Joel Horwitz, IBM & David Richards, WANdisco - Hadoop Summit 2016 San Jose - #theCUBE

>> Narrator: From San Jose, California, in the heart of Silicon Valley, it's theCUBE. Covering Hadoop Summit 2016. Brought to you by Hortonworks. Here's your host, John Furrier. >> Welcome back everyone. We are here live in Silicon Valley at Hadoop Summit 2016, actually San Jose. This is theCUBE, our flagship program. We go out to the events and extract the signal to the noise. Our next guest, David Richards, CEO of WANdisco. And Joel Horowitz, strategy and business development, IBM analyst. Guys, welcome back to theCUBE. Good to see you guys. >> Thank you for having us. >> It's great to be here, John. >> Give us the update on WANdisco. What's the relationship with IBM and WANdisco? 'Cause, you know. I can just almost see it, but I'm not going to predict. Just tell us. >> Okay, so, I think the last time we were on theCUBE, I was sitting with Re-ti-co who works very closely with Joe. And we began to talk about how our partnership was evolving. And of course, we were negotiating an OEM deal back then, so we really couldn't talk about it very much. But this week, I'm delighted to say that we announced, I think it's called IBM Big Replicate? >> Joel: Big Replicate, yeah. We have a big everything and Replicate's the latest edition. >> So it's going really well. It's OEM'd into IBM's analytics, big data products, and cloud products. >> Yeah, I'm smiling and smirking because we've had so many conversations, David, on theCUBE with you on and following your business through the bumpy road or the wild seas of big data. And it's been a really interesting tossing and turning of the industry. I mean, Joel, we've talked about it too. The innovation around Hadoop and then the massive slowdown and realization that cloud is now on top of it. The consumerization of the enterprise created a little shift in the value proposition, and then a massive rush to build enterprise grade, right? And you guys had that enterprise grade piece of it. IBM, certainly you're enterprise grade. You have enterprise everywhere. But the ecosystem had to evolve really fast. What happened? Share with the audience this shift. >> So, it's classic product adoption lifecycle and the buying audience has changed over that time continuum. In the very early days when we first started talking more at these events, when we were talking about Hadoop, we all really cared about whether it was Pig and Hive. >> You once had a distribution. That's a throwback. Today's Thursday, we'll do that tomorrow. >> And the buying audience has changed, and consequently, the companies involved in the ecosystem have changed. So where we once used to really care about all of those different components, we don't really care about the machinations below the application layer anymore. Some people do, yes, but by and large, we don't. And that's why cloud for example is so successful because you press a button, and it's there. And that, I think, is where the market is going to very, very quickly. So, it makes perfect sense for a company like WANdisco who've got 20, 30, 40, 50 sales people to move to a company like IBM that have 4 or 5,000 people selling our analytics products. >> Yeah, and so this is an OEM deal. Let's just get that news on the table. So, you're an OEM. IBM's going to OEM their product and brand it IBM, Big Replication? >> Yeah, it's part of our Big Insights Portfolio. We've done a great job at growing this product line over the last few years, with last year talking about how we decoupled all the value-as from the core distribution. So I'm happy to say that we're both part of the ODPI. It's an ODPI-certified distribution. That is Hadoop that we offer today for free. But then we've been adding not just in terms of the data management capabilities, but the partnership here that we're announcing with WANdisco and how we branded it as Big Replicate is squarely aimed at the data management market today. But where we're headed, as David points out, is really much bigger, right? We're talking about support for not only distributed storage and data, but we're also talking about a hybrid offering that will get you to the cloud faster. So not only does Big Replicate work with HDFS, it also works with the Swift objects store, which as you know, kind of the underlying storage for our cloud offering. So what we're hoping to see from this great partnership is as you see around you, Hadoop is a great market. But there's a lot more here when you talk about managing data that you need to consider. And I think hybrid is becoming a lot larger of a story than simply distributing your processing and your storage. It's becoming a lot more about okay, how do you offset different regions? How do you think through that there are multiple, I think there's this idea that there's one Hadoop cluster in an enterprise. I think that's factually wrong. I think what we're observing is that there's actually people who are spinning up, you know, multiple Hadoop distributions at the line of business for maybe a campaign or for maybe doing fraud detection, or maybe doing log file, whatever. And managing all those clusters, and they'll have Cloud Arrow. They'll have Hortonworks. They'll have IBM. They'll have all of these different distributions that they're having to deal with. And what we're offering is sanity. It's like give me sanity for how I can actually replicate that data. >> I love the name Big Replicate, fantastic. Big Insights, Big Replicate. And so go to market, you guys are going to have bigger sales force. It's a nice pop for you guys. I mean, it's good deal. >> We were just talking before we came on air about sort of a deal flow coming through. It's coming through, this potential deal flow coming through, which has been off the charts. I mean, obviously when you turn on the tap, and then suddenly you enable thousands and thousands of sales people to start selling your products. I mean, IBM, are doing a great job. And I think IBM are in a unique position where they own both cloud and on-prem. There are very few companies that own both the on-prem-- >> They're going to need to have that connection for the companies that are going hybrid. So hybrid cloud becomes interesting right now. >> Well, actually, it's, there's a theory that says okay, so, and we were just discussing this, the value of data lies in analytics, not in the data itself. It lies in you've been able to pull out information from that data. Most CIOs-- >> If you can get the data. >> If you can get the data. Let's assume that you've got the data. So then it becomes a question of, >> That's a big assumption. Yes, it is. (laughs) I just had Nancy Handling on about metadata. No, that's an issue. People have data they store they can't do anything with it. >> Exactly. And that's part of the problem because what you actually have to have is CPU slash processing power for an unknown amount of data any one moment in time. Now, that sounds like an elastic use case, and you can't do elastic on-prem. You can only do elastic in cloud. That means that virtually every distribution will have to be a hybrid distribution. IBM realized this years ago and began to build this hybrid infrastructure. We're going to help them to move data, completely consistent data, between on-prem and cloud, so when you query things in the cloud, it's exactly the same results and the correct results you get. >> And also the stability too on that. There's so many potential, as we've discussed in the past, that sounds simple and logical. To do an enterprise grade is pretty complex. And so it just gives a nice, stable enterprise grade component. >> I mean, the volumes of data that we're talking about here are just off the charts. >> Give me a use case of a customer that you guys are working with, or has there been any go-to-market activity or an ideal scenario that you guys see as a use case for this partnership? >> We're already seeing a whole bunch of things come through. >> What's the number one pattern that bubbles up to the top? Use case-wise. >> As Joel pointed out, that he doesn't believe that any one company just has one version of Hadoop behind their firewall. They have multiple vendors. >> 100% agree with that. >> So how do you create one, single cluster from all of those? >> John: That's one problem you solved. >> That's of course a very large problem. Second problem that we're seeing in spades is I have to move data to cloud to run analytics applications against it. That's huge. That required completely guaranteed consistent data between on-prem and cloud. And I think those two use cases alone account for pretty much every single company. >> I think there's even a third here. I think the third is actually, I think frankly there's a lot of inefficiencies in managing just HDFS and how many times you have to actually copy data. If I looked across, I think the standard right now is having like three copies. And actually, working with Big Replicate and WANdisco, you can actually have more assurances and actually have to make less copies across the cluster and actually across multiple clusters. If you think about that, you have three copies of the data sitting in this cluster. Likely, an analysts have a dragged a bunch of the same data in other clusters, so that's another multiple of three. So there's amount of waste in terms of the same data living across your enterprise. That I think there's a huge cost-savings component to this as well. >> Does this involve anything with Project Atlas at all? You guys are working with, >> Not yet, no. >> That project? It's interesting. We're seeing a lot of opening up the data, but all they're doing is creating versions of it. And so then it becomes version control of the data. You see a master or a centralization of data? Actually, not centralize, pull all the data in one spot, but why replicate it? Do you see that going on? I guess I'm not following the trend here. I can't see the mega trend going on. >> It's cloud. >> What's the big trend? >> The big trend is I need an elastic infrastructure. I can't build an elastic infrastructure on-premise. It doesn't make economic sense to build massive redundancy maybe three or four times the infrastructure I need on premise when I'm only going to use it maybe 10, 20% of the time. So the mega trend is cloud provides me with a completely economic, elastic infrastructure. In order to take advantage of that, I have to be able to move data, transactional data, data that changes all the time, into that cloud infrastructure and query it. That's the mega trend. It's as simple as that. >> So moving data around at the right time? >> And that's transaction. Anybody can say okay, press pause. Move the data, press play. >> So if I understand this correctly, and just, sorry, I'm a little slow. End of the day today. So instead of staging the data, you're moving data via the analytics engines. Is that what you're getting at? >> You use data that's being transformed. >> I think you're accessing data differently. I think today with Hadoop, you're accessing it maybe through like Flume or through Oozy, where you're building all these data pipelines that you have to manage. And I think that's obnoxious. I think really what you want is to use something like Apache Spark. Obviously, we've made a large investment in that earlier, actually, last year. To me, what I think I'm seeing is people who have very specific use cases. So, they want to do analysis for a particular campaign, and so they may just pull a bunch of data into memory from across their data environment. And that may be on the cloud. It may be from a third-party. It may be from a transactional system. It may be from anywhere. And that may be done in Hadoop. It may not, frankly. >> Yeah, this is the great point, and again, one of the themes on the show is, this is a question that's kind of been talked about in the hallways. And I'd love to hear your thoughts on this. Is there are some people saying that there's really no traction for Hadoop in the cloud. And that customers are saying, you know, it's not about just Hadoop in the cloud. I'm going to put in S3 or object store. >> You're right. I think-- >> Yeah, I'm right as in what? >> Every single-- >> There's no traction for Hadoop in the cloud? >> I'll tell you what customers tell us. Customers look at what they actually need from storage, and they compare whatever it is, Hadoop or any on-premise proprietor storage array and then look at what S3 and Swift and so on offer to them. And if you do a side-by-side comparison, there isn't really a difference between those two things. So I would argue that it's a fact that functionally, storage in cloud gives you all the functionality that any customer would need. And therefore, the relevance of Hadoop in cloud probably isn't there. >> I would add to that. So it really depends on how you define Hadoop. If you define Hadoop by the storage layer, then I would say for sure. Like HDFS versus an objects store, that's going to be a difficult one to find some sort of benefit there. But if you look at Hadoop, like I was talking to my friend Blake from Netflix, and I was asking him so I hear you guys are kind of like replatforming on Spark now. And he was basically telling me, well, sort of. I mean, they've invested a lot in Pig and Hive. So if you think it now about Hadoop as this broader ecosystem which you brought up Atlas, we talk about Ranger and Knox and all the stuff that keeps coming out, there's a lot of people who are still invested in the peripheral ecosystem around Hadoop as that central point. My argument would be that I think there's still going to be a place for distributed computing kind of projects. And now whether those will continue to interface through Yarn via and then down to HDFS, or whether that'll be Yarn on say an objects store or something and those projects will persist on their own. To me that's kind of more of how I think about the larger discussion around Hadoop. I think people have made a lot of investments in terms of that ecosystem around Hadoop, and that's something that they're going to have to think through. >> Yeah. And Hadoop wasn't really designed for cloud. It was designed for commodity servers, deployment with ease and at low cost. It wasn't designed for cloud-based applications. Storage in cloud was designed for storage in cloud. Right, that's with S3. That's what Swift and so on were designed specifically to do, and they fulfill most of those functions. But Joel's right, there will be companies that continue to use-- >> What's my whole argument? My whole argument is that why would you want to use Hadoop in the cloud when you can just do that? >> Correct. >> There's object store out. There's plenty of great storage opportunities in the cloud. They're mostly shoe-horning Hadoop, and I think that's, anyway. >> There are two classes of customers. There were customers that were born in the cloud, and they're not going to suddenly say, oh you know what, we need to build our own server infrastructure behind our own firewall 'cause they were born in the cloud. >> I'm going to ask you guys this question. You can choose to answer or not. Joel may not want to answer it 'cause he's from IBM and gets his wrist slapped. This is a question I got on DM. Hadoop ecosystem consolidation question. People are mailing in the questions. Now, keep sending me your questions if you don't want your name on it. Hold on, Hadoop system ecosystem. When will this start to happen? What is holding back the M and A? >> So, that's a great question. First of all, consolidation happens when you sort of reach that tipping point or leveling off, that inflection point where the market levels off, and we've reached market saturation. So there's no more market to go after. And the big guys like IBM and so on come in-- >> Or there was never a market to begin with. (laughs) >> I don't think that's the case, but yes, I see the point. Now, what's stopping that from happening today, and you're a naughty boy by the way for asking this question, is a lot of these companies are still very well funded. So while they still have cash on the balance sheet, of course, it's very, very hard for that to take place. >> You picked up my next question. But that's a good point. The VCs held back in 2009 after the crash of 2008. Sequoia's memo, you know, the good times role, or RIP good times. They stopped funding companies. Companies are getting funded, continually getting funding. Joel. >> So I don't think you can look at this market as like an isolated market like there's the Hadoop market and then there's a Spark market. And then even there's like an AI or cognitive market. I actually think this is all the same market. Machine learning would not be possible if you didn't have Hadoop, right? I wouldn't say it. It wouldn't have a resurgence that it has had. Mahout was one of the first machine learning languages that caught fire from Ted Dunning and others. And that kind of brought it back to life. And then Spark, I mean if you talk to-- >> John: I wouldn't say it creates it. Incubated. >> Incubated, right. >> And created that Renaissance-like experience. >> Yeah, deep learning, Some of those machine learning algorithms require you to have a distributed kind of framework to work in. And so I would argue that it's less of a consolidation, but it's more of an evolution of people going okay, there's distributed computing. Do I need to do that on-premise in this Hadoop ecosystem, or can I do that in the cloud, or in a growing Spark ecosystem? But I would argue there's other things happening. >> I would agree with you. I love both areas. My snarky comment there was never a market to begin with, what I'm saying there is that the monetization of commanding the hill that everyone's fighting for was just one of many hills in a bigger field of hills. And so, you could be in a cul-de-sac of being your own champion of no paying customers. >> What you have-- >> John: Or a free open-source product. >> Unlike the dotcom era where most of those companies were in the public markets, and you could actually see proper valuations, most of the companies, the unicorns now, most are not public. So the valuations are really difficult to, and the valuation metrics are hard to come by. There are only few of those companies that are in the public market. >> The cash story's right on. I think to Joel' point, it's easy to pivot in a market that's big and growing. Just 'cause you're in the wrong corner of the market pivoting or vectoring into the value is easier now than it was 10 years ago. Because, one, if you have a unicorn situation, you have cash on the bank. So they have a good flush cash. Your runway's so far out, you can still do your thing. If you're a startup, you can get time to value pretty quickly with the cloud. So again, I still think it's very healthy. In my opinion, I kind of think you guys have good analysis on that point. >> I think we're going to see some really cool stuff happen working together, and especially from what I'm seeing from IBM, in the fact that in the IT crowd, there is a behavioral change that's happening that Hadoop opened the door to. That we're starting to see more and more It professionals walk through. In the sense that, Hadoop has opened the door to not thinking of data as a liability, but actually thinking about data differently as an asset. And I think this is where this market does have an opportunity to continue to grow as long as we don't get carried away with trying to solve all of the old problems that we solved for on-premise data management. Like if we do that, then we're just, then there will be a consolidation. >> Metadata is a huge issue. I think that's going to be a big deal. And on the M and A, my feeling on the M and A is that, you got to buy something of value, so you either have revenue, which means customers, and or initial property. So, in a market of open source, it comes back down to the valuation question. If you're IBM or Oracle or HP, they can pivot too. And they can be agile. Now slower agile, but you know, they can literally throw some engineers at it. So if there's no customers in I and P, they can replicate, >> Exactly. >> That product. >> And we're seeing IBM do that. >> They don't know what they're buying. My whole point is if there's nothing to buy. >> I think it depends on, ultimately it depends on where we see people deriving value, and clearly in WANdisco, there's a huge amount of value that we're seeing our customers derive. So I think it comes down to that, and there is a lot of IP there, and there's a lot of IP in a lot of these companies. I think it's just a matter of widening their view, and I think WANdisco is probably the earliest to do this frankly. Was to recognize that for them to succeed, it couldn't just be about Hadoop. It actually had to expand to talk about cloud and talk about other data environments, right? >> Well, congratulations on the OEM deal. IBM, great name, Big Replicate. Love it, fantastic name. >> We're excited. >> It's a great product, and we've been following you guys for a long time, David. Great product, great energy. So I'm sure there's going to be a lot more deals coming on your. Good strategy is OEM strategy thing, huh? >> Oh yeah. >> It reduces sales cost. >> Gives us tremendous operational leverage. Getting 4,000, 5,000-- >> You get a great partner in IBM. They know the enterprise, great stuff. This is theCUBE bringing all the action here at Hadoop. IBM OEM deal with WANdisco all happening right here on theCUBE. Be back with more live coverage after this short break.

Published Date : Jul 1 2016

SUMMARY :

Brought to you by Hortonworks. extract the signal to the noise. What's the relationship And of course, we were Replicate's the latest edition. So it's going really well. The consumerization of the enterprise and the buying audience has changed That's a throwback. And the buying audience has changed, Let's just get that news on the table. of the data management capabilities, I love the name Big that own both the on-prem-- for the companies that are going hybrid. not in the data itself. If you can get the data. I just had Nancy Handling and the correct results you get. And also the stability too on that. I mean, the volumes of bunch of things come through. What's the number one pattern that any one company just has one version And I think those two use cases alone of the data sitting in this cluster. I guess I'm not following the trend here. data that changes all the time, Move the data, press play. So instead of staging the data, And that may be on the cloud. And that customers are saying, you know, I think-- Swift and so on offer to them. and all the stuff that keeps coming out, that continue to use-- opportunities in the cloud. and they're not going to suddenly say, What is holding back the M and A? And the big guys like market to begin with. hard for that to take place. after the crash of 2008. And that kind of brought it back to life. John: I wouldn't say it creates it. And created that or can I do that in the cloud, that the monetization that are in the public market. I think to Joel' point, it's easy to pivot And I think this is where this market I think that's going to be a big deal. there's nothing to buy. the earliest to do this frankly. Well, congratulations on the OEM deal. So I'm sure there's going to be Gives us tremendous They know the enterprise, great stuff.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Joe	PERSON	0.99+
David Richards	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
2009	DATE	0.99+
John	PERSON	0.99+
4	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
20	QUANTITY	0.99+
San Jose	LOCATION	0.99+
HP	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Joel Horwitz	PERSON	0.99+
Ted Dunning	PERSON	0.99+
Big Replicate	ORGANIZATION	0.99+
last year	DATE	0.99+
Silicon Valley	LOCATION	0.99+
Big Replicate	ORGANIZATION	0.99+
40	QUANTITY	0.99+
30	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
third	QUANTITY	0.99+
today	DATE	0.99+
Hadoop	TITLE	0.99+
San Jose, California	LOCATION	0.99+
three	QUANTITY	0.99+
two things	QUANTITY	0.99+
2008	DATE	0.99+
5,000 people	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
David Richards	PERSON	0.99+
Blake	PERSON	0.99+
4,000, 5,000	QUANTITY	0.99+
S3	TITLE	0.99+
two classes	QUANTITY	0.99+
tomorrow	DATE	0.99+
Second problem	QUANTITY	0.99+
both areas	QUANTITY	0.99+
three copies	QUANTITY	0.99+
Hadoop Summit 2016	EVENT	0.99+
Swift	TITLE	0.99+
both	QUANTITY	0.99+
Big Insights	ORGANIZATION	0.99+
one problem	QUANTITY	0.98+
Today	DATE	0.98+

Virginia Heffernan, Author of Magic and Loss | Hadoop Summit 2016 San Jose

Zay California in the heart of Silicon Valley. It's the cube covering Hadoop summit 2016 brought to you by Hortonworks. Here's your host, John furrier. >>Okay, we'll come back here and we are here live in Silicon Valley for the cube. This is our flagship program. We go out to the events and extract the cylinders. Of course. We're here at the big data event. Hadoop summit 2016 have a special guest celebrity now, author of the bestselling book magical at Virginia Heffernan magic and loss rising on the bestseller lists. Welcome to the cube. Thanks in our show, you are my internet friend and now you're my real life friend. You're my favorite Facebook friend that I just now met for the first time. Great to meet you. We had never met and now we, but we know each other of course intimately through the interwebs. So I've been following your writing your time. Send you do some stuff on medium and then you, you kind of advertise. You're doing this book. I saw you do the Google glasses experiment in. >>It was Brooklyn and it might, it was so into Google glass and I will admit it, I fought for everything. I fell for VR and all its incarnations and um, and the Google last year, it was like that thing that was supposed to put the internet all voice activated, just put the internet always in front of your face. So I started to wear it around in Brooklyn, my prototype. I thought everyone would stop me and say how cool it was. In fact they didn't think it was pull it off new Yorkers. That's how you would, how they really feel. Got a problem with that. Um, your book magic and loss is fantastic and I think it really is good because uh, Dan Lyons wrote, disrupted, loved, which was fantastic. Dan lies big fan of him and his work, but it really, it wasn't a parody of civil rights for Silicon Valley. >>The show that's kinda taken that culture and made it mainstream. I had people call me up and say, Hey, you live in Callow Alto. My God, do you live near the house? Something like it's on Newell, which is one of my cross streets. But the point is tech culture now is kind of in a native, my youngest is 13 and you know, we're in an iPad generation for the youth and we're from the generation where there was no cell phones. And Mike, I remember when pages were the big innovation and internet. But I think, I think when I'm telling you, I think, I know I'm talking to a fellow traveler when I say that there was digital culture before the advent of the worldwide web in the early nineties you know, I, I'm sure you did too. Got electronic games like crazy. I would get any Merlin or Simon or whatever that they, they introduced. >>And then I also dialed into a mainframe in the late seventies and the early eighties to play the computer as we call it. We didn't even call it the internet. And the thing about the culture too was email was very, you know, monochrome screens, but again, clunky but still connected. Right? So we were that generation of, you know, putting that first training wheels on and now exposed to you. So in the book, your premise is, um, there's magical things happening in the internet and art countering the whole trolling. Uh, yeah, the Internet's bad. And we know recently someone asked me, how can the internet be art when Twitter is so angry? What do you think art is? You know, this is an art. Art is emotional. Artists know powerful >>emotions represented in tranquility and this is, you know, what you see on the internet all the time. Of course the aid of course are human. It needs a place to live and call it Twitter. For now it used to be YouTube comments. So, but we are always taking the measure of something we've lost. Um, I get the word loss from lossy compression, you know, the engineering term that, how does, how MP3 takes that big broad music signal and flattens it out. And something about listening to music on MP3, at least for me, made me feel a sense that I was grieving for something. It was missing something from my analog life. On the other hand, more than counterbalanced by the magic that I think we all experienced on the internet. We wouldn't have a friendship if it weren't for social media and all kinds of other things. And strange serendipity happens not to mention artistic expression in the form of photography, film, design of poetry and music, which are the five chapters of the book. >>So the book is fantastic. The convergence and connection of people, concepts, life with the internet digitally is interesting, right? So there's some laws with the MP3. Great example, but have you found post book new examples? I'm sure the internet culture, geese like Mia, like wow, this is so awesome. There's a cultural aspect of it is the digital experience and we see it on dating sites. Obviously you see, you know Snapchat, you know, dating sites like Tinder and other hookups apps and the real estate, everything being Uberized. What's the new things that you've, that's coming out and you must have some >>well this may be controversial, but one thing I see happening is anti digital culture. Partly as an epi phenomenon of side effect of digitization. We have a whole world of people who really want to immerse themselves in things like live music maker culture, things made by hand, vinyl records, vinyl records, which are selling more than ever in the days of the rolling stones. Gimme shelter less they sold less than than they do now. The rolling stones makes $1 billion touring a year. Would we ever have thought that in the, in the, you know, at the Genesis of the iPod when it seemed like, you know, recorded music represented music in that MP3 thing that floated through our, our phones was all we needed. No, we want to look in the faces of the rolling stones, get as close as we can to the way the music is actually made and you know, almost defiantly, and this is how the culture works. This is how youth culture works. Um, reject, create experiences that cannot be digitized. >>This is really more of a counter culture movement on the overt saturation of digital. >>Yes. Yes. You see the first people to scale down from, you know, high powered iPhones, um, when we're youth going to flip phones. You know, it's like the greatest like greatest punk, punk, punk tech. Exactly. It's like, yeah, I'm going to use these instruments, but like if I break a string, who cares on a PDs? The simplest one, right? >>My mom made me use my iPhone. Are we going to, how are we going to have that? it'd >>be like, Oh, look at you with your basic iPhone over there. And I've got my just like hack down, downscale, whatever. And you know what, I don't spend the weekends, don't pick up my phone on the weekends. But you know, there are interesting markets there. And interesting. I mean, for instance, the, you know, the live phenomenon, I know that, you know, there's this new company by one of the founders of Netflix movie pass, which um, for a $30 subscription you've seen movies in the theater as much as you want and the theaters are beautiful. And what instead of Netflix and chill, you know, the, the, the contemporary, you know, standard date, it's dinner and movie. You're out again. You're eating food, which can't be digitized with in-company, which can't be digitized. And then sitting in a theater, you know, a public experience, which is, um, a pretty extraordinary way that the culture and business pushes back on digital. >>Remember I was a comma on my undergraduate days in computer science in the 80s. And before when it was nerdy and eh, and there was a sociology class at Hubba computers and social change. And the big thing was we're going to lose social interactions because of email. And if you think about what you're talking about here is that the face to face presence, commitment of being with somebody right now is a scarce resource. You have an abundance of connections. >>I mean, take the fact what has happened is digital culture has jacked up the value of undigital culture. So for instance, you know, I've, I've met on Facebook, we talk on Facebook messenger, we notice that we're, you know, like kindred spirits in a certain way and we like each other's posts and so forth. Then we have an, a more extensive talk in messenger when we meet in person for the first time. Both of us are East coast people, but we hugged tele because it's like, Oh wow, like you in the flesh. You know it's something exciting. >>Connection virtually. That's right. There's a synchronous connection presence, but we're not really, we haven't met face to face. >>Yeah, there's this great as a great little experiment going on, set group of kids and Silicon Valley have decided they're too addicted to their phones and Facebook. Now I am not recommending for your viewers and listeners that anybody do what these kids sounds good, are ready. Go. Hey, all right, so what they do is take an LSD breakfast. Now I don't take drugs. I think you can do this without the LSD, but they put a little bit of a hallucinogen under their skin in the morning and what they find is they lost interest in the boring interface in their phones because people on the bus suddenly looked so fascinating to them. The human face is an ratable interface. It can't be reproduced anywhere, Steve. You know, Johnny ive can't make it. They can't make it at Google. And that I think is something we will see young markets doing, which is this renewed appreciation for nature and analog for humans and for analog culture. >>That's right. The Navy is going to sextants and compasses. You may have seen training, they're training sailors on those devices because of the fear that GPS might be hacked. So you know, the young kids probably don't even know what a cup is is, well, I bought myself a compass recently because I suddenly was like, you know, we talk a lot about digital technology, but what the heck, this thing you can point toward the poles, right in my hands. You know, I was suddenly like, we are this floating ball with these poles with different magnetic charges. And I think it's time. I appreciate it. >>Okay, so I've got to ask the, um, the, the feedback that you've gotten from the book, um, again, we hear that every Geneva magic and loss, great, great book. Go by. It's fantastic and open your mind up. It's a, it's a thought provoking, but really specific good use cases. I got a think that, you know, when you talk at Google and when you talk to some of the groups that you're talking to, certainly book clubs and other online that there must be like, Oh my God, you hit the cultural nerve. What have you heard from some of these, um, folks from my age 50 down to the 20 something year olds? Have you had any aha moments where you said, Oh my God, I hit a nerve here. >>Did not want to, I mean, I didn't want to write one of those books. That's like the one thing you need to know to get your startup to succeed or whatever. You know, I was at the airport and every single one of them is like, pop the only thing you need to do to save this or whatever. And they, they do take a very short view. Now if you're thinking about, you know, whether if you're thinking about your quarterly return or your, you know, what you're going to do this quarter and when you're going to be profitable or user acquisition, those books are good manuals. But if you're going to buy a hardcover book and you're going to really invest in reading every page, not just the bolded part, not just the put, you know, the two points that you have to know. I really wanted readers and at what I had found on the internet, people like you, we have an interest in a long view. You know what, I need a really long view >>in a prose that's not for listicle or you know, shorts. It's like it's just a thought provoker but somebody can go, Hey, you know, at the beach on the weekend say, Hey wow, this is really cool. What F you know, we went analog for awhile or what if, what's best for my kids to let my kids play multiplayer games more Zika simulate life. That was my, so these are the kinds of questions that the digital parents are asked. >>Yeah. So you know, like let's take the parents question, which is, is, you know, a, surprisingly to me it's a surprisingly pressing question. I am a parent, but my kids' digital habits are not, you know, of obsessive interest to me. Sometimes I think the worry about our kids is a proxy for how we worry about ourselves. You know, it's funny because they're the, you know, the model of the parent saying my kid has attention deficit order, zero order. My kid has attention deficit disorder. The kids over here, the parents here, you know, who has the attention deficit disorder. But in any case I have realized that parents are talking about uh, computers on the internet as though something kids have to have a very ambivalent relationship with and a very wary relationship with. So limit the time, and it sounds a little bit like the abstinence movement around sexuality that like, you know, you only dip in, it's very, you know, they're only date, right, right, right. >>Instead of joining sides with their kids and helping to create a durable, powerful, interesting online avatar, which is what kids want to do. And it's also what we want to do. So like in your Facebook profile, there are all kinds of strategic groups you can make as a creator of that profile. We know it as adults. Like, do you, some people put up pictures of their kids, some people don't vacation pictures. Some people promote the heck out of themselves. Some people don't do so much of that. Um, do you put up a lot of photographs? Do whatever. Those are the decisions we started to make when went on Facebook at kitchen making the two small armor to have on their gaming profile. That's kind of how they want to play, you know, play for you, going to wear feathers. These are important things. Um, but the uh, you know, small questions like talking to your kids and I don't mean a touchy feely conversation, but literally during the write in all lower case commit, you know, Brighton, all lower case, you're cute and you're this and that means a certain thing and you should get it and you're going to write in all caps and you're going to talk about white nationalist ideology. >>Well that also has a set of consequences. What have you learned in terms of the virtual space? Actually augmented reality, virtual reality, these promise to be virtual spaces. What, what is one of them? They always hope to replicate the real world. The mean, yes. Will there be any parallels of the kind of commitment in the moment? Gives you one thing. I say kids that, you know, the subtitle of the book is the internet as art, magic and loss. The internet is art and the kind of art, the internet is, is what I think of as real estate art. It purports to be reality. You know, every technology pick a photography film says or think of even the introduction of a third dimension in painting, you know, in Renaissance painting perspective for ports to represent reality better than it's been represented before. And if you're right in sync with the technology, you're typically fooled by it. >>I mean, this is a seductive representation of reality. You know, people watching us now believe they're seeing us flush of let us talk. You know, they don't think they're seeing pixels that are designed in certain ways and certainly it's your ways. So trying to sort out the incredibly interesting immersive, artful experience of being online that has some dangers and has some emotions to do it from real life is a really important thing. And you know, for us to learn first and then a model for our kids. So I had a horrible day on Twitter one day, eight 2012 213 worst day ever on Twitter. It was a great day for me. I spent the day at the beach, my Twitter avatar took sniper fire for me all day. People called her an idiot separated amount. I separated them out. And anyone who like likes roleplay and games knows that like I'm not a high priestess in Dentons and dragons. >>You know, I'm a much smaller person than that. And in, in, you know, in the case of this Twitter battle, I'm a less embattled person than the one that takes your armor from me on Twitter. That's my art. Your armor. So let's talk about poetry. Twitter, you mentioned poetry, Twitter, 140 characters. I did 40 characters is a lot. If like a lot of internet users your to have pictographic language like Chinese. So 140 characters is a novel by, well not a novel, but it's a short story for, you know, a writer of short form, short form Chinese aphorisms like Confucius. So one of the things I wanted to say is there's nothing about it being short that makes it low culture. You know, there's, I mean it takes a second to take, to take an a sculpture or to take an a painting and yet like the amount of craft that went into that might be much more good tweeting and you're excellent at it, um, is not easy. You know, I know that times I've been like, I tagged the wrong person and then I have to delete it. Like, because the name didn't come up or you know, I get the hashtags wrong and then I'm like, Oh, it would have been better this other way or I don't have a smart enough interject >>it's like playing sports. Twitter's like, you know, firing under the tennis ball baseline rallies with people. I mean, it's like, it's like there's a cultural thing. And this is the thing that I love about your book is you really bring in the metaphors around art and the cultural aspect. Have you had any, have you found that there's one art period that we represent right now? That it could be a comparison? >>I mean, you know, it's always tempting to care everything to the Renaissance. But you know, obviously in the Italian Renaissance there was so much technological innovation and so much, um, and so much, uh, so much artistic innovation. But, um, you know, the other thing are the Dawn of it's might be bigger than that, which it sounds grounds grandiose, but we're talking about something that nearly 6 billion people use and have access to. So we're talking about something bigger than we've ever seen is the Donovan civilization. So like, we pay a lot of attention to the Aqua docks and Rome and, and you know, later pay to touch it to the frescoes. I attend in this book to the frescoes, to the sculpture, to the music, to the art. So instead of talking about frescoes as an art historian, might I talk about Instagram? Yeah. >>And you, and this thing's all weave together cause we can back to the global fabric. If you look at the civilization as you know you're not to use the world is flat kind of metaphor. But that book kind of brings out that notion of okay if you just say a one global fabric, yes you have poetry, you have photography of soiling with a Johnny Susana ad in London. He says, you know, cricket is a sport in England, a bug and a delicacy depending on where in the world you are. >>Love that is that, I wonder if that's the HSBC had time to actually a beautiful HSBC job has done a beautiful campaign. I should find out who did it about perspective. And that is also a wonderful way to think about the internet because you know, I know a lot of people who don't like Twitter, who don't like YouTube comments. I do like them because I am perpetually surprised at what people bring to their interpretation. Insights in the comments can be revealing. You know, you know, you don't wanna get your feelings hurt. Sometimes you don't want that much exposure to the micro flora and fauna of ideas that could be frightening. But you know, when you're up for it, it's a really nice test of your immune system, you know. All right. So what's next for you? Virginia Heffernan magic and last great book. I think I will continue to write the tech criticism, which is just this growing field. I at Sarah Watson had a wonderful piece today in the Columbia journalism review about how we really need to bring all our faculties to treat, treating to tech criticism meant and treating tech with, um, with Karen, with proper off. Um, and the next book is on anti digital culture. Um, I will continue writing journalism and you'll see little previews of that book in the next work. >>Super inspirational. And I think the culture needs this kind of rallying cry because you know, there is art and science and all this beautiful beauty in the internet and it's not about mutually exclusive analog world. You can look and take, can come offline. So it's interesting case study of this, this revolution I think, and I think the counter culture, if you'd go back and Southern John Markoff about this, when he wrote his first book, the Dormouse wander about the counter culture in Silicon Valley is what's your grade book? And counter cultures usually create a another wave of innovation. So the question that comes out of this one is there could, this could be a seminal moment in history. I mean, I think it absolutely is. You know, in some ways, every moment is a great moment if you know what to make of it. But I am just tired of people telling us that we're ruining our brands and that this is the end of innovation and that we're at some low period. >>I think we will look back and think of this as an incredibly fertile time for our imaginations. If we don't lose hope, if we keep our creativity fired and if we commit to this incredible period we're in Virginia. Thanks for spending the time here in the queue. Really appreciate where you're live at. Silicon Valley is the cube with author Virginia Heffernan magic. And loss. Great book. Get it. If you don't have it, hard copies still available, get it. We'll be right back with more live coverage here. This is the cube. I'm John furry right back with more if the short break.

Published Date : Jun 30 2016

SUMMARY :

Hadoop summit 2016 brought to you by Hortonworks. I saw you do the Google glasses experiment in. That's how you would, how they really feel. was digital culture before the advent of the worldwide web in the early nineties you know, So we were that generation of, you know, putting that first training wheels on and now exposed Um, I get the word loss from lossy compression, you know, the engineering term that, Obviously you see, you know Snapchat, you know, dating sites like Tinder and other hookups of the rolling stones, get as close as we can to the way the music is actually made and you know, You know, it's like the greatest like greatest punk, Are we going to, how are we going to have that? I mean, for instance, the, you know, the live phenomenon, And if you think about what you're talking So for instance, you know, I've, I've met on Facebook, we talk on Facebook messenger, but we're not really, we haven't met face to face. I think you can do this without the LSD, but they put a little bit of a hallucinogen under their skin So you know, the young kids probably don't even know what a cup is is, well, I bought myself a compass recently you know, when you talk at Google and when you talk to some of the groups that you're talking to, certainly book clubs and other online that not just the bolded part, not just the put, you know, the two points that you have to know. It's like it's just a thought provoker but somebody can go, Hey, you know, at the beach on the weekend The kids over here, the parents here, you know, who has the attention deficit disorder. but the uh, you know, small questions like talking to your kids and I don't mean a touchy feely conversation, I say kids that, you know, the subtitle of the book is the internet as art, magic and loss. And you know, for us to learn first and then a model for our kids. it. Like, because the name didn't come up or you know, I get the hashtags wrong and then I'm like, Twitter's like, you know, firing under the tennis ball baseline rallies with people. So like, we pay a lot of attention to the Aqua docks and Rome and, and you know, He says, you know, cricket is a sport in England, a bug and a delicacy depending on You know, you know, you don't wanna get your feelings hurt. you know, there is art and science and all this beautiful beauty in the internet and it's not about If you don't have it, hard copies still available, get it.

ENTITIES

Entity	Category	Confidence
Karen	PERSON	0.99+
Dan Lyons	PERSON	0.99+
England	LOCATION	0.99+
HSBC	ORGANIZATION	0.99+
London	LOCATION	0.99+
Brooklyn	LOCATION	0.99+
$1 billion	QUANTITY	0.99+
five chapters	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Virginia Heffernan	PERSON	0.99+
Mike	PERSON	0.99+
13	QUANTITY	0.99+
Steve	PERSON	0.99+
first book	QUANTITY	0.99+
Virginia	LOCATION	0.99+
40 characters	QUANTITY	0.99+
$30	QUANTITY	0.99+
Dan	PERSON	0.99+
Both	QUANTITY	0.99+
Callow Alto	LOCATION	0.99+
iPod	COMMERCIAL_ITEM	0.99+
iPad	COMMERCIAL_ITEM	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
late seventies	DATE	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
Google	ORGANIZATION	0.99+
two points	QUANTITY	0.99+
last year	DATE	0.99+
first time	QUANTITY	0.99+
140 characters	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
YouTube	ORGANIZATION	0.99+
Chinese	OTHER	0.98+
first time	QUANTITY	0.98+
Twitter	ORGANIZATION	0.98+
early nineties	DATE	0.98+
San Jose	LOCATION	0.98+
early eighties	DATE	0.98+
Newell	LOCATION	0.98+
Hortonworks	ORGANIZATION	0.98+
Hadoop summit 2016	EVENT	0.98+
Johnny	PERSON	0.97+
Sarah Watson	PERSON	0.97+
Facebook	ORGANIZATION	0.97+
Johnny Susana	PERSON	0.97+
Mia	PERSON	0.97+
80s	DATE	0.97+
one	QUANTITY	0.97+
John	PERSON	0.97+
today	DATE	0.96+
Hubba	ORGANIZATION	0.95+
first people	QUANTITY	0.95+
first training	QUANTITY	0.95+
first	QUANTITY	0.94+
John furrier	PERSON	0.94+
Instagram	ORGANIZATION	0.94+
Geneva	LOCATION	0.94+
Snapchat	ORGANIZATION	0.92+
a second	QUANTITY	0.91+
one thing	QUANTITY	0.91+
one art	QUANTITY	0.9+
John Markoff	PERSON	0.9+
a year	QUANTITY	0.9+
nearly 6 billion people	QUANTITY	0.9+
one day	QUANTITY	0.9+
Italian Renaissance	DATE	0.89+
Google glass	COMMERCIAL_ITEM	0.89+
third dimension	QUANTITY	0.89+
Zay California	PERSON	0.86+
Navy	ORGANIZATION	0.86+

Emer Coleman, Disruption - Hadoop Summit 2016 Dublin - #HS16Dublin - #theCUBE

>> Narrator: Live from Dublin, Ireland. It's theCUBE, covering Hadoop Summit Europe 2016. Brought to you by Hortonworks. Now your host, John Furrier and Dave Vellante. >> Okay, welcome back here, we are here live in Dublin, Ireland, it's theCUBE SiliconANGLEs flagship program where we go out to the events and extract the signal from the noise, I'm John Furrier, my cohost Dave Vellante, our next guest is Emer Coleman who's with Disruption Limited, Open Data Governance Board in Ireland and Transport API, a growing startup built self-sustainable, growing business, open data, love that keynote here at Hadoop Summit, very compelling discussion around digital goods, digital future. Emer, welcome to theCUBE. >> It's great to be here. >> So what was your keynote? Let's just quickly talk about what you talked about, and then we can get in some awesome conversation. >> Sure. So the topic yesterday was we need to talk about techno ethics. So basically, over the last couple of months, I've been doing quite a lot of research on ethics and technology, and many people have different interpretations of that, but yesterday I said it's basically about three things. It's about people, it's about privacy, and it's about profits. So it's asking questions about how do we look at holistic technology development that moves away from a pure technocratic play and looks at the deep societal impacts that technology has. >> One of the things that we're super excited about and passionate about is this new era of openness going to a whole another level. Obviously, open source tier one software development environment, cloud computing allows for instant access to resources, almost limitless at this point, as you can project it forward with Moore's Law and whatnot. But the notion that digital assets are not just content, it's data, it's people, it's the things you mentioned about, create a whole new operating environment or user experience, user expectations with mobile phones and Internet of Things and Transport API which you have, if it moves, you capture it, and you're providing value there. So a whole new economy is developing around digital capital. Share your thoughts around this, because this is an area that you're passionate about, you've just done work here, what's your thoughts on this new digital economy, digital capital, digital asset opportunity? >> I think there's huge excitement about the digital economy, isn't there? And I think one of the things I'm concerned about is that that excitement will lead us to the same place that we are now, where we're not really thinking through what are the equitable distribution in that economy, because it seems to me that the spoils are going to a very tiny elite at the tops. So if you look at Instagram, 13 employees when it was purchased by Facebook for a billion dollars, but that's all our stuff, so I'm not getting any shares in the billion, those 13 people are. That's fantastic that you can build a business, build it to that stage and sell, but you have to think about two things, really: what are we looking at in terms of sustainable businesses into the future that create ethical products, and also the demands from citizens to get some value for their data back, because we're becoming shadow employees, we're shadow employees of Google, so when we email, we're not just corresponding, we're creating value for that company. >> And Facebook is a great example. >> And Facebook, and the thing is, when we were at the beginning of that digital journey, it was quite naive. So we were very seduced by free, and we thought, "This is great," and so we're happy with the service. And then the next stage of that, we realize what if we're not paying for the service, we're the product? >> John: Yeah. >> But we were too embedded in the platform to extricate ourselves. But now, I think, when we look at the future of work and great uncertainty that people are facing, when their labor's not going to be required to the same degree, are we going to slavishly keep producing capital and value for companies like Google, and ask for nothing more than the service in return? I don't think so. >> And certainly, the future will be impacted, and one of the things we see now in our business of online media and online open data, is that the data's very valuable. We see that, I'll say data is the new capital, new oil, whatever phrases of the day is used, and the brand marketers are the first ones to react to it, 'cause they're very data driven. Who are you, how do I sell stuff to you? And so what we're seeing is, brand marketers are saying, "Hey, I'm going to money to try to reach out to people, "and I'm going to activate that base and connect with, "engage with them on Facebook or other platform. "I'm going to add value to your Facebook or Google platform, "but yet I'm parasitic to your platform for the data. "Why just don't I get it directly?" So again, you're starting to see that thinking where I don't want to be a parasite or parasitic to a network that the value's coming from. The users have not yet gotten there, and you're teasing that out. What's your thoughts there, progression, where we're at, have people realized this? Have you seen any movement in the industry around this topic? >> No, I think there's a silence around... Technology companies want to get all the data they can. They're not going to really declare as much as they should, because it bends their service model a bit. Also, the data is emergent. Zuckerberg didn't start Facebook as something that was going to be a utility for a billion people, he started it as a social network for a university. And what grew out of that, we learned as we went along. So I'm thinking, now that we have that experience, we know that happens, so let's start the thinking now. And also, this notion of just taking data because you can, almost speculatively getting data at the point of source, without even knowing what you want it for but thinking, "I'm going to monetize this in the end." Jaron Lanier in his book Who Owns The Future talks about micro licensing back content. And I think that's what we need to do. We start, at the very beginning, we need to start baking in two things: privacy by design and different business models where it's not a winner takes all. It's a dialog between the user and the service, and that's iterated together. >> This idea that it's not a zero sum game is very important, and I want to go back to your Instagram and Facebook example. At its peak, I think Eastman Kodak had hundreds of thousands of employees, maybe four or five hundred, 450,000 employees, huge. Facebook has many many more photos, but maybe a few thousand employees? Wow, so all the jobs are gone, but at the same time, we don't want to be protecting the past from the future, so how do you square that circle? >> Correct, but I think what we know is that the rise of robotics and software is going to eat jobs, and basically, there's going to be a hollowing out of the middle class. You know, for sure, whether it's medicine, journalism, retail, exactly. >> Dave: It's not future, it's now. (laughs) >> Exactly. So we maybe come into a point where large swaths of people don't have work. Now, what do you do in a world where your labor is no longer required? Think about the public policy implications of that. Do we say you either fit in this economy or you die? Are we going to look at ideas which they are looking at in Europe, which is like a universal wage? And all of these things are a challenge to government, because they're going to have a citizenry who are not included in this brave new world. So some public policy thinking has to go into what happens when our kids can't get jobs. When the jobs that used to be done by people like us are done by machines. I'm not against the movement of technology, what I'm saying is there are deep societal implications that need some thinking, because if we get to a point where we suddenly realize, if all of these people who are unemployed and can't get work, this isn't a future we envisioned where robots would take all the crap jobs and we would go off to do wonderful things, like how are we going to bring the bacon home? >> It seems like in a digital world that the gap is creativity to combine technologies and knowledge. I find that it's scary when you talk about maybe micromanaging wages and things like that, education is the answer, but that's... How do you just transfer that knowledge? That's sort of the discussion that we're having in the United States anyway. >> I think some of the issue is that the technology is so, we're kind of seduced by simplicity. So we don't see the complexity underneath, and that's the ultimate aim of a technology, is to make something so simple, that complexity is masked. That's what the iPhone did wonderfully. But that's actually how society is looking now. So we're seduced by this simplicity, we're not seeing the complexity underneath, and that complexity would be about what do we do in a world where our labor is no longer required? >> And one of the things that's interesting about the hollowing of the middle class is the assumption is there's no replacements, so one of the things that could be counter argued is that, okay, as the digital natives, my daughter, she's a freshman in high school, my youngest son's eighth grade, they're natives now, so they're going to commit. So what is the replacement capital and value for companies that can be sustained in the new economy versus the decay and the darwinism of the old? So the digital darwinism aspect's interesting, that's one dilemma. The other one is business models, and I want to get your thoughts on this 'cause this is something we were teasing out with this whole value extraction and company platform issue. A company like Twitter. Highly valuable company, it's a global network of people tweeting and sharing, but yet is under constant pressure from Wall Street and investors that they basically suck. And they don't, they're good, people love Twitter, so they're being forced to behave differently against their mission because their profit motive doesn't really match maybe something like Facebook, so therefore they're instantly devalued, yet the future of someone connecting on Twitter is significantly high. That being said, I want to get your thoughts on that and your advice to Twitter management, given the fact it is a global network. What should they do? >> It's the same old capitalism, just it's digital, it's a digital company, it's a digital asset. It's the same approach, right? Twitter has been a wonderful thing. I've been a Twitter user for years. How amazing, it's played a role in the Arab Spring, all sorts of things. So they're really good, but I think you need as a company, so for example, in our company, in Transport API, we're not really looking to build to this massive IPO, we're trying to build a sustainable company in a traditional way using digital. So I think if you let yourself be seduced by the idea of phenomenal IPO, you kind of take your eye off the ball. >> Or in case this, in case you got IPOed, now you're under pressure to produce-- >> Emer: Absolutely, yeah. >> Which changes your behavior. But in Twitter's management defense, they see the value of their product. Now, they got there by accident and everyone loves it, but now they're not taking the bait to try to craft a short term solution to essentially what is already a valuable product, but not on the books. >> Yes, and also I think where the danger is, we know that their generation shifts across channel. So teenagers probably look at Facebook, I think one of them said, like an awkward family dinner they can't quite leave. But for next gen, they're just not going to go there, 'cause that's where your grandmother is. So the same is true of Twitter and Snapchat, these platforms come and go. It's an interesting phenomenon then to see Wall Street putting that much money into something which is essentially quite ephemeral. I'm not saying that Twitter won't be around for years, it may be, but that's the thing about digital, isn't it? Something else comes in and it's well, that becomes the platform of choice. >> Well, it's interesting, right? Everybody, us included, we criticize the... Michael Dell calls it the 90 day shock clock. But it's actually worked out pretty well, I mean, economically, for the United States companies. Maybe it doesn't in the future. What are your thoughts on that, particularly from a European perspective? Where you're reporting maybe twice a year, there's not as much pressure, but yet from a technology industry standpoint, companies outside the Silicon Valley in particular seem to be less competitive, why? >> For example, in our company, in Transport API, we've got some pretty heavyweight clients, we have a wonderful angel investor who has given us two rounds of investment. And it isn't that kind of avaricious absolutely built this super price. And that's allowed us to build from starting off with 2, now to a team of 10, and we're just about coming into break even, so it's doable. But I think it's a philosophy. We didn't want necessarily to build something huge, although we want to go global, but it was let's do this in a sustainable way with reasonable wages, and we've all put our own soul and money into it, but it's a different cultural proposition, I think. >> Well, the valuations always drive the markets. It's interesting too, to your point about things come and go channels, kind of reminds me, Dave and I used to joke about social networks like nightclubs, they're hot and then it's just too crowded and nobody goes there, as Yogi Bear would say. And then they shift and they go out of business, some don't open with fanfare, no one goes 'cause it's got different context. You have a contextual challenge in the world now. Technology can change things, so I want to ask you about identity 'cause there was a great article posted by the founder of the company called Secret which is one of these anonymous apps like Yik Yak and whatnot, and he shut it down. And he wrote a post, kind of a postmortem, saying, "These things come and go, they don't work, "they're not sustainable because there's no identity." So the role of identity in a social global virtual world, virtual being not just virtual reality, is interesting. You live in a world, and your company, Transport API, provides data which enables stuff and the role of identity. So anonymous versus identity, thoughts there, and that impact to the future of work? If you know who you're dealing with, and if they're present, these are concepts that are now important, presence, identity, attention. >> And that's the interesting thing, isn't it? Who controls that identity? Mark Zuckerberg said, "You only have one identity," which is what he said when he set up Facebook. You think, really? No, that's what a young person thinks. When we're older, we know. >> He also said that young people are smarter than older people. >> Yeah, right, okay. (John laughs) He could be right there, he could be right there, but we all have different identities in different parts of our lives. Who we are here, the Hadoop summit is different from what we're at home to when we're with friends. So identity is a multifaceted thing. But also, who gets to determine your identity? So I have 16 years of my search life and Google. Now, who am I in that server, compared to who I am? I am the sum total of my searches. But I'm not just the sum total of my searches, am I? Or even that contextualized, so I'll give you an example. A number of years ago I was searching for a large, very large waterproof plastic bag. And I typed it in, and I thought, "Oh my god, that sounds like I'm going to murder my husband "and try to bury him." (John and Dave laugh) It was actually-- >> John: Into the compost. >> Right, right. And I thought, "Oh my god, what does this look like "on the other side?" Now, it was actually for my summer garden furniture. But the point is, if you looked at that in an analytic way, who would I be? And so I think identity is very, you know-- >> John: Mistaken. >> Yeah, and also this idea of what Frank Pasquale calls the black box society. These secret algorithms that are controlling flows of money and information. How do they decide what my identity is? What are the moral decisions that they make around that? What does it say if I search for one thing over another? If I search constantly for expensive shoes, does that make me shallow? What do these things say? If I search for certain things around health. >> And there's a value judgment now associated with that that you're talking about, that you do not control. >> Absolutely, and which is probably linked to other things which will determine things like whether I get credit or not, but these can almost be arbitrary decisions, 'cause I have no oversight of the logic that's creating that decision making algorithm. So I think it's not just about identity, it's about who's deciding what that identity is. >> And it's also the reality that you're in, context, situations. Dark side, bright side of technology in this future where this new digital asset economy, digital capital. There's going to be good and bad, education can be consumed non-linear, new forms of consumptions, metadata, as you're pointing out, with the algorithms. Where do you see some bright spots and where do you see the danger areas? >> I think the great thing is, when you were saying software is the future. It's our present, but it's going to be even more so in our future. Some of the brightest brains in the world are involved in the creation of new technology. I just think they need to be focusing a bit more of that intellectual rigor towards the impact they're having on society and how they could do it better. 'Cause I think it's too much of a technocratic solution. Technologists say, "We can do this." The questions is, should they? So I think what we need to do is to loop them back into the more social and philosophical side of the discussion. And of course it's a wonderful thing, hopefully technology is going to do amazing things around health. We can't even predict how amazing it's going to be. But all I'm saying is that, if we don't ask the hard questions now about the downsides, we're going to be in a difficult societal position. But I'm hoping that we will, and I'm hoping that raising issues like techno ethics will get more of that discussion going. >> Well, transparency and open data make a big difference. >> Emer: Absolutely. >> Well, and public policy, as you said earlier, can play a huge role here. I wonder if you could give us your perspective on... Public policy, we're in the US most of the time, but it's interesting when we talk to customers here. To hear about the emphasis, obviously, on privacy, data location and so forth, so in the digital world, do you see Europe's emphasis and, I think, leading on those types of topics as an advantage in a digital world, or does it create friction from an economic standpoint? >> Yeah, but it's not all about economics. Friction is a good thing. There are some times when friction is a good thing. Most technologists think all friction is bad. >> Sure, and I'm not implying that it's necessarily good or bad, I'm curious though, is it potentially an economic advantage to have thought through and have policy on some of those issues? >> Well, what we're seeing here-- >> Because I feel like the US is a ticking time bomb on a lot of these issues. >> I was talking to VCs, some VC friends of mine here in the UK, and what they said they're seeing more and more, VCs asking what we call SMEs, small to medium enterprises, about their data policies, and SMEs not being able to answer those questions, and VCs getting nervous. So I think over time it's going to be a competitive advantage that we've done that homework, that we're basically not just rushing to get more users, but that we're looking at it across the piece. Because, fundamentally, that's more sustainable in the longer term. People will not be dumb too forever. They will not, and so doing that thinking now, where we work with people as we create our technology products, I think it's more sustainable in the long term. When you look at economics, sustainability is really important. >> I want to ask you about the Transport API business, 'cause in the US, same thing, we've seen some great openness of data and amazing innovations that have come out of nowhere. In some cases, unheard of entrepreneurs and/or organizations that better society for the betterment of people, from delivering healthcare to poor areas and whatnot. What has been the coolest thing, or of things you've seen come out of your enablement of the transport data. Use cases, have you seen any things that surprised you? >> It's quite interesting, because when I worked for the mayor of London as his director of digital projects, my job was to set up the London data store, which was to open all of London's public sector data. So I was kind of there from the beginning as a lobbyist, and when I was asking agencies to open up their data, they'd go, "What's the ROI?" And I'd just say, "I don't know." Because government's one and oh, I'm saying that was a chicken and egg, you got to put it out there. And we had a funny incident where some of the IT staff in transport for London accidentally let out this link, which is to the tracker net feed, and that powers the tube notice boards that says, "Your next tube is in a minute," whatever. And so the developer community went, "Ooh, this is interesting." >> John: Candy! >> Yeah, and of course, we had no documentation with it because it kind of went out under the radar. And one developer called Mathew Somerville made this map which showed the tubes on a map in real time. And it was like surfacing the underground. And people just thought, "Oh my god, that is amazing." >> John: It's illuminating. >> Yeah. It didn't do anything, but it showed the possibility. The newspapers picked it up, it was absolutely brilliant example, and the guy made it in half a day. And that was the first time people saw their transport system kind of differently. So that was amazing, and then we've seen hundreds of different applications that are being built all the time. And what we're also seeing is integration of transport data with other things, so one of our clients in Transport API is called Toothpick, and they're an online dental booking agency. And so you can go online, you can book your dental appointment with your NHS dentist, and then they bake in transport information to tell you how to get there. So we have pubs using them, and screens so people can order their dinner, and then they say, "You've got 10 minutes till the next bus." So all sorts of cross-platform applications. >> That you never could've envisioned. >> Emer: Never. >> And it's just your point earlier about it's not a zero sum game, you're giving so many ways to create value. >> Emer: Right, right. >> Again, I come back to this notion of education and creativity in the United States education system, so unattainable for so many people, and that's a real concern, and you're seeing the middle class get hollowed out. I think the stat is, the average wage in the United States was 55,000 in 1999, it's 50,000 today. The political campaigns are obviously picking at that scab. What's the climate like in Europe from that standpoint? >> In terms of education? >> No, just in terms of, yes, the education, middle class getting hollowed out, the sentiment around that. >> I don't think people are up to speed with that yet, I really don't think that they're aware of the scale. I think when they think robots or automation, they don't really think software. They think robots like there were in the movies, that would come, as I say, and do those jobs nobody wanted. But not like software. So when I say to them, look, E-discovery software, when it's applied retrospectively, what it shows is that human lawyers are only 60% accurate compared to it. Now, that's a no-brainer, right? If software is 100% accurate, I'm going to use the software. And the ratio difference is 1 to 500. Where you needed 500 lawyers before you need 1. So I don't think people are across the scale of change. >> But it's interesting, you're flying to Heathrow, you fly in and out, you're dealing with a kiosk. You drive out, the billboards are all electronic. There aren't guys doing this anymore. So it's tangible. >> And I think, to your point about education, I'm not as familiar with the education system in the US, but I certainly think, in Europe and in the UK, the education system is not capable of dealing even with the latest digital natives. They're still structuring their classrooms in the same way. These kids, you know-- >> John: They have missed the line with the technology. >> Absolutely. >> So reading, writing and arithmetic, fine. And the cost of education is maybe acceptable. But they may be teaching the wrong thing. >> Asynchronous non-linear, is the thing. >> There's a wonderful example of an Indian academic called Sugata Mitra, who has a fabulous project called a Hole in the Wall. And he goes to non-English speaking little Indian villages, and he builds a computer, and he puts a roof over it so only the children can do it. They don't speak English. And he came back, and he leaves a little bit of stuff they have to get around before they can play a game. And he came back six months later, and he said to them, "What did you think?" And one of the children said, "We need a faster CPU and a better mouse." Now, his point is self-learning, once you have access to technology, is amazing, and I think we have to start-- >> Same thing with the non-linear consumption, asynchronous, all this, the API economy enabling new kinds of expectation and opportunities. >> And it was interesting because the example, some UK schools tried to follow his example. And six months later, they rang him up and they said, "It's not working," and he said, "What did you do?" And they said, "Well, we got every kid a laptop." He said, "That's not the point." The point was putting a scarce resource that the children had to collaborate over. So in order to get to the game, they had figure out certain things. >> I think you're right on some of these (mumbles) that no one's talking about. And Dave and I are very passionate on this, and we're actually investing in a whole new e-learning concept. But it's not about doing that laptop thing or putting courseware online. That's old workflow in a new model. Come on, old wine in a new bottle. So that's interesting. I want to get your thoughts, so a personal question to end this segment. What are you passionate about now, what are you working, outside of the venture, which is exciting. You have a lot of background going back to technology entrepreneurship, public policy, and you're in the front lines now, thought leading on this whole new wide open sea of opportunity, confusion, enabling it. What are you passionate about, what are you working on? Share with the folks that are watching. >> So one of the main things we're trying to do. I work as an associate with Ernst & Young in London. And we've been having discussions over the past couple of months around techno ethics, and I've basically said, "Look, let's see if we can get EY "to build to build an EY good governance index." Like, what does good governance look like in this space, a massively complex area, but what I would love is if people would collaborate with us on that. If we could help to draw up an ethical framework that would convene the technology industry around some ethical good governance issues. So that's what I'm going to be working on as hard as I can over the next while, to try and get as much collaboration from the community, because I think we'd be so much more powerful if the technology industry was to say, "Yeah, let's try and do this better "rather than waiting for regulation," which will come, but will be too clunky and not fit for purpose. >> And which new technology that's emerging do you get most excited about? >> Hmm. Drones. (laughter) >> How about anything with bitcoin, block chains? >> Absolutely, absolutely, block chain. Yeah, block chain, you have to say, yeah. I think, 'cause bitcoin, you know, it's worth 20 p today, it's worth 200,000 tomorrow. >> Dave: Yeah, but block chain. >> Right, right. I mean, that is incredible potentiality. >> New terms like federated, that's not a new term, but federation, universal, unification. These are the themes right now. >> Emer: Well, it's like the road's been coated, isn't it? And we don't know where it's going to go. What a time we live in, right? >> Emer Coleman, thank you so much for spending your time and joining us on theCUBE here, we really appreciate the conversation. Thanks for sharing that great insight here on theCUBE, thank you. It's theCUBE, we are live here in Dublin, Ireland. I'm John Furrier with Dave Vellante. We'll we right back with more SiliconANGLEs, theCUBE and extracting the signal from the noise after this short break. (bright music)

Published Date : Apr 14 2016

SUMMARY :

Brought to you by Hortonworks. and extract the signal from the noise, and then we can get in and looks at the deep societal impacts the things you mentioned about, the spoils are going to And Facebook, and the thing is, embedded in the platform and one of the things we see now get all the data they can. Wow, so all the jobs are is that the rise of robotics and software Dave: It's not future, I'm not against the education is the answer, but that's... and that's the ultimate And one of the things It's the same old but not on the books. that becomes the platform of choice. Maybe it doesn't in the future. And it isn't that kind of avaricious and that impact to the future of work? And that's the He also said that young people But I'm not just the sum But the point is, if you looked at that What are the moral decisions that you do not control. 'cause I have no oversight of the logic And it's also the reality Some of the brightest brains in the world Well, transparency and open so in the digital world, Yeah, but it's not all about economics. Because I feel like the in the UK, and what they said 'cause in the US, same thing, and that powers the tube notice boards Yeah, and of course, we and the guy made it in half a day. And it's just your point earlier about and creativity in the United the sentiment around that. And the ratio difference is 1 to 500. You drive out, the billboards And I think, to your the line with the technology. And the cost of education And one of the children said, of expectation and opportunities. that the children had to collaborate over. outside of the venture, So one of the main I think, 'cause bitcoin, you I mean, that is incredible potentiality. These are the themes right now. Emer: Well, it's like the the signal from the noise

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jaron Lanier	PERSON	0.99+
John	PERSON	0.99+
Europe	LOCATION	0.99+
Emer Coleman	PERSON	0.99+
55,000	QUANTITY	0.99+
Disruption Limited	ORGANIZATION	0.99+
US	LOCATION	0.99+
10 minutes	QUANTITY	0.99+
John Furrier	PERSON	0.99+
four	QUANTITY	0.99+
100%	QUANTITY	0.99+
Mark Zuckerberg	PERSON	0.99+
UK	LOCATION	0.99+
1999	DATE	0.99+
Google	ORGANIZATION	0.99+
Frank Pasquale	PERSON	0.99+
Ernst & Young	ORGANIZATION	0.99+
Zuckerberg	PERSON	0.99+
Emer	PERSON	0.99+
200,000	QUANTITY	0.99+
London	LOCATION	0.99+
16 years	QUANTITY	0.99+
Open Data Governance Board	ORGANIZATION	0.99+
Heathrow	LOCATION	0.99+
Michael Dell	PERSON	0.99+
1	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
Twitter	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
50,000	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Sugata Mitra	PERSON	0.99+
500 lawyers	QUANTITY	0.99+
Dublin, Ireland	LOCATION	0.99+
yesterday	DATE	0.99+
United States	LOCATION	0.99+
Dublin, Ireland	LOCATION	0.99+
Who Owns The Future	TITLE	0.99+
two things	QUANTITY	0.99+
tomorrow	DATE	0.99+
today	DATE	0.99+
20 p	QUANTITY	0.99+
two rounds	QUANTITY	0.99+
13 people	QUANTITY	0.99+
half a day	QUANTITY	0.99+
Ireland	LOCATION	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
NHS	ORGANIZATION	0.99+
90 day	QUANTITY	0.99+
United States	LOCATION	0.99+
one	QUANTITY	0.99+
13 employees	QUANTITY	0.98+
English	OTHER	0.98+
billion	QUANTITY	0.98+
500	QUANTITY	0.98+
Hadoop Summit	EVENT	0.98+
six months later	DATE	0.98+

Brett Rudenstein - Hadoop Summit 2014 - theCUBE - #HadoopSummit

the cube and hadoop summit 2014 is brought to you by anchor sponsor Hortonworks we do have do and headline sponsor when disco we make hadoop invincible okay welcome back and when we're here at the dupe summit live is looking valance the cube our flagship program we go out to the events expect a signal from noise i'm john per year but Jeff Rick drilling down on the topics we're here with wind disco welcome welcome Brett room Stein about senior director tell us what's going on for you guys I'll see you at big presence here so all the guys last night you guys have a great great booth so causing and the crew what's happening yeah I mean the show is going is going very well what's really interesting is we have a lot of very very technical individuals approaching us they're asking us you know some of the tougher more technical in-depth questions about how our consensus algorithm is able to do all this distributor replication which is really great because there's a little bit of disbelief and then of course we get to do the demonstration for them and then suspend disbelief if you will and and I think the the attendance has been great for our brief and okay I always get that you always we always have the geek conversations you guys are a very technical company Jeff and I always comment certainly de volada and Jeff Kelly that you know when disco doesn't has has their share pair of geeks and that dudes who know they're talking about so I'm sure you get that but now them in the business side you talk to customers I want to get into more the outcome that seems to be the show focused this year is a dupe of serious what are some of the outcomes then your customers are talking about when they get you guys in there what are their business issues what are they tore what are they working on to solve yeah I mean I think the first thing is to look at you know why they're looking at us and then and then with the particular business issues that we solve and the first thing and sort of the trend that we're starting to see is the prospects and the customers that we have are looking at us because of the data that they have and its data that matters so it's important data and that's when people start to come to is that's when they look to us as they have data that's very important to them in some cases if you saw some of the UCI stuff you see that the data is you know doing live monitoring of various you know patient activity where it's not just about about about a life and monitoring a life but potentially about saving the life and systems that go down not only can't save lives but they can potentially lose them so you have a demos you want to jump into this demo here what is this all about you know the demo that the demonstration that I'm going to do for you today is I want to show you our non-stop a new product i'm going to show you how we can basically stand up a single HDFS or a single Hadoop cluster across multiple data centers and I think that's one of the tough things that people are really having trouble getting their heads wrapped around because most people when they do multi data center Hadoop they tend to do two different clusters and then synchronize the data between the two of them the way they do that is they'll use you know flume or they'll use some form of parallel ingest they'll use technologies like dis CP to copy data between the data centers and each one of those has sort of an administrative burden on them and then some various flaws in their and their underlying architecture that don't allow them to do a really really detailed job as ensuring that all blocks are replicated properly that no mistakes are ever made and again there's the administrative burden you know somebody who always has to have eyes in the system we alleviate all those things so I think the first thing I want to start off with we had somebody come to our booth and we were talking about this consensus algorithm that we that we perform and the way we synchronize multiple name nodes across multiple geographies and and again and that sort of spirit of disbelief I said you know one of the key tenants of our application is it doesn't underlie it doesn't change the behavior of the application when you go from land scope to win scope and so I said for example if you create a file in one data center and 3,000 miles apart or 7,000 miles apart from that you were to hit the same create file operation you would expect that the right thing happens what somebody gets the file created and somebody gets file already exists even if at 7,000 miles distance they both hit this button at the exact same time I'm going to do a very quick demonstration of that for you here I'm going to put a file into HDFS the my top right-hand window is in Northern Virginia and then 3,000 miles distance from that my bottom right-hand window is in Oregon I'm going to put the etsy hosts file into a temp directory in Hadoop at the exact same time 3,000 miles distance apart and you'll see that exact behavior so I've just launched them both and again if you look at the top window the file is created if you look at the bottom window it says file already exists it's exactly what you'd expect a land scope up a landscape application and the way you'd expect it to behave so that is how we are ensure consistency and that was the question that the prospect has at that distance even the speed of light takes a little time right so what are some of the tips and tricks you can share this that enable you guys to do this well one of the things that we're doing is where our consensus algorithm is a majority quorum based algorithm it's based off of a well-known consensus algorithm called paxos we have a number of significant enhancements innovations beyond that dynamic memberships you know automatic scale and things of that nature but in this particular case every transaction that goes into our system gets a global sequence number and what we're able to do is ensure that those sequence numbers are executed in the correct order so you can't create you know you can't put a delete before a create you know everything has to happen in the order that it actually happened occurred in regardless of the UN distance between data centers so what is the biggest aha moment you get from customer you show them the demo is it is that the replication is availability what is the big big feature focus that they jump on yeah I think I think the biggest ones are basically when we start crashing nodes well we're running jobs we separate the the link between the win and maybe maybe I'll just do that for you now so let's maybe kick into the demonstration here what I have here is a single HDFS cluster it is spanning two geographic territory so it's one cluster in Northern Virginia part of it and the other part is in Oregon I'm going to drill down into the graphing application here and inside you see all of the name notes so you see I have three name nodes running in Virginia three name nodes running in Oregon and the demonstration is as follows I'm going to I'm going to run Terrigen and Terra sort so in other words i'm going to create some data in the cluster I'm then going to go to sort it into a total order and then I'm going to run Tara validate in the alternate data center and prove that all the blocks replicated from one side to the other however along the way I'm going to create some failures I am going to kill some of that active name nodes during this replication process i am going to shut down the when link between the two data centers during the replication paris's and then show you how we heal from from those kinds of conditions because our algorithm treats failure is a first class citizen so there's really no way to deal in the system if you will so let's start unplug John I'm active the local fails so let's go ahead and run the Terrigen in the terrorists or I'm going to put it in the directory called cube one so we're creating about 400 megabytes of data so a fairly small set that we're going to replicate between the two data centers now the first thing that you see over here on the right-hand side is that all of these name nodes kind of sprung to life that is because in an active active configuration with multiple name nodes clients actually load balance their requests across all of them also it's a synchronous namespace so any change that I make to one immediately Curzon immediately occurs on all of them the next thing you might notice in the graphing application is these blue lines over and only in the Oregon data center the blue lines essentially represent what we call a foreign block a block that is not yet made its way across the wide area network from the site of ingest now we move these blocks asynchronously from the site of in jeff's oh that I have land speed performance in fact you can see I just finished the Terrigen part of the application all at the same time pushing data across the wide area network as fast as possible now as we start to get into the next phase of the application here which is going to run terrace sort i'm going to start creating some failures in the environment so the first thing I'm going to do is want to pick two named nodes I'm going to fail a local named node and then we're also going to fail a remote name node so let's pick one of these i'm going to pick HD p 2 is the name of the machine so want to do ssh hd2 and i'm just going to reboot that machine so as I hit the reboot button the next time the graphing application updates what you'll notice here in the monitor is that a flat line so it's no longer taking any data in but if you're watching the application on the right hand side there's no interruption of the service the application is going to continue to run and you'd expect that to happen maybe in land scope cluster but remember this is a single cluster a twin scope with 3,000 miles between the two of them so I've killed one of the six active named nodes the next thing I'm going to do is kill one of the name nodes over in the Oregon data center so I'm going to go ahead and ssh into i don't know let's pick the let's pick the bottom one HTTP nine in this case and then again another reboot operation so I've just rebooted two of the six name nose while running the job but if again if you look in the upper right-hand corner the job running in Oregon kajabi running in North Virginia continues without any interruption and see we just went from 84 to eighty eight percent MapReduce and so forth so again uninterruptedly like to call continuous availability at when distances you are playing that what does continuous availability and wins because that's really important drill down on yeah I mean I think if you look at the difference between what people traditionally call high availability that means that generally speaking the system is there there is a very short time that the system will be unavailable and then it will then we come available again a continuously available system ensures that regardless of the failures that happen around it the system is always up and running something is able to take the request and in a leaderless system like ours where no one single node actually it actually creates a leadership role we're able to continue replication we're and we're also able to continue the coordinator that's two distinct is high availability which everyone kind of know was in loves expensive and then continues availability which is a little bit kind of a the Sun or cousin I guess you know saying can you put in context and cost implementation you know from a from a from a from a perspective of a when disco deployment it's kind of a continuously available system even though people look at us as somewhat traditional disaster recovery because we are replicating data to another data center but remember it's active active that means both data centers are able to write at the same time you have you get to maximize your cluster resources and again if we go back to one of the first questions you asked what are what a customer's doing this with this what a prospects want to do they want to maximize their resource investment if they have half a million dollars sitting in another data center that only is able to perform an emergency recovery situation that means they either have to a scale the primary data center or be what they want to do is utilize existing resource in an active active configuration which is why i say continuous availability they're able to do that in both data centers maximizing all their resource so you versus the consequences of not having that would be the consequences of not being able to do that is you have a one-way synchronization a disaster occurs you then have to bring that data center online you have to make sure that all the appropriate resources are there you have to you have an administrative burden that means a lot of people have to go into action very quickly with the win disco systems right what that would look like I mean with time effort cost and you have any kind of order of magnitude spec like a gay week called some guy upside dude get in the office login you have to look at individual customer service level agreements a number that i hear thrown out very very often is about 16 hours we can be back online within 16 hours really RTO 44 when disco deployment is essentially zero because both sites are active you're able to essentially continue without without any doubt some would say some would say that's contingent availability is high available because essentially zero 16 that's 16 hours I mean any any time down bad but 16 hours is huge yeah that's the service of level agreement then everyone says but we know we can do it in five hours the other of course the other part of that is of course ensuring that once a year somebody runs through the emergency configure / it you know procedure to know that they truly can be back up in line in the service level agreement timeframe so again there's a tremendous amount of effort that goes into the ongoing administrating some great comments here on our crowd chatter out chat dot net / hadoop summit joined the conversation i'll see ya we have one says nice he's talking about how the system has latency a demo is pretty cool the map was excellent excellent visual dave vellante just weighed in and said he did a survey with Jeff Kelly said large portion twenty-seven percent of respondents said lack of enterprises great availability was the biggest barriers to adoption is this what you're referring to yeah this is this is exactly what we're seeing you know people are not able to meet the uptime requirements and therefore applications stay in proof-of-concept mode or those that make it out of proof of concept are heavily burdened by administrators and a large team to ensure that same level of uptime that can be handled without error through software configuration like Linda scope so another comment from Burt thanks Burt for watching there's availability how about security yeah so security is a good one of course we are you know we run on standard dupe distributions and as such you know if you want to run your cluster with on wire encryption that's okay if you want to run your cluster with kerberos authentication that's fine we we fully support those environments got a new use case for crowd chapel in the questions got more more coming in so send them in we're watching the crowd chat slep net / hadoop summit great questions and a lot of people aren't i think people have a hard time partial eh eh versus continues availability because you can get confused between the two is it semantics or is it infrastructure concerns what is what is the how do you differentiate between those two definitions me not I think you know part of it is semantics but but but also from a win disco perspective we like to differentiate because there really isn't that that moment of downtime there is there really isn't that switch over moment where something has to fail over and then go somewhere else that's why I use that word continuous availability the system is able to simply continue operating by clients load balancing their requests to available nodes in a similar fashion when you have multiple data centers as I do here I'm able to continue operations simply by running the jobs in the alternate data center remember that it's active active so any data ingest on one side immediately transfers to the other so maybe let me do the the next part I showed you one failure scenario you've seen all the nodes have actually come back online and self healed the next part of this I want to do an separation I want to run it again so let me kick up kick that off when I would create another directory structure here only this time I'm going to actually chop the the network link between the two data centers and then after I do that I'm going to show you some some of our new products in the works give you a demonstration of that as well well that's far enough Britain what are some of the applications that that this enables people to use the do for that they were afraid to before well I think it allows you know when we look at our you know our customer base and our prospects who are evaluating our technologies it opens up all the all the regulated industries you know things like pharmaceutical companies financial services companies healthcare companies all these people who have strict regulations auditing requirements and now have a very clear concise way to not only prove that they're replicating data that data has actually made its way it can prove that it's in both locations that it's not just in both locations that it's the correct data sometimes we see in the cases of like dis CP copying files between data centers where the file isn't actually copied because it thinks it's the same but there is a slight difference between the two when the cluster diverges like that it's days of administration hour depending on the size of the cluster to actually to put the cluster you know to figure out what went wrong what went different and then of course you have to involve multiple users to figure out which one of the two files that you have is the correct one to keep so let me go ahead and stop the van link here of course with LuAnn disco technology there's nothing to keep track of you simply allow the system to do HDFS replication because it is essentially native HDFS so I've stopped the tunnel between the two datacenters while running this job one of the things that you're going to see on the left-hand size it looks like all the notes no longer respond of course that's just I have no visibility to those nodes there's no longer replicating any data because the the tunnel between the two has been shut down but if you look on the right hand side of the application the upper right-hand window of course you see that the MapReduce job is still running it's unaffected and what's interesting is once I start replicating the data again or once i should say once i start the tunnel up again between the two data centers i'll immediately start replicating data this is at the block level so again when we look at other copy technologies they are doing things of the file level so if you had a large file and it was 10 gigabytes in size and for some reason you know your your file crash but in that in that time you and you were seventy percent through your starting that whole transfer again because we're doing block replication if you had seventy percent of your box that had already gone through like perhaps what I've done here when i start the tunnel backup which i'm going to do now what's going to happen of course is we just continue from those blocks that simply haven't made their way across the net so i've started the tunnel back up the monitor you'll see springs back to life all the name nodes will have to resync that they've been out of sync for some period of time they'll learn any transactions that they missed they'll be they'll heal themselves into the cluster and we immediately start replicating blocks and then to kind of show you the bi-directional nature of this I'm going to run Tara validate in the opposite data center over in Oregon and I'll just do it on that first directory that we created and in what you'll see is that we now wind up with foreign blocks in both sides I'm running applications at the same time across datacenters fully active active configuration in a single Hadoop cluster okay so the question is on that one what is the net net summarized that demo reel quick bottom line in two sentences is that important bottom line is if name notes fail if the wind fails you are still continuously operational okay so we have questions from the commentary here from the crowd chat does this eliminate the need for backup and what is actually transferring certainly not petabytes of data ? I mean you somewhat have to transfer what what's important so if it's important for you to I suppose if it was important for you to transfer a petabyte of data then you would need the bandwidth that support I transfer of a petabyte of data but we are to a lot of Hollywood studios we were at OpenStack summit that was a big concern a lot of people are moving to the cloud for you know for workflow and for optimization Star Wars guys were telling us off the record that no the new film is in remote locations they set up data centers basically in the desert and they got actually provisioned infrastructure so huge issues yeah absolutely so what we're replicating of course is HDFS in this particular case I'm replicating all the data in this fairly small cluster between the two sites or in this case this demo is only between two sites I could add a third site and then a failure between any two would actually still allow complete you know complete availability of all the other sites that still participate in the algorithm Brent great to have you on I want to get the perspective from you in the trenches out in customers what's going on and win disco tell us what the culture there what's going on the company what's it like to work there what's the guys like I mean we we know some of the dudes there cause we always drink some vodka with him because you know likes to tip back a little bit once in a while but like great guy great geeks but like what's what's it like it when disco I think the first you know you touched on a little piece of it at first is there are a lot of smart people at windows go in fact I know when I first came on board I was like wow I'm probably the most unsmoked person at this company but culturally this is a great group of guys they like to work very hard but equally they like to play very hard and as you said you know I've been out with cause several times myself these are all great guys to be out with the culture is great it's a it's a great place to work and you know so you know people who are who are interested should certainly yeah great culture and it fits in we were talking last night very social crowd here you know something with a Hortonworks guide so javi medicate fortress ada just saw him walk up ibm's here people are really sociable this event is really has a camaraderie feel to it but yet it's serious business and you didn't the days they're all a bunch of geeks building in industry and now it's got everyone's attention Cisco's here in Intel's here IBM's here I mean what's your take on the big guys coming in I mean I think the big guys realize that that Hadoop is is is the elephant is as large as it appears elephant is in the room and exciting and it's and everybody wants a little piece of it as well they should want a piece of it Brett thanks for coming on the cube really appreciate when discs are you guys a great great company we love to have them your support thanks for supporting the cube we appreciate it we right back after this short break with our next guest thank you

Published Date : Jun 4 2014

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
two sites	QUANTITY	0.99+
Jeff Kelly	PERSON	0.99+
seventy percent	QUANTITY	0.99+
Oregon	LOCATION	0.99+
two sites	QUANTITY	0.99+
Jeff Kelly	PERSON	0.99+
3,000 miles	QUANTITY	0.99+
Virginia	LOCATION	0.99+
Jeff Rick	PERSON	0.99+
Burt	PERSON	0.99+
84	QUANTITY	0.99+
Northern Virginia	LOCATION	0.99+
North Virginia	LOCATION	0.99+
two	QUANTITY	0.99+
five hours	QUANTITY	0.99+
3,000 miles	QUANTITY	0.99+
7,000 miles	QUANTITY	0.99+
two data centers	QUANTITY	0.99+
Brett	PERSON	0.99+
Star Wars	TITLE	0.99+
10 gigabytes	QUANTITY	0.99+
half a million dollars	QUANTITY	0.99+
16 hours	QUANTITY	0.99+
Brett Rudenstein	PERSON	0.99+
Jeff	PERSON	0.99+
both locations	QUANTITY	0.99+
two sentences	QUANTITY	0.99+
two files	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
two datacenters	QUANTITY	0.99+
two data centers	QUANTITY	0.99+
one	QUANTITY	0.99+
two different clusters	QUANTITY	0.99+
both sides	QUANTITY	0.99+
both sites	QUANTITY	0.99+
first directory	QUANTITY	0.98+
third site	QUANTITY	0.98+
first thing	QUANTITY	0.98+
first	QUANTITY	0.98+
Cisco	ORGANIZATION	0.98+
twenty-seven percent	QUANTITY	0.98+
John	PERSON	0.98+
first thing	QUANTITY	0.98+
one side	QUANTITY	0.97+
Britain	LOCATION	0.97+
today	DATE	0.97+
two definitions	QUANTITY	0.97+
OpenStack	EVENT	0.96+
Hortonworks	ORGANIZATION	0.96+
eighty eight percent	QUANTITY	0.96+
last night	DATE	0.96+
both data centers	QUANTITY	0.94+
each one	QUANTITY	0.94+
zero	QUANTITY	0.94+
once a year	QUANTITY	0.94+
one failure	QUANTITY	0.93+
the cube and hadoop summit 2014	EVENT	0.93+
two geographic territory	QUANTITY	0.93+
Intel	ORGANIZATION	0.92+
both	QUANTITY	0.92+
single	QUANTITY	0.92+
this year	DATE	0.91+
one data center	QUANTITY	0.91+
dupe summit	EVENT	0.9+
Brett room Stein	PERSON	0.9+

Jack Norris - Hadoop Summit 2014 - theCUBE - #HadoopSummit

>>The queue at Hadoop summit, 2014 is brought to you by anchor sponsor Hortonworks. We do, I do. And headline sponsor when disco we make Hadoop invincible >>Okay. Welcome back. Everyone live here in Silicon valley in San Jose. This is a dupe summit. This is Silicon angle and Wiki bonds. The cube is our flagship program. We go out to the events and extract the signal to noise. I'm John barrier, the founder SiliconANGLE joins my cohost, Jeff Kelly, top big data analyst in the, in the community. Our next guest, Jack Norris, COO of map R security enterprise. That's the buzz of the show and it was the buzz of OpenStack summit. Another open source show. And here this year, you're just seeing move after, move at the moon, talking about a couple of critical issues. Enterprise grade Hadoop, Hortonworks announced a big acquisition when all in, as they said, and now cloud era follows suit with their news. Today, I, you sitting back saying, they're catching up to you guys. I mean, how do you look at that? I mean, cause you guys have that's the security stuff nailed down. So what Dan, >>You feel about that now? I think I'm, if you look at the kind of Hadoop market, it's definitely moving from a test experimental phase into a production phase. We've got tremendous customers across verticals that are doing some really interesting production use cases. And we recognized very early on that to really meet the needs of customers required some architectural innovation. So combining the open source ecosystem packages with some innovations underneath to really deliver high availability, data protection, disaster recovery features, security is part of that. But if you can't predict the PR protect the data, if you can't have multitenancy and separate workflows across the cluster, then it doesn't matter how secure it is. You know, you need those. >>I got to ask you a direct question since we're here at Hadoop summit, because we get this question all the time. Silicon lucky bond is so successful, but I just don't understand your business model without plates were free content and they have some underwriters. So you guys have been very successful yet. People aren't looking at map are as good at the quiet leader, like you doing your business, you're making money. Jeff. He had some numbers with us that in the Hindu community, about 20% are paying subscriptions. That's unlike your business model. So explain to the folks out there, the business model and specifically the traction because you have >>Customers. Yeah. Oh no, we've got, we've got over 500 paying customers. We've got at least $1 million customer in seven different verticals. So we've got breadth and depth and our business model is simple. We're an enterprise software company. That's looking at how to provide the best of open source as well as innovations underneath >>The most open distribution of Hadoop. But you add that value separately to that, right? So you're, it's not so much that you're proprietary at all. Right. Okay. >>You clarify that. Right. So if you look at, at this exciting ecosystem, Hadoop is fairly early in its life cycle. If it's a commoditization phase like Linux or, or relational database with my SQL open source, kind of equates the whole technology here at the beginning of this life cycle, early stages of the life cycle. There's some architectural innovations that are really required. If you look at Hadoop, it's an append only file system relying on Linux. And that really limits the types of operations. That types of use cases that you can do. What map ours done is provide some deep architectural innovations, provide complete read-write file systems to integrate data protection with snapshots and mirroring, et cetera. So there's a whole host of capabilities that make it easy to integrate enterprise secure and, and scale much better. Do you think, >>I feel like you were maybe a little early to the market in the sense that we heard Merv Adrian and his keynote this morning. Talk about, you know, it's about 10 years when you start to get these questions about security and governance and we're about nine years into Hadoop. Do you feel like maybe you guys were a little early and now you're at a tipping point, whereas these more, as more and more deployments get ready to go to production, this is going to be an area that's going to become increasingly important. >>I think, I think our timing has been spectacular because we, we kind of came out at a time when there was some customers that were really serious about Hadoop. We were able to work closely with them and prove our technology. And now as the market is just ramping, we're here with all of those features that they need. And what's a, what's an issue. Is that an incremental improvement to provide those kind of key features is not really possible if the underlying architecture isn't there and it's hard to provide, you know, online real-time capabilities in a underlying platform that's append only. So the, the HDFS layer written in Java, relying on the Linux file system is kind of the, the weak underbelly, if you will, of, of the ecosystem. There's a lot of, a lot of important developments happening yarn on top of it, a lot of really kind of exciting things. So we're actively participating in including Apache drill and on top of a complete read-write file system and integrated Hindu database. It just makes it all come to life. >>Yeah. I mean, those things on top are critical, but you know, it's, it's the underlying infrastructure that, you know, we asked, we keep on community about that. And what's the, what are the things that are really holding you back from Paducah and production and the, and the biggest challenge is they cited worth high availability, backup, and recovery and maintaining performance at scale. Those are the top three and that's kind of where Matt BARR has been focused, you know, since day one. >>So if you look at a major retailer, 2000 nodes and map bar 50 unique applications running on a single cluster on 10,000 jobs a day running on top of that, if you look at the Rubicon project, they recently went public a hundred million add actions, a hundred billion ad auctions a day. And on top of that platform, beats music that just got acquired for $3 billion. Basically it's the underlying map, our engine that allowed them to scale and personalize that music service. So there's a, there's a lot of proof points in terms of how quickly we scale the enterprise grade features that we provide and kind of the blending of deep predictive analytics in a batch environment with online capabilities. >>So I got to ask you about your go to market. I'll see Cloudera and Hortonworks have different business models. Just talk about that, but Cloudera got the massive funding. So you get this question all the time. What do you, how do you counter that army and the arms race? I think >>I just wrote an article in Forbes and he says cash is not a strategy. And I think that was, that was an excellent, excellent article. And he goes in and, you know, in this fast growing market, you know, an amount of money isn't necessarily translate to architectural innovations or speeding the development of that. This is a fairly fragmented ecosystem in terms of the stack that runs on top of it. There's no single application or single vendor that kind of drives value. So an acquisition strategy is >>So your field Salesforce has direct or indirect, both mixable. How do you handle the, because Cloudera has got feet on the street and every squirrel will find it, not if they're parked there, parking sales reps and SCS and all the enterprise accounts, you know, they're going to get the, squirrel's going to find a nut once in awhile. Yeah. And they're going to actually try to engage the clients. So, you know, I guess it is a strategy if they're deploying sales and marketing, right? So >>The beauty about that, and in fact, we're all in this together in terms of sharing an API and driving an ecosystem, it's not a fragmented market. You can start with one distribution and move to another, without recompiling or without doing any sort of changes. So it's a fairly open community. If this were a vendor lock-in or, you know, then spending money on brand, et cetera, would, would be important. Our focus is on the, so the sales execution of direct sales, yes, we have direct sales. We also have partners and it depends on the geographies as to what that percentage is. >>And John Schroeder on with the HP at fifth big data NYC has updated the HP relationship. >>Oh, excellent. In fact, we just launched our application gallery app gallery, make it very easy for administrators and developers and analysts to get access and understand what's available in the ecosystem. That's available directly on our website. And one of the featured applications there today is an integration with the map, our sandbox and HP Vertica. So you can get early access, try it and get the best of kind of enterprise grade SQL first, >>First Hadoop app store, basically. Yeah. If you want to call it that way. Right. So like >>Sure. Available, we launched with close to 30, 30 with, you know, a whole wave kind of following that. >>So talk a little bit about, you know, speaking of verdict and kind of the sequel on Hadoop. So, you know, there's a lot of talk about that. Some confusion about the different methods for applying SQL on predicts or map art takes an open approach. I know you'll support things like Impala from, from a competitor Cloudera, talk about that approach from a map arts perspective. >>So I guess our, our, our perspective is kind of unbiased open source. We don't try to pick and choose and dictate what's the right open source based on either our participation or some community involvement. And the reality is with multiple applications being run on the platform, there are different use cases that make difference, you know, make different sense. So whether it's a hive solution or, you know, drill drills available, or HP Vertica people have the choice. And it's part of, of a broad range of capabilities that you want to be able to run on the platform for your workflows, whether it's SQL access or a MapReduce or a spark framework shark, et cetera. >>So, yeah, I mean there is because there's so many different there's spark there's, you know, you can run HP Vertica, you've got Impala, you've got hive. And the stinger initiative is, is that whole kind of SQL on Hadoop ecosystem, still working itself out. Are we going to have this many options in a year or two years from now? Or are they complimentary and potentially, you know, each has its has its role. >>I think the major differences is kind of how it deals with the new data formats. Can it deal with self-describing data? Sources can leverage, Jason file does require a centralized metadata, and those are some of the perspectives and advantages say the Apache drill has to expand the data sets that are possible enabled data exploration without dependency on a, on an it administrator to define that, that metadata. >>So another, maybe not always as exciting, but taking workloads from existing systems, moving them to Hadoop is one of the ways that a lot of people get started with, to do whether associated transformation workloads or there's something in that vein. So I know you've announced a partnership with Syncsort and that's one of the things that they focus on is really making it as easy as possible to meet those. We'll talk a little bit about that partnership, why that makes sense for you and, and >>When your customer, I think it's a great proof point because we announced that partnership around mainframe offload, we have flipped comScore and experience in that, in that press release. And if you look at a workload on a mainframe going to duke, that that seems like that's a, that's really an oxymoron, but by having the capabilities that map R has and making that a system of record with that full high availability and that data protection, we're actually an option to offload from mainframe offload, from sand processing and provide a really cost effective, scalable alternative. And we've got customers that had, had tried to offload from the mainframe multiple times in the past, on successfully and have done it successfully with Mapbox. >>So talk a little bit more about kind of the broader partnership strategy. I mean, we're, we're here at Hadoop summit. Of course, Hortonworks talks a lot about their partnerships and kind of their reseller arrangements. Fedor. I seem to take a little bit more of a direct approach what's map R's approach to kind of partnering and, and as that relates to kind of resell arrangements and things like, >>I think the app gallery is probably a great proof point there. The strategy is, is an ecosystem approach. It's having a collection of tools and applications and management facilities as well as applications on top. So it's a very open strategy. We focus on making sure that we have open API APIs at that application layer, that it's very easy to get data in and out. And part of that architecture by presenting standard file system format, by allowing non Java applications to run directly on our platform to support standard database connections, ODBC, and JDBC, to provide database functionality. In addition to kind of this deep predictive analytics really it's about supporting the broadest set of applications on top of a single platform. What we're seeing in this kind of this, this modern architecture is data gravity matters. And the more processing you can do on a single platform, the better off you are, the more agile, the more competitive, right? >>So in terms of, so you're partnering with people like SAS, for example, to kind of bring some of the, some of the analytic capabilities into the platform. Can you kind of tell us a little bit about any >>Companies like SAS and revolution analytics and Skytree, and I mean, just a whole host of, of companies on the analytics side, as well as on the tools and visualization, et cetera. Yeah. >>Well, I mean, I, I bring up SAS because I think they, they get the fact that the, the whole data gravity situation is they've got it. They've got to go to where the data is and not have the data come to them. So, you know, I give them credit for kind of acknowledging that, that kind of big data truth ism, that it's >>All going to the data, not bringing the data >>To the computer. Jack talk about the success you had with the customers had some pretty impressive numbers talking about 500 customers, Merv agent. The garden was on with us earlier, essentially reiterating not mentioning that bar. He was just saying what you guys are doing is right where the puck is going. And some think the puck is not even there at the same rink, some other vendors. So I gotta give you props on that. So what I want you to talk about the success you have in specifically around where you're winning and where you're successful, you guys have struggled with, >>I need to improve on, yeah, there's a, there's a whole class of applications that I think Hadoop is enabling, which is about operations in analytics. It's taking this, this higher arrival rate machine generated data and doing analytics as it happens and then impacting the business. So whether it's fraud detection or recommendation engines, or, you know, supply chain applications using sensor data, it's happening very, very quickly. So a system that can tolerate and accept streaming data sources, it has real-time operations. That is 24 by seven and highly available is, is what really moves the needle. And that's the examples I used with, you know, add a Rubicon project and, you know, cable TV, >>The very outcome. What's the primary outcomes your clients want with your product? Is it stability? And the platform has enabled development. Is there a specific, is there an outcome that's consistent across all your wins? >>Well, the big picture, some of them are focused on revenues. Like how do we optimize revenue either? It's a new data source or it's a new application or it's existing application. We're exploding the dataset. Some of it's reducing costs. So they want to do things like a mainframe offload or data warehouse offload. And then there's some that are focused on risk mitigation. And if there's anything that they have in common it's, as they moved from kind of test and looked at production, it's the key capabilities that they have in enterprise systems today that they want to make sure they're in Hindu. So it's not, it's not anything new. It's just like, Hey, we've got SLS and I've got data protection policies, and I've got a disaster recovery procedure. And why can't I expect the same level of capabilities in Hindu that I have today in those other systems. >>It's a final question. Where are you guys heading this year? What's your key objectives. Obviously, you're getting these announcements as flurry of announcements, good success state of the company. How many employees were you guys at? Give us a quick update on the numbers. >>So, you know, we just reported this incredible momentum where we've tripled core growth year over year, we've added a tremendous amount of customers. We're over 500 now. So we're basically sticking to our knitting, focusing on the customers, elevating the proof points here. Some of the most significant customers we have in the telco and financial services and healthcare and, and retail area are, you know, view this as a strategic weapon view, this is a huge competitive advantage, and it's helping them impact their business. That's really spring our success. We've, you know, we're, we're growing at an incredible clip here and it's just, it's a great time to have made those calls and those investments early on and kind of reaping the benefits. >>It's. Now I've always said, when we, since the first Hadoop summit, when Hortonworks came out of Yahoo and this whole community kind of burst open, you had to duke world. Now Riley runs at it's a whole different vibe of itself. This was look at the developer vibe. So I got to ask you, and we would have been a big fan. I mean, everyone has enough beachhead to be successful, not about map arbors Hortonworks or cloud air. And this is why I always kind of smile when everyone goes, oh, Cloudera or Hortonworks. I mean, they're two different animals at this point. It would do different things. If you guys were over here, everyone has their quote, swim lanes or beachhead is not a lot of super competition. Do you think, or is it going to be this way for awhile? What's your fork at some? At what point do you see more competition? 10 years out? I mean, Merv was talking a 10 year horizon for innovation. >>I think that the more people learn and understand about Hadoop, the more they'll appreciate these kind of set of capabilities that matter in production and post-production, and it'll migrate earlier. And as we, you know, focus on more developer tools like our sandbox, so people can easily get experienced and understand kind of what map are, is. I think we'll start to see a lot more understanding and momentum. >>Awesome. Jack Norris here, inside the cube CMO, Matt BARR, a very successful enterprise grade, a duke player, a leader in the space. Thanks for coming on. We really appreciate it. Right back after the short break you're live in Silicon valley, I had dupe December, 2014, the right back.

Published Date : Jun 4 2014

SUMMARY :

The queue at Hadoop summit, 2014 is brought to you by anchor sponsor I mean, cause you guys have that's the security stuff nailed down. I think I'm, if you look at the kind of Hadoop market, I got to ask you a direct question since we're here at Hadoop summit, because we get this question all the time. That's looking at how to provide the best of open source But you add that value separately to So if you look at, at this exciting ecosystem, Talk about, you know, it's about 10 years when you start to get these questions about security and governance and we're about isn't there and it's hard to provide, you know, online real-time And what's the, what are the things that are really holding you back from Paducah So if you look at a major retailer, 2000 nodes and map bar 50 So I got to ask you about your go to market. you know, in this fast growing market, you know, an amount of money isn't necessarily all the enterprise accounts, you know, they're going to get the, squirrel's going to find a nut once in awhile. We also have partners and it depends on the geographies as to what that percentage So you can get early If you want to call it that way. a whole wave kind of following that. So talk a little bit about, you know, speaking of verdict and kind of the sequel on Hadoop. And it's part of, of a broad range of capabilities that you want So, yeah, I mean there is because there's so many different there's spark there's, you know, you can run HP Vertica, of the perspectives and advantages say the Apache drill has to expand the data sets why that makes sense for you and, and And if you look at a workload on a mainframe going to duke, So talk a little bit more about kind of the broader partnership strategy. And the more processing you can do on a single platform, the better off you are, Can you kind and I mean, just a whole host of, of companies on the analytics side, as well as on the tools So, you know, I give them credit for kind of acknowledging that, that kind of big data truth So what I want you to talk about the success you have in specifically around where you're winning and you know, add a Rubicon project and, you know, cable TV, And the platform has enabled development. the key capabilities that they have in enterprise systems today that they want to make sure they're in Hindu. Where are you guys heading this year? So, you know, we just reported this incredible momentum where we've tripled core and this whole community kind of burst open, you had to duke world. And as we, you know, focus on more developer tools like our sandbox, a duke player, a leader in the space.

ENTITIES

Entity	Category	Confidence
Jeff Kelly	PERSON	0.99+
Jack Norris	PERSON	0.99+
John Schroeder	PERSON	0.99+
HP	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
$3 billion	QUANTITY	0.99+
December, 2014	DATE	0.99+
Jason	PERSON	0.99+
Matt BARR	PERSON	0.99+
10,000 jobs	QUANTITY	0.99+
Today	DATE	0.99+
10 year	QUANTITY	0.99+
Syncsort	ORGANIZATION	0.99+
Dan	PERSON	0.99+
Silicon valley	LOCATION	0.99+
John barrier	PERSON	0.99+
Java	TITLE	0.99+
Yahoo	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
24	QUANTITY	0.99+
Hadoop	TITLE	0.99+
Cloudera	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
this year	DATE	0.99+
Jack	PERSON	0.99+
fifth	QUANTITY	0.99+
Linux	TITLE	0.99+
Skytree	ORGANIZATION	0.99+
each	QUANTITY	0.99+
both	QUANTITY	0.99+
today	DATE	0.98+
one	QUANTITY	0.98+
Merv	PERSON	0.98+
about 10 years	QUANTITY	0.98+
San Jose	LOCATION	0.98+
Hadoop	EVENT	0.98+
about 20%	QUANTITY	0.97+
seven	QUANTITY	0.97+
over 500	QUANTITY	0.97+
a year	QUANTITY	0.97+
about 500 customers	QUANTITY	0.97+
SQL	TITLE	0.97+
seven different verticals	QUANTITY	0.97+
two years	QUANTITY	0.97+
single platform	QUANTITY	0.96+
2014	DATE	0.96+
Apache	ORGANIZATION	0.96+
Hadoop	LOCATION	0.95+
SiliconANGLE	ORGANIZATION	0.94+
comScore	ORGANIZATION	0.94+
single vendor	QUANTITY	0.94+
day one	QUANTITY	0.94+
Salesforce	ORGANIZATION	0.93+
about nine years	QUANTITY	0.93+
Hadoop Summit 2014	EVENT	0.93+
Merv	ORGANIZATION	0.93+
two different animals	QUANTITY	0.92+
single application	QUANTITY	0.92+
top three	QUANTITY	0.89+
SAS	ORGANIZATION	0.89+
Riley	PERSON	0.88+
First	QUANTITY	0.87+
Forbes	TITLE	0.87+
single cluster	QUANTITY	0.87+
Mapbox	ORGANIZATION	0.87+
map R	ORGANIZATION	0.86+
map	ORGANIZATION	0.86+

Steve Wooledge - Hadoop Summit 2013 - Studio B - #HadoopSummit

>>Winston Edmundson here at Hadoop summit. We've got Steve woolens from Teradata. He's going to talk to me a little bit about a exciting new announcement that you had with Hortonworks today. Tell me a little bit about that. >>Yeah. So Teradata has been in the data management analytics space for over 30 years. And with the announcement today, we announced data portfolio for Hadoop, which is a collection of products, services, and customer support for an entire portfolio for the products. So we've got turnkey appliances, we've got commodity offerings and with Hortonworks, we've got a shared customer support model, so we can give our customers everything they need around >>Ultimate support. Pretty exciting. Now this seems like it must've been a long process to put all this together. >>Well, we've had a partnership with Hortonworks for about a year. We've had Hadoop product offerings in the market for about six months. We've seen a lot of uptake from our customers, and it's really about broadening that to make sure that customers can buy a dupe standalone integrated in with the rest of their data architecture and make it a trusted component within that next generation data architecture. >>Tell me what excites you right now with the customers that you're helping, you're meeting their needs. Where do you see things going? What trends are you following right now? >>The big thing we're seeing is customers. Our customers want to better serve their customers. And there's so many new interaction points that they have with those customers through social networks, email, and being able to take things like the call center voice records, but that's been data that hasn't really been explored in the past to figure out how to better serve those customers. So now with Hadoop and other MapReduce technologies, we can incorporate that analysis into how we better serve our customers, customers at the end of the day. If that makes sense, that's ultimately, it's about getting deeper insights into how to better service the customers. And I think with all the new data that's out there and the hype around big data, that's really what it's about. >>Do you find the customers are coming to you with their own ideas or are they looking to you for suggestions on just how they can bring these different data sets together and how they can maximize and leverage some of this data? >>Well, the problem is there's so much hysteria in this market. I mean, it's an exciting place to be, but there's a lot of technologies, right? So I think the thing with Teradata is we do provide that trusted advisor status. I mean, we've been implementing data analytics solutions for a long, long time and a lot of the problems aren't new, they're just incorporating new analytics techniques. So they have ideas in terms of things they've heard about. They're not really sure how to implement it sometimes. So part of our offering is we have services, so we can look across their entire data architecture and figure out where does the dupe really fit? What are the best use cases for it? How do we integrate that across the enterprise? So the end users and the applications that can benefit from that data can really get the value from it. >>How important do you think it is or how much is an advantage that you are tried and true. You've been here. I mean, some of these solution providers, you can call them fly by night. I mean, they just, they're just here on, you know, they've just formed. They don't have a track record. It's your track record of success? One of the main things that customers are attracted to? >>I think so. I mean, the reality is we have, we're like in the trenches with our customers, it's not just the technology, but when we have business consulting, people that come in with domain expertise from a given industry, so you can call it a track record or whatever it is, but it's really understanding, not just technology, but the business and how these things come together to really get the most value from all the cool technology that's out there. So yeah, a lot of the fly by nighters, I mean, there's a lot of innovative things that are happening. And at duke five years ago, it was one of those very new things. And so we've been looking at it for a while and now we figured out the best way to incorporate it into our solution portfolio and to roll it out to customers >>When you're helping a customer. And you're, you're looking at the here and now, this is what they, they need to be addressing. I would imagine a lot of customers want to know what's around the corner, what's around the bend that we should be aware of, that we should try to be, be prepared for. What do you, what do you tell them? >>Well, I think, you know, everybody will say there's just more and more data coming at you. I think other analytic techniques like graph analysis is something that people particularly with social networks are trying to figure out how are people interrelated to each other. So it's a lot of different use cases and there's different analytic techniques that can be combined in unique ways. So a lot of our R and D investment is going into how do we bring more of those analytic techniques and unify them for people in one system. So that regardless of your data scientists or business analysts, you can ask really interesting, tough questions that you couldn't answer ask before. So it's about giving answers to sometimes the unknown questions and helping them explore that data through unique ways. >>What would you say are some of the industries that are maybe there's probably more urgency for them to adopt some of these strategies or perhaps just, they're more likely to have a big return on investment? What industries would you point to? >>I mean, for us, it's a lot of the traditional industries where you have a lot of consumers, right? Telecommunications, retail, retail, financial services, anybody who's working with. A lot of customers that have a lot of products, just have a lot of complexity, a lot of customer interaction touchpoints. So I think those are the people that typically we see adopting new technology and really thinking about how to better serve their customers >>For folks that are watching tuning in. And they're pretty excited about what you might be able to help them with. What's the best way for them to get in touch with you or, or >>You just go to teradata.com and check us out there. That's probably the best way to reach us. >>Right. Fantastic. Thanks for your time. Winston Edmondson here with studio B signing out.

Published Date : Jul 8 2013

SUMMARY :

He's going to talk to me a little bit about a exciting new announcement that you had with Hortonworks today. So we've got turnkey appliances, we've got commodity offerings and with Hortonworks, Now this seems like it must've been a long process to put all this together. Well, we've had a partnership with Hortonworks for about a year. Tell me what excites you right now with the customers that you're helping, you're meeting their needs. but that's been data that hasn't really been explored in the past to figure out how to better serve those customers. So I think the thing with Teradata is we do provide that trusted advisor status. I mean, they just, they're just here on, you know, they've just formed. I mean, the reality is we have, we're like in the trenches with our customers, I would imagine a lot of customers want to know what's around the corner, So it's a lot of different use cases and there's I mean, for us, it's a lot of the traditional industries where you have a lot of consumers, to get in touch with you or, or That's probably the best way Winston Edmondson here with studio B signing out.

ENTITIES

Entity	Category	Confidence
Hortonworks	ORGANIZATION	0.99+
Teradata	ORGANIZATION	0.99+
Winston Edmundson	PERSON	0.99+
Steve Wooledge	PERSON	0.99+
Winston Edmondson	PERSON	0.99+
over 30 years	QUANTITY	0.99+
Steve woolens	PERSON	0.98+
five years ago	DATE	0.98+
about six months	QUANTITY	0.98+
one system	QUANTITY	0.98+
Hadoop	ORGANIZATION	0.97+
today	DATE	0.96+
one	QUANTITY	0.95+
teradata.com	OTHER	0.93+
about a year	QUANTITY	0.92+
One	QUANTITY	0.92+
Hadoop	EVENT	0.91+
studio B	ORGANIZATION	0.9+
Hadoop Summit 2013	EVENT	0.84+
Hadoop	TITLE	0.77+
Studio	EVENT	0.46+
MapReduce	ORGANIZATION	0.45+

Scott Gnau - Hadoop Summit 2013 - theCUBE - #HadoopSummit

live at hadoop summit this is SiliconANGLE and wiki bonds exclusive coverage of hadoop summit this is the cube our flagship program would go out the advanced extract the signal from the noise i'm to enjoy my co-host Jeff Kelly Jeff welcome to the cube Scott welcome to the cube great to have you here so you kicked off help kick off the show this morning with your keynote talking about a number of things among them the new teradata plans for Hadoop brought it on stage which I thought was great i love i love some i was joined by a dancing appliance okay great it was fantastic a good-looking appliance it was but why don't you tell us a little bit about yourself kind of your role and then we'll kind of get into what tara date is doing here at the show and some of the some of the strategies you're taking towards the big data market okay great well I'm Scott now I'm from tarde de labs and turny two labs is actually organization within teradata that is responsible for research development engineering product management product marketing all the products all of the technology that we roll out kind of the innovation engine of teradata is what we're responsible for and we've been obviously affiliated with hadoop summit we were here last year it's really great to be back having been in the in the data warehouse big data kind of data analytics business for a long time the one thing I have to say about this whole movement in the Hadoop space is that it's unlike anything else I've seen in that it's every geography it's every industry and there's so much energy and emotion around it's unlike any other transition that I've seen and even the difference between our visit here last year and this year where we've seen the the promise turned into reality where we've got customers who are implementing where we've got businesses who are driving value from the solutions that they're really that they're integrating with the solutions that they've already got and and being able to demonstrate that value really emphasizes the importance and I think will help to continue the momentum that we feel in this market Scott one of the things I want to ask you was obviously the theme at had dude was off loading data warehouses what they do is a benefit there but you have a relationship with Hortonworks and we've had we were talking early with Murph was an analyst at Gartner was talking about the the early adopters and the mainstream getting it now and but there's always a question of value right where's the value because his legacy involved right so the most of the web based companies are going to be cloud they'll be SAS they might have a Greenfield clean sheet of paper to work with on big data but an existing enterprise large financial institutions insurance company or what have you they have legacy technology and they have to but they want Hadoop they want to bring it in when you talk to folks out there what are some of the challenges and opportunities they have with that environment and the technology specifically sure that was like a long question there's a lot of a lot of threads in there I want to really try to hit on a couple of important themes because you know you hear it here I get asked a lot about it you know one of the things that people often say is you know this why are you here this whole Hadoop thing is offloading data warehouses isn't that bad doesn't that bother you and the answer is absolutely not certainly there's some hype around that and you know those some marketing around that but when you really look at the technology and the value of what it brings to the table it's a new technology that really allows us to harness new kinds of data and store those new kinds of data in the native format and you know storing detailed data in the native format really enables the best world-class analytics we've seen this happen for you know as long as my career is in the traditional data space so that's a really good thing the way I view it though is sure will some work load move around the infrastructure from the data warehouse to a Hadoop cluster potentially right and by the way if Hadoop is a great solution for it it should go there all right but at the same time there is more demand than there is supply of technology and what I mean by that is the demand for analytics is so extreme that actually adding this tool to the toolkit gives customers more choice and gives them the opportunity to really catch up with the backlog of things that they've wanted to invest in overtime and then the final point really I view what's happening here as perhaps one of the single largest opportunities for expansion of the role and size and scope of the data warehouse in an enterprise because one of the big things that Hadoop brings to the table is a whole lot of raw material a whole lot more data data that used to be thrown away data that never existed a year ago is now going to be able to capture be captured be stored be refined be analyzed and as companies start to find relationships as companies start to find actionable tidbits from the analytics in this huge source of raw material I think it's actually an opportunity for upside for them to integrate more data into their data warehouse where they can actually do the real-time interaction and streaming that's going to get them to the demonstrable business benefit so it's the modernization of the enterprise it's its modernization the way I look at it is also it's sometimes the word incremental can be it can sound like it we're trying to downplay it but I see it as incremental in that it's different data and it's incremental data it's incremental subject areas its new stuff that's going to come into the environment and based on what we've seen in the history of analytics right that there's no end to the value that companies find and there's no end to competition in their businesses so this is a huge opportunity for the entire community to deliver more analytics and i think that there's actually more upside for traditional legacy data warehouse vendors and there is anything I think that's a really important point because as you said a lot of people think about that offloading workloads but it's also about offloading we're close but bringing in new data doing more analytics and then moving some of that into back into the data warehouse you can actually create more value from it yeah I mean one of the things that I've seen is you know over time and Moore's law is something that's been going on for some time right and and cost erosion in Hardware has been going on for a long time and you think about the thing that you buy today for your bi implementation the hardware costs what twenty percent of what it costs three four years ago and you know what revenues continue to increase because they're such pent-up demand that as it gets less expensive it becomes more consumable and I think the same thing it's really going to continue to happen as we add in these new technologies and these new data types so one of the things I want to commend teradata for doing is focusing on kind of that reference protector and helping customers understand how this new technology of Hadoop and big data fits in with everything else that they're doing talk a little bit a bit about how from a reference architecture and then maybe even from a product perspective how teradata goes about turning this into a reality for enterprise customers who you know really you know they're not looking to just kick the tires of the Duke they want they want to use this for its really support you know applications and workflows they're really you know critical to their business yeah I think you know one of the biggest things that we can do to help the industry and to help our customers really is to define a realistic roadmap that's consumable for them in their enterprise and so while it's certainly easy to have marketing release or press release it says uh this new technology does everything in slices bread it washes your car does all these things in reality there are very few things like that in the world right but the new technologies and the new innovations really do fit into some very interesting new use cases and so by providing this integrated roadmap of how customers can deploy and fit these technologies together is a really great education process and it's been extremely well received by our customers and prospects I have to tell you that even in advance of the announcement of the things that we had here today we've already got customers who have gone down this path with us because it's such a compelling value proposition the other thing is that we don't actually put specific technology in those boxes it's a reference architecture we hope that there's some teradata product in there but at the same time we you know our customers understand that there is choice in the marketplace and the best solution is going to win and by providing this reference architecture I think we helped elevate ourselves to more of a trusted advisor status with with the the industry and in how we see these things fitting together and providing very effective very low-risk kinds of solutions well I think you hit on something that trusted advisor I think companies and enterprises are just crying out for some leadership and to help to help them really understand how they're going to make this a reality in their organizations and you know you mentioned kind of the openness and being you know allowing enterprises shoots a technology that fits that fits the the work case of course you know you hope that stared at in a lot of cases but it could be something else so talk a little bit about your relationship with hortonworks so I know you announced today kind of a reseller agreement you're going to be actually reselling the the subscription service to Hortonworks service offering talk about that a little bit and also I want to dive into the tech as well the Hadoop appliance I mentioned earlier like you announced and maybe just kind of walk us through some of the news to them sure so I mean obviously we have a strategic relationship with Hortonworks and it's our second year here at Summit and it really started with I think a very common view of what's happening in the marketplace and how these technologies should really play well together at the same time we also really believe that it's important that the community embrace the open source Apache version of the software so that it doesn't become fragmented and become obsolete right so Horton is spot-on in terms of business model and putting everything back into the Apache open source version so that means that I think this is the version that will win and this will be the version that companies can count on to be sustainable so i think that there's an advantage there implied so that's said i think it fits into the right place we've got a great engineering relationship and a great common vision on how the enterprise architecture and how the pieces can fit together and be optimized for different workloads for different service levels and for different applications so having that common vision and kind of I think bringing to Best of Breed providers together with Wharton works on the on the Hadoop side and teradata for what we're very well known for I think it's really the best of all worlds and we work together to lay out this reference architecture and so it's not just you know tur data came down from the mountain said this should be your reference architecture we've got some validation we got some validation of use cases and then we went to work from an engineering perspective on how we go build these things out and make them work and optimize them and support them end to end because obviously not only in you know with the all of the new solutions is their kind of a scarcity of talent and some confusion support becomes really really important so one of the things we added to our portfolio we announced today is an expanded relationship on the support side where customers can come to teradata for integrated support of all of their data analytics environments whether it be teradata whether it be asked her whether it be Hadoop with hdb and you know that's a really nice thing where there's one phone number to call we've got fully integrated processes we can help with a global footprint in the 80 countries where we do business and obviously Hortonworks with the with the extreme depth and ability to manage the content of the kernel can get it done unlike anyone else Scott we've been talking enterprise-grade all morning as you did those the theme of the keynote mer from our garden about security compliance I mean these are meat and potatoes enterprise issues right so I got to ask you what's what are you guys looking at what's what's coming next obviously the platform to do has a stabilized developers going to want to program on it in different environments but the reality in the enterprise is a certain requirement so what are you looking at in the labs that's coming around the corner that's it going to be really really important for customers to realize the value of scaling and harnessing the big data of Hadoop with the existing infrastructure yeah I mean I think there are two things that will continue to do one is will look to build out kind of that framework of ecosystem and in all of the keynotes this morning you know everyone talked about the value of the ecosystem and it's amazing the ecosystem how they're just more and more logos this year than there were last year and I think that that will continue but really building out that ecosystem so that those things that are important can be realized and they can be realized in a very repeatable fashion I think in addition to that kind of ease of use right because despite the fact that we have burgeoning numbers of newly minted data scientists and people getting into the marketplace that's really good there still aren't enough and so de-risking things by making them easier to deploy and easier to support i think is a key focus area and then you know finally I said two things but now third you know finally it will say to me I'd all right we'll continue to look at performance and just making sure that we have the best density the best performance the cost performance value proposition that our customers will want because I also continue to believe that the supply of data will outstrip any customers ability to invest in infrastructure I'd love to get your take on want to go back to mention to what you mentioned about the you know the Hadoop distribution focusing on a patchy and moving a patchy compatible so I take that number one to me and Tara day is not going to be coming out with their own Hadoop distribution absolutely not but how do you think about that yeah I think we can say that pretty definitively so but what about how do you see this whole Hadoop market playing out them you've got a Hortonworks Cloudera map are some others how do you see this playing out in the next year or so I mean is this you mentioned you think again that's kind of the open source of patchy versions going to kind of win when do you think that's going to happen you've got some competitors in the market and different business models hot yeah you know there are different business models and different innovators and you know my crystal ball is probably only about as clear as anyone elses but you know kind of for the long term I think it's best for the industry if if it mimics a model similar to the way Linux is deployed where this kind of a duopoly maybe three vendors it's very largely open source there's a lot of portability between I think that really strengthens the position of Hadoop as a tech as a core technology and foundation for some of the things that we're doing and so I would hope that in you know the most successful outcome would be that we'd end up with a duopoly or or you know maybe three kind of providers around a similar colonel because that would that would remove fragmentation from the market by the way I think it you know where we are software company so I think it's fair for companies to have value add proprietary software that's not a bad thing but at the file system level at a core two level I think the open source community cannot be out innovated right and and so I think that that's a really important thing so I think you know hopefully we'll get to that duopoly or maybe three companies that kind of have that I don't know if we will but I sure hope we do and I think the if I were to bet on it I would say it's odds on that that will be the case now will that be 18 months three years five years I don't know Scott thanks for coming inside the cube obviously you guys have a great position in the market place and the enterprise message is straw here that's what the demand is we're seeing a lot of trends out there that want the enterprise grade big data which is not just once there's but Hadoop's a big part of it Thanks coming inside the cube and sharing your perspective and what you got working on certainly having the new products come out to be great so thanks for coming onto the cube this is SiliconANGLE and wiki bonds coverage of hadoop summit we'll be right back with our next guest after this short break you

Published Date : Jul 2 2013

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Jeff Kelly	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
twenty percent	QUANTITY	0.99+
Scott	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
hortonworks	ORGANIZATION	0.99+
last year	DATE	0.99+
Horton	ORGANIZATION	0.99+
second year	QUANTITY	0.99+
this year	DATE	0.99+
18 months	QUANTITY	0.99+
Scott Gnau	PERSON	0.99+
last year	DATE	0.99+
80 countries	QUANTITY	0.99+
three years	QUANTITY	0.99+
today	DATE	0.99+
two things	QUANTITY	0.98+
next year	DATE	0.98+
Linux	TITLE	0.98+
three companies	QUANTITY	0.98+
five years	QUANTITY	0.98+
Wharton	ORGANIZATION	0.98+
a year ago	DATE	0.98+
two things	QUANTITY	0.98+
Hadoop	TITLE	0.97+
third	QUANTITY	0.97+
tarde de labs	ORGANIZATION	0.96+
one	QUANTITY	0.95+
this year	DATE	0.95+
SAS	ORGANIZATION	0.94+
Hadoop Summit 2013	EVENT	0.94+
kernel	TITLE	0.93+
one phone	QUANTITY	0.93+
Greenfield	ORGANIZATION	0.92+
Murph	PERSON	0.92+
Jeff	PERSON	0.92+
this morning	DATE	0.9+
two	QUANTITY	0.9+
three kind	QUANTITY	0.9+
hadoop summit	EVENT	0.9+
three vendors	QUANTITY	0.89+
teradata	ORGANIZATION	0.88+
one thing	QUANTITY	0.87+
this morning	DATE	0.87+
Apache	TITLE	0.86+
Duke	ORGANIZATION	0.8+
four years ago	DATE	0.79+
Apache	ORGANIZATION	0.79+
two labs	QUANTITY	0.77+
Hadoop	ORGANIZATION	0.77+
Tara day	PERSON	0.74+
three	DATE	0.7+
one of the biggest things	QUANTITY	0.7+
#HadoopSummit	EVENT	0.68+
lot of people	QUANTITY	0.67+
lot more data	QUANTITY	0.66+
a lot of threads	QUANTITY	0.66+
SiliconANGLE	ORGANIZATION	0.66+
Cloudera	TITLE	0.66+
single	QUANTITY	0.65+
things	QUANTITY	0.61+
turny	ORGANIZATION	0.6+
lot	QUANTITY	0.55+
wiki	TITLE	0.55+
Best	ORGANIZATION	0.52+
Moore	ORGANIZATION	0.48+
wiki	ORGANIZATION	0.47+

Jack Norris - Hadoop Summit 2013 - theCUBE - #HadoopSummit

>>Ash it's, you know, what will that mean to my investment? And the announcement fusion IO is that, you know, we're 25 times faster on read intensive HBase applications. The combination. So as organizations are deploying Hadoop, and they're looking at technology changes coming down the pike, they can rest assured that they'll be able to take advantage of those in a much more aggressive fashion with map R than, than other distribution. >>Jack, how I got to ask you, we were talking last night at the Hadoop summit, kind of the kickoff party and, you know, everyone was there. All the top execs were there and all the developers, you know, we were in the queue. I think, I think that either Dave or myself coined the term, the big three of big data, you guys ROMs cloud Cloudera map R and Hortonworks, really at the, at the beginning of the key players early on and Charles from Cloudera was just recently on. And, and he's like, oh no, this, this enterprise grade stuff has been kicked around. It's been there from the beginning. You guys have been there from the beginning and Matt BARR has never, ever waffled on your, on your messaging. You've always been very clear. Hey, we're going to take a dupe open source a dupe and turn it into an enterprise grade product. Right. So that's clear, right? That's, that's, that's a great, that's a great, so what's your take on this because now enterprise grade is kind of there, I guess, the buzz around getting the, like the folks that have crossed the chasm implemented. So what can you comment on that about one enterprise grade, the reality of it, certainly from your perspective, you haven't been any but others. And then those folks that are now rolling it out for the first time, what can you share with them around? What does it mean to be enterprise grade? >>So enterprise grade is more about the customer experience than, than a marketing claim. And, you know, by enterprise grade, what we're talking about are some of the capabilities and features that they've grown to expect in their, their other enterprise applications. So, you know, the ability to meet full S SLA is full ha recovery from multiple failures, rolling upgrades, data protection was consistent snapshots business continuity with mirroring the ability to share a cluster across multiple groups and have, you know, volumes. I mean, there's a, there's a host of features that fall under the umbrella enterprise grade. And when you move from no support for any of those features to support to a few of them, I don't think that's going to, to ha it's more like moving to low availability. And, and there's just a lot of differences in terms of when we say enterprise grade with those features mean versus w what we view as kind of an incomplete story. So >>What do you, what do you mean by low availability? Well, I mean, it's tongue in cheek. It's nice. It's a good term. It's really saying, you know, just available when you sometimes is that what you mean? Is this not true availability? I mean, availability is 99.9%. Right? >>Right. So if you've got a, an ha solution that can't recover from multiple failures, that's downtime. If you've got an HBase application that's running online and you have data that goes down and it takes 10 to 30 minutes to have the region servers recover it from another place in the distribution, that's downtime. If you have snapshots that aren't consistent across the cluster, that doesn't provide data protection, there's no point in time recovery for, for a cluster. So, you know, there's a lot of details underneath that, but what it, what it amounts to is, do you have interruptions? Do you have downtime? Do you have the potential for losing data? And our answer is you need a series of features that are hardened and proven to deliver that. >>What about recoverability? You mentioned that you guys have done a lot of work in that area with snapshotting, that's kind of being kicked around, are our folks addressing, what are the comp what's your competition doing in those areas of recoverability just mentioned availability. Okay, got that. Recoverability security, compliance, and usability. Those are the areas that seem to be the hot focus areas what's going on in the energy. How would you give them the grade, the letter grade, if you will, candidly, compared to what you guys offer? Well, the, >>The first of all, it's take recoverability. You know, one of the tenants is you have a point in time recovery, the ability to restore to a previous point that's consistent across the cluster. And right now there's, there's no point in time recovery for, for HDFS, for the files. And there's no point in time recovery for HBase tables. So there's snapshot support. It's being talked about in the open source community with respect to snapshots, but it's being referred to in the JIRAs as fuzzy snapshots and really compared to copy table. >>So, Jack, I want to turn the conversation to the, kind of the topic we've talked about before kind of the open versus a proprietary that, that whole debate we've, we've, we've heard about that. We talked about that before here on the cube. So just kind of reiterate for us your take. I mean, we, we hear perhaps because of the show we're at, there's a lot of talk about the open source nature of Hadoop and some of the purists, as you might call them are saying, it's gotta be open a hundred percent Patrick compatible, et cetera. And then there's others that are taking a different approach, explain your approach and why you think that's the key way to make, to really spur adoption of a dupe and make it >>W w we're we're a part of the community we're, we've got, you know, commitment going on. We've, you know, pioneered and pushed a patchy drill, but we have done innovations as well. And I think that those innovations are really required to support and extend the, the whole ecosystem. So canonical distributes RN, three D distribution. We've got, you know, all our, our packages are, are available on get hub and, and open source. So it's not, it's not a binary debate. And I think the, the point being that there's companies that have jumped ahead and now that Peloton is, is, you know, pedaling faster and, and we'll, we'll catch up. We'll streamline. I think the difference is we rearchitected. So we're basically in a race car and, you know, are, are racing ahead with, with enterprise grade features that are required. And there's a lot of work that still needs to be done, needs to be accomplished before that full rearchitecture is, is in place. >>Well, I mean, I think for me, the proof is really in the pudding when you, when it comes to talk about customers that are doing real things and real production, grade mission, critical applications that they're running. And to me that shows the successor or relative success of a given approach. So I know you guys are working with companies like ancestry.com, live nation and Quicken loans. Maybe you could, could you walk us through a couple of those scenarios? Let's take ancestry.com. Obviously they've got a huge amount of data based on the kind of geological information, where do you guys do >>With them? Yeah, so they've got, I mean, they've got the world's largest family genealogy services available on the web. So there's a massive amount of data that they make accessible and, and, you know, ability for, for analysis. And then they've rolled out new features and new applications. One of which is to ship a kit out, have people spit in a tube, returned back and they do DNA matching and reveal additional details. So really some really fabulous leading edge things that are being done with, with the use of, of Hadoop. >>Interesting. So talk about when you went to, to work with them, what were some of their key requirements? Was it around, it was more around the enterprise enterprise, grade security and uptime kind of equation, or was it more around some of the analytics? What, what, what's the kind of the killer use case for them? >>It's kind of, you know, it's, it's hard with a specific company or even, you know, to generalize across companies. Cause they're really three main areas in terms of ease of use and administration dependability, which includes the full ha and then, and then performance. And in some cases, it's, it's just one of those that kind of drives it. And it's used to justify, in other cases, it's kind of a collection. The ease of use is being able to use a cluster, not only as Hadoop, but to access it and treat it like enterprise storage. So it's a complete POSIX compliance file system underneath that allows the, the mounting and access and updates and using it in dynamic read-write. So what that means from an application level, it's, it's faster, it's much easier to administer and it's much easier and reliable for developers to, to utilize. >>I got to ask you about the marketing question cause I see, you know, map our, you guys have done a good job of marketing. Certainly we want to be thankful to you guys is supporting the cube in the past and you guys have been great supporters of our mission, but now the ecosystem's evolving a lot more competition. Claudia mentioned those eight companies they're tracking in quote Hadoop, and certainly Jeff and I, and, and SiliconANGLE by look at there's a lot more because Hadoop washing has been going on now for the term Hadoop watching me and jumping in and doing Hadoop, slapping that onto an existing solution. It's not been happening full, full, full bore for a year. At least what's the next for you guys to break above the noise? Obviously the communities are very active projects are coming online. You guys have your mission in the enterprise. What's the strategy for you guys going forward is more of the same and anything new even share. >>Yeah, I, I, I think as far as breaking above the noise, it will be our customers, their success and their use cases that really put the spotlight on what the differences are in terms of, of, you know, using a big data platform. And I think what, what companies will start to realize is I'd rather analogy between supply chain and the big, the big revolution in supply chain was focusing on inventory at each stage in the supply chain. And how do you reduce that inventory level and how do you speed the, the flow of goods and the agility of a company for competitive advantage. And I think we're going to view data the same way. So companies instead of raw data that they're copying and moving across different silos, if they're able to process data in place and send small results sets, they're going to be faster, more agile and more competitive. >>And that puts the spotlight on what data platform is out there that can support a broad set of applications and it can have the broadest set of functionality. So, you know, what we're delivering is a mission grade, you know, enterprise grade mission, critical support platform that supports MapReduce and does that high performance provides NFS POSIX access. So you can use it like a file system integrates, you know, enterprise grade, no SQL applications. So now you can do, you know, high-speed consistent performance, real time operations in addition to batch streaming, integrated search, et cetera. So it's, it's really exciting to provide that platform and have organizations transform what they're doing. >>How's the feedback on with Ted Dunning? I haven't seen a lot of buzz on the Twittersphere is getting positive feedback here. He's a, a tech athlete. He's a guru, he's an expert. He's got his hands in all the pies. He's a scientist type. What's he up to? What's his, what's his role within Mapa and he's obviously playing in the open-source community. What's he up to these days, >>Chief application architect, he's on the leading edge of my house. So machine learning, so, you know, sharing insights there, he was speaking at the storm meetup two nights ago and sharing how you can integrate long running batch, predictive analytics with real-time streaming and how the use of snapshots really that, that easy and possible. He travels the world and is helping organizations understand how they can take some very complex, long running processes and really simplify and shorten those >>Chance to meet him in New York city had last had duke world at a, at a, a party and great guy, fantastic geek, and certainly is doing a great work and shout out to Ted. Congratulations, continue up that support. How's everyone else doing? How's John and Treevis doing how's the team at map are we're pedaling as best as you can growing >>Really quickly. No, we're just shifting gears. Would it be on pedaling >>Engine? >>Yeah. Give us an update on the company in terms of how the growth and kind of where you guys are moving that. >>Yeah. We're, we're expanding worldwide, you know, just this, you know, last few months we've opened up offices and in London and Munich and Paris, we're expanding in Asia, Japan and Korea. So w our, our sales and services and engineering, and basically across the whole company continues to expand rapidly. Some really great, interesting partnerships and, and a lot of growth Natalie's we add customers, but it's, it's nice to see customers that continue to really grow their use of map are within their organization, both in terms of amount of data that they're analyzing and the number of applications that they're bringing to bear on the platform. >>Well, that a little bit, because I think, you know, one of the, one of the trends we do see is when a company brings in big data, big data platform, and they might start experiment experimenting with it, build an application. And then maybe in the, maybe in the marketing department, then the sales guys see it and they say, well, maybe we can do something with that. How is that typically the kind of the experience you're seeing and how do you support companies that want to start expanding beyond those initial use cases to support other departments, potentially even other physical locations around the world? How do you, how do you kind of, >>That's been the beauty of that is if you have a platform that can support those new applications. So if you know, mission critical workloads are not an issue, if you support volumes so that you can logically separate makes it much easier, which we have. So one of our customers Zions bank, they brought in Matt BARR to do fraud detection. And pretty soon the fact that they were able to collect all of that data, they had other departments coming to them and saying, Hey, we'd like to use that to do analysis on because we're not getting that data from our existing system. >>Yeah. They come in and you're sitting on a goldmine, there are use cases. And you also mentioned kind of, as you're expanding internationally, what's your take on the international market for big data to do specifically is, is the U S kind of a leaps and bounds ahead of the rest of the world in terms of adoption of the technology. What are you seeing out there in terms of where, where the rest of the, >>I wouldn't say leaps and bounds, and I think internationally, they're able to maybe skip some of the experimental steps. So we're seeing, we're seeing deployment of class financial services and telecom, and it's, it's fairly broad recruit technologies there. The largest provider of recruiting services, indeed.com is one of their subsidiaries they're doing a lot with, with Hadoop and map are specifically, so it's, it's, it's been, it's been expanding rapidly. Fantastic. >>I also, you know, when you think about Europe, what's going on with Google and some of the, the privacy concerns even here, or I should say, is there, are there different regulatory environments you've got to navigate when you're talking about data and how you use data when you're starting to expand to other, other locales? >>Yeah. There's typically by vertical, there's different, different requirements, HIPAA and healthcare, and basal to, and financial services. And so all of those, and it, it, it basically, it's the same theme of when you're bringing Hadoop into an organization and into a data center, the same sorts of concerns and requirements and privacy that you're applying in other areas will be applied on Hindu. >>I'm now kind of turning back to the technology. You mentioned Apache drill. I'd love to get an update on kind of where, where that stands. You know, it's put, then put that into context for people. We hear a lot about the SQL and Hadoop question here, where does drill fit into that, into that equation? >>Well, the, the, you know, there's a lot of different approaches to provide SQL access. A lot of that is driven by how do you, how do you leverage some of the talent and organization that, you know, speak SQL? So there's developments with respect to hive, you know, there's other projects out there. Apache drill is an open source project, getting a lot of community involvement. And the design center there is pretty interesting. It started from the beginning as an open source project. And two main differences. One was in looking at supporting SQL it's, let's do full ANSI SQL. So it's full 2003 ANSI, sequel, not a SQL like, and that'll support the greatest number of applications and, you know, avoid a lot of support and, and issues. And the second design center is let's support a broad set of data sources. So nested sources like Jason scheme on discovery, and basically fitting it into an enterprise environment, which sometimes is kinda messy and can get messy as acquisitions happen, et cetera. So it's complimentary, it's about, you know, enabling interactive, low latency queries. >>Jack, I want to give you the final word. We are out of time. Thanks for coming on the cube. Really preached. Great to see you again, keep alumni, but final word. And we'll end the segment here on the cube is your quick thoughts on what's happening here at Hadoop world. What is this show about? Share with the audience? What's the vibe, the summary quick soundbite on Hadoop. >>I think I'll go back to how we started. It's not, if you used to do putz, how you use to do and, you know, look at not only the first application, but what it's going to look like in multiple applications and pay attention to what enterprise grade means. >>Okay. They were secure. We got a more coverage coming, Jack Norris with map R I'll say one of the big three original, big three, still on the, on the list in our mind, and the market's mind with a unique approach to Hadoop and the mid-June great. This is the cube I'm Jennifer with Jeff Kelly. We'll be right back after this short break, >>Let's settle the PR program out there and fighting gap tech news right there. Plenty of the attack was that providing a new gadget. Let's talk about the latest game name, but just the.

Published Date : Jun 27 2013

SUMMARY :

IO is that, you know, we're 25 times faster on read intensive HBase applications. All the top execs were there and all the developers, you know, So, you know, the ability to meet full S SLA is full ha It's really saying, you know, just available when So, you know, there's a lot of details compared to what you guys offer? You know, one of the tenants is you have a point of Hadoop and some of the purists, as you might call them are saying, it's gotta be open a hundred percent that Peloton is, is, you know, pedaling faster and, and we'll, we'll catch up. So I know you guys are working with companies like ancestry.com, live nation and Quicken that they make accessible and, and, you know, ability for, So talk about when you went to, to work with them, what were some of their key requirements? It's kind of, you know, it's, it's hard with a specific company or even, I got to ask you about the marketing question cause I see, you know, map our, you guys have done a good job of marketing. And how do you reduce that inventory level and how do you speed the, you know, what we're delivering is a mission grade, you know, enterprise grade mission, How's the feedback on with Ted Dunning? so, you know, sharing insights there, he was speaking at the storm meetup How's John and Treevis doing how's the team at map are we're pedaling as best as you can No, we're just shifting gears. and basically across the whole company continues to expand rapidly. Well, that a little bit, because I think, you know, one of the, one of the trends we do see is when a company brings in big data, That's been the beauty of that is if you have a platform that can support those And you also mentioned kind of, they're able to maybe skip some of the experimental steps. and it, it, it basically, it's the same theme of when you're bringing Hadoop into We hear a lot about the SQL and Hadoop question support the greatest number of applications and, you know, avoid a lot of support and, Great to see you again, you know, look at not only the first application, but what it's going to look like in multiple This is the cube I'm Jennifer with Jeff Kelly. Plenty of the attack was that providing a new gadget.

ENTITIES

Entity	Category	Confidence
Ted	PERSON	0.99+
London	LOCATION	0.99+
Claudia	PERSON	0.99+
Jeff Kelly	PERSON	0.99+
Asia	LOCATION	0.99+
Ted Dunning	PERSON	0.99+
Jack Norris	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Jack	PERSON	0.99+
10	QUANTITY	0.99+
Paris	LOCATION	0.99+
Korea	LOCATION	0.99+
Matt BARR	PERSON	0.99+
Munich	LOCATION	0.99+
New York	LOCATION	0.99+
99.9%	QUANTITY	0.99+
Jennifer	PERSON	0.99+
Treevis	PERSON	0.99+
25 times	QUANTITY	0.99+
Japan	LOCATION	0.99+
Google	ORGANIZATION	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
Jeff	PERSON	0.99+
eight companies	QUANTITY	0.99+
first time	QUANTITY	0.99+
mid-June	DATE	0.99+
Charles	PERSON	0.98+
Europe	LOCATION	0.98+
30 minutes	QUANTITY	0.98+
One	QUANTITY	0.98+
first application	QUANTITY	0.98+
Ash	PERSON	0.98+
two nights ago	DATE	0.98+
Hortonworks	ORGANIZATION	0.98+
each stage	QUANTITY	0.97+
SQL	TITLE	0.97+
SiliconANGLE	ORGANIZATION	0.97+
Natalie	PERSON	0.97+
ancestry.com	ORGANIZATION	0.96+
Hadoop	TITLE	0.96+
Patrick	PERSON	0.96+
last night	DATE	0.95+
Jason	PERSON	0.95+
2003	DATE	0.95+
Hadoop	EVENT	0.94+
Apache	ORGANIZATION	0.94+
Hadoop	PERSON	0.93+
indeed.com	ORGANIZATION	0.93+
hundred percent	QUANTITY	0.92+
HBase	TITLE	0.92+
Hadoop Summit 2013	EVENT	0.92+
Quicken loans	ORGANIZATION	0.92+
two main differences	QUANTITY	0.89+
HIPAA	TITLE	0.89+
#HadoopSummit	EVENT	0.89+
S SLA	TITLE	0.89+
Hadoop	ORGANIZATION	0.88+
Cloudera	ORGANIZATION	0.85+
map R	TITLE	0.85+
a year	QUANTITY	0.83+
Zions bank	ORGANIZATION	0.83+
Peloton	LOCATION	0.78+
NFS	TITLE	0.78+
MapReduce	TITLE	0.77+
Cloudera map R	ORGANIZATION	0.75+
live	ORGANIZATION	0.74+
second design center	QUANTITY	0.73+
Hindu	ORGANIZATION	0.7+
theCUBE	ORGANIZATION	0.7+
three main areas	QUANTITY	0.68+
one enterprise grade	QUANTITY	0.65+

Amr Awadallah - Hadoop Summit 2013 - theCUBE - #HadoopSummit

>>Come back here. This is Silicon Valley coverage of ADU Summit. I'm John Fur, the founder. We're, we're pleased to have a friend inside the cube. It's rare to have such luminaries, Ama Aala, good friend and also co-founder of Cloudera. Really the pioneer in the space that helped build this industry that we're living here at at Hadoop Summit. I'm with Dave Ante from wiba.org. Amour, welcome back to the Cube Cub alumni. Thank you for having me here. Wow, what a journey. Are you co-founded Cloudera? I remember when you in Stealth Mo, I really can't talk about it. And, and then of course the history of Silicon Angle being, you know, founded and kind of built in in your office when you only had like 20 something employees. Yep. We owe a great deal of gratitude to you and, and congratulations to you Michael Olson, the team for building an industry. So I just wanted Thank you. Thank you. And welcome to the Cube. >>Thank you. It was great to be here. >>So what do you think, what's your take on the current Hadoop ecosystem right now? I mean, obviously a lot's happened. I mean it's big now. It's growing up fast. Yeah. The word enterprise grade is out there. You're seeing it move from, you know, trying to change the world. Our first interview, you said, I've seen the future, I want to bring it to the mainstream. It's here. Yeah. It's hitting mainstream right now. Yeah. What's your take of the current situation of the ecosystem and it's, and its value? >>Yeah, so I, I have a quick question first. Should I look to you or look to the camera? Look to >>The camera or both? Whatever you, whatever you'd like. >>So I think it's, the ecosystem is definitely growing, which is very, very healthy. However, there is a side question there, which is what do you think of all the competition coming into the space? So five years ago when Cloudera was started was just Cloudera. There was no other commercial vendor trying to support or enable Hadoop in the, in the industry for enterprises. And today there is at least 10 of them trying to compete with us, right? And that includes big companies, established companies that decided, hey, we gonna start addressing the space, but includes many, many newcomers who like Hortonworks, who were founded over the last couple of years. That's a healthy thing. I mean, that's absolutely a sign of a growing market. If the market wasn't growing, if there wasn't money in the market, if there wasn't, if it was just hype, there wouldn't have been all of these new companies and new ventures showing up. That said, I never look at competition as something that worries me, that I'm afraid now or what's gonna happen to me, or that's normal. That's exactly what happens to successful companies. If you look at Red Hat, when Red Hat was launching with the Linux, they had 25 competitors or even more 30 competitors. That's when Red Hat was forming out. And today, even of these 25, 30 competitors, they still have six or seven still left. So I think it's a very, very healthy sign of the graph of this market and the maturity that's reaching. >>What do you think about some of the, the white spaces that are evolving? You guys have obviously been involved in a lot of deployments at Cloudera. Again, you're doing a lot of, lot of work with the top, top names and the clients that you have aren't usually disclosed cuz you really can't disclose them. What, what are you seeing right now as the white spaces for things to do in the Hado platform? >>It's a very, very good question. So first I can't talk about future, future roadmap. Right now we're becoming a big company at that level where we can't comment on future roadmaps. >>Ah, that's sinus sign of the >>Time. You're well media train, good to see they're doing a good job keeping you >>A, You want more information on that? I can connect you with a pt, >>Please. No, no, no, we're good. We're good. We'll get it outta you. But, >>But our vision, our vision for Cloudera from day one, like you were saying earlier, we saw the future, right? So our vision from from day one was really to build this data system where we can have detail of any type, whether that data is structured or unstructured or images, it doesn't matter. And then on top of that data run any type of workloads. That workload could be the initial genesis of Hado, which is map use, which is batch processing. But now as as we made many announcements through the last few years, we also now have Impala for interactive analytics as a workload. We have a very, very strong partner partnership with SaaS for doing machine learning and statistics as a workload. And a few weeks ago we announced search as another workload. So you have multiple types of workloads that can handle different types of problems that you have within your organization and bring all of these workloads to all of your data regardless of type. And that's the vision that we'll continue to deliver on. That's exactly what we're building going into the >>Future. So how's that fit in with yarn, right? We're hearing a lot at this conference about yarn, the ability to, you know, do more with less in a lot of the things that you typically hear with the enter within the enterprise. And, and so talk about that a little bit. >>Yarn is a very core part to our platform. In fact, yarn has been part of CDH four for more than a year now out in the, in the markets. So we did bring, we were one of the, I think we were the first vendor who brought yarn into a distribution of Hado out there. It's very, very fundamental to us because that is how we're gonna coordinate. We are gonna be using yarn to coordinate launching all of these different type of workloads. You're gonna have the map produce workload, which is very batch oriented. The Impala workload, which is very latency sensitive. The, the search workload, which is also very latency sensitive. The machine learning workload, which is more batch oriented, et cetera, et cetera. And yarn is a very, very central piece to helping us coordinate all of these different types of workloads onto the >>Platform. Cloudera has been a great citizen in the community also. You, you mentioned and, and we witnessed that your team create the industry. You guys were there, you took the chance, you were the first ones commercially funded by the venture capitalists, you know, then others will follow and I'll see huge ecosystem here. Yes. A lot of noise. A lot of people trying to get attention. So I got to ask you, because I want you to address this because I know it's been talked about in some of the other blogs is there's a lot of fud going on around who's doing what? Who's doing what, and in some cases maybe flat out, you know, misinformation and that happens in a growing market, you know, the elbows get sharp. Yes. So I want you share with the audience anything that you want say about the fud around what people say about Cloudera or about others or what you're doing. Just to clarify, cuz there has been, I mean I've gotten back channel information around, you know, not sure the committers this, and it's been, it's been well documented. There's a lot of fu out there. What, what would you say to the folks out there to clarify >>That? Yes, I, I would say that our focus should be to continue to work as a community, to push the platform forwards. I would say that at Cloudera we do a lot of contributions. Horton works definitely is one of the top contributors out there as well. I'll acknowledge that. So as many, many, many other companies and we wanna continue to see the platform evolve. I will stress though that at Cloudera we do have a number of the original project founders working at the company. So it's not just the, the contribution that we bring, but the fact that we have the founders of these projects working at Cloudera. And some of these projects actually were created at Cloudera from day one as opposed to created in some other company. And then you hire the employee and they work for you. So I gave you what examples from Cloudera dot cutting. >>He is the creator of Hudu dot Cutting is also the creator of Luine, which became solar, which is part of the search project that we launched recently. Dot Cutting wasn't with Cloudera from day one, right? So, so when he created these technologies, he actually was at Tia for example, when he created had he was at ta, wasn't at Cloudera. However, he now works for Cloudera. So we get that because now that cutting works for Cloudera. So that's one example. On the flip side, there is projects like Flume and Scoop that are now part of every single distribution out there. And flu and Scoop were both created at Calera. They were actually created inside of Cloudera. Yeah. So the key point is, and and that's what I would like all of the vendors out there that are trying to leverage had and get benefit about out Hadoop is please don't be just takers. >>There are some vendors out there who are just takers. Just wanna take from the open source, take from the open source and don't give back. Right? I'm not gonna name them, but there is a few of them out there. Please, please, please. I mean that that, that is very, very a selfish behavior. It's not gonna help the ecosystem in the long term. We would like to see you both take and give at the same time. So that would be my core message. And that's for example, like I thank Hortonworks because that's exactly what Hortonworks is doing. They're both giving and taking at the same >>Time. You guys have always been clear on that. Nobody, I mean here contribution to open source has been well documented and there's, there's no question about that. John and I have talked about it a lot that you guys help get it all started. And even Haak when we had 'em on a couple years ago, when Horton Works came to the market said, Hey, the more people work on an open source, the better. >>Yeah, >>Exactly. So yeah, it's always been, been your posture. You're not playing games there. Anyways, having said that, you you, you have a strategy to layer on top of that open source some of your own proprietary code. And so you have choices to make Yes. In terms of how you allocate those resources. So as an engineering manager, how do you allocate those resources in terms of, okay, what do we do for the community and what do we do for our own, you know, future because of the business model that we chose? How do you make those trade offs? >>Yes, that's a very, very good question. So first it's important to stress that our core platform, CDH, is open source. Everything we put in the core platform is open source. So for example, in Palo, which we launched very recently as a ga, now we launched beta last year, but now's ga is a hundred percent Apache license, a hundred percent open source search, which we announced very recently is also open source. So the platform itself, we're committing to everything in there to be open source. Now we believe fundamentally just from having lots of history in studying the open source markets from our ceo Mike Olson himself being one of the very first open source people in the world with, with sleepy cats, the company that he sold to Oracle before founding Cloudera from our investors, helping many other open source companies. To have a successful open co open source company, you need to have a very good engine between the business model that generates revenue and between the product that you are creating. If you don't have a good feedback loop there between these two, you won't be able to sustain the innovation to continue to push the, the boundaries of how good the product is. So we strongly believe in that if you are, if your product is literally a hundred percent open source, meaning both the management and every, there is nothing proprietary whatsoever inside of your products. I can't tell what that is. It's >>Taking a picture. >>Oh, sorry, I thought somebody was waiting >>For me. >>Sorry about that. >>It's a cheap signal. >>It >>Was like a's really good. >>I thought it's like a card of paper with some writing. You, >>You, you have a fan fans out there. They're storming the, the concert here. >>Okay, that's, that's good to hear. That's good to hear. Sorry about that interruption. So if, if, if you have everything a hundred percent open source, that creates two problems. First you have no differentiation whatsoever, meaning another big corporation without naming who the big corporations could be, we just can take everything you do, literally every single bit of source code you have and say, Hey, we can do it too. Come to us, don't work with those guys. Right? We have the latest, greatest things that they have. Why do you wanna continue to work with them? So no, no differentiation is number one, which is very dangerous. And number two, when it becomes, if, if it's a hundred percent open source and there is lots of other vendors able to take the art, the open source artifact and work with it, then it becomes now purely about maintenance and insurance on the products, which is a commodity product, which obviously the prices for that will go down to the ground and you won't be able to have this sustain this positive feedback effect between your business model and between your product code map and won't be able to build a long-lasting company. >>So that's why we do have a combination of open source artifacts and proprietary artifacts. Now our pro proprietary AR artifacts is always around the management of the system, right? So how do we manage the security of the system? How do we manage the, the data flow within the system? How do we manage the services inside the, of the system across all layers, right? Not just the Hado player but the edge based layer, the zookeeper layer, et cetera, et cetera. So that's where we focus our efforts going forward and that's how we differentiate ourself from our, from other vendors out there. Cloudera manager, Cloudera navigator are very unique to us. Nobody else has anything close to those capabilities out there. >>So it sounds like the contributions you make to open source are cultural of, of, in nature, I mean DNA of sorts of Right. And so you're, that's something that you guys do cuz you've always done it. Absolutely. And then the, the artifacts that are proprietary are essentially around rationalizing the revenue opportunity with the expense that you're gonna apply there and making a business case decided >>How to balance. That's that's one. And then two, the differentiation from other competitors. So these two things, Yes. >>Okay. >>I believe that's fundamental to business to open source business models. >>Yeah, I mean there are many open source business models, right? You can go pure service, you can go, like you said, you can totally bogart the code. >>There is no, there is no pure service open source model company that was able to build the longlasting surviving public company, never happened in history. They always get acquired because it becomes a commodity. I >>Mean, right. I mean, I mean and even ibm, right? >>Tom or I want to ask you about the storage thing. We were talking before camera, the, the hor and worst announcement storage you, what's your take on that? >>Which one? The Gluster, the one with Red Hats? Yes. Yes. So Red Hats and yeah, there has been recent news about Red Hat with, with Hor Works having a version of the Haddo platform that uses map use for the computation but uses Red Hat for the storage, right? So Red Hat has a new storage offering that was built based off of a company they acquired was called Guster. And that, that news was very, very surprising to me. And it, the reason why it was surprising, it's correlated also with a shift in messaging from, from Horton works. If you look at Horton Works last year at had Summit last year, one of the key messages that they deliver to us is that within the next five years or by 2015, the tagline back then by 2015, and you're doing research right now to see if I'm saying the right thing. By 2015, half the world data data will be on, will be stored in had would be stored in had. Yes. If you look today at the slides, it >>Doesn't say that it says within five years, >>Right? No, no, no. It says, well >>That was the second iteration was within five years. And now they say something >>Different. Now say they say within 2015 by, sorry, by 2015, half the world's data will be processed by Hado and instead of stored by Hado. And that's a very, very fundamental So >>It's a nuance. >>It's a, it's a very important >>Nuance. Well it's a big deal because yes, when I first saw that I said, Hmm, what does this all mean? And then it sounds 2015 sounds a little early. Yes. And now you're saying processed by, Okay that's different. >>Yes, exactly. And and the reason why now is we believe s GFS is very, very core to the had platform. S GFS is very core to had platform, the storage system of had we want. It's really the layer that Mid had with is more than anything else is how scalable, how reliable and how economical the sdfs storage layer is. So we, we really, I mean ask qu works and ask all the companies working in the, in the had community not to fragment at the storage layer. We need the storage for had to stay inside of had and not to fragment that out. That's very, very critical. >>Okay. So but so >>You're saying that they're in indicating through the gesture that, that they're not come out saying we're going to fragment Hgfs, but the way that this is position might signal >>No, no, no. The announcement, the announcement with Red Hat is >>That is the direct signal. It's >>Literally, we, you'll be able to run map produce directly on top of Red Hat storage instead of sdfs. >>Okay. So >>I >>Interpreted it, I interpret it as they were just hortonwork was hedging on its prediction, which I said Okay, I'll give 'em a break on that. You're saying it's something different, >>It's a shift in strategy potentially. Yeah. Which can be dangerous. It's shift in strategy. >>Is that a compliance issue? Cuz you know, the, the Dishon Hads poss Yeah. Red Hat does have a lot of enterprise customers. Yeah. So is that just maybe if >>Then invest in making had poss compliance, which actually by the way, we are as a community investing in that. Yeah. Yes. You must have. Yeah. So we are investing in adding compulsive poss compliance to had, we're investing in adding snapshots into had, which will be coming very, very soon overnight. >>Well, do you think that that pick a year, I don't care if it's 2015 2000, 22,000 whenever that the majority of the world's data will be running into do >>The majority of worse data that has to do with analytics. Yes. Okay. So so there is, >>So that is that >>Is it's very important, the caveat. Yes, exactly. Because there is lots of types of data that are not very suitable for, had at all. For example, that data storage for Oracle systems, for Oracle database systems. No, you wanna store that in an NetApp emc you don't wanna store that in Hao the, the, the, the, the data storage for streaming video files, right? For just streaming lots and lots of video files. No, you don't wanna store that indu. It's >>A huge >>Proportion of the data. Yeah. Which is a huge, huge >>Proportion of data files, in fact that could overwhelm the data. >>Yeah. So the new nuance, like I would say like I agree that the half thing but the half thing within the world of data for the purpose of analysis. >>Yeah. Okay. So that's, that's >>Narrow down the >>Yeah, okay. But it's a more reasonable, But I've, I >>Never, It's still a huge market by the way. It is. Yeah, >>It is. Yes. Okay. So, so what's next for you? A are you, you, you've gone on this, this journey, you start this company. You've, you've been traveling around like crazy working with customers. What's the next phase of aara do's, you know, career? >>What >>Do you want to have happen next? I mean, what, what do you, what excites you? What do you, what are you working on? >>Yeah, it's just to continue to grow cloud there to be the biggest company it can be. I mean, we want to be literally, we want be one of the very few companies that we're able to take an open source model and turn that into a large publicly traded corporation. >>So you've talked about that you guys brought a new CEO on Right. Look at the background of the ceo and it's, you know, clearly it's got some IPO chops. Yes. So that's, that's an aspiration that you guys have put forth. Okay. >>And you're outward facing now. So you're doing a lot of travel. Yes. So what, what, where have, what have your travels taken now? You've been in China, you obviously you've got a European office Yeah. Open. So what's going on internationally? Give us some sound bites of, of what's happening in the field. Yeah, >>So in, in internationally, I mean, Europe definitely is our next big focus right now. And we now have a big operation in Europe and we have an office presence in, in Europe and a big team down there. And it's growing very quickly. I would say Europe is about two years behind the US kind of like that's how the, how the growth usually matters. What's happening here. And yeah, so we, our, our next big market is Europe. We are looking at China. We don't have a big process in China right now. Japan, we have a big presence in Japan. Japan is growing very quickly. So yeah, I mean we're obviously Canada with the US growing very quickly as well. >>Great to have you on the cube again, for me personally and, and for, for Dave. And I wanna say thanks to Cloudera for some great support over the years. You guys have been fantastic. You know, I say it's built a great company. It's so hard to build a company. You guys have done a great job. I gotta ask you the final question because you did bring that first sound bite, which was, I saw the future, this is back when you guys were just in your B round in, in Palo Alto office, just ramping up, just starting to ramp what's next? What do you see as around the corner? Obviously we're on a trajectory right now. A lot of things gonna get done. Positive compliance, a lot of stuff's gonna fill in. The platform's gonna get stronger. Yeah. We think that open source will win. Yeah. Through all the democratization of open source. What's next? What's the, what's around the corner that you're watching personally that you're, that's interesting to you? A or around where this will take us? >>Yeah. So what, what's next is having this, having this vision become true. Having this future vision that, that you refer to become true. Meaning having a single platform that can store all of your data and that can, regardless of the type of that data, and allow you to extract value for different types of workloads, whether that be batch, interactive machine learning or search or more, right? There will be more things that will come to the platform, but how to bring your applications, all of your data applications, how to bring them to your data and all of your data as opposed to have the data go to them. >>And what are the landmines out there that you need to avoid Yes. In the industry and community needs to avoid to make that a reality. >>The, the key landmine, it's, it's a bit technical. The landmine is a bit technical, which is making sure that they, they are vision continues to evolve and that we have the capability to properly have a multi workload resource management system that allows me to run all of these type of workloads without having them step on each other's steps. That's the key key step going forward. And >>Of course, playing well together in the sandbox. And as always, competitive competition is good. And again, Hadup is doing great. Amma Aala, co-founder of Cloudera inside the Cube. This is Silicon Angle and Wiki Bond's exclusive coverage of ADU Summit here in Silicon Valley. Right back with our next guest after the short break.

Published Date : Jun 27 2013

SUMMARY :

We owe a great deal of gratitude to you and, and congratulations to you Michael Olson, It was great to be here. So what do you think, what's your take on the current Hadoop ecosystem right now? Should I look to you or look to the camera? The camera or both? there is a side question there, which is what do you think of all the competition coming into the space? what are you seeing right now as the white spaces for things to do in the So first I can't talk about future, future roadmap. you No, no, no, we're good. So you have multiple types of workloads that can handle different types of problems to, you know, do more with less in a lot of the things that you typically hear with the enter within the enterprise. You're gonna have the map produce workload, which is very batch So I want you share with the audience anything that you want say about the So I gave you what examples from Cloudera dot cutting. So the key point is, and and that's what I would like all of the vendors out there that We would like to see you both take and give at the same time. John and I have talked about it a lot that you guys help get it all started. And so you have choices to make Yes. So we strongly believe in that if you are, I thought it's like a card of paper with some writing. You, you have a fan fans out there. big corporations could be, we just can take everything you do, literally every single bit of source code you have So how do we manage the security of the system? So it sounds like the contributions you make to open source are cultural of, of, in nature, So these two things, Yes. You can go pure service, you can go, There is no, there is no pure service open source model company I mean, I mean and even ibm, right? Tom or I want to ask you about the storage thing. And it, the reason why it was surprising, it's correlated also with a shift in messaging No, no, no. It says, well And now they say something half the world's data will be processed by Hado and instead of stored And now you're saying processed And and the reason why now is we believe s GFS is very, That is the direct signal. Interpreted it, I interpret it as they were just hortonwork was hedging on its prediction, which I said Okay, It's a shift in strategy potentially. So is that just maybe if So we are investing in adding compulsive poss compliance to had, we're investing in adding snapshots So so there is, No, you wanna store that in an NetApp emc you don't wanna store that in Hao Proportion of the data. for the purpose of analysis. But it's a more reasonable, But I've, I Never, It's still a huge market by the way. What's the next phase of aara do's, you know, of the very few companies that we're able to take an open source model and turn that into So that's, that's an aspiration that you guys have You've been in China, you obviously you've got a European how the growth usually matters. that first sound bite, which was, I saw the future, this is back when you guys were just in your B round in, and allow you to extract value for different types of workloads, whether that be batch, interactive And what are the landmines out there that you need to avoid Yes. That's the key key step going forward. Amma Aala, co-founder of Cloudera inside the Cube.

ENTITIES

Entity	Category	Confidence
Michael Olson	PERSON	0.99+
John	PERSON	0.99+
Europe	LOCATION	0.99+
Mike Olson	PERSON	0.99+
six	QUANTITY	0.99+
John Fur	PERSON	0.99+
China	LOCATION	0.99+
Dave	PERSON	0.99+
Amma Aala	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
Horton Works	ORGANIZATION	0.99+
Japan	LOCATION	0.99+
2015	DATE	0.99+
25	QUANTITY	0.99+
last year	DATE	0.99+
seven	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
25 competitors	QUANTITY	0.99+
Dave Ante	PERSON	0.99+
Ama Aala	PERSON	0.99+
two	QUANTITY	0.99+
two problems	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
30 competitors	QUANTITY	0.99+
Calera	ORGANIZATION	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.99+
ADU Summit	EVENT	0.99+
Hortonworks	ORGANIZATION	0.99+
five years ago	DATE	0.99+
second iteration	QUANTITY	0.99+
one	QUANTITY	0.98+
22,000	QUANTITY	0.98+
Horton	ORGANIZATION	0.98+
first vendor	QUANTITY	0.98+
five years	QUANTITY	0.98+
hundred percent	QUANTITY	0.98+
Red Hat	TITLE	0.98+
Canada	LOCATION	0.98+
Tia	ORGANIZATION	0.98+
Tom	PERSON	0.98+
Hor Works	ORGANIZATION	0.97+
first	QUANTITY	0.97+
Horton	PERSON	0.97+
two things	QUANTITY	0.97+
first interview	QUANTITY	0.97+
Stealth Mo	LOCATION	0.97+
half	QUANTITY	0.96+
Haak	PERSON	0.96+
one example	QUANTITY	0.96+
Hadoop Summit 2013	EVENT	0.95+

Jack Norris | Hadoop Summit 2012

>>Okay. We're back live in Silicon valley and San Jose, California for the continuous coverage of siliconangle.tv and have duke world 2012. This is ground zero for the alpha geeks in big data. Uh, just the tech elite. We call them tech athletes and, uh, we're excited to cover it on the ground. Extract the signal from the noise here. This is the cube, our flagship telecast. I'm joining my co-host Jeff Kelly from Wiki bond.org, the best analyst in the business. Jeff, welcome back for another segment. End of the day, day one loving every minute. Okay. We're here with our guest. Jack Norris is a cm of map bar Jack. Welcome back to the cube. You've been on a few times. Um, so you guys have some news. Yes. So let's get right to the news. So you guys are a player in the business, so share with your news, the folks. Excellent jump right in. >>So, uh, two big announcements today, we announced that Amazon is integrating map bar as part of their Lastic MapReduce service and both edition or, or free edition. M three is available as well as M five directly with Amazon, Amazon in the cloud. >>So what's the value proposition. Why would a customer say, all right, I want to do this in the cloud manpower, an Amazon cloud rather than doing it on premise. >>Okay. So let's start with, I mean, there's a lot of value propositions, all balled up into one here. Uh, first of all, in the cloud, it allows them to spin up very quickly. Within a couple minutes, you can get, uh, you know, hundreds of nodes available. Um, and, uh, and depending on where you're processing the data, if you've got a lot of data in the cloud already makes a lot of sense to do the Hadoop processing directly there. So that's, that's one area. A second is you might have an on-premise cloud deployment and need to have a disaster recovery. So map R provides point in time, snapshots, uh, as well as, as a white area replication. So you can use mirroring having Amazon available as a target is a huge advantage. And then there's also a third application area where you can do processing of the data in the cloud and then synchronize those results to an on-premise. So basically process where the data is combined the results into a cluster on premise. So you >>Don't have to move the raw data. Uh, >>On-premise actually, it's all about let's do the processing on the data. Well, you know, the whole, >>The value proposition and big data in general is let's not move, move data as little as possible. Yep. Uh, you know, so you bring the computation to the data, if you can. Uh, so what are your take on this event? I mean, we've got, uh, this is a, you know, the 4th of June summit, uh, you know, Hortonworks is now fully taken over the show and talk about what you see out here in terms of, uh, the other vendors that play. And, uh, just to kind of the attendees, the vibe you're seeing, >>Uh, it's a lot of excitement. I think a big difference between last year, which seemed to be very developer focused. We're seeing a lot of, a lot of presentations by customers. A lot of information was shared by our customers today. It was fun to see that, uh, comScore's shared, uh, shared their success. Boeing gap map is, uh, it was great for us. >>Fantastic. We look at Amazon, Amazon, first of all, is the gold standard for public cloud. Right? They've knocked it out of the park. Everyone knows Amazon. Um, but they've been criticized on the big data front because of the cycle times involve on. Um, and some developers and mean for web service spending up and down. No problem. Um, and we're seeing businesses like Netflix run on Amazon. So Amazon is not a stranger to running scale for cloud, but Hadoop has kind of been a klugey thing for Amazon. So I think, you know, talk about why Amazon and you guys is a good fit out to the market. The market reach is great. So you guys know and have a huge addressable market. Are you guys helping solve some of that complexity with the, uh, with the MapReduce side? What's, >>What's the core, I guess the first comment first response would be, I think every customer should have that type of Kluge. Uh, uh, they could have the success that Amazon has in Hadoop. They have a huge number of, of, uh, of Hadoop deployments have been very, very successful. I think, >>I mean, you know what I mean by it's natural, it's, cloogy everywhere right now. That's the problem. But Amazon has huge scale, um, and had not a natural fit. There >>Is not a natural fit >>For the data for the data component. And, uh, uh, the HBase for example, >>Component. So where were Amazons, you know, made it very frictionless is the ability to spin up Hadoop to do the analysis. The gap that was missing is some of the, the ha capabilities. The data protection features the disaster recovery, and, you know, we're map are now it gives options to those customers. You know, if they want those kinds of enterprise enterprise grade features, now they have an option within EMR. It can select a M five and, and get moving if they want a performance. And in NFS, they've got the M three options. >>Well, congratulations. I think it's a great deal for you guys and for Amazon customers. My question for you is, as you guys explore the enterprise ready equation, which has been a big topic this week, um, what does that mean to you guys? Cause it means different things to different people depends on where, how high up to OLTB do you go? Right? I mean, we're how far from batch to real time transactional, um, levels you go, I mean, low bash, no problem. But as you start to get more near real time, it's going to be a little bit different gray in this house used security HDFS. Yeah. >>Yeah. So, so duke represents the strategic platform, right? Deploying that in an organization, um, you know, moving from kind of an experimental kind of lab based to production environment creates a different set of feature requirements. How available is it? How easy is it to integrate, right? How do I kind of protect that information and how do I share it? So when we say enterprise grade, we mean you can have SLA, she can put the data there and, and be confident that the data will remain there, that you can have a point in time recovery for an application error or user mistake. Uh, you can have a disaster recovery features in place. And then the integration is about not recreating the wheel to get access to the information. So Hadoop is very powerful, but it requires interacting through an HDFS API. If you can leverage it like through map bar with NFS standard file based access standard ODBC access, open it up. >>So I can use a standard file browser applications to see and manipulate the data really opens up the use cases. And then finally, what we announced in two dot oh, was multitenancy features. So as you share that information, all of a sudden the SLA is of different groups and well, these guys need it immediately. And if you've got some low grade batch jobs are going to impact that. So you want the ability to protect, to isolate, to secure information, and basically have virtual clusters within a cluster. And those features are important to cloud, but they're also important to on-premise >>So great for the hybrid cloud environments out there. I mean, the multitenancy cracking the code on that. Exactly huge. I mean, that is basically, I mean, right now most enterprises are like private cloud because it's like, they're basically extension of their data center and you're seeing a lot more activity in the hybrid cloud as a gateway to the public cloud. So, >>And, and, you know, frankly, people are kind of struggling with in an experimental with Apache Hadoop and the other distributions, the policies are either at the individual file level or the whole cluster. And it all almost forced the creation of separate physical clusters, which kind of goes against the whole Hadoop concept. So the ability to manage it, a logical layer have separate volumes where you can apply policies to apply that applies to all the content underneath really kind of makes it much, much easier for administrators to kind of deal with these multiple use cases. >>Amazon, Amazon has always been one of those cases for the enterprise where it's been one of those and they've, this has been talked about for years, put the credit card down, go play on Amazon, but then bring it back into the it group for certification. And so I think this is a nice product for you guys to bring that comfort. You know, we're very >>Excited the enterprise saying, Hey, >>Come play in Amazon. It's Bulletproof enterprise. Ready? So congratulations. >>I wonder, can we talk, uh, talk use cases. So what are you seeing in terms of, uh, evolving use cases as, as, uh, duke continues to become more enterprise grade, uh, depending on your definition, uh, but how is that impacting what you're seeing in terms of, even if it's just, uh, you know, the, the, um, the mindset even people think now, okay, now it's enterprise grade, well, maybe, you know, in, in, depending on who you talk to, it's been that way for a bit, but what kind of, uh, use cases are you seeing develop now that it's kind of starting to gain acceptance? It's like, okay, we can trust our data is going to be there, et cetera. >>So th there's a huge range of use cases that, uh, different by industry, different by kind of dataset that's being used against everything from really a deep store where you can do analytics on it. So you're selecting the content to something that's very, very analytic machine learning intensive, where you're doing sophisticated clustering algorithms, uh, et cetera, um, where we've seen kind of an expansion of use cases are around real-time streaming and you get streaming data sets that are kind of entering into the cloud. And, um, some of the more mission, critical data moving beyond just maybe click stream data or things that if you happen to drop a few, you know, not a big deal, right. Versus the kind of trust the business type of content. >>Talk a little bit about the streaming, uh, aspects, uh, because of course, you know, we think of duke, we think of a batch system in terms of streaming data into Hadoop. You know, that's, that's a different, uh, that's something we don't, we haven't heard a lot about. So how do you guys approach that? >>So, uh, one of the artifacts of, of HDFS, which is a, is a distributed file system that scores in the underlying Linux file system, it's append only. So as an administrator, you decide, how frequently do I close the file item? I going to do that an hourly basis on it every eight hours, because you have to close the file for other applications to see the data that's been written. Right? So one of the innovations that, uh, that we pursued was to rewrite that create this dynamic read-write layer. So you can continue to write data in any application is seeing the latest data that's written. So you can Mount the cluster as if it's storage and just continue to write data. There really opens up what's, uh, what's possible companies like Informatica, they're all from a messaging product integrates directly in with, with Matt BARR and provides. >>So what kind of advantage does that provide to the end user? What w w translate that into real business value? Why, why is that important? >>Well, so one example is comScore, comScore handles 30 billion, uh, objects a day, uh, as they go out and try to measure the use of, of the web and being able to continually write and stream that information and scale and handle that in a real time and do analytics and turn around data faster, has tremendous business value to them. If they're stuck in a batch environment where the load times lengthen to the point where all of a sudden they can't keep up and they're actually reporting on, you know, old news. And I think the analogy is forecasting rain a day after it's wet. Isn't exactly valuable. >>Yeah. So you guys, obviously a great deal of the enterprise ready for Amazon, big story, big coup for the company. What's next for you. I want to ask that and make sure you get that out there on your agenda for the next year, but then I want you to take a step back a year, maybe a year and a half ago. Look back at how much has changed in this landscape. Um, share your perspective because the market has gone through an evolution where there's been a market opportunity, and then everyone goes, oh my God, it's bigger than we actually thought. I mean, Jeff, Kelly's a groundbreaking report about the $50 billion market is now being talked about as too low. So big data has absolutely opened up to a huge, and it's changed some of the tactics around strategies. So your strategy, Hortonworks strategy, even cloud era. So, and it's still evolving. So what's changed for the folks out there from a year and a half ago, a year ago to today, and then look out for the next 12 months. What's on your agenda. >>Well, if, if you look back, I think we've been fairly consistent. Um, uh, I'm, I'm not going to take credit for the vision of our CEO and CTO. Uh, but they recognized early on that Hadoop was, uh, was a strategic platform and to be a strategic platform that applied to the broadest number of use cases and organizations required some, some areas, uh, of innovation and particularly the how it, how it scaled, how it was managed, how you stored and protected the information needed a rearchitecture. And I think that, you know, architecture matters when you're going through a paradigm shift, having the right one in place creates this, this ability, you know, to speed innovation. And I think that's, if there's anything that's changed, I think it's the speed of innovation has even increased in the Hadoop community. I think it's, it's created a focus on these enterprise grade features on how do we store this valuable information and, and continue to explore. >>And I think one of the observations I'll make is that on that note is that it really focuses everyone to be just mind your own business and get the products out. You know what I'm saying? We've seen everyone, the product focus be the number one conversation. >>What we've seen is customers, you know, start and they expand rapidly. Some of that student data growth, but a lot of it is student more and more applications are being delivered and, and, uh, and, and the values kind of extracted from the hoop platform and success breeds success. Well, >>Congratulations for all your success, great win with Amazon web services and make that a little bit more easier, more robust, and more, more features for them and you, uh, more revenue for part of our, um, and I want to personally thank you for your support to the cube. Uh, we've expanded with a new studio B software for extra extra interviews, um, and wanna expand the conversation, thanks to your generous support. You can bring the independent coverage out to the market and, um, great community, thanks for helping us out. And we appreciate it. So thank you. Okay. Jack Dorsey with Matt bar, we'll be right back to wrap up day one with that. Jeff and I will give our analysis right at the short break.

Published Date : Jun 14 2012

SUMMARY :

So you guys are a player in the business, so share with your news, Amazon in the cloud. So what's the value proposition. And then there's also a third application area where you can do processing of the data in Don't have to move the raw data. Well, you know, the whole, uh, you know, Hortonworks is now fully taken over the show and talk about what you see out here in terms of, uh, it was great for us. So I think, you know, talk about why Amazon and you guys is a good fit out What's the core, I guess the first comment first response would be, I think every customer I mean, you know what I mean by it's natural, it's, cloogy everywhere right now. For the data for the data component. the disaster recovery, and, you know, we're map are now it gives options to those customers. I think it's a great deal for you guys and for Amazon customers. that the data will remain there, that you can have a point in time recovery for an application error or user mistake. So as you share that information, So great for the hybrid cloud environments out there. So the ability to manage it, And so I think this is a nice product for you guys to So congratulations. So what are you seeing in terms of, uh, evolving use cases as, really a deep store where you can do analytics on it. Talk a little bit about the streaming, uh, aspects, uh, because of course, you know, we think of duke, I going to do that an hourly basis on it every eight hours, because you have to close the file for other applications actually reporting on, you know, old news. I want to ask that and make sure you get that And I think that, you know, architecture matters when you're going through a paradigm shift, And I think one of the observations I'll make is that on that note is that it really focuses everyone to be What we've seen is customers, you know, start and they expand rapidly. You can bring the independent coverage out to the market and, um, great community,

ENTITIES

Entity	Category	Confidence
Jeff Kelly	PERSON	0.99+
Jeff	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jack Norris	PERSON	0.99+
Jack Dorsey	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
$50 billion	QUANTITY	0.99+
Silicon valley	LOCATION	0.99+
30 billion	QUANTITY	0.99+
today	DATE	0.99+
Informatica	ORGANIZATION	0.99+
a year ago	DATE	0.99+
next year	DATE	0.99+
comScore	ORGANIZATION	0.99+
a year and a half ago	DATE	0.99+
Kelly	PERSON	0.99+
last year	DATE	0.99+
Amazons	ORGANIZATION	0.99+
Linux	TITLE	0.99+
Matt BARR	PERSON	0.99+
San Jose, California	LOCATION	0.99+
one example	QUANTITY	0.98+
one area	QUANTITY	0.97+
third application	QUANTITY	0.97+
Matt	PERSON	0.97+
one	QUANTITY	0.97+
Hadoop	TITLE	0.97+
this week	DATE	0.96+
2012	DATE	0.95+
hundreds of nodes	QUANTITY	0.94+
Hortonworks	ORGANIZATION	0.94+
Jack	PERSON	0.93+
both edition	QUANTITY	0.93+
a day	QUANTITY	0.93+
two big announcements	QUANTITY	0.92+
second	QUANTITY	0.9+
next 12 months	DATE	0.88+
day one	QUANTITY	0.86+
two dot	QUANTITY	0.85+
M three	OTHER	0.85+
M three	TITLE	0.84+
MapReduce	ORGANIZATION	0.82+
Hadoop Summit 2012	EVENT	0.79+
first response	QUANTITY	0.79+
every eight hours	QUANTITY	0.78+
SLA	TITLE	0.77+
June	DATE	0.77+
first comment	QUANTITY	0.77+
Lastic MapReduce	TITLE	0.69+
M five	OTHER	0.69+
Boeing	ORGANIZATION	0.68+
M five	TITLE	0.67+
siliconangle.tv	OTHER	0.67+
ground zero	QUANTITY	0.67+
Wiki bond.org	ORGANIZATION	0.62+
Apache	ORGANIZATION	0.61+
4th of	EVENT	0.6+

Kickoff | theCUBE NYC 2018

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Hello, everyone, welcome to this CUBE special presentation here in New York City for CUBENYC. I'm John Furrier with Dave Vellante. This is our ninth year covering the big data industry, starting with Hadoop World and evolved over the years. This is our ninth year, Dave. We've been covering Hadoop World, Hadoop Summit, Strata Conference, Strata Hadoop. Now it's called Strata Data, I don't know what Strata O'Reilly's going to call it next. As you all know, theCUBE has been present for the creation at the Hadoop big data ecosystem. We're here for our ninth year, certainly a lot's changed. AI's the center of the conversation, and certainly we've seen some horses come in, some haven't come in, and trends have emerged, some gone away, your thoughts. Nine years covering big data. >> Well, John, I remember fondly, vividly, the call that I got. I was in Dallas at a storage networking world show and you called and said, "Hey, we're doing "Hadoop World, get over there," and of course, Hadoop, big data, was the new, hot thing. I told everybody, "I'm leaving." Most of the people said, "What's Hadoop?" Right, so we came, we started covering, it was people like Jeff Hammerbacher, Amr Awadallah, Doug Cutting, who invented Hadoop, Mike Olson, you know, head of Cloudera at the time, and people like Abi Mehda, who at the time was at B of A, and some of the things we learned then that were profound-- >> Yeah. >> As much as Hadoop is sort of on the back burner now and people really aren't talking about it, some of the things that are profound about Hadoop, really, were the idea, the notion of bringing five megabytes of code to a petabyte of data, for example, or the notion of no schema on write. You know, put it into the database and then figure it out. >> Unstructured data. >> Right. >> Object storage. >> And so, that created a state of innovation, of funding. We were talking last night about, you know, many, many years ago at this event this time of the year, concurrent with Strata you would have VCs all over the place. There really aren't a lot of VCs here this year, not a lot of VC parties-- >> Mm-hm. >> As there used to be, so that somewhat waned, but some of the things that we talked about back then, we said that big money and big data is going to be made by the practitioners, not by the vendors, and that's proved true. I mean... >> Yeah. >> The big three Hadoop distro vendors, Cloudera, Hortonworks, and MapR, you know, Cloudera's $2.5 billion valuation, you know, not bad, but it's not a $30, $40 billion value company. The other thing we said is there will be no Red Hat of big data. You said, "Well, the only Red Hat of big data might be "Red Hat," and so, (chuckles) that's basically proved true. >> Yeah. >> And so, I think if we look back we always talked about Hadoop and big data being a reduction, the ROI was a reduction on investment. >> Yeah. >> It was a way to have a cheaper data warehouse, and that's essentially-- Well, what did we get right and wrong? I mean, let's look at some of the trends. I mean, first of all, I think we got pretty much everything right, as you know. We tend to make the calls pretty accurately with theCUBE. Got a lot of data, we look, we have the analytics in our own system, plus we have the research team digging in, so you know, we pretty much get, do a good job. I think one thing that we predicted was that Hadoop certainly would change the game, and that did. We also predicted that there wouldn't be a Red Hat for Hadoop, that was a production. The other prediction was is that we said Hadoop won't kill data warehouses, it didn't, and then data lakes came along. You know my position on data lakes. >> Yeah. >> I've always hated the term. I always liked data ocean because I think it was much more fluidity of the data, so I think we got that one right and data lakes still doesn't look like it's going to be panning out well. I mean, most people that deploy data lakes, it's really either not a core thing or as part of something else and it's turning into a data swamp, so I think the data lake piece is not panning out the way it, people thought it would be. I think one thing we did get right, also, is that data would be the center of the value proposition, and it continues and remains to be, and I think we're seeing that now, and we said data's the development kit back in 2010 when we said data's going to be part of programming. >> Some of the other things, our early data, and we went out and we talked to a lot of practitioners who are the, it was hard to find in the early days. They were just a select few, I mean, other than inside of Google and Yahoo! But what they told us is that things like SQL and the enterprise data warehouse were key components on their big data strategy, so to your point, you know, it wasn't going to kill the EDW, but it was going to surround it. The other thing we called was cloud. Four years ago our data showed clearly that much of this work, the modeling, the big data wrangling, et cetera, was being done in the cloud, and Cloudera, Hortonworks, and MapR, none of them at the time really had a cloud strategy. Today that's all they're talking about is cloud and hybrid cloud. >> Well, it's interesting, I think it was like four years ago, I think, Dave, when we actually were riffing on the notion of, you know, Cloudera's name. It's called Cloudera, you know. If you spell it out, in Cloudera we're in a cloud era, and I think we were very aggressive at that point. I think Amr Awadallah even made a comment on Twitter. He was like, "I don't understand "where you guys are coming from." We were actually saying at the time that Cloudera should actually leverage more cloud at that time, and they didn't. They stayed on their IPO track and they had to because they had everything betted on Impala and this data model that they had and being the business model, and then they went public, but I think clearly cloud is now part of Cloudera's story, and I think that's a good call, and it's not too late for them. It never was too late, but you know, Cloudera has executed. I mean, if you look at what's happened with Cloudera, they were the only game in town. When we started theCUBE we were in their office, as most people know in this industry, that we were there with Cloudera when they had like 17 employees. I thought Cloudera was going to run the table, but then what happened was Hortonworks came out of the Yahoo! That, I think, changed the game and I think in that competitive battle between Hortonworks and Cloudera, in my opinion, changed the industry, because if Hortonworks did not come out of Yahoo! Cloudera would've had an uncontested run. I think the landscape of the ecosystem would look completely different had Hortonworks not competed, because you think about, Dave, they had that competitive battle for years. The Hortonworks-Cloudera battle, and I think it changed the industry. I think it couldn't been a different outcome. If Hortonworks wasn't there, I think Cloudera probably would've taken Hadoop and making it so much more, and I think they wouldn't gotten more done. >> Yeah, and I think the other point we have to make here is complexity really hurt the Hadoop ecosystem, and it was just bespoke, new projects coming out all the time, and you had Cloudera, Hortonworks, and maybe to a lesser extent MapR, doing a lot of the heavy lifting, particularly, you know, Hortonworks and Cloudera. They had to invest a lot of their R&D in making these systems work and integrating them, and you know, complexity just really broke the back of the Hadoop ecosystem, and so then Spark came in, everybody said, "Oh, Spark's going to basically replace Hadoop." You know, yes and no, the people who got Hadoop right, you know, embraced it and they still use it. Spark definitely simplified things, but now the conversation has turned to AI, John. So, I got to ask you, I'm going to use your line on you in kind of the ask-me-anything segment here. AI, is it same wine, new bottle, or is it really substantively different in your opinion? >> I think it's substantively different. I don't think it's the same wine in a new bottle. I'll tell you... Well, it's kind of, it's like the bad wine... (laughs) Is going to be kind of blended in with the good wine, which is now AI. If you look at this industry, the big data industry, if you look at what O'Reilly did with this conference. I think O'Reilly really has not done a good job with the conference of big data. I think they blew it, I think that they made it a, you know, monetization, closed system when the big data business could've been all about AI in a much deeper way. I think AI is subordinate to cloud, and you mentioned cloud earlier. If you look at all the action within the AI segment, Diane Greene talking about it at Google Next, Amazon, AI is a software layer substrate that will be underpinned by the cloud. Cloud will drive more action, you need more compute, that drives more data, more data drives the machine learning, machine learning drives the AI, so I think AI is always going to be dependent upon cloud ends or some sort of high compute resource base, and all the cloud analytics are feeding into these AI models, so I think cloud takes over AI, no doubt, and I think this whole ecosystem of big data gets subsumed under either an AWS, VMworld, Google, and Microsoft Cloud show, and then also I think specialization around data science is going to go off on its own. So, I think you're going to see the breakup of the big data industry as we know it today. Strata Hadoop, Strata Data Conference, that thing's going to crumble into multiple, fractured ecosystems. >> It's already starting to be forked. I think the other thing I want to say about Hadoop is that it actually brought such great awareness to the notion of data, putting data at the core of your company, data and data value, the ability to understand how data at least contributes to the monetization of your company. AI would not be possible without the data. Right, and we've talked about this before. You call it the innovation sandwich. The innovation sandwich, last decade, last three decades, has been Moore's law. The innovation sandwich going forward is data, machine intelligence applied to that data, and cloud for scale, and that's the sandwich of innovation over the next 10 to 20 years. >> Yeah, and I think data is everywhere, so this idea of being a categorical industry segment is a little bit off, I mean, although I know data warehouse is kind of its own category and you're seeing that, but I don't think it's like a Magic Quadrant anymore. Every quadrant has data. >> Mm-hm. >> So, I think data's fundamental, and I think that's why it's going to become a layer within a control plane of either cloud or some other system, I think. I think that's pretty clear, there's no, like, one. You can't buy big data, you can't buy AI. I think you can have AI, you know, things like TensorFlow, but it's going to be a completely... Every layer of the stack is going to be impacted by AI and data. >> And I think the big players are going to infuse their applications and their databases with machine intelligence. You're going to see this, you're certainly, you know, seeing it with IBM, the sort of Watson heavy lift. Clearly Google, Amazon, you know, Facebook, Alibaba, and Microsoft, they're infusing AI throughout their entire set of cloud services and applications and infrastructure, and I think that's good news for the practitioners. People aren't... Most companies aren't going to build their own AI, they're going to buy AI, and that's how they close the gap between the sort of data haves and the data have-nots, and again, I want to emphasize that the fundamental difference, to me anyway, is having data at the core. If you look at the top five companies in terms of market value, US companies, Facebook maybe not so much anymore because of the fake news, though Facebook will be back with it's two billion users, but Apple, Google, Facebook, Amazon, who am I... And Microsoft, those five have put data at the core and they're the most valuable companies in the stock market from a market cap standpoint, why? Because it's a recognition that that intangible value of the data is actually quite valuable, and even though banks and financial institutions are data companies, their data lives in silos. So, these five have put data at the center, surrounded it with human expertise, as opposed to having humans at the center and having data all over the place. So, how do they, how do these companies close the gap? How do the companies in the flyover states close the gap? The way they close the gap, in my view, is they buy technologies that have AI infused in it, and I think the last thing I'll say is I see cloud as the substrate, and AI, and blockchain and other services, as the automation layer on top of it. I think that's going to be the big tailwind for innovation over the next decade. >> Yeah, and obviously the theme of machine learning drives a lot of the conversations here, and that's essentially never going to go away. Machine learning is the core of AI, and I would argue that AI truly doesn't even exist yet. It's machine learning really driving the value, but to put a validation on the fact that cloud is going to be driving AI business is some of the terms in popular conversations we're hearing here in New York around this event and topic, CUBENYC and Strata Conference, is you're hearing Kubernetes and blockchain, and you know, these automation, AI operation kind of conversations. That's an IT conversation, (chuckles) so you know, that's interesting. You've got IT, really, with storage. You've got to store the data, so you can't not talk about workloads and how the data moves with workloads, so you're starting to see data and workloads kind of be tossed in the same conversation, that's a cloud conversation. That is all about multi-cloud. That's why you're seeing Kubernetes, a term I never thought I would be saying at a big data show, but Kubernetes is going to be key for moving workloads around, of which there's data involved. (chuckles) Instrumenting the workloads, data inside the workloads, data driving data. This is where AI and machine learning's going to play, so again, cloud subsumes AI, that's the story, and I think that's going to be the big trend. >> Well, and I think you're right, now. I mean, that's why you're hearing the messaging of hybrid cloud and from the big distro vendors, and the other thing is you're hearing from a lot of the no-SQL database guys, they're bringing ACID compliance, they're bringing enterprise-grade capability, so you're seeing the world is hybrid. You're seeing those two worlds come together, so... >> Their worlds, it's getting leveled in the playing field out there. It's all about enterprise, B2B, AI, cloud, and data. That's theCUBE bringing you the data here. New York City, CUBENYC, that's the hashtag. Stay with us for more coverage live in New York after this short break. (techy music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media for the creation at the Hadoop big data ecosystem. and some of the things we learned then some of the things that are profound about Hadoop, We were talking last night about, you know, but some of the things that we talked about back then, You said, "Well, the only Red Hat of big data might be being a reduction, the ROI was a reduction I mean, first of all, I think we got and I think we're seeing that now, and the enterprise data warehouse were key components and I think we were very aggressive at that point. Yeah, and I think the other point and all the cloud analytics are and cloud for scale, and that's the sandwich Yeah, and I think data is everywhere, and I think that's why it's going to become I think that's going to be the big tailwind and I think that's going to be the big trend. and the other thing is you're hearing New York City, CUBENYC, that's the hashtag.

ENTITIES

Entity	Category	Confidence
Apple	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Diane Greene	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
John	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
$30	QUANTITY	0.99+
New York	LOCATION	0.99+
2010	DATE	0.99+
IBM	ORGANIZATION	0.99+
Doug Cutting	PERSON	0.99+
Mike Olson	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Dallas	LOCATION	0.99+
O'Reilly	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
five	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Abi Mehda	PERSON	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
$2.5 billion	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
MapR	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
$40 billion	QUANTITY	0.99+
17 employees	QUANTITY	0.99+
VMworld	ORGANIZATION	0.99+
Today	DATE	0.99+
Impala	ORGANIZATION	0.99+
Nine years	QUANTITY	0.99+
four years ago	DATE	0.98+
last night	DATE	0.98+
last decade	DATE	0.98+
Strata Data Conference	EVENT	0.98+
Strata Conference	EVENT	0.98+
Hadoop Summit	EVENT	0.98+
ninth year	QUANTITY	0.98+
Four years ago	DATE	0.98+
two worlds	QUANTITY	0.97+
five companies	QUANTITY	0.97+
today	DATE	0.97+
Strata Hadoop	EVENT	0.97+
Hadoop World	EVENT	0.96+
CUBE	ORGANIZATION	0.96+
Google Next	ORGANIZATION	0.95+
Twitter	ORGANIZATION	0.95+
this year	DATE	0.95+
Spark	ORGANIZATION	0.95+
US	LOCATION	0.94+
CUBENYC	EVENT	0.94+
Strata O'Reilly	ORGANIZATION	0.93+
next decade	DATE	0.93+

Bala Chandrasekaran, Dell EMC | Dell EMC: Get Ready For AI

(techno music) >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're in Austin, Texas at the Dell EMC HPC and AI Innovation Lab. As you can see behind me, there's racks and racks and racks of gear, where they build all types of vessel configurations around specific applications, whether its Oracle or S.A.P. And more recently a lot more around artificial intelligence, whether it's machine learning, deep learning, so it's a really cool place to be. We're excited to be here. And our next guest is Bala Chandrasekaran. He is in the technical staff as a systems engineer. Bala, welcome! >> Thank you. >> So how do you like playing with all these toys all day long? >> Oh I love it! >> I mean you guys have literally everything in there. A lot more than just Dell EMC gear, but you've got switches and networking gear-- >> Right. >> Everything. >> And not just the gear, it's also all the software components, it's the deep learning libraries, deep learning models, so a whole bunch of things that we can get to play around with. >> Now that's interesting 'cause it's harder to see the software, right? >> Exactly right. >> The software's pumping through all these machines but you guys do all types of really, optimization and configuration, correct? >> Yes, we try to make it easy for the end customer. And the project that I'm working on, machine learning for Hadoop, we try to make things easy for the data scientists. >> Right, so we got all the Hadoop shows, Hadoop World, Hadoop Summit, Strata, Big Data NYC, Silicone Valley, and the knock on Hadoop is always it's too hard, there aren't enough engineers, I can't get enough people to do it myself. It's a cool open source project, but it's not that easy to do. You guys are really helping people solve that problem. >> Yes and what you're saying is true for the infrastructure guys. Now imagine a data scientist, right? So Hadoop cluster accessing it, securing it, is going to be really tough for them. And they shouldn't be worried about it. Right? They should be focused on data science. So those are some of the things that we try to do for them. >> So what are some of the tips and tricks as you build these systems that throw people off all the time that are relatively simple things to fix? And then what are some of the hard stuff where you guys have really applied your expertise to get over those challenges? >> Let me give you a small example. So this is a new project A.I. we hired data scientists. So I walk the data scientist through the lab. He looked at all he cluster and he pulled me aside and said hey you're not going to ask me to work on these things, right? I have no idea how to do these things. So that kind of gives you a sense of what a data scientist should focus on and what what they shouldn't focus on. So some of the things that we do, and some of the things that are probably difficult for them is all the libraries that are needed to run their project, the conflicts between libraries, the dependencies between them. So one of the things that we do deliver this pre-configured engine that you can readily download into our product and run. So data scientist don't have to worry about what library I should use. >> Right. >> They have to worry about the models and accuracy and whatever data science needs to be done, rather than focusing on the infrastructure. >> So you not only package the hardware and the systems, but you've packaged the software distribution and all the kind of surrounding components of that as well. >> Exactly right. Right. >> So when you have the data scientists here talking about the Hadoop cluster, if they didn't want to talk about the hardware and the software, what were you helping them with? How did you engage with the customers here at the lab? >> So the example that I gave is for the data scientist that we newly hired for our team so we had to set up environments for them. so that was the example, but the same thing applies for a customer as well. So again to help them in solving the problem we tried to package some of the things as part of our product and deliver it to them so it's easy for them to deploy and get started on things. >> Now the other piece that's included and again is not in this room is the services -- >> Right. >> And the support so you guys have a full team of professional services. Once you configure and figure out what the optimum solution is for them then you got a team that can actually go deploy it at their actual site. >> So we have packaged things even for our services. So the services would go to the customer side. They would apply the solution and download and deploy our packages and be able to demonstrate how easy it is to think of them as tutorials if you like. So here are the tutorials. Here's how you run various models. So here's how easy it is for you to get started. So that's what they would train the customer on. So there's not just the deployment piece of it but just packaging things for them so they can show customers how to get started quickly, how everything works and kind of of give a green check mark if you will. >> So what are some of your favorite applications that people are using these things for? Do you get involved in the applications stack on the customer side? What are some of the fun use cases that people use in your technology to solve? >> So for the application my project is about mission learning on Hadoop via packaging Cloudera's CDSW that's Cloudera Data Science Workbench as part of the product. So that allows data science access to the Hadoop cluster and abstracting the complexities of the cluster. So they can access the cluster. They can access the data. They can have security without worrying about all the intricacies of the cluster. In addition to that they can create different projects, have different libraries in different projects. So they don't have to conflict with each other and also they can add users to it. They can work collaboratively. So basically choose to help data scientists, software developers, do their job and not worry about the infrastructure. >> Right. >> They should not be. >> Right great. Well Bala it's pretty exciting place to work. I'm sure you're having a ball. >> Yes I am thank you. >> All right. Well thanks for taking a few minutes with us and really enjoyed the conversation. >> I appreciate it thank you. All right he's Bala. I'm Jeff. You're watching theCUBE from Austin, Texas at the Dell EMC High Performance Computing and Artificial Intelligence Labs. Thanks for watching. (techno music)

Published Date : Aug 7 2018

SUMMARY :

He is in the technical staff as a systems engineer. I mean you guys have literally everything in there. And not just the gear, And the project that I'm working on, but it's not that easy to do. So those are some of the things that we try to do for them. So some of the things that we do, They have to worry about the models and accuracy and all the kind of surrounding components of that as well. Right. So the example that I gave is for the data scientist And the support so you guys So the services would go to the customer side. So for the application my project is about Well Bala it's pretty exciting place to work. All right. at the Dell EMC High Performance Computing

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Bala Chandrasekaran	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
Bala	PERSON	0.99+
Austin, Texas	LOCATION	0.99+
AI Innovation Lab	ORGANIZATION	0.99+
one	QUANTITY	0.98+
Dell EMC High Performance Computing	ORGANIZATION	0.98+
Dell EMC	ORGANIZATION	0.98+
Cloudera	ORGANIZATION	0.97+
Dell EMC HPC	ORGANIZATION	0.96+
Hadoop	TITLE	0.95+
S.A.P.	ORGANIZATION	0.94+
Artificial Intelligence Labs	ORGANIZATION	0.87+
NYC	LOCATION	0.85+
theCUBE	ORGANIZATION	0.83+
Silicone Valley	LOCATION	0.79+
Hadoop Summit	EVENT	0.78+
Big Data	EVENT	0.72+
Strata	EVENT	0.58+
Hadoop World	EVENT	0.44+
Hadoop	ORGANIZATION	0.41+

Michael Bennett, Dell EMC | Dell EMC: Get Ready For AI

(energetic electronic music) >> Hey, welcome back everybody. Jeff Frick here with The Cube. We're in a very special place. We're in Austin, Texas at the Dell EMC HPC and AI Innovation Lab. High performance computing, artificial intelligence. This is really where it all happens. Where the engineers at Dell EMC are putting together these ready-made solutions for the customers. They got every type of application stack in here, and we're really excited to have our next guest. He's right in the middle of it, he's Michael Bennett, Senior Principal Engineer for Dell EMC. Mike, great to see you. >> Great to see you too. >> So you're working on one particular flavor of the AI solutions, and that's really machine learning with Hadoop. So tell us a little bit about that. >> Sure yeah, the product that I work on is called the Ready Solution for AI Machine Learning with Hadoop, and that product is a Cloudera Hadoop distribution on top of our Dell powered servers. And we've partnered with Intel, who has released a deep learning library, called Big DL, to bring both the traditional machine learning capabilities as well as deep learning capabilities to the product. Product also adds a data science workbench that's released by Cloudera. And this tool allows the customer's data scientists to collaborate together, provides them secure access to the Hadoop cluster, and we think all-around makes a great product to allow customers to gain the power of machine learning and deep learning in their environment, while also kind of reducing some of those overhead complexities that IT often faces with managing multiple environments, providing secure access, things like that. >> Right, cause the big knock always on Hadoop is that it's just hard. It's hard to put in, there aren't enough people, there aren't enough experts. So you guys are really offering a pre-bundled solution that's ready to go? >> Correct, yeah. We've built seven or eight different environments going in the lab at any time to validate different hardware permutations that we may offer of the product as well as, we've been doing this since 2009, so there's a lot of institutional knowledge here at Dell to draw on when building and validating these Hadoop products. Our Dell services team has also been going out installing and setting these up, and our consulting services has been helping customers fit the Hadoop infrastructure into their IT model. >> Right, so is there one basic configuration that you guys have? Or have you found there's two or three different standard-use cases that call for two or three different kinds of standardized solutions? >> We find that most customers are preferring the R7-40XC series. This platform can hold 12 3 1/2" form-factor drives in the front, along with four in the mid-plane, while still providing four SSDs in the back. So customers get a lot of versatility with this. It's also won several Hadoop benchmarking awards. >> And do you find, when you're talking to customers or you're putting this together, that they've tried themselves and they've tried to kind of stitch together and cobble together the open-source proprietary stuff all the way down to network cards and all this other stuff to actually make the solution come together? And it's just really hard, right? >> Yeah, right exactly. What we hear over and over from our product management team is that their interactions with customers, come back with customers saying it's just too hard. They get something that's stable and they come back and they don't know why it's no longer working. They have customized environments that each developer wants for their big data analytics jobs. Things like that. So yeah, overall we're hearing that customers are finding it very complex. >> Right, so we hear time and time again that same thing. And even though we've been going to Hadoop Summit and Hadoop World and Stratus, since 2010. The momentum seems to be a little slower in terms of the hype, but now we're really moving into heavy-duty real time production and that's what you guys are enabling with this ready-made solution. >> So with this product, yeah, we focused on enabling Apache Spark on the Hadoop environment. And that Apache Spark distributed computing has really changed the game as far as what it allows customers to do with their analytics jobs. No longer are we writing things to disc, but multiple transformations are being performed in memory, and that's also a big part of what enables the big DL library that Intel released for the platform to train these deep-learning models. >> Right, cause the Sparks enables the real-time analytics, right? Now you've got streaming data coming into this thing, versus the batch which was kind of the classic play of Hadoop. >> Right and not only do you have streaming data coming in, but Spark also enables you to load your data in memory and perform multiple operations on it. And draw insights that maybe you couldn't before with traditional map-reduce jobs. >> Right, right. So what gets you excited to come to work every day? You've been playing with these big machines. You're in the middle of nerd nirvana I think-- >> Yeah exactly. >> With all of the servers and spin-discs. What gets you up in the morning? What are you excited about, as you see AI get more pervasive within the customers and the solutions that you guys are enabling? >> You know, for me, what's always exciting is trying new things. We've got this huge lab environment with all kinds of lab equipment. So if you want to test a new iteration, let's say tiered HGFS storage with SSDs and traditional hard drives, throw it together in a couple of hours and see what the results are. If we wanted to add new PCIE devices like FPGAs for the inference portion the deep-learning development we can put those in our servers and try them out. So I enjoy that, on top of the validated, thoroughly-worked-through solutions that we offer customers, we can also experiment, play around, and work towards that next generation of technology. >> Right, 'cause any combination of hardware that you basically have at your disposal to try together and test and see what happens? >> Right, exactly. And this is my first time actually working at a OEM, and so I was surprised, not only do we have access to anything that you can see out in the market, but we often receive test and development equipment from partners and vendors, that we can work with and collaborate with to ensure that once the product reaches market it has the features that customers need. >> Right, what's the one thing that trips people up the most? Just some simple little switch configuration that you think is like a minor piece of something, that always seems to get in the way? >> Right, or switches in general. I think that people focus on the application because the switch is so abstracted from what the developer or even somebody troubleshooting the system sees, that oftentimes some misconfiguration or some typo that was entered during the switch configuration process that throws customers off or has somebody scratching their head, wondering why they're not getting the kind of performance that they thought. >> Right, well that's why we need more automation, right? That's what you guys are working on. >> Right yeah exactly. >> Keep the fat-finger typos out of the config settings. >> Right, consistent reproducible. None of that, I did it yesterday and it worked I don't know what changed. >> Right, alright Mike. Well thanks for taking a few minutes out of your day, and don't have too much fun playing with all this gear. >> Awesome, thanks for having me. >> Alright, he's Mike Bennett and I'm Jeff Frick. You're watching The Cube, from Austin Texas at the Dell EMC High Performance Computing and AI Labs. Thanks for watching. (energetic electronic music)

Published Date : Aug 7 2018

SUMMARY :

at the Dell EMC HPC and AI Innovation Lab. of the AI solutions, and that's really that IT often faces with managing multiple environments, Right, cause the big knock always on Hadoop going in the lab at any time to validate in the front, along with four in the mid-plane, is that their interactions with customers, and that's what you guys are enabling has really changed the game as far as what it allows Right, cause the Sparks enables And draw insights that maybe you couldn't before You're in the middle of nerd nirvana I think-- that you guys are enabling? for the inference portion the deep-learning development that you can see out in the market, the kind of performance that they thought. That's what you guys are working on. Right, consistent reproducible. and don't have too much fun playing with all this gear. at the Dell EMC High Performance Computing and AI Labs.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Michael Bennett	PERSON	0.99+
two	QUANTITY	0.99+
Mike Bennett	PERSON	0.99+
Dell	ORGANIZATION	0.99+
seven	QUANTITY	0.99+
Mike	PERSON	0.99+
Dell EMC	ORGANIZATION	0.99+
The Cube	TITLE	0.99+
yesterday	DATE	0.99+
2010	DATE	0.99+
Austin, Texas	LOCATION	0.98+
both	QUANTITY	0.98+
Austin Texas	LOCATION	0.98+
Spark	TITLE	0.98+
2009	DATE	0.98+
R7-40XC	COMMERCIAL_ITEM	0.98+
Intel	ORGANIZATION	0.98+
each developer	QUANTITY	0.98+
AI Innovation Lab	ORGANIZATION	0.97+
Hadoop	TITLE	0.97+
first time	QUANTITY	0.96+
Dell EMC High Performance Computing	ORGANIZATION	0.96+
four	QUANTITY	0.95+
one	QUANTITY	0.94+
Apache	ORGANIZATION	0.94+
one thing	QUANTITY	0.93+
The Cube	ORGANIZATION	0.92+
12 3 1/2"	QUANTITY	0.92+
Dell EMC HPC	ORGANIZATION	0.9+
three different standard-use cases	QUANTITY	0.9+
eight different environments	QUANTITY	0.89+
three different	QUANTITY	0.88+
Stratus	ORGANIZATION	0.83+
Hadoop World	ORGANIZATION	0.79+
one basic configuration	QUANTITY	0.76+
AI Labs	ORGANIZATION	0.74+
four SSDs	QUANTITY	0.73+
Cloudera	TITLE	0.71+
Hadoop Summit	EVENT	0.69+
hours	QUANTITY	0.67+
Hadoop benchmarking awards	TITLE	0.67+
Sparks	COMMERCIAL_ITEM	0.48+
Hadoop	COMMERCIAL_ITEM	0.34+

Day Two Kickoff | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCube. Covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back to day two of theCube's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight along with my co-host James Kobielus. James, it's great to be here with you in the hosting seat again. >> Day two, yes. >> Exactly. So here we are, this conference, 2,100 attendees from 32 countries, 23 industries. It's a relatively big show. They do three of them during the year. One of the things that I really-- >> It's a well-established show too. I think this is like the 11th year since Yahoo started up the first Hadoop summit in 2008. >> Right, right. >> So it's an established event, yeah go. >> Exactly, exactly. But I really want to talk about Hortonworks the company. This is something that you had brought up in an analyst report before the show started and that was talking about Hortonworks' cash flow positivity for the first time. >> Which is good. >> Which is good, which is a positive sign and yet what are the prospects for this company's financial health? We're still not seeing really clear signs of robust financial growth. >> I think the signs are good for the simple reason they're making significant investments now to prepare for the future that's almost inevitable. And the future that's almost inevitable, and when I say the future, the 2020s, the decade that's coming. Most of their customers will shift more of their workloads, maybe not entirely yet, to public cloud environments for everything they're doing, AI, machine learning, deep learning. And clearly the beneficiaries of that trend will be the public cloud providers, all of whom are Hortonworks' partners and established partners, AWS, Microsoft with Azure, Google with, you know, Google Cloud Platform, IBM with IBM Cloud. Hortonworks, and this is... You know, their partnerships with these cloud providers go back several years so it's not a new initiative for them. They've seen the writing on the wall practically from the start of Hortonworks' founding in 2011 and they now need to go deeper towards making their solution portfolio capable of being deployable on-prem, in cloud, public clouds, and in various and sundry funky combinations called hybrid multi-clouds. Okay, so, they've been making those investments in those partnerships and in public cloud enabling the Hortonworks Data Platform. Here at this show, DataWorks 2018 here in San Jose, they've released the latest major version, HDP 3.0 of their core platform with a lot of significant enhancements related to things that their customers are increasingly doing-- >> Well I want to ask you about those enhancements. >> But also they have partnership announcements, the deep ones of integration and, you know, lift and shift of the Hortonworks portfolio of HDP with Hortonworks DataFlow and DataPlane Services, so that those solutions can operate transparently on those public cloud environments as the customers, as and when the customers choose to shift their workloads. 'Cause Hortonworks really... You know, like Scott Gnau yesterday, I mean just laid it on the line, they know that the more of the public cloud workloads will predominate now in this space. They're just making these speculative investments that they absolutely have to now to prepare the way. So I think this cost that they're incurring now to prepare their entire portfolio for that inevitable future is the right thing to do and that's probably why they still have not attained massive rock and rollin' positive cash flow yet but I think that they're preparing the way for them to do so in the coming decade. >> So their financial future is looking brighter and they're doing the right things. >> Yeah, yes. >> So now let's talk tech. And this is really where you want to be, Jim, I know you. >> Oh I get sleep now and I don't think about tech constantly. >> So as you've said, they're really doing a lot of emphasis now on their public cloud partnerships. >> Yes. >> But they've also launched several new products and upgrades to existing products, what are you seeing that excites you and that you think really will be potential game changers? >> You know, this is geeky but this is important 'cause it's at the very heart of Hortonworks Data Platform 3.0, containerization of more... When you're a data scientist, and you're building a machine learning model using data that's maintained, and is persisted, and processed within Hortonworks Data Platform or any other big data platform, you want the ability increasingly for developing machine learning, deep learning, AI in general, to take that application you might build while you're using TensorFlow models, that you build on HDP, they will containerize it in Docker and, you know, orchestrate it all through Kubernetes and all that wonderful stuff, and deploy it out, those AI, out to increasingly edge computing, mobile computing, embedded computing environments where, you know, the real venture capital mania's happening, things like autonomous vehicles, and you know, drones, and you name it. So the fact is that Hortonworks has made that in many ways the premier new feature of HDP 3.0 announced here this week at the show. That very much harmonizes with what their partners, where their partners are going with containerization of AI. IBM, one of their premier partners, very recently, like last month, I think it was, announced the latest version of IBM, what do they call it, IBM Cloud Private, which has embedded as a core feature containerization within that environment which is a prem-based environment of AI and so forth. The fact that Hortonworks continues to maintain close alignment with the capabilities that its public cloud partners are building to their respective portfolios is important. But also Hortonworks with its, they call it, you know, a single pane of glass, the DataPlane Services for metadata and monitoring and governance and compliance across this sprawling hybrid multi-cloud, these scenarios. The fact that they're continuing to make, in fact, really focusing on deep investments in that portfolio, so that when an IBM introduces or, AWS, whoever, introduces some new feature in their respective platforms, Hortonworks has the ability to, as it were, abstract above and beyond all of that so that the customer, the developer, and the data administrator, all they need to do, if they're a Hortonworks customer, is stay within the DataPlane Services and environment to be able to deploy with harmonized metadata and harmonized policies, and harmonized schemas and so forth and so on, and query optimization across these sprawling environments. So Hortonworks, I think, knows where their bread is buttered and it needs to stay on the DPS, DataPlane Services, side which is why a couple months ago in Berlin, Hortonworks made a, I think, the most significant announcement of the year for them and really for the industry, was that they announced the Data Steward Studio in Berlin. Tech really clearly was who addressed the GDPR mandate that was coming up but really did a stewardship as an end-to-end workflow for lots of, you know, core enterprise applications, absolutely essential. Data Steward Studio is a DataPlane Service that can operate across multi-cloud environments. Hortonworks is going to keep on, you know... They didn't have a DPS, DataPlane Services, announcements here in San Jose this week but you can best believe that next year at this time at this show, and in the interim they'll probably have a number of significant announcements to deepen that portfolio. Once again it's to grease the wheels towards a more purely public cloud future in which there will be Hortonworks DNA inside most of their customers' environments going forward. >> I want to ask you about themes of this year's conference. The thing is is that you were in Berlin at the last big Hortonworks DataWorks Summit. >> (speaks in foreign language) >> And really GDPR dominated the conversations because the new rules and regulations hadn't yet taken effect and companies were sort of bracing for what life was going to be like under GDPR. Now the rules are here, they're here to stay, and companies are really grappling with it, trying to understand the changes and how they can exist in this new regime. What would you say are the biggest themes... We're still talking about GDPR, of course, but what would you say are the bigger themes that are this week's conference? Is it scalability, is it... I mean, what would you say we're going, what do you think has dominated the conversations here? >> Well scalability is not the big theme this week though there are significant scalability announcements this week in the context of HDP 3.0, the ability to persist in a scale-out fashion across multi-cloud, billions of files. Storage efficiency is an important piece of the overall announcement with support for erasure coding, blah blah blah. That's not, you know, that's... Already, Hortonworks, like all of their cloud providers and other big data providers, provide very scalable environments for storage, workload management. That was not the hugest, buzzy theme in terms of the announcements this week. The buzz of course was HDP 3.0. Containerization, that's important, but you know, we just came out of the day two keynote. AI is not a huge focus yet for a lot of the Hortonworks customers who are here, the developers. They're, you know, most of their customers are not yet that far along in their deep learning journeys and whatever but they're definitely going there. There's plenty of really cool keynote discussions including the guy with the autonomous vehicles or whatever that, the thing we just came out of. That was not the predominant theme this week here in terms of the HDP 3.0. I think what it comes down to is that with HDP 3.0... Hive, though you tend to take it for granted, it's been in Hadoop from the very start, practically, Hive is now a full enterprise database and that's the core, one of the cores, of HDP 3.0. Hive itself, Hive 3.0 now is its version, is ACID compliant and that may be totally geeky to the most of the world but that enables it to support transactional applications. So more big data in every environment is supporting more traditional enterprise application, transactional applications that require like two-phase commit and all that goodness. The fact is, you know, Hortonworks have, from what I can see, is the first of the big data vendors to incorporate those enhancements to Hive 3.0 because they're so completely tuned in to the Hive environment in terms of a committer. I think in many ways that is the predominant theme in terms of the new stuff that will actually resonate with the developers, their customers here at the show. And with the, you know, enterprises in general, they can put more of their traditional enterprise application workloads on big data environments and specifically, Hortonworks hopes, its HDP 3.0. >> Well I'm excited to learn more here at the on theCube with you today. We've got a lot of great interviews lined up and a lot of interesting content. We got a great crew too so this is a fun show to do. >> Sure is. >> We will have more from day two of the.

Published Date : Jun 20 2018

SUMMARY :

Live from San Jose, in the heart James, it's great to be here with you One of the things that I really-- I think this is like the So it's an This is something that you had brought up of robust financial growth. in public cloud enabling the Well I want to ask you is the right thing to do doing the right things. And this is really where you Oh I get sleep now and I don't think of emphasis now on their announcement of the year at the last big Hortonworks because the new rules of the announcements this week. this is a fun show to do.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Hortonworks'	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
2011	DATE	0.99+
Jim	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Berlin	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
James	PERSON	0.99+
23 industries	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
Hive 3.0	TITLE	0.99+
2020s	DATE	0.99+
next year	DATE	0.99+
this week	DATE	0.99+
32 countries	QUANTITY	0.99+
Hive	TITLE	0.99+
11th year	QUANTITY	0.99+
yesterday	DATE	0.99+
first time	QUANTITY	0.99+
GDPR	TITLE	0.98+
last month	DATE	0.98+
DataPlane Services	ORGANIZATION	0.98+
One	QUANTITY	0.98+
Scott Gnau	PERSON	0.98+
2008	DATE	0.98+
three	QUANTITY	0.98+
2,100 attendees	QUANTITY	0.98+
HDP 3.0	TITLE	0.98+
today	DATE	0.98+
Data Steward Studio	ORGANIZATION	0.98+
two-phase	QUANTITY	0.98+
one	QUANTITY	0.97+
DataWorks Summit 2018	EVENT	0.96+
DataPlane	ORGANIZATION	0.96+
Day two	QUANTITY	0.96+
billions of files	QUANTITY	0.95+
first	QUANTITY	0.95+
day two	QUANTITY	0.95+
DPS	ORGANIZATION	0.95+
Data Platform 3.0	TITLE	0.94+
Hortonworks DataWorks Summit	EVENT	0.94+
DataWorks	EVENT	0.92+

James Markarian, SnapLogic | SnapLogic Innovation Day 2018

>> Announcer: From San Mateo, California, it's theCUBE! Covering SnapLogic, Innovation Day, 2018. Brought to you by SnapLogic. >> Hey welcome back everybody, Jeff Frick here with theCUBE. We are in San Mateo, at what they call the crossroads, it's 92 and 101. If you're coming by and probably sitting in a traffic, look up and you'll see SnapLogic. It's their new offices. We're really excited to be here for Innovation Day. We're excited to have this CTO, James Markarian. James, great to see you and I guess, we we last talked was a couple years ago in New York City. >> Yeah that's right, and why was I there? It was like a big data show. >> That's right. >> And we we are two years later talking about big data. >> Big data, big data is fading a little bit, because now big data is really an engine, that's powering this new thing that's so exciting, which is all about analytics, and machine learning, and we're going to eventually stop saying artificial intelligence and say augmented intelligence, 'cause there's really nothing artificial about it. >> Yeah and we might stop saying big data and just talk about data because it's becoming so ubiquitous. >> Jeff: Right. >> I know that big data, it's not necessarily going away but it's sort of how we're thinking about handling it is, like kind of evolved over time, especially in the last couple of years. >> Right. >> That's what we're kind of seeing from our customers. >> 'Cause there's kind of an ingredient now, right? It's no longer this new shiny object now. It's just part of the infrastructure that helps you get everything else done. >> Yeah, and I think when you think about it, from like, an enterprise point of view, that that shift is going from experimentation to operationalizing. I think that the things you look for in experimentation, there's like, one set of things here looking for proving out the overall value, regardless maybe of cost and uptime and other things and as you operationalize you start thinking about other considerations that obviously Enterprise IT has to think about. >> Right, so if you think back to like, Hadoop Summit and Hadoop World who were first cracking their teeth, like in 2010 or around that time frame, one of the big discussions that always comes up and that was before kind of the rise of public cloud, you know which has really taken off over the last several years, there's this kind of ongoing debate between, do you move the data to the compute or do you move the compute to the data? There was always like, this monster data gravity issue which was almost insurmountable and many would say, oh, you're never going to get all your data into the cloud. It's just way too hard and way too expensive. But, now Amazon has Snowball and Snowball isn't big enough. They actually had a diesel truck that'll come and help you come move your data. Amazon rolled that thing across the stage a couple of years ago. The data gravity thing seems to be less and if you think of a world with infinite compute, infinite stored, infinite networking asyndetically approaching zero, not necessarily good news for some vendors out there but that's a world that we're eventually getting to that changes the way that you organize all this stuff. >> Yeah, I think so and so much has changed. I was fortunate to be one of the early speakers, like I used to do Worlds and everything, and I was adamantly proclaiming you know, the destiny of Hadoop as bright and shiny and there's this question about what really happened. I think that there's a kind of a few different variables that kind of shifted at the same time. One, is of course, this like glut of computing in the cloud happened and there are so many variables moving at once. It's like, How much time do you have Jeff? >> Ask them to get a couple more drinks for us. >> Seeing our lovely new headquarters here and one of the things is that there is no big data center. We have a little closet with some of the servers we keep around but mostly, everything we do is on Amazon. You're even looking at things like, commercial real estate is changing because I don't need all the cooling and the power and the space for my data center that I once had. >> Jeff: Right, right. >> I become a lot more space efficient than I used to be and so the cloud is really kind of changing everything. On the data side, you mention this like, interesting philosophical shift, going from I couldn't possibly do it in the cloud to why in the world would we not do things in the cloud. Maybe the one stall word in there being some fears about security. Obviously there's been a lot of breaches. I think that there's still a lot of introspection everyone needs to do about, are my on premise systems actually more secure than some of these cloud providers? It's really not clear that we know the answer to that. In fact, we suspect that some of the cloud providers are actually more secure because they are professionals about it and they have the best practice. >> And a whole lot of money. >> The other thing that happened that you didn't mention, that's approaching infinity and we're not quite there yet, is interconnect speeds. So it used to be the case that I have a bunch of mainframes and I have a tier rating system and I have a high speed interconnect that puts the two together. Now with fiber networks and just in general, you can run super high speed, like WAN. Especially if you don't care quite as much about latency. So if 500 millisecond latency is still okay with you. >> Great. >> You can do a heck of a lot and move a lot to the cloud. In fact, it's so good, that we went from worrying, could I do this in the cloud at all to well, why wouldn't I do somethings in Amazon and some things in Microsoft and some things in Google? Even if it meant replicating my data across all these environments. The backdrop for some of that is, we had a lot of customers and I was thinking that people would approach it this way, they would install on premise Hadoop, whether it's like Apache or Cloud Air or the other vendors and I would hire a bunch of folks that are the administrators and retire terra data and I'm going to put all my ETL jobs on there, etc. It turned out to be a great theory and the practice is real for some folks but it turned out to be moving a lot of things to kind of shifting sands because Hadoop was evolving at the time. A lot of customers were putting a lot of pressure on it, operational pressure. Again, moving from experimentation phase over to like, operational phase. >> Jeff: Right, right. >> When you don't have the uptime guarantee and I can't just hire somebody off the street to administer this, it has to be a very sharp, knowledgeable person that's very expensive, people start saying, what am I really getting from this and can I just dump it all in S3 and apply a bunch of technology there and let Amazon worry about keeping this thing up and running? People start to say, I used to reject that idea and now it's sounding like a very smart idea. >> It's so funny we talk about people processing tech all the time, right? But they call them tech shows, they don't call them people in process shows. >> Right. >> At least not the ones we go to but time and time again I remember talking to some people about the Hadoop situation and there's just like, no Hadoop people. Sometimes technology all day long. There just aren't enough people with the skills to actually implement it. It's probably changed now but I remember that was such a big problem. It's funny you talk about security and cloud security. You know, at AWS, on Tuesday night of Reinvent, they have a special, kind of a technical keynote speak and like, James Hamilton would go. In the amount of resources, and I just remember one talk he gave just on their cabling across the ocean, and the amount of resources that he can bring to bear, relative to any individual company, is so different; much less a mid-tier company or a small company. I mean, you can bring so much more resources, expertise and knowledge. >> Yeah, the economy is a scale, their just there. >> They're just crazy. >> That's right and that why you know, you sort of assume that the cloud sort of, eventually eats everything. >> Right, right. >> So there's no reason to believe this won't be one of those cases. >> So you guys are getting Extreme. So what is Snaplogic Extreme? >> Well, Snaplogic Extreme is kind of like a response to this trend of data moving from on premise to the cloud and there are some interesting dynamics of that movement. First of all, you need to get data into the cloud, first of all and we've been doing that for years. Connect to everything, dump it in S3, ADLS, etc. No problem. The thing we're seeing with cloud computing is like, there's another interesting shift. Not only is it kind of like mess for less, and let Amazon manage all this, and I probably refer to Amazon more than other vendors would appreciate. >> Right, right. They're the leaders so let's call a spade a spade. >> Yeah. >> Certainly Google and Microsoft are out there as well so those are the top three and we've acknowledged that. >> One of the interesting things about it is that you couldn't really adequately achieve on premises is the burstiness of your compute. I run at a steady state where I need, you know, 10 servers or a 100 servers, but every once in a while, I need like, 1,000 or 10,000 servers to apply to something. So what's the on premise model? Rack and stack, 10,000 machines, and it's like waiting for the great pumpkin, waiting for that workload to come that I've been waiting months and months for and maybe it never comes but I've been paying for it. I paid for a software license for the thing that I need to run there. I'm paying for the cabling and the racking and everything and the person administering. Make sure the disks are all operating in the case where it gets used. Now, all of a sudden, we are taking Amazon and they're saying, hey, pay us for what you're using. You can use reserved pricing and pay a lower rate for the things you might actually care about on a consistent basis but then I'm going to allow you to spike, and I'll just run the meter. So this has caused software vendors like us, to look at the way we charge and the way that we deploy our resources and say, hey, that's a very good model. We want to follow that and so we introduced Snaplogic Extreme, which has a few different components. Basically, it enables us to operate in these elastic environments, shift our thinking in pricing so that we don't think about like, node based or god forbid, core based pricing and say like, hey, basically pay us for what you do with your data and don't worry about how many servers it's running on. Let Snaplogic worry about spinning up and spinning down these machines because a lot of these workloads are data integration or application workloads that we know lots about. >> Right. >> So first of all, we manage these ephemeral, what we call ephemeral or elastic clusters. Second of all, the way that we distribute our workload is by generating Spark code currently. We use the same graphic environment that you use for everything but instead of running on our engines, we kind of spit out Spark code on the end that takes advantage of the massive scale out potential for these ephemeral environments. >> Right. >> We've also kind of built this in such a way that it's Spark today but it could be like, Native or some other engine like Flank or other things that come up. We really don't care like what back end engine actually is as long as it can run certain types of data oriented jobs. It's actually like lots of things in one. We combine out data acquisition and distribution capability with this like, massive elastic scale out capability. >> Yeah, it's unbelievable how you can spin that up and then of course, most people forget you need to spin it down after the event. >> James: Yeah, that's right. >> We talked to a great vendor who talked about, you know, my customer spends no money with me on the weekend, zero. >> James: Right. >> And I'm thrilled because they're not using me. When they do use me, then they're buying stuff. I think what's really interesting is how that changes. Also, your relationship with your customer. If you have a recurring revenue model, you have to continue to deliver a value. You have to stay close to your customer. You have to stay engaged because it's not a one time pop and then you send them the 15% or 20% maintenance bill. It's really this ongoing relationship and they're actually gaining value from your products each and every time you use that. It's a very different way. >> Yeah, that's right. I think it creates better relationships because you feel like, what we do is unproportionate to what they do and vise versa, so it has this fundamental fairness about it, if you will. >> Right, it's a good relationship but I want to go down another path before you turn the cameras on. Talk a little bit about the race always between the need for compute and the compute. It used to be personified best with Microsoft and Intel until we come out with a new chip and then Microsoft OS would eat up all the extra capacity and then they'd come up with a new chip and it was an ongoing thing. You made an interesting comment that, especially in the cloud world where the scale of these things is much, much bigger, that ran a world now where the compute and the storage have kind of, outpaced the applications, if you will, and there's an opportunity for the application to catch up. Oh by the way, we have this cool new thing called machine learning and augmented intelligence. I wonder if you could, is that what's going to fill or kind of rebalance the consumption pattern? >> Yeah, it seems that way and I always think about kind of like, compute and software spiraling around each other like a helix. >> Like at one point, one is leading the other and they sort of just, one eventually surpasses the other and then you need innovation on the other side. I think for a while, like if you turn the clock way back to like, when the Pentium was introduced and everyone was like, how are we ever going to use all of the compute power. >> Windows 95, whoo! >> You know, power of like the Pentium. Do I really need to run my spreadsheets 100% faster? There's no business value whatsoever in transacting faster, or like general user interface or like graphical user interfaces or rendering web pages. Then you start seeing this new glut, often led by like researchers first. Like, software applications coming up that use all of this power because in academia you can start saying, what if I did have infinite compute? What would I do differently? You see things, you know like VR and advanced gaming, come up on the consumer side. Then I think the real answer on the business side is AI and ML. The general trend I start thinking of is something I used to talk about, back in the old days, which is conversion of like, having machines work for us instead of us working for machines. The only way we're ever going to get there is by having higher and higher intelligence on the application side so that it kind of intuits more based on what it's seen before and what it knows about you, etc., in terms of the task that needs to get done. Then there's this whole new breed of person that you need in order to wield all that power because like Hadoop, it's not just natural. You don't just have people floating around like, hey, you know, I'm going to be an Uzi expert or a yarn expert. You don't run into people everyday that's like, oh, yeah, I know neural nets well. I'm a gradient descent expert or whatever you're model is. It's really going to drive like, lots of changes I think. >> Right, well hopefully it does and especially like we were talking about earlier, you know, within core curriculums at schools and stuff. We were with Grace Hopper and Brenda Wilkerson, the new head of the Anita Borg organization, was at this Chicago public school district and they're actually starting to make CS a requirement, along with biology and and physics and chemistry and some of these other things. >> Right. >> So we do have a huge, a huge dearth of that but I want to just close out on one last concept before I let you go and you guys are way on top of this. Greg talked about what you just talked about, which is making the computers work for us versus the other way around. That's where the democratization of the power that we heard a lot about the democratization of big data and the tools and now you guys you guys are talking about the democratization of the integration, especially when you have a bunch of cloud based applications that everybody has access to and maybe, needs to stitch together a different way. But when you look at this whole concept of democratization of that power, how do you see that kind of playing out over the next several years? >> Yeah, that's a very big- >> Sorry I didn't bring you a couple of beer before I brought that up. >> Oh no, I got you covered. So it's a very big, interesting question because I think that you know, first of all, it's one of these, god knows, we can't predict with a lot of accuracy how exactly that's going to look because we're sort of juxtaposing two things. One is, part of the initial move to the cloud was the failure to properly democratize data inside the enterprise, for whatever reason, and we didn't do it. Now we have the computer resources and the central, kind of web based access to everything. Great. Now we have Cambridge Analytica and like, Facebook and people really thinking about data privacy and the fact that we want ubiquitous safe access. I think we know how to make things ubiquitous. The question is, do we know how to make it safe and fair so that the right people are using the right data and the right way? It's a little bit like, you know, there's all these cautionary tales out there like, beware of AI and robotics and everything and nobody really thinks about the danger of the data that's there. It's a much more immediate problem and yet it's sort of like the silent killer until some scandal comes up. We start thinking about these different ways we can tackle it. Obviously there's great solutions for tokenization and encryption and everything at the data level but even if you have the access to it, the question is, how do you control that wildfire that could happen as soon as the horse leaves the barn. Maybe not in it's current form, but when you look at things like Blockchain, there's been a lot of predictions about how Blockchain can be used around like, data. I think that this privacy and this curation and tracking of who has the data, who has access to it and can we control it, I think you are looking at even more like, centralized and guarded access to this private data. >> Great, interesting times. >> Yeah, yeah Jeff, for sure. >> Alright James, well thanks for taking a couple of minutes with us. I really enjoyed the conversation. >> Yeah, it's always great. Thanks for having me Jeff. >> It's James on Jeff and you're watching theCUBE We're at the Snaplogic headquarters in San Mateo, California and thanks for watching. (electronic music)

Published Date : May 21 2018

SUMMARY :

Brought to you by SnapLogic. James, great to see you and I guess, Yeah that's right, and why was I there? and we're going to eventually stop saying Yeah and we might stop saying big data especially in the last couple of years. that helps you get everything else done. Yeah, and I think when you think about it, from like, that changes the way that you organize all this stuff. and I was adamantly proclaiming you know, and one of the things is that there is no big data center. On the data side, you mention this like, that puts the two together. and I'm going to put all my ETL jobs on there, etc. and I can't just hire somebody off the street processing tech all the time, right? and the amount of resources that he can bring to bear, That's right and that why you know, So there's no reason to believe So you guys are getting Extreme. First of all, you need to get data into the cloud, They're the leaders so let's call a spade a spade. Certainly Google and Microsoft are out there as well so for the things you might actually care Second of all, the way that we distribute It's actually like lots of things in one. Yeah, it's unbelievable how you can spin that up you know, my customer spends no money you have to continue to deliver a value. I think it creates better relationships because you feel have kind of, outpaced the applications, if you will, Yeah, it seems that way and I always think and then you need innovation on the other side. in terms of the task that needs to get done. and they're actually starting to make CS a requirement, of the integration, especially when you have Sorry I didn't bring you a couple of beer before and fair so that the right people are using I really enjoyed the conversation. Yeah, it's always great. We're at the Snaplogic headquarters in

ENTITIES

Entity	Category	Confidence
James	PERSON	0.99+
Jeff	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Jeff Frick	PERSON	0.99+
James Markarian	PERSON	0.99+
James Hamilton	PERSON	0.99+
Greg	PERSON	0.99+
Google	ORGANIZATION	0.99+
100 servers	QUANTITY	0.99+
15%	QUANTITY	0.99+
20%	QUANTITY	0.99+
San Mateo	LOCATION	0.99+
2010	DATE	0.99+
AWS	ORGANIZATION	0.99+
10 servers	QUANTITY	0.99+
New York City	LOCATION	0.99+
1,000	QUANTITY	0.99+
10,000 machines	QUANTITY	0.99+
Brenda Wilkerson	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Spark	TITLE	0.99+
10,000 servers	QUANTITY	0.99+
100%	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
SnapLogic	ORGANIZATION	0.99+
Tuesday night	DATE	0.99+
San Mateo, California	LOCATION	0.99+
Windows 95	TITLE	0.99+
One	QUANTITY	0.99+
500 millisecond	QUANTITY	0.99+
two years later	DATE	0.98+
two things	QUANTITY	0.98+
Snaplogic	ORGANIZATION	0.98+
one time	QUANTITY	0.97+
two	QUANTITY	0.97+
one	QUANTITY	0.97+
Innovation Day	EVENT	0.97+
Second	QUANTITY	0.96+
Cambridge Analytica	ORGANIZATION	0.96+
Chicago	LOCATION	0.96+
S3	TITLE	0.95+
Flank	ORGANIZATION	0.95+
First	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
today	DATE	0.93+
Grace Hopper	PERSON	0.93+
first	QUANTITY	0.93+
SnapLogic Innovation Day 2018	EVENT	0.92+
one point	QUANTITY	0.92+
Pentium	COMMERCIAL_ITEM	0.92+
last couple of years	DATE	0.9+
one last concept	QUANTITY	0.9+
one talk	QUANTITY	0.88+
one set	QUANTITY	0.88+
zero	QUANTITY	0.87+
Snaplogic Extreme	ORGANIZATION	0.85+
Anita Borg	ORGANIZATION	0.84+
couple years ago	DATE	0.82+
couple of years ago	DATE	0.81+

James Markarian, SnapLogic | SnapLogic Innovation Day 2018

Published Date : May 19 2018

SUMMARY :

Brought to you by SnapLogic. James, great to see you and I guess, Yeah that's right, and why was I there? And we we are two years and we're going to eventually stop saying Yeah and we might stop saying big data especially in the last couple of years. That's what we're kind of It's just part of the infrastructure Yeah, and I think when you and if you think of a world and I was adamantly proclaiming you know, Ask them to get a and one of the things is that and so the cloud is really that puts the two together. and move a lot to the cloud. and apply a bunch of technology there processing tech all the time, right? and the amount of resources Yeah, the economy is a That's right and that why you know, So there's no reason to believe So you guys are getting Extreme. and I probably refer to Amazon They're the leaders so Certainly Google and Microsoft for the things you might actually care Second of all, the way that we distribute It's actually like lots of things in one. you need to spin it down after the event. you know, my customer spends no money you have to continue to deliver a value. about it, if you will. the application to catch up. and software spiraling and then you need innovation person that you need in the new head of the big data and the tools and now you guys you a couple of beer before and fair so that the I really enjoyed the conversation. Yeah, it's always great. We're at the Snaplogic headquarters in

ENTITIES

Entity	Category	Confidence
James	PERSON	0.99+
Jeff	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Jeff Frick	PERSON	0.99+
James Markarian	PERSON	0.99+
James Hamilton	PERSON	0.99+
Greg	PERSON	0.99+
Google	ORGANIZATION	0.99+
100 servers	QUANTITY	0.99+
15%	QUANTITY	0.99+
20%	QUANTITY	0.99+
San Mateo	LOCATION	0.99+
2010	DATE	0.99+
AWS	ORGANIZATION	0.99+
10 servers	QUANTITY	0.99+
New York City	LOCATION	0.99+
1,000	QUANTITY	0.99+
10,000 machines	QUANTITY	0.99+
Brenda Wilkerson	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Spark	TITLE	0.99+
10,000 servers	QUANTITY	0.99+
100%	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
SnapLogic	ORGANIZATION	0.99+
Tuesday night	DATE	0.99+
San Mateo, California	LOCATION	0.99+
Windows 95	TITLE	0.99+
One	QUANTITY	0.99+
San Mateo, California	LOCATION	0.99+
500 millisecond	QUANTITY	0.99+
two years later	DATE	0.98+
two things	QUANTITY	0.98+
Snaplogic	ORGANIZATION	0.98+
one time	QUANTITY	0.97+
two	QUANTITY	0.97+
one	QUANTITY	0.97+
Innovation Day	EVENT	0.97+
Second	QUANTITY	0.96+
Cambridge Analytica	ORGANIZATION	0.96+
Chicago	LOCATION	0.96+
S3	TITLE	0.95+
Flank	ORGANIZATION	0.95+
First	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
today	DATE	0.93+
Grace Hopper	PERSON	0.93+
first	QUANTITY	0.93+
SnapLogic Innovation Day 2018	EVENT	0.92+
one point	QUANTITY	0.92+
Pentium	COMMERCIAL_ITEM	0.92+
last couple of years	DATE	0.9+
one last concept	QUANTITY	0.9+
one talk	QUANTITY	0.88+
one set	QUANTITY	0.88+
zero	QUANTITY	0.87+
Snaplogic Extreme	ORGANIZATION	0.85+
Anita Borg	ORGANIZATION	0.84+
couple years ago	DATE	0.82+
couple of years ago	DATE	0.81+

Janet George , Western Digital | Western Digital the Next Decade of Big Data 2017

>> Announcer: Live from San Jose, California, it's theCUBE, covering Innovating to Fuel the Next Decade of Big Data, brought to you by Western Digital. >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're at Western Digital at their global headquarters in San Jose, California, it's the Almaden campus. This campus has a long history of innovation, and we're excited to be here, and probably have the smartest person in the building, if not the county, area code and zip code. I love to embarrass here, Janet George, she is the Fellow and Chief Data Scientist for Western Digital. We saw you at Women in Data Science, you were just at Grace Hopper, you're everywhere and get to get a chance to sit down again. >> Thank you Jeff, I appreciate it very much. >> So as a data scientist, today's announcement about MAMR, how does that make you feel, why is this exciting, how is this going to make you be more successful in your job and more importantly, the areas in which you study? >> So today's announcement is actually a breakthrough announcement, both in the field of machine learning and AI, because we've been on this data journey, and we have been very selectively storing data on our storage devices, and the selection is actually coming from the preconstructed queries that we do with business data, and now we no longer have to preconstruct these queries. We can store the data at scale in raw form. We don't even have to worry about the format or the schema of the data. We can look at the schema dynamically as the data grows within the storage and within the applications. >> Right, cause there's been two things, right. Before data was bad 'cause it was expensive to store >> Yes. >> Now suddenly we want to store it 'cause we know data is good, but even then, it still can be expensive, but you know, we've got this concept of data lakes and data swamps and data all kind of oceans, pick your favorite metaphor, but we want the data 'cause we're not really sure what we're going to do with it, and I think what's interesting that you said earlier today, is it was schema on write, then we evolved to schema on read, which was all the rage at Hadoop Summit a couple years ago, but you're talking about the whole next generation, which is an evolving dynamic schema >> Exactly. >> Based whatever happens to drive that query at the time. >> Exactly, exactly. So as we go through this journey, we are now getting independent of schema, we are decoupled from schema, and what we are finding out is we can capture data at its raw form, and we can do the learning at the raw form without human interference, in terms of transformation of the data and assigning a schema to that data. We got to understand the fidelity of the data, but we can train at scale from that data. So with massive amounts of training, the models already know to train itself from raw data. So now we are only talking about incremental learning, as the train model goes out into the field in production, and actually performs, now we are talking about how does the model learn, and this is where fast data plays a very big role. >> So that's interesting, 'cause you talked about that also earlier in your part of the presentation, kind of the fast data versus big data, which kind of maps the flash versus hard drive, and the two are not, it's not either or, but it's really both, because within the storage of the big data, you build the base foundations of the models, and then you can adapt, learn and grow, change with the fast data, with the streaming data on the front end, >> Exactly >> It's a whole new world. >> Exactly, so the fast data actually helps us after the training phase, right, and these are evolving architectures. This is part of your journey. As you come through the big data journey you experience this. But for fast data, what we are seeing is, these architectures like Lambda and Kappa are evolving, and especially the Lambda architecture is very interesting, because it allows for batch processing of historical data, and then it allows for what we call a high latency layer or a speed layer, where this data can then be promoted up the stack for serving purposes. And then Kappa architecture's where the data is being streamed near real time, bounded and unbounded streams of data. So this is again very important when we build machine learning and AI applications, because evolution is happening on the fly, learning is happening on the fly. Also, if you think about the learning, we are mimicking more and more on how humans learn. We don't really learn with very large chunks of data all at once, right? That's important for initially model training and model learning, but on a regular basis, we are learning with small chunks of data that are streamed to us near real time. >> Right, learning on the Delta. >> Learning on the Delta. >> So what is the bound versus the unbound? Unpack that a little bit. What does that mean? >> So what is bounded is basically saying, hey we are going to get certain amounts of data, so you're sizing the data for example. Unbounded is infinite streams of data coming to you. And so if your architecture can absorb infinite streams of data, like for example, the sensors constantly transmitting data to you, right? At that point you're not worried about whether you can store that data, you're simply worried about the fidelity of that data. But bounded would be saying, I'm going to send the data in chunks. You could also do bounded where you basically say, I'm going to pre-process the data a little bit just to see if the data's healthy, or if there is signal in the data. You don't want to find that out later as you're training, right? You're trying to figure that out up front. >> But it's funny, everything is ultimately bounded, it just depends on how you define the unit of time, right, 'cause you take it down to infinite zero, everything is frozen. But I love the example of the autonomous cars. We were at the event with, just talking about navigation just for autonomous cars. Goldman Sachs says it's going to be a seven billion dollar industry, and the great example that you used of the two systems working well together, 'cause is it the car centers or is it the map? >> Janet: That's right. >> And he says, well you know, you want to use the map, and the data from the map as much as you can to set the stage for the car driving down the road to give it some level of intelligence, but if today we happen to be paving lane number two on 101, and there's cones, now it's the real time data that's going to train the system. But the two have to work together, and the two are not autonomous and really can't work independent of each other. >> Yes. >> Pretty interesting. >> It makes perfect sense, right. And why it makes perfect sense is because first the autonomous cars have to learn to drive. Then the autonomous cars have to become an experienced driver. And the experience cannot be learned. It comes on the road. So one of the things I was watching was how insurance companies were doing testing on these cars, and they had a human, a human driving a car, and then an autonomous car. And the autonomous car, with the sensors, were predicting the behavior, every permutation and combination of how a bicycle would react to that car. It was almost predicting what the human on the bicycle would do, like jump in front of the car, and it got it right 80% of the cases. But a human driving a car, we're not sure how the bicycle is going to perform. We don't have peripheral vision, and we can't predict how the bicycle is going to perform, so we get it wrong. Now, we can't transmit that knowledge. If I'm a driver and I just encountered a bicycle, I can't transmit that knowledge to you. But a driverless car can learn, it can predict the behavior of the bicycle, and then it can transfer that information to a fleet of cars. So it's very powerful in where the learning can scale. >> Such a big part of the autonomous vehicle story that most people don't understand, that not only is the car driving down the road, but it's constantly measuring and modeling everything that's happening around it, including bikes, including pedestrians, including everything else, and whether it gets in a crash or not, it's still gathering that data and building the model and advancing the models, and I think that's, you know, people just don't talk about that enough. I want follow up on another topic. So we were both at Grace Hopper last week, which is a phenomenal experience, if you haven't been, go. Ill just leave it at that. But Dr. Fei-Fei Li gave one of the keynotes, and she made a really deep statement at the end of her keynote, and we were both talking about it before we turned the cameras on, which is, there's no question that AI is going to change the world, and it's changing the world today. The real question is, who are the people that are going to build the algorithms that train the AI? So you sit in your position here, with the power, both in the data and the tools and the compute that are available today, and this brand new world of AI and ML. How do you think about that? How does that make you feel about the opportunity to define the systems that drive the cars, et cetera. >> I think not just the diversity in data, but the diversity in the representation of that data are equally powerful. We need both. Because we cannot tackle diverse data, diverse experiences with only a single representation. We need multiple representation to be able to tackle that data. And this is how we will overcome bias of every sort. So it's not the question of who is going to build the AI models, it is a question of who is going to build the models, but not the question of will the AI models be built, because the AI models are already being built, but some of the models have biases into it from any kind of lack of representation. Like who's building the model, right? So I think it's very important. I think we have a powerful moment in history to change that, to make real impact. >> Because the trick is we all have bias. You can't do anything about it. We grew up in the world in which we grew up, we saw what we saw, we went to our schools, we had our family relationships et cetera. So everyone is locked into who they are. That's not the problem. The problem is the acceptance of bring in some other, (chuckles) and the combination will provide better outcomes, it's a proven scientific fact. >> I very much agree with that. I also think that having the freedom, having the choice to hear another person's conditioning, another person's experiences is very powerful, because that enriches our own experiences. Even if we are constrained, even if we are like that storage that has been structured and processed, we know that there's this other storage, and we can figure out how to get the freedom between the two point of views, right? And we have the freedom to choose. So that's very, very powerful, just having that freedom. >> So as we get ready to turn the calendar on 2017, which is hard to imagine it's true, it is. You look to 2018, what are some of your personal and professional priorities, what are you looking forward to, what are you working on, what's top of mind for Janet George? >> So right now I'm thinking about genetic algorithms, genetic machine learning algorithms. This has been around for a while, but I'll tell you where the power of genetic algorithms is, especially when you're creating powerful new technology memory cell. So when you start out trying to create a new technology memory cell, you have materials, material deformations, you have process, you have hundred permutation combination, and the genetic algorithms, we can quickly assign a cause function, and we can kill all the survival of the fittest, all that won't fit we can kill, arriving to the fastest, quickest new technology node, and then from there, we can scale that in mass production. So we can use these survival of the fittest mechanisms that evolution has used for a long period of time. So this is biology inspired. And using a cause function we can figure out how to get the best of every process, every technology, all the coupling effects, all the master effects of introducing a program voltage on a particular cell, reducing the program voltage on a particular cell, resetting and setting, and the neighboring effects, we can pull all that together, so 600, 700 permutation combination that we've been struggling on and not trying to figure out how to quickly narrow down to that perfect cell, which is the new technology node that we can then scale out into tens of millions of vehicles, right? >> Right, you're going to have to >> Getting to that spot. >> You're going to have to get me on the whiteboard on that one, Janet. That is amazing. Smart lady. >> Thank you. >> Thanks for taking a few minutes out of your time. Always great to catch up, and it was terrific to see you at Grace Hopper as well. >> Thank you, I really appreciate it, I appreciate it very much. >> All right, Janet George, I'm Jeff Frick. You are watching theCUBE. We're at Western Digital headquarters at Innovating to Fuel the Next Generation of Big Data. Thanks for watching.

Published Date : Oct 11 2017

SUMMARY :

the Next Decade of Big Data, in San Jose, California, it's the Almaden campus. the preconstructed queries that we do with business data, Right, cause there's been two things, right. of the data and assigning a schema to that data. and especially the Lambda architecture is very interesting, So what is the bound versus the unbound? the sensors constantly transmitting data to you, right? and the great example that you used and the data from the map as much as you can and it got it right 80% of the cases. and advancing the models, and I think that's, So it's not the question of who is going to Because the trick is we all have bias. having the choice to hear another person's conditioning, So as we get ready to turn the calendar on 2017, and the genetic algorithms, we can quickly assign You're going to have to get me on the whiteboard and it was terrific to see you at Grace Hopper as well. I appreciate it very much. at Innovating to Fuel the Next Generation of Big Data.

ENTITIES

Entity	Category	Confidence
Janet George	PERSON	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Janet	PERSON	0.99+
Western Digital	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
two things	QUANTITY	0.99+
2018	DATE	0.99+
last week	DATE	0.99+
2017	DATE	0.99+
Goldman Sachs	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
two systems	QUANTITY	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
seven billion dollar	QUANTITY	0.99+
Fei-Fei Li	PERSON	0.98+
Almaden	LOCATION	0.98+
two point	QUANTITY	0.97+
one	QUANTITY	0.97+
first	QUANTITY	0.95+
Grace Hopper	ORGANIZATION	0.95+
theCUBE	ORGANIZATION	0.95+
hundred permutation	QUANTITY	0.95+
MAMR	ORGANIZATION	0.94+
Women in Data Science	ORGANIZATION	0.91+
tens of millions of vehicles	QUANTITY	0.9+
one of	QUANTITY	0.89+
Kappa	ORGANIZATION	0.89+
Dr.	PERSON	0.88+
single representation	QUANTITY	0.83+
a couple years ago	DATE	0.83+
earlier today	DATE	0.82+
Next Decade	DATE	0.81+
Lambda	TITLE	0.8+
101	OTHER	0.8+
600, 700 permutation	QUANTITY	0.77+
Lambda	ORGANIZATION	0.7+
of data	QUANTITY	0.67+
keynotes	QUANTITY	0.64+
Hadoop Summit	EVENT	0.62+
zero	QUANTITY	0.6+
number	OTHER	0.55+
Delta	OTHER	0.54+
two	OTHER	0.35+

Nathan Trueblood, DataTorrent | CUBEConversations

(techno music) >> Hey welcome back everybody, Jeff Frick here with The CUBE. We're having a cube conversation in the Palo Alto studio. It's a different kind of format of CUBE. Not in the context of a big show. Got a great guest here lined up who we just had on at a show recently. He's Nathan Trueblood, he's the vice president of product management for DataTorrent. Nathan great to see you. >> Thanks for having me. >> We just had you on The CUBE at Hadoop, or Data Works now, >> That's right. >> not Hadoop Summit anymore. So just a quick follow up on that, we were just talking before we turned the cameras on. You said that was a pretty good show for you guys. >> Yeah it was a really great show. In fact as a software company one of the things you really want to see at shows is a lot of customer flow and a lot of good customer discussions, and that's definitely what happened at Data Works. It was also really good validation for us that everyone was coming and talking to us about what can you do from a real time analytics perspective? So that was also a good strong signal that we're onto something in this marketplace. >> It's interesting, I heard your quote from somewhere, that really the streaming and the real time streaming in the big data space is really grabbing all the attention. Obviously we do Spark Summit. We did Flink Forward. So we're seeing more and more activity around streaming and it's so logical that now that we have the compute horsepower, the storage horsepower, the networking horsepower, to enable something that we couldn't do very effectively before but now it's opening up a whole different way to look at data. >> Yeah it really is and I think as someone who's been working the tech world for a while, I'm always looking for simplifying ways to explain what this means. 'Cause people say streaming and real time and all of that stuff. For us what it really comes down to is the faster I can make decisions or the closer to when something happens I can make a decision, that gives me competitive advantage. And so if you look at the whole big data evolution. It's always been towards how quickly can we analyze this data so that we can respond to what it's telling us? And in many ways that means being more responsive to my customer. So a lot of this came out of course originally from very large scale systems at some of the big internet companies like Yahoo where Hadoop was born. But really it all comes down to if I'm more responsive to my customer, I'm more competitive and I win. And I think what a lot of customers are saying across many different verticals is real time means more responsiveness and that means competitive advantage. >> Right and even we hear all the time moving into a predictive model, and then even to a prescriptive model where you're offloading a lot of the grunt work of the decision making, letting the machine do a lot more of that, and so really it's the higher value stuff that finally gets to the human at the end of the interaction who's got to make a judgment. >> That's exactly right, that's right. And so to me all the buzz about streaming is really representative of just this is now the next evolution of where big data architecture has been going which is towards moving away from a batch oriented world into something where we're making decisions as close to the time of data creation as possible. >> So you've been involved in not only tech for a long time but Hadoop specifically and Big Data specifically. And one of the knocks, I remember that first time I ever heard about Hadoop, is actually from Bill Schmarzo at EMC the dean of Big Data. And I was talking to a friend of it and he goes yeah but what Bill didn't tell you, there's not enough people. You know Hadoop's got all this great promise, there just aren't enough people for all the enterprises at the individual company level to implement this stuff. Huge part of the problem. And now you're at DataTorrent and as we talked before, interesting kind of shift in strategy and going to really an application focus strategy as opposed to more of a platform focus strategy so that you can help people at companies solve problems faster. >> That's right we've definitely focused, especially recently on more of an application strategy. But to kind of peel that back a little bit, you need a platform with all the capabilities that a platform has to be able to deliver large scale operable streaming analytics. But customers aren't looking for platforms, they're looking for please solve my business problem, give me that competitive advantage. I think it's a long standing problem in technology and particularly in Big Data where you build a tremendous platform but there's only a handful of people who know how to actually construct the applications to deliver that value. And I think increasingly in big data but also across all of tech, customers are looking for outcomes now and the way for us to deliver outcomes is to deliver applications that run on our platform. So we've built a tremendous platform and now we are working with customers and delivering applications for that platform so that it takes a lot of the complexity out of the equation for them. And we kind of think of it like if in the past it required sort of an architect level person in order to construct an application on our platform, now we're gearing towards a much larger segment of developers in the enterprise who are tremendously capable but don't have that deep Big Data experience that they need to build an application from scratch. >> And it's pretty interesting too 'cause another theme we see over and over and over and over, especially around the innovation theme is the democratization of the access to the data, the democratization of the tools to access the data so that anyone in the company or a much greater set of individuals inside the company have the opportunity to have a hypothesis, to explore the hypothesis, to come back with solutions. And so by kind of removing this ivory tower, either the data scientists or the super smart engineer who's the only one that has the capability to play with the data and the tools. That's really how you open up innovation is democratizing access and ability to test and try things. >> That's right, to me I look at it very simply, when you have large scale adoption of a technology, usually it comes down to simplifying abstractions of one kind or another. And the big simplifying abstraction really of Big Data is providing the ability to break up a huge amount of data and make some sense of it, using of course large scale distributed computing. The abstraction we're delivering at DataTorrent now is building on all that stuff, on all those layers, we've obscured all of that and now you can download with our software an application that produces an outcome. So for example one of the applications we're shipping shortly is a Omni-Channel credit card fraud prevention application. Now our customers in the past have already constructed applications like this on our platform. But now what we're doing like you said is democratizing access to those kinds of applications by providing an application that works out of the box. And that's a simplifying abstraction. Now truthfully there's still a lot of complexity in there but we are providing the pattern, the foundational application that then the customer can focus on customizing to their particular situation, their integrations, their fraud rules and so forth. And so that just means getting you closer to that outcome much more quickly. >> Watching your video from Data Works, one of the interesting topics you brought up is really speed and how faster, better, cheaper, which is innovative for a little while, becomes the new norm. And as soon as you reset the bar on speed, then they just want it, well can you go faster. So whether you went from a week to a day, a day to an hour, there's just this relentless pressure to be able to get the data, analyze the data, make a decision faster and faster and faster. And you've seen this just changing by leap years right over time. >> Right and I literally started my career in the days of ETL extracting data from tape that was data produced weeks or months ago, down to now we're analyzing data at volumes that were inconceivable and producing insight in less than a second, which is kind of mind boggling. And I think the interesting thing that's happening when we think about speed, and I've had a few discussions with other folks about this, they say well speed really only matters for some very esoteric applications. It's one of the things that people bring up. But no one has ever said well I wish my data was less fresh or my insight was not as current. And so when you start to look at the kinds of customers that want to bring real time data processing and analytics, it turns out that nearly every vertical that we look at has a whole host of applications where if you could bring real time analytics you could be more responsive to what your customer's doing. >> Right right. >> Right and that can be, certainly that's the case in retail, but we see it in industrial automation and IoT. All I think of is IoT is a way to sense what's going on in the world, bring that data in, get insight and take action from it. And so real time analytics is a huge part of that, which you know again, healthcare, insurance, banking, all these different places have used cases. And so what we're aiming to do at DataTorrent is make it easy for the businesses in those different verticals to really get the outcome they're looking for, not produce a platform and say imagine what you could do, but produce an application that actually delivers on a particular problem they have. >> It's funny too the speed equation, you saw it in Flash, remembering to shift gears a little bit into the hardware space right, is people said well it's only super low latency, super high volume transactions, financial services, is the only benefit we're going to get from Flash. >> Right yeah we've had the same knock for real time analytics. >> Same thing right, but as soon as you put it in, there's all these second order impacts, third order impacts that nobody ever thought of, that speed that delivers, that aren't directly tied to that transactional speed, but now enable you because of that transactional speed, to do so many other things that you couldn't even imagine to do and so that's why I think we see this pervasiveness of Flash, why wouldn't you want Flash? I mean why wouldn't you want to go faster? 'Cause there's so much upside. >> Yeah so again all of these innovations in IT come down to how can I be more flexible and more responsive to changing conditions? More responsive to my customer, more flexible when it comes to changing business conditions and so forth. And so now as we start to instrument the world and have technologies like machine learning and artificial intelligence, that all needs to be fed by data that is delivered as quickly as possible and then it can be analyzed to make decisions in real time. >> So I wanted to shift gears a little bit, kind of back to the application strategies. So you said you had the first app that's going to be, (Jeff drowned out by Nathan) >> Yeah so the first application yes it was fraud prevention. That's an important distinction there because the distinction between detection and prevention is the competitive advantage of real time. Because what we deliver in DataTorrent is the ability to process massive amounts of data in very very low time frame. Sub seconds time frames. And so that's the kind of fundamental capability you need in order to do something like respond to some kind of fraud event. And what we see in the market is that fraud is becoming a greater and greater problem. The market itself is expanding. But I think as we see fraud is also evolving in terms of the ways it can take place across e-commerce and point of sale and so forth. And so merchants and processors and everyone in the whole spectrum of that market is facing a massive problem and an evolving problem. And so that's where we're focused in one of our first I would say vertically oriented business applications is it's really easy to be able to take in new sources of data with our application but also to be able to process all that data and then run it through a decision engine to decide if something is fraudulent or not in a short period of time. So you need to be able to take in all that data to be able to make a good decision. And you need to be able to decide quickly if it's going to matter. And you also need to be able to have a really strong model for making decisions so that you avoid things like false positives which are as big a problem as preventing fraud itself if you deliver bad customer experience. And we've all had that experience as well which is your card gets shut down for what you think is a legitimate activity. >> It's just so ironic that false positives are the biggest problem with credit card fraud. >> Yeah it's one of yeah. >> You would think we would be thankful for a false positive but all you hear over and over and over is that false positive and the customer experience. It shows that we're so good at it is the thing that really irks people. >> Well if you think about that, having an application that allows you to make better decisions more quickly and prevent those false positives and take care of fraud is a huge competitive advantage for all the different players in that industry. And it's not just for the credit card companies of course, it's for the whole spectrum of people from the merchant all the way to the bank that are trying to deal with this problem. And so that's why it's one of the applications that we think of as a key example where we see a lot of opportunity. And certainly people that are looking at credit card fraud have been thinking about this problem for a while. But there's the complexity like we were discussing earlier of finding the talent, on being able to deliver these kinds of applications finding the technology that can actually scale to the processing volume. And so by delivering Omni-Channel fraud prevention as a Big Data application, that just puts our customers so much closer to the outcome that they want. And it makes it a lot easier to adopt. >> So as you sit, shift gears a little bit, as your VP of product hat, and there's a huge wide world of opportunity in front of you, we talked about IoT a little bit, obviously fraud, you've talked about Omni-Channel retail. How are you guys going to figure out where you want to go next? How are you prioritizing the world, and as you build up more of these applications is it going to be vertically focused, horizontally focused, what are you thoughts as you start down the application journey? >> So a few thoughts on that. Certainly one of the key indicators for me as a product manager when I look at where to go next and what applications we should build next, it comes down to what signal are the customers giving us? As we mentioned earlier, we built a platform for real time analytics and decision making, and one of the things that we see is broad adoption across a lot of different verticals. So I mentioned industrial IoT and financial services fraud prevention and advertising technology, and, and, and. We have a company that we're working with in GPS geofencing. So the possibilities are pretty interesting. But when it comes to prioritizing those different applications we have to also look at what are the economics involved for the customer and for us. So certainly one of the reasons we chose fraud prevention is that the economics are pretty obvious for our customers. Some of these other things are going to take a little bit longer for the economics to show up when it comes to the applications. So you'll certainly see us focusing on vertically oriented business applications because again the horizontals tend to be more like a platform and it's not close enough to delivering an outcome for a customer. But it's worth noting one of the things we see is that while we will deliver vertically oriented applications that oftentimes switching from one vertical app to another is really not a lot more than changing the kind of data we're analyzing, and changing the decision engine. But the fundamental idea of processing data in a pipeline at very high volume with fault tolerance and low latency, that remains the same in every case. So we see a lot of opportunity essentially as we solve an application in one vertical, to rescan it into another. >> So you can say you're tweaking the dials and tweaking the UDI. >> Tweaking the data and the rules that you apply to that data. So if you think about Omni-Channel fraud prevention, well it's not that big of a leap to look at healthcare fraud or into look at all the other kinds of fraud in different verticals that you might see. >> Do you ever see that you'll potentially break out the algorithm, I forget which one we're at, people are talking about algorithms as a service. Or is that too much of a bit, does there need to be a little bit more packaging? >> No I mean I think there will be cases where we will have an algorithm out of the box that provides some basics for the decisions support. But as we see a huge market springing up around AI and machine learning and machine scoring and all of that, there's a whole industry that's growing up around essentially, we provide you the best way to deliver that algorithm or that decision engine, that you train on your data and so forth. So that's certainly an area where we're looking from a partnership perspective. Where we already today partner with some of the AI vendors for what I would say is some custom applications that customers have deployed. But you'll see more of that in our applications coming up in the future. But as far as algorithms as a service, I think that's already here in the form of being able to query against some kind of AI with a question, you know essentially a model and then getting an answer back. >> Right well Nathan, exciting times, and your Big Data journey continues. >> It certainly does, thanks a lot Jeff. >> Thanks Nathan Trueblood from DataTorrent. I'm Jeff Frick, you're watching The CUBE, we'll see you next time, thanks for watching. (techno music)

Published Date : Jul 21 2017

SUMMARY :

Not in the context of a big show. You said that was a pretty good show for you guys. In fact as a software company one of the things and it's so logical that now that we have or the closer to when something happens and so really it's the higher value stuff And so to me all the buzz about streaming at the individual company level to implement this stuff. so that it takes a lot of the complexity is the democratization of the access to the data, is providing the ability to break up a huge amount of data one of the interesting topics you brought up is really speed And so when you start to look at the kinds of customers is make it easy for the businesses is the only benefit we're going to get from Flash. for real time analytics. to do so many other things that you couldn't even imagine that all needs to be fed by data kind of back to the application strategies. And so that's the kind of fundamental capability you need are the biggest problem with credit card fraud. is that false positive and the customer experience. And it's not just for the credit card companies of course, is it going to be vertically focused, horizontally focused, and one of the things that we see So you can say you're tweaking the dials that you apply to that data. break out the algorithm, I forget which one we're at, that provides some basics for the decisions support. and your Big Data journey continues. we'll see you next time, thanks for watching.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Bill Schmarzo	PERSON	0.99+
Jeff	PERSON	0.99+
Nathan Trueblood	PERSON	0.99+
Nathan	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
a week	QUANTITY	0.99+
Bill	PERSON	0.99+
DataTorrent	ORGANIZATION	0.99+
first app	QUANTITY	0.99+
Data Works	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
first application	QUANTITY	0.99+
a day	QUANTITY	0.99+
less than a second	QUANTITY	0.99+
second order	QUANTITY	0.98+
one	QUANTITY	0.98+
Hadoop	ORGANIZATION	0.97+
today	DATE	0.97+
third order	QUANTITY	0.97+
an hour	QUANTITY	0.97+
Big Data	ORGANIZATION	0.96+
first	QUANTITY	0.96+
first time	QUANTITY	0.95+
Flash	TITLE	0.94+
Hadoop	PERSON	0.92+
Hadoop	TITLE	0.91+
weeks	DATE	0.85+
one vertical	QUANTITY	0.83+
Hadoop Summit	EVENT	0.81+
The CUBE	ORGANIZATION	0.79+
one of the applications	QUANTITY	0.77+
Flink	ORGANIZATION	0.72+
Omni-Channel	ORGANIZATION	0.72+
UDI	ORGANIZATION	0.7+
Summit	EVENT	0.66+
CUBE	ORGANIZATION	0.57+
CUBEConversations	ORGANIZATION	0.47+
Spark	ORGANIZATION	0.46+
months	QUANTITY	0.43+

Janet George, Western Digital –When IoT Met AI: The Intelligence of Things - #theCUBE

(upbeat electronic music) >> Narrator: From the Fairmont Hotel in the heart of Silicon Valley, it's theCUBE. Covering when IoT met AI, The Intelligence of Things. Brought to you by Western Digital. >> Welcome back here everybody, Jeff Frick here with theCUBE. We are at downtown San Jose at the Fairmont Hotel. When IoT met AI it happened right here, you saw it first. The Intelligence of Things, a really interesting event put on by readwrite and Western Digital and we are really excited to welcome back a many time CUBE alumni and always a fan favorite, she's Janet George. She's Fellow & Chief Data Officer of Western Digital. Janet, great to see you. >> Thank you, thank you. >> So, as I asked you when you sat down, you're always working on cool things. You're always kind of at the cutting edge. So, what have you been playing with lately? >> Lately I have been working on neural networks and TensorFlow. So really trying to study and understand the behaviors and patterns of neural networks, how they work and then unleashing our data at it. So trying to figure out how it's training through our data, how many nets there are, and then trying to figure out what results it's coming with. What are the predictions? Looking at how the predictions are, whether the predictions are accurate or less accurate and then validating the predictions to make it more accurate, and so on and so forth. >> So it's interesting. It's a different tool, so you're learning the tool itself. >> Yes. >> And you're learning the underlying technology behind the tool. >> Yes. >> And then testing it actually against some of the other tools that you guys have, I mean obviously you guys have been doing- >> That's right. >> Mean time between failure analysis for a long long time. >> That's right, that's right. >> So, first off, kind of experience with the tool, how is it different? >> So with machine learning, fundamentally we have to go into feature extraction. So you have to figure out all the features and then you use the features for predictions. With neural networks you can throw all the raw data at it. It's in fact data-agnostic. So you don't have to spend enormous amounts of time trying to detect the features. Like for example, If you throw hundreds of cat images at the neural network, the neural network will figure out image features of the cat; the nose, the eyes, the ears and so on and so forth. And once it trains itself through a series of iterations, you can throw a lot of deranged cats at the neural network and it's still going to figure out what the features of a real cat is. >> Right. >> And it will predict the cat correctly. >> Right. So then, how does that apply to, you know, the more specific use case in terms of your failure analysis? >> Yeah. So we have failures and we have multiple failures. Some failures through through the human eye, it's very obvious, right? But humans get tired, and over a period of time we can't endure looking at hundreds and millions of failures, right? And some failures are interconnected. So there is a relationship between these failure patterns or there is a correlation between two failures, right? It could be an edge failure. It could a radial failure, eye pattern type failure. It could be a radial failure. So these failures, for us as humans, we can't escape. >> Right. >> And we used to be able to take these failures and train them at scale and then predict. Now with neural networks, we don't have to take and do all that. We don't have to extract these labels and try to show them what these failures look like. Training is almost like throwing a lot of data at the neural networks. >> So it almost sounds like kind of the promise of the data lake if you will. >> Yes. >> If you have heard about, from the Hadoop Summit- >> Yes, yes, yes. >> For ever and ever and ever. Right? You dump it all in and insights will flow. But we found, often, that that's not true. You need hypothesis. >> Yes, yes. >> You need to structure and get it going. But what you're describing though, sounds much more along kind of that vision. >> Yes, very much so. Now, the only caveat is you need some labels, right? If there is no label on the failure data, it's very difficult for the neural networks to figure out what the failure is. >> Jeff: Right. >> So you have to give it some labels to understand what patterns it should learn. >> Right. >> Right, and that is where the domain experts come in. So we train it with labeled data. So if you are training with a cat, you know the features of a cat, right? In the industrial world, cat is really what's in the heads of people. The domain knowledge is not so authoritative. Like the sky or the animals or the cat. >> Jeff: Right. >> The domain knowledge is much more embedded in the brains of the people who are working. And so we have to extract that domain knowledge into labels. And then you're able to scale the domain. >> Jeff: Right. >> Through the neural network. >> So okay so then how does it then compare with the other tools that you've used in the past? In terms of, obviously the process is very different, but in terms of just pure performance? What are you finding? >> So we are finding very good performance and actually we are finding very good accuracy. Right? So once it's trained, and it's doing very well on the failure patterns, it's getting it right 90% of the time, right? >> Really? >> Yes, but in a machine learning program, what happens is sometimes the model is over-fitted or it's under-fitted or there is bias in the model and you got to remove the bias in the model or you got to figure out, well, is the model false-positive or false-negative? You got to optimize for something, right? >> Right, right. >> Because we are really dealing with mathematical approximation, we are not dealing with preciseness, we are not dealing with exactness. >> Right, right. >> In neural networks, actually, it's pretty good, because it's actually always dealing with accuracy. It's not dealing with precision, right? So it's accurate most of the time. >> Interesting, because that's often what's common about the kind of difference between computer science and statistics, right? >> Yes. >> Computers is binary. Statistics always has a kind of a confidence interval. But what you're describing, it sounds like the confidence is tightening up to such a degree that it's almost reaching binary. >> Yeah, yeah, exactly. And see, brute force is good when your traditional computing programing paradigm is very brute force type paradigm, right? The traditional paradigm is very good when the problems are simpler. But when the problems are of scale, like you're talking 70 petabytes of data or you're talking 70 billion roles, right? Find all these patterns in that, right? >> Jeff: Right. >> I mean you just, the scale at which that operates and at the scale at which traditional machine learning even works is quite different from how neural networks work. >> Jeff: Okay. >> Right? Traditional machine learning you still have to do some feature extraction. You still have to say "Oh I can't." Otherwise you are going to have dimensionality issues, right? It's too broad to get the prediction anywhere close. >> Right. >> Right? And so you want to reduce the dimensionality to get a better prediction. But here you don't have to worry about dimensionality. You just have to make sure the labels are right. >> Right, right. So as you dig deeper into this tool and expose all these new capabilities, what do you look forward to? What can you do that you couldn't do before? >> It's interesting because it's grossly underestimating the human brain, right? The human brain is supremely powerful in all aspects, right? And there is a great deal of difficulty in trying to code the human brain, right? But with neural networks and because of the various propagation layers and the ability to move through these networks we are coming closer and closer, right? So one example: When you think about driving, recently, Google driverless car got into an accident, right? And where it got into an accident was the driverless car was merging into a lane and there was a bus and it collided with the bus. So where did A.I. go wrong? Now if you train an A.I., birds can fly, and then you say penguin is a bird, it is going to assume penguin can fly. >> Jeff: Right, right. >> We as humans know penguin is a bird but it can't fly like other birds, right? >> Jeff: Right. >> It's that anomaly thing, right? Naturally when are driving and a bus shows up, even if it's yield, the bus goes. >> Jeff: Right, right. >> We yield to the bus because it's bigger and we know that. >> A.I. doesn't know that. It was taught that yield is yield. >> Right, right. >> So it collided with the bus. But the beauty is now large fleets of cars can learn very quickly based on what it just got from that one car. >> Right, right. >> So now there are pros and cons. So think about you driving down Highway 85 and there is a collision, it's Sunday morning, you don't know about the collision. You're coming down on the hill, right? Blind corner and boom that's how these crashes happen and so many people died, right? If you were driving a driverless car, you would have knowledge from the fleet and from everywhere else. >> Right. >> So you know ahead of time. We don't talk to each other when we are in cars. We don't have universal knowledge, right? >> Car-to-car communication. >> Car-to-car communications and A.I. has that so directly it can save accidents. It can save people from dying, right? But people still feel, it's a psychology thing, people still feel very unsafe in a driverless car, right? So we have to get over- >> Well they will get over that. They feel plenty safe in a driverless airplane, right? >> That's right. Or in a driveless light rail. >> Jeff: Right. >> Or, you know, when somebody else is driving they're fine with the driver who's driving. You just sit in the driver's car. >> But there's that one pesky autonomous car problem, when the pedestrian won't go. >> Yeah. >> And the car is stopped it's like a friendly battle-lock. >> That's right, that's right. >> Well good stuff Janet and always great to see you. I'm sure we will see you very shortly 'cause you are at all the great big data conferences. >> Thank you. >> Thanks for taking a few minutes out of your day. >> Thank you. >> Alright she is Janet George, she is the smartest lady at Western Digital, perhaps in Silicon Valley. We're not sure but we feel pretty confident. I am Jeff Frick and you're watching theCUBE from When IoT meets AI: The Intelligence of Things. We will be right back after this short break. Thanks for watching. (upbeat electronic music)

Published Date : Jul 2 2017

SUMMARY :

Brought to you by Western Digital. We are at downtown San Jose at the Fairmont Hotel. So, what have you been playing with lately? Looking at how the predictions are, So it's interesting. behind the tool. So you have to figure out all the features So then, how does that apply to, you know, So these failures, for us as humans, we can't escape. at the neural networks. the promise of the data lake if you will. But we found, often, that that's not true. But what you're describing though, sounds much more Now, the only caveat is you need some labels, right? So you have to give it some labels to understand So if you are training with a cat, in the brains of the people who are working. So we are finding very good performance we are not dealing with preciseness, So it's accurate most of the time. But what you're describing, it sounds like the confidence the problems are simpler. and at the scale at which traditional machine learning Traditional machine learning you still have to But here you don't have to worry about dimensionality. So as you dig deeper into this tool and because of the various propagation layers even if it's yield, the bus goes. It was taught that yield is yield. So it collided with the bus. So think about you driving down Highway 85 So you know ahead of time. So we have to get over- Well they will get over that. That's right. You just sit in the driver's car. But there's that one pesky autonomous car problem, I'm sure we will see you very shortly 'cause you are Alright she is Janet George, she is the smartest lady

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Janet George	PERSON	0.99+
Janet	PERSON	0.99+
Western Digital	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
one car	QUANTITY	0.99+
Highway 85	LOCATION	0.99+
Sunday morning	DATE	0.99+
two failures	QUANTITY	0.99+
70 billion roles	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
CUBE	ORGANIZATION	0.98+
one example	QUANTITY	0.96+
The Intelligence of Things	TITLE	0.94+
hundreds of cat images	QUANTITY	0.93+
first	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.84+
San Jose	LOCATION	0.8+
one pesky autonomous car	QUANTITY	0.77+
70 petabytes of data	QUANTITY	0.77+
hundreds and	QUANTITY	0.76+
IoT	ORGANIZATION	0.74+
millions of failures	QUANTITY	0.66+
Fairmont Hotel	LOCATION	0.66+
ollision	PERSON	0.65+
meets	TITLE	0.64+
#theCUBE	ORGANIZATION	0.57+
Hadoop Summit	EVENT	0.51+
of	TITLE	0.47+

Brita Rosenheim & Seana Day, The Mixing Bowl | Food IT 2017

>> Announcer: From the Computer History Museum, in the heart of Silicon Valley, it's theCUBE. Covering Food IT: Fork to Farm, brought to you by Western Digital. >> Hey welcome back everybody Jeff Frick here with theCUBE. We're at the Food IT show at the Computer History Museum here in Mountain View, California. Really an amazing show, 350 people, all kind of pieces of the spectrum from academia to technology, to start-ups to Yamaha. Who thought Yamaha was into food tech, I didn't think that. To start-ups and we're really excited to have two of the partners form the Mixing Bowl and the Better Food Ventures, Brita Rosenheim and Seana Day welcome. >> Thank you. >> Thanks Jeff. >> So first off, congratulations on the event, what are your impressions? you guys been doing this for a couple years now I think. Bigger, badder, better? >> No I think this is great. We've has a fantastic turn out and the content's always very interesting and the interaction between the audience and the speakers is fantastic. >> Yeah, we just finished up a panel, IoT, Internet of Tomatoes, so there's always some great conversations really going. >> I think we're talking about that later this afternoon. >> Oh fantastic. >> It is interesting right, because all the big megatrends of cloud and we cover these in tech infrastructure all the time and big data and sensors and IoT and drones and these things. Really, all being brought to bare in agriculture from everything from producing the food to eating the food to the scraps that we don't eat I guess. >> No, you're spot on, some of the big macro challenges are what's driving a lot of the innovation. As you said food scraps, but waste is a major challenge. Labor, certainly here in California is something that we've seen a lot of innovation around solving some of those labor pain points. Certainly sort of environmental sustainability and resource management, you know, how are we using water, how are we using our inputs. Those are a lot of big themes that are driving interest in this sector and driving investment. >> Right so you guys are talking about some of the investments, like you guys put on a show, but you also have an investment arm, so you're looking for new technologies that play in this space correct? >> Yeah, Better Food Ventures makes early stage, seed investments so really kind of, not ideation stage, but pretty close after that. So working with entrepreneurs and really helping them, nurture them, and grow into hopefully successful companies. We've made 12 investments so far, I think seven of them have stepped up to priced equity so. >> Excellent, and you guys have brought this architecture landscape of the innovation. We won't share this on camera because it's way too many names for you to see, but obviously you can go online. >> Seana: It's available for download on our website MixingBowlHub.com. >> It's fascinating, there are literally what, a dozen categories and many firms within each category per side, so I wonder if you can give us a little bit more color on this landscape. I had no idea, the level of innovation that's happening in the food tech space, you just don't think about it probably if you're not in the industry. >> I'll let Seana kick off, between Seana and I, we cover Fork to Farm, so Seana covers from the farm, all the way through distribution and the area that I focus on, distribution all the way to consumer consumption. So we have a nice harmony there. We'll start at the beginning with Seana. >> Looking at over 3,000 companies. >> Jeff: 3,000? >> 3,000 between the two of our sort of database's. My coverage area is really infield technologies, hardware, software, applications. So anything from sensors, drones, soil moisture, weather, crop management, farm management software, all the way through as Brita said, distribution. So looking at supply chain management, logistics, trading platforms, collaboration platforms, so there's a lot going on. Every time, I roll out one of these technology landscapes. I'm always adding categories, which is sort of representative of the way that the market is evolving. I think that there is a lot of interesting stuff happening now in the post-harvest part of this market that more investors are starting to pay attention to. We've heard of that more today's even as well. Technologies that are focused on minimizing waste in the supply chain, making things more efficient helping shorten that supply chain so that we've got fresher food. More local options for consumers. >> I've been tracking the space for the last six or seven years, and to echo Seana's point on every time you put a new map out, you know we're thinking about different categories I mean every single year you've looked at it, the ecosystem has changed so much in terms of even how you categorize or even think of the different innovations that are shaping the space. I focus on, the way I look at my map is from in-home media consumption, discovery, so media, marketing, advertising, all the way through eCommerce, so both the B2B and B2C eCommerce platforms, all the way through restaurant and retail. So grocery, delivery, hyper-local marketing and the like. >> So can you explain the crazy success of these little, event handling, short food videos that are just taking the internet by storm? It's fascinating right? >> Yeah, BuzzFeed's tasty. >> Media consumption is really something to see. >> Yeah, I think BuzzFeed really took the traditional food media category by surprise. They really created the new, literally, video content for consumption that is extremely addicting, short, it makes everything seem approachable. It's kind of the bite-size version of the Food Network and I find myself. >> Off the chart right? >> You can't stop. Whether I'll make it or not you know, like the twirling potato and. (Brita chuckling) >> So the other, the sub-theme for this years conference is Fork to Farm and I'm just curious right. Because we've seen consumerization of IT impact all the different industries that we cover. It is really the end user at the end point that's driving the innovation back upstream. I wonder if you could speak to kind of the acceleration of that trend over time. Or is it relatively recent or you know there's some specific catalyst that you've seen as you've studied the market that has really driven an acceleration of that? >> Seana: Do you want to start with consumer and then we'll get back into the grower side of that? >> Yeah, I mean, I think you've seen kind of the long evolution since my web grocer cosmos of 10, 15 years ago and you know, people thinking, I'm never going to buy food online really don't have that trust level and you know kind of eCommerce in general, mobile technology in general has changed the consumers expectation and purchase and consumption patterns, period, for all other goods, so we've gotten to a point where there is a level of trust of if something is going to come to you in the mail there's just an expected level of trust or you can send it back. So that's kind of lent itself to this food category. I think in one way, that's been an overall industry shift in terms of the changing expectations of the consumer. You want to push a button, you've got your shoes, your lipstick you know your dog toys at the push of a button, why not your food. So the problem with that is food is very different it's has to be hot or cold, you have the cold chain speed, the manual labor involved. Just kind of the cost infrastructure is totally different than sending a box of lipstick and makeup to a consumer so I think you've seen a tremendous amount of funding in this on-demand delivery category a ton of different Uber for this, Uber for that, around the food space. Meal kits, but I think the reality of running those businesses have proven to be very difficult in terms of making the costs work out in terms of a business model so. >> Don't they all know why Van failed? They all probably too young to miss the Webvan and AT&T. >> Yeah, that being said, there's some opportunity there it's just about getting to the right scale. So obviously Amazon just bought Whole Foods last week I think there is room for a brick and mortar approach here but there, I think on-demand delivery's not going away in the food category, so who can actually deliver that because the consumer's not going to say, oh the business model doesn't make sense, I don't want this anymore. They just don't want to pay for it. Somebody has to figure out a way to. >> Oh that other pesky little detail About. And Seana it used to be if we make it they will eat right? I guess that doesn't hold true anymore. >> Well, you know it's a different adoption dynamic in the grower part of the technology adoption curve the consumers tend to pick things up more quickly than the traditional Ag player, Ag stake holder, the growers have been a little bit more tentative in terms of trying to figure out what kinds of technologies actually work. They're all of a sudden confronted with this idea of data overload. All of a sudden, you go from having no data to more data than you know what to do with. That's driving some of these adoption dynamics. People really trying to figure out what works, what business models are sustainable in agriculture and I know unsustainable from a resource standpoint. But just, will that business be around in six to nine to 12 months to support the technology that's in the field. So it's been a little slower I would say, on the production agriculture and grower side in terms of that uptake, but you know the other challenge that I think we face in terms of those models is really the flow of data. The flow of information is still very silo'd and in order to get the kind of decision support tools and the supply chain efficiencies that we're looking for in the food system, we really need to figure out how to integrate those data sources better. What's coming out of the field, what's happening in the mid-stream processing, and then what's happening on the supply chain and logistics side before you get to that consumer who's demanding it. But there's a lot of stages of information that need to harmonize before we can really have a more optimized system. >> Right, and are you seeing within the data side specifically some of the traditional players, like Tableau and clearly there's been a lot of activity in big data for awhile we've been going to Hadoop Summit and Hadoop World for ever and ever, are those people building Ag specific solutions or are there new players that really see the specific opportunity and better position to build you know the analytics to enable the use of that data? >> I think the big IT incumbents are looking at this very, very carefully. But there's are a lot of nuances to agriculture that are different from some of the other vertical industries and there's been a lot of observing from the sidelines down there, less from the deployment of actual technologies. Until people really understand how this market is starting to shake out. I think IBM and some of those big tech players are definitely on the fringes here, but I think again, we've got this challenge of how do you actually deliver value to growers. So, you've got all this data and you can crunch all this data how do you present that in a way that a grower can make a better decision about their operation. And oh, by the way, does the grower trust that data. That sort of is the challenge that I think we're still in the early innings in terms of of how that. It will come, but we're still in the early innings. >> Which is always the case right, to go from kind of an intuition, we've always done it this way, you know, like three generations of grandfathers that have worked this land too, you know here's the data, you can micro-optimize for this, that and the other and really take a different approach. >> I's say one of the challenges both on the Ag side, but also even on the food side, that there's a lot of start-ups that you meet with that are all about big data, big data, but big data really needs to be big data. So the incumbents are really the only ones that are in the position to crunch that amount of data. You can't actually get the insights when you don't have scale so there's a tremendous amount of companies that have a really interesting, innovative, approach to collecting data, to how you can use it and all they need is scale. That's virtually impossible unless they're acquired by or have a partnership with, which isn't going to happen a larger incumbent so big data, you really need a tremendous amount of data points to actually get to something that's useful. >> Alright, well, Seana and Brita thanks for taking a few min utes again, where can people go to get the pretty download it's a lot of data on this thing. >> It's MixingBowlHub.com so that's available both the AdTech landscape and the Food Tech landscape. >> Alright great, well again thanks, for inviting us to the show, really great show and congrats to you both for pulling it off. >> Thank you very much. >> Thanks very much. >> Alright, Brita, Seana, I'm Jeff you're watching theCUBE we're at FoodIT in the Computer Science Museum in Mountain View, California. We'll be back after the short break. Thanks for watching.

Published Date : Jun 28 2017

SUMMARY :

in the heart of Silicon Valley, it's theCUBE. all kind of pieces of the spectrum So first off, congratulations on the event, and the interaction between the audience IoT, Internet of Tomatoes, so there's always the food to the scraps that we don't eat I guess. and resource management, you know, We've made 12 investments so far, I think seven architecture landscape of the innovation. on our website MixingBowlHub.com. I had no idea, the level of innovation and the area that I focus on, distribution in the post-harvest part of this market that are shaping the space. It's kind of the bite-size version of the Food Network like the twirling potato and. kind of the acceleration of that trend over time. in terms of the changing expectations of the consumer. They all probably too young to miss the Webvan and AT&T. because the consumer's not going to say, I guess that doesn't hold true anymore. the consumers tend to pick things up a lot of observing from the sidelines down there, Which is always the case right, that are in the position to crunch that amount of data. to get the pretty download it's a lot of data on this thing. both the AdTech landscape and the Food Tech landscape. to you both for pulling it off. We'll be back after the short break.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Seana	ORGANIZATION	0.99+
California	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
AT&T.	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
Yamaha	ORGANIZATION	0.99+
Western Digital	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
two	QUANTITY	0.99+
12 investments	QUANTITY	0.99+
Whole Foods	ORGANIZATION	0.99+
last week	DATE	0.99+
Brita Rosenheim	PERSON	0.99+
350 people	QUANTITY	0.99+
Mountain View, California	LOCATION	0.99+
Brita	PERSON	0.99+
Seana	PERSON	0.99+
BuzzFeed	ORGANIZATION	0.99+
over 3,000 companies	QUANTITY	0.99+
12 months	QUANTITY	0.99+
each category	QUANTITY	0.98+
today	DATE	0.98+
3,000	QUANTITY	0.98+
Better Food Ventures	ORGANIZATION	0.98+
Webvan	ORGANIZATION	0.98+
seven	QUANTITY	0.98+
one way	QUANTITY	0.97+
both	QUANTITY	0.97+
10, 15 years ago	DATE	0.97+
FoodIT	ORGANIZATION	0.97+
nine	QUANTITY	0.96+
one	QUANTITY	0.96+
Food Network	ORGANIZATION	0.96+
a dozen categories	QUANTITY	0.96+
MixingBowlHub.com	OTHER	0.95+
first	QUANTITY	0.94+
AdTech	ORGANIZATION	0.93+
Van	ORGANIZATION	0.93+
later this afternoon	DATE	0.89+
Brita	ORGANIZATION	0.88+
seven years	QUANTITY	0.88+
Seana Day	PERSON	0.87+
Computer History Museum	LOCATION	0.87+
IT	EVENT	0.86+
theCUBE	ORGANIZATION	0.85+
Food Tech	ORGANIZATION	0.85+
six	QUANTITY	0.85+
Computer History Museum	ORGANIZATION	0.83+
2017	DATE	0.82+
Fork to Farm	EVENT	0.81+
two of the partners	QUANTITY	0.79+
three generations	QUANTITY	0.78+
Food IT	EVENT	0.77+
echo Seana	COMMERCIAL_ITEM	0.76+
Hadoop Summit	EVENT	0.73+
single year	QUANTITY	0.68+
Mixing Bowl	ORGANIZATION	0.58+
Tableau	TITLE	0.56+
Museum	LOCATION	0.54+
Bowl	EVENT	0.51+
Hadoop World	EVENT	0.49+
Fork to Farm	TITLE	0.49+
couple	QUANTITY	0.49+
last	DATE	0.47+
IT	ORGANIZATION	0.39+
Mixing	ORGANIZATION	0.31+

Susie Wee, Cisco DevNet - Cisco DevNet Create 2017 - #DevNetCreate - #theCUBE

(upbeat music) >> Announcer: Live from San Francisco, it's theCUBE, covering DevNet Create 2017. Brought to you by Cisco. >> Hello, everyone, and welcome back to our live coverage from theCUBE exclusive, two days with Cisco's inaugural DevNet Create event. I'm John Furrier, with my co-host, Peter Burris, who's the general manager of Wikibon.com, and head of research for SiliconANGLE Media. We're talking with Susie Wee, who is the vice president and CTO of Cisco's DevNet, the creator of DevNet, the developer program that was started as grassroots, now a full-blown Cisco developer program. Now starting another foray into the cloud-native open-source community with this new event, DevNet Create. Welcome to theCUBE, thanks for joining us. >> Thank you, John. >> Thanks for having us. We love going to the inaugural events because they're always the first, and you know, being bloggers, and media, you got to be first. First news, first comments. >> Susie: Always first. >> Always first, and we're the only media here, so thank you. >> Susie: Thank you. >> So tell us about the event (Susie chuckles). You're the host and the creator, with your team. >> Susie: Yes. >> How did this come together, why DevNet Create? You have DevNet, this event is going extremely well, tell us. >> Awesome, so, yeah, so we have DevNet, we've had DevNet for about three years. It was actually exactly three years ago that we had our first DevNet Zone, a developer conference at Cisco Live, three years ago. And there, we felt like we pretty squarely hit... We've had successes there, we've had a pretty strong handle on our infrastructure audience, but what we see is that there's this huge transition, transformation going on in the industry, with IoT and cloud, that changes the definition of how applications meet infrastructure. And so this whole thing with, you know, applications, what is an application? What is the infrastructure? The infrastructure is now programmable, how can apps interact? It opens up a whole new world, and so what we did was we created DevNet Create as a standalone developer conference focused on IoT and cloud to focus on that transformation. >> And a lot of industry trends kind of going on, and moves you're making, it's the company, or you, Cisco is making, AppDynamics, big acquisition, kind of speaks to that, but also, there's always a natural progression for Cisco to have moving up the stack with software, but IoT gives you guys a unique opportunity with the network concept. So, making it network programmable, infrastructure as code, as some say in the DevOps world, is the ethos. >> Absolutely. >> How do you guys see yourselves engaging with the community, and what are some of the plans, and what's some of the feedback you're getting here at the event? >> So what we've done here at the event is that, you know, as you've seen from the channel is that, our content is 90% from the community, maybe 10% from Cisco, 90% from the community, because we believe it is all about the ecosystem. It's about how applications meet the infrastructure, it's the systems people are building together. And there's a lot of movement in developing these technologies. We don't know the final form of how an IoT app... Like, who's going to build the app, who's going to build the users, who's going to run the service, who's going to run the infrastructure? It's all still evolving, and we think that the community needs to come together to solve this to make the most of the opportunity. And so that's what, really, this is all about. And then, we think it actually involves learning the languages, making sure that the app folks know the language of the infrastructure folks. They don't have to become experts in it, but just knowing the language. Understand what part's programmable, what part's not, what benefit can you derive from the infrastructure. And then, by really having knowledge of what you can get across, and creating a forum for people to get together to have this conversation, we can make those breakthroughs. >> So just a clarification, you said that 90% of the sessions are non-Cisco, or from the community, and only 10% from Cisco? >> Susie: That's right. >> Is that by design? >> That is absolutely by design. So, when we have the DevNet Zone at Cisco Live, that's all about all of Cisco's products, platforms, APIs, bringing in the community to come and learn about those, but DevNet Create was really, squarely for IoT and app developers, IoT app developers, cloud developers, people working on DevOps, to look at that intersection. So we didn't go into all the gory details of networking, like we very much like to do, but we were really trying to focus on, "What's the value to application developers, "and what are the opportunities?" >> Well, it's interesting because, Susie, we're in the midst, as you said, of a pretty significant transformation, and there's a lot of turbulence, not only in business and how business conceives of digital technology, and the role it's going to play, the developer world, cloud-this, cloud-that, different suppliers, but one of the anchor points is the network, even though the network itself is changing, >> It is. >> in the midst of a transformation, but it's a step function. So, you go from, on the wireless, go outside, 1G to 3G, to 5G, et cetera, that kind of thing, but how is the developer going to inform that next step function in the network, the next big transformation in the network, and to what degree is this kind of a session going to really catalyze that kind of a change? >> Absolutely. So, what happens is, you're right, it's something that we all know, all app developers know, and actually, every person in the world knows, the network is important. The network provides connectivity, the network is what provides Internet, data, and everything there. That's critical to apps, but the thing that's been heard about it is it's not programmable. Like, you kind of get that thing configured, it's working now, you leave it. Don't touch it. >> It's still wires. In the minds of a lot of people, (Susie laughs) it's still wires, right? >> It is, it's wires, or even if it's wireless, once you can get it configured, you leave it. You're not playing with it again, it's too, kind of, dangerous or fragile to change it. >> Because of the sensitivity to operational... >> Because of the sensitivity to operations. The big change that's happening is the network is becoming programmable. The network has APIs, and then, we have things like automation and controller-based networking coming into play, so you don't actually configure it by going one network device at a time, you feed these into a controller, and then, now you're actually doing network-wide commands. That takes out the human error, it actually makes it easy to configure and reconfigure. And when you have that ability to provision resources, to kind of reset configurations, when you can do that quickly through APIs, you suddenly have a tool that you never had before. So let me give you an example. So let's say that you're in a building, you have your badging systems, your automated elevators, you have your surveillance cameras, you want to put out a new security system with surveillance cameras. You don't want to put that on the same network segment as your vending machines. You have a different level of security required. Could put in a work order to say... >> Unless you're really worried about who's stealing from the vending machines. (all laugh) >> So what you can do, now that it's programmable, is use infrastructure as code, is basically say, "Boom, give me a new network segment, "let me drop these new devices onto it, "let the programmable network automatically create "a separate network segment that has "all of these devices together." Then you can start to use group-based policy to now set, you know, the rules that you want, for how those cameras are accessed, who they're accessible by, what kind of data can come in and out of it. You can actually do that with infrastructure as code. That was not a knob that app developers had before. So they don't need to become networking experts, but now they have these knobs that they can use to give you that next level of security, to give you that next level of programmability, and to do it at the speed that an app developer needs. >> So I was talking to Steve Post-y earlier this morning, and he's from Redhead, he's a lead developer, he's not a network guy, he's self-proclaimed, "Hey, I'm not a networking person, I care about apps," and he's a developer, and he brought up something interesting I want to get your thoughts on. I think you're onto something really big with your vision, which is why we're so pumped about it, and he brought up an example of ecosystem's edges, and margins of the edge of these, that when they come together, creates innovation opportunities. And he used the example of data science meets cloud. And what he was using in particular was the example of most data people in the old days were data jocks, they did data, they did things, and they weren't really computer scientists, but as those two communities came together, the computer scientist saying, "Hey, I don't know about data," and the data guy's like, "Hey, you know about algorithms," "I know about algorithms," so innovation happened when that came together. What you're doing here, if I got this right, is you're saying, "Hey, DevNet's doing great," from a Cisco perspective, "but now this whole new creative innovation world "in the cloud is happening in real time. "Bring 'em together, "so best of Cisco knowledge to the guys who don't want to be (chuckles) "experts in that can share information." Is that kind of where this is going? >> Yeah, that's exactly where it's going, and same example, earlier in my career, I was working on sending video over networks, and then you had the networking people doing networking, you had the video people doing video compression, but then video networking, or streaming media, kind of, oh, you can put, you know, your knowledge of the compression and the network all together, so that kind of emerged as a field. The same thing, so, so far, the applications, and the infrastructure, and IT departments have been completely separate. You would just do the best you can, it was the job of IT to provide it, but now, suddenly there's an opportunity to bring these together. And it's, again, it's because the infrastructure's becoming programmable, and now it has knobs and can work quickly. So, yes, this is kind of new ground. And things could continue the way they are, right? And it's okay, we're getting by, but you just won't be realizing the potential of the real kind of... >> Well, open-source has clearly demonstrated that the collective intelligence of communities can really move fast, and share, and it's now tier one, so you're seeing companies go public, MuleSoft, Cloudera, and the list goes on and on. So now you have the dynamic of open-source, so I got to ask you the question, as you go out with DevNet Create, as this creation, the builders that are out there building apps are going to have programmable networks, how do you see this next leg of the journey? Because you have the foray now with DevNet Create, looks good, really well done, what's next? >> What's next is going on and making the real instances that show the application and infrastructure synergy. So let me just give you a really simple example of something that we're doing, which is that Apple and Cisco have had a partnership, and this partnership is coming together in that we have iOS developers who are writing mobile apps. So you have your mobile apps people are writing, we have iOS 10, your app developers are writing these apps. But everybody knows you run into a situation where your app gets congested on the network. Let's say that we're here in Westfield Mall, and they want to put out an AR/VR app, and you want that traffic to work, right? 'Cause if the mall wants to offer an AR/VR service, it takes a lot of bandwidth to get that data through, but through this partnership, what we have is an ability we have to use an iOS 10 SDK to, basically, business optimize your app so that it can run well on a Cisco infrastructure. So basically, it's just saying, "Hey, this is important, "put it in the highest QoS (John laughs) level setting, "and make your AR/VR work." So it's just having these real instances where these work together. >> I mean, I used to be a plumber back in my day when I used to work at HP, and I know how hard it is, and so I'm going to bring this up, because networks used to be stable and fragile/brittle, and then that would determine what you could do on top of it. But there are things like DNS, we hear about DNS, we hear about configuration management, setting ports, and doing this, to your point, I want dynamic provisioning or policy at any given moment, yet the network's got to be ready to do that. >> You don't want to submit a work order for that. (laughs) >> You don't want to have to say, "Hey, can you provision port, whatever, "I need to send a bunch of bandwidth." This is what we're talking about when we say programmable infrastructure, just letting the apps interface with network APIs, right? >> Absolutely, and I think that, you heard earlier, that with CNCF, the Cloud Native Computing Foundation, just announced CNI, so that what they're doing is now offering an ability to take your kind of container orchestration and take into consideration what's going on in the network, right? So if this link is more congested than that, then make sure that you're doing your orchestration in the right ways, that the network is informing the cloud layer, that the cloud platform's informing the network, so that's going to be huge. >> But do you think, I'm curious, Susie, do you think that we're going to see a time when we start bringing conventions at layer 7 in the network, so we start to parse layer 7 down a little bit, so developers can think in terms of some of those higher-level services that previously have been presentation? Are we likely to see that kind of a thing? As the pain of the network starts to go away, and an explicit knowledge of layer 1-6 become a lot less important, are we going to see a natural expansion at layer 7, and think about distributed data, distributed applications, distributed services, more coherence to how that happens on an industry-wide basis? What do you think? >> Yeah, so let's see, I don't know if I have a view on which layers go away, or which layers compress... >> But the knowledge, the focal point of those? >> But the knowledge, absolutely. So it comes into play, and what happens is, like, what is the infrastructure? In the Internet of things, things are a part of your infrastructure. That's just different. As you're going to microservices, applications aren't applications, they're being written as microservices, and then once you put those microservices in containers, they can move around. So you actually have a pretty different paradigm for thinking about the architecture of applications, of how they're orchestrated, what resources they sit on, and how you provision, so you get a very new paradigm for that. And then the key is... >> But they're inherently networked? >> That's right, that's right. It's all about connectivity, it's all about, you know, they don't do anything without the network. And we're pushing the boundaries of the network. >> These aren't function calls over memory like we used to think about things, these things are inherently networked. We know we have network SOAs, and service levels, and whatnot... >> Susie: There is. >> It sounds like we have... I was wondering, here, at this conference, are developers starting to talk about, "Geez, I would like to look at Kubernetes "as a lower-level feature in layer 7," >> Susie: They are. (laughs) >> "where there's a consistent approach to thinking about "how that orchestration layer is going to work, "and how containers work above that, "because I don't have to worry about session anymore, I don't have to worry about transmission." >> Susie: Absolutely. >> That goes away, so give me a little bit more visibility into some of that higher-level stuff, where, really, the connectivity issues are becoming more obvious. >> Absolutely, and an interesting example is that, you know, we actually talked about AppDynamics in the keynote, and so, with AppDynamics, what kind of information can you get from these bits of code that are running in different places? And it comes into where we have the Royal Bank of Scotland, who's saying, "What's my busiest bank branch "where people are doing mobile banking in the country?" And they're like, "Well, how do I answer that question?" And then you see that, oh, someone has their mobile phone, they take an app, then you actually break it down to how is that request, that API, how is that being, kind of, operated throughout your network. And when you take a look, you say, "Okay, well, this called this "piece of code that's running here. "This piece of code used this API to talk to this other service, to talk to this other," you can map that out, get back the calls of, "Hey, this is how many times this API has been called, "this is how many times this service has been called, "this is the ones that are talking to who," then they came up with the answer, saying that our busiest bank branch is the 9 a.m. Paddington Train Station. >> And that's a great example, because now you gain visibility >> Exactly >> into where the dependencies are, which even if you don't explicitly render it that way, starts to build a picture of what the layers of function might look like based on the dependencies and the sharing of the underlying services. >> That's right, and that's where you're saying, like, "What? The infrastructure just gave me business value (John laughs) "in a very direct way. "How did that happen?" >> John: That's a huge opportunity for Cisco. >> So it's a big... >> Well, let's get in the studio and let's break down the Kubernetes and the containers, 'cause Docker's here, a lot of other folks are here. We've had, also, Abby Kearns, the executive director of Cloud Foundry. We've had the executive director from the Cloud Native Compute Foundation, Dan was here, a lot of folks here in the industry kind of validating >> Yeah, Craig was here. >> your support. Sun used to have an expression, the network is the computer, but now, maybe Chuck Robbins should go for network is the app, or the app is the network, (Susie laughs) I mean, that's what's happening here. The interplay between the two is happening big time. >> It is happening here, yeah. Just every element, every piece of code, what we saw is that this year, developers will write 111 billion lines of code. You think about that, every piece of... >> Peter: That we know about. (chuckles) >> That we know about, there's probably more. (chuckles) and all of that, you're right, these are broken up into pieces that are inherently networked, right? They have data, it's all about data and information that they're sharing to give interesting experiences. So this is absolutely a new paradigm. >> Well, congratulations on your success. What a great journey, I know it's been a short time, but I noticed after our in-studio interview, when you came in to share with us, the show, as a preview, Chuck Robbins retweeted one of the tweets. >> Susie: He did. >> And so I got to ask you, internally at Cisco, I know you put this together kind of as a entrepreneurial inside the company, and had support for that, what is the conversation you have with Chuck and the executive team about this effort? Because they got to see a clear line of sight that the value of the network is creating business value. What are some of the internal conversations, can you give us a little bit of color without giving away all the trade secrets? >> Yeah, well, internally, we're getting huge support. Chuck Robbins checks in on this, he actually has been checking in saying, "How's it going?" Rowan Trollope sending, "Hey, how's it going? "I heard it's going great." >> Did he text you today? >> Chuck did a couple days ago. >> John: Okay. (chuckles) >> And then Rowan, today, so, yeah, so we have a lot of conversation. >> Rowan's a CUBE alumni, Chuck's got to get on theCUBE, (Susie laughs) Rowan's been on before. >> Yeah, so they're all kind of checking in on it. We have the IoT World Forum going on in parallel, in London, so, otherwise, they would be here as well. But they understand... >> John: There's a general excitement? This is not a rogue event? >> There's huge excitement. >> This is not, like, a rogue event? >> It's not, it's not, and what happens is... They also understand that we're talking about bringing in the ecosystem. It's not just a Cisco conversation, it is a community... >> Yeah, you're doing it right, you're not trying to take over the sandbox. You're coming in with respect and actually putting out content, and learning. >> Putting out content, and really, it's all about letting people interact and create this new area. It's breaking new ground, it's facilitating a conversation. I mean, where apps meet infrastructure, it's controversial as well. Some people should say, "They should never meet. "Why would they ever meet?" (Susie and John laugh) >> So, we do a lot of shows, I was telling Peter that, you know, we were at the first Hadoop Summit, second Hadoop World, with Cloudera, when they were a small startup, Docker's first event, CubeCon's first event, we do a lot of firsts, and I got to tell you, the energy here feels a lot like those events, where it's just so obvious that (chuckles) "Okay, finally, programmable infrastructure." >> Well, I'll be honest, I'm relieved, because, you know, we were taking a bet. So, you know, when I was bouncing this idea off of you, we were talking about it, it was a risk. So the question is, will it appeal to the app developers, will it appeal to the cloud developers, will it appeal overall? And I'm very relieved and happy to see that the vibe is very positive. >> Very positive. >> So people are very receptive to these ideas. >> Well, you know community, give more than you take has always been a great philosophy. >> I'm always a little paranoid and (John laughs) nervous but I'm very pleased, 'cause people seem to be really happy. There's a lot of action. >> There are a lot of PCs with Docker stickers on them here. (John laughs) >> There are. (laughs) There are, yes, yes. We have the true cloud, IoT, we have the hardcore developers here, and they seem to be very engaged and really embracing... >> Well, we've always been covering DevOps, again, from the beginning, and cloud-native is, to me, it's just a semantic word for DevOps. It's happening, it's going mainstream, and great to see Cisco, and congratulations on all your work, and thanks for including theCUBE in your inaugural event. >> Susie: Thank you. >> Susie Wee, Vice President and CTO at Cisco's DevNet. We're here for the inaugural event, DevNet Create, with the community, two great communities coming together. I'm John Furrier with Peter Burris, stay tuned for more coverage from our exclusive DevNet Create coverage, stay with us. (upbeat music) >> Hi, I'm April Mitchell, and I'm the senior director of strategy.

Published Date : May 24 2017

SUMMARY :

Brought to you by Cisco. the developer program that was started as grassroots, because they're always the first, and you know, You're the host and the creator, with your team. You have DevNet, this event is going extremely well, And so this whole thing with, you know, as some say in the DevOps world, is the ethos. of what you can get across, bringing in the community to come and learn about those, but how is the developer going to inform and actually, every person in the world knows, In the minds of a lot of people, once you can get it configured, you leave it. Because of the sensitivity to operations. Unless you're really worried about to give you that next level of security, and margins of the edge of these, and the network all together, so I got to ask you the question, and you want that traffic to work, right? and doing this, to your point, You don't want to submit a work order for that. just letting the apps interface with network APIs, right? that the network is informing the cloud layer, I don't know if I have a view on which layers go away, and then once you put those microservices in containers, It's all about connectivity, it's all about, you know, and service levels, and whatnot... are developers starting to talk about, Susie: They are. "because I don't have to worry about session anymore, the connectivity issues are becoming more obvious. "this is the ones that are talking to who," and the sharing of the underlying services. That's right, and that's where you're saying, like, a lot of folks here in the industry kind of validating network is the app, or the app is the network, what we saw is that this year, Peter: That we know about. and all of that, you're right, Chuck Robbins retweeted one of the tweets. and the executive team about this effort? "I heard it's going great." And then Rowan, today, Rowan's a CUBE alumni, Chuck's got to get on theCUBE, We have the IoT World Forum going on in parallel, in London, about bringing in the ecosystem. and actually putting out content, it's all about letting people the energy here feels a lot like those events, So the question is, will it appeal to the app developers, So people are Well, you know community, There's a lot of action. There are a lot of PCs with Docker stickers on them here. and they seem to be very engaged and really embracing... from the beginning, and cloud-native is, to me, We're here for the inaugural event, DevNet Create, and I'm the senior director of strategy.

ENTITIES

Entity	Category	Confidence
Chuck	PERSON	0.99+
Peter Burris	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
John	PERSON	0.99+
Susie Wee	PERSON	0.99+
Abby Kearns	PERSON	0.99+
Susie	PERSON	0.99+
Craig	PERSON	0.99+
Dan	PERSON	0.99+
Chuck Robbins	PERSON	0.99+
April Mitchell	PERSON	0.99+
Cloud Native Compute Foundation	ORGANIZATION	0.99+
Rowan Trollope	PERSON	0.99+
John Furrier	PERSON	0.99+
CNCF	ORGANIZATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Steve Post	PERSON	0.99+
Rowan	PERSON	0.99+
London	LOCATION	0.99+
90%	QUANTITY	0.99+
iOS 10	TITLE	0.99+
San Francisco	LOCATION	0.99+
Royal Bank of Scotland	ORGANIZATION	0.99+
CNI	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
10%	QUANTITY	0.99+
Cloud Foundry	ORGANIZATION	0.99+
two days	QUANTITY	0.99+
three years ago	DATE	0.99+
Westfield Mall	LOCATION	0.99+
today	DATE	0.99+
two	QUANTITY	0.99+
two communities	QUANTITY	0.99+
first	QUANTITY	0.99+
Wikibon.com	ORGANIZATION	0.99+
MuleSoft	ORGANIZATION	0.99+
iOS	TITLE	0.99+
SiliconANGLE Media	ORGANIZATION	0.98+
IoT World Forum	EVENT	0.98+
DevNet	ORGANIZATION	0.98+
first event	QUANTITY	0.98+
one	QUANTITY	0.98+

Christoph Streubert, SAP - DataWorks Summit Europe 2017 - #DWS17 - #theCUBE

>> Announcer: Live from Munich, Germany, it's The CUBE, covering DataWorks Summit Europe 2017. Brought to you by Heartenworks. >> Okay, welcome back everyone, we are here live in Munich, Germany For DataWorks 2017, the DataWorks Summit, formally Hadoop Summit. I'm John Furrier with Silicone Angle's theCUBE, my co-host Dave Vellante, wrapping up day two of coverage here with Christoph Schubert, who's the Senior Director of SAP Big Data, handles all the go-to-market for SAP Big Data, @sapbigdata is the Twitter handle. You have a great shirt there, Go Live >> Go Live or go home. (Laughs) >> John: You guys are a part. Welcome to theCUBE. >> Christoph: Thank you, I appreciate it. >> Thanks for joining us and on the wrap up. You and I have known each other, we've known each other for a long time. We've been in many Sapphires together, we've had many conversations around the role of data, the role of architecture, the role of how organizations are transforming at the speed of business, which is SAP, it's a lot of software that powers business, under transformation right now. You guys are no stranger to analytics, we have the HANA Cloud Platform now. >> Christoph: We know a thing or two about that, yeah. (laughs) >> You know a little bit about data and legacy as well. You guys power pretty much most of the Fortune 100, if not all of them. What's your thoughts on this? >> Yeah, good point. On the topic of some numbers, about 75% of the world GDP runs through SAP systems eventually. So yes, we know a thing or two about transactional and analytical systems, definitely. >> John: And you're a partner with Hortonworks >> With Hortonworks and other Cloud providers, Hadoop Providers, certainly, absolutely but in this case, Hortonworks. We have, specifically, a solution that runs on Hadoop Spark and that allows, actually, our customers to unify much, much larger data sets with a system of records that we now do so many of them around the world for new and exciting new cases. >> And you were born in Munich. This is your hometown. >> This is actually a home gig for me, exactly. So, yes, unfortunately I'll also be presenting in English but yeah, I want to talk German, Bavarian, all the time. (laughs) >> I see my parents tonight. >> I wish we could help you >> but we don't speak Bavarian. But we do like to drink the beer though. It's the fifth season but a lot of great stuff here in Germany. Dave, you guys, I want to get your thoughts on something. I wanted to get you, just 'cause you're both, you're like an analyst, Christoph as well. I know you're over at SAP but, you know, you have such great industry expertise and Dave obviously covers the stuff everyday. I just think that the data world is so undervalued, in my mind. I think the ecosystem of startups that are coming out in the, out of the open source ecosystems, which are well-defined, by the way, and getting better. But now you have startups doing things like VIMTEC, we just had a bank on. Startups creating value and things like block chain on the horizon. Other new paradigms are coming on, is going to change the landscape of how wealth is created and value is created and charged. So, you've got a whole new tsunami of change. What's your thoughts on how this expands and obviously, certainly, Hortonworks as a public company and Cloudera is going public, so you expect to see that level up in valuation. >> They're in the process, yes. >> But I still think they're both undervalued. Your thoughts. >> Well it's not just the platform, right? and that what, I think, where Hadoop also came from. The legacy of Hadoop is that you don't have to really think about how you want to use your data. You have to, don't think ahead what kind of schema you want to apply and how you want to correlate your data. You can create a large data lake, right? That's the term that was created a long time ago, that allows customers to just collect all that data and think in the second stage about what to use with it and how to correlate it. And that's exactly, now, we're also seeing in the third stage, to not just create analytics but also creating applications instead of analytics or on top of analytics, correlating with data that also drives the business, the core business, from an OLTP perspective or also from an OLAP perspective. >> I mean, Dave, you were the one who said Amazon's a trillion dollar TAM, will be the first trillion dollar company and you were kind of, but you looked at the thousand points of Live with Cloud enables, all these aggregated all together, what's your thoughts on valuation of this industry? Because if Hortonworks continues on this peer play and they've got Cloudera coming in and they're doing well, you could argue that they're both undervalued companies if you count the ecosystem. >> Well, we always knew that big data was going to be a heavy lift, right? And I would agree with what Christoph was saying, was that Hadoop is profound in that it was no schema on right and ship five magabytes of code to a pedabyte of data. But it was hard to get that right. And I remember something you said, John, at one of our early SAP Sapphires, When the big data meme was just coming through. You said, "You know, SAP is not just big data, it's fast data". And you were talking about bringing transaction and analytic data together. >> John: Right. >> Again, something that has only recently been enabled. And you think about, you know, continuous streaming. I think that, now, big data has sort of entered the young-adulthood phase, we're going to start seeing steep part of that S-curve returns, and I think the hype will be realized. I think it is undervalued, much like the internet was. It was overvalued, then nobody wanted to touch it, and then it became. Actually, if you think back to 1999, the internet was undervalued in terms of what it actually achieved. >> John: Yeah. >> I think the same or similar thing is going to happen with big data. And since we have an SAP guest on, I'll say as well, We all remember the early days of ERP. >> Mhm, oh yeah. >> It wasn't clear >> Nope. >> Who was going to emerge as the king. >> Right. >> There were a few solutions. You're right. >> That's right. And, as well, something else we said about big data, it was the practitioners of ERP that made the most money, that created the most value and the same thing is happening here. >> Yeah. In fact, on that topic, I believe that 2017 and 2018 will be the big years for big data, so to speak. >> John: Uh huh. >> In fact, because of some statistics. >> John: In what way? >> Well, we just did >> Adoption, S-curve? >> Right, exactly. Utilizing the value of big data. You're talking about valuation here, right? 75% of CEOs of the top 1000 believe that the next three years are more important to their business than the last 50. And so that tells me that they're willing to invest. Not just the financial market, where I believe really run the most sophisticated big data analytics and models today. They had real use cases with real results very quickly. And so, they showed many how it's done. They created sort of the new role of a data scientist. They have roles like an AML officer. It's a real job, they do nothing else but anti-money laundering, right? So, in that industry they've shown us how to do that and I think others will follow. >> Yeah, and I think that when you look at this whole thing about digital transformation, it's all about data. >> John: Yeah. >> I mean, if you're serious about digital transformation, you must become a data-driven company and you have to hop on that curb. Even if you're talking to the, you know, bank today who got on in 2014, which was relatively late, but the pace at which they're advancing is astronomical. >> John: Yeah. >> I don't remember his name, a British mathematician, created, about 11 years already, that according to the phrase "Data is the new oil". >> John: Mhm. >> And I think it's very true because crude oil, in its original form, you also can't use it. >> John: It has to be refined. >> Right, exactly. It has to be refined to actually use it and use the value of it. Same thing with data. You have to distill it, you have to correlate it, you have to align it, you have to relate it to business transactions so the business really can take advantage of it. >> And then we're seeing, you know, to your point, you've got, I don't know, a list of big data companies that are now in public is growing. It's still small, not much profit. >> I mean, I just think, and this is while I'm getting your reaction, I mean, I'm just reading right now some news popping on my dashboard. Google just released some benchmarks on the TPU, the transistor processing unit, >> Dave: Right. >> Basically a chip dedicated to machine learning. >> Yep. >> You know, so, you're going to start to see some abstraction layers develop, whether it's a hardened-top processor hardware, you guys have certainly done innovation on the analytic side, we've seen that with some of the specialty apps. Just to make things go faster. I mean, so, more and more action is coming, so I would agree that this S-curve is coming. But the game might shift. I mean, this is not an easy, clear path. There's bets being made in big data and there's potential for huge money shift, of value. >> See, one of the things I see, and we talked to Hortonworks about this, the new president, you know, betting all on open source. I happen to think a hybrid model is going to win. I think the rich get richer here. SAP, IBM, even Oracle, you know, they can play the open source game and say, "Hey, we're going to contribute to open source, we're going to participate, we're going to utilize open source, but we're also going to put the imprimatur of our install base, our business model, our trusted brands behind so-called big data." We don't really use that term as much anymore. It's the confluence of not only the technology but the companies who, what'd you say, 75% of the world's transactions run though SAP at some point? >> Christoph: Yeah. >> With companies like SAP behind it, and others, that's when this thing, I think, really takes off. >> What I think a lot of people don't realize, and I've been a customer, also, for a long time before I joined the vendor side, and what is under-realized is the aspect of risk management. Once you have a system and once you have business processes digitized and they run your business, you can't introduce radical changes overnight as quickly anymore as you'd like or your business would like. So, risk management is really very important to companies. That's why you see innovation within organizations not necessarily come from the core digitization organization within their enterprise, it often happens on the outside, within different business units that are closer to the product or to the customer or something. >> Something else that's happening, too, that I wanted to address is this notion of digitization, which is all about data, allows companies to jump industries. You're seeing it everywhere, you're seeing Amazon getting into content, Apple getting into financial services. You know, there's this premise out there that Uber isn't about taxicabs, it's about logistics. >> John: Yeah. >> And so you're seeing these born-digital, born in the cloud companies now being able to have massive impacts across different industries. Huge disruption creates, you know, great opportunities, in my view. >> Christoph: Yeah. >> David: What do you think? >> I mean, I just think that the disruption is going to be brutal, and I want to, I'm trying to synthesize what's happening in this show, and you know, you're going to squint through all the announcements and the products, really an upgrade to 2.6, a new data platform. But here in Europe the IOT thing just, to me, is a catalyst point because it's really a proof point to where the value is today. >> David: Mhm. >> That people can actually look at and say, "This is going to have an impact on our business tier digitization point" and I think IOT is pulling the big data industry and cloud together. And I think machine learning and things that come over the top on it are only going to make it go faster. And so that intersection point, where the AI, augmented intelligence, is going to come in, I think that's where you're going to start to see real proof points on value proposition of data. I mean, right now it's all kind of an inner circle game. "Oh yeah, got to get the insights, optimize this process here and there" and so there's some low hanging fruit, but the big shifting, mind blowing, CEO changing strategies will come from some bigger moves. >> To that point, actually, two things I want to mention that SAP does in that space, specifically, right? Startups, we have a program actually, SAP.io, that Bill McDermont also recently introduced again, where we invest in startups in this space to help foster innovation faster, right? And also connecting that with our customers. >> John: What is it called? >> SAP.io Something to look out for. And on the topic of IOT, we made, also, an announcement at the beginning of the year, Project Leonardo. >> Yeah. >> It's a commitment, it's a solution set, and it's also an investment strategy, right? We're committed in this market to invest, to create solutions, we have solutions already in the cloud and also in primus. There are a few companies we also purchased in conjunction with Loeonardo, RT specifically. Some of our customers in the manufacturing space, very strong opportunity for IOT, sensor collection, creating SLAs for robotics on the manufacturing floor. For example, we have a complete solution set to make that possible and realize that for our customers and that's exactly a perfect example where these sensor applications in IOT, edge, compute rich environments come together also with a core where, then, a system of references like machine points, for example, matter because if you manage the SLA for a machine, for example, you just not only monitor it, you want to also automatically trigger the replacement of a part, for example, and that's why you need an SAP component, as well. So, in that space, we're heavily investing, as well. >> The other think I want to say about IOT is, I see it, I mean, cloud and big data have totally disrupted the IT business. You've seen Dell buying EMC, HP had to get out of the cloud business, Oracle pivoted to the cloud, SAP obviously, going hard after the cloud. Very, very disruptive, those two trends. I see IOT as not necessarily disruptive. I see those who have the install base as adopting IOT and doing very, very well. I think it's maybe disruptive to the economy at large, but I think existing companies like GE, like Siemens, like Dimar, are going to do very, very well as a result of IOT. I mean, to the extent they embrace digitization, which they would be crazy not to. >> Alright guys, final thoughts. What's your walkaway from this show? Dave, we'll start with you. >> I was going to say, you know, Hadoop has definitely not failed, in my mind, I think it's been wildly successful. It is entering this new phase that I call sort of young-adulthood and I think it's, we know it's gone mainstream into the enterprise, now it's about, okay, how do I really drive the value of data, as we've been discussing, and hit that steep part of the S-curve. Which, I agree, it's going to be within the next two years, you're going to start to see massive returns. And I think this industry is going to be realized, looked back, it was undervalued in 2017. >> Remember how long it took to align on TCP/IP? (laughter) >> Walk away, I mean interoperability was key with TCP/IP. >> Christoph: Yeah. One of the things that made things happen. >> I remember talking about it. (laughter) >> Yeah, two megabits per second. Yeah, but I mean, bringing back that, what's your walkaway? Because is it a unification opportunity? Is it more of an ecosystem? >> A good friend of mine, also at SAP on the West Coast, Andreas Walter, he shared an observation that he saw in another presentation years ago. It was suits versus hoodies. Different kind of way to run your IT shop, right? Top-down structure, waterfall projects, and suits, open source, hack it, quickly done, you know, get in, walk away, make money. >> Whoa, whoa, whoa, the suits were the waterfall, hoodies was the agile. >> Christoph: That's correct. >> Alright, alright, okay. >> Christoph: Correct. So, I think that it's not just the technology that's coming together, it's mindsets that are coming together. And I think organizationally for companies, that's the bigger challenge, actually. Because one is very subscribed, change control oriented, risk management aware. The other is very progressive, innovative, fast adopters. That these two can't bring those together, I think that's the real challenge in organizations. >> John: Mhm, yeah. >> Not the technology. And on that topic, we have a lot of very intelligent questions, very good conversations, deep conversations here with the audience at this event here in Munich. >> Dave, my walkaway was interesting because I had some preconceived notions coming in. Obviously, we were prepared to talk about, and because we saw the S1 File by Cloudera, you're starting to see the level of transparency relative to the business model. One's worth one billion dollars in private value, and then Hortonworks pushing only 2700 million in a public market, which I would agree with you is undervalued, vis a vis what's going on. So obviously, you're going to see my observation coming in from here is that I think that's going to be a haircut for Cloudera. The question is how much value will be chopped down off Cloudera, versus how much value of Hortonworks will go up. So the question is, does Cloudera plummit, or does Cloudera get a little bit of a haircut or stay and Hortonworks rises? Either way, the equilibrium in the industry will be established. The other option would be >> Dave: I think the former and the numbers are ugly, let's not sugarcoat it. And so that's got to change in order for this prediction that we're making. >> John: Former being the haircut? >> Yeah, the haircut's going to happen, I think. But the numbers are really ugly. >> But I think the question is how far does it drop and how much of that is venture. >> Sure. >> Venture, arbitrage, or just how they are capitalized but Hortonworks could roll up. >> But my point is that those numbers have to change and get better in order for our prediction to come true. Okay, so, but in your second talk, sorry to interrupt you but >> No, I like a debate and I want to know where that line is. We'll be watching. >> Dave: Yeah. >> But the value in, I think you guys are pointing out but I walk away, is IOT is bigger here, and I already said that, but I think the S-curve is, you're right on. I think you're going to start to see real, fast product development around incorporating data, whether that's a Hortonworks model, which seems to be the nice unifying, partner-oriented one, that's going to start seeing specialized hardware that people are going to start building chips for using flash or other things, and optimizing hard complexities. You pointed that out on the intro yesterday. And putting real product value on the table. I think the cards are going to start hitting the table in ecosystem, and what I'm seeing is that happening now. So, I think just an overall healthy ecosystem. >> Without a doubt. >> Okay. >> Great. >> Any final comments? >> Let's have a beer. >> Great to see you in Munich. (laughter) >> We'll have a beer, we had a pig knuckle last night, Dave. We had some sauerkraut. >> Christoph: (speaks foreign word) >> Yeah, we had the (speaks foreign word). Dave, we'll grab the beer, thanks. Good to be with you again. Thanks to the crew, thanks to everyone watching. >> Thanks, John. >> The CUBE, signing off from Munich, Germany for DataWorks 2017. Thanks for watching, see ya next time. (soft techno music)

Published Date : Apr 7 2017

SUMMARY :

Brought to you by Heartenworks. @sapbigdata is the Twitter handle. Go Live or go home. Welcome to theCUBE. at the speed of business, which is SAP, Christoph: We know a thing or two most of the Fortune 100, about 75% of the world GDP around the world for new And you were born in Munich. Bavarian, all the time. like block chain on the horizon. But I still think in the third stage, to I mean, Dave, you were the one who said And I remember something you said, John, the internet was undervalued in terms is going to happen with big data. There were a few solutions. that created the most value big data, so to speak. of some statistics. that the next three Yeah, and I think that when and you have to hop on that curb. that according to the phrase And I think it's very You have to distill it, you know, to your point, on the TPU, the transistor to machine learning. on the analytic side, we've seen that but the companies who, what'd you say, that's when this thing, I often happens on the outside, allows companies to jump industries. born in the cloud companies now being able that the disruption that come over the top on it to help foster innovation faster, right? And on the topic of IOT, we made, also, in the cloud and also in primus. I mean, to the extent Dave, we'll start with you. and hit that steep part of the S-curve. interoperability was key with TCP/IP. One of the things that made things happen. I remember talking about it. Is it more of an ecosystem? also at SAP on the West Coast, were the waterfall, hoodies was the agile. not just the technology And on that topic, we have a lot coming in from here is that I think and the numbers are ugly, But the numbers are really ugly. and how much of that is venture. but Hortonworks could roll up. sorry to interrupt you but and I want to know where that line is. that people are going to Great to see you in Munich. We'll have a beer, we had a Good to be with you again. Thanks for watching, see ya next time.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Christoph Schubert	PERSON	0.99+
Christoph	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Siemens	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Andreas Walter	PERSON	0.99+
2014	DATE	0.99+
Europe	LOCATION	0.99+
Munich	LOCATION	0.99+
2017	DATE	0.99+
David	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
1999	DATE	0.99+
HP	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
75%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
Dimar	ORGANIZATION	0.99+
Christoph Streubert	PERSON	0.99+
2018	DATE	0.99+
Bill McDermont	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
third stage	QUANTITY	0.99+
first trillion dollar	QUANTITY	0.99+
second stage	QUANTITY	0.99+
one billion dollars	QUANTITY	0.99+
two	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+
second talk	QUANTITY	0.99+
yesterday	DATE	0.99+
Munich, Germany	LOCATION	0.99+
DataWorks Summit	EVENT	0.99+
SAP Big Data	ORGANIZATION	0.98+
both	QUANTITY	0.98+
fifth season	QUANTITY	0.98+
Bavarian	OTHER	0.98+
One	QUANTITY	0.98+

Nadeem Gulzar | DataWorks Summit Europe 2017

>> Announcer: Live from Munich, Germany, it's the CUBE, covering DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Hey welcome back everyone. We're here live in Munich Germany for DataWorks 2017 Summit, formerly know as Hadoop Summit, now called DataWorks. I'm John Furrier with the CUBE, my co-host Dave Vellante, here for two days of wall-to-wall coverage. Our next guest is Nadeem Gulzar, head of advanced Analytics at Danske Bank. Welcome to the CUBE. >> Thank you. >> You're a customer but also talking here at the event, bringing all your folks here. Your observation, I mean, Hadoop is not going away, certainly we see that. But now, as John Kreisa, who was MC'ing, was on earlier said, open up the aperture to analytics, is really where the action is. >> Nadeem: Absolutely. >> Your reaction to that. >> I completely agree, because again, Hadoop is basically just the basic infrastructure, right. Components build on components, and things like that. But, when you really utilize it, is when you add the advanced analytics frameworks. There are many out there. I'm not going to favor one over another. But the main thing is, you need that to really leverage Hadoop. And, at the same time, I think it's very important to realize how much power there actually is in this. For us at, in Danske Bank, getting Hadoop, getting the advanced analytics framework, has really proven quite a lot. It allowed us actually to dig into our core data, transaction data for instance, which we haven't been able to for decades. >> So take me through, because you guys are an interesting use case because you're advanced. You're gettin' at the data, which is cutting edge. But you're going through this transformation, and you have to because you're on the front lines. Take us inside the company, without giving away any trade secrets, and describe the environment. What's the current situation, and how is it evolving from an IT standpoint, and also from the relationship with the stakekholders in the business side. >> So again, we are a bank with 20,000 employees, so of course in a large organization you have silos, People feeling okay, this is my domain, this is my kingdom, don't touch it. Don't approach me, or you can approach me, talk to me, you have to convince me, otherwise don't talk to me at all. So we get that quite a lot, and to be honest, from my point of view, if we do not lift as a bank, we're not going to succeed. If I have success, if my organization of almost 60 people have success, that's good in itself, but we are not going to succeed as a bank. So for me, it's quite important that I go down and break down these barriers, and allow us to come in, tell the business units, tell them what sort of capabilities do we bring, and include them. That is actually the main key. I don't want to replace them or anything like that. >> So an organizational challenge is to get the mindset shifted. How 'about process gaps and product gaps? 'Cause I mean I almost see the sequence, kind of a group hug if you will, organizational mindset, kind of a reset or calibration. And then identify processes and then product gaps, seem to be the next transition. >> Absolutely, absolutely, and there are some gaps. Still, even though we have been on this journey for a considerable amount of time, there are still gaps, both in terms of processes and products. Because again, even though we have top management buy in, it doesn't go through all the way down to the middle layer. So we still struggle with this from time to time. >> How do you break down those barriers? What do you do, what's your strategy? >> I'm humble, to be honest. I go in, I tell them, listen you guys I have some capabilities that I can add to your capabilities. I want you to leverage me to make your life easier. I want to lift you as an organization. I don't care about myself, I want you to be better at what you're doing. >> So Nadeem, the money business and the technology business have always had a close relationship. It was like in 2010 after we came out of the downturn, it was like this other massive collision. You had begun experimenting with Cloud, the shift, CapEx to OpEx. The data thing hit in a big way, obviously mobile became real. So talk about the confluence of those technologies, specifically in the context of your big data journey. Where did you get started, and how did it evolve? >> So actually it fit in quite nicely because we were coming out of this down period, right, so there was extreme amount of focus on cost. So, of course at the time where we wanted to go into this journey, a lot of people were asking, okay how much does this cost, what's the big strategy, and so on. And how's the road map going to look like, and what's the cost of the road map? The thing is, if you buy some off the shelf commercial product, it's quite expensive. We can easily talk like half a billion, something like that, for a full end to end system. So with this, you were allowed, or we were allowed, to start up with relatively small funding, and I'm actually talking about just like a million dollars, roughly. And that actually allowed us a substantial boost in the capability department, in allowing us to show what kind of use cases we could build, and what kind of value we could bring to Danske Bank. >> So you started with understanding Hadoop? Is that right, was that the starting point? >> Yes, in a fairly small, very researched team set up. We did the initial research, we looked at, okay what could this bring? We did some initial, what we call, proof of value. So small, small, pilot projects, looking at, okay this is the data. We can leverage it in this way, this is the value we can bring. How much can we actually boost the business? So everything is directly linked to business value. So, for instance, one of the use cases was within customers, understanding customer behavior, directly linking it to marketing, do more targeted marketing, and at the end get more results in terms of increased sales. >> We just started a journey 2009, 2010, is that right? Or was it later? >> No, we started somewhat later. The initial research was in '14. >> In '14? Okay, alright, so '14 you sort of became familiar with Hadoop, and then I imagine, like many customers, you said okay, wow this stuff is complicated, but you were takin' it in small chunks, low risk. Let's get some value. Marketing is an obvious use case. I would imagine fraud is another obvious use case. So then, how did that evolve? I mean it's only a few years now, but I imagine you've evolved very quickly. >> Extremely quickly. Actually, within two months of the research, we actually saw a huge benefit in this area, and directly we went with the material to the senior members of the different boards we wanted to affect, and actually, you could call it luck. But, maybe we were just well prepared and convincing, so we actually directly got funding at that point in time. They said, listen, this is very promising. Here you go, start off with the initial, slightly larger projects, prove some value, and then come back to us. Initially they wanted us to do two things, look into the customer journey, or doing deeper customer behavior analytics, and the second was within risk. Doing things like, text mining, financial statements, getting some deeper into that, doing some web crawling on financial data such as Bloomberg, etcetera, and then pull it into the system. >> To inform your investments as a financial institution. From an architecture and infrastructure standpoint, we talked about starting at Hadoop. Has it evolved, how has it evolved? Where do you see it going? >> It has evolved quite a lot in the past couple of years. And again, to be honest, it's like every quarter something new is happening and we need to do some adjustments even to the core architecture. And with the introduction of HDB 3 hence later this year, I think we're going to see a massive change once again. Hortonworks already calls it a major change, or a major release. But actually, the things they are doing is extremely promising, so we want to take that step with them. But again, it's going to affect us. >> What's exciting about that to you? >> The thing that's very exciting is, we are now at like a balance point, where we have played quite a lot, we have released a couple of production grade solutions, but we have really not reached the full enterprise potential. So getting like into the real deep stuff with living under heavy SLA's, regulation stuff. All these kind of things is not in place yet, from my point of view. >> We talk a lot about, in the CUBE, and in our company, about these emergent work loads; you had batch, interactive, and the world went back to batch with Hadoop, and now you have this continuous workload, this streaming real-time workloads. How is that affecting your organization, generally, and specifically, you're thinking about architecture. How real is that and where do you see that in the future? >> It's the core, to be honest. Again, one of the main things we are trying to do is look into, so, gone are the days with heavy, heavy batches of data coming in. Because if you look at Weblocks for instance, so when customers interacts with our web, or our tablet solution, or mobile solution, the amount of data generated is humongous. So, no way on earth you can think about batches anymore. So it's more about streaming the data all the way in, doing real time analytics and then produce results. >> What would you say are your biggest, big data challenges, problems that you really want to attack and solve. >> So, what I really want to attack is, getting all sorts of data into the system. So, you can imagine, as a bank we have 2,000 plus systems. We have approximately 4,000 different points that delivers data. So getting all that mass into our data link, it's a huge task. We actually underestimated it. But now, we have seen we have to attack it and get it in because that is the gold. Data is the future gold. So we need to mine it in, we need to do analytics on top of it and produce value. >> And then once you get it in there, I'm sure you're anticipating that you want to make sure this doesn't go stale, doesn't become a swamp, doesn't get frozen. It's your job to talk about data oceans, which is really the long term vision I presume, right? >> And that is a key as well because with the GDPR for instance, we need to have full mapping and full control of all the data coming in. We need to be able to generate metadata, we need to have full data lineage. We need to know what, all the data where it came from, how it's interconnected, relations, all that. >> And that's what, two years away from implementation? Is that about right? >> It's going to take a while, of course. But again, the key thing is we make the framework so all the data coming in step by step, has that. >> Yeah, but so GDPR though, it goes into effect in '19, is that correct? >> It's actually May '18. >> May '18, oh, so it's much tighter time frame then I realized. >> John: You're under the gun. >> Nadeem: Yes. >> Okay, observation here at this event, obviously a lot of IOT, for you that's people. People and things are kind of the edge of the network. The intelligent edge is a big, big topic. Very dynamic. >> Nadeem: Extremely dynamic. >> A lot of things happening. Lot of opportunities for you to be this humble service provider to your constituents, but also your customers. How do you guys view that? What's the current landscape look like as you look outside the company and look at what's happening around you, the world. >> A lot of cool things are going on, to be honest. Especially in IOT, right? I mean, even though we are a core bank, still, there are a lot of sensors we can use. I talked a bit about, under the keynote, about ATM's, right? So, we're also looking at how can we utilize this technology? How can we enable our customers? If you look at our apps, they also generate extreme amounts of data, right? The mobile solution that we have, it gives away GPS location and things like that. And we want to include all that data in. At the end of the day, it's not for our gain, we are not always looking at making the next buck, right? It's also about being there for the customer, providing the services they need, making their banking life easier. >> And your ecosystem is evolving and rapidly adding new constituents to your network because, then you have the consumer with the phone, the mobile app alone, never mind the point of sale opportunity at the ATM. Now a digital, augmented reality experience could be enabled where you now have fintech suppliers, and potentially other suppliers in this now digital network that could be relational with you. >> Yes, and our job is to make sure that we leverage that. Acquiring a banking license is extremely difficult. But we have it, and what we need to do is to engage these fintechs, partners, even other banks, and say listen guys we invite you in. Utilize our services, utilize our framework, utilize our foundation and let's build something upon that. >> If you had to explain, Nadeem, this fintech start up trend because it is super hot, what is it? I mean how would you describe to someone who's not in the banking world. 'Cause most people would be scratching their head and say, isn't that banking? But, now this ecosystem is developing of new entrepreneurial activity and they're skyrocketing with success 'cause they have either a specialty focus, they do something extremely well. It may or may not be in a direct big space with a bank, but a white space. Use cases. So, is it good? Is it bad? Is it hype? What's the current state of the fintech situation? >> From my point of view, it's awesome. And the reason is, these guys are pushing us. Remember, we are a hundred fifty plus year old bank. And sometimes we do tend to just pat on our back and say, okay, this is going good, right? But, these guys are coming in, giving some competition, and we love it. >> Give me an example of a fintech capabilities. Randomly bring up some examples to highlight what fintech is. >> So what we've seen in, for instance the German market, is the fintechs coming in, utilizing some of the customer data, and then producing awesome new applications. Whether it is a new net bank, where a customer can interact with it, in a much, much more smoother way. Some of the banks tend to over clutter things, not make it simple. So things like, where you can put in, you can look at your transactions in a Google Map, for instance. You can see how much do you spend at this location. You can move around. >> You could literally follow the money, on a map. (laughing) >> So this is your home base, you go out here, you spend this amount of money, and maybe even add more on it. So, let's say you do your grocery shopping over here, but if I moved all my business from this company to this company, how much could I save? Imagine if you could just drag and drop it and see, okay, I could actually save a couple of thousand bucks, awesome. >> And machine learning is going to totally change the game with Augmented Intelligence. AI is called Artificial Intelligence, or Augmented Intelligence, depending upon your definition. This is a good thing for consumers. >> It is, it is. >> And thinking about disruption, what do you guys, what are your thoughts on blockchain? What is your research showing? You playing around with Hyperledger at all? >> Yes we are. And blockchain, it's also quite interesting. We're doing lots of research on that. What's it's shown actually is that this is a technology that we can also use. And we can also really utilize, even the security aspects of it. If you just take that, you could really implement that. >> The identity aspect, it's federating identity around fraud, another area you can innovate on. I'm bullish on blockchain, a lot of people are skeptical, but Dave knows I really, I love blockchain. Because it's not about Bitcoin per se, it's sort of the underlying opportunity. It just seems fascinating. Dave you know, I got to get on my soapbox, blockchain soapbox. >> We've never really looked at Bitcoin as just a currency, it's move of a technology platform, and I have always been fascinated with the security angle. Virtually unhackable, put that in quotes. No need for a third party to intermediate. So many positive fundamentals, now it's guys like you figuring out, okay the practitioner saying, here's how we're going to implement it and commercialize it. >> And actually it fits in quite well with things like GDPR. This is also about opening up, the same with PSD 2. Exposing the customer data, making it available for the general public. And ultimately the goal is, so you as a consumer, me as a consumer, we own our data. >> Nadeem, thank you so much for coming on the CUBE and sharing your practitioner situation, and your advice, as well as commentary. I'll give you the last word. As you and your team embark from DataWorks 2017 and head back to the ranch, so to speak, and bring back some stuff. What are you going to work on? What's the to do item? What are you going to sharpen the saw on and cut when you get back? >> So for us on the very, very short term, it's about taking our platform and our capabilities and move it into the real enterprise world. That is our first key milestone that we are going to go for. And, I'll tell you, we're going to go all in for that. Because, unless we do that, we're not able to really attack the core of banking, which requires this, right? Please remember that a consumer doing a transaction somewhere in the world, he cannot stand and wait for ages for something to be processed. It needs to be instantaneous. So, this is what we need to do. >> You think this event, you're armed up with product. >> Absolutely, absolutely. Lots of good insight we've gotten from this. Lots of potential, lots of networking guys and other companies that we can talk to about this. >> Also great recruiting, get some developers out there too, lot of great people. Congratulations on your success and thanks for sharing this great insight here on the CUBE, exposing the data to you live on the CUBE. Silicon Angle dot TV, I'm John Furrier, with Dave Vellante my co-host, more great coverage stay with us here live in Munich, Germany for DataWorks 2017 Summit. We'll be right back.

Published Date : Apr 6 2017

SUMMARY :

Brought to you by Hortonworks. Welcome to the CUBE. You're a customer but also talking here at the event, is when you add the advanced analytics frameworks. and you have to because you're on the front lines. So again, we are a bank with 20,000 employees, kind of a group hug if you will, So we still struggle with this from time to time. I want you to leverage me to make your life easier. the shift, CapEx to OpEx. And how's the road map going to look like, We did the initial research, we looked at, No, we started somewhat later. so '14 you sort of became familiar with Hadoop, and directly we went with the material Where do you see it going? and we need to do some adjustments So getting like into the real deep stuff and now you have this continuous workload, Again, one of the main things we are trying to do What would you say are your biggest, and get it in because that is the gold. And then once you get it in there, of all the data coming in. But again, the key thing is we make the framework so it's much tighter time frame then I realized. obviously a lot of IOT, for you that's people. Lot of opportunities for you A lot of cool things are going on, to be honest. then you have the consumer with the phone, and say listen guys we invite you in. I mean how would you describe to someone and we love it. Give me an example of a fintech capabilities. Some of the banks tend to over clutter things, You could literally follow the money, on a map. So, let's say you do your grocery shopping over here, And machine learning is going to totally change the game that we can also use. Dave you know, I got to get on my soapbox, and I have always been fascinated with the security angle. so you as a consumer, me as a consumer, we own our data. and cut when you get back? That is our first key milestone that we are going to go for. that we can talk to about this. exposing the data to you live on the CUBE.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Danske Bank	ORGANIZATION	0.99+
Nadeem	PERSON	0.99+
John Kreisa	PERSON	0.99+
Nadeem Gulzar	PERSON	0.99+
Dave	PERSON	0.99+
May '18	DATE	0.99+
2009	DATE	0.99+
2010	DATE	0.99+
John Furrier	PERSON	0.99+
Bloomberg	ORGANIZATION	0.99+
20,000 employees	QUANTITY	0.99+
two days	QUANTITY	0.99+
half a billion	QUANTITY	0.99+
two years	QUANTITY	0.99+
two months	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
'19	DATE	0.99+
'14	DATE	0.99+
CUBE	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Google Map	TITLE	0.99+
Munich, Germany	LOCATION	0.99+
both	QUANTITY	0.98+
DataWorks 2017 Summit	EVENT	0.98+
GDPR	TITLE	0.98+
Hadoop	TITLE	0.98+
DataWorks	EVENT	0.98+
Munich Germany	LOCATION	0.98+
PSD 2	TITLE	0.98+
first key milestone	QUANTITY	0.98+
second	QUANTITY	0.97+
DataWorks Summit	EVENT	0.97+
Hadoop Summit	EVENT	0.97+
one	QUANTITY	0.97+
almost 60 people	QUANTITY	0.95+
Hadoop	ORGANIZATION	0.95+
later this year	DATE	0.94+
approximately 4,000 different points	QUANTITY	0.94+
2,000 plus systems	QUANTITY	0.93+
a hundred fifty plus year old	QUANTITY	0.93+
Silicon Angle dot TV	ORGANIZATION	0.93+
2017	EVENT	0.93+
a couple of thousand bucks	QUANTITY	0.87+
DataWorks Summit Europe 2017	EVENT	0.85+
decades	QUANTITY	0.85+
German	LOCATION	0.84+
OpEx	ORGANIZATION	0.82+
past couple of years	DATE	0.76+
a million dollars	QUANTITY	0.76+
earth	LOCATION	0.75+
CapEx	ORGANIZATION	0.74+
Hyperledger	ORGANIZATION	0.71+
2017	DATE	0.7+
3	COMMERCIAL_ITEM	0.68+
Bitcoin	OTHER	0.64+
SLA	TITLE	0.64+
Europe	LOCATION	0.59+
Weblocks	ORGANIZATION	0.59+
HDB	TITLE	0.58+
years	QUANTITY	0.57+
DataWorks	TITLE	0.49+
Cloud	TITLE	0.44+
CUBE	TITLE	0.41+

Carlo Vaiti | DataWorks Summit Europe 2017

>> Announcer: You are CUBE Alumni. Live from Munich, Germany, it's theCUBE. Covering, DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Hello, everyone, welcome back to live coverage at DataWorks 2017, I'm John Furrier with my cohost, Dave Vellante. Two days of coverage here in Munich, Germany, covering Hortonworks and Yahoo, presenting Hadoop Summit, now called DataWorks 2017. Our next guest is Carlo Vaiti, who's the HPE chief technology strategist, EMEA Digital Solutions, Europe, Middle East, and Africa. Welcome to theCUBE. >> Thank you, John. >> So we were just chatting before we came on, of your historic background at IBM, Oracle, and now HPE, and now back into the saddle there. >> Don't forget Sun Microsystems. >> Sun Microsystems, sorry, Sun, yeah. I mean, great, great run. >> It was a long run. >> You've seen the computer revolution happen. I worked at HP for nine years, from '88 to '97. Again, Dave was a premier analyst during that run of client-server. We've seen the computer revolution happen. Now we're seeing the digital revolution where the iPhone is now 10 years old, Cloud is booming, data's at the center of the value proposition, so a completely new disruptive capability. >> Carlo: Sure, yes. >> So what are you doing as the CTO, chief technologist for HPE, how are you guys bringing this story together? 'Cause there's so much going on at HPE. You got the services spit, you got the software split, and HP's focusing on the new style of IT, as Meg Whitman calls it. >> So, yeah. My role in EMEA is actually all about having basically a visionary kind of strategy role for what's going to be HP in the future, in terms of IT. And one of the things that we are looking at is, is specifically to have, we split our strategy in three different aspects, so three transformation areas. The first one which we usually talk is what I call hybrid IT, right, which is basically making services around either On-Premise or on Cloud for our customer base. The second one is actually power the Intelligent Edge, so is actually looking after our collaboration and when we acquire Aruba components. And the third one, which is in the middle, and that's why I'm here at the DataWorks Summit, is actually the data-analytics aspects. And we have a couple of solution in there. One is the Enterprise great Hadoop, which is part of this. This is actually how we generalize all the figure and the strategy for HP. >> It's interesting, Dave and I were talking yesterday, being in Europe, it's obviously a different sideshow, it's smaller than the DataWorks or Hadoop Summit in North America in San Jose, but there's a ton of Internet of things, IoT or IIoT, 'cause here in Germany, obviously, a lot of industrial nations, but in Europe in general, a lot of smart cities initiatives, a lot of mobility, a ton of Internet of things opportunity, more than in the US. >> Absolutely. >> Can you comment on how you guys are tackling the IoT? Because it's an Intelligent Edge, certainly, but it's also data, it's in your wheelhouse. >> Yes, sure. So I'm actually working, it's a good question, because I'm actually working a couple of projects in Eastern Europe, where it's all about Industrial IoT Analytics, IIoTA. That's the new terminology we use. So what we do is actually, we analyze from a business perspective, what are the business pain points, in an oil and gas company for example. And we understand for example, what kind of things that they need and must have. And what I'm saying here is, one of the aspects for example, is the drilling opportunity. So how much oil you can extract from a specific rig in the middle of the North Sea, for example. This is one of the key question, because the customer want to understand, in the future, how much oil they can extract. The other one is for example, the upstream business. So doing on the retail side and having, say, when my customer is stopping in a gas station, I want go in the shop, immediately giving, I dunno, my daughter, a kind of campaign for the Barbie, because they like the Barbie. So IoT, Industrial IoT help us in actually making a much better customer experience, and that's the case of the upstream business, but is also helping us in actually much faster business outcomes. And that's what the customer wants, right? 'Cause, and was talking with your colleague before, I'm talking to the business guy. I'm not talking to the IT anymore in these kind of place, and that's how IoT allow us a chance to change the conversation at the industry level. >> These are first-time conversations too. You're getting at the kinds of business conversations that weren't possible five years ago. >> Carlo: Yes, sure. >> I mean and 10 years ago, they would have seemed fantasy. Now they're reality. >> The role of analytics in my opinion, is becoming extremely key, and I said this morning, for me my best center is that the detail, is the stone foundation of the digital economy. I continue to repeat this terminology, because it's actually where everything is starting from. So what I mean is, let's take a look at the analytic aspect. So if I'm able to analyze the data close to the shop floor, okay, close to the shop manufacturing floor, if I'm able to analyze my data on the rig, in the oil and gas industry, if I'm able to analyze doing preprocessing analytics, with Kafka, Druid, these kind of open-source software, where close to the Intelligent Edge, then my customers going to be happy, because I give them very fast response, and the decision-maker can get to decision in a faster time. Today, it takes a long time to take these type of decision. So that's why we want to move into the power Intelligent Edge. >> So you're saying, data's foundational, but if you get to the Intelligent Edge, it's dynamic. So you have a dynamic reactive, realtime time series, or presences of data, but you need the foundational pre-data. >> Perfect. >> Is that kind of what you're getting at? >> Yes, that's the first step. Preprocessing analytics is what we do. In the next generation of, we think is going to be Industrial IoT Analytics, we're going to actually put massive amount of compute close to the shop manufacturing floor. We call internally or actually externally, convergent planned infrastructure. And that's the key point, right? >> John: Convergent plan? >> Convergent planned infrastructure, CPI. If you look at in Google, you will find. It's a solution we bring in the market a few months ago. We announce it in December last year. >> Yeah, Antonio's smart. He also had a converged systems as well. One of the first ones. >> Yeah, so that's converge compute at the edge basically. >> Correct, converge compute-- >> Very powerful. >> Very powerful, and we run analytics on the edge. That's the key point. >> Which we love, because that means you don't have to send everything back to the Cloud because it's too expensive, it's going to take too long, it's not going to work. >> Carlo: The bandwidth on the network is much less. >> There's no way that's going to be successful, unless you go to the edge and-- >> It takes time. >> With a cost. >> Now the other thing is, of course, you've got the Aruba asset, to be able to, I always say, joke, connect the windmill. But, Carlo, can we go back to the IoTA example? >> Carlo: Correct, yeah. >> I want to help, help our audience understand, sort of, the new HP, post these spin merges. So perviously you would say, okay, we have Vertica. You still have partnership, or you still own Vertica, but after September 1st-- >> Absolutely, absolutely. It's part of the columnar side-- >> Right, yes, absolutely, but, so. But the new strategy is to be more of a platform for a variety of technology. So how for instance would you solve, or did you solve, that problem that you described? What did you actually deliver? >> So again, as I said, we're, especially in the Industrial IoT, we are an ecosystem, okay? So we're one element of the ecosystem solution. For the oil and gas specifically, we're working with other system integrator. We're working with oil and the industry gas expertise, like DXC company, right, the company that we just split a few days ago, and we're working with them. They're providing the industry expertise. We are a infrastructure provided around that, and the services around that for the infrastructure element. But for the industry expertise, we try to have a kind of little bit of knowledge, to start the conversation with the customer. But again, my role in the strategy is actually to be a ecosystem digital integrator. That's the new terminology we like to bring in the market, because we really believe that's the way HP role is going to be. And the relevance of HP is totally depending if we are going to be successful in these type of things. >> Okay, now a couple other things you talked about in your keynote. I'm just going to list them, and then we can go wherever we want. There was Data Link 3.0, Storage Disaggregation, which is kind of interesting, 'cause it's been a problem. Hadoop as a service, Realtime Everywhere, and then Analytics at the Edge, which we kind of just talked about. Let's pick one. Let's start with Data Link 3.0. What is that? John doesn't like the term data link. He likes data ocean. >> I like data ocean. >> Is Data Link 3.0 becoming an ocean? >> It's becoming an ocean. So, Data Link 3.0 for us is actually following what is going to be the future for HDFS 3.0. So we have three elements. The erasure coding feature, which is coming on HDFS. The second element is around having HDFS data tier, multi-data tier. So we're going to have faster SSD drives. We're going to have big memory nodes. We're going to have GPU nodes. And the reason why I say disaggregation is because some of the workload will be only compute, and some of the workload will be only storage, okay? So we're going to bring, and the customer require this, because it's getting more data, and they need to have for example, YARN application running on compute nodes, and the same level, they want to have storage compute block, sorry, storage components, running on the storage model, like HBase for example, like HDFS 3.0 with the multi-tier option. So that's why the data disaggregation, or disaggregation between compute and storage, is the key point. We call this asymmetric, right? Hadoop is becoming asymmetric. That's what it mean. >> And the problem you're solving there, is when I add a node to a cluster, I don't have to add compute and storage together, I can disaggregate and choose whatever I need, >> Everyone that we did. >> based on the workload. >> They are all multitenancy kind of workload, and they are independent and they scale out. Of course, it's much more complex, but we have actually proved that this is the way to go, because that's what the customer is demanding. >> So, 3.0 is actually functional. It's erasure coding, you said. There's a data tier. You've got different memory levels. >> And I forgot to mention, the containerization of the application. Having dockerized the application for example. Using mesosphere for example, right? So having the containerization of the application is what all of that means, because what we do in Hadoop, we actually build the different clusters, they need to talk to each other, and change data in a faster way. And a solution like, a product like SQL Manager, from Hortonworks, is actually helping us to get this connection between the cluster faster and faster. And that's what the customer wants. >> And then Hadoop as a service, is that an on-premise solution, is that a hybrid solution, is it a Cloud solution, all three? >> I can offer all of them. Hadoop is a service could be run on-premise, could be run on a public Cloud, could be run on Azure, or could be mix of them, partially on-premise, and partially on public. >> And what are you seeing with regard to customer adoption of Cloud, and specifically around Hadoop and big data? >> I think the way I see that option is all the customer want to start very small. The maturity is actually better from a technology standpoint. If you're asking me the same question maybe a year ago, I would say, it's difficult. Now I think they've got the point. Every large customer, they want to build this big data ocean, note the delay, ocean, whatever you want to call it. >> John: Love that. (laughs) >> All right. They want to build this data ocean, and the point I want to make is, they want to start small, but they want to think very high. Very big, right, from their perspective. And the way they approach us is, we have a kind of methodology. We establish the maturity assessment. We do a kind of capability maturity assessment, where we find that if the customer is actually a pioneer, or is actually a very traditional one, so it's very slow-going. Once we determine where is the stage of the customer is, we propose some specific proof of concept. And in three months usually, we're putting this in place. >> You also talked about realtime everywhere. We in our research, we talk about the, historically, you had batchy of interactive, and now you have what we call continuous, or realtime streaming workloads. How prevalent is that? Where do you see it going in the future? >> So I think is another train for the future, as I mentioned this morning in my presentation. So and Spark is actually doing the open-source memory engine process, is actually the core of this stuff. We see 60 to 70 time faster analytics, compared to not to use Spark. So many customer implemented Spark because of this. The requirement are that the customer needs an immediate response time, okay, for a specific decision-making that they have to do, in order to improve their business, in order to improve their life. But this require a different architecture. >> I have a question, 'cause you, you've lived in the United States, you're obviously global, and spent a lot of time in Europe as well, and a lot of times, people want to discuss the differences between, let's make it specific here, the European continent and North America, and from a sophistication standpoint, same, we can agree on that, but there are still differences. Maybe, more greater privacy concerns. The whole thing with the Cloud and the NSA in the United States, created some concerns. What do you see as the differences today between North America and Europe? >> From my perspective, I think we are much more for example take IoT, Industrial IoT. I think in Europe we are much more advanced. I think in the manufacturing and the automotive space, the connected car kind of things, autonomous driving, this is something that we know already how to manage, how to do it. I mean, Tesla in the US is a good example that what I'm saying is not true, but if I look at for example, large German manufacturing car, they always implemented these type of things already today. >> Dave: For years, yeah. >> That's the difference, right? I think the second step is about the faster analytic approach. So what I mentioned before. The Power the Intelligent Edge, in my opinion at the moment, is much more advanced in the US compared to Europe. But I think Europe is starting to run back, and going on the same route. Because we believe that putting compute capacity on the edge is what actually the customer wants. But that's the two big differences I see. >> The other two big external factors that we like to look at, are Brexit and Trump. So (laughs) how 'about Brexit? Now that it's starting to sort of actually become, begin the process, how should we think about it? Is it overblown? It is critical? What's your take? >> Well, I think it's too early to say. UK just split a few days ago, right, officially. It's going to take another 18 months before it's going to be completed. From a commercial standpoint, we don't see any difference so far. We're actually working the same way. For me it's too early to say if there's going to be any implication on that. >> And we don't know about Trump. We don't have to talk about it, but the, but I saw some data recently that's, European sentiment, business sentiment is trending stronger than the US, which is different than it's been for the last many years. What do you see in terms of just sentiment, business conditions in Europe? Do you see a pick up? >> It's getting better, it is getting better. I mean, if I look at the major countries, the P&L is going positive, 1.5%. So I think from that perspective, we are getting better. Of course we are still suffering from the Chinese, and Japanese market sometimes. Especially in some of the big large deals. The inclusion of the Japanese market, I feel it, and the Chinese market, I feel that. But I think the economy is going to be okay, so it's going to be good. >> Carlo, I want to thank you for coming on and sharing your insight, final question for you. You're new to HPE, okay. We have a lot of history, obviously I was, spent a long part of my career there, early in my career. Dave and I have covered the transformation of HP for many, many years, with theCUBE certainly. What attracted you to HP and what would you say is going on at HP from your standpoint, that people should know about? >> So I think the number one thing is that for us the word is going to be hybrid. It means that some of the services that you can implement, either on-premise or on Cloud, could be done very well by the new Pointnext organization. I'm not part of Pointnext. I'm in the EG, Enterprise Group division. But I am fan for Pointnext because I believe this is the future of our company, is on the services side, that's where it's going. >> I would just point out, Dave and I, our commentary on the spin merge has been, create these highly cohesive entities, very focused. Antonio now running EG, big fans, of where it's actually an efficient business model. >> Carlo: Absolutely. >> And Chris Hsu is running the Micro Focus, CUBE Alumni. >> Carlo: It's a very efficient model, yes. >> Well, congratulations and thanks for coming on and sharing your insights here in Europe. And certainly it is an IoT world, IIoT. I love the analytics story, foundational services. It's going to be great, open source powering it, and this is theCUBE, opening up our content, and sharing that with you. I'm John Furrier, Dave Vellante. Stay with us for more great coverage, here from Munich after the short break.

Published Date : Apr 6 2017

SUMMARY :

Brought to you by Hortonworks. Welcome to theCUBE. and now back into the saddle there. I mean, great, great run. data's at the center of the value proposition, and HP's focusing on the new style And one of the things that we are looking at is, it's smaller than the DataWorks or Hadoop Summit Can you comment on how you guys are tackling the IoT? and that's the case of the upstream business, You're getting at the kinds of business conversations I mean and 10 years ago, they would have seemed fantasy. and the decision-maker can get to decision in a faster time. So you have a dynamic reactive, And that's the key point, right? It's a solution we bring in the market a few months ago. One of the first ones. That's the key point. it's going to take too long, it's not going to work. Now the other thing is, sort of, the new HP, post these spin merges. It's part of the columnar side-- But the new strategy is to be more That's the new terminology we like to bring in the market, John doesn't like the term data link. and the same level, they want to have but we have actually proved that this is the way to go, So, 3.0 is actually functional. So having the containerization of the application Hadoop is a service could be run on-premise, all the customer want to start very small. John: Love that. and the point I want to make is, they want to start small, and now you have what we call continuous, is actually the core of this stuff. in the United States, created some concerns. I mean, Tesla in the US is a good example is much more advanced in the US compared to Europe. actually become, begin the process, before it's going to be completed. We don't have to talk about it, but the, and the Chinese market, I feel that. Dave and I have covered the transformation of HP It means that some of the services that you can implement, our commentary on the spin merge has been, I love the analytics story, foundational services.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Carlo	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Trump	PERSON	0.99+
Meg Whitman	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Pointnext	ORGANIZATION	0.99+
Chris Hsu	PERSON	0.99+
John	PERSON	0.99+
Carlo Vaiti	PERSON	0.99+
John Furrier	PERSON	0.99+
HP	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
HPE	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
Sun Microsystems	ORGANIZATION	0.99+
Antonio	PERSON	0.99+
US	LOCATION	0.99+
EG	ORGANIZATION	0.99+
second element	QUANTITY	0.99+
United States	LOCATION	0.99+
second step	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
December last year	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
San Jose	LOCATION	0.99+
1.5%	QUANTITY	0.99+
yesterday	DATE	0.99+
North America	LOCATION	0.99+
September 1st	DATE	0.99+
'97	DATE	0.99+
'88	DATE	0.99+
Africa	LOCATION	0.99+
one	QUANTITY	0.99+
Today	DATE	0.99+
three months	QUANTITY	0.99+
Eastern Europe	LOCATION	0.99+
Sun	ORGANIZATION	0.99+
Two days	QUANTITY	0.99+
60	QUANTITY	0.99+
DataWorks 2017	EVENT	0.99+
10 years ago	DATE	0.99+
DXC	ORGANIZATION	0.98+
EMEA Digital Solutions	ORGANIZATION	0.98+
five years ago	DATE	0.98+
a year ago	DATE	0.98+
Tesla	ORGANIZATION	0.98+

John Kreisa, Hortonworks– DataWorks Summit Europe 2017 #DWS17 #theCUBE

>> Announcer: Live from Munich, Germany, it's theCUBE, covering DataWorks Summit Europe 2017. Brought to you by HORTONWORKS. (electronic music) (crowd) >> Okay, welcome back everyone, we are here live in Munich, Germany, for DataWorks 2017, formerly Hadoop Summit, the European version. Again, different kind of show than the main show in North America, in San Jose, but it's a great show, a lot of great topics. I'm John Furrier, my co-host, Dave Vellante. Our next guest is John Kreisa, Vice President of International Marketing. Great to see you emceeing the event. Great job, great event! >> John Kreisa: Great. >> Classic European event, its got the European vibe. >> Yep. >> Germany everything's tightly buttoned down, very professional. (laughing) But big IOT message-- >> Yes. >> Because in Germany a lot of industrial action-- >> That's right. >> And then Europe, in general, a lot of smart cities, a lot of mobility, and issues. >> Umm-hmm. >> So a lot of IOT, a lot of meat on the bone here. >> Yep. >> So congratulations! >> John Kreisa: Thank you. >> How's your thoughts? Are you happy with the event? Give us by the numbers, how many people, what's the focus? >> Sure, yeah, no, thanks, John, Dave. Long-time CUBE attendee, I'm really excited to be here. Always great to have you guys here-- >> Thanks. >> Thanks. >> And be participating. This is a great event this year. We did change the name as you mentioned from Hadoop Summit to DataWorks Summit. Perhaps, I'll just riff on that a little bit. I think that really was in response to the change in the community, the breadth of technologies. You mentioned IOT, machine learning, and AI, which we had some of in the keynotes. So just a real expansion of from data loading, data streaming, analytics, and machine learning and artificial intelligence, which all sit on top and use the core Hadoop platform. We felt like it was time to expand the conference itself. Open up the aperture to really bring in the other technologies that were involved, and really represent what was already starting to kind of feed into Hadoop Summit, so it's kind of a natural change, a natural evolution. >> And there's a 2-year visibility. We talk about this two years ago. >> John Kreisa: Yeah, yeah. >> That you are starting to see this aperture open up a little bit. >> Yeah. >> But it's interesting. I want to get your thoughts on this because Dave and I were talking yesterday. It's like we've been to every single Hadoop Summit. Even theCUBE's been following it all as you know. It's interesting the big data space was created by the Hadoop ecosystem. >> Umm-hmm. >> So, yeah, you rode in on the Hadoop horse. >> Yeah. >> I get that. A lot of people don't get them. They say, Oh, Hadoop's dead, but it's not. >> No. >> It's evolving to a much broader scope. >> That's right. >> And you guys saw that two years ago. Comment on your reaction to Hadoop is not dead. >> Yeah, wow (laughing). It's far from dead if you look at the momentum, largest conference ever here in Europe. I think strong interest from them. I think we had a very good customer panel, which talked about the usage, right. How they were really transforming. You had Walgreens Booth's talking about how they're redoing their shelf, shelving, and how they're redesigning their stores. Don Ske-bang talking about how they're analyzing, how they replenish their cash machines. Centrica talking about how they redo their... Or how they're driving down cost of energy by being smarter around energy consumption. So, these are real transformative use cases, and so, it's far from dead. Really what might be confusing people is probably the fact that there are so many other technologies and markets that are being enabled by this open source technologies and the breadth of the platform. And I think that's maybe people see it kind of move a little bit back as a platform play. And so, we talk more about streaming and analytics and machine learning, but all that's enabled by Hadoop. It's all riding on top of this platform. And I think people kind of just misconstrue that the fact that there's one enabling-- >> It's a fundamental element, obviously. >> John Kreisa: Yeah. >> But what's the new expansion? IOT, as I mentioned, is big here. >> Umm-hmm. >> But there's a lot more in connective tissue going on, as Shawn Connelly calls it. >> Yeah, yep. >> What are those other things? >> Yeah, so I think, as you said, smart cities, smart devices, the analytics, getting the value out of the technologies. The ability to load it and capture it in new ways with new open source technology, NyFy and some of those other things, Kafka we've heard of. And some of those technologies are enabling the broader use cases, so I don't think it's... I think it's that's really the fundamental change in shift that we see. It's why we renamed it to DataWorks Summit because it's all about the data, right. That's the thing-- >> But I think... Well, if you think about from a customer perspective, to me anyway, what's happened is we went through the adolescent phase of getting this stuff to work and-- >> Yeah. >> And figuring out, Okay, what's the relationship with my enterprise data warehouse, and then they realize, Wow, the enterprise data warehouse is critical to my big data platform. >> Umm-hmm. >> So what's customers have done as they've evolved, as Hadoop has evolved, their big data platforms internally-- >> Umm-hmm. And now they're turning to to their business saying, Okay, we have this platform. Let's now really start to go up the steep part of the S-curve and get more value out of it. >> John Kreisa: Umm-hmm. >> Do you agree with that scenario? >> I would definitely agree with that. I think that as companies have, and in particularly here in Europe, it's interesting because they kind of waited for the technology to mature and its reached that inflection point. To your point, Dave, such that they're really saying, Alright, let's really get this into production. Let's really drive value out of the data that they see and know they have. And there's sort of... We see a sense of urgency here in Europe, to get going and really start to get that value out. Yeah, and we call it a ratchet game. (laughing) The ratchet is, Okay, you get the technology to work. Okay, you still got to keep the lights on. Okay, and oh, by the way, we need some data governance. Let's ratchet it up that side. Oh, we need a CDO! >> Umm-hmm. >> And so, because if you just try to ratchet up one side of the house (laughing) (cross-talk)-- >> Well, Carlo from HPE said it great on our last segment. >> Yeah. >> And I thought this was fundamental. And this was kind of like you had a CUBE moment where it's like, Wow, that's a really amazing insight. And he said something profound, The data is now foundational to all conversations. >> Right. >> And that's from a business standpoint. It's never always been the case. Now, it's like, Okay, you can look at data as a fundamental foundation building block. >> Right. >> And then react from there. So if you get the data locked in, to Dave's point about compliance, you then can then do clever things. You can have a conversation about a dynamic edge or-- >> Right. >> Something else. So the foundational data is really now fundamental, and I think that is... Changes, it's not a database issue. It's just all data. >> Right, now all data-- >> All databases. >> You're right, it's all data. It's driving the business in all different functions. It's operational efficiency. It's new applications. It's customer intimacy. All of those different ways that all these companies are going, We've got this data. We now have the systems, and we can go ahead and move forward with it. And I think that's the momentum that we're seeing here in Europe, as evidence by the conference and those kinds of things, just I think really shows how maybe... We used to say... I'd say when I first moved over here, that Europe was maybe a year and a half behind the U.S., in terms of adoption. I'd say that's shrunk to where a lot of the conversations are the exact same conversations that we're having with big European companies, that we're having with U.S. companies. >> And, even in... >> Yeah. >> Like we were just talking to Carlo, He was like, Well, and Europe is ahead in things like certain IOT-- >> Yeah. >> And Industrial IOT. >> Yeah. >> Yeah. >> Even IOT analytics. Some of the... Tesla not withstanding some of the automated vehicles. >> John Kreisa: Correct. >> Autonomous vehicles activity that's going on. >> John Kreisa: That's right. >> Certainly with Daimler and others. So there's an advancement. It almost reminds me of the early days of mobile, so... (laughing) >> It's actually, it's a good point. If you look at... Squint through some of the perspectives, it depends on where you are in the room and what your view is. You could argue there are many things that Europe is advanced on and where we're behind. If you look at Amazon Web Services, for instance. >> Umm-hmm. >> They are clearly running as fast as they can to deploy regions. >> Umm-hmm. >> So the scoop's coming out now. I'm hearing buzz that there's another region coming out. >> Right. >> From Amazon soon (laughing). They can't go fast enough. Google is putting out regions again. >> Right. >> Data centers are now pushing global, yet, there's more industrial here than is there. So it's interesting perspective. It depends on how you look at it! >> Yeah, yeah, no, I think it's... And it's perfectly fair to say there are many places where it's more advanced. I think in this technology and open source technologies, in general, are helping drive some of those and enable some of those trends. >> Yeah. >> Because if you have the sensors, you need a place to store and analyze that data whether it's smart cars or smart cities, or energy, smart energy, all those different places. That's really where we are. >> What's different in the international theater that you're involved in because you've been on both sides. >> Yep. >> As you came from the U.S. then when we first met. What's different out here now? And I see the gaps closing? What other things that notable that you could share? >> Yeah, yeah, so I'd say, we still see customers in the U.S. that are still very much wanting to use the shiniest, new thing, like the very latest version of Spark or the very latest version of NyFy or some other technologies. They want to push and use that latest version. In Europe, now the conversations are slightly different, in terms of understanding the security and governance. I think there's a lot more consciousness, if you will, around data here. There's other rules and regulations that are coming into place. And I think they're a little bit more advanced in how they think of-- >> Yeah. >> Data, personal data, how to be treated, and so, consequently, those are where the conversations are about the platform. How do we secure it? How does it get governed? So that you need regulations-- >> John Furrier: It's not as fast, as loose as the U.S. >> Yeah, it's not as fast. And you look and see some of the regulations. (laughing) My wife asked me if we should set up a VPIN on our home WiFi because of this new rule about being able to sell the personal data. I've said, Well, we're not in the U.S., but perhaps, when we move to the U.S. >> In order to get the right to block chain (laughing). (cross-talk) >> Yeah, absolutely (cross-talk). >> John Furrier: Encrypt everything. >> (laughing) Yeah, exactly. >> Well, another topic is... Let's talk about the ecosystem a little bit. >> Umm-hmm. >> You've got now some additional public brethren, obviously Cloudera's, there's been a lot of talk here about-- >> Umm-hmm. Tow-len and Al-trex-is have gone public. >> Yeah. >> The ecosystem you've evolved that. IBM was up on stage with you guys. >> Yeah, yep. >> So that continues to be-- >> Gallium C. >> Can we talk about that a little bit? >> Gallium C >> Gallium C. >> We had a great... Partners are great. We've always been about the ecosystem. We were talking about before we came on-screen that for us it's not Marney Partnership. They're very much of substance, engineering to try to drive value for the customers. It's where we see that value in that joint value. So IBM is working with us across all of the DataWorks Summit, but, even in all of the engineering work that we're doing, participated in HDP 2.6 announcement that we just did. And I'm sure what you covered with Shawn and others, but those partnerships really help drive value for the customer. >> Umm-hmm. For us, it's all making sure the customer is successful. And to make a complete solution, it is a range of products, right. It is whether it's data warehousing, servers, networks, all of the different analytics, right. There's not one product that is the complete solution. It does take a stack, a multitude of technologies, to make somebody successful. >> Cloudera's S-1, was file, what's been part of the conversation, and we've been digging into, it's great to see the numbers. >> Umm-hmm. >> Anything surprise you in the S-1? And advice you'd give to open source companies looking to go public because, as Dave pointed out, there's a string now of comrades in arms, if you will, Mool-saw, that's doing very well. >> Yeah, yeah. >> And Al-trex-is just went public. >> Yeah. >> You guys have been public for a long time. You guys been operating the public open-- >> Yeah. >> Both open source, pure open source. But also on the public markets. You guys have experience. You got some scar tissue. >> John Kreisa: (laughing) Yeah, yeah. >> What's your advice to Cloudera or others that are... Because the risk certainly will be a rush for more public companies. >> Yeah. >> It's a fantastic trend. >> I think it is a fantastic trend. I completely agree. And I think that it shows the strength of the market. It shows both the big data market, in general, the analytics market, kind of all the different components that are represented in some of those IPOs or planned IPOs. I think that for us, we're always driving for success of the customer, and I think any of the open source companies, they have to look at their business plan and take it step-wise in approach, that keeps an eye on making the customer successful because that's ultimately what's going to drive the company success and drive revenue for it and continue to do it. But we welcome as many companies as possible to come into the public market because A: it just allows everybody to operate in an open and honest way, in terms of comparison and understanding how growth is. But B: it's shows that strength of how open source and related technologies can help-- >> Yeah. >> Drive things forward. >> And it's good for the customer, too, because now they can compare-- >> Yes! >> Apples to Apples-- >> Exactly. >> Visa V, Cloudera, and what's interesting is that they had such a head start on you guys, HORTONWORKS, but the numbers are almost identical. >> Umm-hmm, yeah. >> Really close. >> Yeah, I think it's indicative of the opportunity that they're now coming out and there's rumors of other companies coming out. And I think it's just gives that visibility. We welcome it, absolutely-- >> Yeah. >> To show because we're very proud of our performance and now are growth. And I think that's something that we stand behind and stand on top of. And we want to see others come out and show what they got. >> Let's talk about events, if we can? >> Yeah. >> We were there at the first Hadoop Summit in San Jose. Thrilled to be-- >> John Kreisa: In a few years. >> In Dublin last year. >> Yeah. >> So what's the event strategy? I love going into the local flavor. >> Umm-hmm. >> Last year we had the Irish singers. This year we had a great (laughing) locaL band. >> John Kreisa: (laughing) Yeah, yeah, yeah. >> So I don't know if you've announced where next year's going to be? Maybe you can share with us some of the roll-out strategies? >> Yeah, so first of all, DataWorks Summit is a great event as you guys know, And you guys are long participants, so it's a great partnership. We've moving them international, of course, we did a couple... We are already international, but moving a couple to Asia last year so-- >> Right. >> Those were a tremendous success, we actually exceeded our targets, in terms of how many people we thought would go. >> Dave: Where did you do those? >> We were in Melburn in Tokyo. >> Dave: That's right, yeah. >> Yeah, so in both places great community, kind of rushed to the event and kind of understanding, really showed that there is truly a global kind of data community around Hadoop and other related technologies. So from here as you guys know because you're going to be there, we're thinking about San Jose and really wanting to make sure that's a great event. It's already stacking up to be tremendous, call for papers is all done. And all that's announced so, even the sessions we're really starting build for that, We'll be later this year. We'll be in Sydney, so we're going to have to take DataWorks into Sydney, Australia, in September. So throughout the rest of this year, there's going to be continued building momentum and just really global participation in this community, which is great. >> Yeah. >> Yeah. >> Yeah, it's fantastic. >> Yeah, Sydney should be great. >> Yeah. >> Looking forward to it. We're going to expand theCUBE down under. Dave and I are are excited-- >> Dave: Yeah, let's talk about that. >> We got a lot of interest (laughing). >> Alright. >> John, great to have you-- >> Come on down. >> On theCUBE again. Great to see you. Congratulations, I'm going to see you up on stage. >> Thank you. >> Doing the emcee. Great show, a lot of great presenters and great customer testimonials. And as always the sessions are packed. And good learning, great community. >> Yeah. >> Congratulations on your ecosystem. This is theCUBE broadcasting live from Munich, Germany for DataWorks 2017, presented by HORTONWORKS and Yahoo. I'm John Furrier with Dave Vellante. Stay with us, great interviews on day two still up. Stay with us. (electronic music)

Published Date : Apr 6 2017

SUMMARY :

Brought to you by HORTONWORKS. Great to see you emceeing the event. its got the European vibe. But big IOT message-- a lot of smart cities, a lot of meat on the bone here. Always great to have you guys here-- We did change the name as you mentioned And there's a 2-year visibility. to see this aperture It's interesting the big data space in on the Hadoop horse. A lot of people don't get them. to a much broader scope. And you guys saw that two years ago. that the fact that there's one enabling-- But what's the new expansion? But there's a lot more in because it's all about the data, right. of getting this stuff to work and-- Wow, the enterprise data warehouse of the S-curve and get for the technology to mature it great on our last segment. And I thought It's never always been the case. So if you get the data locked in, So the foundational data a lot of the conversations of the automated vehicles. activity that's going on. It almost reminds me of the it depends on where you are in the room as fast as they can to deploy regions. So the scoop's Google is putting out regions again. It depends on how you look at it! And it's perfectly fair to have the sensors, the international theater And I see the gaps closing? or the very latest version of NyFy So that you need regulations-- fast, as loose as the U.S. some of the regulations. In order to get the right Let's talk about the Tow-len and Al-trex-is IBM was up on stage with you guys. even in all of the engineering work networks, all of the it's great to see the numbers. in the S-1? You guys been operating the public open-- But also on the public markets. Because the risk certainly will be kind of all the different components HORTONWORKS, but the numbers indicative of the opportunity And I think that's something at the first Hadoop Summit in San Jose. I love going into the local flavor. the Irish singers. Yeah, yeah, yeah. And you guys are long participants, in terms of how many kind of rushed to the event We're going to expand theCUBE down under. to see you up on stage. And as always the sessions are packed. I'm John Furrier with Dave Vellante.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
John Kreisa	PERSON	0.99+
John Furrier	PERSON	0.99+
Carlo	PERSON	0.99+
Sydney	LOCATION	0.99+
Asia	LOCATION	0.99+
Shawn Connelly	PERSON	0.99+
2-year	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Tokyo	LOCATION	0.99+
Dublin	LOCATION	0.99+
Melburn	LOCATION	0.99+
San Jose	LOCATION	0.99+
North America	LOCATION	0.99+
John	PERSON	0.99+
Last year	DATE	0.99+
U.S.	LOCATION	0.99+
Daimler	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
September	DATE	0.99+
Centrica	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
last year	DATE	0.99+
Walgreens Booth	ORGANIZATION	0.99+
Both	QUANTITY	0.99+
both sides	QUANTITY	0.99+
HORTONWORKS	ORGANIZATION	0.99+
This year	DATE	0.99+
S-1	TITLE	0.99+
yesterday	DATE	0.98+
next year	DATE	0.98+
Munich, Germany	LOCATION	0.98+
Shawn	PERSON	0.98+
HPE	ORGANIZATION	0.98+
Hadoop Summit	EVENT	0.98+
both	QUANTITY	0.98+
two years ago	DATE	0.98+
a year and a half	QUANTITY	0.98+
one product	QUANTITY	0.98+
DataWorks 2017	EVENT	0.98+
this year	DATE	0.98+
Sydney, Australia	LOCATION	0.97+
DataWorks Summit	EVENT	0.97+
Apples	ORGANIZATION	0.97+
day two	QUANTITY	0.97+
Hortonworks–	ORGANIZATION	0.97+
Gallium C	ORGANIZATION	0.96+
Gallium C.	ORGANIZATION	0.96+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Hadoop Summit: