David Richards, WANdisco - BigDataNYC - #BigDataNYC - #theCUBE

(silence) (upbeat techno music) >> Narrator: Live from New York, it's theCUBE, covering Big Data NYC 2016, brought to you by headline sponsors: Cisco... IBM... Nvidia, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and Peter Burris. >> Welcome back to New York City, everybody. This is theCUBE, the worldwide leader in live tech coverage. David Richards is here. He's the CEO of WANdisco, a long time CUBE alum. Great to see you again. >> Great to be back. >> It was good fun hanging out with last night and a good surprise at the IBM event. There was good action across the street. >> Yeah, you're both looking surprisingly well, actually. >> (Dave laughs) Yes. >> Well, we also heard about the WANdisco versus theCUBE golf tournament, that apparently theCUBE just did really, really well in it and WANdisco went running away with their tail between their legs. >> Well, I talked to Furrier last night. I said, "David Richards was telling me "that he kicked your butt on the golf course." He goes, "Yeah, that's true, actually." (laughter) >> I think I've got some video proof that he actually gave me $20 live on air because, of course, his wallet was empty. (laughter) He was blowing the dust off it, you know? >> Of course, yeah, the body swerve. >> Alligator arms. >> So David, it's, again, great to see you again. You guys have been in this business since day one, and things are evolving. How are things changing for WANdisco? >> So, when we first came into this market, back in the mid-2006, 2007, and then we obviously made a bunch of acquisitions around 2011 and 2012 that took us headlong into the big data marketplace. We pretty much had a completely different business model to our business model now. Then, we had a product called Non-Stop NameNode... My God, can you imagine that? (Dave laughs) That was very focused on the Hadoop marketplace because, at that time, we believed, like everybody else, that Hadoop was going to take over the world, people were going to move to commoditized servers, open-source software, and solve the huge storage problems that they were going to have from both a cost and efficiency perspective. What I think has happened, or is happening right now, is this evolution, and it really is more of a revolution than an evolution is taking place, where workloads, and we were discussing this last night, are moving at massive scale to cloud, and people are really skipping that step, where we thought they were going to have 5, 10,000 sort of clusters on-premise, but now they have some clusters on-prem, but the bulk of the workloads are actually moving into cloud. I was just discussing with George, off-camera a few minutes ago, why that is happening, and there's a lot of applications that are very efficient. The cloud packs are up there ready to use, off the shelf, and it becomes very simplistic, and to be quite frank, do we really care anymore about all these different open-source components? Is the CIO waking up in the middle of the night thinking, oh, my God, am I going to use Ignite, am I going to use Spark, am I going to use Pig, am I going to use Hive, et cetera, et cetera, et cetera? Of course they're not. They really just want to-- Let's inverse the question to ourselves. If you were going to start a competitor to Uber tomorrow, would you go and build a data center (Dave laughs) or would you just throw up a thousand servers up in the cloud and have done with it, and use all the apps that are up there? Of course, the answer's simple, so that's really what's happening. >> Well, one of the things that I... I wrote a piece of research a million years ago in which I prognosticated, the Dictionary Word of the Day, that the value of middleware was inversely proportional to the degree to which anybody knew anything about it. (Dave laughs) CIOs are waking up and asking those questions today, which is an indication that they're creating a problem. >> Yep. >> Infrastructure has to do no harm in the organization. I had a CIO friend for years who still asks his chief CTO, "To what degree is infrastructure creating a problem "for me today?" >> Yeah. >> And if it's creating a problem, it's a problem. >> Mm-hmm. >> You don't want to have to know about this stuff, and so what degree are you helping companies mask some of those... that visibility, so that people can spend less time worrying about the infrastructure? >> So, what we're focused on is a business model that has gone from direct, where we were hiring out a very large direct sales force enterprise, the classic enterprise sales guys that would go knock on doors, knock deals down, go and sell to the Global 1000s, to an indirect model, and we announced that OAM, recently with IBM, IBM Big Replicate, that is under the covers, is WANdisco Fusion, which is a great deal for us. So, our focus very much is on data movement, and data movement between data centers, for companies that want to stay on-prem, and between data centers and in and out of cloud seamlessly, and the word there is seamlessly. So, we worked very hard for the past 18 months on our product such that anybody can go to, if you want to go to the AWS Marketplace, you can, in a few clicks, begin to replicate petabyte-scale in and out of cloud, and we think, and we were discussing this last night, that the hybrid-cloud model is really fascinating, so the ability to take data on-premise, query it in cloud, get complete consistency between on-prem and cloud, but also have all the efficiency in the cloud economics, the elasticity, all the applications that exist in cloud, and I think that model is really interesting, and what's interesting is, I'm not sure that the little guys can execute in that model other than, like we're doing, veer on OAM, an indirect model. So, I'm not sure whether or not, just to go back to the conversation, CIOs are as concerned as they used to be about which Hadoop distribution, for example, they're using. I never hear that question anymore. That question was a 2012, 2013 question. What the CIOs are now concerned about is the economics of cloud, and how do I get that less than $5 per terabyte of data economics that I get in a cloud environment. >> Well, but also increasingly, they're talking about the use cases. >> David: Yeah. >> They want to get their people... They don't want to replicate the Linux or Unix versus NT wars of the 1990s, which was made possible because they were focused on what accounting package am I going to run? Am I going to run it-- >> Yeah. >> on this or that? You know, it was known process, unknown technology. In today's universe, it's unknown process, and they don't want to know as much about the technology, so they're focused on how do I get my men and women focused on use cases that are delivering value for their business. >> Exactly, and the economics question is really simple. Am I going to build a massive, partially used, elastic infrastructure on-premise or am I just going to go and use the elastic infrastructure that already exists in the cloud? That's a no-brainer. That's already happening, and the good news for us, the good news for WANdisco, is it's precisely what we do. It's a data movement problem. Now, I'm bound to say that, but it is actually a data movement problem. In this idea that you have data that changes, active transactional data, as we call it, so the active transactional data movement is a really hard problem. You can't just take a snapshot, right? A file scan and then a snapshot and then move the data, and that's the problem that all the other data replication guys have got. That's what IBM, OAM, that's why we've got strategic partnerships with companies like Oracle, like Amazon, and why I'm sure we'll be announcing things in due course with the other cloud vendors, like Google, for example, and Microsoft with their Azure products. They all have that problem, so data movement, in and out of cloud, if it's batch, if it's static, if it's archival data, easy problem to solve. There's a million and one different replication products. >> Dave: Right. >> You can use rsync if you really wanted to do that, but active transactional data, data that changes, data that moves, you know, at petabyte scale, hard problem. That's the problem that we solve. >> Because you've got speed of light problems and you're exposing yourself to data loss-- >> Yep. >> if something goes wrong. >> Peter: Fidelity is a problem. >> An eventual consistency replication model-- >> Yeah, it... >> doesn't work. You can't... If I'm query... We've got a customer that's trying to look at cardiographs, right, in and out of cloud. I mean, would you really feel comfortable in your cardiograph eventually getting into the cloud and being analyzed? You know, would you? You've got to be absolutely crystal clear that the data is completely consistent from the stuff that I'm generating on-premise versus the models that I'm building in cloud. It's vitally important. >> Well, I would imagine there's regulations, in certain industries anyway, that-- >> Oh, yeah, absolutely. >> require that eventual consistency doesn't fit, right? >> Yeah. Well, I mean, at the moment, without us, that's all you got, I'm afraid... >> Okay. >> Well, so, I'm on a mission, let me and I want to get your take on it, that we always talk about elastic infrastructure, which is a given workload, being able to scale up and scale down. >> David: Yeah. >> I think it's time to start talking about plastic infrastructure-- >> David: Oh, yeah, I like it. >> where a given workload, but a reconfiguration of how that workload is applied because of the value of data, because of integration, because of the need to be able to move in response to business needs. So we talk about plastic infrastructure, where we are reconfiguring based on policy and rules and some other things. What do you think about that? >> I love it, and the reason I love it is because, just to take a step back, the definition of hybrid cloud is... You would imagine it would be relatively simple, but to me, a hybrid means that you have... You know, it's a bit like a hybrid golf club. It's neither a driver nor an iron. It's somewhere in between. So, you have the same workload that can exist both on-premise and in the cloud. I can use both the cloud and on-premise interchangeably. What hybrid cloud actually means, for all the vendors, and this is their dirty little secret, it means that you have some workloads running against some data in the cloud and others that will run against some data on-premise. Now, why do they do that? Because they have to. Because they can't guarantee complete consistency between on-premise and cloud. Our definition of hybrid cloud is exactly the same data, if you want, between on-premise and cloud, and I love this plastic phrase, the idea of repurposing all of those applications, and they can live anywhere. It doesn't matter 'cause it's the same data. >> Yeah, so we have two terms we have to copyright here, plastic infrastructure. >> Plastic... >> What was the other one we heard? >> Data portfolio. >> Data portfolio, yeah. We'll run the tape back >> Plastic infrastructure. (laughter) >> Plastic infrastructure. >> I'm going to steal it (laughs). >> Please do, you know? But the key thing is, as these technologies get more deeply embedded within business and how the business runs, it's incumbent upon the technology leadership to be able to rapidly be able to reconfigure the infrastructure in response to what the business needs. That's not elasticity. >> Yeah. >> That's plasticity. >> I love it, absolutely. (Peter laughs) And I think you're touching on something that's changing, and what we discussed earlier, which is that CIOs aren't waking up in the middle of the night thinking, am I going to use Pig or Hive or any of those other open source components. They're thinking about the applications that they're going to build. How am I actually going to start using this data? And I think the agenda's kind of moved on, and walking around the whole... There's still a little bit of confusion. You still have people talking about infrastructure like it really still matters. I'm not absolutely sure it does. >> Well, so let's talk about that. We got a few minutes or something like that. >> Dave: It matters when it breaks, you know? >> What's that? >> It matters when it breaks. >> It sure does matter when it breaks. >> You know, but otherwise, nobody wants to think about it. >> No, yeah, because like I said earlier, it's the degree to which-- >> We have time, but I want to explore the new distribution model as well. >> Yeah, go ahead. >> Let me do that, get that out, tick that box, if I can. Help me understand, David, how it all works. So you, the partnership with IBM and others, you mentioned Amazon, how does it work? You are in the IBM cloud offering? IBM is actually selling that offering? Is it a branded IBM product? >> So, it's in the big data analytics and cloud offerings. So, at the moment, IBM are very focused, as you know, on owning the platform. IBM, as a company, have the own the platform. >> Dave: Yeah, absolutely. >> So, I'm delighted to say that we're embedded into their platform. Now, they had a big launch of some products last night. >> Yeah. >> I know that they were talking about IBM Big Replicate, which is 100% white label OAM of WANdisco Fusion to solve some very specific problems, primarily around data movement. So, at the hybrid cloud, how do I punch data out into clouds, run the analytics against it, and be sure that I'm going to get the right results? That's what Big Replicate solves, and also, they're moving into mixed environments, whether they're NetApp, just kind of Teradata environment, SAS-based environments, or whether a customer already has an existing distribution of, say, Cloudera or Hortonworks, so they can live alongside that, so we can replicate data between existing deployments, where they may have already made a strategic decision to go with one of those distributions, and also be able to migrate not just into IBM Big Insights, but also into their cloud offering, so that's a great deal for us. We're not... They're selling it themselves. I mean, obviously we've done a lot of field enablement, trained 5,000 or so IBM sales rep, and, you know, if a small company like WANdisco, or a small company like virtually any of the vendors in there that are not in the Global 1000 list, the go-to market has to be indirect. >> And so you're... Totally agree, and so you're basically, if I understand it correctly, you're moving what are conventional filers into the cloud. Customers are doing that. >> Oh. >> How fast is that happening and why are they doing that? >> My, God. I mean, we have not announced this product yet, but we're in the middle of launching it. It's, at scale, moving petabyte-scale data from, and this is transactional data, so it's a hard problem to solve, right, so it's an active data... It's an active transactional data replication problem. So, a lot of... The dirty little secret in the cloud is that a lot of those NFS filers have not moved yet-- >> Right. >> And why haven't they moved? 'Cause they can't. Because you can't just... You know, if you were to travel, one of the customaries of banks and travel companies is they can't press pause in their organization, do a file scan that's going to take six months, and then turn it back on again, and hey, presto, it's in the cloud. You can't do that. So, you kind of have to... At every single migration of those filers, of any sort of data, is a hybrid model, so you have to be able to run both on-prem and cloud while that migration is happening, and there, I can tell you, are a lot, a hell of a lot of NetApp filers that are going to move very soon here, in time. >> Dave: Oh, 'cause that's the problem that you solve. Otherwise, you'd have to freeze everything, which would kill your business, so you can't do it. >> Yeah, so when human beings imagine things, we're always imagining small use cases, small sets, like moving a few files into Dropbox or something, and that's okay that I can't edit those files for the few seconds it takes to move. I took a look at a deal the other day that was 3 billion files. (Dave laughs) Right, 3 billion. You can't even... My brain can't even calculate that, right? That's a three to six month data movement, and Amazon, for example, thought of this product called Snowball, which-- >> Yeah. >> You know, no techy ever believes this story, but, of course, they FedEx a box, a ruggedized hard drive to you essentially, a ruggedized server that you pour your data into it and then you mail it back to them and they can put it there. That doesn't work, of course, for transactional data, for data that changes all the time. >> These are hard problems to solve, and I go to market, getting back to your question, it is all about indirect, you know? So, AWS, a strategic partnership, that, Oracle, a strategic partnership, that, IBM... And as I said, I'm sure that we'll be doing things with Google and Microsoft soon, and they're the five partnerships that I really care about, to be quite frank with you. >> Mm-hmm. and this comes back to this notion of infrastructure, the value of infrastructure, and just to touch on it for a second, so many years ago, when we were doing client-server, >> David: Mm-hmm. >> We would test it on a local area network and deploy it on a WAN (David laughs) and wonder why it blew up. >> David: Yeah. >> The realities of the speed of light and the practical limitations have a real impact on design, and so where infrastructure still matters is we still have to worry about design, we still have to worry about legacy financial assets, how we're deploying those assets, and I want to come back to this because we were talking earlier about data as an asset, the value of data within the business, and you don't want to be limited by the legacy as you try to find new ways of generating value out of your data, and what you guys are trying to allow is that the data can be moved in response to the use case as opposed to the use case not being made possible because of the legacy decisions about where to put your data. >> David: That's precisely it, and I don't think that any CIO, in their right mind, wants to continue with the huge maintenance costs, maintenance payments they have to make to some of those vendors, some of those NFS-based vendors. They need to shut them down. They have to figure out a way to move them into cloud so you get cloud economics, and also be able to query the data in a massively efficient way. You simply cannot do that at the moment. They simply cannot do that at the moment, so, as I said, as we continue to launch these products in the marketplace, I'm sure you'll see, at scale, some pretty large companies surprising-- You know, the two that spring to my mind are that the regulators in the US and the UK, Fenero and the FCA, are both in the process of their moving all into cloud, 100% into cloud, and I would expect to see that trend continue. I mean, the re:Invent... I don't want to talk about another-- and we're here at Strata, but the AWS re:Invent, I would expect to see several major financial service companies announcing cloud strategy. >> Yeah, and Fenero's a big user of the AWS cloud. They talk about it pretty aggressively, and really interesting use case there. So, yeah, so we got to end. What's next for you guys? You've mentioned you're going to be at re:Invent, you're going to be at World of Watson (laughs)? Where are we going to find you next? >> Both of those. Obviously, the white label with IBM is a really interesting deal for us. I can't talk about deal flow yet 'cause it's our end of quarter at the moment, but I can tell you that they're doing a pretty damn good job of selling, so we're in execution mode at the moment, where we've already announced some key partnerships. There'll be more key partnerships to come, I'm sure. We're obviously chasing deals down with some of the other cloud vendors, and I'd expect to see us announcing some interesting new customer wins in the coming days and weeks. >> Dave: Great. Well, congratulations on the momentum and the renewed strategy. I love it, and I appreciate you coming to theCUBE. >> Always a pleasure. >> All right, keep it right there, buddy. We'll be back with our next guest. This is theCUBE. We're live at Big Data NYC, Strata and Hadoop World. Be right back. (spacey electronica music)

Published Date : Sep 29 2016

SUMMARY :

brought to you by headline sponsors: Great to see you again. and a good surprise at the IBM event. Yeah, you're both looking and WANdisco went running away butt on the golf course." He was blowing the dust off it, you know? great to see you again. Let's inverse the question to ourselves. that the value of middleware no harm in the organization. And if it's creating a and so what degree are so the ability to take data on-premise, they're talking about the use cases. Am I going to run it-- as much about the technology, and that's the problem That's the problem that we solve. that the data is completely consistent Well, I mean, at the moment, without us, being able to scale up and scale down. because of the need to be but to me, a hybrid means that you have... Yeah, so we have two terms We'll run the tape back Plastic infrastructure. in response to what the business needs. that they're going to build. Well, so let's talk about that. You know, but otherwise, to explore the new You are in the IBM cloud offering? So, it's in the big data analytics So, I'm delighted to the go-to market has to be indirect. into the cloud. The dirty little secret in the cloud is and hey, presto, it's in the cloud. the problem that you solve. for the few seconds it takes to move. for data that changes all the time. and I go to market, getting and this comes back to this notion and deploy it on a WAN (David laughs) and the practical limitations You simply cannot do that at the moment. going to be at re:Invent, and I'd expect to see us announcing and the renewed strategy. Strata and Hadoop World.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
FCA	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
Fenero	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
$20	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
George	PERSON	0.99+
2012	DATE	0.99+
three	QUANTITY	0.99+
100%	QUANTITY	0.99+
New York City	LOCATION	0.99+
Peter	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
OAM	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
3 billion	QUANTITY	0.99+
two terms	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
5,000	QUANTITY	0.99+
US	LOCATION	0.99+
two	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
FedEx	ORGANIZATION	0.99+
Linux	TITLE	0.99+
mid-2006	DATE	0.99+
Both	QUANTITY	0.99+
both	QUANTITY	0.99+
five partnerships	QUANTITY	0.99+
2013	DATE	0.99+
tomorrow	DATE	0.99+
Unix	TITLE	0.98+
2011	DATE	0.98+
one	QUANTITY	0.98+
six month	QUANTITY	0.98+
5, 10,000	QUANTITY	0.98+
last night	DATE	0.97+
less than $5 per terabyte	QUANTITY	0.97+
Hadoop World	LOCATION	0.97+
1990s	DATE	0.96+
Dropbox	ORGANIZATION	0.96+

Joel Horwitz, IBM & David Richards, WANdisco - Hadoop Summit 2016 San Jose - #theCUBE

>> Narrator: From San Jose, California, in the heart of Silicon Valley, it's theCUBE. Covering Hadoop Summit 2016. Brought to you by Hortonworks. Here's your host, John Furrier. >> Welcome back everyone. We are here live in Silicon Valley at Hadoop Summit 2016, actually San Jose. This is theCUBE, our flagship program. We go out to the events and extract the signal to the noise. Our next guest, David Richards, CEO of WANdisco. And Joel Horowitz, strategy and business development, IBM analyst. Guys, welcome back to theCUBE. Good to see you guys. >> Thank you for having us. >> It's great to be here, John. >> Give us the update on WANdisco. What's the relationship with IBM and WANdisco? 'Cause, you know. I can just almost see it, but I'm not going to predict. Just tell us. >> Okay, so, I think the last time we were on theCUBE, I was sitting with Re-ti-co who works very closely with Joe. And we began to talk about how our partnership was evolving. And of course, we were negotiating an OEM deal back then, so we really couldn't talk about it very much. But this week, I'm delighted to say that we announced, I think it's called IBM Big Replicate? >> Joel: Big Replicate, yeah. We have a big everything and Replicate's the latest edition. >> So it's going really well. It's OEM'd into IBM's analytics, big data products, and cloud products. >> Yeah, I'm smiling and smirking because we've had so many conversations, David, on theCUBE with you on and following your business through the bumpy road or the wild seas of big data. And it's been a really interesting tossing and turning of the industry. I mean, Joel, we've talked about it too. The innovation around Hadoop and then the massive slowdown and realization that cloud is now on top of it. The consumerization of the enterprise created a little shift in the value proposition, and then a massive rush to build enterprise grade, right? And you guys had that enterprise grade piece of it. IBM, certainly you're enterprise grade. You have enterprise everywhere. But the ecosystem had to evolve really fast. What happened? Share with the audience this shift. >> So, it's classic product adoption lifecycle and the buying audience has changed over that time continuum. In the very early days when we first started talking more at these events, when we were talking about Hadoop, we all really cared about whether it was Pig and Hive. >> You once had a distribution. That's a throwback. Today's Thursday, we'll do that tomorrow. >> And the buying audience has changed, and consequently, the companies involved in the ecosystem have changed. So where we once used to really care about all of those different components, we don't really care about the machinations below the application layer anymore. Some people do, yes, but by and large, we don't. And that's why cloud for example is so successful because you press a button, and it's there. And that, I think, is where the market is going to very, very quickly. So, it makes perfect sense for a company like WANdisco who've got 20, 30, 40, 50 sales people to move to a company like IBM that have 4 or 5,000 people selling our analytics products. >> Yeah, and so this is an OEM deal. Let's just get that news on the table. So, you're an OEM. IBM's going to OEM their product and brand it IBM, Big Replication? >> Yeah, it's part of our Big Insights Portfolio. We've done a great job at growing this product line over the last few years, with last year talking about how we decoupled all the value-as from the core distribution. So I'm happy to say that we're both part of the ODPI. It's an ODPI-certified distribution. That is Hadoop that we offer today for free. But then we've been adding not just in terms of the data management capabilities, but the partnership here that we're announcing with WANdisco and how we branded it as Big Replicate is squarely aimed at the data management market today. But where we're headed, as David points out, is really much bigger, right? We're talking about support for not only distributed storage and data, but we're also talking about a hybrid offering that will get you to the cloud faster. So not only does Big Replicate work with HDFS, it also works with the Swift objects store, which as you know, kind of the underlying storage for our cloud offering. So what we're hoping to see from this great partnership is as you see around you, Hadoop is a great market. But there's a lot more here when you talk about managing data that you need to consider. And I think hybrid is becoming a lot larger of a story than simply distributing your processing and your storage. It's becoming a lot more about okay, how do you offset different regions? How do you think through that there are multiple, I think there's this idea that there's one Hadoop cluster in an enterprise. I think that's factually wrong. I think what we're observing is that there's actually people who are spinning up, you know, multiple Hadoop distributions at the line of business for maybe a campaign or for maybe doing fraud detection, or maybe doing log file, whatever. And managing all those clusters, and they'll have Cloud Arrow. They'll have Hortonworks. They'll have IBM. They'll have all of these different distributions that they're having to deal with. And what we're offering is sanity. It's like give me sanity for how I can actually replicate that data. >> I love the name Big Replicate, fantastic. Big Insights, Big Replicate. And so go to market, you guys are going to have bigger sales force. It's a nice pop for you guys. I mean, it's good deal. >> We were just talking before we came on air about sort of a deal flow coming through. It's coming through, this potential deal flow coming through, which has been off the charts. I mean, obviously when you turn on the tap, and then suddenly you enable thousands and thousands of sales people to start selling your products. I mean, IBM, are doing a great job. And I think IBM are in a unique position where they own both cloud and on-prem. There are very few companies that own both the on-prem-- >> They're going to need to have that connection for the companies that are going hybrid. So hybrid cloud becomes interesting right now. >> Well, actually, it's, there's a theory that says okay, so, and we were just discussing this, the value of data lies in analytics, not in the data itself. It lies in you've been able to pull out information from that data. Most CIOs-- >> If you can get the data. >> If you can get the data. Let's assume that you've got the data. So then it becomes a question of, >> That's a big assumption. Yes, it is. (laughs) I just had Nancy Handling on about metadata. No, that's an issue. People have data they store they can't do anything with it. >> Exactly. And that's part of the problem because what you actually have to have is CPU slash processing power for an unknown amount of data any one moment in time. Now, that sounds like an elastic use case, and you can't do elastic on-prem. You can only do elastic in cloud. That means that virtually every distribution will have to be a hybrid distribution. IBM realized this years ago and began to build this hybrid infrastructure. We're going to help them to move data, completely consistent data, between on-prem and cloud, so when you query things in the cloud, it's exactly the same results and the correct results you get. >> And also the stability too on that. There's so many potential, as we've discussed in the past, that sounds simple and logical. To do an enterprise grade is pretty complex. And so it just gives a nice, stable enterprise grade component. >> I mean, the volumes of data that we're talking about here are just off the charts. >> Give me a use case of a customer that you guys are working with, or has there been any go-to-market activity or an ideal scenario that you guys see as a use case for this partnership? >> We're already seeing a whole bunch of things come through. >> What's the number one pattern that bubbles up to the top? Use case-wise. >> As Joel pointed out, that he doesn't believe that any one company just has one version of Hadoop behind their firewall. They have multiple vendors. >> 100% agree with that. >> So how do you create one, single cluster from all of those? >> John: That's one problem you solved. >> That's of course a very large problem. Second problem that we're seeing in spades is I have to move data to cloud to run analytics applications against it. That's huge. That required completely guaranteed consistent data between on-prem and cloud. And I think those two use cases alone account for pretty much every single company. >> I think there's even a third here. I think the third is actually, I think frankly there's a lot of inefficiencies in managing just HDFS and how many times you have to actually copy data. If I looked across, I think the standard right now is having like three copies. And actually, working with Big Replicate and WANdisco, you can actually have more assurances and actually have to make less copies across the cluster and actually across multiple clusters. If you think about that, you have three copies of the data sitting in this cluster. Likely, an analysts have a dragged a bunch of the same data in other clusters, so that's another multiple of three. So there's amount of waste in terms of the same data living across your enterprise. That I think there's a huge cost-savings component to this as well. >> Does this involve anything with Project Atlas at all? You guys are working with, >> Not yet, no. >> That project? It's interesting. We're seeing a lot of opening up the data, but all they're doing is creating versions of it. And so then it becomes version control of the data. You see a master or a centralization of data? Actually, not centralize, pull all the data in one spot, but why replicate it? Do you see that going on? I guess I'm not following the trend here. I can't see the mega trend going on. >> It's cloud. >> What's the big trend? >> The big trend is I need an elastic infrastructure. I can't build an elastic infrastructure on-premise. It doesn't make economic sense to build massive redundancy maybe three or four times the infrastructure I need on premise when I'm only going to use it maybe 10, 20% of the time. So the mega trend is cloud provides me with a completely economic, elastic infrastructure. In order to take advantage of that, I have to be able to move data, transactional data, data that changes all the time, into that cloud infrastructure and query it. That's the mega trend. It's as simple as that. >> So moving data around at the right time? >> And that's transaction. Anybody can say okay, press pause. Move the data, press play. >> So if I understand this correctly, and just, sorry, I'm a little slow. End of the day today. So instead of staging the data, you're moving data via the analytics engines. Is that what you're getting at? >> You use data that's being transformed. >> I think you're accessing data differently. I think today with Hadoop, you're accessing it maybe through like Flume or through Oozy, where you're building all these data pipelines that you have to manage. And I think that's obnoxious. I think really what you want is to use something like Apache Spark. Obviously, we've made a large investment in that earlier, actually, last year. To me, what I think I'm seeing is people who have very specific use cases. So, they want to do analysis for a particular campaign, and so they may just pull a bunch of data into memory from across their data environment. And that may be on the cloud. It may be from a third-party. It may be from a transactional system. It may be from anywhere. And that may be done in Hadoop. It may not, frankly. >> Yeah, this is the great point, and again, one of the themes on the show is, this is a question that's kind of been talked about in the hallways. And I'd love to hear your thoughts on this. Is there are some people saying that there's really no traction for Hadoop in the cloud. And that customers are saying, you know, it's not about just Hadoop in the cloud. I'm going to put in S3 or object store. >> You're right. I think-- >> Yeah, I'm right as in what? >> Every single-- >> There's no traction for Hadoop in the cloud? >> I'll tell you what customers tell us. Customers look at what they actually need from storage, and they compare whatever it is, Hadoop or any on-premise proprietor storage array and then look at what S3 and Swift and so on offer to them. And if you do a side-by-side comparison, there isn't really a difference between those two things. So I would argue that it's a fact that functionally, storage in cloud gives you all the functionality that any customer would need. And therefore, the relevance of Hadoop in cloud probably isn't there. >> I would add to that. So it really depends on how you define Hadoop. If you define Hadoop by the storage layer, then I would say for sure. Like HDFS versus an objects store, that's going to be a difficult one to find some sort of benefit there. But if you look at Hadoop, like I was talking to my friend Blake from Netflix, and I was asking him so I hear you guys are kind of like replatforming on Spark now. And he was basically telling me, well, sort of. I mean, they've invested a lot in Pig and Hive. So if you think it now about Hadoop as this broader ecosystem which you brought up Atlas, we talk about Ranger and Knox and all the stuff that keeps coming out, there's a lot of people who are still invested in the peripheral ecosystem around Hadoop as that central point. My argument would be that I think there's still going to be a place for distributed computing kind of projects. And now whether those will continue to interface through Yarn via and then down to HDFS, or whether that'll be Yarn on say an objects store or something and those projects will persist on their own. To me that's kind of more of how I think about the larger discussion around Hadoop. I think people have made a lot of investments in terms of that ecosystem around Hadoop, and that's something that they're going to have to think through. >> Yeah. And Hadoop wasn't really designed for cloud. It was designed for commodity servers, deployment with ease and at low cost. It wasn't designed for cloud-based applications. Storage in cloud was designed for storage in cloud. Right, that's with S3. That's what Swift and so on were designed specifically to do, and they fulfill most of those functions. But Joel's right, there will be companies that continue to use-- >> What's my whole argument? My whole argument is that why would you want to use Hadoop in the cloud when you can just do that? >> Correct. >> There's object store out. There's plenty of great storage opportunities in the cloud. They're mostly shoe-horning Hadoop, and I think that's, anyway. >> There are two classes of customers. There were customers that were born in the cloud, and they're not going to suddenly say, oh you know what, we need to build our own server infrastructure behind our own firewall 'cause they were born in the cloud. >> I'm going to ask you guys this question. You can choose to answer or not. Joel may not want to answer it 'cause he's from IBM and gets his wrist slapped. This is a question I got on DM. Hadoop ecosystem consolidation question. People are mailing in the questions. Now, keep sending me your questions if you don't want your name on it. Hold on, Hadoop system ecosystem. When will this start to happen? What is holding back the M and A? >> So, that's a great question. First of all, consolidation happens when you sort of reach that tipping point or leveling off, that inflection point where the market levels off, and we've reached market saturation. So there's no more market to go after. And the big guys like IBM and so on come in-- >> Or there was never a market to begin with. (laughs) >> I don't think that's the case, but yes, I see the point. Now, what's stopping that from happening today, and you're a naughty boy by the way for asking this question, is a lot of these companies are still very well funded. So while they still have cash on the balance sheet, of course, it's very, very hard for that to take place. >> You picked up my next question. But that's a good point. The VCs held back in 2009 after the crash of 2008. Sequoia's memo, you know, the good times role, or RIP good times. They stopped funding companies. Companies are getting funded, continually getting funding. Joel. >> So I don't think you can look at this market as like an isolated market like there's the Hadoop market and then there's a Spark market. And then even there's like an AI or cognitive market. I actually think this is all the same market. Machine learning would not be possible if you didn't have Hadoop, right? I wouldn't say it. It wouldn't have a resurgence that it has had. Mahout was one of the first machine learning languages that caught fire from Ted Dunning and others. And that kind of brought it back to life. And then Spark, I mean if you talk to-- >> John: I wouldn't say it creates it. Incubated. >> Incubated, right. >> And created that Renaissance-like experience. >> Yeah, deep learning, Some of those machine learning algorithms require you to have a distributed kind of framework to work in. And so I would argue that it's less of a consolidation, but it's more of an evolution of people going okay, there's distributed computing. Do I need to do that on-premise in this Hadoop ecosystem, or can I do that in the cloud, or in a growing Spark ecosystem? But I would argue there's other things happening. >> I would agree with you. I love both areas. My snarky comment there was never a market to begin with, what I'm saying there is that the monetization of commanding the hill that everyone's fighting for was just one of many hills in a bigger field of hills. And so, you could be in a cul-de-sac of being your own champion of no paying customers. >> What you have-- >> John: Or a free open-source product. >> Unlike the dotcom era where most of those companies were in the public markets, and you could actually see proper valuations, most of the companies, the unicorns now, most are not public. So the valuations are really difficult to, and the valuation metrics are hard to come by. There are only few of those companies that are in the public market. >> The cash story's right on. I think to Joel' point, it's easy to pivot in a market that's big and growing. Just 'cause you're in the wrong corner of the market pivoting or vectoring into the value is easier now than it was 10 years ago. Because, one, if you have a unicorn situation, you have cash on the bank. So they have a good flush cash. Your runway's so far out, you can still do your thing. If you're a startup, you can get time to value pretty quickly with the cloud. So again, I still think it's very healthy. In my opinion, I kind of think you guys have good analysis on that point. >> I think we're going to see some really cool stuff happen working together, and especially from what I'm seeing from IBM, in the fact that in the IT crowd, there is a behavioral change that's happening that Hadoop opened the door to. That we're starting to see more and more It professionals walk through. In the sense that, Hadoop has opened the door to not thinking of data as a liability, but actually thinking about data differently as an asset. And I think this is where this market does have an opportunity to continue to grow as long as we don't get carried away with trying to solve all of the old problems that we solved for on-premise data management. Like if we do that, then we're just, then there will be a consolidation. >> Metadata is a huge issue. I think that's going to be a big deal. And on the M and A, my feeling on the M and A is that, you got to buy something of value, so you either have revenue, which means customers, and or initial property. So, in a market of open source, it comes back down to the valuation question. If you're IBM or Oracle or HP, they can pivot too. And they can be agile. Now slower agile, but you know, they can literally throw some engineers at it. So if there's no customers in I and P, they can replicate, >> Exactly. >> That product. >> And we're seeing IBM do that. >> They don't know what they're buying. My whole point is if there's nothing to buy. >> I think it depends on, ultimately it depends on where we see people deriving value, and clearly in WANdisco, there's a huge amount of value that we're seeing our customers derive. So I think it comes down to that, and there is a lot of IP there, and there's a lot of IP in a lot of these companies. I think it's just a matter of widening their view, and I think WANdisco is probably the earliest to do this frankly. Was to recognize that for them to succeed, it couldn't just be about Hadoop. It actually had to expand to talk about cloud and talk about other data environments, right? >> Well, congratulations on the OEM deal. IBM, great name, Big Replicate. Love it, fantastic name. >> We're excited. >> It's a great product, and we've been following you guys for a long time, David. Great product, great energy. So I'm sure there's going to be a lot more deals coming on your. Good strategy is OEM strategy thing, huh? >> Oh yeah. >> It reduces sales cost. >> Gives us tremendous operational leverage. Getting 4,000, 5,000-- >> You get a great partner in IBM. They know the enterprise, great stuff. This is theCUBE bringing all the action here at Hadoop. IBM OEM deal with WANdisco all happening right here on theCUBE. Be back with more live coverage after this short break.

Published Date : Jul 1 2016

SUMMARY :

Brought to you by Hortonworks. extract the signal to the noise. What's the relationship And of course, we were Replicate's the latest edition. So it's going really well. The consumerization of the enterprise and the buying audience has changed That's a throwback. And the buying audience has changed, Let's just get that news on the table. of the data management capabilities, I love the name Big that own both the on-prem-- for the companies that are going hybrid. not in the data itself. If you can get the data. I just had Nancy Handling and the correct results you get. And also the stability too on that. I mean, the volumes of bunch of things come through. What's the number one pattern that any one company just has one version And I think those two use cases alone of the data sitting in this cluster. I guess I'm not following the trend here. data that changes all the time, Move the data, press play. So instead of staging the data, And that may be on the cloud. And that customers are saying, you know, I think-- Swift and so on offer to them. and all the stuff that keeps coming out, that continue to use-- opportunities in the cloud. and they're not going to suddenly say, What is holding back the M and A? And the big guys like market to begin with. hard for that to take place. after the crash of 2008. And that kind of brought it back to life. John: I wouldn't say it creates it. And created that or can I do that in the cloud, that the monetization that are in the public market. I think to Joel' point, it's easy to pivot And I think this is where this market I think that's going to be a big deal. there's nothing to buy. the earliest to do this frankly. Well, congratulations on the OEM deal. So I'm sure there's going to be Gives us tremendous They know the enterprise, great stuff.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Joe	PERSON	0.99+
David Richards	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
2009	DATE	0.99+
John	PERSON	0.99+
4	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
20	QUANTITY	0.99+
San Jose	LOCATION	0.99+
HP	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Joel Horwitz	PERSON	0.99+
Ted Dunning	PERSON	0.99+
Big Replicate	ORGANIZATION	0.99+
last year	DATE	0.99+
Silicon Valley	LOCATION	0.99+
Big Replicate	ORGANIZATION	0.99+
40	QUANTITY	0.99+
30	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
third	QUANTITY	0.99+
today	DATE	0.99+
Hadoop	TITLE	0.99+
San Jose, California	LOCATION	0.99+
three	QUANTITY	0.99+
two things	QUANTITY	0.99+
2008	DATE	0.99+
5,000 people	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
David Richards	PERSON	0.99+
Blake	PERSON	0.99+
4,000, 5,000	QUANTITY	0.99+
S3	TITLE	0.99+
two classes	QUANTITY	0.99+
tomorrow	DATE	0.99+
Second problem	QUANTITY	0.99+
both areas	QUANTITY	0.99+
three copies	QUANTITY	0.99+
Hadoop Summit 2016	EVENT	0.99+
Swift	TITLE	0.99+
both	QUANTITY	0.99+
Big Insights	ORGANIZATION	0.99+
one problem	QUANTITY	0.98+
Today	DATE	0.98+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Big Replicate: