Jagane Sundar, WANdisco | CUBEConversation, January 2019

>> Hello everyone. Welcome to this CUBE conversations here in Palo Alto, California John Furrier, host of the Cube. I'm here with Jagane Sundar CTO chief technology officer of WANdisco, you get great to see you again. Place we're coming on. >> Thank you for having me, John. >> So the conversation I want to talk to about the technology behind WANdisco and we've had many conversations. So for the folks watching good, our YouTube channel insurgency the evolution of conversations over, I think. Eight, eight, nine years now we've been chatting. What a level up. You guys are now with cloud big announcements around multi cloud live data in particular. So the technology is the gift that keeps giving for WANdisco you guys continuing to take territory now, a big way with cloud, big growth, A lot of changes, a lot of hires. What's going on? >> So, as you well know, WANdisco stands for wide area network distributed, computing on the value ofthe the wide data network aspect is really shining through now because nobody goes to the cloud saying, I'm going to put it in one data center. It's always multiple regions, multiple data centers in each region. Suddenly, problem of having your data consistent, being across multiple cloud windows are on prem to cloud becomes a real challenge. We stepped in. We had something that was a good solution for small users, small data. But we developed it into something that's fantastic for large data volumes on people are running into the problem. The biggest problem that IT providers have is that data scientists do not respect data that's not consistent. If you look at a replica of data and you're not sure whether it's exactly accurate or not the data scientists who spent all his time building algorithms to predict some model gonna look at it and go, that data's not quite right. I'm not going to look at it. So if you use a inconsistent tool or an inadequate tool to replicate your data, you have the problem that nobody is going to respect the replicas. Everybody's going to go back to the source of truth. We solved that problem elegantly and accurately >> State the problem specifically. Is it the integrity of the data? What is the specific problem statement that you guys solve with technology? >> Let me give you an exam you have notifications that come out of cloud object stores when an object this place into the store or deleted from the store that the best effort delivery. If there are logjams in this mechanism used to deliver some notifications, maybe drop the problem with using that notification mechanism to replicate your data is that over a period of time, so you have two three petabytes of data and you're replicating it over a month or month and a half, you'll find that maybe point one percent of your data is not quite accurate anymore. So the value ofthe the replicas essentially zero >> like a leaky pipe. Basically, >> indeed, if you have a leaking pipe, then it's just totally >> we need to have integrity and to end. All right, let's get back to some of the things I want to ask because I think it's a fascinating been following your story. For years, you had a point solution. Multiple wider. You had the replication active, active great for data centers. So disaster recovery not mission critical, but certainly critical. Correct, depending on how it the mission of us. It wasn't this asked Income's Cloud. You mentioned a wide area. Networks and you go back to the old days when I was breaking into the business. That's when they had, you know, dial up modems and front pagers. Not even cell phones. Just starting. Why do your network would have really complicated beast and all the best resource is worked on expensive bandwith, that he had remote offices and you had campus networking then. So why the area networking went through that phase one? Correct. Now we're living in. They win all the time. Cloud is when white area >> correct cloud is when. But there are subtle aspect that people miss all the time. If you go to store an object in Amazon, says three, for example, you pick a region. If it's a complete wide area distributed entity, why do you need to pick a region? The truth is, each cloud vendor hides a number of region specific or local area network specific aspects of their service. Dynamo DB runs and one data centre one one region, two or three availability zones in a region. If you want to replicate that data, you don't really have much help from the cloud vendor themselves. So you need to parse the truth from what has offered what you will find us. The van is still a very challenging problem for a lot of these data application problems. >> Talk about the wide area network challenges in the modern era we're living in, which is cloud computing mentioned some of the nuances around regions and availability zones. Basically, the cloud grew up as building blocks and the plumbing on the neither essentially a mai britt of of certain techniques and networking. Local area network V lands tunneling All these stuff Nets router. So it's obviously plumbing. Yes, what's different now that's important to take that to the next level. Because, you know, there are arguments that saying, Hey, GPR, I might want to have certain regions be smarter, right? So you're starting to see a level up that Amazon and others air going. Google, in particular, talks about this a lot as Ama's Microsoft. What's that next level of when, where the plumbing it's upgraded from basically the other things. >> So the problem really has to be stated in terms ofthe your data architecture. If you look at your data on, figure out that you need the set of data to be available for your business critical applications, then the problem turns into. I need replicas of this data in this region and the other reasons, perhaps in two different cloud render locations because you don't want to be tied down to their availability. One cloud vendor, then the problem tones into How do you hide the complexity of replicating and keeping this data consistent from the users of the data data scientists, the application authors and so on. Now, that's where we step in. We have a transparent replication solution that fits into the plumbing. It's often offered by the IT folks as part of their cloud offering or as part of the hybrid offering. The application. Developers don't really need to worry about those things. A specific example would be hive tables that are users building in one data center an IT Professional from that organization can buy our replication software. That table will be available in multiple data centers and multiple regions available for both Read and write. The user did not do anything or does not need to be a there. So if you have problems such as GDPR requires the data to be here. But this summarized data can be available across all of these regions. Then we can solve the problem elegantly for you without any act application rewiring or reauthoring. >> Talk about the technology that makes all this happen again. This has been a key part of your success that WANdisco love the always love the name wide area there was a big wide area that were fan did that in my early days configuring router tables. You know how it has been. You know, hardcore back then, Distributed systems is certainly large. Scale now is part of the clouds. So all the large scale guys like me when we grew up into computer science days had to think about systems, architecture at scale. We're actually living it now, Correct. So talk about the technology. What specifically do you guys have that that that's your technology and talk about the impact to the scale piece. I think that's a real key technology piece >> indeed. So the core of our algorithm is enhancements and superior implementation. Often algorithm called paxos. Now paxos itself is the only mathematically proven algorithm for keeping replicas in multiple machines or multiple regions. So multiple data centers the other alternatives. Such a raft and zookeeper protocol. These are all compromises for the sake of the ease of implementation. Now we don't feel the cost of implementation. We spent many years doing the research on it, so we have fantastic implementation. Of paxos is extended for use over wide data networks without any special hardware I mentioned without any special hardware piece, because Google Spanner, which is one of our primary competitors, has an implementation that that needs your own specific network and hardware. So the value of >> because they're tired, the clock, atomic clock, actually, to the infrastructure of their timings, that's all synchronized. So it's it's only within Google Cloud? >> Exactly. It cannot even be made available to Google's customers of Google Cloud. That was a feature that they added recently, but it's rolling out in very limited. >> They inherited that from their large scale correct Google. Yes, which is a big table spanner. These are awesome products. >> These are awesome products, but they're very specific >>Tailored for Google. >> Yes, they're great in the Google environment. They're not so great outside of Google. Now we have technology that makes you able to run this across a Google Cloud and Microsoft's Cloud and Amazons Cloud. The value of this is that you have truly cloud neutral solutions. You don't need to worry about when the lock in, you don't need to worry about availability problems in one of the cloud vendors and then you can scale your solution. You can go in with an approach such that when the virtual machines or the compute resource is in one cloud vendor are really inexpensive. Will use that when it's very expensive. Will move our workloads to other locations. You can think up architectures like that, with our solution underpinning your replication >> rights again. I'm gonna ask you the technical quite love these conversations get down and dirty on the hood. So Joel Horowitz was on your new CMO former Microsoft. Keep alumni Richard CEO Talk aboutthe. Same thing. Moving data around the key value probably that's tied right into your legacy of your I P and how that value is with integrity. Moving data from point A to point B. But the world's moving also to identify scenarios where I'm going to move compute rather than through the day, because people have recognized that moving data is hard you got late in C and this cost in band with so two schools of thought not mutually exclusive. When do you pick one? >> Okay, absolutely. They're not mutually exclusive because there are data availability needs that defined some replication scenarios on their computer needs that can be more flexible. If you had the ability to say, have data in Amazon's cloud on in Microsoft's Cloud, You mean Want to use some Amazon specific tools for specific computer scenarios at the same time, used Microsoft tools for other scenarios or perhaps use open source, too, like Hadoop in either one of those clouds? Those are all mechanisms that work perfectly well, but at the core you have to figure out your data architecture. If you can live with your data in one region or in one data center, clearly that's what you should do. But if you cannot have that data, be unavailable, you do have to replicate it. At that point, you should consider replicating to a different cloud window because availability is concerned with all these vendors. >> So two things I hear you say one availability is it's a driver. The other one is user preference Yes. Why not have people who know Microsoft tools and Microsoft software work on Microsoft framework of someone using something else in another cloud? The same data can live in both places. You guys make that happen? Is that what you're saying? Exactly. That's a big deal. >> Absolutely. And we guarantee the consistency that a guarantee that you will not get from any other bender. >> So this basically debunks the whole walk in, Yes, that you guys air solution to to essentially relieve this notion of lock and so me as a customer and say, Hey, I'm an Amazon right now. We're all in an Amazon. But, you know, I've got some temptation to goto Azure or Google. Why wouldn't I if I have the ability to make my data consistent, exact. Is that what you're saying? >> That is exactly what I'm saying. You have this ability to experiment with different cloud vendors. You also have the ability to mitigate some of the cost aspect. If you're going to pay for copies in two different geographic locations, you might as well do it on two different cloud vendor see have the richer subset of applications and better availability. >> So for people who say date is a lock inspect for cloud. It's kind of right if unless they use WANdisco because in a sense, and because you know what really moves with it. I mean, your data's Did you stay there? Yeah, that's kind of common sense. It's not so much technical locket, so there's no real technical lockets. More operational lock and correct with data, if you don't wantto. But if you're afraid of lock in, you go with the WANdisco. That's live data. Multi cloud is that >> that was live data multi cloud on. Does this new ability to actually have active data sets that are available in different cloud bender locations? >> Well, that's a killer app right there. How do you feel? You must You must feel pretty good. You know, you and I have talked many times. Yes, but this's like you been waiting for this moment. This is actually really wide here in a k a cloud. I was a big data problem. Which only getting bigger, exactly. Replication is now the transport between clouds for anti lock. And this is the Holy Grail for home when >> it is the Holy Grail for the industrial. We've been talking about it for years now, and we feel completely redeemed. Now we feel that the industry has gotten to the point back. They understand what we've talked about. I feel very excited, the custom attraction we're seeing on watching our customers light of when we describe the attributes we bring, It's >> exciting and just the risk management alone is a hedge. I mean, if I'm a if I'm someone in the cyber security challenges alone on data, you've got data sovereignty, compliance. Never mind the productivity piece of it, which is pretty amazing. So you guys are changing the data equation. >> Indeed, R R No most excited customers are CEOs because mitigating risk from things like cyber security. As you point out, you may have a breach in one cloud vendor. You can turn that off and use your replica in the other cloud vendor side instantly. Those are comfort. You do not get that other solutions. >> So world having a love fest here. I love the whole multi cloud data. No anti lock. And I think that's a killer feature. Think we'll sell that baby? I'm going to say, OK, that's all good, but I'm going to get you on this one. Security. So no one saw security yet. So if you saw that, then you pretty much got it all. So tell me the securities. Just >> so I'll start by saying, right. Our biggest customer base is the financial industry, banking in companies insurance company's health care. There is no industry in the world that's more security conscious than the banking. And does the government the comment? Perhaps I would. I mean, the banks are really security >> conscious, Their money's money, >> money is money. And and they have, ah, judicially responsibility both governments and to their to their customers. So we've catered to these customers for upwards off a decade. Now, every technical decision we make has security. Ask one of the focus items on DH >> years. A good un security. You >> feel's way insecurity when minute comes to date. Yes. >> Encryption. Is that what this is? It's >> encrypted on the wire. We support all on this data at rest encryption schemes. We support all the the the soup and the cloud vendor security mechanisms. We have a cross cloud product, so the security problems are multiplied and we take care of each of those specifically. So you can be confident that your data secure >> and wire speed security, no overhead involved, >> no overhead involved at all. It's not measurable. >> So well, congratulations on where you guys are a lot more work to do. You guys going to staff? So you hiring a lot of people talk about the talent you're hiring real quick because, you know large skin attracting large scale talent is also one indicator. Yeah, the successful opportunity. I see, the more I think the positioning is phenomenal. Congratulations absent about the hiring, >> as you know, as as David mentioned. A few minutes ago, we hired Joel from IBM for our marketing a department. He cmo wonderful. Higher. We've got Ronchi, who's from the University of Denver. I left the head of that computer science department to come work for us. Another amazing guy. Terrific background. We've got shocked me. Who's another column? UT Austin, phD. He's running engineering for us. We're so pleased to be able to hire talent at this level. As as you well know, it's the people who make these jobs interesting and products interesting. We are. So what are >> some of the things that those guys say when they when they get into really exposed. I mean, why would someone with somewhat what would take someone to quit their ten year professor job at a university, which is pretty much retirement to engage in a growing opportunity? What's the What do they say? >> So the single I mean that you'll find in all of this is very complex, unique technology that has bean refined on it's on the verge of exploding toe, probably something ten to one hundred times the size it is today. People see that when dish when we show them the value ofthe what we've got on the market, that we're taking this too. I'm just getting excited. >> Well, congratulations. You guys have certainly worked hard. Has been great to watch the entrepreneurial journey of getting into that growth stream and just the winds that you're back all that hard work into technologies. Phenomenal again. Multi cloud data not worrying about where your data is is going to give people some East and rest in the other rest of night. Well, because that's the number one of the number one was besides security absolutely Jagane Sundar CTO chief technology officer of WANdisco here inside the CUBE in Palo Alto. I'm John Furrier. Thanks for watching.

Published Date : Jan 23 2019

SUMMARY :

you get great to see you again. So for the folks watching good, our YouTube channel insurgency the evolution of conversations over, So if you use a inconsistent tool or that you guys solve with technology? So the value ofthe the replicas essentially zero like a leaky pipe. You had the replication active, active great for data centers. So you need to parse the truth from what has offered Talk about the wide area network challenges in the modern era we're living in, which is cloud computing mentioned some So the problem really has to be stated in terms ofthe your data architecture. So all the large scale guys So the value of because they're tired, the clock, atomic clock, actually, to the infrastructure of their timings, It cannot even be made available to Google's customers of Google They inherited that from their large scale correct Google. availability problems in one of the cloud vendors and then you can scale your solution. Moving data around the key value probably that's tied right into your legacy work perfectly well, but at the core you have to figure out your data architecture. So two things I hear you say one availability is it's a driver. And we guarantee the consistency that a guarantee that you will not get from any So this basically debunks the whole walk in, Yes, that you guys air solution to to You also have the ability to mitigate some of the cost aspect. they use WANdisco because in a sense, and because you know what really moves with it. Does this new ability to actually You know, you and I have talked many times. it is the Holy Grail for the industrial. So you guys are changing As you point out, you may have a breach in So if you saw that, then you pretty much got it all. I mean, the banks are really security Ask one of the focus items on DH You feel's way insecurity when minute comes to date. Is that what this is? So you can be confident that your data secure It's not measurable. So you hiring a lot of people talk about the talent you're hiring real quick because, I left the head of that computer science department to come work for us. some of the things that those guys say when they when they get into really exposed. So the single I mean that you'll find in all of this getting into that growth stream and just the winds that you're back all

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jagane Sundar	PERSON	0.99+
John Furrier	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
WANdisco	ORGANIZATION	0.99+
Jagane Sundar	PERSON	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Eight	QUANTITY	0.99+
one	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Richard	PERSON	0.99+
Ronchi	PERSON	0.99+
University of Denver	ORGANIZATION	0.99+
January 2019	DATE	0.99+
GPR	ORGANIZATION	0.99+
one region	QUANTITY	0.99+
two schools	QUANTITY	0.98+
each region	QUANTITY	0.98+
YouTube	ORGANIZATION	0.98+
three	QUANTITY	0.98+
both	QUANTITY	0.98+
Ama	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
each cloud	QUANTITY	0.98+
one indicator	QUANTITY	0.98+
two things	QUANTITY	0.98+
ten	QUANTITY	0.97+
three availability zones	QUANTITY	0.97+
both places	QUANTITY	0.97+
one hundred times	QUANTITY	0.97+
one percent	QUANTITY	0.97+
CUBE	ORGANIZATION	0.97+
Google Cloud	TITLE	0.97+
single	QUANTITY	0.96+
eight	QUANTITY	0.96+
Palo Alto, California	LOCATION	0.95+
Google Cloud	TITLE	0.95+
one data center	QUANTITY	0.94+
CTO	PERSON	0.94+
nine years	QUANTITY	0.93+
today	DATE	0.93+
both governments	QUANTITY	0.92+
Cube	ORGANIZATION	0.9+
one cloud vendor	QUANTITY	0.9+
two three petabytes	QUANTITY	0.9+
zero	QUANTITY	0.89+
each	QUANTITY	0.89+
One cloud vendor	QUANTITY	0.88+
two different cloud	QUANTITY	0.88+
over a month	QUANTITY	0.87+
month and a half	QUANTITY	0.86+
Hadoop	TITLE	0.85+
Dynamo	ORGANIZATION	0.82+
UT Austin	ORGANIZATION	0.82+
few minutes ago	DATE	0.81+
ten year professor	QUANTITY	0.79+

Jagane Sundar, WANdisco | CUBEConversation, May 2018

(regal music) >> Hi, I'm Peter Burris And welcome to another Cube conversation from our beautiful studios here in Palo Alto, California. Got another great guest today, Jagane Sundar is the CTO of WANdisco Jagane, welcome back to the Cube. >> Good morning, Peter. >> So, Jagane, I want to talk about something that I want to talk about. And I want you to help explicate for our clients what this actually means. So there's two topics that I want to discuss. We've done extensive research in both of them, and one is this notion that we call plastic infrastructure. And the other one, related, is something we call networks of data. Let's start with networks of data because I think that that's perhaps foundational for plastic infrastructure. If we look back at the history of computing, we've seen increasing decentralization of data. Yet today many people talk about data gravity and how the Cloud is going to bring all data into the Cloud. Our belief, however, is that there's a relationship between where data is located and the actions that have to be taken. And data locality has a technical reality to it. We think we're going to see more distribution of data, but in a way that nonetheless allows us to federate. To bring that data into structures that nonetheless can ensure that the data is valuable wherever it needs to be. When you think of the notion of networks of data, what does that make you think about? >> That's a very interesting concept, Peter. When you consider the Cloud, and you talk about S3 for example and buckets of objects, people automatically assume that it's a global storage system for objects. But if you scratch a little deeper under the surface you'll find that each bucket is located in one region. If you want it available in other regions you've got to set up something called cross-region replication, which replicates in an eventual consistent fashion. It may or may not get there in time. So even in the Cloud storage systems, there is a notion of locality of data. It's something that you have to pay attention to. Now you hit the nail on the head when you said networks of data. What does that mean? Where does the data go? How it is used. Our own platform, the Fusion platform for replication of data is a strongly consistent platform which helps you conform to legal requirements and locality of data and many such things. It was built with such a thing in mind. Of course, we didn't quite call it that way, but I like your way of describing it. >> So as we think then about where this is, the idea is, 40 years ago, ARPANET allowed us to create networks of devices in a relatively open application-oriented way. And the web allowed us to create networks of pages of content. But again, that content was highly stylized. More recently social media's allowed us great networks of identities. All very important stuff. Now as we start talking about digital business and the fact that we want to be able to rearrange our data assets very quickly in response to new business opportunities whether it's customer experience or operational-oriented, this notion of networks of data allows us to think about the approach to doing that, so that we can have the data being in service to existing business opportunities, new business opportunities, and even available for future activities. So we're talking about creating networks out of these data sources, but as you said, to do that properly, we need to worry about consistency, we need to worry about cost. The platform for doing this, Fusion is a good one, it's going to require over time, however we think, some additional types of capabilities. The ability to understand patterns of data usage, the ability to stage data in advance and predictably, et cetera. Where do you think this goes as we start conceiving of networks of data as a fundamental value proposition for technology and business? >> Sure, one of the first things that occurs to me when you talk about a network of data, if you consider that as parallel to a network of computers, you don't have a notion of things like read-only computers whereas read-write computers. That's just silly. You want all computers to be roughly equal in the world. If you have a network of servers, and a network of computers, any of them can read. Any of them can write, and any of them can store. Now our Fusion platform brings about that capability to your definition of a network of data. What we call live data is the ability for you to store replicas of the data in different data centers around the world with the ability to write to any of those locations. If one of the locations happens to go down, it's a non-event. You can continue writing and reading from the other locations. That truly makes the first step towards building this network of data that you're talking about feasible. >> But I want to build on that notion a little bit because we are seeing increased specialization for example, AI, or GPUs. >> Sure. >> AI-specific processors, so even though we are still looking forward to general purpose nonetheless we see some degree of specialization. But let me also take that notion of live data and say I expect that we're going to see something similar. So for example, the same data set can be applied to multiple different classes of applications where each application may take advantage of underlying hardware advantages. But you don't have a restriction on how you deploy it built into the data. Have I got that right? >> Absolutely. Our Fusion platform includes the capability to replicate across Cloud vendors. You can replicate your storage between Amazon S3 and Azure Blob store. Now this is interesting because suddenly, you may discover that Redshift is great for certain applications while Azure SQLDW is better for others. We give you the freedom to invent new applications based on what location is best suited for that purpose. You've taken this concept of network of data, you've applied a consistent replication platform, now you have the ability to build applications in different worlds, in completely different worlds. And that's very interesting to us because if we look at data as the primary asset of any company, consider a company like Netflix, their data and the way they manage their data is the most important thing to that company. We bring the capability to distribute that data across different Cloud vendors, different storage systems, and run different applications. Perhaps you have a GPU heavy Cloud that maybe a GPU vendor offers. Replicate your data into that Cloud, and run your AI applications against that particular replica. We give you truly the freedom to invent new applications for your purpose. >> But very importantly, you are also providing, and I think this is essential, a certainty that there's consistency no matter how you do it. And I think that's the basis of the whole, the Paxos algorithms you guys are using. >> Exactly. The fundamental fact is this. Data scientists hate to deal with outdated data. Because all the work they're doing may be for no use if the data that they're applying it to is outdated, invalid, or partially consistent. We give you guarantees that the data is constantly updated, live data, it's completely consistent. If you ask the same question of two replicas of your data, you will get exactly the same answer. There is no other product in the industry today that can offer that guarantee. And that's important for our customers. >> Now building on the foundation, we're going to have to add some additional things to it. So pattern recognition, ML inside the tool. Is that on the drawing board? And I don't want you to go too far in the future, but is that kind of the future that you see too? >> We are a platform company with an excellent plug-in API. And one of the uses of our plug-in API, I'll give you a simple example, we have banking customers and they need to prevent credit card numbers from flying over the wire under certain circumstances. Our plug-in API enables them to do that. Applying an ML intelligence program into the plug-in API, again, a very simple development effort to do that. We are facilitating such capabilities. We expect third-party developers. We already have a host of third-party developers and companies building to our plug-in API. We expect that to be the vehicle for this. We won't claim expertise in ML, but there are plenty of companies that will do that on our platform. >> All right, so that leads to the second set of questions that I wanted to ask you about. We've defined what we call plastic infrastructure as a future for the industry. And to make sense of that, what we've done is we've said let's take a look at three phases of infrastructure, not based on the nature of the hardware, but based on the fundamental capabilities of the infrastructure. Static infrastructure is when we took an application, we wired it to a particular class of infrastructure. New load hits it, often you broke the infrastructure. Elastic infrastructure is the ability to be able to take a set of workloads and have it vary up and down, so that you can consume more and release the infrastructure so it has a kind of a rubber orientation. You hit it with a new load, it will deform for as long as you need it to, then it snaps back into shape. So you've predictability about what your costs are. We think that increasingly digital business is going to have to think about plastic infrastructure. The ability to very rapidly have the infrastructure deform in response to new loads, but persist that new shape, that new structure in response to how the load has impacted the business if in fact that is a source of value for the business. >> Sure. >> What do you think about that notion of plastic infrastructure? >> I love the way you describe it. In our own internal terminology we have this notion of live data and freedom to invent. What you've described is exactly that. The plastic infrastructure matches exactly with our notion of freedom to invent. Once you've solved the problem of making your data consistently available in different Clouds, different regions, different data centers, the next step of course is the freedom to invent new applications. You're going to throw experimental things at it. You're going to find that there are specific business intelligence that you can draw from this by virtue of a new application. Use it to make some critical decisions, improve profitability perhaps. That results in what you describe as plastic infrastructure. I really love that description by the way. Because we've gone from, the Cloud brought us plastic infrastructure, we've replicated, we've built a system that enables innovation and invention of new ideas. That's plastic infrastructure. I really like the idea that you're proposing. >> So as you think about this concept of plastic infrastructure, obviously there's a lot of changes that're going to take place in the industry. But Fusion in particular, by providing consistency, by increasing the availability, more importantly even the delivery of data where it's required facilitates at that data level, that notion of plasticity. >> Absolutely. The notion that you can throw brand new applications at it in a Cloud vendor of your choice, the fact that we can replicate across different Clouds is important for plastic infrastructure. Perhaps there are certain applications that work better in one Cloud versus the other. You definitely want to try it out that. And if that results in some real valuable applications, continue running it. So your definition that elastic becomes plastic infrastructure matches perfectly with that. We love this notion that we take the CIO's problems of mundane data management away and introduce the capability to invent and innovate in their space. >> So let me give you a very, or let me ask you a very practical, simple question. Historically, the back-up and restore people, and the application development people didn't spend a lot of time with each other, and that has created some tension. Are we now because of our ability to do this live data, are we able to bring those two worlds more closely together so that developers can now think about building increasingly complex, increasingly rich applications? And at the same time ensure that the data that they're building and testing with is in fact very close to the live data that they're actually going to use. >> Absolutely. We do bridge that gap. We enabled application developers to think of more complex, more sophisticated applications without actually worrying about the availability or the consistency of data. And the IT administrators and the CIO run operations that need to deliver that, have the confidence that they can in fact deliver it with the levels of consistency and availability that they need. >> So I'm going to give you the last word in this. I talked about a fair amount now, about this notion of networks of data, and infrastructure plasticity, where do you think this kind of matures over the course of the next four or five years? And what's your peer CTOs of large businesses that are thinking about these challenges of data management be focusing on? >> So the first thing that you have to acknowledge is that people need to stop thinking about machines and servers, and consider this as infrastructure that they acquire from different Cloud vendors. Different Cloud vendors because in fact there is going to be a few, a handful of good Cloud vendors that'll give you different capabilities. Once you get to that conclusion, you need your data available in all of these different Cloud vendors perhaps on your on-prem location as well, with strong consistency. Our platform enables you to do that. Once you get to that point, you have the freedom to build new applications, build business-critical systems that can depend on the consistency and availability of data. That is your definition of plasticity and networks of data. I truly like that. >> Yeah, and so we, great, great summary. We would say that we would agree with you, that increasing with the CIO, or the CDO, whoever it's going to be, has to focus on how do I increase returns on my business's data, and to do that they need to start thinking differently about their data, about their data assets, both now and in the future. Very, very important stuff. Jagane, thank you very much for being on the Cube. >> Thank you, Peter. >> And once again, I'm Peter Burris, and this has been a Cube conversation with Jagane Sundar, CTO of WANdisco. Thanks again. (regal music)

Published Date : May 17 2018

SUMMARY :

Jagane Sundar is the CTO of WANdisco and the actions that have to be taken. It's something that you about the approach to doing that, that occurs to me when you talk that notion a little bit So for example, the same We bring the capability the Paxos algorithms you guys are using. that they're applying it to but is that kind of the We expect that to be the vehicle for this. is the ability to be able I really love that description by the way. of changes that're going to and introduce the capability to invent that they're actually going to use. operations that need to deliver that, So I'm going to give is that people need to stop thinking and to do that they need to start thinking and this has been a Cube conversation

ENTITIES

Entity	Category	Confidence
Jagane	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Peter	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
May 2018	DATE	0.99+
second set	QUANTITY	0.99+
two topics	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
two replicas	QUANTITY	0.99+
first step	QUANTITY	0.98+
one region	QUANTITY	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
each application	QUANTITY	0.98+
40 years ago	DATE	0.97+
Redshift	TITLE	0.96+
two worlds	QUANTITY	0.96+
each bucket	QUANTITY	0.95+
S3	TITLE	0.94+
Paxos	ORGANIZATION	0.94+
first thing	QUANTITY	0.93+
Azure SQLDW	TITLE	0.92+
first things	QUANTITY	0.92+
Azure Blob	TITLE	0.9+
ARPANET	ORGANIZATION	0.9+
five years	QUANTITY	0.89+
CTO	PERSON	0.87+
Cloud	TITLE	0.8+
three phases	QUANTITY	0.77+
Cube	ORGANIZATION	0.77+
questions	QUANTITY	0.72+
Fusion	TITLE	0.69+
CUBEConversation	EVENT	0.65+
four	QUANTITY	0.49+
next	QUANTITY	0.36+

Jagane Sundar, WANdisco | CUBEConversation, May 2018

(intense orchestral music) >> Hi I'm Peter Burris, welcome to another CUBEConversation. Today we've got a special guest from WANdisco, Jagane Sundar, who's the CTO, Jagane, welcome to theCUBE again! >> Thanks Peter, happy to be here! >> So Jagane, we've got a lot to talk about today, WANdisco's doing a lot of new things, but clearly the industry is, itself, in the midst of a relatively important evolution. Now we at Wikibon and SiliconANGLE have been calling it the transformation to digital business. Everybody talks about this, but we've been pretty specific, we think that it boils down to how a company uses data as an asset, and the degree to which it's institutionalizing, or re-institutionalizing work around those assets. How does WANdisco see this big transformation that we're in the midst of right now? >> So, you're exactly right, businesses are transforming from traditional means to a digital based business, and the most important thing about that is the data. WANdisco is at the forefront of making your data available for your innovation. We start off with the basic use-cases, disaster recovery, that's a traditional problem that people have half-solved in many different ways, but we have the ability to solve that problem, take you to the next stage, which is what we call live data, where you don't worry about the availability or the location of your data anymore. Finally, we take you from that live data platform to a place where you can invent with your data, the freedom to invent phase of our--of what we call. Now, that's what you're calling the digital transformation and there's great synergy between our two terminologies, that's an important aspect here. >> So let me impact that a little bit, if I can. So the core notion is: that every business has to start acknowledging that data is something more than the exhaust that comes out of applications, it really is a core data asset. So let's start with this notion of backup and restore, or disaster recovery, the historical orientation is: I have these very expensive assets, typically in the form of hardware, or maybe applications, and I have to ensure that I can back those assets up. So backup restore used to be back up a device, backup a volume, backup whatever else it might be, and now it's moved to more of a backup of a virtual machine. I think we're talking about something different when we talk about your approach to backup and restore we're really talking about backing up data assets, do I have that right? >> That is correct. You have gone from a place where you are backing up PC's and Macintosh's and cellphones, to a place where the digital assets of your company, that are useful analytics, are far more important. Now, a simple backup, where you take the contents in one data center, push it to another data center, are a half-solution to the problem. What we've come up with is this notion called live data. You have multiple data centers, some of them you own, they're on premise, some of them are Cloud vendor data centers, they definitely reside in different parts of the world. Your data also is generated in different parts of the world, now all of this data goes into this data system, this platform that we've built for you, and it's available under all circumstances. If a region of a Cloud vendor goes down or if your own data center goes down, that's a non-event, because that data is available in other data centers around the world. This gives you the flexibility to treat this as a live data platform. You can write data where you want, you can read and run analytics wherever you want. You've gone from backing up PC's and phones, to actually using your digital assets in a manner such that you can make critical business decisions based on that. Imagine that insurance company that's making-- underwriting policies based on this digital data. If the data's not available, you've got a full halt on the business, that's not acceptable. If the data is not available because a specific data center went down, you can't call a full-stop to your business, you've got to make it available. Those are simple examples of how digital transformation is happening, and regular backup and DR are really inadequate to fuel your digital transformation. >> In fact, we like to think, we're advising our clients, that as they think about digital transformation, the role that data's playing, a digital business is not just backing up and restoring or sustaining or avoiding disasters associated with the data, they're really talking about backing up and restoring their entire business. That's kind of what we mean when we talk about DR in the digital business sense, disaster recovery, or backup and recovery, restore, in a digital business sense. And as you said, this notion of live data increases our ability to do that, but partly that requires a second kind of a step. By that I mean, most people think about storage, they think about where data's located in terms of persisting the data. When we talk about this new approach, we're talking about ensuring that we can deliver the data. Restore takes on more importance than backup than it has before, would you agree with that? That really talking about live data is really about being able to restore data wherever it's needed. >> It's an interesting new approach where we don't really define a primary and a backup. One of the important things about our Paxos-based replication system is that each location, or each instance, replica of your data, is exactly equal. So if you have a West Coast data center, and an East Coast data center, and a Midwest data center, and your West Coast data center happens to go down, none of the activities that you perform on your data will stop, you can continue writing your data to your Midwest and your East Coast data center, you continue writing and reading, running your applications against this data set. Now there wasn't a definition that the West Coast is primary and East Coast is backup. When a disaster strikes, we will cut over to the backup, we'll start using that, when the primary comes back, now we have to reconcile it, that's the traditional way of doing things, and it brings about some really bad attributes. Such as you need to have all your data pumped into one data center, that's counter to our philosophy. We believe that live data is where each of these replicas is equal, we build a platform for you where you can write to any of these, you can run your analytics against any of those. Once you get past that mental hurdle, what you've got is the freedom to innovate. You can look at it and go: I've got my data available everywhere, I can write to it, I can read from it, what can I do with this data? How can I quickly iterate so I can make more interesting business decisions, more relevant business decisions that will result in better business, profits and revenue. This interesting outcome is because you're now, not concerned about the availability of data or the primary, backup, and failover and failback, all those disappear from your radar. >> So let me build on that a little bit too, Jagane. So the way we would describe that is that a digital business, most have those data assets, those crucial data assets available, so that they can be delivered to applications and new activities, so we think in terms, what we call data zones, where the idea, you take a look at what your digital business value proposition is, what activities are essential to delivering on that value proposition, and then, whether or not the data is in a zone approximate to that activity, so that activity can actually be executed. So that means, from a physical standpoint, it needs to be there, from a legal standpoint, from a intellectual property control, from cost, but also from a consistency standpoint, you don't want dramatically different behaviors in your business just because the data that's over there is not consistent with the data that's over here, that's kind of what you guys are looking at. Now, ultimately that means, going out a little bit, but ultimately that means that this notion of deploying data so it serves your business now, has to also include a futures orientation. That we want to choose technologies that give us high value options on data futures as well. Is that what you mean by effectively, freedom to invent? >> It's definitely one aspect of our definition of freedom to invent. We are focused fully on complying with some of these requirements that you talked about. Regions of data, for example, there are parts of the world where you cannot take the data from that part of the world outside but often you need to do analytics in a global manner, such that if you detect a flaw or a problem that is surfaced by data in one part of the world, the chances are very good that that'll apply to this restricted zone as well. You want to be able to apply your analytics against that. Critical business decisions may need to be made, yet you cannot export that data out of that country, we facilitate such capabilities. So we've gone from a simpler primary backup type of system to a live data platform. And finally, we've given you the freedom to invent because you can now take a look at it and go I can start building applications that are in the critical business path because I'm confident of the availability of my data, the fact that we comply with all regulatory consilience things like aging out data after a certain number of months or days, we can help you do that really well with our platform. So yes, in fact the notion that data resides in different pools, in different areas, replicated consistently, available under all circumstances, enables business to think about their data in a completely different manner, up-level it. >> And satisfying physical, legal, intellectual property, and cost realities. >> Exactly. Those are all consilience that need to be addressed by this replication platform. >> So as we think about where customers are going with this clearly they've started around this backup and restore, but it sounds like you guys are helping them today conceive of what it means to do backup and restore and analytics, that is a particularly sensitive issue for a lot of businesses right now that are trying to marry together data science and good practices associated with IT. How is that playing out, can you give us some insight into how customers are doing a better job of that? >> Sure. A global auto maker that has acquired our software can do replication started off by using it for two very simple use-cases. They were looking at migrating from an older version of a data system to a newer version, we enabled them to do that without downtime, that was a clear win for us. The second thing they wanted to do was enable a disaster recovery type scenario. Once we got to that stage, we showed them how easy it was for them to continue writing to what was originally notionally the backup system, that made about twice as much compute resources available for them, because their original notion was that the backup system would just be a backup system, nothing could be done on it. Light bulbs went off in our customers head, they looked at it and went I can continue writing here, even if my primary goes down, there's no real notion of a backup, there's no real notion of failover and failback, that opened their minds to a whole bunch of new ideas. Now, they are in a position to build some business critical applications. Gone are the days when an analytics thing meant you run a report once a week and send it off to the CIO, it's not that anymore, it's up to minute accuracy, people are making things like insurance companies making underwriting decisions, and healthcare companies tracking the spread of diseases based on up to the minute information that they're getting. These are not weekly once analytics applications anymore, these are truly businesses that are based on their digital data. >> So a fundamental promise of live data is that wherever the data is, the application is live? >> Jagane: Yes, absolutely. >> Alright one more thing I think we want to talk about very quickly Jagane is there is some differences in mindset that a CIO has to apply here, again the CIO used to look at the assets and say machines, the hardware, yes, and maybe the applications, and now, to really see the value of this, they have to think of this in terms of data being the asset. How are your customers starting to evolve that notion so that they see the problem differently? >> So, I think the first thing that happened was the Cloud, we can't take credit for that, of course, but it helped our costs a great deal because people looked at infrastructure with a completely different viewpoint. They don't look at it as I'm going to buy a server with this size to run my Oracle, that mentality went away, and people started looking at, I have to store my data here and I can run an elastic application on this, I can grow our resources on demand and surrender those resources back to the Cloud when I don't need that. We take that to the next step, we enable them to have consistent replicas of their data across multiple regions of Cloud vendors, across different Cloud vendors. Suddenly they have the ability to do things like, I can run this analytics on Redshift here in Amazon really well, I can use this same data to run it on Azure SQL DW here, which is a better application for this specific use-case. We've opened up the possibilities to them, such that, they don't worry about what data they're going to use, how much resources they're going to get, resources are truly elastic now, you can buy and surrender resources, as per your demand, so it's opening up possibilities that they never had before. >> Excellent! Jagane Sundar, CTO of WANdisco, talking about live data, and the journey the customers are on to make themselves more fully digital businesses. >> Thanks, Peter. >> Once again this is Peter Burris from theCUBE, CUBEConversation with Jagane Sundar of WANdisco. (intense orchestral music)

Published Date : May 17 2018

SUMMARY :

Today we've got a special guest from WANdisco, and the degree to which it's institutionalizing, to a place where you can invent with your data, So the core notion is: that every business has to start in a manner such that you can make that as they think about digital transformation, that you perform on your data will stop, so that they can be delivered to applications such that if you detect a flaw or a problem and cost realities. Those are all consilience that need to be addressed So as we think about where customers are going with this that opened their minds to a whole bunch of new ideas. that a CIO has to apply here, We take that to the next step, and the journey the customers are on to CUBEConversation with Jagane Sundar of WANdisco.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Peter	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
Jagane	PERSON	0.99+
May 2018	DATE	0.99+
Amazon	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
each location	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
two terminologies	QUANTITY	0.99+
each instance	QUANTITY	0.99+
Today	DATE	0.98+
each	QUANTITY	0.98+
once a week	QUANTITY	0.98+
second	QUANTITY	0.97+
first thing	QUANTITY	0.97+
today	DATE	0.97+
one part	QUANTITY	0.97+
two very simple use-cases	QUANTITY	0.96+
One	QUANTITY	0.96+
second thing	QUANTITY	0.95+
East Coast	LOCATION	0.95+
Midwest	LOCATION	0.94+
Azure SQL DW	TITLE	0.94+
about twice	QUANTITY	0.94+
one data center	QUANTITY	0.94+
SiliconANGLE	ORGANIZATION	0.93+
West Coast	LOCATION	0.92+
one aspect	QUANTITY	0.91+
CTO	PERSON	0.83+
Macintosh	COMMERCIAL_ITEM	0.82+
one more thing	QUANTITY	0.8+
Cloud	TITLE	0.73+
CUBEConversation	EVENT	0.72+
Paxos	ORGANIZATION	0.56+
Redshift	TITLE	0.52+
theCUBE	ORGANIZATION	0.52+

Jagane Sundar, WANdisco | AWS Summit SF 2018

>> Voiceover: Live from the Moscone Center, it's theCUBE. Covering AWS Summit San Francisco 2018. Brought to you by Amazon Web Services. >> Welcome back, I'm Stu Miniman and this is theCUBE's exclusive coverage of AWS Summit here in San Francisco. Happy to welcome back to the program Jagane Sundar, who is the CTO of WANdisco. Jagane, great to see you, how have you been? >> Well, been great Stu, thanks for having me. >> All right so, every show we go to now, data really is at the center of it, you know. I'm an infrastructure guy, you know, data is so much of the discussion here, here in the cloud in the keynotes, they were talking about it. IOT of course, data is so much involved in it. We've watched WANdisco from the days that we were talking about big data. Now it's you know, there's AI, there's ML. Data's involved, but tell us what is WANdisco's position in the marketplace today, and the updated role on data? >> So, we have this notion, this brand new industry segment called live data. Now this is more than just itty-bitty data or big data, in fact this is cloud-scale data located in multiple regions around the world and changing all the time. So you have East Coast data centers with data, West Coast data centers with data, European data centers with data, all of this is changing at the same time. Yet, your need for analytics and business intelligence based on that is across the board. You want your analytics to be consistent with the data from all these locations. That, in a sense, is the live data problem. >> Okay, I think I understand it but, you know, we're not talking about like, in the storage world there was like hot data, what's hot and cold data. And we talked about real-time data for streaming data and everything like that. But how do you compare and contrast, you know, you said global in scope, talked about multi-region, really talking distributed. From an architectural standpoint, what's enabling that to be kind of the discussion today? Is it the likes of Amazon and their global reach? And where does WANdisco fit into the picture? >> So Amazon's clearly a factor in this. The fact that you can start up a virtual machine in any part of the world in a matter of minutes and have data accessible to that VM in an instant changes the business of globally accessible data. You're not simply talking about a primary data center and a disaster recovery data center anymore. You have multiple data centers, the data's changing in all those places, and you want analytics on all of the data, not part of the data, not on the primary data center, how do you accomplish that, that's the challenge. >> Yeah, so drill into it a little bit for us. Is this a replication technology? Is this just a service that I can spin up? When you say live, can I turn it off? How do those kind of, when I think about all the cloud dynamics and levers? >> So it is indeed based on active-active replication, using a mathematically strong algorithm called Paxos. In a minute, I'll contrast that with other replication technologies, but the essence of this is that by using this replication technology as a service, so if you are going up to Amazon's web services and you're purchasing some analytics engine, be it Hive or Redshift or any analytics engine, and you want to have that be accessible from multiple data centers, be available in the face of data center or entire region failure, and the data should be accessible, then you go with our live data platform. >> Yeah so, we want you to compare and contrast. What I think about, you know, I hear active-active, speed of light's always a challenge. You know globally, you have inconsistency it's challenging, there's things like Google Spanner out there to look at those. You know, how does this fit compared to the way we've thought of things like replication and globally distributed systems in the past? >> Interesting question. So, ours great for analytics applications, but something like Google Spanner is more like a MySQL database replacement that runs into multiple data centers. We don't cater to that and database-transaction type of applications. We cater to analytics applications of batch, very fast streaming applications, enterprise data warehouse-type analytics applications, for all of those. Now if you take a look inside and see what kind of replication technology will be used, you'll find that we're better than the other two different types. There are two different types of existing replication technologies. One is log shipping. The traditional Oracle, GoldenGate-type, ship the log, once the change is made to the primary. The second is, take a snapshot and copy differences between snapshots. Both have their deficiencies. Snapshot of course is time-based, and it happens once in a while. You'll be lucky if you can get one day RTO with those sorts of things. Also, there's an interesting anecdote that comes to mind when I say that because the Hadoop folks in their HTFS, implemented a version of snapshot and snapdiff. The unfortunate truth is that it was engineered such that, if you have a lot of changes happening, the snapshot and snapdiff code might consume too much memory and bring down your NameNode. That's undesirable, now your backup facility just brought down your main data capability. So snapshot has its deficiencies. Log shipping is always active/passive. Contrast that with our technology of live data, whereat you can have multiple data centers filled with data. You can write your data to any of these data centers. It makes for a much more capable system. >> Okay, can you explain, how does this fit with AWS and can it live in multi-clouds, what about on-premises, the whole you know, multi and hybrid cloud discussion? >> Interesting, so the answer is yes. It can live in multiple regions within the same cloud, multiple reasons within different clouds. It'll also bridge data that exists on your on-prem, Hadoop or other big data systems, or object store systems within Cloud, S3 or Azure, or any of the BLOB stores available in the cloud. And when I say this, I mean in a live data fashion. That means you can write to your on-prem storage, you can also write to your cloud buckets at the same time. We'll keep it consistent and replicated. >> Yeah, what are you hearing from customers when it comes to where their data lives? I know last time I interviewed David Richards, your CEO, he said the data lakes really used to be on premises, now there's a massive shift moving to the public clouds. Is that continuing, what's kind of the breakdown, what are you hearing from customers? >> So I cannot name a single customer of ours who is not thinking about the cloud. Every one of them has a presence on premise. They're looking to grow in the cloud. On-prem does not appear to be on a growth path for them. They're looking at growing in the cloud, they're looking at bursting into the cloud, and they're almost all looking at multi-cloud as well. That's been our experience. >> At the beginning of the conversation we talked about data. How are customers doing you know, exploiting and leveraging or making sure that they aren't having data become a liability for them? >> So there are so many interesting use cases I'd love to talk about, but the one that jumps out at me is a major auto manufacturer. Telematics data coming in from a huge number, hundreds of thousands, of cars on the road. They chose to use our technology because they can feed their West Coast car telematics into their West Coast data center, while simultaneously writing East Coast car data into the East Coast data center. We do the replication, we build the live data platform for them, they run their standard analytics applications, be it Hadoop-sourced or some other analytics applications, they get consistent answers. Whether you run the analytics application on the East Coast or the West Coast, you will get the same exact answer. That is very valuable because if you are doing things like fault detection, you really don't want spurious detection because the data on the West Coast was not quite consistent and your analytics application was led astray. That's a great example. We also have another example with a top three bank that has a regulatory concern where they need to operate out of their backup data centers, so-called backup data center, once every three months or so. Now with live data, there is no notion of active data center and backup data center. All data centers are active, so this particular regulatory requirement is extremely simple for them to implement. They just run their queries on one of the other data centers and prove to the regulators that their data is indeed live. I could go on and on about a number of these. We also have a top two retailer who has got such a volume data that they cannot manage it in one Hadoop cluster. They use our technology to create the live data data link. >> One of the challenges always, customers love the idea of global but governance, compliance, things like GDPR pop up. Does that play into your world? Or is that a bit outside of what WANdisco sees? >> It actually turns out to be an important consideration for us because if you think about it, when we replicate the data flows through us. So we can be very careful about not replicating data that is not supposed to be replicated. We can also be very careful about making sure that the data is available in multiple regions within the same country if that is the requirement. So GDPR does play a big role in the reason why many of our customers, particularly in the financial industry, end up purchasing our software. >> Okay, so this new term live data, are there any other partners of yours that are involved in this? As always, you want like a bit of an ecosystem to help build out a wave. >> So our most important partners are the cloud vendors. And they're multi-region by nature. There is no idea of a single data center or a single region cloud, so Microsoft, Amazon with AWS, these are all important partners of ours, and they're promoting our live data platform as part of their strategy of building huge hybrid data lakes. >> All right, Jagane give us a little view looking forward. What should we expect to see with live data and WANdisco through the rest of 2018? >> Looking forward, we expect to see our footprint grow in terms with dealing with a variety of applications, all the way from batch, pig scripts that used to run once a day to hive that's maybe once every 15 minutes to data warehouses that are almost instant and queryable by human beings, to streaming data that pours things into Kafka. We see the whole footprint of analytics databases growing. We see cross-capability meaning perhaps an Amazon Redshift to an Azure or SQL EDW replication. Those things are very interesting to us, to our customers, because some of them have strengths in certain areas and other have strengths in other areas. Customers want to exploit both of those. So we see us as being the glue for all world-scale analytics applications. >> All right well, Jagane, I appreciate you sharing with us everything that's happening at WANdisco. This new idea of live data, we look forward to catching up with you and the team in the future and hearing more about the customers and everything on there. We'll be back with lots more coverage here from AWS Summit here in San Francisco. I'm Stu Miniman, you're watching theCUBE. (electronic music)

Published Date : Apr 4 2018

SUMMARY :

Brought to you by Amazon Web Services. and this is theCUBE's exclusive coverage data really is at the center of it, you know. and changing all the time. Is it the likes of Amazon and their global reach? The fact that you can start up a virtual machine about all the cloud dynamics and levers? but the essence of this is that by using and globally distributed systems in the past? ship the log, once the change is made to the primary. That means you can write to your on-prem storage, Yeah, what are you hearing from customers They're looking at growing in the cloud, At the beginning of the conversation we talked about data. or the West Coast, you will get the same exact answer. One of the challenges always, of our customers, particularly in the financial industry, As always, you want like a bit of an ecosystem So our most important partners are the cloud vendors. What should we expect to see with live data We see the whole footprint to catching up with you and the team in the future

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
Jagane	PERSON	0.99+
San Francisco	LOCATION	0.99+
Jagane Sundar	PERSON	0.99+
Stu Miniman	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
GDPR	TITLE	0.99+
Stu	PERSON	0.99+
One	QUANTITY	0.99+
East Coast	LOCATION	0.99+
Both	QUANTITY	0.99+
second	QUANTITY	0.99+
two	QUANTITY	0.98+
MySQL	TITLE	0.98+
West Coast	LOCATION	0.98+
two different types	QUANTITY	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
one day	QUANTITY	0.98+
Kafka	TITLE	0.98+
S3	TITLE	0.97+
Moscone Center	LOCATION	0.97+
Oracle	ORGANIZATION	0.96+
once a day	QUANTITY	0.95+
Google Spanner	TITLE	0.95+
single data center	QUANTITY	0.95+
NameNode	TITLE	0.94+
hundreds of thousands	QUANTITY	0.94+
today	DATE	0.93+
theCUBE	ORGANIZATION	0.92+
Azure	TITLE	0.91+
WANdisco	TITLE	0.9+
snapdiff	TITLE	0.89+
SQL EDW	TITLE	0.89+
Redshift	TITLE	0.88+
single customer	QUANTITY	0.87+
AWS Summit	EVENT	0.87+
AWS Summit San Francisco 2018	EVENT	0.86+
single region	QUANTITY	0.85+
2018	DATE	0.84+
snapshot	TITLE	0.81+
Jagane	ORGANIZATION	0.76+
three bank	QUANTITY	0.74+
once every 15 minutes	QUANTITY	0.73+
European	LOCATION	0.73+
AWS Summit SF 2018	EVENT	0.71+
once	QUANTITY	0.7+
Cloud	TITLE	0.65+
every three months	QUANTITY	0.64+
GoldenGate	ORGANIZATION	0.57+
of cars	QUANTITY	0.55+
minute	QUANTITY	0.53+
Paxos	ORGANIZATION	0.53+
HTFS	TITLE	0.53+
Hive	TITLE	0.49+
Hadoop	ORGANIZATION	0.41+
BLOB	TITLE	0.4+

Jagane Sundar, WANdisco | AWS re:Invent 2017

>> Announcer: Live from Las Vegas It's theCube covering AWS re:Invent 2017 presented by AWS, Intel, and our ecosystem of partners. >> Welcome back to our live coverage. theCube here at AWS re:Invent 2017 Our fifth year covering Amazon Web Services and their massive growth. I'm John Furrier, my co-host Lisa Martin. Here our next guest is CTO of WANdisco, Jugane Sundar. Welcome back to theCube. >> Thank you John. >> You guys are everywhere. WANdisco around the table and all these deals so you guys have been doing extremely well with (indistinct talking) property. What's new? You got some news? >> Yes we do, we recently announced integration with Amazon's AWS Snowball device which gives you the ability to do migration of on-premises workload into the Cloud without down time, and then the end result is a hybrid cloud environment that you can have an active for right environment on both sides. That's a unique capability, nobody else can do that today. >> What does it mean for AWS and their customers 'cause they're very customer focused. What are you guys bringing to the table? >> We bring a whole lot of big data workloads, analytics workloads, IoT workloads into their Cloud. And the beauty of the cloud is that you may have a 20 node cluster on-premises but you can run analytics with a 1000 nodes up in the Cloud on demand and pay just for that use. We think it's a very powerful value proposition. >> Where are you seeing the most traction? We've talking about the massive growth at 18 billion dollar annual runway that fit AWS and Andy's conversation with you John the other day said we haven't gotten that big on startups alone. So even some of the things like the advertising that AWS is now starting to do suggests they're going up the stack to the Enterprise and to the Sea Suites. Where are you guys seeing the most traction with AWS? Is it in the Enterprise space, is it in the start up space, both? >> So somewhat because of our route, what we're finding is that the large majority of Big Data customers and analytics customers from the last two, three years are all considering some form of addition of a cloud to their environment. If it's not a wholesale migration, it's a hybrid environment. It's bursting out into the cloud type of use case and what you're finding is that growth of on-premise Big Data and analytics systems is slowing down because once you get to the Cloud, the plethora of tools you have, the facilities that the scale brings to you is just unmatched. That's the trend we really see in the market. >> We've seen a lot of people go and use it in the marketplace. Juniper Networks for instance, are seeing some activity at the network. Who would have thought a network player is gonna to pick it in the Cloud, but this is what industrial-strength cloud looks like. You guys have the active active. Where does that fit in for the customers who wanna leverage the apps, and don't wanna worry about the networks? >> Exactly, the traditional model of thinking was use the Cloud for back up. You have your on-premise stuff. The cheapest way to back it up is into the Cloud. But that's really just scratching the tip of the iceberg. Once you put your data up in the Cloud, you have the ability to have it strongly consistently replicated then you can do amazing things from the Cloud. You can do a whole new analytics system. Perhaps you want to experiment with Spark in the Cloud and have it on on high on-premise that works very well. Now that both sides are actively writeable, you can create partitions of your data that are dynamically generated written to both sides. These are things that people did not consider. Once they stumble upon it, it just opens their mind to a whole new way of operating. >> Business Park, I've heard some rumors and rumblings in the developer community here that they're running Spark on Lando. People always hacking with new stuff. So Lando server list I think is coming down. How does that relate to some of things that are driving WANdisco's, how do you relate to that? Does that help you? Does that hurt you guys? >> It helps us, the way we look at it. We're all about strong replication of storage. Lando is no storage, you talk to the underlying storage of some kind. It's S3, it's EBS volumes whatever. So long as the storage comes through our system. Any growth, any simple easy way for applications to be written is hugely positive for us. >> What are the start ups out there? We've seen a lot of start ups really missed the mark. They misfired on the Cloud and you seen some stars that have played it well. They've got in the tornadoes as we say. In fact, Geoffrey Moore, I think is rewriting his book Inside the Tornado, which is a management paradigm. But there really seems to be a new business model. You guys are like ever green at WANdisco because you're unique (indistinct talking) property. How are you guys working with that business model and what are some of the things you're seeing with start ups and companies who are trying to play the cloud but are misfiring? >> Right so WANdisco as you know stands for Wide Area Network Distributed Computing, and the Cloud is like a huge bonus to it. It's all about the Wide Area Network. We are now consolidating a bunch of work in the cloud, but guess what? It's gonna go back to going to go into the edge in some way 'cause the edges are getting smarter. You need replication between those. We see a lot of that coming up in the next two, three, five years. IoT workloads and use cases all involve somewhat of edge smart computing. We replicate between those really well. >> Lisa, we always talk about the trend is your friend. In your case, Cloud is your friend. >> Indeed, it is. The Cloud is all about wide area network computing and we are the ones who can really replicate-- >> How does a customer know what to do when it comes down to getting involved with WANdisco? It's not obvious. Spell it out, why do they need you guys? When do you get involved? What specific things should be red flags to a potential customer or a customer who says I'm gonna go in on the Cloud. Unpack that. >> Let me give you a simple example. We look at Amazon S3, it's a Cloud service storage. But do you know that it's actually on a per region basis. When you create a bucket to put objects into the bucket, it's located in one region. If you want it replicated elsewhere, they have cross-region replication which is an eventually consistent replication system that doesn't give you the consistent results that you want. If you have such a situation employing our technology immediately gives you consistent replication. Be it Cloud regions, Cloud to Cloud or on-premise to Cloud. The end result is the minute you step into replication across the land, every solution out there doesn't do it consistently and that's our core-- >> And that's your unique IP. >> Indeed, it is. >> Okay so I'm seeing Amazon racing their roll out regions. They got one coming in China, one in the Middle East. That's a big part of the strategy. Does that help you or what does that do? >> Absolutely it helps us a great deal, partly because customers now do not look at their applications as a single region applications. That doesn't fly anymore. The the notion that my banking app cannot work because a data center went down is just not acceptable in the modern world anymore. The fact that we depend so much on the services means they need to be up all the time. More regions, more data replication. That's why we step in. >> So that sounds like a lot of symbiosis here. You talk about S3 and replication challenges. So tell us how WANdisco is actually helping AWS. That's one example but help you us understand the symbiosis with your relationship with AWS. >> The best example I can give you is a large travel service company in the internet. They had to Adobe infrastructure that was growing out of control. They wanted to manage costs by moving some workloads to Amazon but didn't really know where to start, because you can't do such a thing as take a copy of the data, ship it off on a Snowball into the Cloud and tell the users of that data, stop writing to it now. It's gonna be available in the Cloud, a week, 10 days from now. Then you can start writing again. That's just not acceptable. This is live data problem. The problem here is that you need to be able to ship out your data on Snowballs, continue to write the on-premise storage. When it shows up in the Cloud, start writing that. Both are consistently replicated, you have a proper hybrid Cloud environment. So this was a great bonus to them. As for AWS, they watched this and they look at it as a easy way to move vast majority of data from on-premise big Data analytics systems. >> Have they been a fuel to your fire, in a sense that they've been on this incredible acceleration of their innovation and as Andy Joci said many times to you John. It's speed and customer focus. So how has their accelerated pace of innovation helped fuel WANdisco's so that like you were saying the unique value. How have they really ignited that? >> So they started off with just plain Snowball two years ago. Last year they announced Snowball Edge which is a pretty improved device. Now they have in the works, capability to do some compute on those boxes. That's very interesting to us. Now our services can decide on the Snowball, It arrives at a customer site. He plugs it in, turns it on instant replication capabilities Those are fueled both by Amazon's drive and extreme speed and our own capabilities. So Amazon is a wonderful partner for us partly because their charge to us innovation is quite amazing. >> Snowball, snow mobile, it's gonna be a white Christmas for you guys. Business is good. >> Business is great. >> Okay, final question. What's the conversations you're having here this year, share with us some of the quick conversations you're having in the hallways, meetings, Amazon got execs, partners. >> So most of the conversation are about moving workloads from on-premise into the Cloud. I personally am very interested in IoT use cases because I see the volume of data and the ability for us to do some interesting replications at being critical. That's where our focus is right now. >> Jugane Sundar, CTO of WANdisco. Big announcement, partnership with Amazon Web Services and Snowball replication active active. Great solution for replication. You got regions across regions. Check out WANdisco. Thanks for coming by, great to see you again. Congratulations on all your success. This is theCube, live coverage day one. It's coming down to an end. The halls open, we got two more days of packed two Cubes. Stay tuned for more, we got some great guest coming up, stay with us. (uptempo techno music)

Published Date : Nov 29 2017

SUMMARY :

It's theCube covering AWS re:Invent 2017 Welcome back to our live coverage. so you guys have been doing extremely well a hybrid cloud environment that you can have an active What are you guys bringing to the table? that you may have a 20 node cluster on-premises that fit AWS and Andy's conversation with you John the plethora of tools you have, Where does that fit in for the customers you have the ability to have it strongly consistently Does that hurt you guys? you talk to the underlying storage of some kind. and you seen some stars that have played it well. and the Cloud is like a huge bonus to it. Lisa, we always talk about the trend is your friend. and we are the ones who can really replicate-- Spell it out, why do they need you guys? The end result is the minute you step Does that help you or what does that do? The the notion that my banking app cannot work the symbiosis with your relationship with AWS. The problem here is that you need to be able to ship out many times to you John. Now our services can decide on the Snowball, it's gonna be a white Christmas for you guys. What's the conversations you're having here So most of the conversation are about moving workloads Thanks for coming by, great to see you again.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Geoffrey Moore	PERSON	0.99+
Andy Joci	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Jugane Sundar	PERSON	0.99+
China	LOCATION	0.99+
Andy	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Last year	DATE	0.99+
Lisa	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
both sides	QUANTITY	0.99+
fifth year	QUANTITY	0.99+
Inside the Tornado	TITLE	0.99+
two years ago	DATE	0.99+
1000 nodes	QUANTITY	0.99+
Adobe	ORGANIZATION	0.99+
Middle East	LOCATION	0.99+
both	QUANTITY	0.99+
10 days	QUANTITY	0.99+
18 billion dollar	QUANTITY	0.98+
Both	QUANTITY	0.98+
this year	DATE	0.98+
CTO	PERSON	0.98+
one example	QUANTITY	0.98+
20 node	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
three	QUANTITY	0.98+
five years	QUANTITY	0.97+
Christmas	EVENT	0.97+
one region	QUANTITY	0.97+
two Cubes	QUANTITY	0.97+
Spark	TITLE	0.97+
Las Vegas	LOCATION	0.96+
Cloud	TITLE	0.96+
one	QUANTITY	0.95+
Juniper Networks	ORGANIZATION	0.95+
day one	QUANTITY	0.94+
Lando	ORGANIZATION	0.94+
three years	QUANTITY	0.93+
a week	QUANTITY	0.92+
S3	TITLE	0.91+
today	DATE	0.91+
theCube	COMMERCIAL_ITEM	0.88+
two more days	QUANTITY	0.88+
single region	QUANTITY	0.88+
Sea Suites	ORGANIZATION	0.86+
Snowball	ORGANIZATION	0.86+
Snowball Edge	COMMERCIAL_ITEM	0.84+
two	QUANTITY	0.79+
Snowball	COMMERCIAL_ITEM	0.79+
re:Invent 2017	EVENT	0.78+
mobile	COMMERCIAL_ITEM	0.78+
Invent 2017	EVENT	0.76+

Jagane Sundar & Pranav Rastogi | Big Data NYC 2017

>> Announcer: Live from Midtown Manhattan, it's theCUBE, covering Big Data, New York City, 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Okay, welcome back, everyone. Live in Manhattan, this is theCUBE's coverage of our fifth year doing Big Data, NYC; eighth year covering Hadoop World, which is now evolved into Strata Data which is right around the corner. We're doing that in conjunction with that event. This is, again, where we have the thought leaders, we have the experts, we have the entrepreneurs and CEOs come in, of course. The who's who in tech. And my next two guests, is Jagane Sundar, CUBE alumni, who was on yesterday. CTO of WANdisco, one of the hottest companies, most valuable companies in the space for their unique IP, and not a lot of people know what they're doing. So congratulations on that. But you're here with one of your partners, a company I've heard of, called Microsoft, also doing extremely well with Azure Cloud. We've got Pranav Rastogi, who's the program manager of Microsoft Cloud Azure. You guys have an event going on as well at Microsoft Ignite which has been creating a lot of buzz this year again. As usual, they have a good show, but this year the Cloud certainly has taken front and center. Welcome to theCUBE, and good to see you again. >> Thank you. >> Thank you. >> Alright, so talk about the partnership. You guys, Jagane deals with all the Cloud guys. You're here with Microsoft. What's going on with Microsoft? Obviously they've been, if you look at the stock price. From 20-something to a complete changeover of the leadership of Satya Nadella. The company has mobilized. The Cloud has got traction, putting a dent in the universe. Certainly, Amazon feels a little bit of pain there. But, in general, a lot more work to do. What are you guys doing together? Share the relationship. >> So, we just announced a product that's a one-click deployment in the Microsoft Azure Cloud, off WANdisco's Fusion Replication technology. So, if you got some data assets, Hadoop or Cloud object stores on-premise and you want to create a hybrid or a Cloud environment with Azure and Picture, ours is the only way of doing Active/Active. >> Active/Active. And there is some stuff out there that's looking like Active/Active. DataPlane by Hortonworks. But it's fully not Active/Active. We talked a little bit about that yesterday. >> Jagane: Yes. >> Microsoft, you guys, what's interesting about these guys besides the Active/Active? It's a unique thing. It's an ingredient for you guys. >> Yes, the interesting thing for us is, the biggest problem that we think customers have for big data perspective is, if you look at the landscape of the ecosystem in terms of open source projects that are available it's very hard to a: figure out How do I use this software?, b: How do I install it? And, so what we have done is created an experience in Azure HDInsight where you can discover these applications, within the context of your cluster and you can install these applications by one-click install. Which installs the application, configures it, and then you're good to go. We think that this is going to sort of increase the productivity of users trying to get sense out of big data. The key challenges we think customers have today is setting up some sort of hybrid environment between how do you connect your on premise data to move it to the Cloud, and there are different use cases that you can have you can move parts of the data and you can do experiment easily in the Cloud. So what we've done is, we've enabled WANdisco as an application on our HDInsight application platform, where customers can install it using a single-click deploy connected with the data that's sitting on-prem, use the Active/Active feature to have both these environments running simultaneously and they're in sync. >> So one benefits the one-click thing, that's on your side, right? You guys are enabling that. So, okay, I get that. That's totally cool. We'll get to that in a second. I want to kind of drill down on that. But, what's the benefit to the customers, that you guys are having? So, I'm a customer, I one-click, I want some WANdisco Active/Active. Why am I doing it? What does the Cloud change? How does your Cloud change from that experience? >> One example that you can think about is going to change is in an on-premise environment you have a cluster running, but you're kind of limited on what you can do with the cluster, because you've already setup the number of nodes and the workloads your running is fairly finite, but what's happening in reality and today is, lots of users, especially in the machine learning space, and AI space, and the analytic space are using a lot of open source libraries and technologies and they're using it on top of Hadoop, and they're using it on top of Spark. However, in experimenting with these technologies is hard on-prem because it's a locked environment. So we believe, with the Cloud, especially with it offering WANdisco and HDInsight, once you move the data you can start spinning up clusters, you can start installing more open source libraries, experiment, and you can shut down the clusters when you're done. So it's going to increase your efficiency, it's going to allow you to experiment faster, and it's going to reduce for cost as well, because you don't have to have the cluster running all the time and once you are done with your experimentation, then you can decide which way do you want to go. So, it's going to remove the-- >> Jagane, what's your experience with Azure? A lot of people have been, some people have been critical, and rightfully so. You guys are moving as fast you can. You can only go as fast you can, but the success of the Cloud has been phenomenal. You guys have done a great job with the Cloud. Got to give you props on that. Your customers are benefiting, or Microsoft's customers are benefiting. How's the relationship? Are you getting more customers through these guys? Are you bringing customers from on-prem to Cloud? How's the customer flow going? >> Almost all of our customers who have on-prem instances of Hadoop are considering Cloud in one form or the other. Different Clouds have different strengths, as they've found-- >> Interviewer: And different technologies. >> Indeed. And Azure's strengths appear to be the HDInsight piece of it and as Pranam just mentioned, the cool thing is, you can replicate into the Cloud, start up a 50 node Spark cluster today to run a query, that may return results to you really fast. Now, remember this is data that you can write to both in the Cloud and on-premise. It's kept consistent by our technology, or tomorrow you may find that somebody tells you, Hive with the new Tez enhancements is faster, sure, spin up a hundred node Hive cluster in the Cloud, HDInsight supports that really well. You're getting consistent data and your queries will respond much faster than your on-premise. >> We've had Oliver Chu on, before with Hortonworks obviously they're partnering there. HDInsight's been getting a lot of traction lately. Where's that going? We've seen some good buzz on that. Good people talking about it. What's the latest update on your end? >> HDInsight is doing really good. The customers love the ease of creating a cluster using just a few clicks and the benefits that customers get, clusters are optimized for certain scenarios. So if you're doing data science, you can create a Spark cluster, install open source libraries. We have Microsoft R Server running on Spark, which is a unique offering to Microsoft, which lots of customers have appreciated. You also have streaming scenarios that you can do using open source technologies, like we have Apache Kafka running on a stack, which is becoming very popular from an ingestion perspective. Folks have been-- >> Has the Kupernetes craze come down to your group yet? Has it trickled down? It seems to be going crazy. You hired an amazing person from Google, Brendan Burns, we've interviewed before. He's part of the original Kubernetes spec he now works for Microsoft. What's the buzz on the Kubernetes container world there? >> In general, Microsoft Azure has seen great benefits out of it. We are seeing lots of traction in that space. From my role in particular, I focus more on the HDInsight big data space, which is kind of outside of what we do with Kubernetes' work. >> And your relationship is going strong with WANdisco? >> Pranav: Yes. >> Right. >> We just launched this offering just about yesterday is what we announced and we're looking forward to getting customers on to the stack. >> That's awesome. What's your take on the industry right now? Obviously, the partnerships are becoming clearer as people can see there's (mumbles). You're starting to see the notion of infrastructure and services are changing. More and more people want services and then you got the classic infrastructure which looks like it's going to be hybrid. That's pretty clear, we see that. Services versus infrastructure, how should customers think about how they architect their environments? So they can take advantage of the Active/Active and also have a robust, clean, not a lot of re-skilling going on, but more of a good organization from a personnel standpoint, but yet get to a hybrid architecture? >> So, it depends, the Cloud gives you lots of options to meet the customers where they are. Different customers have different kinds of requirements. Customers who have specialized, some of their applications will probably want to go more of an infrastructure route, but customers also love to have some of the past benefits where, you know, I have a service running where I don't have to worry about the infrastructure, how dispatching happen, how does OS updates happen, how does maintenance happen. They want to sort of rely on the Microsoft Azure Cloud provider to take care of it. So that they can focus on their application specific logic, or business specific logic, or analytical workloads, and worry about optimizing those parts of the application because that is their core-- >> It's been great.I want to get your thoughts real quick. Share some color. What's going on inside Microsoft? Obviously, open source has become a really big part of the culture, even just at Ignite. More Linux news is coming. You guys have been involved in Linux. Obviously, open source with Azure, ton of stuff, I know is built in the Microsoft Cloud on open source. You're contributing now as to Kubernetes, as I mentioned earlier. Seems to be a good cultural shift at Microsoft. What's the vibe on the open source internally at Microsoft? Can you share, just some anecdotal insight into what's the vibe like inside, around open source? >> The vibe has increased quite a lot around open source. You rightly mentioned, just recently we've announced a SQL server on Linux as well, at the Ignite conference. You can also deploy a SQL server on a docker container, which is quite revolutionary if you think about how forward we have come. Open source is so pervasive it's almost used in a lot of these projects. Microsoft employees are contributing back to open source projects in terms of, bug fixes, feature requests, or documentation updates. It's a very, very active community and by and large I think customers are benefiting a lot, because there are so many folks working together on open source projects and making them successful and especially around the Azure stack, we also ensure that you can run these open source workloads lively in the Cloud. From an enterprise perspective, you get the best of both worlds. You get the latest innovations happening in open source, plus the reliability of the managed platform that Azure provides at an enterprise scale. >> So again, obviously Microsoft partnership is huge, all the Clouds as well. Where do you want to take the relationship with Microsoft? What happens next? You guys are just going to continue to do business, you're like expecting the one-click's nice, I have some questions on that. What happens next? >> So, I see our partnership becoming deeper. We see the value that HDInsight brings to the ecosystem and all of that value is captured by the data. At the end of the day, if you have stale data, if you have data that you can't rely on the applications are useless. So we see ourselves getting more and more deeply embedded in the system. We see of ourselves as an essential part of the data strategy for Azure. >> Yeah, we see continuous integration as a development concept, continuous analytics as a term, that's being kicked around. We were talking yesterday about, here in theCUBE, real time, I want some data real time and IT goes back, "Here it is, it's real time!" No, but the data's three weeks old. I mean, real time (laughs) is a word that doesn't mean I got to see it really fast, low latency response. Well, that's not the data I want. I meant the data in real time, not you giving me a real time query. So again, this brings up a mind shift in terms of the new way to do business in the Cloud and hybrid. It's changing the game. As customers scratch their heads and try to figure out how to make their organizations more DevOps oriented, what do you guys see for advice for those managers, who are really getting behind it, really want to make change, who kind of have to herd the cats a little bit, and maybe break out security and put it in it's own group? Or you come and say, okay IT guys we're going to change into our operating model, even on-prem, we'll use some burst in to the Cloud, Azure's got 365 on there, lot of coolness developing. What's the advice for the mindset of the change agents out there that are going to do the transformation? >> My advice would be, if you've done the same thing by hand over two times, it's time you automated it, but-- >> Interviewer: Two times?! >> Two times. >> No three rule? Three strikes you're out? >> You're saying two, contrarian. >> That's a careful statement. Because, if you try automating something that you've never actually tried by hand, that's a disaster as well. A couple times, so you know how it's supposed to work. >> Interviewer: Get a good groove on it. >> Right, then you optimize, you automate, and then you turn the knobs. So, you try a hundred node cluster, maybe that's going to be faster. Maybe after a certain point, you don't get any improvements, so you know how to-- >> So take some baby steps, and one easy way to do it is to automate something that you've done. >> Jagane: Yes, exactly. >> That's almost risk-free, relatively speaking. Thoughts, advice to change agents out there. This is your industry hat on. You can take your Microsoft hat off. >> Baby steps. So you start small, you get familiar with the environment and your toolsets are provided so that you get a consistent experience on what you were doing on-prem and sort of in a hybrid space. And the whole idea is as you get more comfortable the benefits of the Cloud far outweigh any sort of cultural changes that need to happen-- >> Guys, thanks for coming on theCUBE, really appreciate it. Thoughts on the Big Data NYC this week? What do you think? >> I think it's a conference that has a lot of Cloud hanging over it and people are scratching their heads. Including vendors, customers, everybody scratching their head, but there is a lot of Cloud in this conference, although this is not a Cloud conference. >> Yeah, they're trying to make it an AI conference. A lot of AI watching certainly we're seeing that everywhere. But again, nothing wrong hyping up AI. It's good for society. It really is cool, but still, that's talking about baby steps, AI is still not there. It seems like, AI from when I got my CS degree in the 80's, not a lot innovation, well machine learning is getting better, but, a lot more way to go on AI. Don't you think? >> Yes, you know a few of the announcements we've made in this week is all about making it easier for developers to get started with AI and machine learning and our whole hope is with these investments that we've done and Azure machine learning improvements and the companion app and the workbench, allows you to get started very easily with AI and machine learning models and you can apply and build these models, do a CICD process and deploy these models and be more effective in the space. >> Yeah and also the tooling market has kind of gotten out of control. We were just joking the other day, that there's this tool shed mindset where everything is in the tool shed and people bought a hammer and turned it into a lawnmower. So it's like, you got to be careful which tools you have. Think about a platform. Think holistically, but if you take the baby steps and implement it, certainly it's there. My personal opinion, I think the Cloud is the equalizer. Cloud can bring compute power that changes what a tool was built for. Even, go back six years, the tools that were out there even six years ago are completely changed by the impact of unlimited, potentially unlimited capacity horsepower. So, okay that resets a little bit. You agree? >> I do. I totally agree. >> Who wins, who loses on the reset? >> The Cloud is an equalizer, but there is a mindset shift that goes with that those who can adapt to the mindset shift, will win. Those who can not and are still clinging to their old practices will have a hard time. >> Yeah, it's exciting. If you're still reinventing Hadoop from 2011 then, probably not good shape right now. >> Jagane: Not a good place to be. >> Using Hadoop is great for Bash, but you can't make that be a lawnmower. That's my opinion. Okay, thanks for coming on. I appreciate it (laughs) You're smiling, you got something that you, no? >> Pranav: (laughs) Thank you so much for that comment. >> Yeah, tool sheds are out there, be careful. Guys do your job. Congratulations on your partnership, appreciate it. This is theCUBE, live in New York. More after this short break. We'll be right back.

Published Date : Sep 27 2017

SUMMARY :

Brought to you by SiliconANGLE Media Welcome to theCUBE, and good to see you again. of the leadership of Satya Nadella. and you want to create a hybrid We talked a little bit about that yesterday. It's an ingredient for you guys. and there are different use cases that you can have that you guys are having? and once you are done with your experimentation, Got to give you props on that. in one form or the other. the cool thing is, you can replicate into the Cloud, What's the latest update on your end? You also have streaming scenarios that you can do using Has the Kupernetes craze come down to your group yet? I focus more on the HDInsight big data space, on to the stack. and then you got the classic infrastructure So, it depends, the Cloud gives you lots of options of the culture, even just at Ignite. and especially around the Azure stack, Where do you want to take the relationship with Microsoft? At the end of the day, if you have stale data, in terms of the new way to do A couple times, so you know how it's supposed to work. and then you turn the knobs. and one easy way to do it is to You can take your Microsoft hat off. And the whole idea is as you get more comfortable Thoughts on the Big Data NYC this week? but there is a lot of Cloud in this conference, Don't you think? and you can apply and build these models, So it's like, you got to be careful which tools you have. I totally agree. and are still clinging to their old practices Yeah, it's exciting. but you can't make that be a lawnmower. Congratulations on your partnership, appreciate it.

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Brendan Burns	PERSON	0.99+
Two times	QUANTITY	0.99+
2011	DATE	0.99+
Amazon	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Satya Nadella	PERSON	0.99+
Google	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Jagane Sundar	PERSON	0.99+
three weeks	QUANTITY	0.99+
Jagane	PERSON	0.99+
fifth year	QUANTITY	0.99+
Manhattan	LOCATION	0.99+
yesterday	DATE	0.99+
HDInsight	ORGANIZATION	0.99+
CUBE	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
WANdisco	ORGANIZATION	0.99+
20	QUANTITY	0.99+
Pranav	PERSON	0.99+
one-click	QUANTITY	0.99+
Pranav Rastogi	PERSON	0.99+
two	QUANTITY	0.99+
New York City	LOCATION	0.99+
Midtown Manhattan	LOCATION	0.99+
this year	DATE	0.99+
eighth year	QUANTITY	0.98+
One example	QUANTITY	0.98+
SQL	TITLE	0.98+
both worlds	QUANTITY	0.98+
both	QUANTITY	0.98+
Linux	TITLE	0.97+
one	QUANTITY	0.97+
Spark	TITLE	0.97+
Azure	TITLE	0.97+
NYC	LOCATION	0.97+
two guests	QUANTITY	0.97+
this week	DATE	0.97+
six years ago	DATE	0.97+
today	DATE	0.96+
CTO	PERSON	0.96+
Ignite	EVENT	0.96+
one form	QUANTITY	0.96+
80's	DATE	0.95+
Ignite	ORGANIZATION	0.95+
Hadoop	TITLE	0.95+
Azure	ORGANIZATION	0.95+
single	QUANTITY	0.95+
Oliver Chu	PERSON	0.94+
Azure Cloud	TITLE	0.93+
one easy way	QUANTITY	0.93+
WANdisco	TITLE	0.91+

Jagane Sundar, WANdisco | BigData NYC 2017

>> Announcer: Live from midtown Manhattan, it's theCUBE, covering BigData New York City 2017, brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Okay welcome back everyone here live in New York City. This is theCUBE special presentation of our annual event with theCUBE and Wikibon Research called BigData NYC, it's our own event that we have every year, celebrating what's going on in the big data world now. It's evolving to all data, cloud applications, AI, you name it, it's happening. In the enterprise, the impact is huge for developers, the impact is huge. I'm John Furrier, cohost of the theCUBE, with Peter Burris, Head of Research, SiliconANGLE Media and General Manager of Wikibon Research. Our next guest is Jagane Sundar, who's the CTO of WANdisco, Cube alumni, great to see you again as usual here on theCUBE. >> Thank you John, thank you Peter, it's great to be back on theCUBE. >> So we've been talking the big data for many years, certainly with you guys, and it's been a great evolution. I don't want to get into the whole backstory and history, we covered that before, but right now is a really, really important time, we see you know the hurricanes come through, we see the floods in Texas, we've seen Florida, and Puerto Rico now on the main conversation. You're seeing it, you're seeing disasters happen. Disaster recovery's been the low hanging fruit for you guys, and we talked about this when New York City got flooded years and years ago. This is a huge issue for IT, because they have to have disaster recovery. But now it's moving more beyond just disaster recovery. It's cloud. What's the update from WANdisco? You guys have a unique perspective on this. >> Yes, absolutely. So we have capabilities to replicate between the cloud and Hadoop multi data centers across geos, so disasters are not a problem for us. And we have some unique technologies we use. One of the things we do is we can replicate in an active-active mode between different cloud vendors, between cloud and on-prem Hadoop, and we are the only game in town. Nobody else can do that. >> So okay let me just stop right there. When you say the only game in town I got a little skeptic here. Are you saying that nobody does active-active replication at all? >> That is exactly what I'm saying. We had some wonderful announcements from Hortonworks, they have a great product called the Dataplane. But if you dig deep, you'll find that it's actually an active-passive architecture, because to do active-active, you need this capability called the Paxos algorithm for resolving conflict. That's a very hard algorithm to implement. We have over 10 years' experience in that. That's what gives us our ability to do this active-active replication, between clouds, between on-prem and cloud. >> All right so just to take that a step further, I know we're having a CTO conversation, but the classic cliche is skate to where the puck is going to be. So you kind of didn't just decide one morning you're going to be the active-active for cloud. You kind of backed into this. You know the world spun in your direction, the puck came to you guys. Is that a fair statement? >> That is a very fair statement. We've always known there's tremendous value in this technology we own, and with the global infrastructure trends, we knew that this was coming. It wasn't called the cloud when we started out, but that's exactly what it is now, and we're benefiting from it. >> And the cloud is just a data center, it's just, you don't own it. (mumbles) Peter, what's your reaction to this? Because when he says only game in town, implies some scarcity. >> Well, WANdisco has a patent, and it actually is very interesting technology, if I can summarize very quickly. You do continuous replication based on writes that are performed against the database, so that you can have two writers and two separate databases and you guarantee that they will be synchronized at some point in time because you guarantee that the writing of the logs and the messaging to both locations >> Absolutely. >> in order, which is a big issue. You guys put a stamp on the stuff, and it actually writes to the different locations with order guaranteed, and that's not the way most replication software works. >> Yes, that's exactly right. That's very hard to do, and that's the only way for you to allow your clients in different data centers to write to the same data store, whether it's a database, a Hadoop folder, whether it's a bucket in a cloud object store, it doesn't matter. The core fact remains, the Paxos algorithm is the only way for you to do active-active replication, and ours is the only Paxos implementation that can work over the >> John: And that's patented by you guys? >> Yes, it's patented. >> And so someone to replicate that, they'd have to essentially reverse engineer and have a little twist on it to not get around the patents. Are you licensing the technology or are you guys hoarding it for yourselves? >> We have different ways of engaging with partners. We are very reasonable with that, and we work with several powerful partners >> So you partner with the technology. >> Yes. >> But the key thing, John, in answer to your question is that it's unassailable. I mean there's no argument, that is, companies move more towards a digital way of doing things, largely driven by what customers want, your data becomes more of an asset. As you data becomes more of an asset, you make money by using that data in more places, more applications and more times. That is possible with data, but the problem you end up with consistency issues, and for certain applications, it's not an issue, you're basically writing, or if you're basically reading data it's not an issue. But the minute that you're trying to write on behalf of a particular business event or a particular value proposition, then now you have a challenge, you are limited in how you can do it unless you have this kind of a technology. And so this notion of continuous replication in a world that's going to become increasingly dependent upon data, data that is increasingly distributed, data that you want to ensure has common governance and policy in place, technologies like WANdisco provides are going to be increasingly important to the overall way that a business organizes itself, institutes its work and makes sure it takes care of its data assets. >> Okay, so my next question then, thanks for the clarification, it's good input there and thanks for summarizing it like that, 'cause I couldn't have done that. But when we last talked, I always was enamored by the fact that you guys have the data center replication thing down. I always saw that as a great thing for you guys. Okay, I get that, that's an on-premise situation, you have active-active, good for disaster recovery, lot of use cases, people should be beating down your door 'cause you have a better mousetrap, I get that. Now how does that translate to the cloud? So take me through why the cloud now fits nicely with that same paradigm. >> So, I mean, these are industry trends, right. What we've found is that the cloud object stores are very, very cost effective and efficient, so customers are moving towards that. They're using their Hadoop applications but on cloud object stores. Now it's trivial for us to add plugins that enable us to replicate between a cloud object store on one side, and a Hadoop on the other side. It could also be another cloud object store from a different cloud provider on the other side. Once you have that capability, now customers are freed from lock-in from either a cloud vendor or a Hadoop vendor, and they love that, they're looking at it as another way to leverage their data assets. And we enable them to do that without fear of lock-in from any of these vendors. >> So on the cloud side, the regions have always been a big thing. So we've heard Amazon have a region down here, and there was fix it. We saw at VMworld push their VMware solution to only one western region. What's the geo landscape look like in the cloud? Does that relate to anything in your tech? >> So yes, it does relate, and one of the things that people forget is that when you create an Amazon S3 bucket, for example, you specify a region. Well, but this is the cloud, isn't it worldwide? Turns out that object store actually resides in one region, and you can use some shaky technologies like cross-region replication to eventually get the data to the other region. >> Peter: Which just boosts the prices you pay. >> Yes, not just boost the price. >> Well they're trying to save price but then they're exposed on reliability. >> Reliability, exactly. You don't know when the data's going to be there, there are no guarantees. What we offer is, take your cloud storage, but we'll guarantee that we can replicate it in a synchronous fashion to another region. Could be the same provider, could be another provider. That gives tremendous benefits to the customers. >> So you actually have a guarantee when you go to customers, say with an SLA guarantee? Do you back it up with like money back, what's the guarantee? >> So the guarantees are, you know we are willing to back it up with contracts and such like, and our customers put us through rigorous testing procedures, naturally. But we stand up to every one of those. We can scale and maintain the consistency guarantees that they need for modern businesses. >> Okay, so take me through the benefits. Who wants this? Because you can almost get kind of sucked into the complexities of it, and the nuances of cloud and everything as Peter laid out, it's pretty complex even as he simplified it. Who buys this? (laughs) I mean, who's the guy, is it the IT department, is it the ops guy, is it the facilities, who... >> So we sell to the IT departments, and they absolutely love the technology. But to go back to your initial statement, we have all these disasters happening, you know, hopefully people are all doing reasonably okay at the end of these horrible disasters, but if you're an enterprise of any size, it doesn't have to be a big enterprise, you cannot go back to your users or customers and say that because of a hurricane you cannot have access to your data. That's sometimes legally not allowed, and other times it's just suicide for a business >> And HPE in Houston, it's a huge plant down there. >> Jagane: Indeed. >> They got hit hard. >> Yep, in those sort of circumstances, you want to make sure that your data is available in multiple data centers spread throughout the world, and we give you that capability. >> Okay, what are some of the successes? Let's talk through now, obviously you've got the technology, I get that. Where's the stakes in the ground? Who's adopting it? I know you do a lot of biz dev deals. I don't know if they're actually OEM-type deals, or they're just licensing deals. Take us through to where your successes are with this technology. >> So, biz dev wise, we have a mix of OEM deals and licenses and co-selling agreements. The strong ones are all OEMs, of course. We have great partnerships with IBM, Amazon, Microsoft, just wonderful partnerships. The actual end customers, we started off selling mostly to the financial industry because they have a legal mandate, so they were the first to look into this sort of a thing. But now we've expanded into automobile companies. A lot of the auto companies are generating vast amounts of data from their cars, and you can't push all that data into a single data center, that's just not reasonable. You want to push that data into a single data store that's distributed across the world in just wherever the car is closest to. We offer that capability that nobody else can, so that we've got big auto manufacturers signed up, we've got big retailers signed up for exactly the same capability. You cannot imagine ingesting all that data into a single location. You want this replicated across, you want it available no matter what happens to any single region or a data center. So we've got tremendous success in retail, banking, and a lot of this is through partnerships again. >> Well congratulations, I got to ask, you know, what's new with you guys? Obviously you have success with the active-active. We'll dig into the Hortonworks things to check your comment around them not having it, so we'll certainly look with the Dataplane, which we like. We interviewed Rob Bearden. Love the announcement, but they don't have the active-active, we're going to document that, and get that on the record. But you guys are doing well. What's new here, what's in New York, what are some of your wins, can you just give a quick update on what's going on at WANdisco? >> Okay, so quick recap, we love the Hortonworks Dataplane as well. We think that we can build value into that ecosystem by building a plugin for them. And we love the whole technology. I have wonderful friends there as well. As for our own company, we see all of our, a lot of our business coming from cloud and hybrid environments. It's just the reality of the situation. You had, you know, 20 years ago, you had NFS, which was the great appender of all storage, but turned out to be very expensive, and you had 10 years, seven years ago you had HDFS come along, and that appended the cost model of NFS and SANs, which those industries were still working their way through. And now we have cloud object stores, which have appended the HDFS model, it's much more cost-efficient to operate using cloud object stores. So we will be there, we have replication products for that. >> John: And you're in the major clouds, you in Azure? >> Yes, we are in Azure. >> Google? >> Jagane: Yes, absolutely. >> AWS? >> AWS, of course. >> Oracle? >> Oracle, of course. >> So you got all the top four companies. >> We're in all of them. >> All right, so here's the next question is, >> And you're also in IBM stuff too. >> Yes, we're built tightly into IBM >> So you've got a pretty strong legacy >> And a monopoly. >> On the mainframe. >> Like the fiber channel of replication. (John and Jagane laugh) That was a bad analogy. I mean it's like... Well, I mean fiber channel has only limited suppliers 'cause they have unique technology, it was highly important. >> But the basic proposition is look, any customer that wants to ensure that a particular data source is going to be available in a distributed way, and you're going to have some degree of consistency, is going to look at this as an option. >> Yes. >> Well you guys certainly had a great team under your leadership, it's got great tech. The final question I have for you here is, you know, we've had many conversations about the industry, we like to pontificate, I certainly like to speculate, but now we have eight years of history now in the big data world, we look back, you know, we're doing our own event in New York City, you know, thanks to great support from you guys and other great friends in the community. Appreciate everyone out there supporting theCUBE, that's awesome. But the world's changed. So I got to ask you, you're a student of the industry, I know that and knowing you personally. What's been the success formula that keeps the winners around today, and what do people need to do going forward? 'Cause we've seen the train wreck, we've seen the dead bodies in the industry, we've kind of seen what's happened, there've been some survivors. Why did the current list of characters and companies survive, and what's the winning formula in your opinion to stay relevant as big data grows in a huge way from IoT to AI cloud and everything in between? >> I'll quote Stephen Hawking in this. Intelligence is the capability to adapt to changes. That's what keeps industries, that's what keeps companies, that what keeps executives around. If you can adapt to change, if you can see things coming, and adapt your core values, your core technology to that, you can offer customers a value proposition that's going to last a long time. >> And in a big data space, what is that adaptive key focus, what should they be focused on? >> I think at this point, it's extracting information from this volume of data, whether you use machine learning in the modern days, or whether it was simple hive queries, that's the value proposition, and making sure the data's available everywhere so you can do that processing on it, that remains the strength. >> So the whole concept of digital business suggests that increasingly we're going to see our assets rendered in some form as data. >> Yes. >> And we want to be able to ensure that that data is able to be where it needs to be when it needs to be there for any number of reasons. It's a very, very interesting world we're entering into. >> Peter, I think you have a good grasp on this, and I love the narrative of programming the world in real time. What's the phrase you use? It's real time but it's programming the world... Programming the real world. >> Yeah, programming the real world. >> That's a huge, that means something completely, it's not a tech, it's a not a speed or feed. >> Well the way we think about it, is that we look at IoT as a big information transducer, where information's in one form, and then you turn it into another form to do different kinds of work. And that big data's a crucial feature in how you take data from one form and turn it into another form so that it can perform work. But then you have to be able to turn that around and have it perform work back in the real world. There's a lot of new development, a lot of new technology that's coming on to help us do that. But any way you look at it, we're going to have to move data with some degree of consistency, we're still going to have to worry about making sure that if our policy says that that action needs to take place there, and that action needs to take place there, that it actually happens the way we want it to, and that's going to require a whole raft of new technologies. We're just at the very beginning of this. >> And active-active, things like active-active in what you're talking about really is about value creation. >> Well the thing that makes active-active interesting is, again, borrowing from your terms, it's a new term to both of us, I think, today. I like it actually. But the thing that makes it interesting is the idea that you can have a source here that is writing things, and you can have a source over there that are writing things, and as a consequence, you can nonetheless look at a distributed database and keep it consistent. >> Consistent, yeah. >> And that is a major, major challenge that's going to become increasingly a fundamental feature of our digital business as well. >> It's an enabling technology for the value creation and you call it work. >> Yeah, that's right. >> Transformation of work. Jagane, congratulations on the active-active, and WANdiscos's technology and all your deals you're doing, got all the cloud locked up. What's next? Well you going to lock up the edge? You're going to lock up the edge too, the cloud. >> We do like this notion of the edge cloud and all the intermediate steps. We think that replicating data between those systems or running consistent compute across those systems is an interesting problem for us to solve. We've got all the ingredients to solve that problem. We will be on that. >> Jagane Sundar, CTO of WANdisco, back on theCUBE, bringing it down. New tech, whole new generation of modern apps and infrastructure happening in distributed and decentralized networks. Of course theCUBE's got it covered for you, and more live coverage here in New York City for BigData NYC, our annual event, theCUBE and Wikibon here in Hell's Kitchen in Manhattan, more live coverage after this short break.

Published Date : Sep 27 2017

SUMMARY :

brought to you by SiliconANGLE Media great to see you again as usual here on theCUBE. Thank you John, thank you Peter, Disaster recovery's been the low hanging fruit for you guys, One of the things we do is we can replicate Are you saying that nobody does because to do active-active, you need this capability the puck came to you guys. and with the global infrastructure trends, And the cloud is just a data center, and the messaging to both locations You guys put a stamp on the stuff, is the only way for you to do active-active replication, or are you guys hoarding it for yourselves? and we work with several powerful partners But the key thing, John, in answer to your question that you guys have the data center replication thing down. Once you have that capability, Does that relate to anything in your tech? and you can use some shaky technologies but then they're exposed on reliability. Could be the same provider, could be another provider. So the guarantees are, you know we are willing to is it the ops guy, is it the facilities, who... you cannot have access to your data. And HPE in Houston, and we give you that capability. I know you do a lot of biz dev deals. and you can't push all that data into a single data center, and get that on the record. and that appended the cost model of NFS and SANs, So you got all Like the fiber channel of replication. But the basic proposition is look, in the big data world, we look back, you know, Intelligence is the capability to adapt to changes. and making sure the data's available everywhere So the whole concept of digital business is able to be where it needs to be What's the phrase you use? That's a huge, that means something completely, that it actually happens the way we want it to, in what you're talking about really is about is the idea that you can have a source here that's going to become increasingly and you call it work. Well you going to lock up the edge? We've got all the ingredients to solve that problem. and more live coverage here in New York City

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Rob Bearden	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
Jagane	PERSON	0.99+
John Furrier	PERSON	0.99+
Peter	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
Stephen Hawking	PERSON	0.99+
two writers	QUANTITY	0.99+
Houston	LOCATION	0.99+
New York City	LOCATION	0.99+
Puerto Rico	LOCATION	0.99+
Texas	LOCATION	0.99+
New York	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Wikibon Research	ORGANIZATION	0.99+
VMworld	ORGANIZATION	0.99+
Florida	LOCATION	0.99+
Google	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
both	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
two separate databases	QUANTITY	0.99+
20 years ago	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
Cube	ORGANIZATION	0.99+
first	QUANTITY	0.99+
WANdiscos	ORGANIZATION	0.98+
over 10 years'	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
SiliconANGLE Media	ORGANIZATION	0.98+
one form	QUANTITY	0.97+
Wikibon	ORGANIZATION	0.97+
One	QUANTITY	0.97+
today	DATE	0.97+
seven years ago	DATE	0.96+
one	QUANTITY	0.96+
one region	QUANTITY	0.96+
Hadoop	TITLE	0.96+
Hortonworks Dataplane	ORGANIZATION	0.95+
NYC	LOCATION	0.95+
four companies	QUANTITY	0.94+
single region	QUANTITY	0.94+
years	DATE	0.93+
Dataplane	ORGANIZATION	0.91+
single location	QUANTITY	0.91+
single data center	QUANTITY	0.91+
HPE	ORGANIZATION	0.9+
one side	QUANTITY	0.9+
one western	QUANTITY	0.89+
Paxos	TITLE	0.89+
Paxos	OTHER	0.88+
both locations	QUANTITY	0.88+
10 years	QUANTITY	0.88+
BigData	EVENT	0.87+
Azure	TITLE	0.86+

Jagane Sundar, WANdisco - BigDataNYC - #BigDataNYC - #theCUBE

>> Announcer: Live from New York, it's theCUBE covering BigData New York City 2016, brought to you by headline sponsors Cisco, IBM, Nvidia, and our ecosystem sponsors. Now here are your hosts, Dave Vellante and Peter Burris. >> Welcome back to theCUBE everybody. This is BigData NYC and we are covering wall to wall, we've been here since Monday evening. We we're with Nvidia, Nvidia talking about deep learning, machine learning. Yesterday we had a full slate, we had eight data scientists up on stage yesterday and then we covered the IBM event last night, the rooftop party. Saw David Richards there, hanging out with him, and wall to wall today and tomorrow. Jagane Sundar is here, he is the CTO of WANdisco, great to see you again Jagane. >> Thanks for having me Dave. >> You're welcome. It's been a while since you and I sat down and I know you were on theCUBE recently at Oracle Headquarters, which I was happy to see you there and see the deals that are going on you've got good stuff going on with IBM, good stuff going on with Oracle, the Cloud is eating the world as we sort of predicted and knew but everybody wanted to put their head in the sand but you guys had to accommodate that didn't you. >> We did and if you remember us from a few years ago we were very very interested in the Hadoop space but along the journey we realized that our replication platform is actually much bigger than Hadoop. And the Cloud is just a manifestation of that vision. We had this ability to replicate data, strongly consistent, across wide area networks in different data centers and across storage systems so you can go from HDFS to a Cloud storage system like S3 or Azure Wasabi and we will do it with strong consistency. And that turned out to be a bigger deal than actually providing just replication for the Hadoop platform. So we expanded beyond our initial Hadoop Forex and now we're big in the Cloud. We replicate data to many Cloud providers and customers use us for many use cases like disaster recovery, migration, active/active, Cloud bursting, all of those interesting use cases. >> So any time I get you on theCUBE I like to refresh the 101 for me and for the audience that may not be familiar with it but you say strongly consistent, versus you hear the term eventual consistency, >> Jugane: Correct. >> What's the difference, why is the latter inadequate for the applications that you're serving. >> Right so when people say eventually consistent, what they don't remember is that eventually consistent systems often have different data in the different replicas and once in a while, once every five minutes or 15 minutes, they have to run an anti-entropy process to reconcile the differences and entropy is the total randomness right if you go back to your physics, high school physics. What you're really talking about is having random data and once every 10 minutes making it reconcile and the reconciliation process is very messy, it's like last right winds and the notion of time becomes important, how do you keep time accurate between those. Companies like Google have wonderful infrastructure where they have GPS and atomic clocks and they can do a better job but for the regular enterprise user that's a hard problem so often you get wrong data that's reconciled. So asking the same query you may get different answers and your different replicas. That's a bad sign, you want it consistent enough so you can guarantee results. >> Dave: And you've done this with math, right? >> Exactly, our basis is an algorithm called Paxos, which was invented by a gentleman called Leslie Lamport back in '89 but it took many decades for that algorithm to be widely understood. Our own chief scientists spent over a decade developing those, adding enhancements to make it run over the wide area network. The end result is a strongly consistent system, mathematically proven, that runs over the wide area network and it's completely resistant to failure of all sorts. >> That allows you to sort of create the same type of availability, data consistency as you mentioned Google with the atomic clocks, Spanner I presume, is this fascinating, I mean when the paper came out I was, my eyes were bleeding reading it and but that's the type of capability that you're able to bring to enterprises right? >> That's exactly right, we can bring similar capabilities across diverse networks. You can have regular networking gear, time synchronized by NTP, out in the Cloud, things are running in a virtual machine where time adrift most of the time, people don't realize that VMs are pretty bad at keeping time and all you get up in the Cloud is VMS. Across all those enviroments we can give you strongly consistent replication at the same quality that Google does with their hardware. So that's the value that we bring to the Fortune 500. >> So increasingly enterprises are recognizing that data has an, I don't want to say intrinsic value but data is a source of value in context all by itself. Independent of any hardware, independent of any software. That it's something that needs to be taken care of and you guys have an approach for ensuring that important aspects of it are better taken care of. Not the least of which, is that you can provide an option to a customer who may make a bad technology choice one day to make a better technology choice the next day and not be too worried about dead ending themselves. I'm reminded of the old days when somebody who was negotiating an IBM main frame deal would put an Amdahl coffee cup in front of IBM or put an Oracle coffee cup in front of SAP. Do you find customers metaphorically putting a WANdisco coffee cup in front of those different options and say these guys are ensuring that our data remains ours? >> Customers are a lot more sophisticated now, the scenarios that you pointed out are very very funny but what customers come to us for is the exact same thing, the way they ask it is, I want to move to Cloud X, but I want to make sure that I can also run on Cloud Y and I want to do it seamlessly without any downtime on my on-prem applications that are running. We can give them that. Not only are they building a disaster recovery environment, often they're experimenting with multiple Clouds at the same time and may the better Cloud win. That puts a lot of competition and pressure on the actual Cloud applications they're trying. That's a manifestation in modern Cloud terms of the coffee cup competitor in the face that you just pointed out. Very funny but this how customers are doing it these days. >> So are you using or are they starting to, obviously you are able to replicate with high fidelity with strong fidelity, strong consistency, large volumes of data. Are you starting to see customers, based on that capability actually starting to redesign how they set up their technology plant? >> Absolutely, when customers were talking about hybrid Cloud which was pretty well hyped a year or so ago, they basically had some data on-prem and some other data in the Cloud and they were doing stuff but what we brought to them was the ability to have the same data both on-prem and in the Cloud, maybe you had a weekly analytics job that took a lot of resources. You'd burst that out into the Cloud and run it up there, move the result of that analytics job back on-prem. You'd have it with strong consistency. The result is that true hybrid Cloud is enabled when only when you have the same exact data available in all of your Cloud locations. We're the only company that can provide that so we've got customers that are expanding their Cloud options because of the data consistency we offer. >> And those Cloud options are obviously are increasing >> Jugane: They are. >> But there's also a recognition that it's as we gain more experience with Cloud, that different workloads are better than others as we move up there. Now Oracle with some of their announcements last week may start to push the envelope on that a little bit but as you think about where the need for moving large volumes of data with high, with strong consistency what types of applications do you think people are focusing on? Is it mainly big data or are there other application styles or job types that you think are going to become increasingly important? >> So we've got much more than big data, one of the big sources of leads for us now is our capability to migrate netapp filers up into the Cloud and that has suddenly become very important because an example I'd like to give is a big financial firm that has all of its binaries and applications and user data and netapp filers, the actual data is in HDFS on-prem. They're moving their binaries from the netapp up into the Cloud in a specific Cloud windows equal into the filer and the big data part of it from HDFS up into Cloud object store, we are the only platform that can deal with both in the strong consistent manner that I've talked about and we're a single replication platform so that gives them the ability to make the sort of a migration with very low risk. One of the attributes of our migration is that we do it with no downtime. You don't have to take your online, your on-prem environment offline in order to do the migration so they are doing that so we see a lot of business from that sort of migration efforts where people have data in mass filers, people have data in other non-HDFS storage systems. We're happy to migrate all of those. Our replication platform approach, which we've taken in the last year and a half or so is really paying off in that respect. >> And you couldn't do that with conventional migration techniques because it would take too long, you'd have to freeze the applications? >> A couple of things, one you'd probably have to take the applications offline, second you'd be using tools of periodic synchronization variety such as RSYNC and anybody in the devops or operations whose ever used RSYNC across the wide area network will tell you how bad that experience is. It really is a very bad experience. We've got capability to migrate netapp filer data without imposing a load on the netapp's on-prem so we can do it without pounding the crap out of the netapp's server such that they can't offer service to their existing customers. Very low impact on the network configuration, application configuration. We can go in, start the migration without downtime, maybe it takes two, three days for the data to get up over there because of mavenlink. After that is done, you can start playing with it up in the Cloud. And you can cut over seamlessly so there's so real downtime, that's the capability we've seen. >> But you've also mentioned one data type, binaries, they can't withstand error propagation. >> Jugane: Absolutely. >> And so being able to go to a customer and say you're going to have to move these a couple times over the course of the next n-months or years, as a consequence of the new technology that's now available and we can do so without error propagation is going to have a big impact on how well their IT infrastructure, their IT asset base runs in five years. >> Indeed, indeed. That's very important. Having the ability to actually start the application, having the data in a consistent and true form so you can start, for example, the data base and have it mount the actual data so you can use it up in the Cloud, those are capabilities that are very important to customers. >> So there's another application. If you think about, you tend to be more bulk, the question I'm going to ask is and at what point in time is the low threshold in terms of specific types of data movement. Here's why I'm asking. IOT data is a data source or is a use-case that has often the most stringent physical constraints possible. Time, speed of light, has an implication but also very importantly, this notion of error propagation really matters. If you go from a sensor to a gateway to another gateway to another gateway you will lose bits along the way if you're not very careful. >> Correct. >> And in a nuclear power plant, that doesn't work that way. >> Jugane: Yeah. >> Now we don't have to just look at a nuclear power plant as an example but there's increasingly industrial IOTs starting to dramatically impact not just life and death circumstances but business success or failure. What types of smaller batch use-cases do you guys find yourselves operating in, in places like IOT where this notion of error or air control strong consistency is so critical? >> So one of the most popular applications that use our replication is Spark and Spark Streaming which as you can imagine is a big part of most IOT infrastructure, we can do replication such that you ingest into the closest data center, you go from your server or your car or whatever to the closest data center, you don't have to go multiple hops. We will take care consistency from there on. What that gives you is the ability to say I have 12 data centers with my IOT infrastructure running, one data center goes down, you don't have a downtime at all. It's only the data that was generated inside the data center that's lost. All client machines connecting to that data center will simply connect to another data center, strong replication continues, this gives you the ability to ingest at very large volumes while still maintaining the consistency and IOT is a big deal for us, yes. >> We're out of time but I got a couple of last minute questions if I may. So when you integrate with IBM, Oracle, what kind of technical issues do you encounter, what kind of integration do you have to do, is it lightweight, heavyweight, middleweight? >> It's middleweight I would say. IBM is a great example, they have a deep integration with our product and some of the authentication technology they use was more advanced than what was available in open source at that time. We did a little of work, and they did a little bit of work to make that work, but other than that, it's a pretty straight forward process. The end result is that they have a number of their applications where this is a critical part of their infrastructure. >> Right, and then road map. What can you tell us about, what should we look for in the future, what kind of problems are you going to be solving? >> So we look at our platform as the best replication engine in the world. We're building an SDK, we expect custom plugins for different other applications, we expect more high-speed streaming data such as IOT data, we want to be the choice for replication. As for the plugins themselves, they're getting easier and easier to build so you'll see wide coverage from us. >> Jugane, thanks so much for coming to theCUBE, always a pleasure to have you. >> Thank you for having me. >> You're welcome. Alright keep it right there everybody, we'll be back to wrap. This is theCUBE, we're live from NYC. We'll be right back. (upbeat electronic music)

Published Date : Sep 29 2016

SUMMARY :

brought to you by headline great to see you again Jagane. and see the deals that are going on but along the journey we realized for the applications that you're serving. So asking the same query you runs over the wide area network So that's the value that we is that you can provide the scenarios that you pointed So are you using or You'd burst that out into the Cloud or job types that you think are going to and the big data part of it from HDFS and anybody in the devops or operations they can't withstand error propagation. as a consequence of the new and have it mount the actual the question I'm going to ask is that doesn't work that way. do you guys find yourselves operating in, What that gives you is the ability to say do you have to do, and some of the authentication you going to be solving? engine in the world. for coming to theCUBE, This is theCUBE, we're live from NYC.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Peter Burris	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Jagane	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Google	ORGANIZATION	0.99+
NYC	LOCATION	0.99+
two	QUANTITY	0.99+
15 minutes	QUANTITY	0.99+
David Richards	PERSON	0.99+
yesterday	DATE	0.99+
Cloud X	TITLE	0.99+
12 data centers	QUANTITY	0.99+
Cloud Y	TITLE	0.99+
tomorrow	DATE	0.99+
last week	DATE	0.99+
three days	QUANTITY	0.99+
five years	QUANTITY	0.99+
New York	LOCATION	0.99+
SAP	ORGANIZATION	0.99+
Jugane	PERSON	0.99+
One	QUANTITY	0.99+
Leslie Lamport	PERSON	0.99+
both	QUANTITY	0.99+
Yesterday	DATE	0.99+
Monday evening	DATE	0.99+
'89	DATE	0.99+
WANdisco	ORGANIZATION	0.98+
last night	DATE	0.98+
today	DATE	0.98+
Amdahl	ORGANIZATION	0.97+
one day	QUANTITY	0.97+
over a decade	QUANTITY	0.97+
single	QUANTITY	0.97+
Cloud	TITLE	0.95+
Hadoop	TITLE	0.95+
BigData	ORGANIZATION	0.95+
S3	TITLE	0.95+
one	QUANTITY	0.95+
next day	DATE	0.94+
eight data scientists	QUANTITY	0.93+
a year or so ago	DATE	0.9+
five minutes	QUANTITY	0.88+
BigDataNYC	ORGANIZATION	0.88+
once	QUANTITY	0.88+
Spark	TITLE	0.87+
few years ago	DATE	0.87+
one data center	QUANTITY	0.86+
Azure Wasabi	TITLE	0.86+
BigData	EVENT	0.84+
Paxos	OTHER	0.81+
101	QUANTITY	0.79+
one data	QUANTITY	0.77+
once every 10 minutes	QUANTITY	0.77+
last year and a half	DATE	0.77+
CTO	PERSON	0.76+
theCUBE	TITLE	0.75+
next n-months	DATE	0.74+
York City 2016	EVENT	0.71+
Oracle Headquarters	ORGANIZATION	0.67+
couple	QUANTITY	0.63+
Fortune 500	ORGANIZATION	0.58+
many	QUANTITY	0.58+
WANdisco	COMMERCIAL_ITEM	0.55+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Jagane Sundar: