Nipun Agarwal, Oracle | CUBEconversation

(bright upbeat music) >> Hello everyone, and welcome to the special exclusive CUBE Conversation, where we continue our coverage of the trends of the database market. With me is Nipun Agarwal, who's the vice president, MySQL HeatWave in advanced development at Oracle. Nipun, welcome. >> Thank you Dave. >> I love to have technical people on the Cube to educate, debate, inform, and we've extensively covered this market. We were all over the Snowflake IPO and at that time I remember, I challenged organizations bring your best people. Because I want to better understand what's happening at Database. After Oracle kind of won the Database wars 20 years ago, Database kind of got boring. And then it got really exciting with the big data movement, and all the, not only SQL stuff coming out, and Hadoop and blah, blah, blah. And now it's just exploding. You're seeing huge investments from many of your competitors, VCs are trying to get into the action. Meanwhile, as I've said many, many times, your chairman and head of technology, CTO, Larry Ellison, continues to invest to keep Oracle relevant. So it's really been fun to watch and I really appreciate you coming on. >> Sure thing. >> We have written extensively, we talked to a lot of Oracle customers. You get the leading mission critical database in the world. Everybody from Fortune 100, we evaluated what Gardner said about the operational databases. I think there's not a lot of question there. And we've written about that on WikiBound about you're converged databases, and the strategy there, and we're going to get into that. We've covered Autonomous Data Warehouse Exadata Cloud at Customer, and then we just want to really try to get into your area, which has been, kind of caught our attention recently. And I'm talking about the MySQL Database Service with HeatWave. I love the name, I laugh. It was an unveiled, I don't know, a few months ago. So Nipun, let's start the discussion today. Maybe you can update our viewers on what is HeatWave? What's the overall focus with Oracle? And how does it fit into the Cloud Database Service? >> Sure Dave. So HeatWave is a in-memory query accelerator for the MySQL Database Service for speeding up analytic queries as well as long running complex OLTP queries. And this is all done in the context of a single database which is the MySQL Database Service. Also, all existing MySQL applications or MySQL compatible tools and applications continue to work as is. So there is no change. And with this HeatWave, Oracle is delivering the only MySQL service which provides customers with a single unified platform for both analytic as well as transaction processing workloads. >> Okay, so, we've seen open source databases in the cloud growing very rapidly. I mentioned Snowflake, I think Google's BigQuery, get some mention, we'll talk, we'll maybe talk more about Redshift later on, but what I'm wondering, well let's talk about now, how does MySQL HeatWave service, how does that compare to MySQL-based services from other cloud vendors? I can get MySQL from others. In fact, I think we do. I think we run WikiBound on the LAMP stack. I think it's running on Amazon, but so how does your service compare? >> No other vendor, like, no other vendor offers this differentiated solution with an open source database namely, having a single database, which is optimized both for transactional processing and analytics, right? So the example is like MySQL. A lot of other cloud vendors provide MySQL service but MySQL has been optimized for transaction processing so when customs need to run analytics they need to move the data out of MySQL into some other database for any analytics, right? So we are the only vendor which is now offering this unified solution for both transactional processing analytics. That's the first point. Second thing is, most of the vendors out there have taken open source databases and they're basically hosting it in the cloud. Whereas HeatWave, has been designed from the ground up for the cloud, and it is a 100% compatible with MySQL applications. And the fact that we have designed it from the ground up for the cloud, maybe I'll spend 100s of person years of research and engineering means that we have a solution, which is very, very scalable, it's very optimized in terms of performance, and it is very inexpensive in terms of the cost. >> Are you saying, well, wait, are you saying that you essentially rewrote MySQL to create HeatWave but at the same time maintained compatibility with existing applications? >> Right. So we enhanced MySQL significantly and we wrote a whole bunch of new code which is brand new code optimized for the cloud in such a manner that yes, it is 100% compatible with all existing MySQL applications. >> What does it mean? And if I'm to optimize for the cloud, I mean, I hear that and I say, okay, it's taking advantage of cloud-native. I hear kind of the buzzwords, cloud-first, cloud-native. What does it specifically mean from a technical standpoint? >> Right. So first, let's talk about performance. What we have done is that we have looked at two aspects. We have worked with shapes like for instance, like, the compute shapes which provide the best performance for dollar, per dollar. So I'll give you a couple of examples. We have optimized for certain shifts. So, HeatWave is in-memory query accelerator. So the cost of the system is dominated by the cost. So we are working with chips which provide the cheapest cost per terabyte of memory. Secondly, we are using commodity cloud services in such a manner that it's in-optimized for both performance as well as performance per dollar. So, example is, we are not using any locally-attached SSDs. We use ObjectStore because it's very inexpensive. And then I guess at some point I will get into the details of the architecture. The system has been really, really designed for massive scalability. So as you add more compute, as you add more service, the system continues to scale almost perfectly linearly. So this is what I mean in terms of being optimized for the cloud. >> All right, great. >> And furthermore, (indistinct). >> Thank you. No, carry on. >> Over the next few months, you will see a bunch of other announcements where we're adding a whole bunch of machine learning and data driven-based automation which we believe is critical for the cloud. So optimized for performance, optimized for the cloud, and machine learning-based automation which we believe is critical for any good cloud-based service. >> All right, I want to come back and ask you more about the architecture, but you mentioned some of the others taking open source databases and shoving them into the cloud. Let's take the example of AWS. They have a series of specialized data stores and, for different workloads, Aurora is for OLTP I actually think it's based on MySQL Redshift which is based on ParAccel. And so, and I've asked Amazon about this, and their response is, actually kind of made sense to me. Look, we want the right tool for the right job, we want access to the primitives because when the market changes we can change faster as opposed to, if we put, if we start building bigger and bigger databases with more functionality, it's, we're not as agile. So that kind of made sense to me. I know we, again, we use a lot, we use, I think I said MySQL in Amazon we're using DynamoDB, works, that's cool. We're not huge. And I, we fully admit and we've researched this, when you start to get big that starts to get maybe expensive. But what do you think about that approach and why is your approach better? >> Right, we believe that there are multiple drawbacks of having different databases or different services, one, optimized for transactional processing and one for analytics and having to ETL between these different services. First of all, it's expensive because you have to manage different databases. Secondly, it's complex. From an application standpoint, applications need, now need to understand the semantics of two different databases. It's inefficient because you have to transfer data at some PRPC from one database to the other one. It's not secure because there is security aspects involved when your transferring data and also the identity of users in the two different databases is different. So it's, the approach which has been taken by Amazons and such, we believe, is more costly, complex, inefficient and not secure. Whereas with HeatWave, all the data resides in one database which is MySQL and it can run both transaction processing and analytics. So in addition to all the benefits I talked about, customers can also make their decisions in real time because there is no need to move the data. All the data resides in a single database. So as soon as you make any changes, those changes are visible to customers for queries right away, which is not the case when you have different siloed specialized databases. >> Okay, that, a lot of ways to skin a cat and that what you just said makes sense. By the way, we were saying before, companies have taken off the shelf or open source database has shoved them in the cloud. I have to give Amazon some props. They actually have done engineering to Aurora and Redshift. And they've got the engineering capabilities to do that. But you can see, for example, in Redshift the way they handle separating compute from storage it's maybe not as elegant as some of the other players like a Snowflake, for example, but they get there and they, maybe it's a little bit more brute force but so I don't want to just make it sound like they're just hosting off the shelf in the cloud. But is it fair to say that there's like a crossover point? So in other words, if I'm smaller and I'm not, like doing a bunch of big, like us, I mean, it's fine. It's easy, I spin it up. It's cheaper than having to host my own servers. So there's, presumably there's a sweet spot for that approach and a sweet spot for your approach. Is that fair or do you feel like you can cover a wider spectrum? >> We feel we can cover the entire spectrum, not wider, the entire spectrum. And we have benchmarks published which are actually available on GitHub for anyone to try. You will see that this approach you have taken with the MySQL Database Service in HeatWave, we are faster, we are cheaper without having to move the data. And the mileage or the amount of improvement you will get, surely vary. So if you have less data the amount of improvement you will get, maybe like say 100 times, right, or 500 times, but smaller data sizes. If you get to lots of data sizes this improvement amplifies to 1000 times or 10,000 times. And similarly for the cost, if the data size is smaller, the cost advantage you will have is less, maybe MySQL HeatWave is one third the cost. If the data size is larger, the cost advantage amplifies. So to your point, MySQL Database Service in HeatWave is going to be better for all sizes but the amount of mileage or the amount of benefit you will get increases as the size of the data increases. >> Okay, so you're saying you got better performance, better cost, better price performance. Let me just push back a little bit on this because I, having been around for awhile, I often see these performance and price comparisons. And what often happens is a vendor will take the latest and greatest, the one they just announced and they'll compare it to an N-1 or an N-2, running on old hardware. So, is, you're normalizing for that, is that the game you're playing here? I mean, how can you, give us confidence that this is easier kind of legitimate benchmarks in your GitHub repo. >> Absolutely. I'll give you a bunch of like, information. But let me preface this by saying that all of our scripts are available in the open source in the GitHub repo for anyone to try and we would welcome feedback otherwise. So we have taken, yes, the latest version of MySQL Database Service in HeatWave, we have optimized it, and we have run multiple benchmarks. For instance, TBC-H, TPC-DS, right? Because the amount of improvement a query will get depends upon the specific query, depends upon the predicates, it depends on the selectivity so we just wanted to use standard benchmarks. So it's not the case that if you're using certain classes of query, excuse me, benefit, get them more. So, standard benchmarks. Similarly, for the other vendors or other services like Redshift, we have run benchmarks on the latest shapes of Redshift the most optimized configuration which they recommend, running their scripts. So this is not something that, hey, we're just running out of the box. We have optimized Aurora, we have optimized (indistinct) to the best and possible extent we can based on their guidelines, based on their latest release, and that's what you're talking about in terms of the numbers. >> All right. Please continue. >> Now, for some other vendors, if we get to the benchmark section, we'll talk about, we are comparing with other services, let's say Snowflake. Well there, there are issues in terms of you can't legally run Snowflake numbers, right? So there, we have looked at some reports published by Gigaom report. and we are taking the numbers published by the Gigaom report for Snowflake, Google BigQuery and as you'll see maps numbers, right? So those, we have not won ourselves. But for AWS Redshift, as well as AWS Aurora, we have run the numbers and I believe these are the best numbers anyone can get. >> I saw that Gigaom report and I got to say, Gigaom, sometimes I'm like, eh, but I got to say that, I forget the guy's name, he knew what he was talking about. He did a good job, I thought. I was curious as to the workload. I always say, well, what's the workload. And, but I thought that report was pretty detailed. And Snowflake did not look great in that report. Oftentimes, and they've been marketing the heck out of it. I forget who sponsored it. It is, it was sponsored content. But, I did, I remember seeing that and thinking, hmm. So, I think maybe for Snowflake that sweet spot is not, maybe not that performance, maybe it's the simplicity and I think that's where they're making their mark. And most of their databases are small and a lot of read-only stuff. And so they've found a market there. But I want to come back to the architecture and really sort of understand how you've able, you've been able to get this range of both performance and cost you talked about. I thought I heard that you're optimizing the chips, you're using ObjectStore. You're, you've got an architecture that's not using SSD, it's using ObjectStore. So this, is their cashing there? I wonder if you could just give us some details of the architecture and tell us how you got to where you are. >> Right, so let me start off saying like, what are the kind of numbers we are talking about just to kind of be clear, like what the improvements are. So if you take the MySQL Database Service in HeatWave in Oracle Cloud and compare it with MySQL service in any other cloud, and if you look at smaller data sizes, say data sizes which are about half a terabyte or so, HeatWave is 400 times faster, 400 times faster. And as you get to... >> Sorry. Sorry to interrupt. What are you measuring there? Faster in terms of what? >> Latency. So we take TCP-H 22 queries, we run them on HeatWave, and we run the same queries on MySQL service on any other cloud, half a terabyte and the performance in terms of latency is 400 times faster in HeatWave. >> Thank you. Okay. >> If you go to a lot of other data sites, then the other data point of view, we're looking at say something like, 4 TB, there, we did two comparisons. One is with AWS Aurora, which is, as you said, they have taken MySQL. They have done a bunch of innovations over there and we are offering it as a premier service. So on 4 TB TPC-H, MySQL Database Service with HeatWave is 1100 times faster than Aurora. It is three times faster than the fastest shape of Redshift. So Redshift comes in different flavors some talking about dense compute too, right? And again, looking at the most recommended configuration from Redshift. So 1100 times faster that Aurora, three times faster than Redshift and at one third, the cost. So this where I just really want to point out that it is much faster and much cheaper. One third the cost. And then going back to the Gigaom report, there was a comparison done with Snowflake, Google BigQuery, Redshift, Azure Synapse. I wouldn't go into the numbers here but HeatWave was faster on both TPC-H as well as TPC-DS across all these products and cheaper compared to any of these products. So faster, cheaper on both the benchmarks across all these products. Now let's come to, like, what is the technology underneath? >> Great. >> So, basically there are three parts which you're going to see. One is, improve performance, very good scale, and improve a lower cost. So the first thing is that HeatWave has been optimized and, for the cloud. And when I say that, we talked about this a bit earlier. One is we are using the cheapest shapes which are available. We're using the cheapest services which are available without having to compromise the performance and then there is this machine learning-based automation. Now, underneath, in terms of the architecture of HeatWave there are basically, I would say, four key things. First is, HeatWave is an in-memory engine that a presentation which we have in memory is a hybrid columnar representation which is optimized for vector process. That's the first thing. And that's pretty table stakes these days for anyone who wants to do in-memory analytics except that it's hybrid columnar which is optimized for vector processing. So that's the first thing. The second thing which starts getting to be novel is that HeatWave has a massively parallel architecture which is enabled by a massively partitioned architecture. So we take the data, we read the data from MySQL into the memory of the HeatWave and we massively partition this data. So as we're reading the data, we're partitioning the data based on the workload, the sizes of these partitions is such that it fits in the cache of the underlying processor and then we're able to consume these partitions really, really fast. So that's the second bit which is like, massively parallel architecture enabled by massively partitioned architecture. Then the third thing is, that we have developed new state-of-art algorithms for distributed query processing. So for many of the workloads, we find that joints are the long pole in terms of the amount of time it takes. So we at Oracle have developed new algorithms for distributed joint processing and similarly for many other operators. And this is how we're being able to consume this data or process this data, which is in-memory really, really fast. And finally, and what we have, is that we have an eye for scalability and we have designed algorithms such that there's a lot of overlap between compute and communication, which means that as you're sending data across various nodes and there could be like, dozens of of nodes or 100s of nodes that they're able to overlap the computation time with the communication time and this is what gives us massive scalability in the cloud. >> Yeah, so, some hard core database techniques that you've brought to HeatWave, that's impressive. Thank you for that description. Let me ask you, just to go to quicker side. So, MySQL is open source, HeatWave is what? Is it like, open core? Is it open source? >> No, so, HeatWave is something which has been designed and optimized for the cloud. So it can't be open source. So any, it's not open service. >> It is a service. >> It is a service. That's correct. >> So it's a managed service that I pay Oracle to host for me. Okay. Got it. >> That's right. >> Okay, I wonder if you could talk about some of the use cases that you're seeing for HeatWave, any patterns that you're seeing with customers? >> Sure, so we've had the service, we had the HeatWave service in limited availability for almost 15 months and it's been about five months since we have gone G. And there's a very interesting trend of our customers we're seeing. The first one is, we are seeing many migrations from AWS specifically from Aurora. Similarly, we are seeing many migrations from Azure MySQL we're migrations from Google. And the number one reason customers are coming is because of ease of use. Because they have their databases currently siloed. As you were talking about some for optimized for transactional processing, some for analytics. Here, what customers find is that in a single database, they're able to get very good performance, they don't need to move the data around, they don't need to manage multiple databaes. So we are seeing many migrations from these services. And the number one reason is reduce complexity of ease of use. And the second one is, much better performance and reduced costs, right? So that's the first thing. We are very excited and delighted to see the number of migrations we're getting. The second thing which we're seeing is, initially, when we had the service announced, we were like, targeting really towards analytics. But now what are finding is, many of these customers, for instance, who have be running on Aurora, when they are moving from MySQL in HeatWave, they are finding that many of the OLTP queries as well, are seeing significant acceleration with the HeatWave. So now customers are moving their entire applications or, to HeatWave. So that's the second trend we're seeing. The third thing, and I think I kind of missed mentioning this earlier, one of the very key and unique value propositions we provide with the MySQL Database Service in HeatWave, is that we provide a mechanism where if customers have their data stored on premise they can still leverage the HeatWave service by enabling MySQL replication. So they can have their data on premise, they can replicate this data in the Oracle Cloud and then they can run analytics. So this deployment which we are calling the hybrid deployment is turning out to be very, very popular because there are customers, there are some customers who for various reasons, compliance or regulatory reasons cannot move the entire data to the cloud or migrate the data to the cloud completely. So this provides them a very good setup where they can continue to run their existing database and when it comes to getting benefits of HeatWave for query acceleration, they can set up this replication. >> And I can run that on anyone, any available server capacity or is there an appliance to facilitate that? >> No, this is just standard MySQL replication. So if a customer is running MySQL on premise they can just turn off this application. We have obviously enhanced it to support this inbound replication between on-premise and Oracle Cloud with something which can be enabled as long as the source and destination are both MySQL. >> Okay, so I want to come back to this sort of idea of the architecture a little bit. I mean, it's hard for me to go toe to toe with the, I'm not an engineer, but I'm going to try anyway. So you've talked about OLTP queries. I thought, I always thought HeatWave was optimized for analytics. But so, I want to push on this notion because people think of this the converged database, and what you're talking about here with HeatWave is sort of the Swiss army knife which is great 'cause you got a screwdriver and you got Phillips and a flathead and some scissors, maybe they're not as good. They're not as good necessarily as the purpose-built tool. But you're arguing that this is best of breed for OLTP and best of breed for analytics, both in terms of performance and cost. Am I getting that right or is this really a Swiss army knife where that flathead is really not as good as the big, long screwdriver that I have in my bag? >> Yes, so, you're getting it right but I did want to make a clarification. That HeatWave is definitely the accelerator for all your queries, all analytic queries and also for the long running complex transaction processing inquiries. So yes, HeatWave the uber query accelerator engine. However, when it comes to transaction processing in terms of your insert statements, delete statements, those are still all done and served by the MySQL database. So all, the transactions are still sent to the MySQL database and they're persistent there, it's the queries for which HeatWave is the accelerator. So what you said is correct. For all query acceleration, HeatWave is the engine. >> Makes sense. Okay, so if I'm a MySQL customer and I want to use HeatWave, what do I have to do? Do I have to make changes to my existing applications? You applied earlier that, no, it's just sort of plugs right in. But can you clarify that. >> Yes, there are absolutely no changes, which any MySQL or MySQL compatible application needs to make to take advantage of HeatWave. HeatWave is an in-memory accelerator and it's completely transparent to the application. So we have like, dozens and dozens of like, applications which have migrated to HeatWave, and they are seeing the same thing, similarly tools. So if you look at various tools which work for analytics like, Tableau, Looker, Oracle Analytics Cloud, all of them will work just seamlessly. And this is one of the reasons we had to do a lot of heavy lifting in the MySQL database itself. So the MySQL database engineering team was, has been very actively working on this. And one of the reasons is because we did the heavy lifting and we meet enhancements to the MySQL optimizer in the MySQL storage layer to do the integration of HeatWave in such a seamless manner. So there is absolutely no change which an application needs to make in order to leverage or benefit from HeatWave. >> You said earlier, Nipun, that you're seeing migrations from, I think you said Aurora and Google BigQuery, you might've said Redshift as well. Do you, what kind of tooling do you have to facilitate migrations? >> Right, now, there are multiple ways in which customers may want to do this, right? So the first tooling which we have is that customers, as I was talking about the replication or the inbound replication mechanism, customers can set up heat HeatWave in the Oracle Cloud and they can send the data, they can set up replication within their instances in their cloud and HeatWave. Second thing is we have various kinds of tools to like, facilitate the data migration in terms of like, fast ingestion sites. So there are a lot of such customers we are seeing who are kind of migrating and we have a plethora of like, tools and applications, in addition to like, setting up this inbound application, which is the most seamless way of getting customers started with HeatWave. >> So, I think you mentioned before, I have my notes, machine intelligence and machine learning. We've seen that with autonomous database it's a big, big deal obviously. How does HeatWave take advantage of machine intelligence and machine learning? >> Yeah, and I'm probably going to be talking more about this in the future, but what we have already is that HeatWave uses machine learning to intelligently automate many operations. So we know that when there's a service being offered in the cloud, our customers expect automation. And there're a lot of vendors and a lot of services which do a good job in automation. One of the places where we're going to be very unique is that HeatWave uses machine learning to automate many of these operations. And I'll give you one such example which is provisioning. Right now with HeatWave, when a customer wants to determine how many nodes are needed for running their workload, they don't need to make a guess. They invoke a provisioning advisor and this advisor uses machine learning to sample a very small percentage of the data. We're talking about, like, 0.1% sampling and it's able to predict the amount of memory with 95% accuracy, which this data is going to take. And based on that, it's able to make a prediction of how many servers are needed. So just a simple operation, the first step of provisioning, this is something which is done manually across, on any of the service, whereas at HeatWave, we have machine learning-based advisor. So this is an example of what we're doing. And in the future, we'll be offering many such innovations as a part of the MySQL Database and the HeatWave service. >> Well, I've got to say I was skeptic but I really appreciate it, you're, answering my questions. And, a lot of people when you made the acquisition and inherited MySQL, thought you were going to kill it because they thought it would be competitive to Oracle Database. I'm happy to see that you've invested and figured out a way to, hey, we can serve our community and continue to be the steward of MySQL. So Nipun, thanks very much for coming to the CUBE. Appreciate your time. >> Sure. Thank you so much for the time, Dave. I appreciate it. >> And thank you for watching everybody. This is Dave Vellante with another CUBE Conversation. We'll see you next time. (bright upbeat music)

Published Date : Apr 28 2021

SUMMARY :

of the trends of the database market. So it's really been fun to watch and the strategy there, for the MySQL Database Service on the LAMP stack. And the fact that we have designed it optimized for the cloud I hear kind of the buzzwords, So the cost of the system Thank you. critical for the cloud. So that kind of made sense to me. So it's, the approach which has been taken By the way, we were saying before, the amount of improvement you will get, is that the game you're playing here? So it's not the case All right. and we are taking the numbers published of the architecture and if you look at smaller data sizes, Sorry to interrupt. and the performance in terms of latency Thank you. So faster, cheaper on both the benchmarks So for many of the workloads, to go to quicker side. and optimized for the cloud. It is a service. So it's a managed cannot move the entire data to the cloud as long as the source and of the architecture a little bit. and also for the long running complex Do I have to make changes So the MySQL database engineering team to facilitate migrations? So the first tooling which and machine learning? and the HeatWave service. and continue to be the steward of MySQL. much for the time, Dave. And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Larry Ellison	PERSON	0.99+
Nipun Agarwal	PERSON	0.99+
Nipun	PERSON	0.99+
AWS	ORGANIZATION	0.99+
400 times	QUANTITY	0.99+
Dave	PERSON	0.99+
1000 times	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
10,000 times	QUANTITY	0.99+
100%	QUANTITY	0.99+
HeatWave	ORGANIZATION	0.99+
second bit	QUANTITY	0.99+
MySQL	TITLE	0.99+
95%	QUANTITY	0.99+
100 times	QUANTITY	0.99+
two aspects	QUANTITY	0.99+
500 times	QUANTITY	0.99+
0.1%	QUANTITY	0.99+
half a terabyte	QUANTITY	0.99+
dozens	QUANTITY	0.99+
1100 times	QUANTITY	0.99+
4 TB	QUANTITY	0.99+
first point	QUANTITY	0.99+
First	QUANTITY	0.99+
Phillips	ORGANIZATION	0.99+
Amazons	ORGANIZATION	0.99+
three times	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
One third	QUANTITY	0.99+
one database	QUANTITY	0.99+
second thing	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
Snowflake	TITLE	0.99+

David Richards, WANdisco | AWS Summit 2017

>> Narrator: Live from Manhattan, it's theCUBE, covering AWS Summit New York City 2017, brought to you by Amazon Web Services. >> And welcome back to New York, here. AWS Summit, theCUBE continue our coverage of what's happening here in the Big Apple. I'm John Walls along with Stu Miniman, and what this is is maybe not the most prolific CUBE guest of all time, but he's in the hall of fame. He really is a CUBE MVP for sure. It's good to have David Richards with us, the president, chairman, CEO of WANdisco. Good to see you, sir. >> It's a pleasure to be back again. It feels like home. >> It is like home. We need to get you your own microphone, I think, you know? >> David: I know it. I need my name on the back of the seat or something. >> This isn't quite a home game for you. All right, so you've got an office in Sheffield, England. >> David: Yeah. >> You've got an office out in the valley, Silicon Valley. We got ya right in the middle, I think. >> David: Yeah. >> Almost, don't we? So-- >> Exactly. >> We kind of split the difference for you this one. >> I always tell people I'm recolonizing the United States. I've been here for about 20 years. I can change the accent. >> Right. >> I'll get you all, eventually. >> All right, well, another year or two, we'll see how that works for ya. Big, big, I guess six, seven months for you, right? As far as some acquisitions you've done, some vice partnerships and arrangements you've done. >> Yes, as a business, we've really progressed well in the first half of the year. I've got to be a little bit careful. We've got results coming out September the sixth in London, but we did do a pre-announcement of a business update. We signed a record big data cloud contract with a very large bank for over four million dollars. That was our largest ever contract win. We signed a major retailer who we can't name, obviously, which is another sort of cloud ObjectStore on premises. A big data win, and interestingly, we stopped burning cash and investors really like this kind of perfect storm of, 175%, 173% growth in our cloud big data revenue, booking, sorry, combined with a flat cost-base, which meant, first half of last year, burning five point four million dollars down to virtually zero, just $600,000 in the first half. So, investors really like that. We really like that, and it demonstrates that perfect storm of flat cost-base and growing sales. >> David, I'm curious, does working with Amazon, and your customers being on Amazon, does the speed and agility and everything like that contribute to that profitability? >> Well, Amazon kind of changes the game for all vendors, right? Because nobody, it used to be this sort of big four, five, six, whatever it is these days, consulting companies that had to implement ERP systems and all those complex applications. I don't necessarily think they're the people, they're not the go-to people anymore for cloud. So, it's down to uniqueness of technology. Amazon have got such a wide array, we were talking earlier about some of their announcements out today as they continue to go up the stack with applications and so on. So, it does lend itself very well to small vendors with sticky, unique intellectual property and unique products and services that are going to really thrive in this kind of cloud environment. So, we've really enjoyed working with Amazon, but we're also working with the other cloud vendors, as well, and I have to say, when we first saw the Snowmobile and the Snowball, well, actually, the Snowmobile, drive out on stage in New York, was it 12, 18 months ago? It's dog years, so everything goes seven times faster. >> John: Right, right, right. >> I was laughing. I was like, "How on Earth can you possibly use a truck to move data?" But a customer came to us, a prospect came to us the other day, he wanted to move a hundred petabytes of data. Now, if you're going to use the public internet to do that, that's going to take a hell of a long time. So, this idea of a mix between physical and digital data movement I think is, when moving to cloud, is actually fascinating. I think it's a really fascinating subject area. One that customers are definitely going to use. >> Yeah, you've got a great vantage point looking at customers' migrations. >> David: Yeah. >> It was actually something big in the keynote talking about, there are so many migrations out there that Amazon released an AWS Migration Hubs. So, obviously, physics is always a challenge, my legacy mindset. Customers, we heard a customer up onstage and it's usually not lift and shift maybe for the private cloud, but for public cloud, I usually, I need to rewrite, I need to do micro-services. What is the friction for customers, and how are you and Amazon and the other clouds helping customers work through those challenges? >> OK, so, just to take a step back and think about the problems that happen at hyper-scale data movement. So, small-scale data, gigabyte-scale data, the stuff that you typically see in a relational database, they're not particularly big problems. It's kind of minimal outage, press pause, move data, make it consistent, and you're done. You can have a sort of, a small outage, maybe 15 minutes or even a day to move data, but when it gets to hyper-scale, when it gets to petabyte-scale, multi-terabyte-scale data moves, that's when you have a problem, and that's really the problem that we solve. So, the idea that you can move data that's moving and changing without an interruption to service from on-premise to cloud and support a hybrid cloud topology for an elongated period of time is fascinating. I was listening at an investor conference to the CEO of VMware who was talking about, we're going to be in a situation of hybrid cloud for the next 20, 25 years because, overnight, not everybody can just repurpose every single application that they're running on-premise, whether it's in the main frame application, or a relational data application, or wherever it is in the OP application, and repurpose that in cloud overnight. So, we're going to have to gradually move and migrate those applications over. So, it's highly likely we're going to be in a hybrid cloud environment for the foreseeable future, and that's actually fantastic news for us. We're moving, as I said, at scale companies into cloud with transactional data, and nobody else can touch us in terms of the uniqueness of the IP, which is fantastic news for us. >> In terms of just big data in general, Stu has one use for it, I have a different use for it. It's going to live in a lot of different places. How are you responding to different needs within your clients and trying to make them more effective, make them more efficient? And yet, when you're dealing with more and more data, that's a big storm to handle. >> That's a great question. I went to speak a couple of months ago to a new customer of ours who is a major healthcare provider on the east coast, and I kind of said to him, "OK, you've had this deep cluster for the past three years. Why are you calling us? Why now?" Which is the question that I always ask our customers. Why? What changed? Why are you doing this right now?" And maybe for the past three years they've been putting legal data into the system. That's data, but who cares if you can't get access to it? We can move to telephone. We can move to e-mails. We can go into an archive, into a paper archive even, to find it, but the why now is that they're now putting patient record data, patient information with regulated SLA's into this system, and that really is our sweet spot. As you get to, remember that investment thesis, small-scale gigabyte outage is small outage, when you get into petabyte, exabyte-scale, when you've got data sets that are a thousand, a million times greater, it's linear to the quantum of data. That outage becomes a thousand or a million times greater. So, that's kind of intolerable. So, we love it when strategic applications, regardless of what the use case is, we could all have different, it might be patient data, it might be retail information, it might be banking data, it might be customer retention information, when those strategic applications move onto this hyper-scale infrastructure, you have to support RTO and RTP, and that's what we do. >> And is a byte a byte a byte? You have these thousands of needles in haystacks, right? How do you assign value to one as opposed to another? >> So, this is another great question and one that investors kind of ask me a lot. So, we used to model our business from kind of the ground up. So, we take the classic enterprise sales team, you have a sales and marketing organization that's quite large, you would multiply that by their quota and then multiply it by 66% because that's how many of them are going to be successful in selling product. Well, we completely threw that away when we launched WANdisco Fusion, our new technology, early 2016. Then, we moved to a channel-based approach. So, we have IBM, we have an OAM, 5,000 quarter-carrying enterprise sales guys at IBM selling our products. That was a fantastic deal for us. We signed it in April 2016, and they've done the first half of this year, and made at least six million dollars in sales that we have also announced, and then, we've got strategic partnerships with Amazon, with Microsoft, with Google, and we model our business by those channels. So, we're not looking for needles in haystacks. We don't, we could never hire another, I mean, if we had to come into the market and say, "We need to go and hire 5,000 enterprise sales guys," we'd have to be raising, doing fund-raisers like Uber or something. We'd just be untenable. We couldn't do it. So, we have a product that lends itself very well to a channel-based approach, and that's working very nicely for us. So, we're not looking for, we're just looking for haystacks. Somebody else can go and find the needles. >> John: Find me and you, right? >> Right. >> David, how are your customers managing the pace of change these days? We've said Amazon is an example. It's like everyday there's three new services coming out. Are they excited? Are they completely overwhelmed? What do you see these days? >> So, I think it's classic sort of products and option lifecycle stuff. The sort of technical enthusiasts, they love all this change. The early-stage companies that are implementing this new cloud-based technology, ObjectStore technology and so on, they're managing very well. It's the later-stage companies you might go to and say, "ObjectStore," and they'll go, "What's ObjectStore? We're just getting our head around Hadoop, and Hive, and Pig, and all this other stuff that you were talking about three years ago," and sales guys go in there now and say, "Oh, no, no, no, don't worry about Hadoop. Nobody's going to run Hadoop in the cloud." It's like, "Well, that's what you told me three years ago." So, I think the market's certainly divided. I think you're going to see, as we move up products and option lifecycle, you're going to see lots and lots and lots of interesting moves happen. The companies that seem to be owning cloud, I think Alibaba is coming up really fast. We're seeing them doing some interesting things. Obviously, they've got dominoes in the Chinese market. Amazon First-Mover, Microsoft's futures dependent on cloud. So, they all have their different spin and different take on applications that they're going to run in cloud. I think there is, I think it's a bit like the cellphone industry. There's lot and lots of different plans, lots and lots of different confusing nomenclature, but that's going to settle out in the next couple of years, but there's unquestionably, if you look at the audience here today, unquestionably large-scale movement of applications and data to cloud. >> Well, we appreciate the time, as always. Great to see you. Another notch in your CUBE belt. (laughing) So, congratulations for that, and maybe you can settle in to New York for a day or two. You said your travels have had you flip-floppin' back and forth between England and here. So, maybe you can settle in for a day or two. >> Yeah, I need to replicate myself. I need to put myself in at least two different places at the same time. >> Live data replication right here. (laughing) All right, David, thanks for bein' with us. David Richards. >> Thank you. Thanks guys. >> Back with more here on theCUBE, we continue our coverage of AWS Summit from New York City right after this break. (upbeat music)

Published Date : Aug 14 2017

SUMMARY :

brought to you by Amazon Web Services. It's good to have David Richards with us, It's a pleasure to be back again. We need to get you your own microphone, I think, you know? I need my name on the back of the seat or something. All right, so you've got an office in Sheffield, England. You've got an office out in the valley, Silicon Valley. I can change the accent. As far as some acquisitions you've done, I've got to be a little bit careful. So, it's down to uniqueness of technology. One that customers are definitely going to use. Yeah, you've got a great vantage point I need to do micro-services. and that's really the problem that we solve. that's a big storm to handle. and I kind of said to him, because that's how many of them are going to be successful What do you see these days? on applications that they're going to run in cloud. and maybe you can settle in to New York for a day or two. I need to put myself in at least two different places All right, David, thanks for bein' with us. Thank you. we continue our coverage of AWS Summit from New York City

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
Google	ORGANIZATION	0.99+
John	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Stu Miniman	PERSON	0.99+
April 2016	DATE	0.99+
London	LOCATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
six	QUANTITY	0.99+
New York	LOCATION	0.99+
15 minutes	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
WANdisco	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
England	LOCATION	0.99+
New York City	LOCATION	0.99+
two	QUANTITY	0.99+
September	DATE	0.99+
175%	QUANTITY	0.99+
66%	QUANTITY	0.99+
a day	QUANTITY	0.99+
173%	QUANTITY	0.99+
Big Apple	LOCATION	0.99+
Earth	LOCATION	0.99+
VMware	ORGANIZATION	0.99+
seven months	QUANTITY	0.99+
early 2016	DATE	0.99+
five	QUANTITY	0.99+
three years ago	DATE	0.99+
AWS	ORGANIZATION	0.99+
Sheffield, England	LOCATION	0.99+
four million dollars	QUANTITY	0.98+
over four million dollars	QUANTITY	0.98+
United States	LOCATION	0.97+
three new services	QUANTITY	0.97+
zero	QUANTITY	0.97+
ObjectStore	ORGANIZATION	0.97+
today	DATE	0.97+
5,000	QUANTITY	0.97+
thousands of needles	QUANTITY	0.96+
about 20 years	QUANTITY	0.96+
AWS Summit	EVENT	0.96+
first	QUANTITY	0.95+
Silicon Valley	LOCATION	0.95+
Hive	ORGANIZATION	0.95+
first half	QUANTITY	0.95+
AWS Summit New York City 2017	EVENT	0.94+
AWS Summit 2017	EVENT	0.93+
sixth	DATE	0.92+
CUBE	ORGANIZATION	0.9+
first half of last year	DATE	0.89+
5,000 enterprise sales guys	QUANTITY	0.88+
a million times	QUANTITY	0.88+
couple of months ago	DATE	0.88+
$600,000	QUANTITY	0.87+
seven times	QUANTITY	0.87+
theCUBE	ORGANIZATION	0.86+
Snowmobile	ORGANIZATION	0.86+
two different places	QUANTITY	0.85+
a thousand	QUANTITY	0.85+
One	QUANTITY	0.85+
Pig	ORGANIZATION	0.85+
at least six million dollars	QUANTITY	0.84+
past three years	DATE	0.83+
four	QUANTITY	0.83+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for ObjectStore: