Guy Churchward, DataTorrent | Big Data SV 2018
>> Announcer: Live from San Jose, it's theCUBE, presenting Big Data, Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to theCUBE. Our continuing coverage of our event, Big Data SV, continues, this is our first day. We are down the street from the Strata Data Conference. Come by, we're at this really cool venue, the Forager Tasting Room. We've got a cocktail party tonight. You're going to hear some insights there as well as tomorrow morning. I am Lisa Martin, joined by my co-host, George Gilbert, and we welcome back to theCUBE, for I think the 900 millionth time, the president and CEO of DataTorrent, Guy Churchward. Hey Guy, welcome back! >> Thank you, Lisa, I appreciate it. >> So you're one of our regular VIP's. Give us the update on DataTorrent. What's new, what's going on? >> We actually talked to you a couple of weeks ago. We did a big announcement which was around 3.10, so it's a new release that we have. In all small companies, and we're a small startup, in the big data and analytic space, there is a plethora of features that I can reel through. But it actually makes something a little bit more fundamental. So in the last year... In fact, I think we chatted with you maybe six months ago. We've been looking very carefully at how customers purchase and what they want and how they execute against technology, and it's very very different to what I expected when I came into the company about a year ago off the EMC role that I had. And so, although the features are there, there's a huge amount of underpinning around the experience that a customer would have around big data applications. I'm reminded of, I think it's Gartner that quoted that something like 80% of big data applications fail. And this is one of the things that we really wanted to look at. We have very large customers in production, and we did the analysis of what are we doing well with them, and why can't we do that en masse, and what are people really looking for? So that was really what the release was about. >> Let's elaborate on this a little bit. I want to drill into something where you said many projects, as we've all heard, have not succeeded. There's a huge amount of complexity. The terminology we use is, without tarring and feathering any one particular product, the open source community is kind of like, you're sort of harnessing a couple dozen animals and a zookeeper that works in triplicate... How does DataTorrent tackle that problem? >> Yeah, I mean, in fact I was desperately interested in writing a blog recently about using the word community after open source, because in some respects, there isn't a huge community around the open source movement. What we find is it's the du jour way in which we want to deliver technology, so I have a huge amount of developers that work on a thing called Apache Apex, which is a component in a solution, or in an architecture and in an outcome. And we love what we do, and we do the best we do, and it's better than anybody else's thing. But that's not an application, that's not an outcome. And what happens is, we kind of don't think about what else a customer has to put together, so then they have to go out to the zoo and pick loads of bits and pieces and then try to figure out how to stitch them all together in the best they can. And that takes an inordinately long time. And, in general, people who love this love tinkering with technologies, and their projects never get to production. And large enterprises are used to sitting down and saying, "I need a bulletproof application. "It has to be industrialized. "I need a full SLA on the back of it. "This thing has to have lights out technology. "And I need it quick." Because that was the other thing, as an aspect, is this market is moving so fast, and you look at things like digital economy or any other buzz term, but it really means that if you realize you need to do something, you're probably already too late. And therefore, you need it speedy, expedited. So the idea of being able to wait for 12 months, or two years for an application, also makes no sense. So the arch of this is basically deliver an outcome, don't try and change the way in which open source is currently developed, because they're in components, but embrace them. And so what we did is we sort of looked at it and said, "Well what do people really want to do?" And it's big data analytics, and I want to ingest a lot of information, I want to enrich it, I want to analyze it, and I want to take actions, and then I want to go park it. And so, we looked at it and said, "Okay, so the majority "of stuff we need is what we call a cache stack, "which is KAFKA, Apache Apex, Spark and Hadoop, "and then put complex compute on top." So you would have heard of terms like machine learning, and dimensional compute, so we have their modules. So we actually created an opinionated stack... Because otherwise you have a thousand to choose from and people get confused with choice. I equate it to going into a menu at a restaurant, there's two types of restaurants, you walk into one and you can turn pages and pages and pages and pages of stuff, and you think that's great, I got loads of choice, but the choice kind of confuses you. And also, there's only one chef at the back, and he can't cook everything well. So you know if he chooses the components and puts them together, you're probably not going to get the best meal. And then you go to restaurants that you know are really good, they generally give you one piece of paper and they say, "Here's your three entrees." And you know every single one of them. It's not a lot of choice, but at the end of the day, it's going to be a really good meal. >> So when you go into a customer... You're leading us to ask you the question which is, you're selling the prix fixe tasting menu, and you're putting all the ingredients together. What are some of those solutions and then, sort of, what happens to the platform underneath? >> Yeah, so what you don't want to do is to take these flexible, microdata services, which are open source projects, and hard glue them together to create an application that then has no flexibility. Because, again, one of the myths that I used to assume is applications would last us seven to 10 years. But what we're finding in this space is this movement towards consumerization of enterprise applications. In other words, I need an app and I need it tomorrow because I'm competitively disadvantaged, but it might be wrong, so I then need to adjust it really quick. It's this idea of continual developed, continual adjustment. But that flies in the face of all of this gluing and enterprise-ilities. And I want to base it on open source, and open source, by default, doesn't glue well together. And so what we did is we said okay, not only do you have to create an opinionated stack, and you do that because you want them all to scale into all industries, and they don't need a huge amount of choice, just pick best of breed. But you need to then put a sleeve around them so they all act as though they are a single application. And so we actually announced a thing calls Epoxy. It's a bit of a riff on gluing, but it's called DataTorrent Epoxy. So we have, it's like a microdata service bus, and you can then interchange the components. For instance, right now, Apache Apex is this string-based processing engine in that component. But if there's a better unit, we're quite happy to pull it out, chuck it away, and then put another one in. This isn't a ubiquitous snap-on toolset, because, again, the premise is use open source, get the innovation from there. It has to be bulletproof and enterprise-ility and move really fast. So those are the components I was working on. >> Guy, as CEO, I'm sure you speak with a lot of customers often. What are some of the buying patterns that you're seeing across industries, and what are some of the major business value that DataTorrent can help deliver to your customers? >> The buying patterns when we get involved, and I'm kind of breaking this down into a slightly different way, because we normally get involved when a project's in flight, one of the 80% that's failing, and in general, it's driven by a strategic business partner that has an agenda. And what you see is proprietary application vendors will say, "We can solve everything for you." So they put the tool in and realize it doesn't have the flexibility, it does have enterprise-ility, but it can't adjust fast. And then you get the other type who say, "Well we'll go to a distro or we'll go "to a general purpose practitioner, "and they'll build an application for us." And they'll take open source components, but they'll glue it together with proprietary mush, and then that doesn't then grow past. And then you get the other ones, which is, "Well if I actually am not guided by anybody, "I'll buy a bunch of developers, stick them in my company, "and I've got control on that." But they fiddle around a lot. So we arrive in and, in general, they're in this middle process of saying, "I'm at a competitive disadvantage, "I want to move forward and I want to move forward fast, "and we're working on one of those three channels." The types of outcomes, we just, and back to the expediency of this, we had a telco come to us recently, and it was just before the iPhone X launched, and they wanted to do AB testing on the launch on their platform. We got them up and running within three months. Subsequent from that launch, they then repurposed the platform and some of the components with some augmentation, and they've come out with three further applications. They've all gone into production. So the idea is then these fast cycles of microdata services being stitched together with the Epoxy resin type approach-- >> So faster time to value, lower TCO-- >> Exactly. >> Being able to get to meet their customers' needs faster-- >> Exactly, so it's outcome-based and time to value, and it's time to proof. Because this is, again, the thing that Gartner picked up on, is Hadoop's difficult, this market's complex and people kick the tires a lot. And I sort of joke with customers, "Hey if you want to "obsess about components rather than the outcome, "then your successor will probably come see us "once you're out and your group's failed." And I don't mean that in an obnoxious way. It's not just DataTorrent that solves this same thing, but this it the movement, right? Deal with open source, get enterprise-ilities, get us up and running within a quarter or two, and then let us have some use and agile repurposing. >> Following on that, just to understand going in with a solution to an economic buyer, but then having the platform be reusable, is it opinionated and focused on continuous processing applications, or does it also address both the continuous processing and batch processing? >> Yeah, it's a good answer. In general, and again Gatekeeper, you've got batch and you've got realtime and string, and so we deal with data in motion, which is string-based processing. A string-based processing engine can deal with batch as well, but a batch cannot deal with string. >> George: So you do both-- >> Yeah >> And the idea being that you can have one programming model for both. >> Exactly. >> It's just a window, batch is just a window. >> And the other thing is, a myth bust, is for the last maybe eight plus years, companies assume that the first thing you do in big data analytics is collect all the data, create a data lake, and so they go in there, they ingest the information, they put it into a data lake, and then they poke the data lake posthumously. But the data in the data lake is, by default, already old. So the latency of sticking it into a data lake and then sorting it, and then basically poking it, means that if anybody deals with the data that's in motion, you lose. Because I'm analyzing as it's happening and then you would be analyzing it after at rest, right? So now the architecture of choice is ingest the information, use high performance storage and compute, and then, in essence, ingest, normalize, enrich, analyze, and act on data in motion, in memory. And then when I've used it, then throw it off into a data lake because then I can basically do posthumous analytics and use that for enrichment later. >> You said something also interesting where the DataTorrent customers, the initial successful ones sort of tended to be larger organizations. Those are typically the ones with skillsets to, if anyone's going to be able to put pieces together, it's those guys. Have you not... Well, we always expected big data applications, or sort of adaptive applications, to go mainstream when they were either packaged apps to take all the analysis and embed it, or when you had end to end integrated products to make it simple. Where do you think, what's going to drive this mainstream? >> Yeah, it depends on how mainstream you want mainstream. It's kind of like saying how fast is a fast car. If you want a contractor that comes into IT to create a dashboard, go buy Tableau, and that's mainstream analytics, but it's not. It's mainstream dashboarding of data. The applications that we deal with, by default, the more complex data, they're going to be larger organizations. Don't misunderstand when I say, "We deal with these organizations." We don't have a professional services arm. We work very closely with people like HCL, and we do have a jumpstart team that helps people get there. But our job is teach someone, it's like a kid with a bike and the training wheels, our job is to teach them how to ride the bike, and kick the wheels off, and step away. Because what we don't want to do is to put a professional services drip feed into them and just keep sucking the money out. Our job is to get them there. Now, we've got one company who actually are going to go live next month, and it's a kid tracker, you know like a GPS one that you put on bags and with your kids, and it'll be realtime tracking for the school and also for the individuals. And they had absolutely zero Hadoop experience when we got involved with them. And so we've brought them up, we've helped them with the application, we've kicked the wheels off and now they're going to be sailing. I would say, in a year's time, they're going to be comfortable to just ignore us completely, and in the first year, there's still going to be some handholding and covering up a bruise as they fall off the bike every so often. But that's our job, it's IP, technology, all about outcomes and all about time to value. >> And from a differentiation standpoint, that ability to enable that self service and kick off the training wheels, is that one of the biggest differentiators that you find DataTorret has, versus the Tableau's and the other competitors on the market? >> I don't want to say there's no one doing what we're doing, because that will sound like we're doing something odd. But there's no one doing what we're doing. And it's almost like Tesla. Are they an electric car or are they a platform? They've spurred an industry on, and Uber did the same thing, and Lyft's done something and AirBNB has. And what we've noticed is customer's buying patterns are very specific now. Use open source, get up their enterprise-ilities, and have that level of agility. Nobody else is really doing that. The only people that will do that is your contract with someone like Hortonworks or a Cloudera, and actually pay them a lot of money to build the application for you. And our job is really saying, "No, instead of you paying "them on professional services, we'll give you the sleeve, "we'll make it a little bit more opinionated, "and we'll get you there really quickly, "and then we'll let you and set you free." And so that's one. We have a thing called the Application Factory. That's the snap on toolset where they can literally go to a GUI and say, "I'm in the financial market, "I want a fraud prevention application." And we literally then just self assemble the stack, they can pick it up, and then put their input and output in. And then, as we move forward, we'll have partners who are building the spoke applications in verticals, and they will put them up on our website, so the customers can come in and download them. Everything is subscription software. >> Fantastic, I wish we had more time, but thanks so much for finding some time today to come by theCUBE, tell us what's new, and we look forward to seeing you on the show again very soon. >> I appreciate it, thank you very much. >> We want to thank you for watching theCUBE. Again, Lisa Martin with my co-host George Gilbert, we're live at our event, Big Data SV, in downtown San Jose, down the street from the Strata Data Conference. Stick around, George and I will be back after a short break with our next guest. (light electronic jingle)
SUMMARY :
presenting Big Data, Silicon Valley, brought to you and we welcome back to theCUBE, So you're one of our regular VIP's. and we did the analysis of what are we doing well with them, I want to drill into something where you said many projects, So the idea of being able to wait for 12 months, So when you go into a customer... And so what we did is we said okay, not only do you have What are some of the buying patterns that you're seeing And then you get the other ones, which is, And I sort of joke with customers, "Hey if you want to and so we deal with data in motion, And the idea being that you can have one and then you would be analyzing it after at rest, right? or when you had end to end integrated products and now they're going to be sailing. and actually pay them a lot of money to build and we look forward to seeing you We want to thank you for watching theCUBE.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
two years | QUANTITY | 0.99+ |
George | PERSON | 0.99+ |
12 months | QUANTITY | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
AirBNB | ORGANIZATION | 0.99+ |
Lisa | PERSON | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
80% | QUANTITY | 0.99+ |
two types | QUANTITY | 0.99+ |
Gartner | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
iPhone X | COMMERCIAL_ITEM | 0.99+ |
DataTorrent | ORGANIZATION | 0.99+ |
seven | QUANTITY | 0.99+ |
Guy Churchward | PERSON | 0.99+ |
tomorrow morning | DATE | 0.99+ |
Lyft | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
six months ago | DATE | 0.99+ |
next month | DATE | 0.99+ |
three months | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
one | QUANTITY | 0.98+ |
EMC | ORGANIZATION | 0.98+ |
first day | QUANTITY | 0.98+ |
tonight | DATE | 0.98+ |
Silicon Valley | LOCATION | 0.98+ |
tomorrow | DATE | 0.98+ |
one chef | QUANTITY | 0.98+ |
10 years | QUANTITY | 0.98+ |
one piece | QUANTITY | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
Cloudera | ORGANIZATION | 0.97+ |
three entrees | QUANTITY | 0.97+ |
Strata Data Conference | EVENT | 0.97+ |
first thing | QUANTITY | 0.97+ |
first year | QUANTITY | 0.96+ |
single application | QUANTITY | 0.96+ |
today | DATE | 0.95+ |
couple of weeks ago | DATE | 0.95+ |
telco | ORGANIZATION | 0.95+ |
900 millionth time | QUANTITY | 0.95+ |
one company | QUANTITY | 0.94+ |
HCL | ORGANIZATION | 0.94+ |
a quarter | QUANTITY | 0.94+ |
DataTorret | ORGANIZATION | 0.93+ |
three channels | QUANTITY | 0.93+ |
two | QUANTITY | 0.92+ |
Big Data SV | EVENT | 0.92+ |
Big Data SV 2018 | EVENT | 0.91+ |
three further applications | QUANTITY | 0.86+ |
Apex | TITLE | 0.84+ |
a year | QUANTITY | 0.82+ |
Tableau | ORGANIZATION | 0.81+ |
Hadoop | PERSON | 0.81+ |
about a year ago | DATE | 0.8+ |
couple dozen animals | QUANTITY | 0.8+ |
product | QUANTITY | 0.78+ |
eight plus years | QUANTITY | 0.77+ |
Apache | ORGANIZATION | 0.76+ |
agile | TITLE | 0.76+ |
Guy | PERSON | 0.73+ |
Epoxy | ORGANIZATION | 0.71+ |
Tableau | TITLE | 0.71+ |
DataTorrent | PERSON | 0.7+ |
around 3.10 | DATE | 0.69+ |
Spark | TITLE | 0.68+ |
restaurants | QUANTITY | 0.66+ |
Gatekeeper | TITLE | 0.66+ |
model | QUANTITY | 0.63+ |
Phu Hoang, DataTorrent Inc. | CUBEConversation
>> Narrator: From Palo Alto, California, it's CUBEConversations with John Furrier. >> Hello, welcome to our special CUBEConversation here in Palo Alto, California. I'm John Furrier, co-founder of SiliconAngle Media and co-host for the CUBE. I'm here with Phu Hoang who's the co-founder and chief strategy officer of DataTorrent. Great to see you again. Welcome back >> Thank you so much, John. >> This CUBEConversation. So, you're now the chief strategy officer, which is code words for you are the CEO and co-founder of the company. You bring in a pro guy, Churchwood we know very well, former EMC-er, real pro. Gives you a chance to kind of get down and dirty into the organization and get back to your roots and kind of look at the big picture. Great management team. Talk about what your background is, because I think I want to start there, because you have an interesting background. Former Yahoo executive, we've talked before. Take a minute to talk about your background. >> Yeah, sure. You know I think I'm just one of those super lucky engineer. I got involved with Yahoo way early in 1996. I think I was the fifth engineer, or so. I stayed there for 12 years, ended up running about close to 3,000 engineers, and had the chance to really experience the whole growth of the internet. We build out hundreds of sites worldwide, so all of engineering team develop all of those websites throughout the world. >> You must have a tear in your eye at how Yahoo ended up. We don't want to go there. Folks that don't remember Yahoo during the web1.0 days, it was the beginning of a revolution. I kind of see the same thing happening, like blockchain and what's going on now. A whole new wild west is happening, but back then you couldn't buy off the shelves. You had to certainly buy servers, but the software, you guys were handling kind of a first generation use case. >> That's right. >> Folks may or may not know, but Yahoo really was the inventor of Hadoop. Doing Hadoop at large scale, honestly ... MapReduce written by Google, but the rest is, you guys were deploying a lot of that stuff. You had to deal with scale and write your own software for big data, before it was called big data. >> That's exactly right. It's interesting, because originally we thought that our job was really customer-facing website, and all of the data crunching and massaging that we would actually be able to use enterprise software to do that. Very quickly we learned at the pace of scale data that we were generating that we really couldn't use that software. We were kind of on our own, so we had to invent approaches to do that. The thing we knew a lot was commodity servers on racks. So, we ended up saying, "How do I solve this big data processing problem using that hardware?" It didn't happen overnight. It took many years of doing it right, doing it wrong, and fixing it. You start to iterate around how to do distributed processing across many hundreds of servers. >> It's interesting, Yahoo had the same situation. And ultimately Amazon ended up having, cause they were a pioneer. People dismissed Amazon web services. Like, "It's just hosting and bare metal on the cloud." Really what's interesting is that you guys were hardening and operationalizing big data. >> That's right. >> So, I got to ask you the question, cause this is more of a geeky computer science concept, but batch processing has been around since the mainframe, and that's become normal. Databases, et cetera, software. But now, over the past 8 years in particular, as big data and unstructured data has proliferated in massive scale, certainly now with internet of things you see it booming. This notion of real time data in motion. You have two paradigms out there, batch processing, which is well known and data in motion, which is essentially real time. Self-driving cars ... Evidence is everywhere, where this is going. Real time is not near real time. >> That's right. >> In nanoseconds, people want results. This is a fundamental data challenge. What's your thoughts on this and how does this relate to how big data will evolve for customers? >> I think you're exactly right. I think as big data came, and people were able to process data, and understand it better, and derive insights from it, very quickly for competitive reason, they find out that they want those insights sooner and sooner. They couldn't get it soon enough. So, you have those opposing trends of more and more data, yet at the same time, faster and faster insight. Where does that go? When you really come down to it, people don't really want to do batch processing. They do batch processing, cause that was the technology that they have. If they have their way, they don't want to just ... Information is coming into their business. Customers are interacting with their products constantly, 24 by 7. Those events, if you will, that are giving them insights are happening all the time. Except, for a long time, they store it into a file. They wait til midnight, and then they process it overnight. More and more there are now capabilities in memory distributed to do that processing as it comes in. That's one of the big motivations for forming DataTorrent. >> I want to get to DataTorrent in a minute, but I want to get some of these trends, cause I think they're important to kind of put together the two big pieces of the puzzle, if you will. One is, you mentioned batch processing in real time. The companies, historically, have built their infrastructure and their operations IT, and whatever, around that, how storage was procured and deployed. Now with IOT and the edge of the network becoming huge, it's a big deal. So, data in motion, it's pretty much well agreed upon amongst most of the smart people, this is a big issue. Let me throw a little wrench in the equation. Cloud computing kind of changes the security paradigm. There's no perimeter anymore, so there's no door you can secure, no firewall model. Once you get in, you're in. That's where we've seen a lot of attacks on ransomware and a lot of cyber attacks. The penetration is everywhere. Now there's APIs and everything. When you bring cloud into it, and you bring in the fact that you've got data in motion, what is the challenge for the customer? How do top architects get their arms around this? What's the solution? What's your vision on that? >> Well, I will start by saying it's a hard problem. I think you're absolutely right. I think we're still in the phase where the problems are very visible about how do you solve this. I think we're still, as an industry, figuring out how to solve it, cause you're right, the security issue ... Security is not this one point tool. It's an entire ecosystem process for doing that. The cloud opens up all of those opportunities for fraud and so on. It's still an ongoing challenge. I think the trend of memory becoming cheaper and cheaper, so that things are done more in memory and less in storage could actually help a bit on that. But overall, security internal, external processes are ... >> It's a moving train. >> Yeah, it's moving. Exactly. >> Let me ask you about the big other trend to throw on top of this. This is really kind of where you see a lot of the activity, although some will claim that the app store is not seeing as many apps now as they used to be. Certainly the enterprises, massive growth and application development. So, ready-made apps with DevOps and Cloud have built a whole culture of infrastructure as code, which is essentially saying that I'm going to build apps and make the infrastructure kind of invisible. You're seeing a lot of apps like that, called ready-made apps, however you want to call it. Those are the things. How are you guys at DataTorrent handling and supporting that trend? >> We are right smack in the middle of exactly that trend. One of theses that we had was that big data's hard. Hadoop is hard. Hadoop is now 12 years old. Lots of people are using Hadoop, trying Hadoop, but yet it's still not something that is fully operationalized and easy for everybody. I think that part of that is big data's hard, distributed processing is hard, how to get all that working. There were two things we were focusing on. One was the real time thing. The other one was, how do we make this stuff a lot easier to use? So, we focus a lot on building tools on top of the open source engine, if you will, to kind of make it easy to use. The other one is exactly that, ready-made apps. As we continue to learn in working with our customers, and starting to see the patterns, putting kind of, bigger functional block together, so that it's easier to kind of build these big data application at this next layer. Machine learning, rule engines, whatever not. How do you piece that together in a way that is 80 percent done, so that the customer only has a little bit, the last mile. >> So, you guys want to be the tooling for that? >> Yeah, I think so. I think you have to. This stuff, you have to kind of go through the whole six layer of what it takes to get the final business value out. You're not going to have the skillset to do it. The more we can abstract and get it to the top, the better. >> Every company's got their own DNA. Intel has Moore's Law. You're the co-founder of DataTorrent. What's the DNA of your company, as the founder? Talk about what's the, what do employees you try to instill into your culture that is the DNA that you want to be known for? >> Interesting. So, I start out sort of on the technical or product side. Actually, our DNA is all about ops. We think that, especially in big data, there's lots of ways to do prototypes and get some proof of concept going.. Getting that to production, to run it 24 by 7, never lose data, that really has been hard. Our entire existence is around how to truly build 24 by 7, no data, fast application. All of our engineers live and breathe how to do that well. >> Ops is consistent with stability. It's interesting, Silicon Valley's going through its own transformation around programmers and the role of entrepreneurship. It's interesting, in the enterprise, they always kind of were like, "Oh, no big deal." Because at the end of the day they need stuff to run at five 9. These are networks. The old saying that Mark Zuckerberg used to have is, "Move fast and break stuff." They've changed their tune to, "Move fast and be a hundred percent reliable." This is the trend that the enterprises will always put out there. How do companies stay and maintain that ops integrity and still be innovative without a lot of command and control, and compliance restrictions? How do they experiment with this data tsunami that's happening and maintain that integrity? >> My answer to that is, I think, as an industry, we have to build products and tools to allow for that. Some of that is processes inside a company, but I think a lot of that can be productize. The advances in that big data processing layer, and how to recover, get new containers, and do all the right things, allow for the application developer not to have to worry about many of those segments. I think technology exists out there for tools to be developed to deal with a lot of that. >> I love talking with entrepreneurs and you're the co-founder of DataTorrent. Talk about the journey you've been on from the beginning. You have a new CEO, which as the CEO, you want to lighten the load up a little bit. It gets bigger, you got to have HR issues, things are happening. You're putting culture in place and trying to scale out and get a groove swing. Certainly Uber could've taken a few tips from your playbook, as bringing in senior management. You did it at the right time. Talk about your journey, the company, and what people should know about DataTorrent. >> We're just a bunch of guys that are just still trying to make a contribution to the industry. I think we saw an opportunity to really help people move towards big data, move towards real time analytics, and really help them solve some really hairy problems that they have coming up with data. From a skillset and personally, I think kind of my particular strength was really about that initial vision. Be able to build out a set of capabilities, and maybe get a first set of half a dozen wins, and really prove point. To sort of make it into a machine that has all the right marketing tools, and business development tools, and so on. It will be great to bring in someone like Guy, who has done that many, many times over, and has been super successful at that, to take us to the next level. >> Takes a lot of self awareness, too. You probably had your moments where, should you stay on, be the CEO ... But, what are you doing now, cause you get down and you can get into the products. Are you doing a lot more product tinkering? Are you involved the road map? What's your involvement day-to-day now? >> I love it, cause it's exactly what I enjoy most, which is really interacting with customers and users and really continue to hone in on the product market fit. And continue to understand, what are the pain points? What are the issues? And, how can we solve it? All coming from, not so much a services mentality, but a product mentality. >> At the cloud ops, too. That's a big area. So, what's the big problem that you solve for the customers? What's the big, hairy problem? >> Really easy, how to productize, how to operationalize this data pipeline that they have, so that they can truly be accepting real live business data that they are getting in, and giving them the insight. >> Been a lot of talk about automation and AI, lately. Obviously, it's a buzzword. Wikibon just put out a report called True Private Cloud that shows all the automation's actually going at and replacing non-differentiated labor, which actually the racking and stacking gear. Moving to values, actually is going to be more employment on that side. Talk about the role of automation in the data world, because if you just think about the amount of data that companies like Facebook and Yahoo take in, you need machine learning. You need automation. What is the key to automation in a lot of these new emerging areas around large data sets? >> It's so funny, yesterday I was driving. I was listening to a KQED segment, and they were talking about in its next phase, AI and machine learning is going to do sort of the first layer of all the reporting. So, you actually have reporters doing much more sophisticated reporting, cause there's an AI layer that has a template of what are the questions to answer, and they can just spill out all the news for you. >> Paid by cryptocurrency. >> Yeah. I think machine learning and AI will be everywhere. We will continue to learn, and it will continue to get better at doing more and more things for us, so that we get to kind of play at that creative, disruptive layer, while it does all the menial tasks. I think it will touch every part of our civilization. The technology is getting incredible. The algorithms are incredible. The power, the computing power to allow for that is getting exponential. I think it's super interesting that the engineers are super interested in it. Everything we do now revolves around ... When we talk about the analytics layer at real time, it's all about machine learning scoring and how to, rules and all that. >> Great to have you here on the CUBEConversation. Give you the last word. Give a quick plug about DataTorrent. What should your customers know about you guys? Why should they call you? >> We're a company solely focused on bringing big data applications to production. We focus on making sure that as long as you understand what you want to do with data, we can make it super fast, super reliable, super scalable. All that stuff. >> Co-founder of DataTorrent here and the CUBEConversation here in Palo Alto. I'm John Furrier. Thanks for watching. (synth music)
SUMMARY :
it's CUBEConversations with John Furrier. Great to see you again. and kind of look at the big picture. and had the chance to really experience I kind of see the same thing happening, You had to deal with scale and write your own software and all of the data crunching and massaging that we would It's interesting, Yahoo had the same situation. So, I got to ask you the question, relate to how big data will evolve for customers? So, you have those opposing trends of more and more data, and you bring in the fact that you've got data in motion, the problems are very visible about how do you solve this. Yeah, it's moving. and make the infrastructure kind of invisible. the open source engine, if you will, I think you have to. that is the DNA that you want to be known for? Getting that to production, to run it 24 by 7, and the role of entrepreneurship. and do all the right things, allow for the application You did it at the right time. To sort of make it into a machine that has all the right and you can get into the products. and really continue to hone in on the product market fit. So, what's the big problem that you solve for the customers? so that they can truly be accepting real live business data What is the key to automation in a lot of these AI and machine learning is going to do sort of The power, the computing power to allow for that Great to have you here on the CUBEConversation. We focus on making sure that as long as you understand and the CUBEConversation here in Palo Alto.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Tom | PERSON | 0.99+ |
Marta | PERSON | 0.99+ |
John | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
David | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Peter Burris | PERSON | 0.99+ |
Chris Keg | PERSON | 0.99+ |
Laura Ipsen | PERSON | 0.99+ |
Jeffrey Immelt | PERSON | 0.99+ |
Chris | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Chris O'Malley | PERSON | 0.99+ |
Andy Dalton | PERSON | 0.99+ |
Chris Berg | PERSON | 0.99+ |
Dave Velante | PERSON | 0.99+ |
Maureen Lonergan | PERSON | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
Paul Forte | PERSON | 0.99+ |
Erik Brynjolfsson | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Andrew McCafee | PERSON | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
Cheryl | PERSON | 0.99+ |
Mark | PERSON | 0.99+ |
Marta Federici | PERSON | 0.99+ |
Larry | PERSON | 0.99+ |
Matt Burr | PERSON | 0.99+ |
Sam | PERSON | 0.99+ |
Andy Jassy | PERSON | 0.99+ |
Dave Wright | PERSON | 0.99+ |
Maureen | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Cheryl Cook | PERSON | 0.99+ |
Netflix | ORGANIZATION | 0.99+ |
$8,000 | QUANTITY | 0.99+ |
Justin Warren | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
2012 | DATE | 0.99+ |
Europe | LOCATION | 0.99+ |
Andy | PERSON | 0.99+ |
30,000 | QUANTITY | 0.99+ |
Mauricio | PERSON | 0.99+ |
Philips | ORGANIZATION | 0.99+ |
Robb | PERSON | 0.99+ |
Jassy | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Mike Nygaard | PERSON | 0.99+ |
Guy Churchward, DataTorrent | CUBEConversations
(upbeat electronic music) >> Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're having a CUBE Conversation in the Palo Alto studio, a little bit of a break from the crazy conference season, so we can have a little more intimate conversation without the madness of some of the shows. So we're really excited to have many-time CUBE alumni, Guy Churchward, on. He's the president and CEO of DataTorrent. Guy, great to see you. >> Thank you, Jeff, 'preciate it. >> So how have you been surviving the crazy conference season? >> It's been crazy. This is very unusual. It's just calm and quiet and relaxed, and there's not people buzzing around, so it's different. >> So you've been at DataTorrent for a while now, so give us kind of the quick update, where you guys are, how things are moving along for you. >> Yeah, I mean, I've kicked in about five months, so I think I'm just coming up to sort of five and a half, six months, so it's a enough time to get my feet wet, understand whether I made a massive mistake or whether it's exciting. I'm still-- >> Jeff: Still here, you're wearing the T-shirt. >> Yeah, I'm pleased to say I'm still very excited about it. It's a great opportunity, and the space is just hot, hot. >> So you guys are involved in streaming data and streaming analytics, and you know, we had Hadoop, was kind of the hot thing in big data, and really the focus has shifted now to streaming analytics. You guys are playing right in that space and have been for a while, but you're starting to make some changes and come at the problem from a slightly different twist. Give us an update on what you guys are up to. >> Yeah, I mean, so when I dropped into DataTorrent, obviously, it's real-time data analytics, based on stream processing or event processing. So the idea is to say instead of doing things like analytics, insight, and action on data at rest, you know, traditional way of doing things is sucking data into a data store and then poking it litigiously at sort of a real-time analytics basis. And what the company decided to do, and again, this is around the founders, is to say if you could take the insight and action piece and shift it left of the data store in memory and then literally garner the insight and action when an event happens, then that's obviously faster and it's quicker. And it was interesting, a client said to us recently that batch, or stream, or near real-time, or microbatch, is sort of like real-time for a person, 'cause a person can't think that fast. So the latency is a factor of that, but what we do is real-time for a computer. So the idea here is that you literally have sub-second latency and response and actions and insight. But anyway, they built a toolkit, and they built a development platform, and it's completely extensible, and we've got a dozen customers on board, and they're high production, and people are running a billion events per second, so it's very cool. But there wasn't this repeatable business, and I think the deeper I got into it, you also look at it and you say, "Well, Hadoop isn't the easiest thing to deploy." >> Jeff: Right, right, consistently. >> And, the company had this mantra, really, of going to solve total cost of ownership and time to value, so in other words, how fast can I get to an outcome and how cheap is it to run it. So can you create unique IP on top of opensource that allows you to basically get up and running quickly, it's got a good budget constraint from a scale-up perspective and scale-out, but at the same time, you don't need these genius developers to work on it because there's only a small portion of people who basically can deploy a Hadoop cluster in a massive scale in a reliable way. So we thought, well, the thing to do is to really bring it into the masses. But again, if you bring a toolkit down, you're really saying here's a toolkit and an opportunity, and then build the applications and see what you can do. What we figured is actually what you want to do is to say, no, let's just see if we can take Hadoop out of the picture and the complexity of it, and actually provide an end-to-end application. So we looked to each of the customers' current deployments and then figured out, can we actually industrialize that pipeline? In other words, take the opensource components, ruggedize them, scale them, make sure that they stay up, they're full torrents, 7x24, and then provide them as an application. So we're actually shifting our focus, I think, from just what are called the apex platform and the stream-based processing platform to an application factory and actually producing end-to-end applications. >> 'Cause it's so interesting to think of batch and batch in not real-time compared to real-time streaming, right? We used to take action on a sample of old data, and now, you've got the opportunity to actually take action on all of the now data. Pretty significant difference. >> Yeah, I mean, it kills me. I've got to say, since the last time we met, I literally wrote a blog series, and one of them was called Analytics, Real-Time Analytics versus Real-Time Analytics. And I had this hilarious situation where I was talking to a client, and I asked then, and I said, "Do you do real-time analytics?" They go, "Yeah." And I said, "Do you work on real-time data?" And they said, "Yeah." And I said, "What's your latency between an event happening "and you being able to take an action on the event?" And he said, "Well, 60 milliseconds." It's just amazing. I said, "Well, tell me what your architecture looks like." And he says, "Well I take Kafka into Apex as a stream. "I then import it in essence into Cassandra, "and then I allow my customers to poke the data." So I said, "Well, but that's not 60 milliseconds." And he goes, "No, no, it is." And I said, "What are you measuring?" He goes, "Well, the customer basically puts "an inquiry onto the data store." And so literally, what he's doing is a real-time query against a stale data that's sitting inside of a date lake. But he swore blind. >> But it's fast though, right? >> And that's the thing is he's looking, he say, "Hey, well, I can get a really quick response." Well, I can as well. I mean, I can look at Google World and I can look at my house, and I can find out that my house is not real-time. And that's really what it was. So you then say to yourself, well look, the whole security market is based around this technology. It's classic ETL, and it's basically get the data, suck it in, park it into a data store, and then poke at it. >> Jeff: Right >> But that means that that latency, by just the sheer fact that you're taking the data in and you're normalizing it and dropping it into a data store, your latency's already out there. And so one of the applications that we looked at is around fraud, and specifically payment fraud and credit card fraud. And everything out there in the market today is basically, it's detection because of the latency. If you kind of think about it, credit card swipe, the transaction's happened, they catch the first one, they look at it and say, "Well, that's a bit weird." If another one of these ones comes up, then we know we've got fraud. Well, of course, what happens is they suck the data in, it sits inside a data store, they poke the data a little bit later, and they figure out, actually, it is fraud. But the second action has happened. So they detected fraud, but they couldn't prevent it, so everything out there is payment fraud prevention, or payment fraud detection because it's basically got that latency. So what we've done is we said to ourself, "No, we actually can prevent it." Because if you can move the insight and actions to the left-hand side of the data store, and as the event is happening, you literally can grab that card swipe and say no, no, no, you don't do it anymore, you prevent it. So, it's literally taking that whole market from, in essence, detection to prevention. And this is, it's kind of fascinating because there's other angles to this. There's a marketplace inside the credit card site that talks about card not present, and there's a thing called OmniChannel, and OmniChannel's interesting, 'cause most retailers have gone out there and they've got their bricks and mortar infrastructure and architecture and data centers, and they've gone and acquired an online company. And so, now, they have these two different architectures, and if you imagine if you got to hop between the two, it kind of has gaps. And so, the fraudsters will exploit OmniChannel because there's multiple different architectures around, right? So if you think about it, there's one side of saying, hey, if we can prevent that, so taking in a huge amount of data, having it talk, having a life cycle around it, and literally being able to detect and then prevent fraud before the fraudsters can actually figure out what to do, that's fantastic, and then on the plus side, you could take that same pipeline and that same application, and you can actually provide it to the retailers and say, well, what you'd want to do is things like, again, I wrote another blog on it, loyalty brand. You know, on the retail side, is for instance, my wife, we shop like crazy, everybody does. I try not to, but let's say she's been on the Nordstrom site, and we've got a Nordstrom. So Nordstrom has a cookie on their system and they can figure what had been done. And she's surfing around, and she finds a dress she kind of likes, but she doesn't buy it because she doesn't want to spend the money. Now, I'm in Nordstrom's about four weeks later, and I'm literally buying a pair of socks. A card swipe, and what it does is because you've got this OmniChannel and you can connect the two, what they want to do is to be able to turn around and say, "Oh, Guy, before we run this credit card, "we noticed that your wife was looking at this dress. "We know her birthday's coming up. "And by the way, we've checked our store, "and we've got the color and the size "she wants it in, and if you want, "we'll put it on the credit card." >> Don't tell her that, she already bought too much. She won't want you to get that dress. Nah, it's a great, it's a really interesting example, right? >> But it is that, and if you kind of think about it, and this where, when they say every second counts, it's like every millisecond counts. And so it really is machine-to-machine, real-time, and that's what we're providing. >> Well, that's the interesting, you know, a couple things just jump into mind as you're talking. One is by going the application route, right, you're reducing the overhead for just pure talent that we keep hearing about. It's such a shortage in some of these big data applications, Hadoop, specifically. So now, you're delivering a bunch of that, that's already packaged to do a degree in an application, is that accurate? >> Yeah, I mean I kind of look at the engineering talent inside an organization is like a triangle. And at the very top, you have talented engineers that basically can hard code and that's really where our technology has sat traditionally. So, we go to a large organization. They have a hundred people dedicated to this sport. The challenge is then it means the small organizations who don't have it can't take advantage. And then you've got at the base end, you have technologies like Tableau, you know, as a GUI that you can use by an IT guy. And in the middle you've got this massive swath of engineering talent that literally isn't the, Yoda hardcode on the analytics stuff and really can't do the Hadoop cluster. But they want to basically get dangerous on this technology, and if you can take your, you know, the top talent, and you bring that in to that center and then provide it at a cost economics that makes sense, then you're away. And that's really what we've seen is. So our client base is going to go from the 1410, 1420, 1450s, into the 14,000s and you bring it down, and that's really, if you think about it, that's where Splunk kind of got their roots. Which is really, get an application, allow people to use it, execute against it and then build that base up. >> That's ironically that you bring up Splunk 'cause George Gilbert, one of our Wikibon analysts, loves to say that Splunk is the best imitation of Hadoop that was ever created. He thinks of it really as a Hadoop application as opposed to Splunk, because they're super successful. They found a great application. They've been doing a terrific job. But the other piece that you brought up that triggered my mind was really the machine-to-machine. And real-time is always an interesting topic. What is real time? I always think of real time means in time to do something about it. That can be a wide spectrum depending on what you're actually doing. And the machine-to-machine aspect is really important because they do operate at a completely different level of speed. And time is very different for a machine-to-machine operation interaction interface than trying to provide some insight to a human, so they can start to make a decision. >> Yeah, I mean, you know, it was, again, one of those moments through the last five months I was looking at it. There's a very popular technology in our space called Spark, Apache Spark. And it's successful and it's great in batch and it's got micro-batch and there's actually a thing called Spark Streaming, which is micro-batch. But in essence, it's about a second latency, and so you look at it and you go, but what's in a second? You know what I mean? I mean, surely that's good enough. And absolutely, it's good enough for some stuff. But if you were, I mean we joke about it with things like autonomous cars. If you have cruise control, adaptive cruise control, you don't want that run on batch because that second is the difference between you slamming into a truck or not. If you have DHL, they're doing delivery drops to you, and you're actually measuring weather patterns against it, and correlating where you're going to drive and how and high and where, there's no way that you're going to run on a batch process. And then batch is just so slow in comparison. We actually built an application and it's a demo up on our web. And it's a live app, and when I sat down with the engineering team, and I said, "Look, I need people to understand "what real real-time does and the benefits of it." And it's simply doing is shifting the analytics and actions from the right-hand side of where the data store is, to the left-hand side. So you take all of the latency of parting the data and then go find the data. And what we did is we said, look, well, I want to do this really fair and, when you were a kid, there used to be games like Snap, you know, where the cards that you would turn over and you'd go snap and it's mine. So we're just looking and say, "Okay, "why don't we do something like that?" It's like fishing, you know, tickling fish and who sees the first fish, you grab it, it's yours. So we created an application that basically creates random numbers at a very, very huge speed, and whichever process, we have three processes running, whichever one sees it the first time, puts their hands up and says, "I got that." And if somebody else says, "I've got that," but they see a timestamp on the other one, they can't claim it. One wins, and the other two lose. And I did it, and we optimized around, basically, the Apache Apex code, which is ours in stream mode, the Apache Apex, believe it or not, in a micro-batch mode, and Spark Streaming, as fast as they can, and we literally engineered the hell out of them to get them as fast as possible. And if you look at the results, it literally is, win every time for stream, and a loss every time for the other two. So from a speed perspective, now the reality is like I said, is if I'm showing a dashboard to you, by the time you blink, all three have gotten you the data. It's immaterial, and this isn't knocking on Spark. Our largest deployments all run on what we call, like a cask-type architecture, which is basically Kafka Apache, Spark. So we see this in Hadoop, and it's always in there. So it's kind of this cache thing. So we like it for what it is, but where customers come unbundled, is where they try and force-fit a technology into the wrong space. And so again, you mentioned Splunk, these sort of waves of innovation. We find every client sitting there, going, "I want to get inside quicker". The amount of meetings that we're all in, where you sit there and go, "If I'd only known that now "or before, then I would've made a decision." And, you know, in the good old days, we worked at-rest data. At-rest was really the kingdom of Splunk. If you think about it, we're now in the tail end of batch, which is really where Spark's done. So Splunk and Spark are kind of there, and now you're into this real-time. So again, it's running at a fair pace, but the learnings that we've had over the last few months is toolkits are great, platforms are great, but to bring this out into a mass adoption, you really need to make sure that you've provided hardened application. So we see ourselves now as, you know, real-time big data applications company, not just Apache. >> And when you look at the application space that you're going to attack, do you look at it kind of vertically, do you look at it functionally, kind of, you mentioned fraud as one of the earlier ones. How are you kind of organizing yourself around the application space? >> Yeah, and so, the best way for me to describe it, and I want to spin it in a better way than this, but I'll tell you exactly as we've done it, which is, I've looked at what the customers have currently got and we have deployments in about a dozen big customers and they're all different use cases, and then I've looked at it and said, "What you really want to do is you want to go "to a market that people have a current problem, "and also in a vertical where they're prepared "to pay for something and solving a problem "that if they give you money, they either "make money quickly or they save money quickly." So it's actually-- >> So simple. (laughs) >> But it would be much better if I said it in a pure way and I made some magical thing up, but in reality is I'm looking and going, "You got to go where the hardest problems are," And right now, a few things like card not present, you look at roaming abuse and you look at OmniChannel from payment fraud, everybody is looking for something. Now, the challenge is the market's noisy there, and so what happens is everybody's saying, "But I've got it." >> That's what strikes me about the fraud thing is you would think that that's a pretty sophisticated market place in which to compete. So you clearly have to have an advantage to even get a meeting, I would imagine. >> Yeah, and again, we've tested the market. The market's pretty hard on the back of it. We've got an application coming out shortly, and we're actually doing design partnerships with a couple of big banks. So but we don't want to be seen as just a fraud, now, just a fraud, just a fraud prevention company. (chuckles) I'll stay with a fraud, myself. But you kind of look and you say, look, they'll be a set of fraud applications because there's about half a dozen only to be done, retail, like I mentioned on things like the loyalty brand stuff. We have a number of companies that are using us for ad tech. So again, I can't mention the names. Actually, we've just published one, Publix, no, PubMatic is one of the ad tech organizations that's using our products. But we'll literally come out and harden that pipeline as well. So we're going to strut along but instead of just saying, "Hey, we've solved absolutely everything," what I want to do is to solve a problem for someone and then just move forward. You know, most of our customers have somewhere between three to five different applications that are running up and that are in production. So once the platform's in, you know, then they see the value of it. But we really want to make sure that we're closer to the end result and to an outcome, because that's the du jour way that customers want to buy things now. >> Well, and they always have, right? Like you said, they've got a burning issue. You either got to make money or save money. And if it's not a burning issue, it falls to the bottom of the pile, 'cause there's something that's burning that they need to fix quickly. >> And the other thing, Jeff, is if you, and again, it's dirty laundry, but if you think about it, I go to an account and the account's got a fraud solution, and it's all right but it's not doing what they want, but we come along up with a platform, say, "We can do absolutely anything." And then they go, "Well, I've got this really difficult "problem that no one's solved for me, "but I'm not even sure if I've got a budget for it. "Let's spend two year messing around with it. And that's no good, you know? From a small company, you really want that tractionable event, so my thing is just say, "No, what we want to do is I want to go "talk to John about John's problem," and say, "I can solve it better than the current one." And there is nothing in the market today, on the payment fraud side, that will provide prevention. It is all detection. So, there's a unique value. The question is whether we can get the noise out. >> All right, well, we look forward to watching the progress and we'll check again in five months or so. >> Thank you, Jeff, 'preciate it. >> Guy Churchward, he's from DataTorrent, President and CEO. Took over about five months ago and kind of changed the course a little bit. Exciting to watch, thanks for stopping by. >> Guy: Thank you >> All right, Jeff Frick, you're watching the theCUBE. See you next time. Thanks for watching. (upbeat electronic music)
SUMMARY :
a little bit of a break from the crazy conference season, and there's not people buzzing around, so it's different. where you guys are, how things are moving along for you. to get my feet wet, understand whether I made It's a great opportunity, and the space is just hot, hot. and really the focus has shifted now to streaming analytics. So the idea here is that you literally have and then build the applications and see what you can do. 'Cause it's so interesting to think and I said, "Do you do real-time analytics?" And that's the thing is he's looking, and if you imagine if you got to hop She won't want you to get that dress. But it is that, and if you kind of think about it, Well, that's the interesting, you know, And at the very top, you have talented engineers But the other piece that you brought up and so you look at it and you go, but what's in a second? And when you look at the application space Yeah, and so, the best way for me to describe it, So simple. you look at roaming abuse and you look at OmniChannel So you clearly have to have an advantage So once the platform's in, you know, that they need to fix quickly. and again, it's dirty laundry, but if you think about it, and we'll check again in five months or so. and kind of changed the course a little bit. See you next time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
John | PERSON | 0.99+ |
Nordstrom | ORGANIZATION | 0.99+ |
Jeff | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
DataTorrent | ORGANIZATION | 0.99+ |
OmniChannel | ORGANIZATION | 0.99+ |
DHL | ORGANIZATION | 0.99+ |
Guy Churchward | PERSON | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
PubMatic | ORGANIZATION | 0.99+ |
60 milliseconds | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
Publix | ORGANIZATION | 0.99+ |
two year | QUANTITY | 0.99+ |
second action | QUANTITY | 0.99+ |
first time | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
three processes | QUANTITY | 0.99+ |
14,000s | QUANTITY | 0.99+ |
six months | QUANTITY | 0.98+ |
CUBE | ORGANIZATION | 0.98+ |
first one | QUANTITY | 0.98+ |
Tableau | TITLE | 0.98+ |
one | QUANTITY | 0.98+ |
Snap | TITLE | 0.98+ |
today | DATE | 0.98+ |
Apache | ORGANIZATION | 0.98+ |
one side | QUANTITY | 0.98+ |
Splunk | ORGANIZATION | 0.98+ |
first fish | QUANTITY | 0.98+ |
One | QUANTITY | 0.97+ |
second | QUANTITY | 0.97+ |
DataTorrent | PERSON | 0.97+ |
five different applications | QUANTITY | 0.96+ |
five months | QUANTITY | 0.96+ |
One wins | QUANTITY | 0.96+ |
two different architectures | QUANTITY | 0.96+ |
each | QUANTITY | 0.95+ |
Hadoop | TITLE | 0.95+ |
five and a half | QUANTITY | 0.95+ |
a billion events per second | QUANTITY | 0.95+ |
about half a dozen | QUANTITY | 0.94+ |
Spark | TITLE | 0.94+ |
about five months | QUANTITY | 0.94+ |
a dozen customers | QUANTITY | 0.93+ |
Spark | ORGANIZATION | 0.93+ |
five months ago | DATE | 0.91+ |
about four weeks later | DATE | 0.91+ |
Wikibon | ORGANIZATION | 0.9+ |
theCUBE | ORGANIZATION | 0.9+ |
last five months | DATE | 0.89+ |
one of them | QUANTITY | 0.89+ |
Apex | TITLE | 0.85+ |
1450s | QUANTITY | 0.85+ |
Kafka | TITLE | 0.84+ |
snap | TITLE | 0.83+ |
Yoda | ORGANIZATION | 0.83+ |
hundred people | QUANTITY | 0.81+ |
Hadoop | ORGANIZATION | 0.8+ |
about a dozen big | QUANTITY | 0.79+ |
1420 | QUANTITY | 0.78+ |
Nathan Trueblood, DataTorrent | CUBEConversations
(techno music) >> Hey welcome back everybody, Jeff Frick here with The CUBE. We're having a cube conversation in the Palo Alto studio. It's a different kind of format of CUBE. Not in the context of a big show. Got a great guest here lined up who we just had on at a show recently. He's Nathan Trueblood, he's the vice president of product management for DataTorrent. Nathan great to see you. >> Thanks for having me. >> We just had you on The CUBE at Hadoop, or Data Works now, >> That's right. >> not Hadoop Summit anymore. So just a quick follow up on that, we were just talking before we turned the cameras on. You said that was a pretty good show for you guys. >> Yeah it was a really great show. In fact as a software company one of the things you really want to see at shows is a lot of customer flow and a lot of good customer discussions, and that's definitely what happened at Data Works. It was also really good validation for us that everyone was coming and talking to us about what can you do from a real time analytics perspective? So that was also a good strong signal that we're onto something in this marketplace. >> It's interesting, I heard your quote from somewhere, that really the streaming and the real time streaming in the big data space is really grabbing all the attention. Obviously we do Spark Summit. We did Flink Forward. So we're seeing more and more activity around streaming and it's so logical that now that we have the compute horsepower, the storage horsepower, the networking horsepower, to enable something that we couldn't do very effectively before but now it's opening up a whole different way to look at data. >> Yeah it really is and I think as someone who's been working the tech world for a while, I'm always looking for simplifying ways to explain what this means. 'Cause people say streaming and real time and all of that stuff. For us what it really comes down to is the faster I can make decisions or the closer to when something happens I can make a decision, that gives me competitive advantage. And so if you look at the whole big data evolution. It's always been towards how quickly can we analyze this data so that we can respond to what it's telling us? And in many ways that means being more responsive to my customer. So a lot of this came out of course originally from very large scale systems at some of the big internet companies like Yahoo where Hadoop was born. But really it all comes down to if I'm more responsive to my customer, I'm more competitive and I win. And I think what a lot of customers are saying across many different verticals is real time means more responsiveness and that means competitive advantage. >> Right and even we hear all the time moving into a predictive model, and then even to a prescriptive model where you're offloading a lot of the grunt work of the decision making, letting the machine do a lot more of that, and so really it's the higher value stuff that finally gets to the human at the end of the interaction who's got to make a judgment. >> That's exactly right, that's right. And so to me all the buzz about streaming is really representative of just this is now the next evolution of where big data architecture has been going which is towards moving away from a batch oriented world into something where we're making decisions as close to the time of data creation as possible. >> So you've been involved in not only tech for a long time but Hadoop specifically and Big Data specifically. And one of the knocks, I remember that first time I ever heard about Hadoop, is actually from Bill Schmarzo at EMC the dean of Big Data. And I was talking to a friend of it and he goes yeah but what Bill didn't tell you, there's not enough people. You know Hadoop's got all this great promise, there just aren't enough people for all the enterprises at the individual company level to implement this stuff. Huge part of the problem. And now you're at DataTorrent and as we talked before, interesting kind of shift in strategy and going to really an application focus strategy as opposed to more of a platform focus strategy so that you can help people at companies solve problems faster. >> That's right we've definitely focused, especially recently on more of an application strategy. But to kind of peel that back a little bit, you need a platform with all the capabilities that a platform has to be able to deliver large scale operable streaming analytics. But customers aren't looking for platforms, they're looking for please solve my business problem, give me that competitive advantage. I think it's a long standing problem in technology and particularly in Big Data where you build a tremendous platform but there's only a handful of people who know how to actually construct the applications to deliver that value. And I think increasingly in big data but also across all of tech, customers are looking for outcomes now and the way for us to deliver outcomes is to deliver applications that run on our platform. So we've built a tremendous platform and now we are working with customers and delivering applications for that platform so that it takes a lot of the complexity out of the equation for them. And we kind of think of it like if in the past it required sort of an architect level person in order to construct an application on our platform, now we're gearing towards a much larger segment of developers in the enterprise who are tremendously capable but don't have that deep Big Data experience that they need to build an application from scratch. >> And it's pretty interesting too 'cause another theme we see over and over and over and over, especially around the innovation theme is the democratization of the access to the data, the democratization of the tools to access the data so that anyone in the company or a much greater set of individuals inside the company have the opportunity to have a hypothesis, to explore the hypothesis, to come back with solutions. And so by kind of removing this ivory tower, either the data scientists or the super smart engineer who's the only one that has the capability to play with the data and the tools. That's really how you open up innovation is democratizing access and ability to test and try things. >> That's right, to me I look at it very simply, when you have large scale adoption of a technology, usually it comes down to simplifying abstractions of one kind or another. And the big simplifying abstraction really of Big Data is providing the ability to break up a huge amount of data and make some sense of it, using of course large scale distributed computing. The abstraction we're delivering at DataTorrent now is building on all that stuff, on all those layers, we've obscured all of that and now you can download with our software an application that produces an outcome. So for example one of the applications we're shipping shortly is a Omni-Channel credit card fraud prevention application. Now our customers in the past have already constructed applications like this on our platform. But now what we're doing like you said is democratizing access to those kinds of applications by providing an application that works out of the box. And that's a simplifying abstraction. Now truthfully there's still a lot of complexity in there but we are providing the pattern, the foundational application that then the customer can focus on customizing to their particular situation, their integrations, their fraud rules and so forth. And so that just means getting you closer to that outcome much more quickly. >> Watching your video from Data Works, one of the interesting topics you brought up is really speed and how faster, better, cheaper, which is innovative for a little while, becomes the new norm. And as soon as you reset the bar on speed, then they just want it, well can you go faster. So whether you went from a week to a day, a day to an hour, there's just this relentless pressure to be able to get the data, analyze the data, make a decision faster and faster and faster. And you've seen this just changing by leap years right over time. >> Right and I literally started my career in the days of ETL extracting data from tape that was data produced weeks or months ago, down to now we're analyzing data at volumes that were inconceivable and producing insight in less than a second, which is kind of mind boggling. And I think the interesting thing that's happening when we think about speed, and I've had a few discussions with other folks about this, they say well speed really only matters for some very esoteric applications. It's one of the things that people bring up. But no one has ever said well I wish my data was less fresh or my insight was not as current. And so when you start to look at the kinds of customers that want to bring real time data processing and analytics, it turns out that nearly every vertical that we look at has a whole host of applications where if you could bring real time analytics you could be more responsive to what your customer's doing. >> Right right. >> Right and that can be, certainly that's the case in retail, but we see it in industrial automation and IoT. All I think of is IoT is a way to sense what's going on in the world, bring that data in, get insight and take action from it. And so real time analytics is a huge part of that, which you know again, healthcare, insurance, banking, all these different places have used cases. And so what we're aiming to do at DataTorrent is make it easy for the businesses in those different verticals to really get the outcome they're looking for, not produce a platform and say imagine what you could do, but produce an application that actually delivers on a particular problem they have. >> It's funny too the speed equation, you saw it in Flash, remembering to shift gears a little bit into the hardware space right, is people said well it's only super low latency, super high volume transactions, financial services, is the only benefit we're going to get from Flash. >> Right yeah we've had the same knock for real time analytics. >> Same thing right, but as soon as you put it in, there's all these second order impacts, third order impacts that nobody ever thought of, that speed that delivers, that aren't directly tied to that transactional speed, but now enable you because of that transactional speed, to do so many other things that you couldn't even imagine to do and so that's why I think we see this pervasiveness of Flash, why wouldn't you want Flash? I mean why wouldn't you want to go faster? 'Cause there's so much upside. >> Yeah so again all of these innovations in IT come down to how can I be more flexible and more responsive to changing conditions? More responsive to my customer, more flexible when it comes to changing business conditions and so forth. And so now as we start to instrument the world and have technologies like machine learning and artificial intelligence, that all needs to be fed by data that is delivered as quickly as possible and then it can be analyzed to make decisions in real time. >> So I wanted to shift gears a little bit, kind of back to the application strategies. So you said you had the first app that's going to be, (Jeff drowned out by Nathan) >> Yeah so the first application yes it was fraud prevention. That's an important distinction there because the distinction between detection and prevention is the competitive advantage of real time. Because what we deliver in DataTorrent is the ability to process massive amounts of data in very very low time frame. Sub seconds time frames. And so that's the kind of fundamental capability you need in order to do something like respond to some kind of fraud event. And what we see in the market is that fraud is becoming a greater and greater problem. The market itself is expanding. But I think as we see fraud is also evolving in terms of the ways it can take place across e-commerce and point of sale and so forth. And so merchants and processors and everyone in the whole spectrum of that market is facing a massive problem and an evolving problem. And so that's where we're focused in one of our first I would say vertically oriented business applications is it's really easy to be able to take in new sources of data with our application but also to be able to process all that data and then run it through a decision engine to decide if something is fraudulent or not in a short period of time. So you need to be able to take in all that data to be able to make a good decision. And you need to be able to decide quickly if it's going to matter. And you also need to be able to have a really strong model for making decisions so that you avoid things like false positives which are as big a problem as preventing fraud itself if you deliver bad customer experience. And we've all had that experience as well which is your card gets shut down for what you think is a legitimate activity. >> It's just so ironic that false positives are the biggest problem with credit card fraud. >> Yeah it's one of yeah. >> You would think we would be thankful for a false positive but all you hear over and over and over is that false positive and the customer experience. It shows that we're so good at it is the thing that really irks people. >> Well if you think about that, having an application that allows you to make better decisions more quickly and prevent those false positives and take care of fraud is a huge competitive advantage for all the different players in that industry. And it's not just for the credit card companies of course, it's for the whole spectrum of people from the merchant all the way to the bank that are trying to deal with this problem. And so that's why it's one of the applications that we think of as a key example where we see a lot of opportunity. And certainly people that are looking at credit card fraud have been thinking about this problem for a while. But there's the complexity like we were discussing earlier of finding the talent, on being able to deliver these kinds of applications finding the technology that can actually scale to the processing volume. And so by delivering Omni-Channel fraud prevention as a Big Data application, that just puts our customers so much closer to the outcome that they want. And it makes it a lot easier to adopt. >> So as you sit, shift gears a little bit, as your VP of product hat, and there's a huge wide world of opportunity in front of you, we talked about IoT a little bit, obviously fraud, you've talked about Omni-Channel retail. How are you guys going to figure out where you want to go next? How are you prioritizing the world, and as you build up more of these applications is it going to be vertically focused, horizontally focused, what are you thoughts as you start down the application journey? >> So a few thoughts on that. Certainly one of the key indicators for me as a product manager when I look at where to go next and what applications we should build next, it comes down to what signal are the customers giving us? As we mentioned earlier, we built a platform for real time analytics and decision making, and one of the things that we see is broad adoption across a lot of different verticals. So I mentioned industrial IoT and financial services fraud prevention and advertising technology, and, and, and. We have a company that we're working with in GPS geofencing. So the possibilities are pretty interesting. But when it comes to prioritizing those different applications we have to also look at what are the economics involved for the customer and for us. So certainly one of the reasons we chose fraud prevention is that the economics are pretty obvious for our customers. Some of these other things are going to take a little bit longer for the economics to show up when it comes to the applications. So you'll certainly see us focusing on vertically oriented business applications because again the horizontals tend to be more like a platform and it's not close enough to delivering an outcome for a customer. But it's worth noting one of the things we see is that while we will deliver vertically oriented applications that oftentimes switching from one vertical app to another is really not a lot more than changing the kind of data we're analyzing, and changing the decision engine. But the fundamental idea of processing data in a pipeline at very high volume with fault tolerance and low latency, that remains the same in every case. So we see a lot of opportunity essentially as we solve an application in one vertical, to rescan it into another. >> So you can say you're tweaking the dials and tweaking the UDI. >> Tweaking the data and the rules that you apply to that data. So if you think about Omni-Channel fraud prevention, well it's not that big of a leap to look at healthcare fraud or into look at all the other kinds of fraud in different verticals that you might see. >> Do you ever see that you'll potentially break out the algorithm, I forget which one we're at, people are talking about algorithms as a service. Or is that too much of a bit, does there need to be a little bit more packaging? >> No I mean I think there will be cases where we will have an algorithm out of the box that provides some basics for the decisions support. But as we see a huge market springing up around AI and machine learning and machine scoring and all of that, there's a whole industry that's growing up around essentially, we provide you the best way to deliver that algorithm or that decision engine, that you train on your data and so forth. So that's certainly an area where we're looking from a partnership perspective. Where we already today partner with some of the AI vendors for what I would say is some custom applications that customers have deployed. But you'll see more of that in our applications coming up in the future. But as far as algorithms as a service, I think that's already here in the form of being able to query against some kind of AI with a question, you know essentially a model and then getting an answer back. >> Right well Nathan, exciting times, and your Big Data journey continues. >> It certainly does, thanks a lot Jeff. >> Thanks Nathan Trueblood from DataTorrent. I'm Jeff Frick, you're watching The CUBE, we'll see you next time, thanks for watching. (techno music)
SUMMARY :
Not in the context of a big show. You said that was a pretty good show for you guys. In fact as a software company one of the things and it's so logical that now that we have or the closer to when something happens and so really it's the higher value stuff And so to me all the buzz about streaming at the individual company level to implement this stuff. so that it takes a lot of the complexity is the democratization of the access to the data, is providing the ability to break up a huge amount of data one of the interesting topics you brought up is really speed And so when you start to look at the kinds of customers is make it easy for the businesses is the only benefit we're going to get from Flash. for real time analytics. to do so many other things that you couldn't even imagine that all needs to be fed by data kind of back to the application strategies. And so that's the kind of fundamental capability you need are the biggest problem with credit card fraud. is that false positive and the customer experience. And it's not just for the credit card companies of course, is it going to be vertically focused, horizontally focused, and one of the things that we see So you can say you're tweaking the dials that you apply to that data. break out the algorithm, I forget which one we're at, that provides some basics for the decisions support. and your Big Data journey continues. we'll see you next time, thanks for watching.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Frick | PERSON | 0.99+ |
Bill Schmarzo | PERSON | 0.99+ |
Jeff | PERSON | 0.99+ |
Nathan Trueblood | PERSON | 0.99+ |
Nathan | PERSON | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
EMC | ORGANIZATION | 0.99+ |
a week | QUANTITY | 0.99+ |
Bill | PERSON | 0.99+ |
DataTorrent | ORGANIZATION | 0.99+ |
first app | QUANTITY | 0.99+ |
Data Works | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
first application | QUANTITY | 0.99+ |
a day | QUANTITY | 0.99+ |
less than a second | QUANTITY | 0.99+ |
second order | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
Hadoop | ORGANIZATION | 0.97+ |
today | DATE | 0.97+ |
third order | QUANTITY | 0.97+ |
an hour | QUANTITY | 0.97+ |
Big Data | ORGANIZATION | 0.96+ |
first | QUANTITY | 0.96+ |
first time | QUANTITY | 0.95+ |
Flash | TITLE | 0.94+ |
Hadoop | PERSON | 0.92+ |
Hadoop | TITLE | 0.91+ |
weeks | DATE | 0.85+ |
one vertical | QUANTITY | 0.83+ |
Hadoop Summit | EVENT | 0.81+ |
The CUBE | ORGANIZATION | 0.79+ |
one of the applications | QUANTITY | 0.77+ |
Flink | ORGANIZATION | 0.72+ |
Omni-Channel | ORGANIZATION | 0.72+ |
UDI | ORGANIZATION | 0.7+ |
Summit | EVENT | 0.66+ |
CUBE | ORGANIZATION | 0.57+ |
CUBEConversations | ORGANIZATION | 0.47+ |
Spark | ORGANIZATION | 0.46+ |
months | QUANTITY | 0.43+ |
Jeff Bettencourt, DataTorrent & Nathan Trueblood, DataTorrent - DataWorks Summit 2017
>> Narrator: Live, from San Jose, in the heart of Silicon Valley, it's The Cube. Covering, DataWorks Summit, 2017. Brought to you by Hortonworks. >> Welcome back to The Cube. We are live on day two of the DataWorks Summit. From the heart of Silicon Valley. I am Lisa Martin, my co-host is George Gilbert. We're very excited to be joined by our next guest from DataTorrent, we've got Nathan Trueblood, VP of Product, hey Nathan. >> Hi. >> Lisa: And, the man who gave me my start in high tech, 12 years ago, the SVP of Marketing, Jeff Bettencourt. Welcome, Jeff. >> Hi, Lisa, good to see ya. >> Lisa: Great to see you, too, so. Tell us about the SVP of Marketing, who is DataTorrent, what do you guys do, what are doing in the big data space? >> Jeff: So, DataTorrent is all about real time streaming. So, it's really taken a different paradigm to handling information as it comes from the different sources that are out there, so you think, big IOT, you think, all of these different new things that are creating pieces of information. It could be humans, it could be machines. Sensors, whatever it is. And taking that in realtime, rather than putting it traditionally just in a data lake and then later on coming back and investigating the data that you stored. So, we started about 2011, started by some of the early founders, people that started Yahoo. And, we're pioneers in Hadoop with Hadoop yarn. This is one of the guys here, too. And so we're all about building realtime analytics for our customers, making sure that they can get business decisions done in realtime. As the information is created. And, Nathan will talk a little bit about what we're doing on the application side of it, as well. Building these hard application pipelines for our customers to assist them to get started faster. >> Lisa: Excellent. >> So, alright, let's turn to those realtime applications. Umm, my familiarity with DataTorrent started probably about five years ago, I think, where it was, I think the position is, I don't think that there was so much talk about streaming but it was like, you know, realtime data feed, but, now we have, I mean, streaming is sort of center of gravity. Sort of, appear to big data. >> Nathan: Yeah. >> So, tell us how someone whose building apps, should think about the two solution categories how they compliment each other and what sort of applications we can build now that we couldn't build before? >> So, I think the way I look at it, is not so much two different things that compliment each other, but streaming analytics and realtime data processing and analytics is really just a natural progression of where big data has been going. So, you know, when we were at Yahoo and we're running Hadoop in scale, you know, first thing on the scene was just simply the ability to produce insight out of a massive amount of data. But then there was this constant pressure, well, okay, now we've produced that insight in a day, can you do it in an hour? You know, can you do it in half an hour? And particularly at Yahoo at the time that Ah-mol, our CTO and I were there, there was just constant pressure of can you produce insight from a huge volume of data more quickly? And, so we kind of saw at that time, two major trends. One, was that we were kind of reaching a limit of where you could go with the Hadoop and batch architecture at that time. And so a new approach was required. And that's what really was sort of, the foundation of the Apache Apex project and of DataTorrent the company, was simply realizing that a new approach was required because the more that Yahoo or other businesses can take information from the world around them and take action on that as quickly as possible, that's going to make you more competitive. So I'd look at streaming as really just a natural progression. Where, now it's possible to get inside and take action on data as close to the time of data creation as possible and if you can do that, then, you're going to be competitive. And so we see this coming across a whole bunch of different verticals. So that's how I kind of look at the sort of it's not too much complimentary, as a trend in where big data is going. Now, the kinds of things that weren't possible before this, are, you know, the kinds of applications where now you can take insight whether it's from IOD or from sensors or from retail, all the things that are going on. Whereas before, you would land this in a data lake, do a bunch of analysis, produce some insight, maybe change your behavior, but ultimately, you weren't being as responsive as you could be to customers. So now what we are seeing, why I think the center of mass is moved into realtime and streaming, is that now it's possible to, you know, give the customer an offer the second they walk into a store. Based on what you know about them and their history. This was always something that the internet properties were trying to move towards, but now we see, that same technology is being made available across a whole bunch of different verticals. A whole bunch of different industries and that's why you know, when you look at Apex and DataTorrent, we're involved not only in things like adtech, but in industrial automation and IOT, and we're involved in, you know, retail and customer 360 because in every one of these cases, insurance, finance, security and fraud prevention, it's a huge competitive advantage if you can get insight and make a decision, close to the time of the data creation. So, I think that's really where the shift is coming from. And then the other thing I would mention here, is that a big thrust of our company, and of Apache Apex and this is, so we saw streaming was going to be something that every one was going to need. The other thing we saw from our experience at Yahoo, was that, really getting something to work at a POC level, showing that something is possible, with streaming analytics is really only a small part of the problem. Being able to take and put something into production at scale and run a business on it, is a much bigger part of the problem. And so, we put into both the Apache Apex problem as well as into our product, the ability to not only get insight out of this data in motion, but to be able to put that into production at scale. And so, that's why we've had quite a few customers who have put our product, in production at scale and have been running that way, you know, in some cases for years. And so that's another sort of key area where we're forging a path, which is, it's not enough to do POC and show that something is possible. You have to be able to run a business on it. >> Lisa: So, talk to us about where DataTorrent sits within a modern data architecture. You guys are kind of playing in a couple of, integrated in a couple of different areas. What goes through what that looks like? >> So, in terms of a modern data architecture, I mean part of it is what I just covered in that, we're moving sort of from a batch to streaming world where the notion of batch is not going away, but now when you have something, you know a streaming application, that's something that's running all the time, 24/7, there's no concept of batch. Batch is really more the concept of how you are processing data through that streaming application so, what we're seeing in the modern data architecture, is that, you know, typically you have people taking data, extracting it and eventually loading it into some kind of a data lake, right? What we're doing is, shifting left of the data lake. You know, analyzing information when it's created. Produce insight from it, take action on it, and then, yes, land it in the data lake, but once you land it in the data lake, now, all of the purposes of what you're doing with that data have shifted. You know, we're producing insight, taking action to the left of the data lake and then we use that data lake to do things, like train your you know, your machine learning model that we're then going to use to the left of the data lake. Use the data lake to do slicing and dicing of your data to better understand what kinds of campaigns you want to run, things like that. But ultimately, you're using the realtime portion of this to be able to take those campaigns and then measure the impacts you're having on your customers in realtime. >> So, okay, cause that was going to be my followup question, which is, there does seem to be a role, for a historical repository for richer context. >> Nathan: Absolutely. >> And you're acknowledging that. Like, did the low legacy analytics happen first? Then, store up for a richer model, you know, later? >> Nathan: Correct. >> Umm. So, there are a couple things then that seem to be like requirements, next steps, which is, if you're doing the modeling, the research model, in the cloud, how do you orchestrate its distribution towards the sources of the realtime data, umm, and in other words, if you do training up in the cloud where you have, the biggest data or the richest data. Is DataTorrent or Apex a part of the process of orchestrating the distribution and coherence of the models that should be at the edge, or closer to where the data sources are? >> So, I guess there's a couple different ways we can think about that problem. So, you know we have customers today who are essentially providing into the streaming analytics application, you know, the models that have been trained on the data from the data lake. And, part of the approach we take in Apex and DataTorrent, is that you can reload and be changing those models all of the time. So, our architecture is such that it's full tolerant it stays up all the time so you can actually change the application and evolve it over time. So, we have customers that are reloading models on a regular basis, so that's whether it's machine learning or even just a rules engine, we're able to reload that on a regular basis. The other part of your question, if I understood you, was really about the distribution of data. And the distribution of models, and the distribution of data and where do you train that. And I think that you're going to have data in the cloud, you're going to have data on premises, you're going to have data at the edge, again, what we allow customers to do, is to be able to take and integrate that data and make decisions on it, regardless kind of where it lives, so we'll see streaming applications that get deployed into the cloud. But they may be synchronized in some portion of the data, to on premises or vis versa. So, certainly we can orchestrate all of that as part of an overall streaming application. >> Lisa: I want to ask Jeff, now. Give us a cross section of your customers. You've got customers ranging from small businesses, to fortune 10. >> Jeff: Yep. >> Give us some, kind of used cases that really took out of you, that really showcased the great potential that DataTorrent gives. >> Jeff: So if you think about the heritage of our company coming out of the early guys that were in Yahoo, adtech is obviously one that we hit hard and it's something we know how to do really really well. So, adtech is one of those things where they're constantly changing so you can take that same model and say, if I'm looking at adtech and saying, if I applied that to a distribution of products, in a manufacturing facility, it's kind of all the same type of activities, right? I'm managing a lot of inventory, I'm trying to get that inventory to the right place at the right time and I'm trying to fill that aspect of it. So that's kind of where we kind of started but we've got customers in the financial sector, right, that are really looking at instantaneous type of transactions that are happening. And then how do you apply knowledge and information to that while you're bringing that source data in so that you can make decisions. Some of those decisions have people involved with them and some of them are just machine based, right, so you take the people equation out. We kind of have this funny thing that Guy Churchward our CEO talks about, called the do loop and the do loop is where the people come in and how do we remove people out of that do loop and really make it easier for companies to act, prevent? So then if you take that aspect of it, we've got companies like in the publishing space. We've got companies in the IOT space, so they're doing interview management, stuff like that, so, we go from very you know, medium sized customers all the way up to very very large enterprises. >> Lisa: You're really turning up a variety of industries and to tech companies, because they have to be these days. >> Nathan: Right, well and one other thing I would mention, there, which is important, especially as we look at big data and a lot of customer concern about complexity. You know, I mentioned earlier about the challenge of not just coming up with an idea but being able to put that into production. So, one of the other big ares of focus for DataTorrent, as a company, is that not only have we developed platform for streaming analytics and applications but we're starting to deliver applications that you can download and run on our platform that deliver an outcome to a customer immediately. So, increasingly as we see in different verticals, different applications, then we turn those into applications we can make available to all of our customers that solve business problems immediately. One of the challenges for a long time in IT is simply how do you eliminate complexity and there's no getting away from the fact that this is big data in its complex systems. But to drive mass adoption, we're focused on how can we deliver outcomes for our customers as quickly as possible and the way to do that is by making applications available across all these different verticals. >> Well you guys, this has been so educational. We wish you guys continued success, here. It sounds like you're really being quite disruptive in an of yourselves, so if you haven't heard of them, DataTorrent.com, check them out. Nathan, Jeff, thanks so much for giving us your time this afternoon. >> Great, thanks for the opportunity. >> Lisa: We look forward to having you back. You've been watching The Cube, live from day two of the DataWorks Summit, from the heart of Silicon Valley, for my co-host George Gilbert, I'm Lisa Martin, stick around, we'll be right back. (upbeat music)
SUMMARY :
Brought to you by Hortonworks. From the heart of Silicon Valley. 12 years ago, the SVP of Marketing, Jeff Bettencourt. who is DataTorrent, what do you guys do, the data that you stored. but it was like, you know, realtime data feed, is that now it's possible to, you know, Lisa: So, talk to us about where DataTorrent Batch is really more the concept of how you are So, okay, cause that was going to be my followup question, Then, store up for a richer model, you know, later? in the cloud, how do you orchestrate its distribution and DataTorrent, is that you can reload to fortune 10. showcased the great potential that DataTorrent gives. so that you can make decisions. of industries and to tech companies, that you can download and run on our platform We wish you guys continued success, here. Lisa: We look forward to having you back.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff | PERSON | 0.99+ |
Nathan | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Jeff Bettencourt | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
adtech | ORGANIZATION | 0.99+ |
Nathan Trueblood | PERSON | 0.99+ |
Apex | ORGANIZATION | 0.99+ |
DataTorrent | ORGANIZATION | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Guy Churchward | PERSON | 0.99+ |
The Cube | TITLE | 0.99+ |
half an hour | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
an hour | QUANTITY | 0.98+ |
DataWorks Summit | EVENT | 0.98+ |
two different things | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
Apache | ORGANIZATION | 0.97+ |
today | DATE | 0.97+ |
both | QUANTITY | 0.97+ |
Ah-mol | ORGANIZATION | 0.96+ |
first thing | QUANTITY | 0.96+ |
DataTorrent.com | ORGANIZATION | 0.96+ |
a day | QUANTITY | 0.95+ |
Hortonworks | ORGANIZATION | 0.95+ |
day two | QUANTITY | 0.94+ |
12 years ago | DATE | 0.93+ |
this afternoon | DATE | 0.92+ |
DataWorks Summit 2017 | EVENT | 0.92+ |
2011 | DATE | 0.91+ |
first | QUANTITY | 0.91+ |
two solution | QUANTITY | 0.9+ |
about five years ago | DATE | 0.88+ |
Apache Apex | ORGANIZATION | 0.88+ |
SVP | PERSON | 0.83+ |
Hadoop | ORGANIZATION | 0.77+ |
two major trends | QUANTITY | 0.77+ |
2017 | DATE | 0.74+ |
second | QUANTITY | 0.68+ |
360 | QUANTITY | 0.66+ |
The Cube | ORGANIZATION | 0.63+ |
Guy Churchward & Phu Hoang, DataTorrent Inc. | Mobile World Congress 2017
(techno music) >> Announcer: Live, from Silicon Valley, it's "the Cube," covering Mobile World Congress 2017. Brought to you by Mintel. >> Okay, welcome back everyone. We're here live in Palo Alto, California, covering Mobile World Congress, which is later in Spain right now, in Barcelona, it's gettin' close to bedtime, or, if you're a night owl, you're out hittin' the town, because Barcelona stays out very late, or just finishing your dinner. Of course, we'll bring in all theCube coverage here. News analysis, commentary, and of course, reaction to all the big mega-trends. And our next two guests is Guy Churchward who is the President and CEO of Data Torrent, formerly of EMC. You probably recognize him from theCube, from the EMC world, the many times he's been on. Cube alumni. And Phu Hoang, who's the co-founder and Chief Strategy Officer of Data Torrent. Co-founder, one of the founders. Also one of the early, early Yahoo engineers. I think he was the fourth engineer at Yahoo. Going way back on the 90s. Built that to a large scale. And Yahoo is credited for the invention of Hadoop, and many other great big data things. And we all know Yahoo was data-full. Guys, welcome to theCube's special coverage. Great to see you. >> Thank you so much. So I'm psyched that you guys came in, because, two things. I want to talk about the new opportunity at Data Torrent, and get some stories around the large scales experience that you guys have dealing with data. 'Cause you're in the middle of where this is intersecting with Mobile World Congress. Right now, Mobile World Congress is on the collision course between cloud-ready, classic enterprise network architectures with consumer, all happening at the same time. And data, with internet of things, is that going to be at the center of all the action? So, (laughing) these are not devices. So, that's the core theme. So, Guy, I want to get your take on, what attracted you to Data Torrent? What was the appeal for the opportunity? >> You mean, why am I here, why have I just arrived? >> I've always data-obsessed. You know this. From the days of running the storage business on their data protection, before that I was doing data analytics and security forensics. And if you look at, as you said, whether it's big data, or cloud, and the immersion of IOT, one thing's for sure, for me. It was never about big data, as in a big blob of stuff. It was all about small data sprawl. And the world's just getting more diverse by the second, and you can see that by Mobile World, right? The challenge then you have is, companies, they need to analyze their business. In other words, data analytics. About 30 years ago, when I was working for BA Systems, I remember meeting a general of the army. And he said the next war will be one in the data center, not on the battlegrounds. And so you really understand-- >> He's right about that. >> Yeah. And you have to be very, very close. So in other words, companies have started to obsess about what I call the do loop. And that really means, when data is created, and then ingesting the data, and getting insight from the data, and then actioning on that. And it's that do loop. And what you want to do, is you want to squeeze that down into a sub-second. And if you can run your analytics at the pace of your business, then you're in good shape. If you can't, you lose. And that means from a security perspective, or you're not going to win the bids. In any shape or form. That's not a business-- >> John: So speed is critical. >> Yeah, and people say, speed and accuracy. Because what you don't want to do is to run really really fast and fall off a cliff. So you really need to make sure that speed is there and accuracy is there. In the good old days, when I was running security forensics, you could either do complex end processing, which was a very small amount of information coming in and then querying it like crazy, or things like log management, where you would store data at rest, and then look at it afterwards. But now with the paradigm of all the technology catching up, so whether that's the disk space that you get, and the storage and the processing, and things like Hadoop with the clustering, you now break that paradigm. Where you can collect all the information from a business and process it before you land the data, and then get the insight out of it, and then action. So that was my thing, of looking and saying, look, this whole thing's going to happen. In last year -- >> And at large scale, too. I mean, what you're talking about in the theoretical side makes a lot of sense, but also putting that into large scale, is even more challenging. >> Yeah, we had, when I was going through the processes, dating, you know, to see whether was a company that made sense, I chatted one of our investors. And they're also a customer. And I said, why did you choose Data Torrent? And they said, "We tested everything in production, we tested all the competitive products out there, and we broke everything except Data Torrent. And actually, we tested you in production up to a billion events per second, and you didn't break. And we believe that that quantity is something that you need as a stepping stone to move forward." >> And what use cases does that fit for? Just give me some anecdotal (snaps fingers) billion transactions. At that speed, what's some use cases that really take advantage of that? >> They were mastering in, what I would call, industrialization of IT. So in other words, once you get into things like turbines, wind generation, train parts. We're going to be very very soon, looking out of a window and seeing -- >> John: So is it flow data? Is it the speed of the flow? Is it the feed of all the calculations, or both? >> It's a bit of both. And what I'll do, is I'll give Phu a chance, otherwise, we'll end up chatting about it. >> John: Phu, come on, you're the star. (laughing) When you founded this company, you had a background at Yahoo, which you built from scratch, but that was a first-mover opportunity, Web 1.0, as they say. That evolved up and then, everyone used Yahoo Finance. Everyone used Yahoo Search as a directory early on. And then everything just got bigger and bigger and bigger, and then you had to build your own stuff with Hadoop. >> Yeah. >> So you lived it. The telcos don't have the same problem. They actually got backed into the data, from being in the voice business, and then the data business. The data came after the voice. So what's the motivation behind Data Torrent? Tell us a little bit more. >> It's exactly what you say, actually. Going through the 12 years at Yahoo, and really, we learned big data the hard way. Making mistakes month after month, about how to do this thing right. We didn't have the money, and then we found out that, actually, proprietary systems of the shelf system that we thought were available, really couldn't do their jobs. So we had to invent our own technology, to deal with the kind of data processing that we had. At some point, Yahoo had a billion users using Yahoo at any given point in time, right? And the amount of impressions, the amount of clicks, the amount of activity, that a billion users have, onto the system. And all of the log files that you have to process to understand what's going on. On the other side of that, we need to be able to understand all of those activities in order to sell to our advertisers. Slice and dice behaviors and users, and so on. We didn't have the technology to do that. The only thing we knew how to do was, to have these cheap racks of cheap servers, that we were using to serve webpages. And we turned to that to say, this is what we're going to need to do, to solve these big data problems. And so, the idea of, okay we need to take this big problem and divide it into smaller pieces, so that we can run on these cheap servers, sort of became the core tenant of how we do distributor processing that became Hadoop, at the end of the day, right? >> You had big data come in because you were, big data-full, as we say. You weren't building software to solve someone else's problem. You had your own problem, you had a lot of data. You were full with data. >> Exactly. >> Had to go on a data diet, to your point. (crosstalk) >> And no one to turn to. >> And no one to turn to. >> All right. So let's spin this around or Mobile World Congress. 'Cause the big theme is, obviously, we all know what device is. In fact, we just released here on theCube early this morning Peter Burris pre-announced our new research initiative called IOTP. Which stands for Internet Of Things And People. And so now you add the complexity of people devices, whether that's going to be some sort of watch, phones, anything around them. That adds to the industrial aspect of turbines and what not. Internet of Things is a new edge architecture. So the data tsunami coming, besides the challenges of telcos to provision these devices, are going to be very challenging. So the question I want to ask you guys is, how do you see this evolving, because you have certainly connectivity. Yeah, you know, low latency, small little data coming from the windmills or whatever. Versus big high-dense bandwidth, mobility. And then you got network core issues, right. So how does this going to look like? Where does the data piece fit in? Because all aspects of this have data. What's your thoughts on this, and architecture. Tell us about your impressions, and the conversations you've had. >> First of all, I think data will exist everywhere. On the fringe, in the middle, at the center. And there's going to be data analytics and processing in every path of that. The challenge will be to kind of figure out what part of processing do you put on the fringe, what part do you put at the center. And I think that's a fluid thing that is going to be constantly changing. Going back to the telcos. We've had numbers of conversationw with telcos. And, yes we're helping them right now with their current set of issues around capacity management and billing, all those things. But they are also looking to the next step in their business. They're making all this money from provisioning, but they know they sit on top of this massive amount of really valuable data, from their customers. Every cellphone is sending them all of this data. And so there's a huge opportunity for them to monetize, or really produce value, back to their customers. And that could come in form of offers, to customers. But now you're talking about massive analytics targeting. That is also real-time, because if you're sending an offer to someone at a particular location, if you do that slowly, or in batch, and you give them an offer 10 minutes later, they're no longer where they are. They're 10 minutes away, right? >> Well, first two questions to follow up on that. One, do they know they have a data advantage opportunity here? Do they know that data is potentially a competitive advantage? >> From our conversation, they absolutely do. They're just trying to figure out, so what do we do here? It's new to them. >> I want to get both your perspectives. Guy, I want you to weigh in on this one, 'cause this is another theme that's coming out of the reporting and analysis from Mobile World Congress. This has come also from the cloud side as well. Integration now, is more important than ever, because, for instance, they might have an Oracle there, there might be Oracle databases outside their network. That they might want to tap into. So tapping other people's data. Not just what they can get, the telcos. It's going to be important. So how do you guys see the integration aspect, how we, top of the first inning, national anthem going on. I mean, where are we in this integration? There's a pregame, or, what inning are we in on this? >> Yeah, we're definitely not on the home run on it. I think our friend, and your friend Steve Manly, I sat down with him, and I gave him a brief, you know, what we were doing, and he was blown away by the technology and the opportunity, but he was certainly saying, but the challenge is the diversity of the data types. And then where they're going to be. Autonomic cars. You know each manufacturer will tell the car behind it, what it just experienced, but the question is, when will a Tesla tell a Range Rover, or tell a BMW? So you have actually -- >> They're different platforms, just different stats, it's a nightmare. >> Right. So in other words, >> And trackability. And whether it's going to be open APIs, whether it's technologies like Kafka. But the integration of that, and making sure that you can do transformation and then normalize it and drive it forward. It's kind of interesting, you know. You mentioned the telco space, and do they understand it. In some respects, what Phu went through with Yahoo, in other words, you go to a webpage, you pull it up, it knows you because of a cookie and it figures out, and then sells advertising to you on that page. Now think about you as a location, and you're walking past a Starbucks, and they want to sell you a coffee for ten cents less than they would normally do. They need to know you're there then. And this is the thing, and this is why real-time is going to be so critical. And similarly, like you said, you look out the window and you see DHL, or UPS, or FedEx drones out the window. You not only have an insight issue. You also have a security issue, you have a compliance issue, you have a locational issue. >> I think you're onto something. And I think I actually had this talk today with Steve Manly EMC World last year, around time series data. So this is interesting. Everyone wants to store everything, but it actually might not be worth anything anymore. If the drone is delivering your package, or whatever realtime data is in realtime, it's really important right there in realtime, or near realtime. It might not be worth anything after. But yet a purchase at a store, at a time, might be worth knowing that as a record to pull in. You get what I'm saying? So there's a notion of data that's interesting. >> And I think, and again, Phu's the expert. I'm still running up onto it. It's just a pet hobby, an obsession of mine. But the market has this term ETL. In other words, Extract, Transform, Land. Or load. But in essence, it's always talked about in that (mumbles) batch. In other words, I get the data, transform it, drop it, and then I have a look at it. We're going upside-down. So the idea now is to actually extract, transform, insight, action, then landing. So in other words, get the value at the fresh data, before it's the data late. Because if you set the data late, by default, it's actually stale. And actually, then there's the fascination of saying, if you're delivering realtime data to a person, you can't think fast enough to actually make a live decision. So therefore, you've almost got any information that comes to you, has to tier out. So it comes to a process. You get that fresh use of it, and then it drops into a data lake. And so I think there's using both, but I think what will you see in the market, and, again, you've experienced the disk flash momentum that happened last year. You're going to see that from a data source from at-rest, advanced, to real-time data streams on our applications next year. So I think the issue is, the formative year, and back to your, you know, get it right, get the integration, but make sure your APIs are there, talking to the right technologies. I think everything's going to be exciting this year and new and fresh and people really want to do it. Next year is going to be the year where you're going to see an absolute changing of the guards. >> And then also the SLA requirements, they'll start to get into this when you start looking at integration. >> You're absolutely right. Actually, the SLA part is actually very very important here. Because, as you move analytics from this back world, where it has, you do it once a day, and if it dies, it's okay, you just do it again. To where it is now continuous, 24 by 7, giving you insight continuously about your business, your people, your services, and so on. Then all of a sudden, it has to have the same characteristics as your business. Which is, it's 24 by 7, it can never go down, it can never lose data. So, all of a sudden you're putting tremendous requirements on an analytics system, which has, all the way from the beginning of history 'til now, been a very relaxed batch thing, to all of a sudden being something that is enterprise-grade, 24 by 7. And I think that that's actually where it's going to be the toughest nut to crack. >> So tell about some of the things that you've learned. And pretend for a second, let's pretend that you, as a co-founder at Data Torrent, and Guy, and you are teamed up. You guys run this telco. Let's just make one up, Verizon. Or AT&T, or pick one. And you sit there saying, okay, you've got the keys to the kingdom. And you can do whatever you want (laughing). You can be Donald Trump, or you can be whoever you want. You can fire everybody, or you can pick it over and run it. What would you do? You know you've got IOT. So this is business model innovation opportunities. I want you to put the technical hat on, plus knowing what you know around the business model opportunities. What do you do? You know IOT's an opportunity. Amazon is going after that heavily. Do you bolt a cloud together? Do you go after Amazon? Do you co-op with Amazon? Do you co-integrate? Do you grab the IOT? Do you use the data? I mean, given where we are today, what's the best move if we were consulting with this. >> You know, I will be the last person to be talking about giving advice to a telco. But since we are, we own our own telco here, and then we're pretending, I would say the following. IOT is going to happen, right? Earlier, when I say a billion people, that's just human beings. Once you now talk about censoring, you can program how many times they can send you data per second, then the growth in volume is immense, right? I think there's a huge opportunity, as a telco, in terms of the data that they have available and the insight that they could have about what's going on. That is not easy. I don't think that, as a telco, in the current DNA of a telco, I can go ahead and do all that analytics and really open up my business to the data insight layer. I would partner, and find a way-- >> Well, we're consulting, we're going to sit around and say hey, what do we have? We have relationship with the consumer, big marketing budgets. We can talk to them directly, we have access to their device. >> But you'll bifurcate the business. We're in the boardroom here, this is nothing more than that. But I would look at it and say look, you've got a consumer business, the same as in IOT. There's really, for me, there's three parts of IOT. There is the bit that I love which, you can geek out, which is basically the consumer market, which, there's no money in for a large-scale tenant, right, enterprise. And then you have the industrialization of IOT, which is I've got a leaky pipe, and I want a hardened device, ruggedized, which is wifi, so, now as a telco, I could create a IOT cloud, that allows me to put these devices out there, and in fact, I use Arlo, the little cameras. And they've got one now, where I can basically float it with its own cellular signal. So it's its own cellphone. That's a great use of IOT for that. And then you step to the consumer side of, I've got a cellphone, and then what I'll do is literally, in essence, riff off what Yahoo did in the early days and say, I'm now the new browser. The person's the browser. So in other words, follow the location, follow where he is, and then basically do locational-based advertising. >> By the way, you have to license the patent from our earlier guest, he'll say will he leak, 'cause he's got th6e patent on personal firewall for personal server. He's built a mobile personal server. >> Yeah. >> But this is the opportunity around wireless. Why I love the confusion, but the opportunity around wireless right now is, you can get bandwidth at high capacity. You have millimeter wave four, that doesn't go through walls, but you have other diverse frequencies and spectrum for instance, you can blend it all together to have that little drip signal, if you will, going into the cloud from the leaky pipe. Or if you need turbine, full-fat pipe, you maybe go somewhere. So, I think this is an interesting opportunity. >> And they're going to end up watching the data centers as well. There's still the gamut of saying our customer is going to continue to support their own data centers, or are there going to be one to a hundred data centers out there? And then how does selling a manufacturer or a telco play into that, and do they want to be that guy or not? >> Guy, Phu, thanks for coming in. I want to give you guys a chance to put a plug in for Data Torrent. Thanks for sharing some great commentary on the industry. So, what's up with you guys? Give us the update. Are you hiring? You growing? What are you guys doing? Customers? What's the update? Technology, innovations? >> So we've got a release coming out tomorrow which is a momentum release. I can't talk too much about the numbers, but in essence, from a fact base, we have a thing called a patchy apex. So it's open sourced, so you can use our product for free. But that's growing like gangbusters. From a top-level project, that's actually the fastest-growing one, and it's only been out for seven months. We just broke through 50,000 users on it. From our product, we're doing very well on the back of it. So we actually have subscription for the production side. >> So revenue is a subscription model. >> Yeah, and we meet both sides. So in other words, for the engineer who writes it, you've got the open source. And then when you put it into production, from the operations side, you can then license our products to enable you to manage an easy-- >> So when it gets commercialized, you pay as you go, when you use it. >> And you don't have to, if you don't want to. You've got all the tools to do it. But, we focus for our products group of, time to value, total cost of ownership. We're trying to bring Hadoop and real scale, realtime streaming to the masses. So what's the technology innovation? What's the disruptive enabler for you guys? >> I think we talked about it, right? You've got two really competing trends going on here. On one side, data is getting more and more and more massive. So it's going to take longer and longer to process it. Yet at the other side, business wants to be able to get data, have insight, and take action sub-second. So how do you get both at the same time? That's really the magic of the technology. >> Thanks for coming in. Great to meet you, Phu. I'd love to talk about the old Yahoo days, a total throwback, Web 1.0, a great time in history, pre-bubble bursting. Greatness happening in the valley and all around the world, and I remember those days clearly. Guy, great to see you. Congratulations on your new CEO committee. And great to have you on theCube. This is theCube bringing the coverage, and commentary, and reaction of Mobile World Congress here, in California. As everyone goes to bed in Barcelona, we're just gettin' down to the end of our day here in the afternoon in California. Be right back with more after this short break. (techno music)
SUMMARY :
Brought to you by Mintel. And Yahoo is credited for the invention of Hadoop, So I'm psyched that you guys came in, because, two things. And if you look at, as you said, And what you want to do, is you want to squeeze that and process it before you land the data, I mean, what you're talking about in the theoretical side And I said, why did you choose Data Torrent? And what use cases does that fit for? So in other words, once you get into things like And what I'll do, is I'll give Phu a chance, and then you had to build your own stuff with Hadoop. So you lived it. And all of the log files that you have to process You had big data come in because you were, Had to go on a data diet, to your point. So the question I want to ask you guys is, and you give them an offer 10 minutes later, Do they know that data It's new to them. So how do you guys see the integration aspect, and I gave him a brief, you know, what we were doing, just different stats, it's a nightmare. So in other words, and then sells advertising to you on that page. And I think I actually had this talk today with Steve Manly So the idea now is to actually extract, transform, when you start looking at integration. and if it dies, it's okay, you just do it again. And you can do whatever you want (laughing). and the insight that they could have about what's going on. We can talk to them directly, There is the bit that I love which, you can geek out, By the way, you have to license the patent to have that little drip signal, if you will, And they're going to end up watching I want to give you guys a chance to put a plug in So it's open sourced, so you can use our product for free. And then when you put it into production, So when it gets commercialized, you pay as you go, What's the disruptive enabler for you guys? So how do you get both at the same time? And great to have you on theCube.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Steve | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Steve Manly | PERSON | 0.99+ |
Sanjay | PERSON | 0.99+ |
Rick | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Verizon | ORGANIZATION | 0.99+ |
David | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Fernando Castillo | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Dave Balanta | PERSON | 0.99+ |
Erin | PERSON | 0.99+ |
Aaron Kelly | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Fernando | PERSON | 0.99+ |
Phil Bollinger | PERSON | 0.99+ |
Doug Young | PERSON | 0.99+ |
1983 | DATE | 0.99+ |
Eric Herzog | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Deloitte | ORGANIZATION | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
Spain | LOCATION | 0.99+ |
25 | QUANTITY | 0.99+ |
Pat Gelsing | PERSON | 0.99+ |
Data Torrent | ORGANIZATION | 0.99+ |
EMC | ORGANIZATION | 0.99+ |
Aaron | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Pat | PERSON | 0.99+ |
AWS Partner Network | ORGANIZATION | 0.99+ |
Maurizio Carli | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Drew Clark | PERSON | 0.99+ |
March | DATE | 0.99+ |
John Troyer | PERSON | 0.99+ |
Rich Steeves | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
BMW | ORGANIZATION | 0.99+ |
VMware | ORGANIZATION | 0.99+ |
three years | QUANTITY | 0.99+ |
85% | QUANTITY | 0.99+ |
Phu Hoang | PERSON | 0.99+ |
Volkswagen | ORGANIZATION | 0.99+ |
1 | QUANTITY | 0.99+ |
Cook Industries | ORGANIZATION | 0.99+ |
100% | QUANTITY | 0.99+ |
Dave Valata | PERSON | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
Peter Burris | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
Stephen Jones | PERSON | 0.99+ |
UK | LOCATION | 0.99+ |
Barcelona | LOCATION | 0.99+ |
Better Cybercrime Metrics Act | TITLE | 0.99+ |
2007 | DATE | 0.99+ |
John Furrier | PERSON | 0.99+ |