Blaine Mathieu, VANTIQ | Big Data SV 2018

>> Announcer: Live from San Jose, it's The Cube, presenting Big Data, Silicon Valley. Brought to you by Silicon Angle Media and its ecosystem partners. >> Welcome back to The Cube. Our continuing coverage of our event, Big Data SV continues. I am Lisa Martin joined by Peter Burris. We're in downtown San Jose at a really cool place called Forager Tasting and Eatery. Come down, hang out with us today as we have continued conversations around all things big data, everything in between. This is our second day here and we're excited to welcome to The Cube the CMO of VANTIQ, Blaine Mathieu. Blaine, great to meet you, great to have you on the program. >> Great to be here, thanks for inviting me. >> So, VANTIQ, you guys are up the street in Walnut Creek. What do you guys do, what are you about, what makes VANTIQ different? >> Well, in a nutshell, VANTIQ is a so called high productivity application development platform to allow developers to build, deploy, and manage so called event driven real time applications, the kind of applications that are critical for driving many of the digital transformation initiatives that enterprises are trying to get on top of these days. >> Digital trasformation, it's a term that can mean so many different things, but today, it's essential for companies to be able to compete, especially enterprise companies with newer companies that are more agile, more modern. But if we peel apart digital transformation, there's so many elements that are essential. How do you guys help companies, enterprises, say, evolve their application architectures that might currently not be able to support an actual transformation to a digital business? >> Well, I think that's a great question, thank you. I think the key to digital trasformation is really a lot around the concept of real time, okay. The reason Uber is disrupting or has disrupted the taxi industry is the old way of doing it was somebody called a taxi and then they waited 30 minutes for a taxi to show up and then they told the taxi where to go and hopefully they got there. Whereas, Uber, turned that into a real time business, right? You called, you pinged something on your phone. They knew your location. They knew the location of the driver. They matched those up, brought 'em together in real time. Already knew where to bring you to and ensured you had the right route and that location. All of this data flowing, all of these actions have been taken in real time. The same thing applies to a disruptor like Netflix, okay? In the old days, Blockbuster used to send you, you know, a leaflet in the mail telling you what the new movies are. Maybe it was personalized for you. Probably not. No, Netflix knows who you are instantly, gives you that information, again, in real time based on what you've done in the past and is able to give you, deliver the movie also, in real time pretty well. Every disruptor you look at around digital transformation is bringing a business or a process that was done slowly and impersonally to make it happen in real time. Unfortunately, enterprise applications and the architectures, as you said a second ago, that are being used in most applications today weren't designed to enable these real time use cases. A great example is sales force. So, a sales force is a pretty standard, what you'd call a request application. So, you make a request, a person, generally, makes a request of the system, system goes into a database, queries that database, find information and then returns it back to the user. And that whole process could take, you know, significant amounts of time, especially if the right data isn't in the database at the time and you have to go request it or find it or create it. A new type of application needs to be created that's not fundamentally database centric, but it's able to take these real time data streams coming in from devices, from people, from enterprise systems, process them in real time and then take an action. >> So, let's pretend I'm a CEO. >> Yeah. >> One of the key things you said, and I want you to explain it better, is event. What is event? What is an event and how does that translate into a digital business decision? >> This notion of complex event processing CEP has been around in technology for a long time and yet, it surprises me still a lot of folks we talk to, CEOs, have never heard of the concept. And, it's very simple really. An event is just something that happens in the context of business. That's as complex and as simple as it is. An event could be a machine increases in temperature by one degree, a car moves from one location to another location. It could be an enterprise system, like an ERP system, you know, approves a PO. It could be a person pressing a button on a mobile device. All of those, or it could be an IOT device putting off a signal about the state of a machine. Increasingly, we're getting a lot of events coming from IOT devices. So, really, any particular interesting business situation or a change in a situation that happens is an event And increasingly driven, as you know, by IOT, by augmented reality, by AI and machine learning, by autonomous vehicles, by all these new real time technologies are spinning off more and more events, streams of these events coming off in rapid fashion and we have to be able to do something about them. >> Let me take a crack at it and you tell me if I've got this right. That, historically, applications have been defined in terms of processes and so, in many respects, there was a very concrete, discreet, well established program, set of steps that were performed and then the transaction took place. And event, it seems to me is, yeah, we generally described it, but it changes in response to the data. >> Right, right. >> So, an event is kind of like an outside in driven by data. >> Right, right. >> System response, whereas, your traditional transaction processing is an inside out driven by a sequence of programmed steps, and that decision might have been made six years ago. So, the event is what's happening right now informed by data versus a transaction, traditional transaction is much more, what did we decide to do six years ago and it just gets sustained. Have I got that right? >> That's right. Absolutely right or six hours ago or even six minutes ago, which might seem wow, six minutes, that's pretty good, but take a use case for a field service agent trying to fix a machine or an air conditioner on top of a building. In today's world now, that air conditioner has hundreds of sensors that are putting off data about the state of that air conditioner in real time. A service tech has the ability to, while the machine is still putting off that data, be able to make repairs and changes and fixes, again, in the moment, see how that is changing the data coming off the machine, and then, continue to make the appropriate repairs in collaboration with a smart system or an application that's helping them. >> That's how identifying patterns about what the problem is, versus some of the old ways was where we had recipe of, you know, steps that you went through in the call center. >> Right, right. And the customer is getting more and more frustrated. >> They got their clipboard out and had the 52 steps they followed to see oh that didn't work, now the next step. No, data can help us do that much more efficiently and effectively if we're able to process it in real time. >> So, in many respects, what we're really talking about is an application world or a world looking forward where the applications, which historically have been very siloed, process driven, to a world where the application function is much more networked together and the application, the output of one application is having a significant impact through data on the performance of an application somewhere else. That seems like it's got the potential to be an extremely complex fabric. (laughing) So, do I wait until I figure all that out (laughing) and then I start building it? Or do I, I mean, how do I do it? Do I start small and create and grow into it? What's the best way for people to start working on this? >> Well, you're absolutely right. Building these complex, geeking out a little bit, you know, asynchronous, non-blocking, so called reactive applications, that's the concept that we've been using in computer science for some time, is very hard, frankly. Okay, it's much easier to build computing systems that process things step one, step, two, step three, in order, but if you have to build a system that is able to take real time inputs or changes at any point in the process at any time and go in a different direction, it's very complex. And, computer scientists have been writing applications like this for decades. It's possible to do, but that isn't possible to do at the speed that companies now want to transform themselves, right? By the time you spec out an application and spend two years writing it, your business competitors have already disrupted you. The requirements have already changed. You need to be much more rapid and agile. And so, the secret sauce to this whole thing is to be able to write these transformative applications or create them, not even write is actually the wrong word to use, to be able to create them. >> Generate them. >> Yeah, generate them in a way which is very fast, does not require a guru level developer and reactive Java or some super low level code that you'd have to use to otherwise do it, so that you can literally have business people help design the applications, conceptually build them almost in real time, get them out into the market, and then be able to modify them as you need to, you know, on the fly. >> If I can build on that for just one second. So, it used to be we had this thing called computer assisted software engineer. >> (laughs) Right, right. >> We were going to operate this very very high level language. It's kind of-- But then, we would use code and build a code and the two of them were separated and so the minute that we deployed, somebody would go off and maintain and the whole thing would break. >> Right, right. >> Do you have that problem? >> No, well, that's exactly right. So, the old, you know, the old, the previous way of doing it was about really modeling an application, maybe visually, drag and drop, but then fundamentally, you created a bunch of code and then your job, as you said after, was to maintain and deploy and manage. >> Try to sustain some connection back up to that beautiful visual model. >> And you probably didn't because that was too much. That was too much work, so forget about the model after that. Instead, what we're able to do these days is to build the applications visually, you know, really for the most part with either super low code or, in many cases, no code because we have the ability to abstract away a lot of the complexity, a lot of the complex code that you'd have to write, we can represent that, okay, with these logical abstractions, create the applications themselves, and then continue to maintain, add to, modify the application using the exact same structure. You're not now stuck on, now you're stuck with 20,000 lines of code that you have to, that you have to edit. You're continuing to run and maintain the application just the way you built it, okay. We've now got to the place in computer science where we can actually do these things. We couldn't do them, you know, 20 years ago with case, but we can absolutely do them now. >> So, I'm hearing from a customer internal perspective a lot of operational efficiencies that VANTIQ can drive. Let's look now from a customer's perspective. What are the business impacts you're able to make? You mentioned the word reactive a minute ago when you were talking about applications, but do you have an example where you've, VANTIQ, has enabled a customer, a business, to be more, to be proactive and be able to identify through, you know, complex event processing, what their customers are doing to be able to deliver relevant messages and really drive revenue, drive profit? >> Right, right. So many, you know, so many great examples. And, I mentioned field service a few minutes ago. I've got a lot of clients in that doing this real time field service using these event processing applications. One that I want to bring up right now is one of the largest global shoe manufacturers, actually, that's a client of VANTIQ. I, unfortunately, can't say the name right now 'cause they want to keep what they're doing under wraps, but we all definitely know the company. And they're using this to manage the security, primarily, around their real time global supply chain. So, they've got a big challenge with companies in different countries redirecting shipments of their shoes, selling them on the gray market, at different prices than what are allowed in different regions of the world. And so, through both sensorizing the packages, the barcode scanning, the enterprise systems bringing all that data together in real time, they can literally tell in the moment is something is be-- If a package is redirected to the wrong region or if literally a shoe or a box of shoes is being sold where it shouldn't be sold at the wrong price. They used to get a monthly report on the activities and then they would go and investigate what happened last month. Now, their fraud detection manager is literally sitting there getting this in real time, saying, oh, Singapore sold a pallet of shoes that they should not have been able to sell five minute ago. Call up the guy in Singapore and have him go down and see what's going on and fix that issue. That's pretty powerful when you think about it. >> Definitely, so like reduction in fraud or increase in fraud detection. Sounds like, too, there's a potential for a significant amount of cost savings to the business, not just meeting the external customer needs, but from a, from a cost perspective reduction. Not just some probably TCO, but in operational expenses. >> For sure, although, I would say most of the digital transformation initiatives, when we talk to CEOs and CIOs, they're not focused as much on cost savings, as they're focused on A, avoiding being disrupted by the next interesting startup, B, creating new lines of business, new revenue streams, finding out a way to do something differently dramatically better than they're currently doing it. It's not only about optimizing or squeezing some cost out of their current application. This thing that we are talking about, I guess you could say it's an improvement on their current process, but really, it's actually something they just weren't even really doing before. Just a total different way of doing fraud detection and managing their global supply chain that they just fundamentally weren't even doing. And now, of course, they're looking at many other use cases across the company, not just in supply chain, but, you know, smart manufacturing, so many use cases. Your point about savings, though, there's, you know, what value does the application itself bring? Then, there's the question of what does it cost to build and maintain and deploy the application itself, right? And, again, with these new visual development tools, they're not modeling tools, you're literally developing the application visually. You know, I've been in so many scenarios where we talked to large enterprises. You know, we talk about what we're doing, like we talk about right now, and they say, okay, we'd love to do a POC, proof of concept. We want to allocate six months for this POC, like normally you would probably do for building most enterprise applications. And, we inevitably say, well, how about Friday? How about we have the POC done by Friday? And, you know, we get the Germans laugh, you know, laugh uncomfortably and we go away and deliver the POC by Friday because of how much different it is to build applications this way versus writing low level Java or C-sharp code and sticking together a bunch of technologies and tools 'cause we abstract all that away. And, you know, the eyes drop open and the mouth drops open and it's incredible what modern technology can do to radically change how software is being developed. >> Wow, big impact in a short period of time. That's always a nice thing to be able to deliver. >> It is, it is to-- It's great to be able to surprise people like that. >> Exactly, exactly. Well, Blaine, thank you so much for stopping by, sharing what VANTIQ is doing to help companies be disruptive and for sharing those great customer examples. We appreciate your time. >> You're welcome. Appreciate the time. >> And for my co-host, Peter Burris, I'm Lisa Martin. You're watching The Cube's continuing coverage of our event, Big Data SV Live from San Jose, down the street from the Strata Data Conference. Stick around, we'll be right back with our next guest after a short breal. (techy music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by Silicon Angle Media the CMO of VANTIQ, Blaine Mathieu. So, VANTIQ, you guys are up the street in Walnut Creek. for driving many of the digital transformation that might currently not be able to support and the architectures, as you said a second ago, One of the key things you said, in the context of business. in response to the data. So, an event is kind of like an outside in So, the event is what's happening right now and changes and fixes, again, in the moment, of the old ways was where we had recipe of, you know, And the customer is getting more and more frustrated. they followed to see oh that didn't work, and the application, the output of one application And so, the secret sauce to this whole thing to modify them as you need to, you know, on the fly. So, it used to be we had this thing and so the minute that we deployed, So, the old, you know, the old, Try to sustain just the way you built it, okay. but do you have an example where you've, that they should not have been able to sell to the business, not just meeting and deliver the POC by Friday because to be able to deliver. It's great to be able to surprise people Well, Blaine, thank you so much for stopping by, Appreciate the time. down the street from the Strata Data Conference.

ENTITIES

Entity	Category	Confidence
Blaine	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Peter Burris	PERSON	0.99+
Singapore	LOCATION	0.99+
Uber	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
VANTIQ	ORGANIZATION	0.99+
Blaine Mathieu	PERSON	0.99+
20,000 lines	QUANTITY	0.99+
30 minutes	QUANTITY	0.99+
two	QUANTITY	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
52 steps	QUANTITY	0.99+
Walnut Creek	LOCATION	0.99+
six months	QUANTITY	0.99+
Java	TITLE	0.99+
one degree	QUANTITY	0.99+
Friday	DATE	0.99+
second day	QUANTITY	0.99+
last month	DATE	0.99+
one second	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
six years ago	DATE	0.98+
both	QUANTITY	0.98+
Strata Data Conference	EVENT	0.98+
Big Data SV Live	EVENT	0.98+
One	QUANTITY	0.98+
The Cube	ORGANIZATION	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
20 years ago	DATE	0.98+
Big Data SV 2018	EVENT	0.97+
six hours ago	DATE	0.97+
six minutes ago	DATE	0.97+
five minute ago	DATE	0.97+
a minute ago	DATE	0.96+
hundreds of sensors	QUANTITY	0.95+
The Cube	TITLE	0.94+
Blockbuster	ORGANIZATION	0.91+
few minutes ago	DATE	0.89+
step one	QUANTITY	0.89+
step three	QUANTITY	0.85+
Forager Tasting and Eatery	ORGANIZATION	0.85+
decades	QUANTITY	0.84+
six minutes	QUANTITY	0.84+
C	TITLE	0.83+
Big Data	ORGANIZATION	0.81+
one location	QUANTITY	0.78+
one application	QUANTITY	0.77+
second ago	DATE	0.71+
CEP	ORGANIZATION	0.53+
big	ORGANIZATION	0.52+
Germans	PERSON	0.51+
techy	ORGANIZATION	0.41+
Data	EVENT	0.31+

Wikibon Research Meeting | Systems at the Edge

>> Hi I'm Peter Burris and welcome once again to Wikibons's weekly research meeting on theCUBE. (funky electronic music) This week we're going to discuss something that we actually believe is extremely important. And if you listen to the recent press announcements this week from Deli MC, the industry increasingly is starting to believe is important. And that is, how are we going to build systems that are dependent upon what happens at the edge? The past 10 years have been dominated about the cloud. How are we going to build things in the cloud? How are we going to get data to the cloud? How are we going to integrate things in the cloud? While all those questions remain very relevant, increasingly, the technology's becoming available, the systems and the design elements are becoming available, and the expertise is now more easily bought together so that we can start attacking some extremely complex problems at the edge. A great example of that is the popular notion of what's happening with automated driving. That is a clear example of huge design requirements at the edge. Now to understand these issues, we have to be able to generalize certain attributes of the differences in the resources, whether they be hardware or software, but increasingly, especially from a digital business transformation standpoint, the differences in the characteristics of the data. And that's what we're going to talk about this week. How do different types of data, data that's generated at the edge, data that's generated elsewhere, going to inform decisions about the classes of infrastructure that we're going to have to build and support as we move forward with this transformation that's taking place in the industry. So to kick it off, Neil Raden I want to turn to you. What are some of those key data differences and what taxonomically do we regard as what we call primary, secondary, and tertiary data? Neil. >> Well, primary data come in from sensors. It's a little bit different than anything we've ever seen in terms of doing analytics. Now I know that operational systems do pick up primary data, credit card transactions, something like that. But, scanner data, not scanner data, I mean sensor data is really designed for analysis. It's not designed for record keeping. And because it's designed for analysis, we have to have a different way of treating it than we do other things. If you think about a data lake, everything that falls into that data lake has come from somewhere else, it's been used for something else. But this data is fresh, and that requires that we really have to treat it carefully. Now, the retention and stewardship of that requires a lot of thought. And I don't think industry has really thought of that through a great deal. But look, sensor data is not new, it's been around for a long time. But what's different now is the volume and the lack of latency in it. But any organization that wants to get involved in it really needs to be thinking about what's the business purpose of it. If you're just going into, IOT as we call it generically, to save a few bucks you might as well not bother. It really is something that will change your organization. Now, what do we do with this data is a real problem because for the most part, these senses are going to be remote, and there's going to be a lot of, that means they're going to generate a lot of data. So what do we do with it? Do we reduce it at the sight? That's been one suggestion. There's an issue that any model for reduction could conceivably lose data that may be important somewhere down the line. Can the data be reconstituted through metadata or some sort of reverse algorithms? You know, perhaps. Those are the things we really need to think about. My humble opinion is the software and the devices need to be a single unit. And for the most part, they need to be designed by vendors, not by individual ITs. >> So David Floyer, let's pick up on that. Software and devices as single unit, designed more by vendors who have specific demand expertise, turn into solutions and present it to business. What do you think? >> Absolutely, I completely concur with that. The initial attempts to using the sensors and connecting to the sensors were very simple things like for example, the nest, the thermostats. And that's worked very well. But if you look at it over time, the processing for that has gone into the home, into your Apple TV device or your Alexa or whatever it is. So, that's coming down and now it's getting even closer to the edge. In the future, our proposition is that it will get even closer and then those will put together solutions, all types of solutions that are appropriate to the edge that will be taking not just one sensor but multiple sensors, collecting that data together, just like in the autonomous car for example where you take the lidars and the radars and the cameras etcetera. We'll be taking that data, we'll be analyzing it, and we'll be making decisions based on that data at the edge. And vendors are going to play a crucial role in providing these solutions to IT and to the OT and to many other parts. And a large value will be in their expertise that they will develop in this area. >> So as a rule of thumb, when I was growing up and learned to drive, I was told always keep five car lengths between you and whatever's in front of you at whatever speed you're traveling. What you just described David is that there will be sensors and there will be processing that takes place in that automated car that isn't using that type of rule of thumb but know something about tire temperature, and therefore the coefficient of friction on the tires, know something about the brakes, knows what the stopping power needs to be at the speed and therefore what buffer needs to be between it and whatever else is around it. >> Absolutely. >> This is no longer a rule of thumb, this is physics and deep understanding of what it's going to require to stop that car. >> And on top of that, what you'll also want to know, outside from your car is, what type of car is in front of you? Is that an autonomous car, or is that somebody being driven bye Peter? In which case, you have 10 lengths behind you. >> But that's not going to be primary data. Is that what we mean by secondary data? >> No, that's still primary because you're going to set up a connection between you and that other car. That car is going to tell you I'm primary to you, that's primary data. >> Here's what I mean, correct use primary data but, from a standpoint of that the car in that case is submitting a signal, right? So even though to your car it's primary data, but one of the things from a design standpoint that's interesting, is that car is now transmitting a digital signal about it's state that's relevant to you so that you can combine that >> Correct. inside effectively, a gateway inside your car. >> Yes. >> So there's external information that is in fact digital coming in, combining with the sensors about what's happening in your car. Have I got that right? >> Absolutely. That to me is a sort of sengrey one, then you've got the tertiary data which is the big picture about the traffic conditions >> Routes. and the weather and the routes and that sort of thing which is at that much higher cloud level, yes. So David Vellante, we always have to make sure as we have these conversations. We've talked a bit about this data, we've talked a little bit about the classes of work that's going to be performed at the different levels. How do we ensure that we sustain the business problem in this conversation? >> So, I mean I think Wikibon's done some really good work on describing what this sort of data model looks like from edge devices where you have primary data, the gateways where you're doing aggregated data in the cloud where maybe the serious modeling occurs. And my assertion would be is that the technology to support that elongating and increasingly distributed data model has been maturing for a decade and the real customer challenge is not just technical, it's really understanding a number of factors and I'll name some. Where in the distributed data value chain are you going to differentiate? And how does the data that you're capturing in that data pipeline contribute to monetization? What are the data sources, who has access to that data, how do you trust that data, and interpret it, and act on it with confidence? There are significant IP ownership in data protection issues. Who owns the data? Is it the device manufacturer, is it the factory, etcetera. What's the business model that's going to allow you to succeed? What skill sets are required to win? And really importantly, what's the shape of the ecosystem that needs to form to go to market and succeed? These are the things that I think customers are really struggling with that I talk to. >> Now, the one thing I'd add to that and I want to come back to it is the idea that, and who is ultimately bonding the solution because this is going to end up in a court of law. But let's come to this IP issue, George. Let's talk about how local data is going to be, is going to enter into the flow of analytics, and that question of who owns data, because that's important and then have the question about some of the ramifications and liabilities associated with this. >> Okay well, just on the IP protection and the idea that a vendor has to take sort of whole product responsibility for the solution. That vendor is probably going to be dealing with multiple competitors when they're sort of enabling say, self-driving car or other, you know edge, or smaller devices. The key thing is that, a vendor will say, you know, the customer keeps their data and the customer gets the insights from that data. But that data is informing in the middle a black box, an analytic black box. It's flowing through it, that's where the insights come out, on the other side. But the data changes that black box as it flows through it. So, that is something where, you know, when the vendor provides a whole solution to Mercedes, that solution will be better when they come around to BMW. And the customers should make sure that what BMW gets the benefit of, goes back to Mercedes. That's on the IP thing. I want to add one more thing on the tertiary side which is, when you're close to the edge, it's much more data intensive. When we've talked about the reduction in data and the real-time analytics, at the tertiary level it's going to be more where time is a bigger factor and you're essentially running a simulation, it's more compute intensive. And so you're doing optimizations of the model and those flow back as context to inform both the gateway and the edge. >> David Floyer I want to turn it to you. So we've talked a little bit about the characteristics of the data, great list of Dave Vellante about some of the business considerations, we will get very quickly in a second to some of the liability issues cause that's going to be important. But take us through how, which George just said about the tertiary elements. Now we've got all the data laid out, how is that going to map to the classes of devices? And we'll then talk a bit about some of the impacts on the industry. What's it going to look like? >> So if we take the primary edge first, and you take that as a unit, you'll have a number of senses within that. >> So just released, this is data about the real world that's coming into the system to be processed? >> Yes. So it'll have, for example, cameras. If we take a simple example of making sure that bad people don't get into your site. You'll have a camera there which will be facial recognition. They'll have a badge of some sort, so you'll read that badge, you may want to take their weight, you may want to have a infrared sensor on them so that you can tell their exact distance. So, a whole set of sensors that the vendor will put together for the job of insuring you don't get bad guys in there. And what you're insuring is that bad guys don't get in there, that's obviously one, very important, and also, that you don't go and- >> Stop good guys from going in. stop good guys from going in there. So those are the two characteristics >> The false-positive problem. the false-positives. Those are the two things you're trying to design that- >> At the primary edge. at the primary edge. And there's a mass amount of data going into that, which is only going to be reduced to very, very little data coming up to the next level which is this guy came here, this was his characteristics, he didn't look well today, maybe you should see a nurse, or whatever other information you can gather from that will go up to that secondary level, and then that'll also be a record of to HR maybe, about who has arrived there or what time they arrived, to the manufacturing systems about who is there and who has those skills to do a particular job. There are multiple uses of that data which can then be used for differentiation for whatever else from that secondary layer into local systems and then equally they can be pushed up to the higher level which is, how much power should be generating today, what are the higher levels. >> We now have 4,000 people in the building, air condition therefore is going to look like this, or, it could be combined with other types of data like over time we're going to need new capacity, or payroll, or whatever else it might be. >> And each level will have its own type of AI. So you've got AI at the edge, which is to produce a specific result, and then there's AI to optimize at the secondary level and then the AI optimize bigger things at the tertiary level. >> So we're going to talk more about some of the AI next week, but for right now we're talking about classes of devices that are high performance, high bandwidth, cheap, constrained, proximate to the event. >> Yep. >> Gateways that are capable of taking that information and start to synthesize it for the business, for other business types of things, and then tertiary systems, true private cloud for example, although we may have very sizable things at the gateway as well, >> There will be true private clouds. that are capable of integrating data in a more broad way. What's the impact in the industry? Are we going to see IT firms roll in and control this sweeping, (man chuckles) as Neil said, trillions of new devices. Is this all going to be intel? Is it all going to be, you know, looking like clients and PCs? >> My strong advice is, that the devices themselves will be done by extreme specialists in those areas that they will need a set of very deep technology understanding of the devices themselves, the senses themselves, the AI software relevant to that. Those are the people that are going to make money in that area. And you're much better off partnering with those people and letting them solve the problems, and you solve, as Dave said earlier, the ones that can differentiate you within your processes, within your business. So yes, leave that to other people is my strong advice. And from an IT's point of view, just don't do it yourself. >> Well the gateway's, sound like you're suggesting, the gateway is where that boundary's going to be. >> Yes. That's where the boundary is. >> And the IT technologies may increasingly go down to the edge, but it's not clear that the IT vendor expertise goes down to the edge >> Correct. at the same degree. >> Correct. >> So, Neil let's come back to you. When we think about this arrangement of data, you know, how the use cases are going to play out, and where the vendors are, we still have to address this fundamental challenge that Dave Vellante bought up. Who's going to end up being responsible for this? Now you've worked in insurance, what does that mean from an overall business standpoint? What kinds of failure weights are we going to accommodate? How is this going to play out? What do you think? >> Well, I'd like to point out that I worked in insurance 30 years ago. (men chuckling) >> Male Voice: I didn't want to date ya Neil. (men chuckling) >> Yeah the old reliable life insurance company. Anyway, one of the things David was just discussing sounded a lot to me like complex event processing. And I'm wondering where the logical location event needs to be, because it needs some prior data to do CEP, you have to have something to compare it against. But if you're pushing it all back to the tertiary level, there's going to be a lot of latency. And the whole idea was CEP was, you know, right now. So, that I'm a little curious about. But I'm sorry, what was your question? >> Well no, let's address that. So CEP David, I agree. But I don't want to turn this into a general discussion and CEP. It's got its own set of issues. >> It's clear there have got to be complex models created. And those are going to be created in a large environment, almost certainly in a tertiary type environment. And those are going to be created by the vendors of those particular problem solvers at the primary edge. To a large extent, they're going to provide solutions in that area. And they're going to have to update those. And so, they are going to have to have lots and lots of test data for themselves and maybe some companies will provide test data if it's convenient for those, for a fee or whatever it is, to those vendors. But the primary model itself is going to be in the tertiary level, and that's going to be pushed down to the primary level itself. >> I'm going to make an assertion here that the, the way I think about this Neil is that the data coming off at the primary level is going to be the sensor data, the sensor said it was good. Then that is recorded as an event, we let somebody in the building. And that's going to be a key feature of what happens at the secondary level. I think a lot of complex processing is likely to end up at that secondary level. >> Absolutely. >> Then the data gets pushed up to the tertiary level and it becomes part of an overall social understanding of the business, it's behavior data. So increasingly, what did we do as a consequence of letting this person in the building? Oh we tried to stop him. That's going to be more of the behavioral data that ends up at the tertiary level, will still do complex event processing there. It's going to be interesting to see whether or not we end up with CEP directly in the sensor tower. Might under certain circumstances, that's a cost question though. So let me now turn it in the last few minutes here Neil back to you. At the end of the day, we've seen for years the question of how much security is enough security? And businesses said, "Oh I want to be 100% secure." And sometimes see-so said "We got that. You gave me the money, we've now made you 100% secure." But we know it's not true. Same thing is going to exist here. How much fidelity is enough fidelity down at the edge? How do we ensure that business decisions can be translated into design decisions that lead to an appropriate and optimized overall approach to the way the system operates? From a business standpoint back, what types of conversations are going to take place in the boardroom that the rest of the organization's going to have to translate into design decisions? >> You know, boy, bad actors are going to be bad actors. I don't think you can do anything to eliminate it. The best you can do is use the best processes and the best techniques to keep it from happening and hope for the best. I'm sorry, that's all I can really say about it. >> There's quite a lot of work going on at the moment from Arm, in particular. They've got a security device image ability. So, there's a lot of work going on in that very space. It's obviously interesting from an IT perspective is how do you link the different security systems, both from an Arm point of view and then from a X86 as you go further up the chain. How are they going to be controlled and how's that going to be managed? That's going to be a big IT issue. >> Yeah, I think the transmission is the weak point. >> Male Voice: What do you mean by that Neil? >> Well the data has to flow across networks, that would be the easiest place for someone to intercept it and, you know, and do something nefarious. >> Right yeah, so that's purely in a security thing. I was trying to use that as an analogy. So, at the end of the day, the business is going to have to decide how much data do we have to capture off the edge to ensure that we have the kinds of models we want, so that we can realize the specificity of actions and behaviors that we want in our business? That's partly a technology question, partly a cost question. Different sensors are able to operate at different speeds for example. But ultimately, we have to be able to bring those, that list of decisions or business issues that Dave Vellante raised, down to some of the design questions. But it's not going to be throw a $300 micro processor everything. There's going to be very, very concrete decisions that have to take place. So, George do you agree with that? >> Yes, two issues though. One, there's the existing devices that can't get re-instrumented, that they already have their software, hardware stack. >> There's a legacy in place? >> Yes. But there's another thing which is, some of the most advanced research that's been going on that produced much of today's distributed computing and big data infrastructure, like the Berkeley Analytics lab, and say their contributions spark in related technologies. They're saying we have to throw everything out and start over for secure real-time systems. That you have to build from hardware all the way up. In other words, you're starting from the sand to re-think something that's secure and real-time that you can't layer it on. >> So very quickly David, that's a great point George. Building on what George has said very quickly, the primary responsibility for bonding the behavior or the attributes of these devices are going to be with the vendor. >> Of creating the solution? >> Correct. >> That's going to be the primary responsibility. But obviously from an IT point of view, you need to make sure that that device is doing the job that's important for your business, not too much, not too little, is doing that job, and that you are able to collect the necessary data from it that is going to be of value to you. So that's a question of qualification of the devices themselves. >> Alright so, David Vellante, Neil Raden, David Floyer, George Gilbert, action item round. I want one action item from you guys from this conversation. Keep it quick, keep it short, keep it to the point. David Floyer, what's your action item? >> So my action item is don't go into areas that you don't need to. You do not need to become experts, IT in general does not need to become experts at the edge itself. Rely on partners, rely on vendors to do that unless of course you're one of those vendors. In which case, you'll need very, very deep knowledge. >> Or you choose that that's where you're value stream your differentiations is going to be which means you just became one of those values. >> Yes, exactly. >> George Gilbert. >> I would build on that and I would say that if you look at the skills required to build these full stack solutions, there's data science, there's application development, there's the analytics. Very few of those solutions are going to have skills all in one company. So the go-to market model for building these is going to be something that, at least at this point in time, we're going to have to look to like combinations like IBM working with sort of supply chain masters. >> Good. Neil Raden, action item. >> The question is not necessarily one of technology because that's going to evolve. But I think as an organization, you need to look at it from this end which is, would employing this create a new business opportunity for us? Something we're not already doing. Or number two, change our operations in some significant way. Or number three, you know, the old red queen thing. We have to do it to keep up with the competition. >> Male Voice: David Vellante, action item. >> Okay well look, at the risk of sounding trite, you got to start the planning process from the customer on in, and so often people don't. You got to understand where you're going to add value for customers and constructing and external and internal ecosystem that can really juice that value creation. >> Alright, fantastic guys. So let me quickly summarize. This week on the Wikibon Friday research meeting in the cube, we discussed a new way of thinking about data characteristics that will inform system design and a business value that's created. We observe that data is not all the same when we think about these very complex, highly distributed, and decentralized systems that we're going to build. That there's a difference between primary data, secondary data, and tertiary data. Primary data is data that is generated from real world events or measurements and then turned into signals that can be acted upon very proximate to that real world set of conditions. A lot of sensors will be there, a lot of processing will be moved down there, and a lot of actuators and actions will take place without referencing other locations within the cloud. However, we will see circumstances where the events that are taken, or the decisions that are taken on those vents, will be captured in some sort of secondary tier that will then record something about the characteristics of the actions and events that were taken, and then summarized and then pushed up to a tertiary tier where that data can then be further integrated in other attributes and elements of the business. The technology to do this is broadly available but not universally successfully applied. We expect to see a lot of new combinations of edge-related device to work with primary data. That is going to be a combination of currently successful firms in the OT or operational technology world, most likely in partnership with a lot of other vendors that have demonstrated significant expertise and understanding the problems, especially the business problems, associated with the fidelity of what happens at the edge. The IT industry is going to approach very aggressively and very close to this at that secondary level, through gateways and other types of technologies. And even though we'll see IT technology continue to move down to the primary level, it's not clear exactly how vendors will be able to follow that. More likely, we'll see the adoption of IT approaches to doing things at the primary level by vendors that have the main expertise in how that level works. We will however see significantly interesting true private cloud and public cloud data end up from the tertiary level end up with a whole new sets of systems that are going to be very important from an administration and management standpoint because they have to work within the context of the fidelity of this overall system together. The final point we want to make is that these are not technology problems by themselves. While significant technology problems are on the horizon about how we think about handling this distribution of data, managing it appropriately, our ability, ultimately, to present the appropriate authority at different levels within that distributive fabric to ensure the proper working condition in a way that nonetheless we can recreate if we need to. But these are, at bottom, fundamentally business problems. They're business problems related to who owns the intellectual property that's being created, they're business problem related to what level in that stack do I want to show my differentiation to my customers and they're business problems from a liability and legal standpoint as well. The action item is, all firms will in one form or another be impacted by the emergence of the edge as a dominate design as consideration for their infrastructure but also for their business. Three ways, or a taxonomy that looks at three classes of data, primary, secondary, and tertiary, will help businesses sort out who's responsible, what partnerships I need to put in place, what technologies and I going to employ, and very importantly, what overall business exposure I'm going to accommodate as I think ultimately about the nature of the processing and business promises that I'm making to my marketplace. Once again, this has been the Wikibon Friday research meeting here on theCUBE. I want to thank all the analysts who were here today, but especially thank you for paying attention and working with us. And by all means, let's hear those comments back about how we're doing and what you think about this important question of different classes of data driven by different needs of the edge. (funky electronic music)

Published Date : Oct 13 2017

SUMMARY :

A great example of that is the popular notion And for the most part, they need to be designed present it to business. that are appropriate to the edge that will be taking and learned to drive, I was told of what it's going to require to stop that car. Is that an autonomous car, or is that But that's not going to be primary data. That car is going to tell you I'm primary inside your car. Have I got that right? the big picture about the traffic conditions and the weather and the routes What's the business model that's going to allow you to succeed? Now, the one thing I'd add to that the benefit of, goes back to Mercedes. of the liability issues cause that's going to be important. and you take that as a unit, and also, that you don't go and- So those are the two characteristics Those are the two things you're trying to design that- and then that'll also be a record of to HR maybe, air condition therefore is going to look like this, a specific result, and then there's AI to optimize high bandwidth, cheap, constrained, proximate to the event. Is it all going to be, you know, looking like clients and PCs? Those are the people that are going to make money in that area. Well the gateway's, sound like you're suggesting, at the same degree. How is this going to play out? Well, I'd like to point out that I worked in insurance Male Voice: I didn't want to date ya Neil. And the whole idea was CEP was, you know, right now. But I don't want to turn this into be in the tertiary level, and that's going to be And that's going to be a key feature of That's going to be more of the behavioral data and the best techniques to keep it from happening and how's that going to be managed? Well the data has to flow across networks, capture off the edge to ensure that we have can't get re-instrumented, that they already have their some of the most advanced research that's been going on are going to be with the vendor. the necessary data from it that is going to be of value to you. Keep it quick, keep it short, keep it to the point. IT in general does not need to Or you choose that that's where you're is going to be something that, at least at this point in time, Neil Raden, action item. We have to do it to keep up with the competition. You got to understand where you're going to add value sets of systems that are going to be very important

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
David Floyer	PERSON	0.99+
Neil	PERSON	0.99+
Neil Raden	PERSON	0.99+
Dave Vellante	PERSON	0.99+
David Vellante	PERSON	0.99+
David	PERSON	0.99+
George	PERSON	0.99+
George Gilbert	PERSON	0.99+
Peter Burris	PERSON	0.99+
Mercedes	ORGANIZATION	0.99+
BMW	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
$300	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
10 lengths	QUANTITY	0.99+
two characteristics	QUANTITY	0.99+
Berkeley Analytics	ORGANIZATION	0.99+
next week	DATE	0.99+
4,000 people	QUANTITY	0.99+
two issues	QUANTITY	0.99+
Peter	PERSON	0.99+
today	DATE	0.99+
each level	QUANTITY	0.99+
One	QUANTITY	0.99+
one suggestion	QUANTITY	0.98+
Three ways	QUANTITY	0.98+
five car	QUANTITY	0.98+
both	QUANTITY	0.98+
This week	DATE	0.97+
two things	QUANTITY	0.97+
this week	DATE	0.97+
30 years ago	DATE	0.97+
one	QUANTITY	0.97+
Wikibon	ORGANIZATION	0.97+
Wikibons	ORGANIZATION	0.97+
trillions of new devices	QUANTITY	0.97+
single unit	QUANTITY	0.97+
one sensor	QUANTITY	0.96+
one form	QUANTITY	0.96+
first	QUANTITY	0.94+
one company	QUANTITY	0.94+
Apple TV	COMMERCIAL_ITEM	0.92+
one action item	QUANTITY	0.92+
three classes	QUANTITY	0.91+
intel	ORGANIZATION	0.89+
Wikibon	EVENT	0.86+
one more	QUANTITY	0.79+
second	QUANTITY	0.76+
past 10 years	DATE	0.75+
CEP	ORGANIZATION	0.75+
Deli MC	ORGANIZATION	0.73+
CEP	TITLE	0.68+
Arm	ORGANIZATION	0.65+
Wikibon Friday	EVENT	0.64+
Alexa	TITLE	0.64+
years	QUANTITY	0.62+
few bucks	QUANTITY	0.6+

Yuanhao Sun, Transwarp Technology - BigData SV 2017 - #BigDataSV - #theCUBE

>> Announcer: Live from San Jose, California, it's theCUBE, covering Big Data Silicon Valley 2017. (upbeat percussion music) >> Okay, welcome back everyone. Live here in Silicon Valley, San Jose, is the Big Data SV, Big Data Silicon Valley in conjunction with Strata Hadoop, this is theCUBE's exclusive coverage. Over the next two days, we've got wall-to-wall interviews with thought leaders, experts breaking down the future of big data, future of analytics, future of the cloud. I'm John Furrier with my co-host George Gilbert with Wikibon. Our next guest is Yuanhao Sun, who's the co-founder and CTO of Transwarp Technologies. Welcome to theCUBE. You were on, during the, 166 days ago, I noticed, on theCUBE, previously. But now you've got some news. So let's get the news out of the way. What are you guys announcing here, this week? >> Yes, so we are announcing 5.0, the latest version of Transwarp Hub. So in this version, we will call it probably revolutionary product, because the first one is we embedded communities in our product, so we will allow people to isolate different kind of workloads, using dock and containers, and we also provide a scheduler to better support mixed workloads. And the second is, we are building a set of tools allow people to build their warehouse. And then migrate from existing or traditional data warehouse to Hadoop. And we are also providing people capability to build a data mart, actually. It allow you to interactively query data. So we build a column store in memory and on SSD. And we totally write the whole SQL engine. That is a very tiny SQL engine, allow people to query data very quickly. And so today that tiny SQL engine is like about five to ten times faster than Spark 2.0. And we also allow people to build cubes on top of Hadoop. And then, once the cube is built, the SQL performance, like the TBCH performance, is about 100 times faster than existing database, or existing Spark 2.0. So it's super-fast. And in, actually we found a Paralect customer, so they replace their data with software, to build a data mart. And we already migrate, say 100 reports, from their data to our product. So the promise is very good. And the first one is we are providing tool for people to build the machine learning pipelines and we are leveraging TensorFlow, MXNet, and also Spark for people to visualize the pipeline and to build the data mining workflows. So this is kind of like Datasense tools, it's very easy for people to use. >> John: Okay, so take a minute to explain, 'cus that was great, you got the performance there, that's the news out of the way. Take a minute to explain Transwarp, your value proposition, and when people engage you as a customer. >> Yuanhao: Yeah so, people choose our product and the major reason is our compatibility to Oracle, DV2, and teradata SQL syntax, because you know, they have built a lot of applications onto those databases, so when they migrate to Hadoop, they don't want to rewrote whole program, so our compatibility, SQL compatibility is big advantage to them, so this is the first one. And we also support full ANCIT and distribute transactions onto Hadoop. So that a lot of applications can be migrate to our product, with few modification or without any changes. So this is the first our advantage. The second is because we are providing, even the best streaming engine, that is actually derived from Spark. So we apply this technology to IOT applications. You know the IOT pretty soon, they need a very low latency but they also need very complicated models on top of streams. So that's why we are providing full SQL support and machine learning support on top of streaming events. And we are also using event-driven technology to reduce the latency, to five to ten milliseconds. So this is second reason people choose our product. And then today we are announcing 5.0, and I think people will find more reason to choose our product. >> So you have the compatibility SQL, you have the tooling, and now you have the performance. So kind of the triple threat there. So what's the customer saying, when you go out and talk with your customers, what's the view of the current landscape for customers? What are they solving right now, what are the key challenges and pain points that customers have today? >> We have customers in more than 12 vertical segments, and in different verticals they have different pain points, actually so. Take one example: in financial services, the main pain point for them is to migrate existing legacy applications to Hadoop, you know they have accumulated a lot of data, and the performance is very bad using legacy database, so they need high performance Hadoop and Spark to speed up the performance, like reports. But in another vertical, like in logistic and transportation and IOT, the pain point is to find a very low latency streaming engine. At the same time, they need very complicated programming model to write their applications. And that example, like in public sector, they actually need very complicated and large scale search engine. They need to build analytical capability on top of search engine. They can search the results and analyze the result in the same time. >> George: Yuanhao, as always, whenever we get to interview you on theCube, you toss out these gems, sort of like you know diamonds, like big rocks that under millions of years, and incredible pressure, have been squeezed down into these incredibly valuable, kind of, you know, valuable, sort of minerals with lots of goodness in them, so I need you to unpack that diamond back into something that we can make sense out of, or I should say, that's more accessible. You've done something that none of the Hadoop Distro guys have managed to do, which is to build databases that are not just decision support, but can handle OLTP, can handle operational applications. You've done the streaming, you've done what even Databricks can't do without even trying any of the other stuff, which is getting the streaming down to event at a time. Let's step back from all these amazing things, and tell us what was the secret sauce that let you build a platform this advanced? >> So actually, we are driven by our customers, and we do see the trends people are looking for, better solutions, you know there are a lot of pain to set up a habitable class to use the Hadoop technology. So that's why we found it's very meaningful and also very necessary for us to build a SQL database on top of Hadoop. Quite a lot of customers in FS side, they ask us to provide asset until the transaction can be put on top of Hadoop, because they have to guarantee the consistency of their data. Otherwise they cannot use the technology. >> At the risk of interrupting, maybe you can tell us why others have built the analytic databases on top of Hadoop, to give the familiar SQL access, and obviously have a desire also to have transactions next to it, so you can inform a transaction decision with the analytics. One of the questions is, how did you combine the two capabilities? I mean it only took Oracle like 40 years. >> Right, so. Actually our transaction capability is only for analytics, you know, so this OLTP capability it is not for short term transactional applications, it's for data warehouse kind of workloads. >> George: Okay, so when you're ingesting. >> Yes, when you're ingesting, when you modify your data, in batch, you have to guarantee the consistency. So that's the OLTP capability. But we are also building another distributed storage, and distributed database, and that are providing that with OLTP capability. That means you can do concurrent transactions, on that database, but we are still developing that software right now. Today our product providing the digital transaction capability for people to actually build their warehouse. You know quite a lot of people believe data warehouse do not need transaction capability, but we found a lot of people modify their data in data warehouse, you know, they are loading their data continuously to data warehouse, like the CRM tables, customer information, they can be changed over time. So every day people need to update or change the data, that's why we have to provide transaction capability in data warehouse. >> George: Okay, and then so then well tell us also, 'cus the streaming problem is, you know, we're told that roughly two thirds of Spark deployments use streaming as a workload. And the biggest knock on Spark is that it can't process one event at a time, you got to do a little batch. Tell us some of the use cases that can take advantage of doing one event at a time, and how you solved that problem? >> Yuanhao: Yeah so the first use case we encounter is the anti-fraud, or fraud detection application in FSI, so whenever you swipe your credit card, the bank needs to tell you if the transaction is a fraud or not in a few milliseconds. But if you are using Spark streaming, it will usually take 500 milliseconds, so the latency is too high for such kind of application. And that's why we have to provide event per time, like means event-driven processing to detect the fraud, so that we can interrupt the transaction in a few milliseconds, so that's one kind of application. The other can come from IOT applications, so we already put our streaming framework in large manufacture factory. So they have to detect the main function of their equipments in a very short time, otherwise it may explode. So if you... So if you are using Spark streaming, probably when you submit your application, it will take you hundreds of milliseconds, and when you finish your detection, it usually takes a few seconds, so that will be too long for such kind of application. And that's why we need a low latency streaming engine, but you can see it is okay to use Storm or Flink, right? And problem is, we found it is: They need a very complicated programming model, that they are going to solve equation on the streaming events, they need to do the FFT transformation. And they are also asking to run some linear regression or some neural network on top of events, so that's why we have to provide a SQL interface and we are also embedding the CEP capability into our streaming engine, so that you can use pattern to match the events and to send alerts. >> George: So, SQL to get a set of events and maybe join some in the complex event processing, CEP, to say, does this fit a pattern I'm looking for? >> Yuanhao: Yes. >> Okay, and so, and then with the lightweight OLTP, that and any other new projects you're looking at, tell us perhaps the new use cases you'd be appropriated for. >> Yuanhao: Yeah so that's our official product actually, so we are going to solve the problem of large scale OLTP transaction problems like, so you know, a lot of... You know, in China, there is so many population, like in public sector or in banks, they need build a highly scalable transaction systems so that they can support a very high concurrent transactions at the same time, so that's why we are building such kind of technology. You know, in the past, people just divide transaction into multiple databases, like multiple Oracle instances or multiple mySQL instances. But the problem is: if the application is simple, you can very easily divide a transaction over the multiple instances of databases. But if the application is very complicated, especially when the ISV already wrote the applications based on Oracle or traditional database, they already depends on the transaction systems so that's why we have to build a same kind of transaction systems, so that we can support their legacy applications, but they can scale to hundreds of nodes, and they can scale to millions of transactions per second. >> George: On the transactional stuff? >> Yuanhao: Yes. >> Just correct me if I'm wrong, I know we're running out of time but I thought Oracle only scales out when you're doing decision support work, not when you're doing OLTP, not that it, that it can only, that it can maybe stretch to ten nodes or something like that, am I mistaken? >> Yuanhao: Yes, they can scale to 16 to all 32 nodes. >> George: For transactional work? >> For transaction works, but so that's the theoretical limit, but you know, like Google F1 and Google Spanner, they can scale to hundreds of nodes. But you know, the latency is higher than Oracle because you have to use distributed particle to communicate with multiple nodes, so the latency is higher. >> On Google? >> Yes. >> On Google. The latency is higher on the Google? >> 'Cus it has to go like all the way to Europe and back. >> Oracle or Google latency, you said? >> Google, because if you are using two phase commit protocol you have to talk to multiple nodes to broadcast your request to multiple nodes, and then wait for the feedback, so that mean you have a much higher latency, but it's necessary to maintain the consistency. So in a distributed OLTP databases, the latency is usually higher, but the concurrency is also much higher, and scalability is much better. >> George: So that's a problem you've stretched beyond what Oracle's done. >> Yuanhao: Yes, so because customer can tolerant the higher latency, but they need to scale to millions of transactions per second, so that's why we have to build a distributed database. >> George: Okay, for this reason we're going to have to have you back for like maybe five or ten consecutive segments, you know, maybe starting tomorrow. >> We're going to have to get you back for sure. Final question for you: What are you excited about, from a technology, in the landscape, as you look at open source, you're working with Spark, you mentioned Kubernetes, you have micro services, all the cloud. What are you most excited about right now in terms of new technology that's going to help simplify and scale, with low latency, the databases, the software. 'Cus you got IOT, you got autonomous vehicles, you have all this data, what are you excited about? >> So actually, so this technology we already solve these problems actually, but I think the most exciting thing is we found... There's two trends, the first trend is: We found it's very exciting to find more competition framework coming out, like the AI framework, like TensorFlow and MXNet, Torch, and tons of such machine learning frameworks are coming out, so they are solving different kinds of problems, like facial recognition from video and images, like human computer interactions using voice, using audio. So it's very exciting I think, but for... And also it's very, we found it's very exciting we are embedding these, we are combining these technologies together, so that's why we are using competitors you know. We didn't use YARN, because it cannot support TensorFlow or other framework, but you know, if you are using containers and if you have good scheduler, you can schedule any kind of competition frameworks. So we found it's very interesting to, to have these new frameworks, and we can combine together to solve different kinds of problems. >> John: Thanks so much for coming onto theCube, it's an operating system world we're living in now, it's a great time to be a technologist. Certainly the opportunities are out there, and we're breaking it down here inside theCube, live in Silicon Valley, with the best tech executives, best thought leaders and experts here inside theCube. I'm John Furrier with George Gilbert. We'll be right back with more after this short break. (upbeat percussive music)

Published Date : Mar 14 2017

SUMMARY :

Jose, California, it's theCUBE, So let's get the news out of the way. And the first one is we are providing tool and when people engage you as a customer. And then today we are announcing 5.0, So kind of the triple threat there. the pain point is to find so I need you to unpack because they have to guarantee next to it, so you can you know, so this OLTP capability So that's the OLTP capability. 'cus the streaming problem is, you know, the bank needs to tell you Okay, and so, and then and they can scale to millions scale to 16 to all 32 nodes. so the latency is higher. The latency is higher on the Google? 'Cus it has to go like all so that mean you have George: So that's a the higher latency, but they need to scale segments, you know, to get you back for sure. like the AI framework, like it's a great time to be a technologist.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
George	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
China	LOCATION	0.99+
five	QUANTITY	0.99+
Europe	LOCATION	0.99+
Transwarp Technologies	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
500 milliseconds	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
San Jose, California	LOCATION	0.99+
hundreds of nodes	QUANTITY	0.99+
Hadoop	TITLE	0.99+
Today	DATE	0.99+
ten nodes	QUANTITY	0.99+
first	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
100 reports	QUANTITY	0.99+
tomorrow	DATE	0.99+
second	QUANTITY	0.99+
first one	QUANTITY	0.99+
Yuanhao Sun	PERSON	0.99+
second reason	QUANTITY	0.99+
Spark 2.0	TITLE	0.99+
today	DATE	0.99+
this week	DATE	0.99+
ten times	QUANTITY	0.99+
16	QUANTITY	0.99+
two trends	QUANTITY	0.99+
Yuanhao	PERSON	0.99+
SQL	TITLE	0.99+
Spark	TITLE	0.99+
first trend	QUANTITY	0.99+
two capabilities	QUANTITY	0.98+
Silicon Valley, San Jose	LOCATION	0.98+
TensorFlow	TITLE	0.98+
one event	QUANTITY	0.98+
32 nodes	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
Torch	TITLE	0.98+
166 days ago	DATE	0.98+
one example	QUANTITY	0.98+
more than 12 vertical segments	QUANTITY	0.97+
ten milliseconds	QUANTITY	0.97+
hundreds of milliseconds	QUANTITY	0.97+
two thirds	QUANTITY	0.97+
MXNet	TITLE	0.97+
Databricks	ORGANIZATION	0.96+
Google	ORGANIZATION	0.96+
ten consecutive segments	QUANTITY	0.95+
first use	QUANTITY	0.95+
Wikibon	ORGANIZATION	0.95+
Big Data Silicon Valley	ORGANIZATION	0.95+
Strata Hadoop	ORGANIZATION	0.95+
about 100 times	QUANTITY	0.94+
Big Data SV	ORGANIZATION	0.94+
One of	QUANTITY	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for CEP: