Image Title

Search Results for Big Data Silicon Valley:

Big Data Silicon Valley 2018 Recap


 

>> Dave: Good morning everybody and welcome to Big Data SV. >> Come down, hang out with us today as we have continued conversations. >> Will this trend, this Big Data trend, solve the problems that decision support and business intelligence couldn't solve. We're going to talk about that today. Gentlemen, welcome to theCUBE. (energetic rock music) >> Dave: We're setting up for the digital business era. >> What do people really want to do? And it's big data analytics. I want to ingest a lot of information. I want to enrich it. I want to analyze it and I want to take actions and then I want to go park it. >> Leveraging everything that is open source to build models and put models in production. >> We talk a little bit like it's Google Docs for your data. >> So I no longer have to send daily data dumps to partners. They can simply query the data themselves. >> We've taken the two approaches of enterprise analytics and self-services and tried to create a scenario where you kind of get the best of both worlds. >> The epicenter of this whole data management has to move to cloud. >> It saves you a lot of time and effort. You can focus on more strategic projects. >> Do you agree it's kind of bifurcated. There's the Spotifys, and the Ubers, and the AirBnBs that are crushing it and then there's a lot of traditional enterprises that are still stovepipe and struggling. >> Marketing people, operational people, finance people, they need data to do their jobs. Their jobs are becoming more data-driven but they're not necessarily data people. >> They're depending on the vendor landscape to provide them with an entry level set of tools. >> Don't make me work harder and add new staff. Solve the problem. >> Yeah, it's all about solving problems. >> A lot more on machine learning now and artificial intelligence and frankly a lot of discussion around ethics. >> Data governance, it is in fact a business imperative. >> Marketers want all the customer data they can get, right? But there's social security numbers, PII-- Who should be able to see and use what because if this data is used inappropriately then it can cause a lot of problems. >> Creating that visibility is very important. >> The biggest casualty is going to be their customer relationship if they don't do this because most companies don't know their customers fully. >> The key that digital transformation is really a lauder on the concept of real time. >> If anybody deals with the data that's in motion, you lose because I'm analyzing as it's happening and then you would be analyzing after at rest. >> Speed is so important these days and the new companies that are grasping data aggressively, putting it somewhere where they can make decisions on it on a day-to-day basis, they're winning. >> Come on down, be part of our audience. We also have a great party tonight where you can network with some of our experts and analysts. (energetic rock music) >> Our expectation is that as the tooling gets better, we will see more people be able to present themselves truly as capable of doing this, and that will accelerate the process. >> To me, one of the first things a CDO has to do is understand how a company gets value out of its data. >> You can either run away from that data and say, look, I'm going to not, I'm going to bury my head in the sand, I'm going to be a business, I'm just going to forget about that data stuff and that's certainly a way to go. Right? It's a way to go away. >> It's easy to get overwhelmed for companies, you have to pick somewhere, right? >> You don't have to go sit in the basement for a year having something that is 'the thing', the unicorn in the business, it's small quick wins. >> We're not afraid of makin' mistakes. If we provision infrastructure and we don't get it right the first time, we just change it. >> That's something that we would just never be able to do previously in a data center. >> When companies get started with the right first project they can build on that success and invest more, whereas if you're not experimenting and trying things and moving, you're never going to get there. >> Dave: Thanks for watching, everybody. This is thCUBE. We're live from Big Data SV. >> And we're clear. Thank you. (audience applauds)

Published Date : Mar 12 2018

SUMMARY :

to Big Data SV. Come down, hang out with us today We're going to talk about that today. and I want to take actions and then I want to go park it. to build models and put models in production. So I no longer have to send daily data dumps to partners. We've taken the two approaches of enterprise analytics has to move to cloud. It saves you a lot of time and effort. and the AirBnBs that are crushing it they need data to do their jobs. to provide them with an entry level set of tools. Solve the problem. and artificial intelligence and frankly Who should be able to see and use what The biggest casualty is going to be on the concept of real time. If anybody deals with the data that's in motion, that are grasping data aggressively, putting it somewhere We also have a great party tonight where you can network Our expectation is that as the tooling gets better, To me, one of the first things a CDO has to do I'm going to be a business, I'm just going to forget You don't have to go sit in the basement for a year the first time, we just change it. able to do previously in a data center. and invest more, whereas if you're not experimenting This is thCUBE. And we're clear.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

UbersORGANIZATION

0.99+

AirBnBsORGANIZATION

0.99+

SpotifysORGANIZATION

0.99+

tonightDATE

0.99+

todayDATE

0.98+

Google DocsTITLE

0.98+

both worldsQUANTITY

0.98+

Big DataORGANIZATION

0.98+

first timeQUANTITY

0.97+

oneQUANTITY

0.97+

first projectQUANTITY

0.97+

two approachesQUANTITY

0.97+

firstQUANTITY

0.95+

a yearQUANTITY

0.93+

2018DATE

0.92+

Big Data SVORGANIZATION

0.8+

ValleyTITLE

0.65+

SiliconLOCATION

0.6+

Big DataTITLE

0.46+

Jerry Chen, Greylock | AWS re:Invent 2019


 

>> Narrator: Live from Las Vegas, it's theCUBE covering AWS reInvent 2019. Brought to you by Amazon Web Services and Intel along with it's Ecosystem partners. >> Well, welcome back, everyone theCUBE's live coverage in Las Vegas for AWS reInvent. It's theCUBE's 10th year of operations, it's our seventh AWS reInvent and every year, it gets better and better and every year, we've had theCUBE at reInvent, Jerry Chen has been on as a guest. He's a VIP, Jerry Chen, now a general partner at Greylock Tier One, one of the leading global Venture capitals at Silicon Valley. Jerry, you've been on the journey with us the whole time. >> I guess I'm your good luck charm. >> (laughs) Well, keep it going. Keep on changing the game. So, thanks for coming on. >> Jerry: Thanks for having me. >> So, now that you're a seasoned partner now at Greylock. You got a lot of investments under your belt. How's it going? >> It's great, I mean look, every single year, I look around the landscape thinking, "What else could be coming? "What if we surprise this year?" What's the new trends? What both macro-trends, also company trends, like, who's going to buy who, who's going to go public? Every year, it just gets busier and busier and bigger and bigger. >> All these new categories are emerging with this new architecture. I call it Cloud 2.0, maybe next gen Cloud, whatever you want to call it, it's clear visibility now into the fact that DevOps is working, Cloud operations, large scale operations with Cloud is certainly a great value proposition. You're seeing now multiple databases, pick the tool, I think Jassy got that right in his keynote, I believe that, but now the data equation comes over the top. So, you got DevOps infrastructure as code, you got data now looking like it's going to go down that same path of data as code where developers don't have to deal with all the different nuances of how data's stored, how it's handled, where is it, warm or cold or at glacier. So, developers still don't have that yet today. Seems to be an area of Amazon. What's your take on all this? >> I think you saw, so what drove DevOps? Speed, right? It's basically how developers shows you operations, merging of two groups. So, we're seeing the same trend DataOps, right? How data engineers and data scientists can now have the same speeds developers had for the past 10 years, DataOps. So, A, what does that mean? Give me the menu of what I want like, Goldilocks, too big, too small, just right. Too hot, too cold, just right. Like, give me the storage tier, the data tier, the size I want, the temperature I want and the speed I want. So, you're seeing DataOps give the same kind of Goldilocks treatment as developers. >> And on terms of like Cloud evolution again, you've seen the movie from the beginning at VM where now through Amazon, seventh year. What jumps out at you, what do you look at as squinting through the trend lines and the fashion of the features, it still seems to be the same old game, compute memory storage and software. >> Well I mean, compute memory storage, there's an atomic building blocks of a compute, right? So, regardless of services these high level frameworks, deep down, you still have compute networking and storage. So, that's the building blocks but I think we're seeing 10th year of reInvent this kind of, it's not one size fits all but this really big fat long tail, small instances, micro-instances, server lists, big instances for like jumbo VMs, bare metal, right? So, you're seeing not one architecture but folks can kind of pick and choose buy compute by the drip, the drop or buy compute by the whole VM or whole server full. >> And a lot of people are like, the builders love that. Amazon owns the builder market. I mean, if anyone who's doing a startup, they pretty much start on Amazon. It's the most robust, you pick your tools, you build, but Steve Malaney was just on before us says, "Enterprise don't want power tools, "they're going to cut their hand off." (laughs) Right so, Microsoft's been winning with this approach of consumable Cloud and it's a nice card to play because they're not yet there with capabilities with Amazon, so it's a good call, they got an Enterprise sales force. Microsoft playing a different game than AWS because they have to. >> Sure I mean, what's football now, you have a running game, you need a passing game, right? So, if you can't beat them with the running game, you go with a passing game and so, Amazon has kind of like the fundamental building blocks or power tools for the builders. There's a large segment of population out there that don't want that level of building blocks but they want us a little bit more prescriptive. Microsoft's been around Enterprise for many many years, they understand prescriptive tools and architectures. So, you're going to become a little bit more prefab, if you will. Here's how you can actually construct the right application, ML apps, AI apps, et cetera. Let me give you the building blocks at a higher level abstraction. >> So, I want to get your take on value creations. >> Jerry: Sure. >> So, if it's still early (mumbles), it's took a lot more growth, you start to see Jassy even admit that in his keynotes that he said quote, "There are two types "of developers and customers. "People want the building blocks "or people who want solutions." Or prefab or some sort of more consumable. >> More prescriptive, yeah. >> So, I think Amazon's going to start going that way but that being said, there's still opportunities for startups. You're an investor, you invest in startups. Where do you see opportunities? If you're looking at the startup landscape, what is the playbook? How should you advise startups? Because ya know, have the best team or whatever but you look at Amazon, it's like, okay, they got large scale. >> Jerry: Yeah. >> I'm going to be a little nervous. Are they going to eat my lunch? Do I take advantage of them? Do I draft off them? There are wide spaces as vertical market's exploding that are available. What's your view on how startups should attack the wealth creation opportunity value creation? >> There, I mean, Amazon's creating a new market, right? So, you look at their list of many services. There's just like 175 services out there, which is basically too many for any one company to win every single service. So, but you look at that menu of services, each one of those services themselves can be a startup or a collection of services can be a startup. So, I look at that as a roadmap for opportunity of companies can actually go in and create value around AI, around data, around security, around observability because Amazon's not going to naturally win all of those markets. What they do have is distribution, right? They have a lot of developer mind share. So, if you're a startup, you play one or three themes. So like, one is how do I pick one area and go deep for IP, right? Like, cheaper, better, faster, own some IP and though, they're going to execute better and that's doable over and over again in different markets. Number two is, we talked about this before, there's not going to be a one Cloud wins all, Amazon's clearly in the lead, they have won most of the Cloud, so far, but it'll be a multi-Cloud world, it'll be On Premise world. So, how do I play a multi-Cloud world, is another angle, so, go deep in IP, go multi-Cloud. Number three is this end to end solution, kind of prescriptive. Amazon can get you 80% of the way there, 70% of the way there but if you're like, an AI developer, you're a CMO, you're a marketing developer, you kind of want this end to end solution. So, how can I put together a full suite of tools from beginning to end that can give me a product that's a better experience. So, either I have something that's a deeper IP play a seam between multiple Clouds or give it end to end solutions around a problem and solve that one problem for our customer. >> And in most cases, the underlay is Amazon or Azure. >> Or Google or Alley Cloud or On Premises. Not going to wait any time soon, right? And so, how do I create a single fabric, if you will that looks similar? >> I want to riff with you in real time here on theCUBE around data. So, data scale is obviously a big discussion that's starting to happen now, data tsunami, we've heard that for years. So, there's two scale benefits, horizontal scale with data and then vertical specialism, vertical scale or ya know, using AI machine learning in apps, having data, so, how do you view that? What's your reaction to the notion of creating the horizontal scale value and vertical specialism value? >> Both are a great place for startups, right? They're not mutually exclusive but I think if you go horizontal, the amount of data being created by your applications, your infrastructure, your sensors, time stories data, ridiculously large amount, right? And that's not going away any time soon. I recently did investment in ChronoSphere, 'cause you guys covered over at CUBEcon a few weeks ago, that's talking about metrics and observability data, time stories data. So, they're going to handle that horizontal amount of data, petabytes and petabytes, how can we quarry this quickly, deeply with a lot of insight? That's one play, right? Cheaper, better, faster at scale. The next play, like you said, is vertical. It's how do I own data or slice the data with more contacts than I know I was going to have? We talked about the virtual cycle of data, right? Just the system of intelligence, as well. If I own a set of data, be it healthcare, government or self-driving car data, that no one else has, I can build a solution end to end and go deep and so either pick a lane or pick a geography, you can go either way. It's hard to do both, though. >> It's hard for startup. >> For a startup. >> Any big company. >> Very few companies can do two things well, startups especially, succeed by doing one thing very well. >> I think my observation is that I think looking at Amazon, is that they want the horizontal and they're leaving offers on the table for our startups, the vertical. >> Yeah, if you look at their strategy, the lower level Amazon gets, the more open-sourced, the more ubiquitous you try to be for containers, server lists, networking, S3, basic sub straits, so, horizontal horizontal, low price. As you get higher up from like, deep mind like, AI technologies, perception, prediction, they're getting a little bit more specialized, right? As you see these solutions around retail, healthcare, voice, so, the higher up in the stack, they can build more narrow solutions because like any startup of any product, you need the right wedge. What's the right wedge in the customers? At the base level of developers, building blocks, ubiquitous. For solutions marketing, healthcare, financial services, retail, how do I find a fine point wedge? >> So, the old Venture business was all enamored with consumers over the years and then, maybe four years ago, Enterprise got hot. We were lowly Enterprise guys where no one-- >> Enterprise has been hot forever in my mind, John but maybe-- >> Well, first of all, we've been hot on Enterprise, we love Enterprise but then all of a sudden, it just seemed like, oh my God, people had an awakening like, and there's real value to be had. The IT spend has been trillions and the stats are roughly 20 or so percent, yet to move to the Cloud or this new next gen architecture that you're investing companies in. So, a big market... that's an investment thesis. So, a huge enterprise market, Steve Malaney of Aviation called it a thousand foot wave. So, there's going to be a massive enterprise money... big bag of money on the table. (laughs) A lot of re-transformations, lot of reborn on the Cloud, lot of action. What's your take on that? Do you see it the same way because look how they're getting in big time, Goldman Sachs on stage here. It's a lot of cash. How do you think it's going to be deployed and who's going to be fighting for it? >> Well, I think, we talked about this in the past. When you look to make an investment, as a startup founder or as a VC, you want to pick a wave bigger than you, bigger than your competitors. Right so, on the consumer side, ya know, the classic example, your Instagram fighting Facebook and photo sharing, you pick the mobile first wave, iPhone wave, right, the first mobile native photo sharing. If you're fighting Enterprise infrastructure, you pick the Cloud data wave, right? You pick the big data wave, you pick the AI waves. So, first as a founder startup, I'm looking for these macro-waves that I see not going away any time soon. So, moving from BaaS data to streaming real time data. That's a wave that's happening, that's inevitable. Dollars are floating from slower BaaS data bases to streaming real time analytics. So, Rocksett, one of the investors we talked about, they're riding that wave from going BaaS to real time, how to do analytics and sequel on real time data. Likewise, time servers, you're going from like, ya know, BaaS data, slow data to massive amounts of time storage data, Chronosphere, playing that wave. So, I think you have to look for these macro-waves of Cloud, which anyone knows but then, you pick these small wavelettes, if that's a word, like a wavelettes or a smaller wave within a wave that says, "Okay, I'm going to "pick this one trend." Ride it as a startup, ride it as an investor and because that's going to be more powerful than my competitors. >> And then, get inside the wave or inside the tornado, whatever metaphor. >> We're going to torch the metaphors but yeah, ride that wave. >> All right, Jerry, great to have you on. Seven years of CUBE action. Great to have you on, congratulations, you're VIP, you've been with us the whole time. >> Congratulations to you, theCUBE, the entire staff here. It's amazing to watch your business grow in the past seven years, as well. >> And we soft launch our CUBE 365, search it, it's on Amazon's marketplace. >> Jerry: Amazing. >> SaaS, our first SaaS offering. >> I love it, I mean-- >> John: No Venture funding. (laughs) Ya know, we're going to be out there. Ya know, maybe let you in on the deal. >> But now, like you broadcast the deal to the rest of the market. >> (laughs) Jerry, great to have you on. Again, great to watch your career at Greylock. Always happy to have ya on, great commentary, awesome time, Jerry Chen, Venture partner, general partner of Greylock. So keep coverage, breaking down the commentary, extracting the signal from the noise here at reInvent 2019, I'm John Furrier, back with more after this short break. (energetic electronic music)

Published Date : Dec 4 2019

SUMMARY :

Brought to you by Amazon Web Services and Intel of the leading global Venture capitals at Silicon Valley. Keep on changing the game. So, now that you're a seasoned partner now at Greylock. What's the new trends? So, you got DevOps infrastructure as code, I think you saw, so what drove DevOps? of the features, it still seems to be the same old game, So, that's the building blocks It's the most robust, you pick your tools, you build, So, if you can't beat them with the running game, So, I want to get your take you start to see Jassy even admit that in his keynotes So, I think Amazon's going to start going that way I'm going to be a little nervous. So, but you look at that menu of services, And so, how do I create a single fabric, if you will I want to riff with you So, they're going to handle that horizontal amount of data, one thing very well. on the table for our startups, the vertical. the more ubiquitous you try to be So, the old Venture business was all enamored So, there's going to be a massive enterprise money... So, I think you have to look for these or inside the tornado, whatever metaphor. We're going to torch the metaphors All right, Jerry, great to have you on. It's amazing to watch your business grow And we soft launch our CUBE 365, Ya know, maybe let you in on the deal. But now, like you broadcast the deal (laughs) Jerry, great to have you on.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Steve MalaneyPERSON

0.99+

Jerry ChenPERSON

0.99+

JerryPERSON

0.99+

Amazon Web ServicesORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

Silicon ValleyLOCATION

0.99+

70%QUANTITY

0.99+

80%QUANTITY

0.99+

John FurrierPERSON

0.99+

JohnPERSON

0.99+

two groupsQUANTITY

0.99+

Las VegasLOCATION

0.99+

175 servicesQUANTITY

0.99+

Goldman SachsORGANIZATION

0.99+

oneQUANTITY

0.99+

10th yearQUANTITY

0.99+

firstQUANTITY

0.99+

GreylockORGANIZATION

0.99+

AWSORGANIZATION

0.99+

IntelORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

BothQUANTITY

0.99+

JassyPERSON

0.99+

bothQUANTITY

0.99+

two typesQUANTITY

0.99+

DevOpsTITLE

0.99+

one problemQUANTITY

0.99+

seventh yearQUANTITY

0.98+

two thingsQUANTITY

0.98+

reInventEVENT

0.98+

AviationORGANIZATION

0.98+

four years agoDATE

0.98+

Seven yearsQUANTITY

0.98+

two scaleQUANTITY

0.97+

CUBEconORGANIZATION

0.97+

iPhoneCOMMERCIAL_ITEM

0.97+

one companyQUANTITY

0.97+

three themesQUANTITY

0.97+

todayDATE

0.96+

reInvent 2019EVENT

0.96+

InstagramORGANIZATION

0.96+

ChronoSphereTITLE

0.95+

AzureORGANIZATION

0.95+

each oneQUANTITY

0.95+

FacebookORGANIZATION

0.94+

RocksettPERSON

0.94+

this yearDATE

0.93+

GoogleORGANIZATION

0.92+

Number threeQUANTITY

0.92+

Number twoQUANTITY

0.92+

one thingQUANTITY

0.92+

trillionsQUANTITY

0.92+

20QUANTITY

0.92+

VentureORGANIZATION

0.92+

CloudTITLE

0.92+

single fabricQUANTITY

0.88+

one areaQUANTITY

0.87+

Cloud 2.0TITLE

0.87+

Keynote Analysis | Actifio Data Driven 2019


 

>> From Boston, Massachusetts. It's theCUBE. Covering Actifio 2019 Data Driven. (upbeat techno music) Brought to you by Actifio. >> Hello everyone and welcome to Boston and theCUBE's special coverage of Actifio Data Driven 19. I'm Dave Vellante. Stu Miniman is here. We've got a special guest, John Furrier is in the house from from Palo Alto. Guys, theCUBE we love to go out on the ground, you know, we go deep. We're here at this data theme, right? We were there in the early days, John, you called me up and say, "Get your butt here, we're going to cover the first of Doop World". And since then things have moved quite fast. Everybody thought, you know, Hadoop Big Data was going to take over the world. Nobody even uses that term anymore, right? It's kind of, now it's AI, and machine intelligence, and block chain, and everything else. So what do you think is happening? Did the early Big Data days fail? You know, Frank Genus this morning called it The experimentation phase. >> I mean, I don't really think Frank has a good handle on what's going on in my opinion, cause I think it's not an experimentation, it's real. That was a wave that was essentially the beginning of, not an experimentation, of realization and reality that data, unstructured data in particular was real and relevant. Hadoop looked good off the tee, mill the fairway as we say, but the thing about the Hadoop ecosystem is that validated big data. Every financial institution jumped on it. Everyone who knew anything about data or had data issues or had a lot of data, knew the value. It's just that the apparatus to build via Hadoop was too expensive. In comes Cloud computing at scale, so, as Cloud was accelerating, you look at the Amazon Web Services Revenue Chart you can almost see the D mark where the inflection point is on the hockey stick of Amazon's revenue numbers. And that is the point in time where Hadoop was on the declining of failure. Hortonworks sold the Cloudera. Cloudera's earnings are at an all-time low. A lot of speculation of their entire strategy, and their venture back company went public, but bet the ranch to be the next data warehouse. That wasn't the business model. The data business was a completely new industry, completely being re-transformed, and, far from experimentation, it is real and definitely growing like a weed, but changing because of the underpinning infrastructure dynamics of Cloud Native, Microservices, and that's only going to get highly accelerated and the people who talk about context of industry like Frank, are going to be off. Their predictions will be off because they don't really see the new picture clear enough, in my opinion, >> So, >> I think he's off. >> So it's not so much of a structural change like it was when we went from, you know, mainframes to PCs, it's more of a sort of flow, evolution into this new area which is being driven, powered by new technologies, we talk about block chain machine intelligence and other things. >> Well, I mean, the make up of companies that were building quote, "Big Data Solutions", were trying to build an apparatus or mechanisms to solve big data problems, but none of them actually had the big data problem. None of them were full of data. None of them had a lot of data. The ones that had problems were the financial institutions, the credit card companies, the people who were doing a lot of large scale, um, with Google, Facebook, and some of the hyperscalers. They were actually dealing with the data tsunami themselves, so the practitioners ended up driving it. You guys at Wikibomb, we pointed this out on theCUBE many times, that the value was going to come from the practitioners not the suppliers of so called technology. So, you know, the Clouderas of the world who thought Hadoop would be relevant and growing as a technology were right on one side, on the other side of the coin was the Cloud decimation of that sector. The Cloud computer just completely blew away that Hadoop market because you didn't have to hire a PhD, you didn't have to hire specialty skills to stand up Hadoop clusters. You could actually throw it in the Cloud and get agile quickly, and get value out of data very very quickly. That has been real, it has not been an experiment. There's been new case studies, new companies born, new brands, so it's not an experiment, it is reality, and it's only going to get more real every day. >> And I add of course now you've got, you mentioned Cloudera and Hortenworks, you also got Matt Bar reeling Stu. Let's talk about Actifio. So they coined the term Copy Data Management, they created the category, of course they do a lot of backup, I mean, everybody in this space does a lot of backup. And then you saw the Silicon Valley companies come in. Particularly Cohesity and Rubric, you know, to a lesser extent he got some other guys like Zerto and Durva, but it was really those two companies, Cohesity and Rubric, they raised more money in their D round than Actifio has since inception. But yet Actifio keeps, you know, plodding along, growing, you know, word is they're profitable, you know, they're not like this really sectioned very East Coast versus kind of West Coast mentality. What's your take on what's going on? >> Yeah, so, Dave right, you look at the early days of Actifio and you say great, Copy Data Management, I have all these copies of data, how do I reduce my cost, get greater utilization than I have and leverage the data? I love the title of the show here, Data Driven. You know, we know at the center of digital transformation if you can't become data driven, like the CMO Brian Regan got up on stage talk about that industrialization of data. How am I going along that journey being this, I collected data versus now, you know, data, you know, is the reason that I make decisions, how I make decisions, I get smarter. The Cloud of course is a huge enabler of this, there's all these services that I can instantly access to be able to get greater insight, and move along with that environment, and if you look underneath all of these backup companies, it's really how I can change that data into business value and drive my business, the metadata underneath and all those pieces, not just the wonky storage and technical solutions that make things better, and I get a faster ROI. It's that data at the core of what we do and how do I get that as a business to accelerate. Because we know IT needs to be able to respond back to the business and data needs to be that rocket fuel. >> Is it the case of data haves and data have-nots? I mean, Amazon has data >> I mean, you're right-- >> and Facebook has data. >> We're talking about Actifio, you brought that up, okay, on this segment, on the inside segment, which is cool, they're here at the event, but they have a good opportunity but they also, they got some challenges. I mean, the thing about Actifio is, to my earlier point, which side of the wave are they on? Are they out too much out front with virtualization and Amazon, the Cloud will take them away, or are they riding the Cloud wave, making that an enabler? And I think what really I like about Actifio is because they have a lot of virtualization capabilities, the question is can they scale that Stu, to containers and microservices, because, the real opportunity in this market, in my opinion, is going to build on the virtualization trend, and make container aware, microservices capabilities because if they don't, then that would be a tell sign. Now either way it's a hot M&A market right now, so I think being in the market, horse on the track as you say. You look at the tableau sales force deal monster numbers we are in clearly a hot IPO market and a major roll up market on the M&A side. I think clearly there's two types of companies, old and new, and that is really what people are looking at, are they part of the old guard, are they the new guard. So, you know, this to me is going to be a tell sign of what they do next, can they make the data driven value proposition, you articulated Stu, actually a reality It's going to come from the technology underneath. >> Well I think it's a really interesting point you're making because, Stu as you probably know, that Amazon announced the Amazon backup service right, and you talked about the backup guys and they're like, "Ah yeah it's backup, but it really doesn't do recovery, it's really not that robust". It's part of me says, "Uh oh"... >> Watch out. >> You better move fast", because Amazon has stated, "Hey if you don't move fast we're going to just keep gobbling", and you've seen Amazon do this. What are your thoughts on that? Can these specialists, can they survive, John's talking about M&A. Can the market support all these guys along with the big, you know, traditional guys like Veritas, and Dell EMC, and IBM and Combol? >> Right, well so Actifio started very much in the data center. They were before this Could wave really took off. It's really only in the last year that they've been sassifying their product. So the question is, does that underlying IP, which wasn't tied to hardware, but, you know, sat at really more of, you know, reminded us of that storage virtualization battles that we talked about for years, Dave, but now they are going in the Cloud. They've got all the partnerships in the Cloud, but they are competing against those new vendors that you talked about like Cohesity and Rubric out there, and there's big money chasing this environment. So, you know, I want to talk to the customers here and find out, you know, where they are using them, and especially some of those first customers using this--. >> Well they clearly need a Cloud play cause that's clearly where the action is. But if you look at what's going on with Amazon, Azure, and Google you see a lot of on premises, Stu, because that's where the customers are. So just because the customers are currently not migrating their existing workloads to the Cloud doesn't mean it's not going to happen. So I think there's an opportunity for any company like Actifio, who may or may not be on the curve on the tech side, one little misfire on a tech bet could cripple the company and also make the company. There's a lot of high risk, reward ratio. How they handle containers. How they build on virtualizations. Virtualization going to to be part of the future with Cloud. These are the kind of the dynamics that are going to be in play, and they got some time on their hands because the on premises growth is because the clients are trying to figure out what to do and they're not going to be migrating, lifting, and shifting workloads all off to the Cloud. New will be Cloud based, but enterprises have proven why we are in multi-Cloud and hybrid-Cloud conversation, that... The enterprise on premises is not going away anytime soon. >> I want to ask you guys, John you specifically, about this sort of new Silicon Valley growth model and how companies are achieving escape velocity. When you and I made our first trip to Barcelona, I was having dinner with David Scott who was the CEO of 3PAR and he said to me, When I came to 3PAR the board said, "Hey we're willing to invest 30 million dollars in this company". And David Scott said to them, "I need way more, I need 80 million dollars". Today 80 million dollars is nothing. You saw, you know, Pure Storage hit escape velocity, was just throwing money, and growing at the problem. You're seeing Cohesity-- >> Well you can debate that. I mean, If you have to build a rocket ship, hit critical mass and you want to fund that, you're going to to need an enterprise. However, there's arguments on the south side that you can actually get fly wheel effect going early with less capital. So again, that's 3PAR-- >> But so that's my point. >> Well so that's 3PAR, that was 2009. >> So, yeah that was early days so that's ancient history. But software is generally supposed to be a capital efficient market, yet these companies are raising many hundreds and hundreds of millions, you know, half a billion dollar raises and they are putting it largely in promotion. Is that the new model, is that sustainable, in your view? >> Well I think you're conflating capital market dynamics with viable companies to invest in. I think there's a robust seed in series A market but the series A market and Silicon Valley is you know, 15 to 25 million, it used to be 3 to 5. So the dynamics are changing on funding. There's just not enough companies, horses on the track, to deploy capital at tranches of 30, 50, 80 million. So the capital markets are clearly going to have the money available so it's a market for the startups and the broke companies. That's separate from actually winning. So you've got slacks going public this weeks, you have other companies who have built business on a sass fly wheel, and then everything else is gravy in terms of the go to market, they got a couple hundred million. I think slack got close to a billion dollars in cash that they've raised. So they're flooded with cash, they'll never spend it all. So there are some companies that can achieve success like that. Others have to buy market share, they got to push and build out a sales force, and it's going to be a function of the role of customer, customization, specialism, and whatnot. But with AI machine leaning there's more efficiencies coming in so I think the modern company can do more with less. >> What do you think of the ride sharing on IPOs, Uber and Lift, do you abol? Do you like 'em or do you think it's just, they're losing too money and can't sustain it? >> I was thinking about that this morning after looking at the article in the Wall Street Journal in our coverage on Silicon angle. You look at Zoom communications, I like models that actually can take a simple concept and an existing mature market and disrupt it by being Cloud efficient and completely sass and data driven. That is an example of success. That to me, Zoom Communications and Zscaler, another company that we talk to, these are companies that were built with a specific value proposition that made the product and they were targeting mature markets with leaders in it. Video conferencing, Webex, Citrix, Zoom came out of nowhere, optimized on simple value proposition, used Cloud scale and data, and crushed it. Uber, Lift, little bit different issue. They're losing money but I would bet on the long term that that is going to be the used case for how people will have transportation. I think that's the long game and I think that without regulatory kind of pressure, without, there's regulatory issues that's really the big risk. But I believe that Uber and Lift absolutely will be long brands and just like Facebook was early on, although they threw off a lot of cash, those guys are building for penetration, and that's where the funding matters. Penetration is critical. Now they're the standard, and people really don't take taxis anymore, but they're really using the ride sharing. And you get the scooters, you get the bikes, they're all sequencing into these adjacent markets which drains more cash but builds the brand, builds the footprint. >> Well that's what I want to ask you. So people compare the early Uber, Lift, Taxi, Ride sharing to Amazon selling books, but there's all these other adjacencies. You have a thought on this? >> Well, just, you know, right, Uber Eats is a huge opportunity for that environment and autonomous vehicles everybody talks about, but it's still quite a ways out. So there are a lot of different- >> Scooters are the same, we're in San Diego, there are 8 gazillion scooters. >> San Diego had fun, you know, going around on their electronic scooters, boy, talk about the gig economy, they pay people at the night, to like go pay by the recharge you do on that, what is the future of work, >> Yeah, that's a great point. >> and how can we have that-- >> Uber going to look a lot like Amazon. You subsidize the front end retail side of the business, but look at the data that they throw up. Uber's data that they're gathering on, not only customer behavior, but just mapping services, 3-D mapping is going to be huge, so you've got these cars that are essentially bots on the road, providing massive mapping and traffic analysis. So you're going to start to see data driven, like Actifio slogan here, be a big part of all design decisions and value proposition from any company out there. And if they're not data driven I think they're going to be toast. >> Probably could because there's that data and that machine learning underneath, that can optimize, you know, where the people are, how I use the system, such a huge wave that we're watching. >> How about one last topic which is heavily data driven, it's Facebook. Facebook is obviously a data driven company, the Facebook crypto play, I love it, I love Facebook. I'm a bull on Facebook, I think it's been beat up. I think, two billion users is hard to replicate, but what's your thoughts on their crypto play? >> Well it's kind of a middle finger to the United States of America but it's a great catalyst for the international market because crypto needed a whale to come in and bring all those users in. Bad timing, in my mind, for Facebook, because given all the anti-trust and regulatory conversations, what better way to show your threat to the world order when you say we're going to run a banking system with a collection of international companies. I think the US is going to look at this and say, "Oh my God! They can't even be trusted to handle personal information and we're going to now let them run a banking system? Run monetary, basically World Bank equivalent infrastructure?" No frickin way! I think this is going to to be a major road to home. I think Facebook has to really make this an ecosystem play if they want to make it work, that's their telegraphic move they're saying, "Hey we want to do for the community but we got our own wallet and we got our own network". But they bring a lot to the table so it's going to be a really interesting dynamic to see the coalescing around Facebook because they could make the market. Look what Instagram did to Snapchat. They literally killed the company, took all their users. That is what's going to happen in the digital money economy when Facebook brings billions of users user experience with money. What happened with Snapchat with Instagram is going to happen to the World Bank if this continues. >> Where do you stand on the government breaking up big tech? >> So Dave, you know, you look in these companies, it's not easy to pull those apart. I don't think our government understands how most of big tech works. You know, take Amazon and AWS, that's one company underneath it. You know, Facebook, Microsoft. You know, Microsoft went through all these issues. Question Dave, we've had lots of debates on Twitter you know, are they breaking the law, are they not doing trust? I have some trust issues with Facebook myself, but most of the big companies up there I don't think the anti-trust kicks in, I don't think it makes sense to pull them apart. >> Stu, the Facebook story and the YouTube story are simply this, they have been hiding under the platform rules, of the Digital Millennium Copyright Act, and they are an editing platform so you can't sue them. Okay, once they become a publisher they could be sued. Just like CNN, Fox News, and everybody else. And we're publishers. So they've been hiding behind the platform. That gig is up. They're going to have to address are you a platform or are you a publisher? You're making editing decisions around what users can see with software, you are essentially editing the feed, that is a publisher role, with that becomes responsibility, and then obviously regulartory. >> Well Facebook is conflicted right now. They're trying to figure out which side of the fence to go on. >> No no no! They want one side! The platform side! They're make billions of dollars! >> Yeah but so they're making decisions about you know, which content to show and whether they monetize it. And when it's controversial content, they'll turn down the ads a little bit but they won't completely eliminate it sometimes. >> So, Dave, the only thing that the partisans in politics seem to agree on though is that big tech has too much power. You know, What's your take on that? >> Well so I think that if they are breaking the law then they should be moderated. But I don't think the answer is to go hard after Elizabeth Warren. Hard after them and break them up. I think you got to start with okay, because you break these companies up what's going to happen is they're going to be worth more, it's going to be AT&T all over again. >> While you guys were at Sysco Live, we covered this at Amazon Web Service and Public Sector Summit. The real issue in government, Stu, is there's too much tech for bad on the PR side, and there's not enough tech for good. Tech is not bad, tech is good. There's not enough promotion around the apps around there. There's real venture funds being created to promote tech for good. That's going to where the tide will turn. When does the tech industry start doing good stuff, not bad stuff. >> All right we've got to wrap. John, thanks for sitting in. Thank you for watching. Be right back, we're here at Actifio Data Driven 2019. From Boston this is theCUBE, be right back. (upbeat techno music)

Published Date : Jun 19 2019

SUMMARY :

Brought to you by Actifio. So what do you think is happening? but bet the ranch to be the next data warehouse. like it was when we went from, you know, mainframes to PCs, that the value was going to come from the practitioners But yet Actifio keeps, you know, plodding along, and how do I get that as a business to accelerate. I mean, the thing about Actifio is, to my earlier point, and you talked about the backup guys and they're like, Can the market support all these guys along with the and find out, you know, where they are using them, and they're not going to be migrating, lifting, I want to ask you guys, John you specifically, I mean, If you have to build a rocket ship, of millions, you know, half a billion dollar raises So the capital markets are clearly going to have and they were targeting mature markets with leaders in it. So people compare the early Uber, Lift, Taxi, Ride sharing Well, just, you know, right, Uber Eats is a huge Scooters are the same, we're in San Diego, there are but look at the data that they throw up. that can optimize, you know, where the people are, the Facebook crypto play, I love it, I love Facebook. I think this is going to to be a major road to home. but most of the big companies up there and they are an editing platform so you can't sue them. side of the fence to go on. you know, which content to show So, Dave, the only thing that the partisans in politics I think you got to start with okay, There's not enough promotion around the apps around there. Thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

MicrosoftORGANIZATION

0.99+

JohnPERSON

0.99+

AmazonORGANIZATION

0.99+

IBMORGANIZATION

0.99+

UberORGANIZATION

0.99+

Dave VellantePERSON

0.99+

John FurrierPERSON

0.99+

FacebookORGANIZATION

0.99+

Elizabeth WarrenPERSON

0.99+

3PARORGANIZATION

0.99+

CombolORGANIZATION

0.99+

Stu MinimanPERSON

0.99+

15QUANTITY

0.99+

AWSORGANIZATION

0.99+

David ScottPERSON

0.99+

Palo AltoLOCATION

0.99+

San DiegoLOCATION

0.99+

VeritasORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

FrankPERSON

0.99+

Brian ReganPERSON

0.99+

30 million dollarsQUANTITY

0.99+

BarcelonaLOCATION

0.99+

Frank GenusPERSON

0.99+

80 million dollarsQUANTITY

0.99+

AT&TORGANIZATION

0.99+

World BankORGANIZATION

0.99+

ClouderaORGANIZATION

0.99+

3QUANTITY

0.99+

BostonLOCATION

0.99+

CohesityORGANIZATION

0.99+

2009DATE

0.99+

CNNORGANIZATION

0.99+

YouTubeORGANIZATION

0.99+

WebexORGANIZATION

0.99+

ZscalerORGANIZATION

0.99+

30QUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

two typesQUANTITY

0.99+

Digital Millennium Copyright ActTITLE

0.99+

CitrixORGANIZATION

0.99+

billions of dollarsQUANTITY

0.99+

TodayDATE

0.99+

LiftORGANIZATION

0.99+

ActifioORGANIZATION

0.99+

two companiesQUANTITY

0.99+

8 gazillion scootersQUANTITY

0.99+

first tripQUANTITY

0.99+

Fox NewsORGANIZATION

0.99+

United States of AmericaLOCATION

0.99+

Boston, MassachusettsLOCATION

0.99+

Actifio 2019TITLE

0.98+

Lewis Kaneshiro & Karthik Ramasamy, Streamlio | Big Data SV 2018


 

(upbeat techno music) >> Narrator: Live, from San Jose, it's theCUBE! Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to Big Data SV, everybody. My name is Dave Vellante and this is theCUBE, the leader in live tech coverage. You know, this is our 10th big data event. When we first started covering big data, back in 2010, it was Hadoop, and everything was a batch job. About four or five years ago, everybody started talking about real time and the ability to affect outcomes before you lose the customer. Lewis Kaneshiro was here. He's the CEO of Streamlio and he's joined by Karthik Ramasamy who's the chief product officer. They're both co-founders. Gentlemen, welcome to theCUBE. My first question is, why did you start this company? >> Sure, we came together around a vision that enterprises need to access the value around fast data. And so as you mentioned, enterprises are moving out of the slow data era, and looking for a fast data value to their data, to really deliver that back to their users or their use cases. And so, coming together around that idea of real time action what we did was we realized that enterprises can't all access this data with projects right now that are not meant to work together, that are very difficult, perhaps, to stitch together. So what we did was create an intelligent platform for fast data that's really accessible to enterprises of all sizes. What we do is we unify the core components to access fast data, which is messaging, compute and stream storage, accessing the best of breed open-source technology that's really open-source out of Twitter and Yahoo! >> It's a good thing I was going to ask why does the world need to know there are, you know, streaming platforms, but Lewis kind of touched on it, 'cause it's too hard. It's too complicated, so you guys are trying to simplify all that. >> Yep, the reason mainly we wanted to simplify it because, based on all our experiences at Twitter and Yahoo! one of the key aspects was to to simplify it so that it's conceivable by regular enterprise because Twitter and Yahoo! kind of our position can afford the talent and the expertise in order to do this real time platforms. But when it goes to normal enterprises, they don't have access to the expertise and the cost benefits that they might have to reincur. So, because of that we wanted to use these open-source projects, the Twitter and the Yahoo!'s provider, combine them, and make sure that you have a simple, easy, drag and drop kind of interface, so that it's easily conceivable for any enterprise. Essentially, what we are trying to do is reduce the (mumbles) for enterprises for real time, for all enterprises. >> Dave: Yeah, enterprises will pay up... >> Yes. >> For a solution. The companies that you used to work for, they all gladly throw engineering at the problem. >> Yeah. >> Sure. >> To save time, but most organizations, they don't have the resources and so. Okay, so how does it, would it work prior to Streamlio? Maybe take us through sort of how a company would attack this problem, the complexities of what they have to deal with, and what life is like with you guys. >> So, current state of the world is it's fragmented solution, today. So the state of the world is where you take multiple pieces of different projects and you'd assemble them together in formats so that you can do (mumbles) right? So the reason why people end up doing is each of these big data projects that people use was the same for completely different purpose. Like messaging is one, and compute is another one, and third one is storage one. So, essentially what we have done as company is to simplify this aspect by integrating this well-known, best-of-the-breed projects called, for messaging we use something called Apache Poser, for compute we use something called Apache Krem, from Twitter, and similarly for storage, for real time storage, we use something called Apache Bookkeeper, so and to unify them, so that, under the hoods, it may be three systems, but, as a user, when you are using it, it serves or functions as a single system. So you install the system, and ingest your data, express your computation, and get the results out, in one single system. >> So you've unified or converged these functions. If I understand it correctly, we talking off camera a little bit, the team, Lewis, that you've assembled actually developed a lot of these, or hugely committed to these open-source projects, right? >> Absolutely, co-creators of each of the projects and what that allows us to do is to really integrate, at a deep level, each project. For example, Pulsar is actually a pub/sub system that is built on Bookkeeper, and Bookkeeper, in our minds, is a pure list best-of-breed stream storage solution. So, fast and durable storage. That storage is also used in Apache Heron to store State. So, as you can see, enterprises, rather than stitching together multiple different solutions for queuing, streaming, compute, and storage, now have one option that they can install in a very small cluster, and operationally it's very simple to scale up. We simply add nodes if you get data spikes. And what this allows is enterprises to access new and exciting use cases that really weren't possible before. For example, machine learning model deployment to real time. So I'm a data scientist and what I found is in data science, you spend a lot of time training models in batch mode. It's a legacy type of approach, but once the model is trained, you want to put that model into production in real time so that you can deliver that value back to a user in real time. Let's call it under two second SLA. So, that has been a great use case for Streamlio because we are a ready made intelligent platform for fast data, for MLai deployment. >> And the use cases are typically stateful and your persisting data, is that right? >> Yes, use cases, it can be used for stateless use cases also, but the key advantage that we bring to a table is stateful storage. And since we ship along with the storage (mumbles) stateful storage becomes much easier because of the fact that it can be used to store a real intermediate state of the computation or it can be used for the staging (mumbles) data when it spills over from what the memory is it's automatically stored to disk or you can even in the data for as long as you want so that you can unlock the value later after the data has been processed for the fast data. You can access the lazy data later, in time. >> So give us the run-down on the company, funding, you know, VCs, head count. Give us the basics. >> Sure, we raise Series A from Lightspeed Venture Partners, lead by John Vrionis and Sudip Chakrabarti. We've raised seven and a half million and emerged from stealth back in August. That allowed us to ramp up our team to 17, now, mainly engineers, in order to really have a very solid product, but we launched post rev, prelaunch and some of our customers are really looking at geo replication across multiple data centers and so active, active geo replication is an open-source feature in Apache Pulsar, and that's been a huge draw, compared to some other solutions that are out there. As you can see, this theme of simplifying architecture is where Streamlio sits, so unifying, queuing and streaming allows us to replace a number of different legacy systems. So that's been one avenue to help growth. The other, obviously is on the compute piece. As enterprises are finding new and exciting use cases to deliver back to their users, the compute piece needs to scale up and down. We also announce Pulsar Functions, which is stream-native compute that allows very simple function computation in native Python and Java, so you spin out the Apache Python cluster or Streamlio platform, and you simply have compute functionality. That allows us to access edge use cases, so IOT is a huge, kind of exciting POC's for us right now where we have connected car examples that don't need heavyweight schedule or deployment at the edge. It's Pulsar Pulsar functions. What that allows us to do are things like fraud detection, anomaly detection at the edge, model deployment at the edge, interpolation, observability, and alerts. >> And, so how do you charge for this? Is it usage based. >> Sure. What we found is enterprise are more comfortable on a per node basis, simply because we have the ambition to really scale up and help enterprises really use Streamlio as their fast data platform across the entire enterprise. We found that having a per data charge rate actually would limit that growth, and so per node and shared architecture. So, we took an early investment in optimizing around Kubernetes. And so, as enterprises are adopting Kubernetes, we are the most simple installation on Kubernetes, so on-prem, multicloud, at the edge. >> I love it, so I mean for years we've just been talking about the complexity headwinds in this big data space. We certainly saw that with Hadoop. You know, Spark was designed to certainly solve some of those problems, but. Sounds like you're doing some really good work to take that further. Lewis and Karthik, thank you so much for coming on theCUBE. I really appreciate it. >> Thanks for having us, Dave. >> All right, thank you for watching. We're here at Big Data SV, live from San Jose. We'll be right back. (techno music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media and the ability to affect outcomes And so as you mentioned, enterprises are moving out so you guys are trying to simplify all that. and the cost benefits that they might have to reincur. The companies that you used to work for, and what life is like with you guys. so that you can do (mumbles) right? the team, Lewis, that you've assembled so that you can deliver that value so that you can unlock the value later you know, VCs, head count. the compute piece needs to scale up and down. And, so how do you charge for this? have the ambition to really scale up and help enterprises Lewis and Karthik, thank you so much for coming on theCUBE. All right, thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Karthik RamasamyPERSON

0.99+

KarthikPERSON

0.99+

Lewis KaneshiroPERSON

0.99+

DavePERSON

0.99+

San JoseLOCATION

0.99+

Lightspeed Venture PartnersORGANIZATION

0.99+

John VrionisPERSON

0.99+

LewisPERSON

0.99+

2010DATE

0.99+

AugustDATE

0.99+

three systemsQUANTITY

0.99+

StreamlioORGANIZATION

0.99+

Yahoo!ORGANIZATION

0.99+

eachQUANTITY

0.99+

TwitterORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

JavaTITLE

0.99+

first questionQUANTITY

0.99+

Sudip ChakrabartiPERSON

0.99+

one optionQUANTITY

0.99+

PythonTITLE

0.99+

bothQUANTITY

0.99+

seven and a half millionQUANTITY

0.99+

17QUANTITY

0.98+

each projectQUANTITY

0.98+

third oneQUANTITY

0.98+

KubernetesTITLE

0.98+

single systemQUANTITY

0.98+

firstQUANTITY

0.96+

PulsarTITLE

0.96+

StreamlioTITLE

0.96+

SparkTITLE

0.94+

BookkeeperTITLE

0.94+

oneQUANTITY

0.93+

one single systemQUANTITY

0.92+

theCUBEORGANIZATION

0.91+

todayDATE

0.91+

Big Data SV 2018EVENT

0.9+

ApacheORGANIZATION

0.89+

Silicon ValleyLOCATION

0.89+

SLATITLE

0.89+

one avenueQUANTITY

0.89+

Series AOTHER

0.88+

five years agoDATE

0.86+

Big DataEVENT

0.85+

About fourDATE

0.85+

Big Data SVEVENT

0.82+

IOTTITLE

0.81+

PoserTITLE

0.75+

Big Data SVORGANIZATION

0.71+

10th bigQUANTITY

0.67+

Apache HeronTITLE

0.65+

under two secondQUANTITY

0.62+

dataEVENT

0.61+

StreamlioPERSON

0.54+

eventQUANTITY

0.48+

HadoopTITLE

0.45+

KremTITLE

0.32+

Jaspreet Singh, Druva & Jake Burns, Live Nation | Big Data SV 2018


 

>> Narrator: Live from San Jose, it's theCUBE. Presenting: Big Data Silicon Valley. Brought to you by SiliconANGLE Media, and its ecosystem partners. >> Welcome back, everyone, we're here live at San Jose for Big Data SV, Big Data Silicon Valley. I'm John Furrier, cohost of theCUBE. We're here with two great guests, Jaspreet Singh, founder and CEO of Druva, and Jake Burns, VP of Cloud Services of Live Nation Entertainment. Welcome to theCUBE, so what's going on with Cloud? Apps are out there, backup, recovery, what's going on? >> So, we went all in with AWS, and late 2015 and through 2016 we moved all of our corporate infrastructure into AWS, and I think we're a little bit unique in that situation, so in terms of our posture, we're 100% Cloud. >> John: Jaspreet, what's going on with you guys in the Cloud, because we've talked about this before, with a lot of the apps in the cloud, backup is really important. What's the key thing that you guys are doing together with Live Nation? >> Sure, so I think the notion of data is now pretty much everywhere. The data is captured, controlled in data center, now it's getting decentralized into getting into apps and ecosystems, and softwares and services deployed either at the edge or in the Cloud. As the data gets more and more decentralized, the notion of data management, bead backup, BD discovery. Anything has to get more and more centralized. And we strongly believe the epicenter of this whole data management has to move to Cloud. So, Druva is a size based provider for data management. And we work with Live Nation to predict the apps not just in the data center. But, also at the edge and also the Cloud data center. The applications deployed in the Cloud, be it Live Nation or Ticketmaster. >> And what are some of the workloads you guys are backing up? That's with Druva. >> Yeah so, it's pretty much all corporate, IT applications. You know, typical things you'd find in any IT shop really. So, you know, we have our financial systems and we have some of our smaller ticketing systems and you know, corporate websites. Things of that nature. So, it's like we have 120 applications that are running and it's just really kind of one of everything. >> We were talking before we came on camera about the history of computing and the Cloud has obviously changed the game. How would you compare the Cloud as a trend relative to operationalizing the role of data and obviously GDPR, Ransomware. These are things that now with the perimeter gone. There's worries. So now, how do you guys look at the Cloud? So Jake, I will start with you. If you can compare and contrast, where we have come from and where we are going. Role of the Cloud. Significant primary, expanding. How would you compare that? And how would you talk to someone who says Hey I'm still in the data center world? What's going on with Cloud? >> Well, yeah, it's significant and it's expanding, both. And you know, it's really transforming the way we do business. So you know just from a high level, things like shortening the time to market for applications, going from three to six months just to get a proof of concept started to today, you know, in the Cloud. Being able to innovate really by trying things trying to... we try 20 different things, decide what works, what doesn't work. And at very low cost. So, it allows us to really do things that just weren't possible before. So, also, we we move more quickly because, you know, we're not afraid of making mistakes. If we provision infrastructure and we don't get it right the first time, we just change it. You know, that's something that we would just never be able to do previously in the data center. So to answer your question, everything is different. >> And as a service model's been kind of key. Is the consumption on your end different like I mean radically different? Like give an example of like how much time would be saved or taken to use other the traditional approaches. >> Oh for sure. You know, in the role of IT has completely changed because you know, instead of worrying about nuts and bolts and servers and storage arrays and data centers. You know, we could really focus on the things that are important to the business. You know, those things delivering results for the business. So, bringing value, bringing applications online and trying things that are going to help you know, us do business rather than focusing on all the minutiae. All that stuff's now been outsourced to Cloud providers. So, really, we kind of have a similar head count and staff. But, we are focused on things that bring value rather than things that are just kind of frivolous. >> Jaspreet, you guys have been very successful startup growing rapidly. The Cloud been a good friend that trend is your friend with the Cloud. >> What's different operationally that you guys are tapping into? What's that tail wind for Druva that's making you guys successful? And is it the ease of use? Is it the ease of consumption? Is it the tech? What's the secret to success with Druva? >> Sure, so, we believe cloud is a very big business transformation trend more than a technology trend. It's how you consumer service with a fixed SLA, with a fixed service agreement across the globe. So, it's ease of consumption. It's simplicity of views. It's orchestration. It's cost control. All those things. So, our promise to our customers is the complexity of data management, backups, archives, data production, which is a risk mitigation project. You know, can be completely abstracted by a simple service. For example, you know, Live Nation consumers, consumer drove a service through Amazon Marketplace. So, think about consuming a critical service like data management through simplicity of marketplace, pay as you go, as you consume the service. Across the globe. In the US, in Australia, and Europe. And also, helps the vendors like us to innovate better. Because we have a control environment to understand how different customers are using the service and be able to orchestrate better security pusher, better threat prevention, better cost control. DevOps. So, it improves the pusher of the service being offered and helps the customer consumer. >> You both are industry veterans by today's standards unless you're like 24 doing some of the cryptocurrency stuff that, you know, doesn't know the old IT baggage. How would you guys view the multi-Cloud conversation? Because we hear that all the time. Multi-Cloud has come up so many times. What does it mean? Jake, what does multi-Cloud actually mean? Is it the same workload across multiple Clouds? Is it the fact that there is multiple Clouds? Certainly, there will be multiple Clouds? But, so, help us digest what that even means these days. >> Yeah, that's a great question and it's a really interesting topic. Multi-Cloud is one of those things where, you know, there's so many benefits to using more than one Cloud provider. But, there are also a lot of pitfalls. So, people really underestimate the difference in the technology and the complexity of managing the technology when you change Cloud providers. I'm talking primarily about infrastructure service providers like Amazon web services. So, you know, I think there's a lot of good reasons to be multi-Cloud to get the best features out of different providers, to not have, you know, the risk of having all your data in one place with one vendor. But, you know, it needs to be done in such a way where you don't take that hit in overhead and complexity and you know, I think that's kind of a prohibitive barrier for most enterprises. >> And what are the big pitfalls that you see? Is it mainly underestimating the stack complexity between them or is it more of just operational questions? I mean what is the pitfalls that you've observed? >> Yeah, so, moving from like a typical IT data center environment to public Cloud provider like AWS. You're essentially asking all your technical staff to start speaking in a new language. Now if you were to introduce a second Cloud provider to that environment, now you're asking them to learn a third language as well. And that's a lot to ask. So, you really have two scenarios where you can make that work today without using a third party. And that's ask all of your staff to know both and that's just not feasible. Or have two tech teams. One for each Cloud platform. That's really not something businesses want to do. So, I think the real answer is to rely on a third party that can come in and abstract one of those Cloud complexities Well, one of those Cloud providers out. So, you don't have to directly manage it. And in that way, you can get the benefit of being multi-Cloud, that data protection of being multi-Cloud. But, not have to introduce that complexity to your environment. >> To provide some abstraction layer. Some sort of software approach. >> Yeah, like for example, if you have your primary systems in AWS, and you use a software like Druva Phoenix to backup your data and you put that data into a second Cloud provider. You don't have to an account with that second Cloud provider. You don't have to have the risk of associating without a complexity associated without that is I think is a very >> And that's where you're looking for differentiation. We look at venues, say hey don't make me work harder. >> Right. >> And add new staff. Solve the problem. >> Yeah, it's all about solving problems right? And that's why we're doing this. >> So, Druva talk about this thing. Because we talked about it earlier about To me we could be oh we're on Azure. Well, they have Office 365 of course they're going to have Microsoft. A lot of people have a lot going on and AWS. So, maybe we're not there at the world where you can actually use provision across Clouds, the same workload, It would be nice to have that someday if it was seamless. But, I think that's might be the nirvana. But at the end of the day, an enterprise might have Office 365 and some Azure. But, I got some mostly Amazon over here I'm doing a lot of development on and doing a DevOps, and I'm on-prim. How do you talk to that? Because that's like you got to backup Office 365, you got to do the on-prim thing, you got to do the Amazon thing. How do you guys solve that problem? What's the conversation? >> Absolutely. I think over time we believe best of breed will win. So, people will deploy different type of cloud for different workloads. Pete's has hosted IaaS or platform like PaaS. When they do that, when they host multiple services, softwares to deploy services. I think its hard to control where the data will go. What we can orchestrate or anybody can orchestrate is the centralizing the data management part of it. So, Druva has the best pusher, has the best coverage across multiple heterogeneous Cloud breed. You know. Services like Office 365, Box, or Saleforce or B platforms like S3 or Dynono DB through our product called Apollo or hosted platforms like what Live Nation is using through our Phoenix product line. So getting the breadth of coverage, consistency of policies on a single platform is what will make enterprises adopt what's best out there without worrying about how you build abstraction for data management. >> Jake, what's the biggest thing you see people who are moving to the Cloud for the first time? What are they struggling with? Is it the idea that there's no perimeter? Is it staff training? I mean what are some of the as people move from Test Dev and or start to put in production the Cloud? What are some of the critical things they should think about? >> Yeah, there are so many of them. But first, really, its just getting buy in, you know, from your technical staff because, you know, in an enterprise environment you bring in a Cloud provider it's very easily framed to hold as if we're just being outsourced right? So, I think getting past that barrier first and really getting through to folks and letting them know that really this is good for you. This is not bad for you. You're going to be learning a new skill, very valuable skill, and you're going to be more effective at your job. So, I think that's the first thing. After that, once you start moving to the Cloud, then, the thing that becomes apparent very quickly is cost control. So, you know, the thing with public Cloud is you know, before you had this really kind of narrow range of what IT could cost. Now with the traditional data center, now we have this huge range. And yes, it can be cheaper than it was before. But, it can also be far more expensive than it was before. >> So, service is sprawled or just not paying attention? Both? >> Well, you essentially you're giving your engineers a blank check. So, you need to have some governance and, you know, you really need to think about things that you didn't have to think about before. You're paying for consumption. So, you really have to watch your consumption. >> So, take me thorough the mental model of D duplication in the Cloud. Because I'm trying to like visualize it or grok it a little bit. Okay, so, the Cloud is out there, data's everywhere. And do I move the compute to the data? How does the backup and recovery and data management work? And does D Doup change with Cloud? Because some people think I got my D Doup already and I'm on premise. I've been doing these old solutions. How does D Doup specifically change in the Cloud or does it? >> I know scale changes. You're looking at, you know, the best D Doup systems, if you look historically, you know, were 100 terabyte, 200 terabyte, Dedup indexes, data domain. The scale changes, you know, customers expect massive scale in Cloud. Our largest customer had 10 perabyte in a single Dedup index. It's 100x scale difference compared to what traditional systems could do. Number two, you could create a quality of service which is not really bound by a fixed, you know, algorithm like variable lent or whatever. So, you can optimize a Dedup very clearly for the right workload. The right Dedup for the right workload. So, you may Dedup off of 365 differently than your VMware instances, compared to your Oracle databases or your Endpoint workload. So, it helps you that as a service business model helps you create a custom, tailored solution for the right data. And bring the scale. We don't have the complexity of scale. But, to get the benefit of scale. All, you know, simply managing the cloud. >> Jake, what's it like working with Druve? What's the benefit that they bring to you guys? >> Yeah, so, specifically around backups for our enterprise systems, you know, that's a difficult challenge to solve natively in the Cloud. Especially if you're going to be limited to using Cloud native tools. So, it's really it's a really perfect use case for a third party provider. You know, people don't think about this much but in the old days, in the data center, you know, our backups went offsite into a vault. They were on tapes. It was very difficult for us to lose those or for them to be erased accidentally or even intentionally. Once you go into the Cloud, especially if you're all in with the Cloud, like we are. Everything is easier. And so, accidents are easier also. You know, deleting your data is easier. So, you know, what we really want and what a lot of enterprises want. >> And security too is a potential >> Absolutely, yeah. And so, what we want is we want to get some of that benefit, you know, back that we had from that inefficiency that we had beforehand. We love all the benefits of the Cloud. But, we want to have our data protected also. So, this is a great role for a company like Druva to come in and offer a product like Phoenix and say, you know, we're going to handle we're going to handle your backups for you essentially. So, you're going to put it in a safe place. We're going to secure it for you. And we're going to make sure it's secure for you. And doing it software is a service like Druva does with Phoenix. I think is the absolute right way to go. It's exactly what you need. >> Well, congratulations Jake Burns, Vice President in Cloud services. >> Thank you. >> At Live Nation entertainment. Jaspreet Singh, CEO of Druva, great to have you on. Congratulations on your success. >> Thank you. >> Inside the tornado called Cloud computing. A lot more stuff coming. More CUBE coverage coming up after this short break. Be right back. (electronic music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Welcome to theCUBE, so what's going on with Cloud? So, we went all in with AWS, What's the key thing that you guys are doing and services deployed either at the edge or in the Cloud. you guys are backing up? So, you know, we have our financial systems And how would you talk to someone who says to today, you know, in the Cloud. Is the consumption on your end different on the things that are important to the business. Jaspreet, you guys have been very successful So, it improves the pusher of the service being offered that, you know, doesn't know the old IT baggage. to not have, you know, the risk And in that way, you can get the benefit To provide some abstraction layer. and you put that data into a second Cloud provider. And that's where you're looking for differentiation. Solve the problem. And that's why we're doing this. Because that's like you got to backup So, Druva has the best pusher, So, you know, the thing with public Cloud is So, you really have to watch your consumption. And do I move the compute to the data? the best D Doup systems, if you look historically, So, you know, what we really want to get some of that benefit, you know, back in Cloud services. Jaspreet Singh, CEO of Druva, great to have you on. Inside the tornado called Cloud computing.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jake BurnsPERSON

0.99+

Jaspreet SinghPERSON

0.99+

EuropeLOCATION

0.99+

JohnPERSON

0.99+

John FurrierPERSON

0.99+

Live Nation EntertainmentORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

USLOCATION

0.99+

AWSORGANIZATION

0.99+

JakePERSON

0.99+

AustraliaLOCATION

0.99+

AmazonORGANIZATION

0.99+

100xQUANTITY

0.99+

threeQUANTITY

0.99+

San JoseLOCATION

0.99+

OneQUANTITY

0.99+

JaspreetPERSON

0.99+

Office 365TITLE

0.99+

oneQUANTITY

0.99+

Live NationORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

bothQUANTITY

0.99+

DruvaORGANIZATION

0.99+

200 terabyteQUANTITY

0.99+

firstQUANTITY

0.99+

120 applicationsQUANTITY

0.99+

BothQUANTITY

0.99+

100%QUANTITY

0.99+

100 terabyteQUANTITY

0.99+

secondQUANTITY

0.99+

PhoenixORGANIZATION

0.99+

two scenariosQUANTITY

0.99+

late 2015DATE

0.98+

six monthsQUANTITY

0.98+

first timeQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

TicketmasterORGANIZATION

0.98+

2016DATE

0.98+

10 perabyteQUANTITY

0.98+

two great guestsQUANTITY

0.97+

S3TITLE

0.97+

CloudTITLE

0.97+

one vendorQUANTITY

0.97+

GDPRTITLE

0.97+

single platformQUANTITY

0.96+

OracleORGANIZATION

0.96+

Big Data SVORGANIZATION

0.96+

AzureTITLE

0.95+

365QUANTITY

0.95+

todayDATE

0.94+

20 different thingsQUANTITY

0.94+

Big Data Silicon ValleyORGANIZATION

0.94+

Druva PhoenixTITLE

0.93+

DruvaTITLE

0.93+

one placeQUANTITY

0.93+

Cloud ServicesORGANIZATION

0.92+

more than one CloudQUANTITY

0.91+

two tech teamsQUANTITY

0.91+

first thingQUANTITY

0.89+

DBTITLE

0.89+

Jacques Nadeau, Dremio | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE, presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to Big Data SV in San Jose. This theCUBE, the leader in live tech coverage. My name is Dave Vellante and this is day two of our wall-to-wall coverage. We've been here most of the week, had a great event last night, about 50 or 60 of our CUBE community members were here. We had a breakfast this morning where the Wikibon research team laid out it's big data forecast, the eighth big data forecast and report that we've put out, so check out that online. Jacques Nadeau is here. He is the CTO and co-founder of Dremio. Jacque, welcome to theCUBE, thanks for coming on. >> Thanks for having me here. >> So we were talking a little bit about what you guys do. Three year old company. Well, let me start. Why did you co-found Dremio? >> So, it was a very simple thing I saw, so, over the last ten years or so, we saw a regression in the ability for people to get at data, so you see all these really cool technologies that came out to store data. Data lakes, you know, SQL systems, all these different things that make developers very agile with data. But what we were also seeing was a regression in the ability for analysts and data consumers to get at that data because the systems weren't designed for analysts, they were designed for data producers and developers. And we said, you know what, there needs to be a way to solve this. We need to be able to empower people to be self-sufficient again at the data consumption layer. >> Okay, so you solved that problem how, you said, called it a self-service of a data platform. >> Yeah, yeah, so self-service data platform and the idea is pretty simple. It's that, no matter where the data is physically, people should be able to interact with a logical view of it. And so, we talk a little bit like it's Google Docs for your data. So people can go into the system, they can see the different data sets that are available to them, collaborate around those, create changes to those that they can then share with other people in the organization, always dealing with the logical layer and then, behind the scenes, we have physical capabilities to interact with all the different system we interact with. But that's something that business users shouldn't have to think as much about and so, if you think about how people interact with data today, it's very much about copies. So every time you want to do something, typically you're going to make a copy. I want to reshape the data, I make a copy. I want to make it go faster, I make a copy. And those copies are very, very difficult for people to manage and they could have mixed the business meaning of data with the physical, I'm making copies to make them faster or whatever. And so our perspective is that, if you can separate away the physical concerns from the logical, then business users have a much more, much more likelihood to be able to do something self-service. >> So you're essentially virtualizing my corpus of data, independent of location, is that right, I mean-- >> It's part of what we do, yeah. No, it's part of what we do. So, the way we look at it is, is kind of several different components to try to make something self-service. It starts with, yeah, virtualize or abstract away the details of the physical, right? But then, on top of that, expose a very, sort of a very user-friendly interface that allows people to sort of catalog and understand the different things, you know, search for things that they want to interact with, and then curate things, even if they're non-technical users, right? So the goal is that, if you talk to sort of even large internet companies in the Valley, it's very hard to even hire the amount of data engineering that you need to satisfy all the requests of your end-users of data. And so the, and so the goal of Dremio is basically to figure out different tools that can provide a non-technical experience for getting at the data. So that's sort of the start of it but then the second step is, once you've got access to this thing and people can collaborate and sort of deal with the data, then you've got these huge volumes of data, right? It's big data and so how do you make that go faster? And then we have some components that we deal with, sort of, speed and acceleration. >> So maybe talk about how people are leveraging this capability, this platform, what the business impact is, what have you seen there? >> So a lot of people have this problem, which is, they have data all over the place and they're trying to figure out "How do I expose this "to my end-users?" And those end-users might be analysts, they might be data scientists, they might be product managers that are trying to figure out how their product is working. And so, what they're doing today is they're typically trying to build systems internally that, to provide these capabilities. And so, for example, working with a large auto manufacturer. And they've got a big initiative where they're trying to make the data that they have, they have huge amounts of data across all sort of different parts of the organization and they're trying to make that available to different data consumers. Now, of course, there's a bunch of security concerns that you need to have around that, but they just want to make the data more accessible. And so, what they're doing is they're using Dremio to figure out ways to, basically, catalog all the data below, expose that to the different users, applying lots of different security rules around that, and then create a bunch of reflections, which make the things go faster as people are interacting with the things. >> Well, what about the governance factor? I mean, you heard this in the hadoop world years ago. "Ah, we're going to make, we're going to harden hadoop, "we're going to" and really, there was no governance and it became more and more important. How do you guys handle that? Do you partner with people? Is it up to the customer to figure that out? Do you provide that? >> It's several different things, right? It's a complex ecosystem, right? So it's a combination of things. You start with partnering with different systems to make sure that you integrate well with those things. So the different things that control some parts of credentials inside the systems all the way down to "What's the file system permissions?", right? "What are the permissions inside of something like Hive and the metastore there?" And then other systems on top of that, like Sentry or Ranger are also exposing different credentialing, right? And so we work hard to sort of integrate with those things. On top of that, Dremio also provides a full security model inside of the sort of virtual space that we work. And so people can control the permissions, the ability to access or edit any object inside of Dremio based on user roles and LDAP and those kinds of things. So it's, it's kind of multiple layers that have to be working together. >> And tell me more about the company. So founded three years ago, I think a couple of raises, >> Yep >> who's backing you? >> Yeah, yeah, yeah, so we founded just under three years ago. We had great initial investors, in Red Point and Lightspeed, so two great initial investors and we raised about 15 million on that round. And then we actually just closed a B round in January of this year and we added Norwest to the portfolio there. >> Awesome, so you're now in the mode of, I mean, they always say, you know, software is such a capital-efficient business but you see software companies raising, you know, 900 million dollars and so, presumably, that's to compete, to go to market and, you know, differentiate with your messaging and branding. Is that sort of what the, the phase that you're in now? You kind of developed a product, it's technically sound, it's proven in the marketspace and now you're scaling the, the go-to-market, is that right? >> That's exactly right. So, so we've had a lot of early successes, a lot of Fortune 100 companies using Dremio today. For example, we're working with TransUnion. We're working with Intel. We actually have a great relationship with OVH, which is the third-largest hosting company in the world, so a lot of great, Daimler is another one. So working with a lot of great companies, seeing sort of great early success with the product with those companies, and really looking to say "Hey, we're out here." We've got a booth for the first time at Strata here and we're sort of letting people know about, sort of, a better way, or easier way, for people to deal with data >> Yeah. >> A happier way. >> I mean, it's a crowded space, right? There's a lot of tools out there, a lot of companies. I'm interested in how you sort of differentiate. Obviously simplification is a part of that, the breadth of your capabilities. But maybe, in your words, you could share with me how you differentiate from the competition and how you break out from the noise. >> Yeah, yeah, yeah, so it's, you're absolutely right, it's a very crowded space. Everybody's using the same words and that makes it very hard for people to understand what's going on. And so, what we've found is very simple is that typically we will actually, the first meeting we deal with a customer, within the first 10 minutes we'll demo the product. Because so many technologies are technologies, not, they're not products and so you have to figure out how to use the product. You've got to figure out how you would customize it for your certain use-case. And what we've found with our product is, by making it very, very simple, people start, the light goes on in a very short amount of time and so, we also do things on our website so that you can see, in a couple of minutes, or even less than that, little animations that sort of give you a sense of what it's about. But really, it's just "Hey, this is a product "which is about", there's this light bulb that goes on, it's great. And you figure this out over the course of working with different customers, right? But there's this light bulb that goes on for people that are so confused by all the things that are going on and if we can just sit down with them, show them the product for a few minutes, all of a sudden they're like "Wait a minute, "I can use this", right? So you're frequently talking to buyers that are not the most technical parts of the organization initially, and so most of the technologies they look at are technologies that are very difficult to understand and they have to look to others to try to even understand how it would fit into their architecture. With Dremio, we have customers that can, that have installed it and gotten up, and within an hour or two, started to see real value. And that sort of excitement happens even in the demo, with most people. >> So you kind of have this bifurcated market. Since the big data meme, everybody says they're data-driven and you've got a bifurcated market in that, you've got the companies that are data-driven and you've got companies who say they're data-driven but really aren't. Who are your customers? Are they in both? Are they predominantly in the data-driven side? Are they predominantly in the trying to be data-driven? >> Well, I would say that they all would say that they're data-driven. >> Yeah, everyone, who's going to say "Well, we're not data-driven." >> Yeah, yeah, yeah. So I would say >> We're dead. >> I would say that everybody has data and they've got some ways that they're using it well and other places where they feel like they're not using it as well as they should. And so, I mean, the reason that we exist is to make it so it's easier for people to get value out of data, and so, if they were getting all the value they think they could get out of data, then we probably wouldn't exist and they would be fully data-driven. So I think that everybody, it's a journey and people are responding well to us, in part, because we're helping them down that journey. >> Well, the reason I asked that question is that we go to a lot of shows and everybody likes to throw out the digital transformation buzzword and then use Uber and Airbnb as an example, but if you dig deeper, you see that data is at the core of those companies and they're now beginning to apply machine intelligence and they're leveraging all this data that they've built up, this data architecture that they built up over the last five or 10 years. And then you've got this set of companies where all the data lives in silos and I can see you guys being able to help them. At the same time, I can see you helping the disruptors, so how do you see that? I mean, in terms of your role, in terms of affecting either digital transformations or digital disruptions. >> Well, I'd say that in either case, so we believe in a very sort of simple thing, which is that, so going back to what I said at the beginning, which is just that I see this regression in terms of data access, right? And so what happens is that, if you have a tightly-coupled system between two layers, then it becomes very difficult for people to sort of accommodate two different sets of needs. And so, the change over the last 10 years was the rise of the developer as the primary person for controlling data and that brought a huge amount of great things to it but analysis was not one of them. And there's tools that try to make that better but that's really the problem. And so our belief is very simple, which is that a new tier needs to be introduced between the consumers and the, and the producers of data. And that, and so that tier may interact with different systems, it may be more complex or whatever, for certain organizations, but the tier is necessary in all organizations because the analysts shouldn't be shaken around every time the developers change how they're doing data. >> Great. John Furrier has a saying that "Data is the new development kit", you know. He said that, I don't know, eight years ago and it's really kind of turned out to be the case. Jacques Nadeau, thanks very much for coming on theCUBE. Really appreciate your time. >> Yeah. >> Great to meet you. Good luck and keep us informed, please. >> Yes, thanks so much for your time, I've enjoyed it. >> You're welcome. Alright, thanks for watching everybody. This is theCUBE. We're live from Big Data SV. We'll be right back. (bright music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media We've been here most of the week, So we were talking a little bit about what you guys do. And we said, you know what, there needs to be a way Okay, so you solved that problem how, and the idea is pretty simple. So the goal is that, if you talk to sort of expose that to the different users, I mean, you heard this in the hadoop world years ago. And so people can control the permissions, And tell me more about the company. And then we actually just closed a B round that's to compete, to go to market and, you know, for people to deal with data and how you break out from the noise. and so most of the technologies they look at So you kind of have this bifurcated market. that they're data-driven. Yeah, everyone, who's going to say So I would say And so, I mean, the reason that we exist is At the same time, I can see you helping the disruptors, And so, the change over the last 10 years "Data is the new development kit", you know. Great to meet you. This is theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Jacques NadeauPERSON

0.99+

DaimlerORGANIZATION

0.99+

John FurrierPERSON

0.99+

NorwestORGANIZATION

0.99+

IntelORGANIZATION

0.99+

WikibonORGANIZATION

0.99+

TransUnionORGANIZATION

0.99+

JacquePERSON

0.99+

San JoseLOCATION

0.99+

OVHORGANIZATION

0.99+

LightspeedORGANIZATION

0.99+

second stepQUANTITY

0.99+

UberORGANIZATION

0.99+

two layersQUANTITY

0.99+

AirbnbORGANIZATION

0.99+

bothQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

Google DocsTITLE

0.99+

Red PointORGANIZATION

0.99+

StrataORGANIZATION

0.99+

60QUANTITY

0.98+

900 million dollarsQUANTITY

0.98+

three years agoDATE

0.98+

eight years agoDATE

0.98+

twoQUANTITY

0.98+

DremioPERSON

0.98+

first 10 minutesQUANTITY

0.98+

last nightDATE

0.98+

about 15 millionQUANTITY

0.97+

theCUBEORGANIZATION

0.97+

first timeQUANTITY

0.97+

DremioORGANIZATION

0.97+

Big Data SVORGANIZATION

0.96+

an hourQUANTITY

0.96+

two great initial investorsQUANTITY

0.95+

todayDATE

0.93+

first meetingQUANTITY

0.93+

this morningDATE

0.92+

two different setsQUANTITY

0.9+

thirdQUANTITY

0.88+

Big DataORGANIZATION

0.87+

SQLTITLE

0.87+

10 yearsQUANTITY

0.87+

CUBEORGANIZATION

0.87+

years agoDATE

0.86+

Silicon ValleyLOCATION

0.86+

January of this yearDATE

0.84+

DremioTITLE

0.84+

Three year oldQUANTITY

0.81+

last 10 yearsDATE

0.8+

SentryORGANIZATION

0.77+

one of themQUANTITY

0.75+

about 50QUANTITY

0.75+

day twoQUANTITY

0.74+

RangerORGANIZATION

0.74+

SVEVENT

0.7+

last ten yearsDATE

0.68+

eighth bigQUANTITY

0.68+

DataORGANIZATION

0.66+

BigEVENT

0.65+

couple of minutesQUANTITY

0.61+

CTOPERSON

0.56+

oneQUANTITY

0.55+

lastDATE

0.52+

100 companiesQUANTITY

0.52+

underDATE

0.51+

fiveQUANTITY

0.5+

2018DATE

0.5+

HiveTITLE

0.42+

Steve Wilkes, Striim | Big Data SV 2018


 

>> Narrator: Live from San Jose it's theCUBE. Presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to San Jose everybody, this is theCUBE, the leader in live tech coverage and you're watching BigData SV, my name is Dave Vellante. In the early days of Hadoop everything was batch oriented. About four or five years ago the market really started to focus on real time and streaming analytics to try to really help companies affect outcomes while things were still in motion. Steve Wilks is here, he's the co-founder and CTO of a company called Stream, a firm that's been in this business for around six years. Steve welcome to theCUBE, good to see you. Thanks for coming on. >> Thanks Dave it's a pleasure to be here. >> So tell us more about that, you started about six years ago, a little bit before the market really started talking about real time and streaming. So what led you to that conclusion that you should co-found Steam way ahead of its time? >> It's partly our heritage. So the four of us that founded Stream, we were executives at GoldenGate Software. In fact our CEO Ali Kutay was the CEO of GoldenGate Software. So when we were acquired by Oracle in 2009, after having to work for Oracle for a couple years, we were trying to work out what to do next. And GoldenGate was replication software right? So it's moving data from one place to another. But customers would ask us in customer advisory boards, that data seems valuable, it's moving. Can you look at it while it's moving and analyze it while it's moving, get value out of that moving data? And so that was kind of set in our heads. And then we were thinking about what to do next, that was kind of the genesis of the idea. So the concept around Stream when we first started the company was we can't just give people streaming data, we need to give them the ability to process that data, analyze it, visualize it, play with it and really truly understand the data. As well as being able to collect it and move it somewhere else. And so the goal from day one was always to build a full end-to-end platform that did everything customers needed to do for streaming integration analytics out of the box. And that's what we've done after six years. >> I got to ask a really basic question, so you're talking about your experience at GoldenGate moving data from point a to point b and somebody said well why don't we put that to work. But is there change data or was it static data? Why couldn't I just analyze it in place? >> GoldenGate works on change data. >> Okay so that's why, there was changes going through. Why wait until it hits its target, let's do some work in real time and learn from that, get greater productivity. And now you guys have taken that to a new level. That new level being what? Modern tools, modern technologies? >> A platform built from the ground up to be inherently distributed, scalable, reliable with exactly one's processing guarantees. And to be a complete end-to-end platform. There's a recognition that the first part of being able to do streaming data integration or analytics is that you need to be able to collect the data right? And while change data captured from databases is the way to get data out of databases in a streaming fashion, you also have to deal with files and devices and message queues and anywhere else the data can reside. So you need a large number of different data collectors that all turn the enterprise data sources into streaming data. And similarly if you want to store data somewhere you need a large collection of target adapters that deliver to things. Not just on premise but also in the cloud. So things like Amazon S3 or the cloud databases like Redshift and Google BigQuery. So the idea was really that we wanted to give customers everything they need and that everything they need isn't trivial. It's not just, well we take Apache Kafka and then we stuff things into it and then we take things out. Pretty often, for example, you need to be able to enrich data and that means you need to be able to join streaming data with additional context information, reference data. And that reference data may come form a database or from files or somewhere else. So you can't call out to the database and maintain the speeds of streaming data. We have customers that are doing hundreds of thousands of events per second. So you can't call out to a database for every event and ask for records to enrich it with. And you can't even do that with an external cache because it's just not fast enough. So we built in an in-memory data grid as part of our platform. So you can join streaming data with the context information in real time without slowing anything down. So when you're thinking about doing streaming integration, it's more than just moving data around. It's ability to process it and get it in the right form, to be able to analyze it, to be able to do things like complex event processing on that data. And also to be able to visualize it and play with it is an essential part of the whole platform. >> So I wanted to ask you about end-to-end. I've seen a lot of products from larger, maybe legacy companies that will say it's end-to-end but what it really is, is a cobbled together pieces that they bought in and then, this is our end-to-end platform, but it's not unified. Or I've seen others "Well we've got an end-to-end platform" oh really, can I see the visualization? "Well we don't have visualization "we use this third party for visualization". So convince me that you're end-to-end. >> So our platform when you start with it you go into a UI, you can start building data flows. Those data flows start from connectors, we have all the connectors that you need to get your enterprise data. We have wizards to help you build those. And so now you have a data stream. Now you want to start processing that, we have SQL-based processing so you can do everything from filtering, transformation, aggregation, enrichment of data. If you want to load reference data into memory you use a cache component to drag that in, configure that. You now have data in-memory you can join with your streams. If you want to now take the results of all that processing and write it somewhere, use one of our target connectors, drag that in so you've got a data flow that's getting bigger and bigger, doing more and more processing. So now you're writing some of that data out to Kafka, oh I'm going to also add in another target adaptor write some of it into Azure Blob Storage and some of it's going to Amazon Redshift. So now you have a much bigger data flow. But now you say okay well I also want to do some analytics on that. So you take the data stream, you build another data flow that is doing some aggregation of a Windows, maybe some complex event processing, and then you use that dashboard builder to build a dashboard to visualize all of that. And that's all in one product. So it literally is everything you need to get value immediately. And you're right, the big vendors they have multiple different products and they're very happy to sell you consulting to put them all together. Even if you're trying to build this from open source and you know, organizations try and do that, you need five or six major pieces of open source, a lot of support in libraries, and a huge team of developers to just build a platform that you can start to build applications on. And most organizations aren't software platform companies, they're finance companies, oil and gas companies, healthcare companies. And they really want to focus on solving business problems and not on reinventing the wheel by building a software platform. So we can just go in there and say look; value immediately. And that really, really helps. >> So what are some of your favorite use cases, examples, maybe customer examples that you can share with me? >> So one of the great examples, one of my customers they have a lot of data in our HP non-stop system. And they needed to be able to get visibility into that immediately. And this was like order processing, supply chain, ERP data. And it would've taken a very large amount of time to do analytics directly on the HP nonstop. And finding resources to do that is hard as well. So they needed to get the data out and they need to get it into the appropriate place. And they recognize that use the right technology to ask the right question. So they wanted some of it in Hadoop so they could do some machine learning on that. They wanted some of it to go into Kafka so they could get real time analytics. And they wanted some of it to go into HBase so they could query it immediately and use that for reference purposes. So they utilized us to do change data capture against the HP nonstop, deliver that datastream out immediately into Kafka and also push some of it into HEFS and some of it into HBase. So they immediately got value out of that, because then they could also build some real-time analytics on it. It would sent out alerts if things were taking too long in their order processing system. And allowed them to get visibility directly into their process that they couldn't get before with much fewer resources and more modern technologies than they could have used before. So that's one example. >> Can I ask you a question about that? So you talked about Kafka, HBase, you talk about a lot of different open source projects. You've integrated those or you've got entries and exits into those? >> So we ship with Kafka as part of our product. It's an optional messaging bus. So, our platform has two different ways of moving data around. We have a high-speed, in-memory only message bus and that works almost network speed and it's great for a lot of different use cases. And that is what backs our data streams. So when you build a data flow, you have streams in between each step, that is backed by an in-memory bus. Pretty often though, in use cases, you need to be able to potentially rewind data for recovery purposes or have different applications running at different speeds and that's where a persistent message bus like Kafka comes in but you don't want to use a persistent message bus for everything because it's doing IO and it's slowing things down. So you typically use that at the beginning, at the sources, especially things like IOT where you can't rewind into them. Things like databases and files, you can rewind into them and replay and recover but IOT sources, you can't do that. So you would push that into a Kafka backed stream and then subsequent processing is in-memory. So we have that as part of our product. We also have Elastic as part of our product for results storage. You can switch to other results storage but that's our default. And we have a few other key components that are part of our product but then on the periphery, we have adapters integrate with a lot of the other things that you mentioned. So we have adapters to read and write HDFS, Hive, HBase, Across, Cloudera, Autumn Works, even MapR. So we have the MapR versions of the file system and MapR streams and MapR DB and then there's lots of other more proprietary connectors like CVC from Oracle, and SQL server, and MySQL and MariaDB. And then database connectors for delivery to virtually any JDBC compliant database. >> I took you down a tangent before you had a chance. You were going to give us another example. We're pretty much out of time but if you can briefly share either that or the last word, I'll give it to you. >> I think the last word would be that that is one example. We have lots and lots of other types of use cases that we do including things like: migrating data from on-premise to the cloud, being able to distribute log data, and being able to analyze that log data being able to do in-memory analytics and get real-time insights immediately and send alerts. It's a very comprehensive platform but each one of those use cases are very easy to develop on their own and you can do them very quickly. And of course as the use case expands within a customer, they build more and more and so they end up using the same platform for lots of different use cases within the same account. >> And how large is the company? How many people? >> We are around 70 people right now. >> 70 People and you're looking for funding? What rounds are you in? Where are you at with funding and revenue and all that stuff? >> Well I'd have to defer to my CEO for those questions. >> All right, so you've been around for what, six years you said? >> Yeah, we have a number of rounds of funding. We had initial seed funding then we had the investment by Summit Partners that carried us through for a while. Then subsequent investment from Intel Capital, Dell EMC, Atlantic Bridge. And that's where we are right now. >> Good, excellent. Steve, thanks so much for coming on theCUBE, really appreciate your time. >> Great, it's awesome. Thank you Dave. >> Great to meet you. All right, keep it right there everybody, we'll be back with our next guest. This is theCUBE. We're live from BigData SV in San Jose. We'll be right back. (techno music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media the market really started to focus So what led you to that conclusion So it's moving data from one place to another. I got to ask a really basic question, And now you guys have taken that to a new level. and that means you need to be able to So I wanted to ask you about end-to-end. So our platform when you start with it And they needed to be able to get visibility So you talked about Kafka, HBase, So when you build a data flow, you have streams We're pretty much out of time but if you can briefly to develop on their own and you can do them very quickly. And that's where we are right now. really appreciate your time. Thank you Dave. Great to meet you.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

Dave VellantePERSON

0.99+

Steve WilksPERSON

0.99+

StevePERSON

0.99+

2009DATE

0.99+

Steve WilkesPERSON

0.99+

fiveQUANTITY

0.99+

Intel CapitalORGANIZATION

0.99+

GoldenGate SoftwareORGANIZATION

0.99+

Ali KutayPERSON

0.99+

OracleORGANIZATION

0.99+

hundredsQUANTITY

0.99+

GoldenGateORGANIZATION

0.99+

KafkaTITLE

0.99+

San JoseLOCATION

0.99+

StreamORGANIZATION

0.99+

MySQLTITLE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

Atlantic BridgeORGANIZATION

0.99+

six yearsQUANTITY

0.99+

SteamORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

MapRTITLE

0.99+

HPORGANIZATION

0.99+

fourQUANTITY

0.99+

70 PeopleQUANTITY

0.99+

Dell EMCORGANIZATION

0.99+

MariaDBTITLE

0.99+

StriimPERSON

0.99+

SQLTITLE

0.99+

oneQUANTITY

0.98+

each stepQUANTITY

0.98+

Summit PartnersORGANIZATION

0.98+

two different waysQUANTITY

0.97+

first partQUANTITY

0.97+

around six yearsQUANTITY

0.97+

around 70 peopleQUANTITY

0.96+

HBaseTITLE

0.96+

one exampleQUANTITY

0.96+

theCUBEORGANIZATION

0.95+

BigData SVORGANIZATION

0.94+

Big DataORGANIZATION

0.92+

HadoopTITLE

0.92+

one productQUANTITY

0.92+

each oneQUANTITY

0.91+

six major piecesQUANTITY

0.91+

About fourDATE

0.91+

CVCTITLE

0.89+

firstQUANTITY

0.89+

about six years agoDATE

0.88+

day oneQUANTITY

0.88+

ElasticTITLE

0.87+

Silicon ValleyLOCATION

0.87+

WindowsTITLE

0.87+

five years agoDATE

0.86+

S3TITLE

0.82+

JDBCTITLE

0.81+

AzureTITLE

0.8+

CEOPERSON

0.79+

one placeQUANTITY

0.78+

RedshiftTITLE

0.76+

AutumnORGANIZATION

0.75+

secondQUANTITY

0.74+

thousandsQUANTITY

0.72+

Big Data SV 2018EVENT

0.71+

couple yearsQUANTITY

0.71+

GoogleORGANIZATION

0.69+

Praveen Kankariya, Impetus | Big Data SV 2018


 

>> Narrator: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media, and its ecosystem partners. (electronica flourish) >> We're back at Big Data SV. This is theCUBE, the leader in live tech coverage. My name is Dave Vellante. Praveen Kankariya is here. He's the CEO of a company called Impetus. Company's been around the Big Data space before Hadoop, even. Praveen, thanks for back in theCUBE, good to see you. >> Thank you, Dave. >> So, as I said in the open, you've seen a lot. You kind of really got into the Big Data space in 2007, seen it blow through the Hadoop, you know, sort of batch world into the real time world, seen the data management headwinds. From your perspective, you know, what kind of problems are you solving today in the Big Data world? >> So I can go into the details of what we are doing, but at a high level, we are helping companies converge to a singular, enterprise-wide data model. 'Cause I think that is a crisis in the Fortune 500 today, and there'll be have and have-nots. >> Dave: What do you mean a crisis? >> I routinely run into companies who do not have their data model stitched. So they know the same customer, they know me by five different handles, and they don't have it figured out, that I'm the same guy. So, that I think is a major problem. So I think the C-suite is, they would not like to hear this, but they are flying partially blind. >> I have a theory on this, but I want to hear yours-- >> Sure. >> Why is that such a big problem? >> So, the most efficient business in the world is a one-man business, because everything is flowing in the same brain. The moment you hire your first employee, you start having communication breakdowns. And now these companies have hundreds and thousands of employees. Hundreds of thousands of employees. There's a lot of breakdown. There are airlines that, when I'm upgraded to first class, are offering me an economy-plus seat when I go to check in. That's ... they're turning me off, and they're losing an opportunity to, real opportunity to upsell something else to me. So. >> Okay, well, so let's bring this into the world of digital transformation. Everybody talks about those buzzwords, so let's try to put some sort of meat on that bone. If you look at the top five companies by market cap, Amazon, Apple, Facebook, Google. I'm missing somebody. Anyway, they're big. 500 billion, 700 billion dollars. They're all sort of what we would call data-driven. What does that mean? Data is at the core of their enterprise. A lot of the companies you're talking about, human expertise is the core of their enterprise, and they've got data that's sort of in silos, surrounding it. >> Praveen: Yes, yes. >> Is that an accurate description? >> That's-- And how can you help close that gap? >> So they have data in silos, and even that data in silos is not being used at velocity, with velocity. That data is, you know, it's taking much longer for them to even clean up that data, get access to that data, derive insights from that data. >> Dave: Right. >> So there's a lot of sluggishness, overall. >> Dave: So how do you help? >> How do we help? Great question. We help in many different ways. So we actually, so my company provides solutions. So we have some, a few products of our own, and then we work with all kinds of product companies. But we're about solving a problem, so when the customers we engage with, we actually solve a problem, so that there's a business outcome before we walk out. That's the big difference. We're not here to just sell the next sexy platform, or this or that, you know. We're not just here to excite the developers. >> So, maybe you could give me some of your favorite examples of where you've helped some of your clients. >> So there's one fairly large company, it's a household name around the world. And we have helped them create a single source of truth using a Big Data infrastructure. This has about six and a half thousand feeds of data coming in, continuously. Some continuously, some every few minutes, every few hours, whatnot. But then all their data is stitched together, and it's got guardrails, there's full governance. So, and now this platform is available to every business unit, to run their own applications. There's a set of APIs who go in and develop their own applications. So shadow idea is being promoted in this environment. It's not being looked down upon. >> So it's not sitting in one box, presumably, it's distributed throughout the organization? >> It is distributed. And you know, there're are some, you know, as long as you stay within the governance structure, you can derive, you know, somebody wants a graph database, they can derive a graph database from this massive, fully-connected data set, which is an enterprise-wide data set. >> Don't you see as some of the challenges, as well as cultural, there are some industries that might say, or some executives that say, "Well, you know my industry, "healthcare is an example, really hasn't been disrupted. "We're maybe insulated from that." I feel as though that's somewhat risky thinking, and it's easy to maybe sit back say, "Well, I'm going to wait, see what happens." What are your thoughts on that? >> Look at the data. The week Jeff Bezos announced that he is tying up with JPMC and Warren Buffet, some of the largest healthcare companies, and I'm talking of Fortune 10 companies, they lost about 20% of their market cap that week. So, you don't have to listen to me. Listen to the markets. >> Well, that's true. We see what happens in grocery, see what happens in... We haven't really seen, as I say, the disruption in healthcare, financial services, but it's all data, and that changes the equation. So why, let's see, not why. How when, if you get to this, so it sounds like step one is to get that sort of single data model across the organization, but there's other steps. You got to figure out how to monetize the data, not necessarily by selling it, but how data contributes to the monetization of the company. You got to it accessible, you got to make it of high quality, you've got to get the right skill sets. So there's a lot to it, and more than just the technology. Maybe you could talk about that. >> So the way, I would like to preach, if I'm allowed to-- >> Dave: Please, it's theCUBE... (laughs) >> No, no, I mean, I don't mean here, but if any CEO was listening to me, what I would like to tell them is, just create a vision of your ultimate connected data model. And then start looking at how do you converge out of that vision. It may not happen in one day, one week, one year. It's going to take time, and you know, every business is in flight, so they have to operate continuously, but they have to keep gravitating. And the biggest casualty is going to be their customer relationship if they don't do this. Because most companies don't know their customers fully. I mean, that little example of the airline which was showing me, flashing an ad for economy seats, premium economy seats when I'm already in first class, they don't know me. Some part of that company doesn't know me. So they're not able to service me well. Here now they lost an opportunity to monetize, but I think from another perspective, they lost an opportunity to really offer me something which would've made my flight way more comfortable. >> Well. >> So. >> Then you wonder if that's the dynamic that you encountered, what's the speed to market, the agility of that organization? They're hampered by their ability to, whether it's roll out new apps, identify new data sources, create new products for the customers. Have you seen, what kind of impacts have you seen within your customers? You gave the example before, of that sort of single data model, the single version of the truth. What business impacts have been able to affect for your customers? >> So, there, I mean I can go on giving you anecdotes from my observations, my front row observations into these companies. >> Yeah, it'd be good to have some kind of proof points, right? Our audience would love to hear that. >> So, you know there's a company not too far from here. They've stitched every click stream, right to product usage data. To support data, to every marketing email opened. And they can tell who's buying, what happened, what is their support experience, who's upgrading, who's upgrading faster because they had a positive support experience, or not. So everything is tied. Any direction you want to look into your customer space, you can go and get visibility from every perspective you can think of. That's customer 360. We worked with a credit card company where they had a massive rules engine, which had been developed over generations to report fraud, to catch fraud, while a transaction's being processed. We actually, once they got all their data together, we could apply a massive machine learning engine. And we started learning from customers' own behavior, so we completely discarded the rules engine, and now we have a learning system which is flagging fraudulent transactions. So they managed to cut down their false positives tremendously, and in turn reduced inconvenience. It used to be embarrassing for me to give out a card and get it declined in front of a customer. >> So, as I said at the top, you've seen sort of the evolution of this whole Big Data meme before it was called Big Data. What are the things that may be exciting you? We seem to be entering a new era we call digital. There's a cognitive era, AI, machine intelligence. What do you see that's exciting, and real? >> So number one, so I like to divide this space into two parts, the whole space of data analytics. There's the data plumbing, which we call data management, and whatnot. I have to plumb all my data together. Only then I can feed this data into my AI models. Now I can do in my silos today, but for me to do at a global level for my entire corporation, I need it all stitched together. And then, of course, these models are very real. My son, my 22-year old son is using TensorFlow for some little startup that he's cooking. And it took him just a month to pick it up and start applying it. So why can't our large companies do so? And in turn, bring down the cost of services, cost of products, the velocity of delivering those things to us, and make life better. >> So, the barriers to technology deployment are getting lower. >> And this is all feasible, Dave, right now. >> Yeah. >> You know, I mean, this is all, this is a dream 10 years ago. If somebody had said, you know, for an old corporation to stitch all its data, "What're you talking about? "It's not going to happen." But now, this is possible, and it's feasible. It's not going to require, make a massive hole in their budgets. >> But don't you think it's also table stakes to compete in over, the next 10 years? >> It is, there is table stakes. It's actually kind of late, from my perspective. If I had to go invest in the market, I mean, I would invest in companies who have their data act together. >> Yeah, yeah. So, what's the, how do you tell, when a company has its data act together? When you walk into a prospect, how do you know, what do you see, what're the characteristics of somebody who has that act together? >> It's hard for me to give you a few characteristics, but you know, you can tell what is the mandate they're operating under, if there are clear mandates. Because, for most companies, this is lost because of turf battle. This whole battle is lost due to turf issues. And the moment you see senior executives working together, with a massive willingness to bring everything together. You know, they'll have different turfs, and they're willing to contribute data, and bring it together. That's a phenomenally positive sign, because once that happens, then every large company has the wherewithal to go hire 50 data scientists, or work with all kinds of companies, including mine, to get data science help. >> Yeah, it comes back to the culture, doesn't it? >> Yes, absolutely. >> All right, Praveen, we have to leave it right there. Thanks very much for coming back in theCUBE. >> Thank you Dave, thank you. Thank you for the opportunity. >> You're very welcome. All right, keep it right there, everybody. This is theCUBE. We're live from the Forager in San Jose, Big Data SV. We'll be right back. (electronica flourish)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Praveen, thanks for back in theCUBE, good to see you. You kind of really got into the Big Data space in 2007, So I can go into the details of what we are doing, that I'm the same guy. because everything is flowing in the same brain. Data is at the core of their enterprise. That data is, you know, it's taking much longer for them We're not here to just sell the next sexy platform, So, maybe you could give me to every business unit, And you know, there're are some, you know, and it's easy to maybe sit back say, So, you don't have to listen to me. So there's a lot to it, and more than just the technology. Dave: Please, it's theCUBE... It's going to take time, and you know, if that's the dynamic that you encountered, So, there, I mean I can go on giving you anecdotes Yeah, it'd be good to have So they managed to cut down We seem to be entering a new era we call digital. So number one, so I like to divide this space So, the barriers to technology deployment It's not going to require, If I had to go invest in the market, So, what's the, how do you tell, It's hard for me to give you a few characteristics, All right, Praveen, we have to leave it right there. Thank you for the opportunity. We're live from the Forager in San Jose, Big Data SV.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AmazonORGANIZATION

0.99+

AppleORGANIZATION

0.99+

Dave VellantePERSON

0.99+

DavePERSON

0.99+

GoogleORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

Jeff BezosPERSON

0.99+

2007DATE

0.99+

Praveen KankariyaPERSON

0.99+

JPMCORGANIZATION

0.99+

one weekQUANTITY

0.99+

PraveenPERSON

0.99+

ImpetusORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

one boxQUANTITY

0.99+

one yearQUANTITY

0.99+

two partsQUANTITY

0.99+

one dayQUANTITY

0.99+

50 data scientistsQUANTITY

0.99+

first employeeQUANTITY

0.99+

San JoseLOCATION

0.99+

five different handlesQUANTITY

0.98+

10 years agoDATE

0.98+

Big Data SVORGANIZATION

0.98+

todayDATE

0.98+

700 billion dollarsQUANTITY

0.98+

about 20%QUANTITY

0.97+

about six and a half thousand feedsQUANTITY

0.97+

Big DataORGANIZATION

0.96+

singleQUANTITY

0.96+

five companiesQUANTITY

0.96+

ImpetusPERSON

0.96+

one-manQUANTITY

0.96+

theCUBEORGANIZATION

0.95+

oneQUANTITY

0.95+

22-year oldQUANTITY

0.94+

step oneQUANTITY

0.94+

single sourceQUANTITY

0.93+

single versionQUANTITY

0.92+

2018DATE

0.91+

next 10 yearsDATE

0.87+

first classQUANTITY

0.86+

hundreds andQUANTITY

0.86+

Hundreds of thousands of employeesQUANTITY

0.85+

Silicon ValleyLOCATION

0.85+

BuffetPERSON

0.84+

a monthQUANTITY

0.83+

FortuneORGANIZATION

0.82+

500 billionQUANTITY

0.81+

10 companiesQUANTITY

0.76+

HadoopTITLE

0.69+

hoursQUANTITY

0.69+

employeesQUANTITY

0.68+

HadoopLOCATION

0.68+

weekDATE

0.68+

360QUANTITY

0.6+

ForagerORGANIZATION

0.56+

Fortune 500ORGANIZATION

0.56+

WarrenORGANIZATION

0.54+

thousandsQUANTITY

0.53+

TensorFlowTITLE

0.51+

minutesQUANTITY

0.5+

everyQUANTITY

0.49+

SVEVENT

0.48+

David Abercrombie, Sharethrough & Michael Nixon, Snowflake | Big Data SV 2018


 

>> Narrator: Live from San Jose, it's theCUBE. Presenting Big Data, Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hi, I'm George Gilbert, and we are broadcasting from the Strata Data Conference, we're right around the corner at the Forager Tasting Room & Eatery. We have this wonderful location here, and we are very lucky to have with us Michael Nixon, from Snowflake, which is a leading cloud data warehouse. And David Abercrombie from Sharethrough which is a leading ad tech company. And between the two of them, they're going to tell us some of the most advance these cases we have now for cloud-native data warehousing. Michael, why don't you start with giving us some context for how on a cloud platform one might rethink a data warehouse? >> Yeah, thank you. That's a great question because let me first answer it from the end-user, business value perspective, when you run a workload on a cloud, there's a certain level of expectation you want out of the cloud. You want scalability, you want unlimited scalability, you want to be able to support all your users, you want to be able to support the data types, whatever they may be that comes in into your organization. So, there's a level of expectation that one should expect from a service point of view once you're in a cloud. So, a lot of the technology that were built up to this point have been optimized for on-premises types of data warehousing where perhaps that level of service and currency and unlimited scalability was not really expected but, guess what? Once it comes to the cloud, it's expected. So those on-premises technologies aren't suitable in the cloud, so for enterprises and, I mean, companies, organizations of all types from finance, banking, manufacturing, ad tech as we'll have today, they want that level of service in the cloud. And so, those technologies will not work, and so it requires a rethinking of how those architectures are built. And it requires being built for the cloud. >> And just to, alright, to break this down and be really concrete, some of the rethinking. We separate compute from storage, which is a familiar pattern that we've learned in the cloud but we also then have to have this sort of independent elasticity between-- >> Yes. Storage and the compute, and then Snowflake's taken it even a step further where you can spin out multiple compute clusters. >> Right. >> Tell us how that works and why that's so difficult and unique. >> Yeah, you know, that's taking us under the covers a little bit, but what makes our infrastructure unique is that we have a three-layer architecture. We separate, just as you said, storage from the compute layer, from the services layer. And that's really important because as I mentioned before, you want unlimited capacity, unlimited resources. So, if you scale, compute, and today's world on on-premises MPP, what that really means is that you have to bring the storage along with the compute because compute is tied to the storage so when you scale the storage along with the compute, usually that involves a lot of burden on the data warehouse manager because now they have to redistribute the data and that means redistributing keys, managing keys if you will. And that's a burden, and by the reverse, if all you wanted to do was increase storage but not the compute, because compute was tied to storage. Why you have to buy these additional compute notes, and that might add to the cost when, in fact, all you really wanted to pay for was for additional storage? So, by separating those, you keep them independent, and so you can scale storage apart from compute and then, once you have your compute resources in place, the virtual warehouses that you're talking about that have completed the job, you spun them up, it's done its job, and you take it down, guess what? You can release those resources, and of course, in releasing those resources, basically you can cut your cost as well because, for us, it's pure usage-based pricing. You only pay for what you use, and that's really fantastic. >> Very different from the on-prem model where, as you were saying, tied compute and storage together, so. >> Yeah, let's think about what that means architecturally, right? So if you have an on-premises data warehouse, and you want to scale your capacity, chances are you'll have to have that hardware in place already. And having that hardware in place already means you're paying that expense and, so you may pay for that expense six months prior to need it. Let's take a retailer example. >> Yeah. >> You're gearing up for a peak season, which might be Christmas, and so you put that hardware in place sometime in June, you'll always put it in advanced because why? You have to bring up the environment, so you have to allow time for implementation or, if you will, deployment to make sure everything is operational. >> Okay. >> And then what happens is when that peak period comes, you can't expand in that capacity. But what happens once that peak period is over? You paid for that hardware, but you don't really need it. So, our vision is, or the vision we believe you should have when you move workloads to the cloud is, you pay for those when you need them. >> Okay, so now, David, help us understand, first, what was the business problem you were trying to solve? And why was Snowflake, you know, sort of uniquely suited for that? >> Well, let me talk a little bit about Sharethrough. We're ad tech, at the core of our business we run an ad exchange, where we're doing programmatic training with the bids, with the real-time bidding spec. The data is very high in volume, with 12 billion impressions a month, that's a lot of bids that we have to process, a lot of bid requests. The way it operates, the bids and the bid responses and programmatic training are encoded in JSONs, so our ad exchange is basically exchanging messages in JSON with our business partners. And the JSONs are very complicated, there's a lot of richness and detail, such that the advertisers can decide whether or not they want to bid. Well, this data is very complicated, very high-volume. And advertising, like any business, we really need to have good analytics to understand how our business is operating, how our publishers are doing, how our advertisers are doing. And it all depends upon this very high-volume, very complex JSON event data stream. So, Snowflake was able to ingest our high-volume data very gracefully. The JSON parsing techniques of Snowflake allow me to expose the complicated data structure in a way that's very transparent and usable to our analysts. Our use of Snowflake has replaced clunkier tools where the analysts basically had to be programmers, writing programs in Scala or something to do in analysis. And now, because we've transparently and easily exposed the complicated structures within Snowflake in a relational database, they can use good old-fashioned SQL to run their queries, literally, afternoon analysis is now a five-minute query. >> So, let me, as I'm listening to you describe this. We've had various vendors telling us about these workflows in the sort of data prep and data science tool change. It almost sounds to me like Snowflake is taking semi-structured or complex data and it's sort of unraveling it and normalizing is kind of an overloaded term but it's making it business-ready, so you don't need as much of that manual data prep. >> Yeah, exactly, you don't need as much manual data prep, or you don't need as much expertise. For instance, Snowflake's JSON capabilities, in terms of drilling down the JSON tree with dot path notation, or expanding nested objects is very expressive, very powerful, but still your typical analyst or your BI tool certainly wouldn't know how to do that. So, in Snowflake, we sort of have our cake and eat it too. We can have our JSONs with their full richness in our database, but yet we can simplify and expose the data elements that are needed for analysis, so that an analyst, their first day on the job, they can get right to work and start writing queries. >> So let me ask you about, a little more about the programmatic ad use case. So if you have billions of impressions per month, I'm guessing that means you have quite a few times more, in terms of bids, and then there's the, you know once you have, I guess a successful one, you want to track what happens. >> Correct. >> So tell us a little more about that, what that workload looks like, in terms of, what analytics you're trying to perform, what's your tracking? >> Yeah, well, you're right. There's different steps in our funnel. The impression request expands out by a factor of a dozen as we send it to all the different potential bidders. We track all that data, the responses come back, we track that, we track our decisions and why we selected the bidder. And then, once the ad is shown, of course there's various beacons and tracking things that fire. We'd have to track all of that data, and the only way we could make sense out of our business is by bringing all that data together. And in a way that is reliable, transparent, and visible, and also has data integrity, that's another thing I like about the Snowflake database is that it's a good old-fashioned SQL database that I can declare my primary keys, I can run QC checks, I can ensure high data integrity that is demanded by BI and other sorts of analytics. >> What would be, as you continue to push the boundaries of the ad tech service, what's some functionality that you're looking to add, and Snowflake as your partner, either that's in there now that you still need to take advantage of or things that you're looking to in the future? >> Well, moving forward, of course, we, it's very important for us to be able to quickly gauge the effectiveness of new products. The ad tech market is fast-changing, there's always new ways of bidding, new products that are being developed, new ways for the ad ecosystem to work. And so, as we roll those out, we need to be able to quickly analyze, you know, "Is this thing working or not?" You know, kind of an agile environment, pivot or prove it. Does this feature work or not? So, having all the data in one place makes that possible for that very quick assessment of the viability of a new feature, new product. >> And, dropping down a little under the covers for how that works, does that mean, like you still have the base JSON data that you've absorbed, but you're going to expose it with different schemas or access patterns? >> Yeah, indeed. For instance, we make use of the SQL schemas, roles, and permissions internally where we can have the different teams have their own domain of data that they can expose internally, and looking forward, there's the share house feature of Snowflake that we're looking to implement with our partners, where, rather than sending them data, like a daily dump of data, we can give them access to their data in our database through this top layer that Michael mentioned, the service layer, essentially allows me to create a view grant select onto another customer. So I no longer have to send daily data dumps to partners or have some sort of API for getting data. They can simply query the data themselves so we'll be implementing that feature with our major partners. >> I would be remiss in not asking at a data conference like this, now that there's the tie-in with CuBOL and Spark Integration and Machine Learning, is there anything along that front that you're planning to exploit in the near future? >> Well, yeah, Sharethrough, we're very experimental, playful, we're always examining new data technologies and new ways of doing things but now with Snowflake as sort of our data warehouse of curated data. I've got two petabytes of referential integrity data, and that is reliable. We can move forward into our other analyses and other uses of data knowing that we have captured every event exactly once, and we know exactly where it fits in a business context, in a relational manner. It's clean, good data integrity, reliable, accessible, visible, and it's just plain old SQL. (chuckles) >> That's actually a nice way to sum it up. We've got the integrity that we've come to expect and love from relational databases. We've got the flexibility of machine-oriented data, or JSON. But we don't have to give up the query engine, and then now you have more advanced features, analytic features that you can take advantage of coming down the pipe. >> Yeah, again we're a modern platform for the modern age, that's basically cloud-based computing. With a platform like Snowflake in the backend, you can now move those workloads that you're accustomed to to the cloud and have in the environment that you're familiar with, and it saves you a lot of time and effort. You can focus on more strategic projects. >> Okay, well, with that, we're going to take a short break. This has been George Gilbert, we're with Michael Nixon of Snowflake, and David Abercrombie of Sharethrough listening to how the most modern ad tech companies are taking advantage of the most modern cloud data warehouses. And we'll be back after a short break here at the Strata Data Conference, thanks. (quirky music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media some of the most advance these cases we have now a certain level of expectation you want out of the cloud. concrete, some of the rethinking. Storage and the compute, and then Snowflake's taken it and unique. that have completed the job, you spun them up, Very different from the on-prem model where, as you and you want to scale your capacity, chances are You have to bring up the environment, so you have to allow You paid for that hardware, but you don't really need it. of richness and detail, such that the advertisers can So, let me, as I'm listening to you describe this. of drilling down the JSON tree with dot path notation, I'm guessing that means you have quite a few times more, I like about the Snowflake database analyze, you know, "Is this thing working or not?" the service layer, essentially allows me to create and that is reliable. and then now you have more you can now move those workloads that you're accustomed to at the Strata Data Conference, thanks.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

George GilbertPERSON

0.99+

David AbercrombiePERSON

0.99+

Michael NixonPERSON

0.99+

MichaelPERSON

0.99+

JuneDATE

0.99+

twoQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

San JoseLOCATION

0.99+

ScalaTITLE

0.99+

firstQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

five-minuteQUANTITY

0.99+

SnowflakeTITLE

0.99+

ChristmasEVENT

0.98+

Strata Data ConferenceEVENT

0.98+

three-layerQUANTITY

0.98+

first dayQUANTITY

0.98+

a dozenQUANTITY

0.98+

two petabytesQUANTITY

0.97+

SharethroughORGANIZATION

0.97+

JSONTITLE

0.97+

SQLTITLE

0.96+

one placeQUANTITY

0.95+

six monthsQUANTITY

0.94+

Forager Tasting Room & EateryORGANIZATION

0.91+

todayDATE

0.89+

SnowflakeORGANIZATION

0.87+

SparkTITLE

0.87+

12 billion impressions a monthQUANTITY

0.87+

Machine LearningTITLE

0.84+

Big DataORGANIZATION

0.84+

billions of impressionsQUANTITY

0.8+

CuBOLTITLE

0.79+

Big Data SV 2018EVENT

0.77+

onceQUANTITY

0.72+

theCUBEORGANIZATION

0.63+

JSONsTITLE

0.61+

timesQUANTITY

0.55+

Satyen Sangani, Alation | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to theCUBE, I'm Lisa Martin with John Furrier. We are covering our second day of our event Big Data SV. We've had some great conversations, John, yesterday, today as well. Really looking at Big Data, digital transformation, Big Data, plus data science, lots of opportunity. We're excited to welcome back to theCUBE an alumni, Satyen Sangani, the co-founder and CEO of Alation. Welcome back! >> Thank you, it's wonderful to be here again. >> So you guys finish up your fiscal year end of December 2017, where in the first quarter of 2018. You guys had some really strong results, really strong momentum. >> Yeah. >> Tell us what's going on at Alation, how are you pulling this momentum through 2018. >> Well, I think we have had an enterprise focused business historically, because we solve a very complicated problem for very big enterprises, and so, in the last quarter we added customers like American Express, PepsiCo, Roche. And with huge expansions from our existing customers, some of whom, over the course of a year, I think went 12 X from an initial base. And so, we found some just incredible momentum in Q4 and for us that was a phenomenal cap to a great year. >> What about the platform you guys are doing? Can you just take a minute to explain what Alation does again just to refresh where you are on the product side? You mentioned some new accounts, some new use cases. >> Yeah. >> What's the update? Take a minute, talk about the update. >> Absolutely, so, you certainly know, John, but Alation's a data catalog and a data catalog essentially, you can think of it as Yelp or Amazon for data and information side of the enterprise. So if you think about how many different databases there are, how many different reports there are, how many different BI tools there are, how many different APIs there are, how many different algorithms there are, it's pretty dizzying for the average analyst. It's pretty dizzying for the average CIO. It's pretty dizzying for the average chief data officer. And particularly, inside of Fortune 500s where you have hundreds of thousands of databases. You have a situation where people just have too much signal or too much noise, not enough signal. And so what we do is we provide this Yelp for that information. You can come to Alation as a catalog. You can do a search on revenue 2017. You'll get all of the reports, all of the dashboards, all of the tables, all of the people that you might need to be able to find. And that gives you a single place of reference, so you can understand what you've got and what can answer your questions. >> What's interesting is, first of all, I love data. We're data driven, we're geeks on data. But when I start talking to folks that are outside the geek community or nerd community, you say data and they go, "Oh," because they cringe and they say, "Facebook." They see that data issues there. GDPR, data nightmare, where's the store, you got to manage it. And then, people are actually using data, so they're realizing how hard (laughs) it is. >> Yeah >> How much data do we have? So it's kind of like a tropic disillusionment, if you will. Now they got to get their hands on it. They've got to put it to work. >> Yeah. >> And they know that So, it's now becoming really hard (laughs) in their mind. This is business people. >> Yeah. >> They have data everywhere. How do you guys talk to that customer? Because, if you don't have quality data, if you don't have data you can trust, if you don't have the right people, it's hard to get it going. >> Yeah. >> How do you guys solve that problem and how do you talk to customers? >> So we talk a lot about data literacy. There is a lot of data in this world and that data is just emblematic of all of the stuff that's going on in this world. There's lots of systems, there's lots of complexity and the data, basically, just is about that complexity. Whether it's weblogs, or sensors, or the like. And so, you can either run away from that data, and say, "Look, I'm going to not, "I'm going to bury my head in the sand. "I'm going to be a business. "I'm just going to forget about that data stuff." And that's certainly a way to go. >> John: Yeah. >> It's a way to go away. >> Not a good outlook. >> I was going to say, is that a way of going out of business? >> Or, you can basically train, it's a human resources problem fundamentally. You've got to train your people to understand how to use data, to become data literate. And that's what our software is all about. That's what we're all about as a company. And so, we have a pretty high bar for what we think we do as a business and we're this far into that. Which is, we think we're training people to use data better. How do you learn to think scientifically? How do you go use data to make better decisions? How do you build a data driven culture? Those are the sorts of problems that I'm excited to work on. >> Alright, now take me through how you guys play out in an engagement with the customer. So okay, that's cool, you guys can come in, we're getting data literate, we understand we need to use data. Where are you guys winning? Where are you guys seeing some visibility, both in terms of the traction of the usage of the product, the use cases? Where is it kind of coming together for you guys? >> Yeah, so we literally, we have a mantra. I think any early stage company basically wins because they can focus on doing a couple of things really well. And for us, we basically do three things. We allow people to find data. We allow people to understand the data that they find. And we allow them to trust the data that they see. And so if I have a question, the first place I start is, typically, Google. I'll go there and I'll try to find whatever it is that I'm looking for. Maybe I'm looking for a Mediterranean restaurant on 1st Street in San Jose. If I'm going to go do that, I'm going to do that search and I'm going to find the thing that I'm looking for, and then I'm going to figure out, out of the possible options, which one do I want to go to. And then I'll figure out whether or not the one that has seven ratings is the one that I trust more than the one that has two. Well, data is no different. You're going to have to find the data sets. And inside of companies, there could be 20 different reports and there could be 20 different people who have information, and so you're going to trust those people through having context and understanding. >> So, trust, people, collaboration. You mentioned some big brands that you guys added towards the end of calendar 2017. How do you facilitate these conversations with maybe the chief data officer. As we know, in large enterprises, there's still a lot of ownership over data silos. >> Satyen: Yep. >> What is that conversation like, as you say on your website, "The first data catalog designed for collaboration"? How do you help these organizations as large as Coca-Cola understand where all the data are and enable the human resources to extract values, and find it, understand it, and trust it? >> Yeah, so we have a very simple hypothesis, which is, look, people fundamentally have questions. They're fundamentally curious. So, what you need to do as a chief data officer, as a chief information officer, is really figure out how to unlock that curiosity. Start with the most popular data sets. Start with the most popular systems. Start with the business people who have the most curiosity and the most demand for information. And oh, by the way, we can measure that. Which is the magical thing that we do. So we can come in and say, "Look, "we look at the logs inside of your systems to know "which people are using which data sets, "which sources are most popular, which areas are hot." Just like a social network might do. And so, just like you can say, "Okay, these are the trending restaurants." We can say, "These are the trending data sets." And that curiosity allows people to know, what data should I document first? What data should I make available first? What data do I improve the data quality over first? What data do I govern first? And so, in a world where you've got tons of signal, tons of systems, it's totally dizzying to figure out where you should start. But what we do is, we go these chief data officers and say, "Look, we can give you a tool and a catalyst so "that you know where to go, "what questions to answer, who to serve first." And you can use that to expand to other groups in the company. >> And this is interesting, a lot of people you mentioned social networks, use data to optimize for something, and in the case of Facebook, they they use my data to target ads for me. You're using data to actually say, "This is how people are using the data." So you're using data for data. (laughs) >> That's right. >> So you're saying-- >> Satyen: We're measuring how you can use data. >> And that's interesting because, I hear a lot of stories like, we bought a tool, we never used it. >> Yep. >> Or people didn't like the UI, just kind of falls on the side. You're looking at it and saying, "Let's get it out there and let's see who's using the data." And then, are you doubling down? What happens? Do I get a little star, do I get a reputation point, am I being flagged to HR as a power user? How are you guys treating that gamification in this way? It's interesting, I mean, what happens? Do I become like-- >> Yeah, so it's funny because, when you think about search, how do you figure out that something's good? So what Google did is, they came along and they've said, "We've got PageRank." What we're going to do is we're going to say, "The pages that are the best pages are the ones "that people link to most often." Well, we can do the same thing for data. The data sources that are the most useful ones are the people that are used most often. Now on top of that, you can say, "We're going to have experts put ratings," which we do. And you can say people can contribute knowledge and reviews of how this data set can be used. And people can contribute queries and reports on top of those data sets. And all of that gives you this really rich graph, this rich social graph, so that now when I look at something it doesn't look like Greek. It looks like, "Oh, well I know Lisa used this data set, "and then John used it "and so at least it must answer some questions "that are really intelligent about the media business "or about the software business. "And so that can be really useful for me "if I have no clue as to what I'm looking at." >> So the problem that you-- >> It's on how you demystify it through the social connections. >> So the problem that you solve, if what I hear you correctly, is that you make it easy to get the data. So there's some ease of use piece of it, >> Yep. >> cataloging. And then as you get people using it, this is where you take the data literacy and go into operationalizing data. >> Satyen: That's right. >> So this seems to be the challenge. So, if I'm a customer and I have a problem, the profile of your target customer or who your customers are, people who need to expand and operationalize data, how would you talk about it? >> Yeah, so it's really interesting. We talk about, one of our customers called us, sort of, the social network for nerds inside of an enterprise. And I think for me that's a compliment. (John laughing) But what I took from that, and when I explained the business of Alation, we start with those individuals who are data literate. The data scientists, the data engineers, the data stewards, the chief data officer. But those people have the knowledge and the context to then explain data to other people inside of that same institution. So in the same way that Facebook started with Harvard, and then went to the rest of the Ivies, and then went to the rest of the top 20 schools, and then ultimately to mom, and dad, and grandma, and grandpa. We're doing the exact same thing with data. We start with the folks that are data literate, we expand from there to a broader audience of people that don't necessarily have data in their titles, but have curiosity and questions. >> I like that on the curiosity side. You spent some time up at Strata Data. I'm curious, what are some of the things you're hearing from customers, maybe partners? Everyone used to talk about Hadoop, it was this big thing. And then there was a creation of data lakes, and swampiness, and all these things that are sort of becoming more complex in an organization. And with the rise of myriad data sources, the velocity, the volume, how do you help an enterprise understand and be able to catalog data from so many different sources? Is it that same principle that you just talked about in terms of, let's start with the lowest hanging fruit, start making the impact there and then grow it as we can? Or is an enterprise needs to be competitive and move really, really quickly? I guess, what's the process? >> How do you start? >> Right. >> What do people do? >> Yes! >> So it's interesting, what we find is multiple ways of starting with multiple different types of customers. And so, we have some customers that say, "Look, we've got a big, we've got Teradata, "and we've got some Hadoop, "and we've got some stuff on Amazon, "and we want to connect it all." And those customers do get started, and they start with hundreds of users, in some case, they start with thousands of users day one, and they just go Big Bang. And interestingly enough, we can get those customers enabled in matters of weeks or months to go do that. We have other customers that say, "Look, we're going to start with a team of 10 people "and we're going to see how it grows from there." And, we can accommodate either model or either approach. From our prospective, you just have to have the resources and the investment corresponding to what you're trying to do. If you're going to say, "Look, we're going to have, two dollars of budget, and we're not going to have the human resources, and the stewardship resources behind it." It's going to be hard to do the Big Bang. But if you're going to put the appropriate resources up behind it, you can do a lot of good. >> So, you can really facilitate the whole go big or go home approach, as as well as the let's start small think fast approach. >> That's right, and we always, actually ironically, recommend the latter. >> Let's start small, think fast, yeah. >> Because everybody's got a bigger appetite than they do the ability to execute. And what's great about the tool, and what I tell our customers and our employees all day long is, there's only metric I track. So year over year, for our business, we basically grow in accounts by net of churn by 55%. Year over year, and that's actually up from the prior year. And so from my perspective-- >> And what does that mean? >> So what that means is, the same customer gave us 55 cents more on the dollar than they did the prior year. Now that's best in class for most software businesses that I've heard. But what matters to me is not so much that growth rate in and of itself. What it means to me is this, that nobody's come along and says, "I've mastered my data. "I understand all of the information side of my company. "Every person knows everything there is to know." That's never been said. So if we're solving a problem where customers are saying, "Look, we get, and we can find, and understand, "and trust data, and we can do that better last year "than we did this year, and we can do it even more "with more people," we're going to be successful. >> What I like about what you're doing is, you're bringing an element of operationalizing data for literacy and for usage. But you're really bringing this notion of a humanizing element to it. Where you see it in security, you see it in emerging ecosystems. Where there's a community of data people who know how hard it is and was, and it seems to be getting easier. But the tsunami of new data coming in, IOT data, whatever, and new regulators like GDPR. These are all more surface area problems. But there's a community coming together. How have you guys seen your product create community? Have you seen any data on that, 'cause it sounds like, as people get networked together, the natural outcome of that is possibly usage you attract. But is there a community vibe that you're seeing? Is there an internal collaboration where they sit, they're having meet ups, they're having lunches. There's a social aspect in a human aspect. >> No, it's humanal, no, it's amazing. So in really subtle but really, really powerful ways. So one thing that we do for every single data source or every single report that we document, we just put who are the top users of this particular thing. So really subtly, day one, you're like, "I want to go find a report. "I don't even know "where to go inside of this really mysterious system". Postulation, you're able to say, "Well, I don't know where to go, but at least I can go call up John or Lisa," and say, "Hey, what is it that we know about this particular thing?" And I didn't have to know them. I just had to know that they had this report and they had this intelligence. So by just discovering people in who they are, you pick up on what people can know. >> So people of the new Google results, so you mentioned Google PageRank, which is web pages and relevance. You're taking a much more people approach to relevance. >> Satyen: That's right. >> To the data itself. >> That's right, and that builds community in very, very clear ways, because people have curiosity. Other people are in the mechanism why in which they satisfy that curiosity. And so that community builds automatically. >> They pay it forward, they know who to ask help for. >> That's right. >> Interesting. >> That's right. >> Last question, Satyen. The tag line, first data catalog designed for collaboration, is there a customer that comes to mind to you as really one that articulates that point exactly? Where Alation has come in and really kicked open the door, in terms of facilitating collaboration. >> Oh, absolutely. I was literally, this morning talking to one of our customers, Munich Reinsurance, largest reinsurance customer or company in the world. Their chief data officer said, "Look, three years ago, "we started with 10 people working on data. "Today, we've got hundreds. "Our aspiration is to get to thousands." We have three things that we do. One is, we actually discover insights. It's actually the smallest part of what we do. The second thing that we do is, we enable people to use data. And the third thing that we do is, drive a data driven culture. And for us, it's all about scaling knowledge, to centers in China, to centers in North America, to centers in Australia. And they've been doing that at scale. And they go to each of their people and they say, "Are you a data black belt, are you a data novice?" It's kind of like skiing. Are you blue diamond or a black diamond. >> Always ski in pairs (laughs) >> That's right. >> And they do ski in pairs. And what they end up ultimately doing is saying, "Look, we're going to train all of our workforce to become better, so that in three, 10 years, we're recognized as one of the most innovative insurance companies in the world." Three years ago, that was not the case. >> Process improvement at a whole other level. My final question for you is, for the folks watching or the folks that are going to watch this video, that could be a potential customer of yours, what are they feeling? If I'm the customer, what smoke signals am I seeing that say, I need to call Alation? What are some of the things that you've found that would tell a potential customer that they should be talkin' to you guys? >> Look, I think that they've got to throw out the old playbook. And this was a point that was made by some folks at a conference that I was at earlier this week. But they basically were saying, "Look, the DLNA's PlayBook was all about providing the right answer." Forget about that. Just allow people to ask the right questions. And if you let people's curiosity guide them, people are industrious, and ambitious, and innovative enough to go figure out what they need to go do. But if you see this as a world of control, where I'm going to just figure out what people should know and tell them what they're going to go know. that's going to be a pretty, a poor career to go choose because data's all about, sort of, freedom and innovation and understanding. And we're trying to push that along. >> Satyen, thanks so much for stopping by >> Thank you. >> and sharing how you guys are helping organizations, enterprises unlock data curiosity. We appreciate your time. >> I appreciate the time too. >> Thank you. >> And thanks John! >> And thank you. >> Thanks for co-hosting with me. For John Furrier, I'm Lisa Martin, you're watching theCUBE live from our second day of coverage of our event Big Data SV. Stick around, we'll be right back with our next guest after a short break. (upbeat music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media Satyen Sangani, the co-founder and CEO of Alation. So you guys finish up your fiscal year how are you pulling this momentum through 2018. in the last quarter we added customers like What about the platform you guys are doing? Take a minute, talk about the update. And that gives you a single place of reference, you got to manage it. So it's kind of like a tropic disillusionment, if you will. And they know that How do you guys talk to that customer? And so, you can either run away from that data, Those are the sorts of problems that I'm excited to work on. Where is it kind of coming together for you guys? and I'm going to find the thing that I'm looking for, that you guys added towards the end of calendar 2017. And oh, by the way, we can measure that. a lot of people you mentioned social networks, I hear a lot of stories like, we bought a tool, And then, are you doubling down? And all of that gives you this really rich graph, It's on how you demystify it So the problem that you solve, And then as you get people using it, and operationalize data, how would you talk about it? and the context to then explain data the volume, how do you help an enterprise understand have the resources and the investment corresponding to So, you can really facilitate the whole recommend the latter. than they do the ability to execute. What it means to me is this, that nobody's come along the natural outcome of that is possibly usage you attract. And I didn't have to know them. So people of the new Google results, And so that community builds automatically. is there a customer that comes to mind to And the third thing that we do is, And what they end up ultimately doing is saying, that they should be talkin' to you guys? And if you let people's curiosity guide them, and sharing how you guys are helping organizations, Thanks for co-hosting with me.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
PepsiCoORGANIZATION

0.99+

Lisa MartinPERSON

0.99+

Satyen SanganiPERSON

0.99+

JohnPERSON

0.99+

American ExpressORGANIZATION

0.99+

AlationORGANIZATION

0.99+

RocheORGANIZATION

0.99+

SatyenPERSON

0.99+

thousandsQUANTITY

0.99+

LisaPERSON

0.99+

55 centsQUANTITY

0.99+

AustraliaLOCATION

0.99+

AmazonORGANIZATION

0.99+

Coca-ColaORGANIZATION

0.99+

2018DATE

0.99+

10 peopleQUANTITY

0.99+

threeQUANTITY

0.99+

John FurrierPERSON

0.99+

hundredsQUANTITY

0.99+

YelpORGANIZATION

0.99+

San JoseLOCATION

0.99+

ChinaLOCATION

0.99+

HarvardORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

twoQUANTITY

0.99+

TodayDATE

0.99+

2017DATE

0.99+

55%QUANTITY

0.99+

second dayQUANTITY

0.99+

North AmericaLOCATION

0.99+

GoogleORGANIZATION

0.99+

todayDATE

0.99+

two dollarsQUANTITY

0.99+

20 different peopleQUANTITY

0.99+

yesterdayDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

last yearDATE

0.99+

three years agoDATE

0.99+

firstQUANTITY

0.99+

second thingQUANTITY

0.99+

OneQUANTITY

0.99+

oneQUANTITY

0.99+

first quarter of 2018DATE

0.99+

20 different reportsQUANTITY

0.99+

three thingsQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

last quarterDATE

0.98+

DLNAORGANIZATION

0.98+

third thingQUANTITY

0.98+

Three years agoDATE

0.98+

eachQUANTITY

0.98+

singleQUANTITY

0.98+

bothQUANTITY

0.98+

1st StreetLOCATION

0.98+

Big BangEVENT

0.98+

this yearDATE

0.98+

Strata DataORGANIZATION

0.97+

12 XQUANTITY

0.97+

GDPRTITLE

0.97+

seven ratingsQUANTITY

0.96+

AlationPERSON

0.95+

this morningDATE

0.95+

Big Data SV 2018EVENT

0.94+

first dataQUANTITY

0.94+

TeradataORGANIZATION

0.93+

10 yearsQUANTITY

0.93+

Ian Swanson, DataScience.com | Big Data SV 2018


 

(royal music) >> Announcer: John Cleese. >> There's a lot of people out there who have no idea what they're doing, but they have absolutely no idea that they have no idea what they're doing. Those are the ones with the confidence and stupidity who finish up in power. That's why the planet doesn't work. >> Announcer: Knowledgeable, insightful, and a true gentleman. >> The guy at the counter recognized me and said... Are you listening? >> John Furrier: Yes, I'm tweeting away. >> No, you're not. >> I tweet, I'm tweeting away. >> He is kind of rude that way. >> You're on your (bleep) keyboard. >> Announcer: John Cleese joins the Cube alumni. Welcome, John. >> John Cleese: Have you got any phone calls you need to answer? >> John Furrier: Hold on, let me check. >> Announcer: Live from San Jose, it's the Cube, presenting Big Data Silicon Valley, brought to you by Silicon Angle Media and its ecosystem partners. (busy music) >> Hey, welcome back to the Cube's continuing coverage of our event, Big Data SV. I'm Lisa Martin with my co-host, George Gilbert. We are down the street from the Strata Data Conference. This is our second day, and we've been talking all things big data, cloud data science. We're now excited to be joined by the CEO of a company called Data Science, Ian Swanson. Ian, welcome to the Cube. >> Thanks so much for having me. I mean, it's been a awesome two days so far, and it's great to wrap up my trip here on the show. >> Yeah, so, tell us a little bit about your company, Data Science, what do you guys do? What are some of the key opportunities for you guys in the enterprise market? >> Yeah, absolutely. My company's called datascience.com, and what we do is we offer an enterprise data science platform where data scientists get to use all they tools they love in all the languages, all the libraries, leveraging everything that is open source to build models and put models in production. Then we also provide IT the ability to be able to manage this massive stack of tools that data scientists require, and it all boils down to one thing, and that is, companies need to use the data that they've been storing for years. It's about, how do you put that data into action. We give the tools to data scientists to get that data into action. >> Let's drill down on that a bit. For a while, we thought if we just put all our data in this schema-on-read repository, that would be nirvana. But it wasn't all that transparent, and we recognized we have to sort of go in and structure it somewhat, help us take the next couple steps. >> Ian: Yeah, the journey. >> From this partially curated data sets to something that turns into a model that is actionable. >> That's actually been the theme in the show here at the Strata Data Conference. If we went back years ago, it was, how do we store data. Then it was, how do we not just store and manage, but how do we transform it and get it into a shape that we can actually use it. The theme of this year is how do we get it to that next step, the next step of putting it into action. To layer onto that, data scientists need to access data, yes, but then they need to be able to collaborate, work together, apply many different techniques, machine learning, AI, deep learning, these are all techniques of a data scientist to be able to build a model. But then there's that next step, and the next is, hey, I built this model, how do I actually get it in production? How does it actually get used? Here's the shocking thing. I was at an event where there's 500 data scientists in the audience, and I said, "Stand up if you worked on a model for more than nine months "and it never went into production." 90% of the audience stood up. That's the last mile that we're all still working on, and what's exciting is, we can make it possible today. >> Wanting to drill down into the sort of, it sounds like there's a lot of choice in the tools. But typically, to do a pipeline, you either need well established APIs that everyone understands and plugs together with, or you need an end to end sort of single vendor solution that becomes the sort of collaboration backbone. How are you organized, how are you built? >> This might be self-serving, but datascience.com, we have enterprise data science platform, we recommend a unified platform for data science. Now, that unified platform needs to be highly configurable. You need to make it so that that workbench, you can use any tool that you want. Some data scientists might want to use a hammer, others want to be able to use a screwdriver over here. The power is how configurable, how extensible it is, how open source you can adopt everything. The amazing trends that we've seen have been proprietary solutions going back decades, to now, the rise of open source. Every day, dozens if not hundreds of new machine learning libraries are being released every single day. We've got to give those capabilities to data scientists and make them scale. >> OK, so the, and I think it's pretty easy to see how you would have incorporate new machine learning libraries into a pipeline. But then there's also the tools for data preparation, and for like feature extraction and feature engineering, you might even have some tools that help you with figuring out which algorithm to select. What holds all that together? >> Yeah, so orchestrating the enterprise data science stack is the hardest challenge right now. There has to be a company like us that is the glue, that is not just, do these solutions work together, but also, how do they collaborate, what is that workflow? What are those steps in that process? There's one thing that you might have left out, and that is, model deployment, model interpretation, model management. >> George: That's the black art, yeah. >> That's where this whole thing is going next. That was the exciting thing that I heard in terms of all these discussion with business leaders throughout the last two days is model deployment, model management. >> If I can kind of take this to maybe shift the conversation a little bit to the target audience. Talked a lot about data scientists and needing to enable them. I'm curious about, we just talked with, a couple of guests ago, about the chief data officer. How, you work with enterprises, how common is the chief data officer role today? What are some of the challenges they've got that datascience.com can help them to eliminate? >> Yeah, the CIO and the chief data officer, we have CIOs that have been selecting tools for companies to use, and now the chief data officer is sitting down with the CEO and saying, "How do we actually drive business results?" We work very closely with both of those personas. But on the CDO side, it's really helping them educate their teams on the possibilities of what could be realized with the data at hand, and making sure that IT is enabling the data scientists with the right tools. We supply the tools, but we also like to go in there with our customers and help coach, help educate what is possible, and that helps with the CDO's mission. >> A question along that front. We've been talking about sort of empowering the data scientist, and really, from one end of the modeling life cycle all the way to the end or the deployment, which is currently the hardest part and least well supported. But we also have tons of companies that don't have data science trained people, or who are only modestly familiar. Where do, what do we do with them? How do we get those companies into the mainstream in terms of deploying this? >> I think whether you're a small company or a big company, digital transformation is the mandate. Digital transformation is not just, how do I make a taxi company become Uber, or how do I make a speaker company become Sonos, the smart speaker, it's how do I exploit all the sources of my data to get better and improved operational processes, new business models, increased revenue, reduced operation costs. You could start small, and so we work with plenty of smaller companies. They'll hire a couple data scientists, and they're able to do small quick wins. You don't have to go sit in the basement for a year having something that is the thing, the unicorn in the business, it's small quick wins. Now we, my company, we believe in writing code, trained, educated, data scientists. There are solutions out there that you throw data at, you push a button, it gets an output. It's this magic black box. There's risk in that. Model interpretation, what are the features it's scoring on, there's risk, but those companies are seeing some level of success. We firmly believe, though, in hiring a data science team that is trained, you can start small, two or three, and get some very quick wins. >> I was going to say, those quick wins are essential for survivability, like digital transformation is essential, but it's also, I mean, to survival at a minimum, right? >> Ian: Yes. >> Those quick wins are presumably transformative to an enterprise being able to sustain, and then eventually, or ideally, be able to take market share from their competition. >> That is key for the CDO. The CDO is there pitching what is possible, he's pitching, she's pitching the dream. In order to be able to help visualize what that dream and the outcome could be, we always say, start small, quick wins, then from there, you can build. What you don't want to do is go nine months working on something and you don't know if there's going to be outcome. A lot of data science is trial and error. This is science, we're testing hypotheses. There's not always an outcome that's to be there, so small quick wins is something we highly recommend. >> A question, one of the things that we see more and more is the idea that actionable insights are perishable, and that latency matters. In fact, you have a budget for latency, almost, like in that short amount of time, the more sort of features that you can dynamically feed into a model to get a score, are you seeing more of that? How are the use cases that you're seeing, how's that pattern unfolding? >> Yeah, so we're seeing more streaming data use cases. We work with some of the biggest technology companies in the world, so IoT, connected services, streaming real time decisions that are happening. But then, also, there are so many use cases around org that could be marketing, finance, HR related, not just tech related. On the marketing side, imagine if you're customer service, and somebody calls you, and you know instantly the lifetime value of that customer, and it kicks off a totally new talk track, maybe get escalated immediately to a new supervisor, because that supervisor can handle this top tier customer. These are decisions that can happen real time leveraging machine learning models, and these are things that, again, are small quick wins, but massive, massive impact. It's about decision process now. That's digital transformation. >> OK. Are you seeing patterns in terms of how much horsepower customers are budgeting for the training process, creating the model? Because we know it's very compute intensive, like, even Intel, some people call it, like, high performance compute, like a supercomputer type workload. How much should people be budgeting? Because we don't see any guidelines or rules of thumb for this. >> I still think the boundaries are being worked out. There's a lot of great work that Nvidia's doing with GPU, we're able to do things faster on compute power. But even if we just start from the basics, if you go and talk to a data scientist at a massive company where they have a team of over 1,000 data scientists, and you say to do this analysis, how do you spin up your compute power? Well, I go walk over to IT and I knock on the door, and I say, "Set up this machine, set up this cluster." That's ridiculous. A product like ours is able to instantly give them the compute power, scale it elastically with our cloud service partners or work with on-prem solutions to be able to say, get the power that you need to get the results in the time that's needed, quick, fast. In terms of the boundaries of the budget, that's still being defined. But at the end of the day, we are seeing return on investment, and that's what's key. >> Are you seeing a movement towards a greater scope of integration for the data science tool chain? Or is it that at the high end, where you have companies with 1,000 data scientists, they know how to deal with specialized components, whereas, when there's perhaps less of, a smaller pool of expertise, the desire for end to end integration is greater. >> I think there's this kind of thought that is not necessarily right, and that is, if you have a bigger data science team, you're more sophisticated. We actually see the same sophistication level of 1,000 person data science team, in many cases, to a 20 person data science team, and sometimes inverse, I mean, it's kind of crazy. But it's, how do we make sure that we give them the tools so they can drive value. Tools need to include collaboration and workflow, not just hammers and nails, but how do we work together, how do we scale knowledge, how do we get it in the hands of the line of business so they can use the results. It's that that is key. >> That's great, Ian. I also like that you really kind of articulated start small, quick ins can make massive impact. We want to thank you so much for stopping by the Cube and sharing that, and what you guys are doing at Data Science to help enterprises really take advantage of the value that data can really deliver. >> Thanks so much for having datascience.com on, really appreciate it. >> Lisa: Absolutely. George, thank you for being my co-host. >> You're always welcome. >> We want to thank you for watching the Cube. I'm Lisa Martin with George Gilbert, and we are at our event Big Data SV on day two. Stick around, we'll be right back with our next guest after a short break. (busy music)

Published Date : Mar 8 2018

SUMMARY :

Those are the ones with the confidence and stupidity and a true gentleman. The guy at the counter recognized me and said... Announcer: John Cleese joins the Cube alumni. brought to you by Silicon Angle Media We are down the street from the Strata Data Conference. and it's great to wrap up my trip here on the show. and it all boils down to one thing, and that is, the next couple steps. to something that turns into a model that is actionable. and the next is, hey, I built this model, that becomes the sort of collaboration backbone. how open source you can adopt everything. OK, so the, and I think it's pretty easy to see Yeah, so orchestrating the enterprise data science stack in terms of all these discussion with business leaders a couple of guests ago, about the chief data officer. and making sure that IT is enabling the data scientists empowering the data scientist, and really, having something that is the thing, or ideally, be able to take market share and the outcome could be, we always say, start small, the more sort of features that you can dynamically in the world, so IoT, connected services, customers are budgeting for the training process, get the power that you need to get the results Or is it that at the high end, We actually see the same sophistication level and sharing that, and what you guys are doing Thanks so much for having datascience.com on, George, thank you for being my co-host. and we are at our event Big Data SV on day two.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

Ian SwansonPERSON

0.99+

GeorgePERSON

0.99+

IanPERSON

0.99+

LisaPERSON

0.99+

UberORGANIZATION

0.99+

John FurrierPERSON

0.99+

Silicon Angle MediaORGANIZATION

0.99+

JohnPERSON

0.99+

John CleesePERSON

0.99+

500 data scientistsQUANTITY

0.99+

90%QUANTITY

0.99+

dozensQUANTITY

0.99+

NvidiaORGANIZATION

0.99+

San JoseLOCATION

0.99+

20 personQUANTITY

0.99+

Data ScienceORGANIZATION

0.99+

nine monthsQUANTITY

0.99+

1,000 personQUANTITY

0.99+

twoQUANTITY

0.99+

two daysQUANTITY

0.99+

more than nine monthsQUANTITY

0.99+

second dayQUANTITY

0.99+

1,000 data scientistsQUANTITY

0.99+

threeQUANTITY

0.99+

Big Data SVEVENT

0.99+

over 1,000 data scientistsQUANTITY

0.99+

CubeORGANIZATION

0.99+

bothQUANTITY

0.99+

Strata Data ConferenceEVENT

0.98+

oneQUANTITY

0.98+

IntelORGANIZATION

0.98+

SonosORGANIZATION

0.98+

one thingQUANTITY

0.97+

a yearQUANTITY

0.96+

todayDATE

0.95+

day twoQUANTITY

0.95+

this yearDATE

0.94+

singleQUANTITY

0.92+

Big Data SV 2018EVENT

0.88+

DataScience.comORGANIZATION

0.87+

hundreds of new machine learning librariesQUANTITY

0.86+

lot of peopleQUANTITY

0.83+

decadesQUANTITY

0.82+

every single dayQUANTITY

0.81+

years agoDATE

0.77+

last two daysDATE

0.76+

datascience.comORGANIZATION

0.75+

one endQUANTITY

0.7+

yearsQUANTITY

0.67+

datascience.comOTHER

0.65+

couple stepsQUANTITY

0.64+

Big DataEVENT

0.64+

couple of guestsDATE

0.57+

coupleQUANTITY

0.52+

Silicon ValleyLOCATION

0.52+

thingsQUANTITY

0.5+

CubeTITLE

0.47+

Ziya Ma, Intel | Big Data SV 2018


 

>> Live from San Jose, it's theCUBE! Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to theCUBE. Our continuing coverage of our event, Big data SV. I'm Lisa Martin with my co-host George Gilbert. We're down the street from the Strata Data Conference, hearing a lot of interesting insights on big data. Peeling back the layers, looking at opportunities, some of the challenges, barriers to overcome but also the plethora of opportunities that enterprises alike have that they can take advantage of. Our next guest is no stranger to theCUBE, she was just on with me a couple days ago at the Women in Data Science Conference. Please welcome back to theCUBE, Ziya Ma. Vice President of Software and Services Group and the Director of Big Data Technologies from Intel. Hi Ziya! >> Hi Lisa. >> Long time, no see. >> I know, it was just really two to three days ago. >> It was, well and now I can say happy International Women's Day. >> The same to you, Lisa. >> Thank you, it's great to have you here. So as I mentioned, we are down the street from the Strata Data Conference. You've been up there over the last couple days. What are some of the things that you're hearing with respect to big data? Trends, barriers, opportunities? >> Yeah, so first it's very exciting to be back at the conference again. The one biggest trend, or one topic that's hit really hard by many presenters, is the power of bringing the big data system and data science solutions together. You know, we're definitely seeing in the last few years the advancement of big data and advancement of data science or you know, machine learning, deep learning truly pushing forward business differentiation and improve our life quality. So that's definitely one of the biggest trends. Another thing I noticed is there was a lot of discussion on big data and data science getting deployed into the cloud. What are the learnings, what are the use cases? So I think that's another noticeable trend. And also, there were some presentations on doing the data science or having the business intelligence on the edge devices. That's another noticeable trend. And of course, there were discussion on security, privacy for data science and big data so that continued to be one of the topics. >> So we were talking earlier, 'cause there's so many concepts and products to get your arms around. If someone is looking at AI and machine learning on the back end, you know, we'll worry about edge intelligence some other time, but we know that Intel has the CPU with the Xeon and then this lower power one with Atom. There's the GPU, there's ASICs, FPGAS, and then there are these software layers you know, with higher abstraction layer, higher abstraction level. Help us put some of those pieces together for people who are like saying, okay, I know I've got a lot of data, I've got to train these sophisticated models, you know, explain this to me. >> Right, so Intel is a real solution provider for data science and big data. So at the hardware level, and George, as you mentioned, we offer a wide range of products from general purpose like Xeon to targeted silicon such as FPGA, Nervana, and other ASICs chips like Nervana. And also we provide adjacencies like networking the hardware, non-volatile memory and mobile. You know, those are the other adjacent products that we offer. Now on top of the hardware layer, we deliver fully optimized software solutions stack from libraries, frameworks, to tools and solutions. So that we can help engineers or developers to create AI solutions with greater ease and productivity. For instance, we deliver Intel optimized math kernel library. That leverage of the latest instruction set gives us significant performance boosts when you are running your software on Intel hardware. We also deliver framework like BigDL and for Spark and big data type of customers if they are looking for deep learning capabilities. We also optimize some popular open source deep learning frameworks like Caffe, like TensorFlow, MXNet, and a few others. So our goal is to provide all the necessary solutions so that at the end our customers can create the applications, the solutions that they really need to address their biggest pinpoints. >> Help us think about the maturity level now. Like, we know that the very most sophisticated internet service providers who are sort of all over this machine learning now for quite a few years. Banks, insurance companies, people who've had this. Statisticians and actuaries who have that sort of skillset are beginning to deploy some of these early production apps. Where are we in terms of getting this out to the mainstream? What are some of the things that have to happen? >> To get it to mainstream, there are so many things we could do. First I think we will continue to see the wide range of silicon products but then there are a few things Intel is pushing. For example, we're developing this in Nervana, graph compiler that will encapsulate the hardware integration details and present a consistent API for developers to work with. And this is one thing that we hope that we can eventually help the developer community with. And also, we are collaborating with the end user. Like, from the enterprise segment. For example, we're working with the financial services industry, we're working with a manufacturing sector and also customers from the medical field. And online retailers, trying to help them to deliver or create the data science and analytics solutions on Intel-based hardware or Intel optimized software. So that's another thing that we do. And we're seeing actually very good progress in this area. Now we're also collaborating with many cloud service providers. For instance, we work with some of the top seven cloud service providers, both in the U.S. and also in China to democratize the, not only our hardware, but also our libraries and tools, BigDL, MKL, and other frameworks and libraries so that our customers, including individuals and businesses, can easily access to those building blocks from the cloud. So definitely we're working from different factors. >> So last question in the last couple of minutes. Let's kind of vibe on this collaboration theme. Tell us a little bit about the collaboration that you're having with, you mentioned customers in some highly regulated industries, for as an example. But a little bit to understand what's that symbiosis? What is Intel learning from your customers that's driving Intel's innovation of your technologies and big data? >> That's an excellent question. So Lisa, maybe I can start my sharing a couple of customer use cases. What kind of a solution that we help our customer to address. I think it's always wise not to start a conversation with the customer on technology that you deliver. You want to understand the customer's needs first. And then so that you can provide a solution that really address their biggest pinpoint rather than simply selling technology. So for example, we have worked with an online retailer to better understand their customers' shopping behavior and to assess their customers' preferences and interests. And based upon that analysis, the online retailer made different product recommendations and maximized its customers' purchase potential. And it drove up the retailer's sales. You know, that's one type of use case that we have worked. We also have partnered with the customers from the medical field. Actually, today at the Strata Conference we actually had somebody highlighting, we had a joint presentation with UCSF where we helped the medical center to automate the diagnosis and grading of meniscus lesions. And so today actually, that's all done manually by the radiologist but now that entire process is automated. The result is much more accurate, much more consistent, and much more timely. Because you don't have to wait for the availability of a radiologist to read all the 3D MRI images. And that can all be done by machines. You know, so those are the areas that we work with our customers, understand their business need, and give them the solution they are looking for. >> Wow, the impact there. I wish we had more time to dive into some of those examples. But we thank you so much, Ziya, for stopping by twice in one week to theCUBE and sharing your insights. And we look forward to having you back on the show in the near future. >> Thanks, so thanks Lisa, thanks George for having me. >> And for my co-host George Gilbert, I'm Lisa Martin. We are live at Big Data SV in San Jose. Come down, join us for the rest of the afternoon. We're at this cool place called Forager Tasting and Eatery. We will be right back with our next guest after a short break. (electronic outro music)

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media some of the challenges, barriers to overcome What are some of the things that you're So that's definitely one of the biggest trends. on the back end, So at the hardware level, and George, as you mentioned, What are some of the things that have to happen? and also customers from the medical field. So last question in the last couple of minutes. customers from the medical field. And we look forward to having you We will be right back with our

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

UCSFORGANIZATION

0.99+

GeorgePERSON

0.99+

LisaPERSON

0.99+

San JoseLOCATION

0.99+

ChinaLOCATION

0.99+

Ziya MaPERSON

0.99+

U.S.LOCATION

0.99+

International Women's DayEVENT

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

ZiyaPERSON

0.99+

one weekQUANTITY

0.99+

todayDATE

0.99+

twiceQUANTITY

0.99+

FirstQUANTITY

0.99+

Strata Data ConferenceEVENT

0.99+

one topicQUANTITY

0.98+

SparkTITLE

0.98+

bothQUANTITY

0.98+

IntelORGANIZATION

0.98+

one thingQUANTITY

0.98+

three days agoDATE

0.98+

Women in Data Science ConferenceEVENT

0.97+

Strata ConferenceEVENT

0.96+

firstQUANTITY

0.96+

BigDLTITLE

0.96+

TensorFlowTITLE

0.96+

one typeQUANTITY

0.95+

twoDATE

0.94+

MXNetTITLE

0.94+

CaffeTITLE

0.92+

theCUBEORGANIZATION

0.91+

oneQUANTITY

0.9+

Software and Services GroupORGANIZATION

0.9+

Forager Tasting and EateryORGANIZATION

0.88+

Vice PresidentPERSON

0.86+

Big Data TechnologiesORGANIZATION

0.84+

seven cloud service providersQUANTITY

0.81+

last couple daysDATE

0.81+

AtomCOMMERCIAL_ITEM

0.76+

Silicon ValleyLOCATION

0.76+

Big Data SV 2018EVENT

0.74+

a couple days agoDATE

0.72+

Big Data SVORGANIZATION

0.7+

XeonCOMMERCIAL_ITEM

0.7+

NervanaORGANIZATION

0.68+

Big DataEVENT

0.62+

lastDATE

0.56+

dataEVENT

0.54+

caseQUANTITY

0.52+

3DQUANTITY

0.48+

coupleQUANTITY

0.47+

yearsDATE

0.47+

NervanaTITLE

0.45+

BigORGANIZATION

0.32+

Blaine Mathieu, VANTIQ | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's The Cube, presenting Big Data, Silicon Valley. Brought to you by Silicon Angle Media and its ecosystem partners. >> Welcome back to The Cube. Our continuing coverage of our event, Big Data SV continues. I am Lisa Martin joined by Peter Burris. We're in downtown San Jose at a really cool place called Forager Tasting and Eatery. Come down, hang out with us today as we have continued conversations around all things big data, everything in between. This is our second day here and we're excited to welcome to The Cube the CMO of VANTIQ, Blaine Mathieu. Blaine, great to meet you, great to have you on the program. >> Great to be here, thanks for inviting me. >> So, VANTIQ, you guys are up the street in Walnut Creek. What do you guys do, what are you about, what makes VANTIQ different? >> Well, in a nutshell, VANTIQ is a so called high productivity application development platform to allow developers to build, deploy, and manage so called event driven real time applications, the kind of applications that are critical for driving many of the digital transformation initiatives that enterprises are trying to get on top of these days. >> Digital trasformation, it's a term that can mean so many different things, but today, it's essential for companies to be able to compete, especially enterprise companies with newer companies that are more agile, more modern. But if we peel apart digital transformation, there's so many elements that are essential. How do you guys help companies, enterprises, say, evolve their application architectures that might currently not be able to support an actual transformation to a digital business? >> Well, I think that's a great question, thank you. I think the key to digital trasformation is really a lot around the concept of real time, okay. The reason Uber is disrupting or has disrupted the taxi industry is the old way of doing it was somebody called a taxi and then they waited 30 minutes for a taxi to show up and then they told the taxi where to go and hopefully they got there. Whereas, Uber, turned that into a real time business, right? You called, you pinged something on your phone. They knew your location. They knew the location of the driver. They matched those up, brought 'em together in real time. Already knew where to bring you to and ensured you had the right route and that location. All of this data flowing, all of these actions have been taken in real time. The same thing applies to a disruptor like Netflix, okay? In the old days, Blockbuster used to send you, you know, a leaflet in the mail telling you what the new movies are. Maybe it was personalized for you. Probably not. No, Netflix knows who you are instantly, gives you that information, again, in real time based on what you've done in the past and is able to give you, deliver the movie also, in real time pretty well. Every disruptor you look at around digital transformation is bringing a business or a process that was done slowly and impersonally to make it happen in real time. Unfortunately, enterprise applications and the architectures, as you said a second ago, that are being used in most applications today weren't designed to enable these real time use cases. A great example is sales force. So, a sales force is a pretty standard, what you'd call a request application. So, you make a request, a person, generally, makes a request of the system, system goes into a database, queries that database, find information and then returns it back to the user. And that whole process could take, you know, significant amounts of time, especially if the right data isn't in the database at the time and you have to go request it or find it or create it. A new type of application needs to be created that's not fundamentally database centric, but it's able to take these real time data streams coming in from devices, from people, from enterprise systems, process them in real time and then take an action. >> So, let's pretend I'm a CEO. >> Yeah. >> One of the key things you said, and I want you to explain it better, is event. What is event? What is an event and how does that translate into a digital business decision? >> This notion of complex event processing CEP has been around in technology for a long time and yet, it surprises me still a lot of folks we talk to, CEOs, have never heard of the concept. And, it's very simple really. An event is just something that happens in the context of business. That's as complex and as simple as it is. An event could be a machine increases in temperature by one degree, a car moves from one location to another location. It could be an enterprise system, like an ERP system, you know, approves a PO. It could be a person pressing a button on a mobile device. All of those, or it could be an IOT device putting off a signal about the state of a machine. Increasingly, we're getting a lot of events coming from IOT devices. So, really, any particular interesting business situation or a change in a situation that happens is an event And increasingly driven, as you know, by IOT, by augmented reality, by AI and machine learning, by autonomous vehicles, by all these new real time technologies are spinning off more and more events, streams of these events coming off in rapid fashion and we have to be able to do something about them. >> Let me take a crack at it and you tell me if I've got this right. That, historically, applications have been defined in terms of processes and so, in many respects, there was a very concrete, discreet, well established program, set of steps that were performed and then the transaction took place. And event, it seems to me is, yeah, we generally described it, but it changes in response to the data. >> Right, right. >> So, an event is kind of like an outside in driven by data. >> Right, right. >> System response, whereas, your traditional transaction processing is an inside out driven by a sequence of programmed steps, and that decision might have been made six years ago. So, the event is what's happening right now informed by data versus a transaction, traditional transaction is much more, what did we decide to do six years ago and it just gets sustained. Have I got that right? >> That's right. Absolutely right or six hours ago or even six minutes ago, which might seem wow, six minutes, that's pretty good, but take a use case for a field service agent trying to fix a machine or an air conditioner on top of a building. In today's world now, that air conditioner has hundreds of sensors that are putting off data about the state of that air conditioner in real time. A service tech has the ability to, while the machine is still putting off that data, be able to make repairs and changes and fixes, again, in the moment, see how that is changing the data coming off the machine, and then, continue to make the appropriate repairs in collaboration with a smart system or an application that's helping them. >> That's how identifying patterns about what the problem is, versus some of the old ways was where we had recipe of, you know, steps that you went through in the call center. >> Right, right. And the customer is getting more and more frustrated. >> They got their clipboard out and had the 52 steps they followed to see oh that didn't work, now the next step. No, data can help us do that much more efficiently and effectively if we're able to process it in real time. >> So, in many respects, what we're really talking about is an application world or a world looking forward where the applications, which historically have been very siloed, process driven, to a world where the application function is much more networked together and the application, the output of one application is having a significant impact through data on the performance of an application somewhere else. That seems like it's got the potential to be an extremely complex fabric. (laughing) So, do I wait until I figure all that out (laughing) and then I start building it? Or do I, I mean, how do I do it? Do I start small and create and grow into it? What's the best way for people to start working on this? >> Well, you're absolutely right. Building these complex, geeking out a little bit, you know, asynchronous, non-blocking, so called reactive applications, that's the concept that we've been using in computer science for some time, is very hard, frankly. Okay, it's much easier to build computing systems that process things step one, step, two, step three, in order, but if you have to build a system that is able to take real time inputs or changes at any point in the process at any time and go in a different direction, it's very complex. And, computer scientists have been writing applications like this for decades. It's possible to do, but that isn't possible to do at the speed that companies now want to transform themselves, right? By the time you spec out an application and spend two years writing it, your business competitors have already disrupted you. The requirements have already changed. You need to be much more rapid and agile. And so, the secret sauce to this whole thing is to be able to write these transformative applications or create them, not even write is actually the wrong word to use, to be able to create them. >> Generate them. >> Yeah, generate them in a way which is very fast, does not require a guru level developer and reactive Java or some super low level code that you'd have to use to otherwise do it, so that you can literally have business people help design the applications, conceptually build them almost in real time, get them out into the market, and then be able to modify them as you need to, you know, on the fly. >> If I can build on that for just one second. So, it used to be we had this thing called computer assisted software engineer. >> (laughs) Right, right. >> We were going to operate this very very high level language. It's kind of-- But then, we would use code and build a code and the two of them were separated and so the minute that we deployed, somebody would go off and maintain and the whole thing would break. >> Right, right. >> Do you have that problem? >> No, well, that's exactly right. So, the old, you know, the old, the previous way of doing it was about really modeling an application, maybe visually, drag and drop, but then fundamentally, you created a bunch of code and then your job, as you said after, was to maintain and deploy and manage. >> Try to sustain some connection back up to that beautiful visual model. >> And you probably didn't because that was too much. That was too much work, so forget about the model after that. Instead, what we're able to do these days is to build the applications visually, you know, really for the most part with either super low code or, in many cases, no code because we have the ability to abstract away a lot of the complexity, a lot of the complex code that you'd have to write, we can represent that, okay, with these logical abstractions, create the applications themselves, and then continue to maintain, add to, modify the application using the exact same structure. You're not now stuck on, now you're stuck with 20,000 lines of code that you have to, that you have to edit. You're continuing to run and maintain the application just the way you built it, okay. We've now got to the place in computer science where we can actually do these things. We couldn't do them, you know, 20 years ago with case, but we can absolutely do them now. >> So, I'm hearing from a customer internal perspective a lot of operational efficiencies that VANTIQ can drive. Let's look now from a customer's perspective. What are the business impacts you're able to make? You mentioned the word reactive a minute ago when you were talking about applications, but do you have an example where you've, VANTIQ, has enabled a customer, a business, to be more, to be proactive and be able to identify through, you know, complex event processing, what their customers are doing to be able to deliver relevant messages and really drive revenue, drive profit? >> Right, right. So many, you know, so many great examples. And, I mentioned field service a few minutes ago. I've got a lot of clients in that doing this real time field service using these event processing applications. One that I want to bring up right now is one of the largest global shoe manufacturers, actually, that's a client of VANTIQ. I, unfortunately, can't say the name right now 'cause they want to keep what they're doing under wraps, but we all definitely know the company. And they're using this to manage the security, primarily, around their real time global supply chain. So, they've got a big challenge with companies in different countries redirecting shipments of their shoes, selling them on the gray market, at different prices than what are allowed in different regions of the world. And so, through both sensorizing the packages, the barcode scanning, the enterprise systems bringing all that data together in real time, they can literally tell in the moment is something is be-- If a package is redirected to the wrong region or if literally a shoe or a box of shoes is being sold where it shouldn't be sold at the wrong price. They used to get a monthly report on the activities and then they would go and investigate what happened last month. Now, their fraud detection manager is literally sitting there getting this in real time, saying, oh, Singapore sold a pallet of shoes that they should not have been able to sell five minute ago. Call up the guy in Singapore and have him go down and see what's going on and fix that issue. That's pretty powerful when you think about it. >> Definitely, so like reduction in fraud or increase in fraud detection. Sounds like, too, there's a potential for a significant amount of cost savings to the business, not just meeting the external customer needs, but from a, from a cost perspective reduction. Not just some probably TCO, but in operational expenses. >> For sure, although, I would say most of the digital transformation initiatives, when we talk to CEOs and CIOs, they're not focused as much on cost savings, as they're focused on A, avoiding being disrupted by the next interesting startup, B, creating new lines of business, new revenue streams, finding out a way to do something differently dramatically better than they're currently doing it. It's not only about optimizing or squeezing some cost out of their current application. This thing that we are talking about, I guess you could say it's an improvement on their current process, but really, it's actually something they just weren't even really doing before. Just a total different way of doing fraud detection and managing their global supply chain that they just fundamentally weren't even doing. And now, of course, they're looking at many other use cases across the company, not just in supply chain, but, you know, smart manufacturing, so many use cases. Your point about savings, though, there's, you know, what value does the application itself bring? Then, there's the question of what does it cost to build and maintain and deploy the application itself, right? And, again, with these new visual development tools, they're not modeling tools, you're literally developing the application visually. You know, I've been in so many scenarios where we talked to large enterprises. You know, we talk about what we're doing, like we talk about right now, and they say, okay, we'd love to do a POC, proof of concept. We want to allocate six months for this POC, like normally you would probably do for building most enterprise applications. And, we inevitably say, well, how about Friday? How about we have the POC done by Friday? And, you know, we get the Germans laugh, you know, laugh uncomfortably and we go away and deliver the POC by Friday because of how much different it is to build applications this way versus writing low level Java or C-sharp code and sticking together a bunch of technologies and tools 'cause we abstract all that away. And, you know, the eyes drop open and the mouth drops open and it's incredible what modern technology can do to radically change how software is being developed. >> Wow, big impact in a short period of time. That's always a nice thing to be able to deliver. >> It is, it is to-- It's great to be able to surprise people like that. >> Exactly, exactly. Well, Blaine, thank you so much for stopping by, sharing what VANTIQ is doing to help companies be disruptive and for sharing those great customer examples. We appreciate your time. >> You're welcome. Appreciate the time. >> And for my co-host, Peter Burris, I'm Lisa Martin. You're watching The Cube's continuing coverage of our event, Big Data SV Live from San Jose, down the street from the Strata Data Conference. Stick around, we'll be right back with our next guest after a short breal. (techy music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by Silicon Angle Media the CMO of VANTIQ, Blaine Mathieu. So, VANTIQ, you guys are up the street in Walnut Creek. for driving many of the digital transformation that might currently not be able to support and the architectures, as you said a second ago, One of the key things you said, in the context of business. in response to the data. So, an event is kind of like an outside in So, the event is what's happening right now and changes and fixes, again, in the moment, of the old ways was where we had recipe of, you know, And the customer is getting more and more frustrated. they followed to see oh that didn't work, and the application, the output of one application And so, the secret sauce to this whole thing to modify them as you need to, you know, on the fly. So, it used to be we had this thing and so the minute that we deployed, So, the old, you know, the old, Try to sustain just the way you built it, okay. but do you have an example where you've, that they should not have been able to sell to the business, not just meeting and deliver the POC by Friday because to be able to deliver. It's great to be able to surprise people Well, Blaine, thank you so much for stopping by, Appreciate the time. down the street from the Strata Data Conference.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
BlainePERSON

0.99+

Lisa MartinPERSON

0.99+

Peter BurrisPERSON

0.99+

SingaporeLOCATION

0.99+

UberORGANIZATION

0.99+

two yearsQUANTITY

0.99+

NetflixORGANIZATION

0.99+

San JoseLOCATION

0.99+

VANTIQORGANIZATION

0.99+

Blaine MathieuPERSON

0.99+

20,000 linesQUANTITY

0.99+

30 minutesQUANTITY

0.99+

twoQUANTITY

0.99+

Silicon Angle MediaORGANIZATION

0.99+

52 stepsQUANTITY

0.99+

Walnut CreekLOCATION

0.99+

six monthsQUANTITY

0.99+

JavaTITLE

0.99+

one degreeQUANTITY

0.99+

FridayDATE

0.99+

second dayQUANTITY

0.99+

last monthDATE

0.99+

one secondQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

six years agoDATE

0.98+

bothQUANTITY

0.98+

Strata Data ConferenceEVENT

0.98+

Big Data SV LiveEVENT

0.98+

OneQUANTITY

0.98+

The CubeORGANIZATION

0.98+

todayDATE

0.98+

oneQUANTITY

0.98+

20 years agoDATE

0.98+

Big Data SV 2018EVENT

0.97+

six hours agoDATE

0.97+

six minutes agoDATE

0.97+

five minute agoDATE

0.97+

a minute agoDATE

0.96+

hundreds of sensorsQUANTITY

0.95+

The CubeTITLE

0.94+

BlockbusterORGANIZATION

0.91+

few minutes agoDATE

0.89+

step oneQUANTITY

0.89+

step threeQUANTITY

0.85+

Forager Tasting and EateryORGANIZATION

0.85+

decadesQUANTITY

0.84+

six minutesQUANTITY

0.84+

CTITLE

0.83+

Big DataORGANIZATION

0.81+

one locationQUANTITY

0.78+

one applicationQUANTITY

0.77+

second agoDATE

0.71+

CEPORGANIZATION

0.53+

bigORGANIZATION

0.52+

GermansPERSON

0.51+

techyORGANIZATION

0.41+

DataEVENT

0.31+

Matt Maccaux, Dell EMC | Big Data SV 2018


 

>> Male Narrator: Live from San Jose, it's theCube. Presenting Big Data Silicon Valley, brought to you by SilconANGLE Media and it's ecosystem partners. >> Welcome back to theCube's continuing coverage of our event, Big Data SV in downtown San Jose. I'm Lisa Martin, my co-host is Dave Vellante. Hey Dave. >> Hey Lisa, how's it going? >> Good. >> Doing a great job here, by the way. >> Well thank you, sir. >> Keeping the trains going. >> Yeah. >> Well done. >> We've had a really interesting couple of days, we started here yesterday interviewing lots of great guys and gals on Big Data and everything in between. A lots of different topics there, opportunities, challenges, digital transformation, how can customers really evolve on this journey? We're excited to welcome back to theCube, one of our distinguished alumni, Matt Maccaux, the Global Big Data Practice Lead from Dell EMC. Welcome back. >> Well thanks for having me, appreciate it, it's a pleasure to be here. >> Yeah, so lots of stuff going on. We've been here, as I mentioned, we're down the street from the Strata Data Conference and we've had a lot of great conversations, very educational, informative. You've been with the whole Dell EMC family for a while now. We'd love to get your perspective on, kind of, what's going on from your team's standpoint. What are you seeing in the enterprises with respect to Big Data and being able to really leverage data across the business as a value driver and a revenue generator? >> Yeah, it's interesting that what we see across the business in terms of, especially in the big enterprises, there, many organizations, even the more mature ones, are still struggling to get that extra dollar, that extra level of monetization out of their data assets. They, everyone talks about monetizing data and using data, treating it as an asset, but organizations are struggling with that, not because of the technology, the technology's been put in, they've ramped up their teams, their skills. It's, what we tend to see inhibiting this digital transformation growth is process. It's organizational strife and it's not looking to best practices, even within own, their own organization, we're doing things like DevOps. So, why would we treat the notion of creating a data model any different than we would regular application development? Well, organizations still carry that weight, that inertia, they still treat Big Data and analytics like they do the data warehouse, and the most effective organizations are starting to incorporate that agile methodology and agile thinking, no snowflakes, infrastructure's code, these concepts of quickly and rapidly repeatedly doing these things, those are the organizations that are really starting to pull away from their competitors in industry. So, Dell EMC, our consulting group and our product lines are all there to support that transformation journey by taking those best practices and DevOps DataOps and bringing that to the analytical space. >> Do you think that companies, Matt, have a pretty good sense as to how applications that they develop are going to affect, create value, creating value is, let's simplify it, increasing revenue or cutting cost? Generally people can predict with the impact, they can write a business case around it. My observation was that certainly in the early days of so-called Big Data, people really didn't have an understanding as to the relationship between their data and that value, and so, many companies mistakenly thought, "Well I need to figure out how to sell my data," versus understand how data affects monetization. I wonder if you could comment on that and how has that progressed throughout the years? >> Yeah, that's a good point, we, from a consulting practice, used to do a lot of, what we call, proof of values, where organizations, after they kicked the tires and covered some use cases, we took them through a very slow, methodical business case RY analysis. You're going to spend this much on infrastructure, you're going to hire these people, you're going to take this data, and poof, you're going to make this much money, you're going to save this much money. Well, we're doing less and less of that these days because organizations have a good feel for where they want to go and the potential upside for doing this where they're now tend to struggle is, "Well, how do I actually get there?" "There's still a lot of tools and a lot of technologies and which is right for my business?" "What is the right process and how do I build that consensus in the organization?" And so, from a business consulting perspective, we're doing less of the RY work and more of the governance, the sort of, governance work by aligning stakeholders, getting those repeatable patterns and architectures in place to help organizations take that first few wins and then scale it. >> Where do you see the action these days? I mean there's somehow I profile use cases, obviously getting people to click on ads, Big Data has helped with that, fraud detection has come such a long way in the last 10 years, ya know, no doubt, certainly risk assessment, ya know, from the financial services industry. Those are the obvious ones, where else do you see Big Data analytics to the changing the world, if you will? >> Yeah, so I'd say those static or batch-type workloads are well understood. That, hey, is there fraud on transactions that occurred yesterday or last night? What is the customer score, lifetime value score for customer? Where we see more trends in the enterprise space is streaming. So, what can we catch in real time and help our people make real time decisions? So, and that is dealing with unstructured data. So, I've got a call center and I'm listening to the voice that's coming in, putting some sentiment analysis on that and then providing a score or script to the customer call agent in real time. And those, sort of, streaming use cases, whether it's images or voice, that, I think, is the next paradigm for use cases that organizations want to tackle. 'Cause if you can prevent a customer from leaving in real time, right, say, you know what, it sounds like you're upset, what if we did X to help retain you, it's going to be significant. All these organizations have a good idea of the cost it takes to acquire a new customer and the cost of losing a customer, so if they can put that intelligence in upstream, they no longer have to spend so much money trying to capture new customers 'cause they can focus on the ones they have. So, I think that, sort of, time between customer and streaming is where the next set of, I think, money's to be found. >> So customer experience is critical for businesses in any organization, I'm wondering, kind of, what the juxtaposition is of businesses going, "Yes, we have to be able "to do things in real time, in enterprise, "we have to be agile, yet we have, in order "to really facilitate a really effective, relevant, "timely customer experience, many departments "and organizations in a business need access to data." From a political perspective, how does Dell EMC, how does your consulting practice help an enterprise be able to start opening up these barriers internally to be able to enable data sharing so that they can drive and take advantage of things like real-time streaming to ultimately improve the customer experience, revenue, et cetera? >> Yeah, it's going to sound really trite, but the first step is getting everyone in a room and talking about what good looks like, what are the low-hanging... And everyone's going to agree on those use cases, there going to say, "These are the things we have to do," right, "We want to lose fewer customers, we want to..." You know, whatever the case may be, so everyone will agree on that. So, the politics don't come into play there. So, "Well, what data do we require for that?" "Okay, well, we've got all this data, great, "no disagreement there." Well, where is the data located? Who's the owner or the steward of that data? And now, who's going to be responsible for monetizing that? And that's where we tend to see the breakdown because when these things cross the line of business and customer always crosses the line of business, you end up with turf wars. And so this, the emergence of the Chief Data Officer, who's responsible for the policy and the prioritization and the ownership of these things is such a key role now, that, and it's not a CIO responsible for data, it is a business aligned executive reporting to the chief, CEO, COO, CFO. Again, business alignment, that tends to be the decision maker or at least the thing that solves for those conflicts across those BUs. And when that happens, then we see real change. But, if there's not that role or that person that can put that line in the sand and say, "This is how we're going to do it," you end up with that political strife and then you end up with silos of information or point solutions across the enterprise and it doesn't serve anyone. >> What are you seeing in terms of that CDO role? I mean, initially the Chief Data Officer was really within regulated businesses, financial services, healthcare, government. And then you've seen it permeate, ya know, to more mainstream. Do you see that role as having legs? A lot of people have questioned that role. What Chief Digital Officer, Chief Data Officer is encroaching on the CIO territory? I'm inferring from your comments that you're optimistic about that role going forward. >> I am, as long as it's well-defined as having unique capabilities that's different than the CIO. Again, I think the first generation of Chief Data Officers were very CIO-liked or CIO-for-data and that's when you ended up with the turf wars. And then it was like, "Okay, well this is "what we're doing." But then you had someone who was sort of a peer for infrastructure and so, it just didn't seem to work out. And so, now we're seeing that role being redefined, it's less about the technology and the tools and the infrastructure, and it's more about the policies, the consistency, the architectures. >> You know I'd observe, I wonder if we can talk about this for a little bit, it's the CDO role. To me, one of the first things a CDO has to do is understand how a company gets value out of its data, what is the, and if it's a full profit company, what's the monetization, where does that come from? Not selling the data, as we were talking about earlier. And then there is what data, what data, where are, what data architecture, data sources, how do we give access to that? And then quality, data quality seems to be something that they worry about. And then skills, not, none, no technology in here. And then somehow they're going to form relationships with the line of business and it's simultaneous to figuring that out. Does that seem like a reasonable framework for the CIO, CDOs job? >> It does, and you call them Chief Data Governance Officer, I mean, it really falls under the umbrella of governance. It's about standards and consistency, but also these policies of, there are finite resources, whether we're talking people or computes. What do you do when there's not enough resources and more demand? How do you prioritize the things that the business does? Well, do you have policies and matrices that say, "Okay, well, is it material, actionable, timely?" "Then yes, then we'll proceed with this." "No, it doesn't pass." And it doesn't have to be about money. However the organization judges itself is what it should be based on. So, whether we're talking non-profit, we helped a school system recently better align kids with schedules and also learning abilities by sitting them next to each other in classes, there's no profit in that other than the education of children, so every organization judges itself or measures itself a little differently, but it comes back to those KPIs. What are your KPIs, how does that align to business initiatives? And then everything should flow from there. Now, I'm not saying it's easy work. Data governance is the hardest thing to do in this space and that's why I think so few organizations take it on 'cause it's a long, slow process and, ya know, you should've started 10 years ago on it and if you haven't, it feels like this mountain that is really high to climb. >> What you're saying is outcome driven. >> Yeah. >> Independent of the types of organizations. I want to talk about innovation, I've been asking a lot of people this week, do you feel like Big Data, ya know, the meme of Big Data that was created eight, 10 years ago, do you feel like it lived up to its promises? >> That's a loaded question. I think if you were to ask the back office enterprises, I would say yes. In terms of customers feeling it, probably not, because when you use an Uber app to hail a cab and pay $3.75 to go across town, it feels like a quality of life, but you don't know that that's a data-driven decision. As a consumer, your average consumer, you probably don't feel that. As you're clicking through Amazon and they know, sort of, the goods that you need, or the fact that they know what you're going to need and they've got it in a warehouse that they can get to you later that day, it doesn't feel like a Big Data solution, it just feels like, "Hey, the people I'm doing business with, they know me better." People don't really understand that that's a Big Data and analytics concept, so, has it lived up to the hype? Externally, I think the perception is that it has not, but the businesses that really get it, feel that absolutely it has. That's 'cause you, do you agree it's kind of bifurcated? >> Matt Maccaux: Yeah, it is. >> The Spotify's and the Ubers and the Airbnb's that are crushing it and then there's a lot of traditional enterprises that are still stove pipe and struggling. >> Yeah, it's funny, when we talk to customers, we've got our introductory power points, right, it always talks about the new businesses and the old businesses and, and I'm finding that that doesn't play very well anymore with enterprise customers. They're like, "We're never going to be the Uber "of our industry, it's not going to happen "if I'm a fortune 100 legacy, it's not going to happen." "What I really want to do, though, "is help my customers or make more money here, "I'm not going to be the Uber, it's just not going to happen." "We're not the culture, we're not the, we're not set up "that way, we have all of this technical legacy stuff, "but I really want to get more value out of my data, "how do I do that?" And so that message resonates. >> Isn't that in some ways, though, how do you feel about this, is it a recipe for disruption, where that's not going to happen, but something could happen where somebody digitizes your business? >> Yes, absolutely, if there are organizations, if you're in the fortune 500 and you are not worried about someone coming along and disrupting you, then you are probably not doing the right job. I would be kept awake every night, whether it was financial services or industrial manufacturing. >> Dave Vellante: Grocery. >> Nobody thought that the taxis, who the hell would come in and disrupt the cab industry? Ya got to hire all these people, the cars are junk, the customer experience is awful. Well, someone has come along and there's been an industry related to this, now they have their bumps in the road, so are they going to be disrupted again or what's the next level of disruption? But, I think it is technology that fuels that, but it's also the cultural shift as part of that, which is outside the technologies, the socioeconomic trends that I think drive that, as well. >> But even, ya know, and we've got just a few seconds left, the cultural shift internally. It sounds like, from what you're describing, if an enterprise is going to recognize, "I'm not going to compete with an Uber or an Airbnb "or a Netflix, but I've got to be able to compete "with my existing peers of enterprise organizations," the CDO role sounds like it's a matter of survivability. >> Yes. >> Without putting that in place, you can't capitalize on the value of data monetized and et cetera. Well guys, I wish we had more time 'cause I think we're opening a can of worms here, but Dave, Matt thanks so much for having this conversation. Thank you for stopping by. >> Thanks for having me here, it was a real pleasure. >> Likewise. We want to thank you for watching theCube. We are continuing our coverage of our event, Big Data SV in downtown San Jose. For Dave Vellante, my co-host, I'm Lisa Martin. Stick around, we'll be right back with our next guest after a short break. (upbeat music)

Published Date : Mar 8 2018

SUMMARY :

brought to you by SilconANGLE Media Welcome back to theCube's continuing coverage by the way. We're excited to welcome back to theCube, it's a pleasure to be here. We'd love to get your perspective on, and bringing that to the analytical space. applications that they develop are going to affect, and more of the governance, the sort of, Those are the obvious ones, where else do you see the cost it takes to acquire a new customer these barriers internally to be able Again, business alignment, that tends to be I mean, initially the Chief Data Officer and the infrastructure, and it's more about To me, one of the first things a CDO has to do Data governance is the hardest thing to do Independent of the types or the fact that they know what you're going to need The Spotify's and the Ubers and the Airbnb's and the old businesses and, and I'm finding then you are probably not doing the right job. their bumps in the road, so are they going to be "or a Netflix, but I've got to be able to compete that in place, you can't capitalize We want to thank you for watching theCube.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

DavePERSON

0.99+

Matt MaccauxPERSON

0.99+

Lisa MartinPERSON

0.99+

MattPERSON

0.99+

$3.75QUANTITY

0.99+

AmazonORGANIZATION

0.99+

SpotifyORGANIZATION

0.99+

LisaPERSON

0.99+

AirbnbORGANIZATION

0.99+

yesterdayDATE

0.99+

UberORGANIZATION

0.99+

Dell EMCORGANIZATION

0.99+

last nightDATE

0.99+

SilconANGLE MediaORGANIZATION

0.99+

UbersORGANIZATION

0.99+

first stepQUANTITY

0.99+

10 years agoDATE

0.99+

NetflixORGANIZATION

0.99+

oneQUANTITY

0.98+

eight,DATE

0.97+

this weekDATE

0.96+

Strata Data ConferenceEVENT

0.96+

Big Data SV 2018EVENT

0.96+

Male NarratorTITLE

0.95+

first generationQUANTITY

0.94+

Big DataORGANIZATION

0.93+

San JoseLOCATION

0.92+

Silicon ValleyLOCATION

0.87+

theCubeORGANIZATION

0.83+

theCubeTITLE

0.83+

Big DataTITLE

0.8+

DevOpsTITLE

0.79+

last 10 yearsDATE

0.79+

Big Data SVEVENT

0.73+

first thingsQUANTITY

0.72+

Live fromTITLE

0.7+

first fewQUANTITY

0.7+

daysQUANTITY

0.64+

CDOTITLE

0.62+

500QUANTITY

0.59+

coupleQUANTITY

0.59+

100QUANTITY

0.55+

Octavian Tanase, NetApp | Big Data SV 2018


 

>> Announcer: Live from San Jose it's The Cube presenting Big Data, Silicon Valley brought to you by SiliconANGLE Media and its ecosystem partners. >> Good morning. Welcome to The Cube. We are on day two of our coverage our event Big Data SV. I'm Lisa Martin with my cohost Dave Vellante. We're down the street from the Strata Data Conference. This is The Cube's tenth big data event and we had a great day yesterday learning a lot from myriad guests on very different nuances of big data journey where things are going. We're excited to welcome back to The Cube an alumni, Octavian Tanase, the Senior Vice President of Data ONTAP fron Net App. Octavian, welcome back to The Cube. >> Glad to be here. >> So you've been at the Strata Data Conference for the last couple of days. From a big data perspective, what are some of the things that you're hearing, in terms of from a customer's perspective on what's working, what challenges, opportunities? I'm very excited to be here and learn about the innovation of our partners in the industry and share with our partners and our customers what we're doing to enable them to drive more value out of that data. The reality is that data has become the 21st Century gold or oil that powers the business and everybody's looking to apply new techniques, a lot of times machine learning, deep learning, to draw more value of the data, make better decisions and compete in the marketplace. Octavian, you've been at NetApp now eight years and I've been watching NetApp, as we were talking about offline, for decades and I've seen the ebb and flow and this company has transformed many, many times. The latest, obviously cloud came in, flash came into play and then you're also going through a major transition in the customer based to clustered ONTAP. You seemed to negotiate that. NetApp is back, thriving, stock's up. What's happening at NetApp? What's the culture like these days? Give us the update. >> I think we've been very fortunate to have a CEO like George Kurian, who has been really focused on helping us do basically fewer things better, really focus on our core business, simplify our operations and continue to innovate and this is probably the area that I'm most excited about. It's always good to make sure that you accelerate the business, make it simpler for your customers and your partners to do business with you, but what you have to do is innovate. We are a product company. We are passionate about innovation. I believe that we are innovating with more pace than many of the startups in the space so that's probably the most exciting thing that has been part of our transformation. >> So let's talk about big data. Back in the day if you had a big data problem you would buy a big Unix box, maybe buy some Oracle licenses, try to put all your data into that box and that became your data warehouse. The brilliance of Hadoop was hey we can leave the data where it is. There's too much data to put into the box so we're going to bring five megabytes to code to a petabyte of data. And the other piece of it is CFOs loved it, because we're going to reduce the cost of our expensive data warehouse and we're going to buy off the shelf components: white box, servers and off the shelf disk drives. We're going to put that together and life will be good. Well as things matured, the old client-server days, it got very expensive, you needed enterprise grade. So where does NetApp fit into that equation, because originally big storage companies like NetApp, they weren't part of the equation? Has that changed? >> Absolutely. One of the things that has enabled that transformation, that change is we made a deliberate decision to focus on software defined and making sure that the ONTAP operating system is available wherever data is being created: on the edge in an IoT device, in the traditional data center or in the cloud. So we are in the unique position to enable analytics, big data, wherever those applications reside. One of the things that we've recently done is we've partnered with IDC and what the study, what the analysis has shown is that deploying in analytics, a Hadoop or NoSQL type of solution on top of NetApp is half the cost of DAS. So when you consider the cost of servers, the licenses that you're going to have to pay for, these commercial implementations of Hadoop as well as the storage and the data infrastructure, you are much better off choosing NetApp than a white box type of solution. >> Let's unpack that a little bit, because if I infer correctly from what you said normally you would say the operational costs are going to be dramatically lower, it's easier to manage a professional system like a NetApp ONTAP, it's integrated, great software, but am I hearing you correctly, you're saying the acquisition costs are actually less than if I'm buying white box? A lot of people are going to be skeptical about that, say Octavian no way, it's cheaper to buy white box stuff. Defend that statement. >> Absolutely. If you're looking at the whole solution that includes the server and the storage, what NetApp enables you to do if you're running the solution on top of ONTAP you reduce the need for so many servers. If you reduce that number you also reduce the licensing cost. Moreover, if you actually look at the core value proposition of the storage layer there, DAS typically makes three copies of the data. We don't. We are very greedy and we're making sure that you're using shared storage and we are applying a bunch of storage efficiency techniques to further compress, compact that data for world class storage efficiency. >> So cost efficiency is obviously a great benefit for any company when they're especially evolving, from a digital perspective. What are some of the business level benefits? You mentioned speed a minute ago. What is Data ONTAP and even ONTAP in the cloud enabling your enterprise customers to achieve at the business level, maybe from faster time to market, identifying with machine learning and AI new products? Give me an example of maybe a customer that you think really articulates the value that ONTAP in the cloud can deliver. >> One of the things that's really important is to have your data management capability, whatever the data is being produced so ONTAP being consumed either as a VM or a service ... I don't know if you've seen some of the partnerships that we have with AWS and Azure. We're able to offer the same rich data management capabilities, not only the traditional data center, but in the cloud. What that really enables customers to do is to simplify and have the same operating system, the same data management platform for the both the second platform traditional applications as well as for the third platform applications. I've seen a company like Adobe be very successful in deploying their infrastructure, their services not only on prem in their traditional data center, but using ONTAP Cloud. So we have more than about 1,500 customers right now that have adopted ONTAP in the AWS cloud. >> What are you seeing in terms of the adoption of flash and I'm particularly interested in the intersection of flash adoption and the developer angle, because we've seen, in certain instances, certain organizations are able to share data off of flash much more efficiently that you would be, for instance, of a spinning disk? Have you seen a developer impact in your customer base? >> Absolutely I think most of customers initially have adopted flash, because of high throughput and low latency. I think over time customers really understood and identified with the overall value proposition in cost of ownership in flash that it enables them to consolidate multiple workloads in a smaller footprint. So that enables you to then reduce the cost to operate that infrastructure and it really gives you a range of applications that you can deploy that you were never able to do that. Everybody's looking to do in place, in line analytics that now are possible, because of this fast media. Folks are looking to accelerate old applications in which they cannot invest anymore, but they just want to run faster. Flash also tends to be more reliable than traditional storage, so customers definitely appreciate that fewer things could go wrong so overall the value proposition of flash, it's all encompassing and we believe that in the near future flash will be the defacto standard in everybody's data center, whether it's on prem or in the cloud. >> How about backup and recovery in big data? We obviously, in the enterprise, very concerned about data protection. What's similar in big data? What's different and what's NetApp's angle on that? >> I think data protection and data security will never stop being important to our customers. Security's top of mind for everybody in the industry and it's a source of resume changing events, if you would, and they're typically not promotions. So we have invested a tremendous deal in certifications for HIPAA, for FIPS, we are enabling encryption, both at rest and in flight. We've done a lot of work to make sure that the encryption can happen in software layer, to make sure that we give the customers best storage class efficiency and what we're also leveraging is the innovation that ONTAP has done over many years to protect the data, replicate its snapshots, peering the data to the cloud. These are techniques that we're commonly using to reduce the cost of ownership, also protect the data the customers deploy. >> So security's still a hot topic and, like you said, it probably always will be, but it's a shared responsibility, right? So customers leveraging NetApps safe or on prem hybrid also using Azure or AWS, who's your target audience? If you're talking to the guys and gals that are still managing storage are you also having the CSO or the security guys come in, the gals, to understand we've got this appointment in Azure or AWS so we're going to bring in ONTAP to facilitate this? There's a shared responsibility of security. Who's at the table, from your perspective, in your customers that you need to help understand how they facilitate true security? >> It's definitely been a transformative event where more and more people in IQ organizations are involved in the decisions that are required to deploy the applications. There was a time when we would talk only to the storage admin. After a while we started talking to the application admin, the virtualization admin and now you're talking to the line of business who has that vested interest to make sure that they can harness the power of the data in their environment. So you have the CSO, you have the traditional infrastructure people, you have the app administration and you have the app owner, the business owner that are all at the table that are coming and looking to choose the best of breed solution for their data management. >> What are the conversations like with your CXO, executives? Everybody talks about digital transformation. It's kind of an overused term, but there's real substance when you actually peel the onion. What are you seeing as NetApp's role in effecting digital transformations within your customer base? >> I think we have a vision of how we can help enterprises take advantage of the digital transformation and adopt it. I think we have three tenants of that vision. Number one is we're helping customers harness the power of the cloud. Number two, we're looking to enable them to future proof their investments and build the next generation data center. And number three, nobody starts with a fresh slate so we're looking to help customers modernize their current infrastructure through storage. We have a lot of expertise in storage. We've helped, over time, customers time and again adopt disruptive technologies in nondisruptive ways. We're looking to adopt these technologies and trends on behalf of our customers and then help them use them in a seamless safe way. >> And continue their evolution to identify new revenue streams, new products, new opportunities and even probably give other lines of business access to this data that they need to understand is there value here, how can we harness it faster than our competitors, right? >> Absolutely. It's all about deriving value out of the data. I think earlier I called it the gold of the 21st Century. This is a trend that will continue. I believe there will be no enterprise or center that won't focus on using machine learning, deep learning, analytics to derive more value out of the data to find more customer touch points, to optimize their business to really compete in the marketplace. >> Data plus AI plus cloud economics are the new innovation drivers of the next 10, 20 years. >> Completely agree. >> Well Octavian thanks so much for spending time with us this morning sharing what's new at NetApp, some of the visions that you guys have and also some of the impact that you're making with customers. We look forward to having you back on the program in the near future. >> Thank you. Appreciate having the time. >> And for my cohost Dave Vellante I'm Lisa Martin. You're watching The Cube live on day two of coverage of our event, Big Data SV. We're at this really cool venue, Forager Tasting Room. Come down here, join us, get to hear all these great conversations. Stick around and we'll be right back with our next guest after a short break. (electronic music)

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media We're down the street from the Strata Data Conference. in the customer based to clustered ONTAP. that you accelerate the business, Back in the day if you had a big data problem and making sure that the ONTAP operating system A lot of people are going to be skeptical about that, that includes the server and the storage, that ONTAP in the cloud can deliver. that have adopted ONTAP in the AWS cloud. to operate that infrastructure and it really gives you We obviously, in the enterprise, peering the data to the cloud. that you need to help understand that are required to deploy the applications. What are the conversations like with your CXO, executives? and build the next generation data center. out of the data to find more customer touch points, are the new innovation drivers of the next 10, 20 years. We look forward to having you back on the program Appreciate having the time. get to hear all these great conversations.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

George KurianPERSON

0.99+

Lisa MartinPERSON

0.99+

Octavian TanasePERSON

0.99+

AdobeORGANIZATION

0.99+

OctavianPERSON

0.99+

AWSORGANIZATION

0.99+

eight yearsQUANTITY

0.99+

San JoseLOCATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

NetAppTITLE

0.99+

HadoopTITLE

0.99+

five megabytesQUANTITY

0.99+

OracleORGANIZATION

0.99+

second platformQUANTITY

0.99+

21st CenturyDATE

0.99+

HIPAATITLE

0.99+

Strata Data ConferenceEVENT

0.99+

yesterdayDATE

0.99+

ONTAPTITLE

0.99+

The CubeTITLE

0.99+

IDCORGANIZATION

0.98+

bothQUANTITY

0.98+

OneQUANTITY

0.98+

UnixCOMMERCIAL_ITEM

0.98+

NetAppORGANIZATION

0.97+

The CubeORGANIZATION

0.97+

Silicon ValleyLOCATION

0.96+

ONTAP CloudTITLE

0.95+

more than about 1,500 customersQUANTITY

0.95+

NetAppsTITLE

0.93+

Big Data SVEVENT

0.93+

Big Data SV 2018EVENT

0.93+

day twoQUANTITY

0.93+

Forager Tasting RoomLOCATION

0.88+

NoSQLTITLE

0.87+

AzureORGANIZATION

0.86+

third platform applicationsQUANTITY

0.81+

a minute agoDATE

0.81+

Number twoQUANTITY

0.8+

Senior Vice PresidentPERSON

0.79+

three tenantsQUANTITY

0.78+

decadesQUANTITY

0.74+

a petabyte of dataQUANTITY

0.73+

tenth bigQUANTITY

0.71+

Number oneQUANTITY

0.71+

three copiesQUANTITY

0.7+

this morningDATE

0.69+

number threeQUANTITY

0.68+

ONTAPORGANIZATION

0.67+

Data ONTAPORGANIZATION

0.64+

eventQUANTITY

0.64+

Net AppTITLE

0.64+

10QUANTITY

0.64+

halfQUANTITY

0.6+

flashTITLE

0.58+

muchQUANTITY

0.58+

Big DataEVENT

0.57+

yearsQUANTITY

0.55+

Sastry Malladi, FogHorn | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE, presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partner. (upbeat electronic music) >> Welcome back to The Cube. I'm Lisa Martin with George Gilbert. We are live at our event, Big Data SV, in downtown San Jose down the street from the Strata Data Conference. We're joined by a new guest to theCUBE, Sastry Malladi, the CTO Of FogHorn. Sastry, welcome to theCUBE. >> Thank you, thank you, Lisa. >> So FogHorn, cool name, what do you guys do, who are you? Tell us all that good stuff. >> Sure. We are a startup based in Silicon Valley right here in Mountain View. We started about three years ago, three plus years ago. We provide edge computing intelligence software for edge computing or fog computing. That's how our company name got started is FogHorn. For our particularly, for our IoT industrial sector. All of the industrial guys, whether it's transportation, manufacturing, oil and gas, smart cities, smart buildings, any of those different sectors, they use our software to predict failure conditions in real time, or do condition monitoring, or predictive maintenance, any of those use cases and successfully save a lot of money. Obviously in the process, you know, we get paid for what we do. >> So Sastry... GE populized this concept of IIoT and the analytics and, sort of the new business outcomes you could build on it, like Power by the Hour instead of selling a jet engine. >> Sastry: That's right. But there's... Actually we keep on, and David Floor did some pioneering research on how we're going to have to do a lot of analytics on the edge for latency and bandwidth. What's the FogHorn secret sauce that others would have difficulty with on the edge analytics? >> Okay, that's a great question. Before I directly answer the question, if you don't mind, I'll actually even describe why that's even important to do that, right? So a lot of these industrial customers, if you look at, because we work with a lot of them, the amount of data that's produced from all of these different machines is terabytes to petabytes of data, it's real. And it's not just the traditional digital sensors but there are video, audio, acoustic sensors out there. The amount of data is humongous, right? It's not even practical to send all of that to a Cloud environment and do data processing, for many reasons. One is obviously the connectivity, bandwidth issues, and all of that. But the two most important things are cyber security. None of these customers actually want to connect these highly expensive machines to the internet. That's one. The second is the lack of real-time decision making. What they want to know, when there is a problem, they want to know before it's too late. We want to notify them it is a problem that is occurring so that have a chance to go fix it and optimize their asset that is in question. Now, existing solutions do not work in this constrained environment. That's why FogHorn had to invent that solution. >> And tell us, actually, just to be specific, how constrained an environment you can operate in. >> We can run in about less than 100 to 150 megabytes of memory, single-core to dual-core of CPU, whether it's an ARM processor, an x86 Intel-based processor, almost literally no storage because we're a real-time processing engine. Optionally, you could have some storage if you wanted to store some of the results locally there but that's the kind of environment we're talking about. Now, when I say 100 megabytes of memory, it's like a quarter of Raspberry Pi, right? And even in that environment we have customers that run dozens of machinery models, right? And we're not talking -- >> George: Like an ensemble. >> Like an anomaly detection, a regression, a random forest, or a clustering, or a gamut, some of those. Now, if we get into more deep learning models, like image processing and neural net and all of that, you obviously need a little bit more memory. But what we have shown, we could still run, one of our largest smart city buildings customer, elevator company, runs in a raspberry Pi on millions of elevators, right? Dozens of machinery algorithms on top of that, right? So that's the kind of size we're talking about. >> Let me just follow up with one question on the other thing you said, with, besides we have to do the low-latency locally. You said a lot of customers don't want to connect these brown field, I guess, operations technology machines to the internet, and physically, I mean there was physical separation for security. So it's like security, Bill Joy used to say "Security by obscurity." Here it's security by -- >> Physical separation, absolutely. Tell me about it. I was actually coming from, if you don't mind, last week I was in Saudi Arabia. One of the oil and gas plants where we deployed our software, you have to go to five levels of security even to get to there, It's a multibillion dollar plant and refining the gas and all of that. Completely offline, no connectivity to the internet, and we installed, in their existing small box, our software, connected to their live video cameras that are actually measuring the stuff, doing the processing and detecting the specific conditions that we're looking for. >> That's my question, which was if they want to be monitoring. So there's like one low level, really low hardware low level, the sensor feeds. But you could actually have a richer feed, which is video and audio, but how much of that, then, are you doing the, sort of, inferencing locally? Or even retraining, and I assume that since it's not the OT device, and it's something that's looking at it, you might be more able to send it back up the Cloud if you needed to do retraining? >> That's exactly right. So the way the model works is particularly for image processing because you need, it's a more complex process to train than create a model. You could create a model offline, like in a GPU box, an FPGA box and whatnot. Import and bring the model back into this small little device that's running in the plant, and now the live video data is coming in, the model is inferencing the specific thing. Now there are two ways to update and revise the model: incremental revision of the model, you could do that if you want, or you can send the results to a central location. Not internet, they do have local, in this example for example a PIDB, an OSS PIDB, or some other local service out there, where you have an opportunity to gather the results from each of these different locations and then consolidate and retrain the model, put the model back again. >> Okay, the one part that I didn't follow completely is... If the model is running ultimately on the device, again and perhaps not even on a CPU, but a programmable logic controller. >> It could, even though a programmable controller also typically have some shape of CPU there as well. These days, most of the PLCs, programmable controllers, have either an RM-based processor or an x86-based processor. We can run either one of those too. >> So, okay, assume you've got the model deployed down there, for the, you know, local inferencing. Now, some retraining is going to go on in the Cloud, where you have, you're pulling in the richer perspective from many different devices. How does that model get back out to the device if it doesn't have the connectivity between the device and the Cloud? >> Right, so if there's strictly no connectivity, so what happens is once the model is regenerated or retrained, they put a model in a USB stick, it's a low attack. USB stick, bring it to the PLC device and upload the model. >> George: Oh, so this is sort of how we destroyed the Iranian centrifuges. >> That's exactly right, exactly right. But you know, some other environments, even though it's not connectivity to the Cloud environment, per se, but the devices have the ability to connect to the Cloud. Optionally, they say, "Look, I'm the device "that's coming up, do you have an upgraded model for me?" Then it can pull the model. So in some of the environments it's super strict where there are absolutely no way to connect this device, you put it in a USB stick and bring the model back here. Other environments, device can query the Cloud but Cloud cannot connect to the device. This is a very popular model these days because, in other words imagine this, an elevator sitting in a building, somebody from the Cloud cannot reach the elevator, but an elevator can reach the Cloud when it wants to. >> George: Sort of like a jet engine, you don't want the Cloud to reach the jet engine. >> That's exactly right. The jet engine can reach the Cloud it if wants to, when it wants to, but the Cloud cannot reach the jet engine. That's how we can pull the model. >> So Sastry, as a CTO you meet with customers often. You mentioned you were in Saudi Arabia last week. I'd love to understand how you're leveraging and gaging with customers to really help drive the development of FogHorn, in terms of being differentiated in the market. What are those, kind of bi-directional, symbiotic customer relationships like? And how are they helping FogHorn? >> Right, that's actually a great question. We learn a lot from customers because we started a long time ago. We did an initial version of the product. As we begin to talk to the customers, particularly that's part of my job, where I go talk to many of these customers, they give us feedback. Well, my problem is really that I can't even do, I can't even give you connectivity to the Cloud, to upgrade the model. I can't even give you sample data. How do you do that modeling, right? And sometimes they say, "You know what, "We are not technical people, help us express the problem, "the outcome, give me tools "that help me express that outcome." So we created a bunch of what we call OT tools, operational technology tools. How we distinguish ourselves in this process, from the traditional Cloud-based vendor, the traditional data science and data analytics companies, is that they think in terms of computer scientists, computer programmers, and expressions. We think in terms of industrial operators, what can they express, what do they know? They don't really necessarily care about, when you tell them, "I've got an anomaly detection "data science machine algorithm", they're going to look at you like, "What are you talking about? "I don't understand what you're talking about", right? You need to tell them, "Look, this machine is failing." What are the conditions in which the machine is failing? How do you express that? And then we translate that requirement, or that into the underlying models, underlying Vel expressions, Vel or CPU expression language. So we learned a ton from user interface, capabilities, latency issues, connectivity issues, different protocols, a number of things that we learn from customers. >> So I'm curious with... More of the big data vendors are recognizing data in motion and data coming from devices. And some, like Hortonworks DataFlow NiFi has a MiNiFi component written in C plus plus, really low resource footprint. But I assume that that's really just a transport. It's almost like a collector and that it doesn't have the analytics built in -- >> That's exactly right, NiFi has the transport, it has the real-time transport capability for sure. What it does not have is this notion of that CEP concept. How do you combine all of the streams, everything is a time series data for us, right, from the devices. Whether it's coming from a device or whether it's coming from another static source out there. How do you express a pattern, a recognition pattern definition, across these streams? That's where our CPU comes in the picture. A lot of these seemingly similar software capabilities that people talk about, don't quite exactly have, either the streaming capability, or the CPU capability, or the real-time, or the low footprint. What we have is a combination of all of that. >> And you talked about how everything's time series to you. Is there a need to have, sort of an equivalent time series database up in some central location? So that when you subset, when you determine what relevant subset of data to move up to the Cloud, or you know, on-prem central location, does it need to be the same database? >> No, it doesn't need to be the same database. It's optional. In fact, we do ship a local time series database at the edge itself. If you have a little bit of a local storage, you can down sample, take the results, and store it locally, and many customers actually do that. Some others, because they have their existing environment, they have some Cloud storage, whether it's Microsoft, it doesn't matter what they use, we have connectors from our software to send these results into their existing environments. >> So, you had also said something interesting about your, sort of, tool set, as being optimized for operations technology. So this is really important because back when we had the Net-Heads and the Bell-Heads, you know it was a cultural clash and they had different technologies. >> Sastry: They sure did, yeah. >> Tell us more about how selling to operations, not just selling, but supporting operations technology is different from IT technology and where does that boundary live? >> Right, so typical IT environment, right, you start with the boss who is the decision maker, you work with them and they approve the project and you go and execute that. In an industrial, in an OT environment, it doesn't quite work like that. Even if the boss says, "Go ahead and go do this project", if the operator on the floor doesn't understand what you're talking about, because that person is in charge of operating that machine, it doesn't quite work like that. So you need to work bottom up as well, to convincing them that you are indeed actually solving their pain point. So the way we start, where rather than trying to tell them what capabilities we have as a product, or what we're trying to do, the first thing we ask is what is their pain point? "What's your problem? What is the problem "you're trying to solve?" Some customers say, "Well I've got yield, a lot of scrap. "Help me reduce my scrap. "Help me to operate my equipment better. "Help me predict these failure conditions "before it's too late." That's how the problem starts. Then we start inquiring them, "Okay, what kind of data "do you have, what kind of sensors do you have? "Typically, do you have information about under what circumstances you have seen failures "versus not seeing failures out there?" So in the process of inauguration we begin to understand how they might actually use our software and then we tell them, "Well, here, use your software, "our software, to predict that." And, sorry, I want 30 more seconds on that. The other thing is that, typically in an IT environment, because I came from that too, I've been in this position for 30 plus years, IT, UT and all of that, where we don't right away talk about CEP, or expressions, or analytics, and we don't talk about that. We talk about, look, you have these bunch of sensors, we have OT tools here, drag and drop your sensors, express the outcome that you're trying to look for, what is the outcome you're trying to look for, and then we drive behind the scenes what it means. Is it analytics, is it machine learning, is it something else, and what is it? So that's kind of how we approach the problem. Of course, if, sometimes you do surprisingly occasionally run into very technical people. From those people we can right away talk about, "Hey, you need these analytics, you need to use machinery, "you need to use expressions" and all of that. That's kind of how we operate. >> One thing, you know, that's becoming clearer is I think this widespread recognition that's data intensive and low latency work to be done near the edge. But what goes on in the Cloud is actually closer to simulation and high-performance compute, if you want to optimize a model. So not just train it, but maybe have something that's prescriptive that says, you know, here's the actionable information. As more of your data is video and audio, how do you turn that into something where you can simulate a model, that tells you the optimal answer? >> Right, so this is actually a good question. From our experience, there are models that require a lot of data, for example, video and audio. There are some other models that do not require a lot of data for training. I'll give you an example of what customer use cases that we have. There's one customer in a manufacturing domain, where they've been seeing a lot of finished goods failures, there's a lot of scrap and the problem then was, "Hey, predict the failures, "reduce my scrap, save the money", right? Because they've been seeing a lot of failures every single day, we did not need a lot of data to train and create a model to that. So, in fact, we just needed one hour's worth of data. We created a model, put the thing, we have reduced, completely eliminated their scrap. There are other kinds of models, other kinds of models of video, where we can't do that in the edge, so we're required for example, some video files or simulated audio files, take it to an offline model, create the model, and see whether it's accurately predicting based on the real-time video coming in or not. So it's a mix of what we're seeing between those two. >> Well Sastry, thank you so much for stopping by theCUBE and sharing what it is that you guys at FogHorn are doing, what you're hearing from customers, how you're working together with them to solve some of these pretty significant challenges. >> Absolutely, it's been a pleasure. Hopefully this was helpful, and yeah. >> Definitely, very educational. We want to thank you for watching theCUBE, I'm Lisa Martin with George Gilbert. We are live at our event, Big Data SV in downtown San Jose. Come stop by Forager Tasting Room, hang out with us, learn as much as we are about all the layers of big data digital transformation and the opportunities. Stick around, we will be back after a short break. (upbeat electronic music)

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media down the street from the Strata Data Conference. what do you guys do, who are you? Obviously in the process, you know, the new business outcomes you could build on it, What's the FogHorn secret sauce that others Before I directly answer the question, if you don't mind, how constrained an environment you can operate in. but that's the kind of environment we're talking about. So that's the kind of size we're talking about. on the other thing you said, with, and refining the gas and all of that. the Cloud if you needed to do retraining? Import and bring the model back If the model is running ultimately on the device, These days, most of the PLCs, programmable controllers, if it doesn't have the connectivity USB stick, bring it to the PLC device and upload the model. we destroyed the Iranian centrifuges. but the devices have the ability to connect to the Cloud. you don't want the Cloud to reach the jet engine. but the Cloud cannot reach the jet engine. So Sastry, as a CTO you meet with customers often. they're going to look at you like, and that it doesn't have the analytics built in -- or the real-time, or the low footprint. So that when you subset, when you determine If you have a little bit of a local storage, So, you had also said something interesting So the way we start, where rather than trying that tells you the optimal answer? and the problem then was, "Hey, predict the failures, and sharing what it is that you guys at FogHorn are doing, Hopefully this was helpful, and yeah. We want to thank you for watching theCUBE,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

GeorgePERSON

0.99+

Lisa MartinPERSON

0.99+

Saudi ArabiaLOCATION

0.99+

Sastry MalladiPERSON

0.99+

MicrosoftORGANIZATION

0.99+

one hourQUANTITY

0.99+

SastryPERSON

0.99+

Silicon ValleyLOCATION

0.99+

GEORGANIZATION

0.99+

100 megabytesQUANTITY

0.99+

LisaPERSON

0.99+

Bill JoyPERSON

0.99+

twoQUANTITY

0.99+

FogHornORGANIZATION

0.99+

last weekDATE

0.99+

Mountain ViewLOCATION

0.99+

30 more secondsQUANTITY

0.99+

David FloorPERSON

0.99+

one questionQUANTITY

0.99+

HortonworksORGANIZATION

0.99+

San JoseLOCATION

0.99+

30 plus yearsQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

three plus years agoDATE

0.99+

one customerQUANTITY

0.98+

oneQUANTITY

0.98+

secondQUANTITY

0.98+

C plus plusTITLE

0.98+

OneQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

150 megabytesQUANTITY

0.98+

two waysQUANTITY

0.97+

Strata Data ConferenceEVENT

0.97+

IranianOTHER

0.97+

five levelsQUANTITY

0.95+

millions of elevatorsQUANTITY

0.95+

about less than 100QUANTITY

0.95+

one partQUANTITY

0.94+

VelOTHER

0.94+

One thingQUANTITY

0.92+

dozens of machinery modelsQUANTITY

0.92+

eachQUANTITY

0.91+

IntelORGANIZATION

0.91+

FogHornPERSON

0.86+

2018DATE

0.85+

first thingQUANTITY

0.85+

single-coreQUANTITY

0.85+

NiFiORGANIZATION

0.82+

Power by the HourORGANIZATION

0.81+

about three years agoDATE

0.81+

Forager Tasting RORGANIZATION

0.8+

a tonQUANTITY

0.8+

CTOPERSON

0.79+

multibillion dollarQUANTITY

0.79+

DataEVENT

0.79+

Bell-HeadsORGANIZATION

0.78+

every single dayQUANTITY

0.76+

The CubeORGANIZATION

0.75+

CloudCOMMERCIAL_ITEM

0.73+

Dozens of machinery algorithmsQUANTITY

0.71+

PiCOMMERCIAL_ITEM

0.71+

petabytesQUANTITY

0.7+

raspberryORGANIZATION

0.69+

Big DataORGANIZATION

0.68+

CloudTITLE

0.67+

dual-coreQUANTITY

0.65+

SastryORGANIZATION

0.62+

NetORGANIZATION

0.61+

Daniel Raskin, Kinetica | Big Data SV 2018


 

>> Narrator: Live, from San Jose, it's theCUBE. Presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners (mellow electronic music) >> Welcome back to theCUBE, on day two of our coverage of our event, Big Data SV. I'm Lisa Martin, my co-host is Peter Burris. We are the down the street from the Strata Data Conference, we've had a great day yesterday, and great morning already, really learning and peeling back the layers of big data, challenges, opportunities, next generation, we're welcoming back to theCUBE an alumni, the CMO of Kinetica, Dan Raskin. Hey Dan, welcome back to theCUBE. >> Thank you, thank you for having me. >> So, I'm a messaging girl, look at your website, the insight engine for the extreme data economy. Tell us about the extreme data economy, and what is that, what does it mean for your customers? >> Yeah, so it's a great question, and, from our perspective, we sit, we're here at Strata, and you see all the different vendors kind of talking about what's going on, and there's a little bit of word spaghetti out there that makes it really hard for customers to think about how big data is affecting them today, right? And so, what we're actually looking at is the idea of, the world's changed. That, big data from five years ago, doesn't necessarily address all the use cases today. If you think about what customers are going through, you have more users, devices, and things coming on, there's more data coming back than ever before, and it's not just about creating the data driven business, and building these massive data lakes that turn into data swamps, it's really about how do you create the data-powered business. So when we're using that term, we're really trying to call out that the world's changed, that, in order for businesses to compete in this new world, they have to think about to take data and create CoreIP that differentiates, how do I use it to affect the omnichannel, how do I use it to deal with new things in the realm of banking and Fintech, how do I use it to protect myself against disruption in telco, and so, the extreme data economy is really this idea that you have business in motion, more things coming online ever before, how do I create a data strategy, where data is infused in my business, and creates CoreIP that helps me maintain category leadership or grow. >> So as you think about that challenge, there's a number of technologies that come into play. Not least of which is the industry, while it's always to a degree been driven by what hardware can do, that's moderated a bit over time, but today, in many respects, a lot of what is possible is made possible, by what hardware can do, and what hardware's going to be able to do. We've been using similar AI algorithms for a long time. But we didn't have the power to use them! We had access to data, but we didn't have the power to acquire and bring it in. So how is the relationship between your software, and your platform, and some of the new hardware that's becoming available, starting to play out in a way of creating value for customers? >> Right, so, if you think about this in terms of this extreme data concept, and you think about it in terms of a couple of things, one, streaming data, just massive amounts of streaming data coming in. Billions of rows that people want to take and translate into value. >> And that data coming from-- >> It's coming from users, devices, things, interacting with all the different assets, more edge devices that are coming online, and the Wild West essentially. You look at the world of IoT and it's absolutely insane, with the number of protocols, and device data that's coming back to a company, and then you think about how do you actually translate this into real-time insight. Not near real-time, where it's taking seconds, but true millisecond response times where you can infuse this into your business, and one of our whole premises about Kinetica is the idea of this massive parallel compute. So the idea of not using CPUs anymore, to actually drive the powering behind your intelligence, but leveraging GPUs, and if you think about this, a CPU has 64 cores, 64 parallel things that you can do at a time, a GPU can have up to 6,000 cores, 6,000 parallel things, so it's kind of like lizard brain verse modern brain. How do you actually create this next generation brain that has all these neural networks, for processing the data, in a way that you couldn't. And then on top of that, you're using not just the technology of GPUs, you're trying to operationalize it. So how do you actually bring the data scientist, the BI folks, the business folks all together to actually create a unified operational process, and the underlying piece is the Kinetica engine and the GPU used to do this, but the power is really in the use cases of what you can do with it, and how you actually affect different industries. >> So can you elaborate a little bit more on the use cases, in this kind of game changing environment? >> Yeah, so there's a couple of common use cases that we're seeing, one that affects every enterprise is the idea of breaking down silos of business units, and creating the customer 360 view. How do I actually take all these disparate data feeds, bring them into an engine where I can visualize concepts about my customer and the environment that they're living in, and provide more insight? So if you think about things like Whole Foods and Amazon merging together, you now have this power of, how do I actually bridge the digital and physical world to create a better omnichannel experience for the user, how do I think about things in terms of what preferences they have, personalization, how to actually pair that with sensor data to affect how they actually navigate in a Whole Foods store more efficiently, and that's affecting every industry, you could take that to banking as well and think about the banking omminchannel, and ATMs, and the digital bank, and all these Fintech upstarts that are working to disrupt them. A great example for us is the United States Postal Service, where we're actually looking at all the data, the environmental data, around the US Postal Service, we're able to visualize it in real-time, we're able to affect the logistics of how they actually navigate through their routes, we're able to look things like postal workers separating out of their zones, and potentially kicking off alerts around that, so effectively making the business more efficient. But, we've moved into this world where we always used to talk about brick and mortar going to cloud, we're now in this world where the true value is how you bridge the digital and physical world, and create more transformative experiences, and that's what we want to do with data. So it could be logistics, it could be omnichannel, it could be security, you name it. It affects every single industry that we're talking about. >> So I got two questions, what is Kinetica's contribution to that, and then, very importantly, as a CMO, how are you thinking about making sure that the value that people are creating, or can create with Kinetica, gets more broadly diffused into an ecosystem. >> Yeah, so the power that we're bringing is the idea of how to operationalize this in a way where again, you're using your data to create value, so, having a single engine where you're collecting all of this data, massive volumes of data, terabytes upon terabytes of data, enabling it where you can query the data, with millisecond response times, and visualize it, with millisecond response times, run machine learning algorithms against it to augment it, you still have that human ability to look at massive sets of data, and do ad hoc discovery, but can run machining learning algorithms against that and complement it with machine learning. And then the operational piece of bringing the data scientists into the same platform that the business is using, so you don't have data recency issues, is a really powerful mix. The other piece I would just add is the whole piece around data discovery, you can't really call it big data if, in order to analyze the data, you have to downsize and downsample to look at a subset of data. It's all about looking at the entire set. So that's where we really bring value. >> So, to summarize very quickly, you are providing a platform that can run very, very fast, in a parallel system, and memories in these parallel systems, so that large amounts of data can be acted upon. >> That's right. >> Now, so, the next question is, there's not going to be a billion people that are going to use your tool to do things, how are you going to work with an ecosystem and partners to get the value that you're able to create with this data, out into the engine enterprise. >> It's a great question, and probably the biggest challenge that I have, which is, how do you get above the word spaghetti, and just get into education around this. And so I think the key is getting into examples, of how it's affecting the industry. So don't talk about the technology, and streaming from Kafka into a GPU-powered engine, talk about the impact to the business in terms of what it brings in terms of the omnichannel. You look at something like Japan in the 2020 Olympics, and you think about that in terms of telco, and how are the mobile providers going to be able to take all the data of what people are doing, and to related that to ad-tech, to relate that to customer insight, to relate that to new business models of how they could sell the data, that's the world of education we have to focus on, is talk about the transformative value it brings from the customer perspective, the outside-in as opposed to the inside-out. >> On that educational perspective, as a CMO, I'm sure you meet with a lot of customers, do you find that you might be in this role of trying to help bridge the gaps between different roles in an organization, where there's data silos, and there's probably still some territorial culture going on? What are you finding in terms of Kinetica's ability to really help educate and maybe bring more stakeholders, not just to the table, but kind of build a foundation of collaboration? >> Yeah, it's a really interesting question because I think it means, not just for Kinetica, but all vendors in the space, have to get out of their comfort zone, and just stop talking speeds and feeds and scale, and in fact, when we were looking at how to tell our story, we did an analysis of where most companies were talking, and they were focusing a lot more on the technical aspirations that developers sell, which is important, you still need to court the developer, you have community products that they can download, and kick the tires with, but we need to extend our dialogue, get out of our customer comfort zone, and start talking more to CIOs, CTOs, CDOs, and that's just reaching out to different avenues of communication, different ways of engaging. And so, I think that's kind of a core piece that I'm taking away from Strata, is we do a wonderful job of speaking to developers, we all need to get out of our comfort zone and talk to a broader set of folks, so business folks. >> Right, 'cause that opens up so many new potential products, new revenue streams, on the marketing side being able to really target your customer base audience, with relevant, timely offers, to be able to be more connected. >> Yeah, the worst scenario is talking to an enterprise around the wonders of a technology that they're super excited about, but they don't know the use case that they're trying to solve, start with the use case they're trying to solve, start with thinking about how this could affect their position in the market, and work on that, in partnership. We have to do that in collaboration with the customers. We can't just do that alone, it's about building a partnership and learning together around how you use data in a different way. >> So as you imagine, the investments that Kinetica is going to make over the next few years, with partners, with customers, what do you hope Kinetica will be in 2020? >> So, we want it to be that transformative engine for enterprises, we think we are delivering something that's quite unique in the world, and, you want to see this on a global basis, affecting our customer's value. I almost want to take us out of the story, and if I'm successful, you're going to hear wonderful enterprise companies across telco, banking, and other areas just telling their story, and we happen to be the engine behind it. >> So you're an ingredient in their success. >> Yes, a core ingredient in their success. >> So if we think about over the course of the next technology, set of technology waves, are they any particular applications that you think you're going to be stronger in? So I'll give you an example, do you envision that Kinetica can have a major play in how automation happens inside infrastructure, or how developers start seeing patterns in data, imagine how those assets get created. Where are some of the kind of practical, but not really, or rarely talked about applications that you might find yourselves becoming more of an ingredient because they themselves become ingredients to some of these other big use cases? >> There are a lot of commonalities that we're starting to see, and the interesting piece is the architecture that you implement tends to be the same, but the context of how you talk about it, and the impact it has tends to be different, so, I already mentioned the customer 360 view? First and foremost, break down silos across your organization, figure out how do you get your data into one place where you can run queries against it, you can visualize it, you can do machine learning analysis, that's a foundational element, and, I have a company in Asia called Lippo that is doing that in their space, where all of the sudden they're starting to glean things they didn't know about their customer before to create, doing that ad hoc discovery, so that's one area. The other piece is this use case of how do you actually operationalize data scientists, and machine learning, into your core business? So, that's another area that we focus on. There are simple entry points, things like Tableau Acceleration, where you put us underneath the existing BI infrastructure, and all of the sudden, you're a hundred times faster, and now your business folks can sit at the table, and make real-time business decisions, where in the past, if they clicked on certain things, they'd have to wait to get those results. Geospatial visualization's a no-brainer, the idea of taking environmental data, pairing it with your customer data, for example, and now learning about interactions. And I'd say the other piece is more innovation driven, where we would love sit down with different innovation groups in different verticals and talk with them about, how are you looking to monetize your data in the future, what are the new business models, how does things like voice interaction affect your data strategy, what are the different ways you want to engage with your data, so there's a lot of different realms we can go to. >> One of the things you said as we wrap up here, that I couldn't agree with more, is, the best value articulation I think a brand can have, period, is through the voice of their customer. And being able to be, and I think that's one of the things that Paul said yesterday is, defining Kinetica's success based on the success of your customers across industry, and I think really doesn't get more objective than a customer who has, not just from a developer perspective, maybe improved productivity, or workforce productivity, but actually moved the business forward, to a point where you're maybe bridging the gaps between the digital and physical, and actually enabling that business to be more profitable, open up new revenue streams because this foundation of collaboration has been established. >> I think that's a great way to think about it-- >> Which is good, 'cause he's your CEO. >> (laughs) Yes, that sustains my job. But the other piece is, I almost get embarrassed talking about Kinetica, I don't want to be the car salesman, or the vacuum salesman, that sprinkles dirt on the floor and then vacuums it up, I'd rather us kind of fade to the behind the scenes power where our customers are out there telling wonderful stories that have an impact on how people live in this world. To me, that's the best marketing you can do, is real stories, real value. >> Couldn't agree more. Well Dan, thanks so much for stopping by, sharing what things that Kinetica is doing, some of the things you're hearing, and how you're working to really build this foundation of collaboration and enablement within your customers across industries. We look forward to hearing the kind of cool stuff that happens with Kinetica, throughout the rest of the year, and again, thanks for stopping by and sharing your insights. >> Thank you for having me. >> I want to thank you for watching theCUBE, I'm Lisa Martin with my co-host Peter Burris, we are at Big Data SV, our second day of coverage, at a cool place called the Forager Tasting Room, in downtown San Jose, stop by, check us out, and have a chance to talk with some of our amazing analysts on all things big data. Stick around though, we'll be right back with our next guest after a short break. (mellow electronic music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by SiliconANGLE Media We are the down the street from the Strata Data Conference, and what is that, what does it mean for your customers? and it's not just about creating the data driven business, So how is the relationship between your software, if you think about this in terms of this is really in the use cases of what you can do with it, and the digital bank, and all these Fintech upstarts making sure that the value that people are creating, is the idea of how to operationalize this in a way you are providing a platform that are going to use your tool to do things, and how are the mobile providers going to be able and kick the tires with, but we need to extend our dialogue, on the marketing side being able to really target We have to do that in collaboration with the customers. the engine behind it. that you think you're going to be stronger in? and the impact it has tends to be different, so, One of the things you said as we wrap up here, To me, that's the best marketing you can do, some of the things you're hearing, and have a chance to talk with some of our amazing analysts

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Peter BurrisPERSON

0.99+

Lisa MartinPERSON

0.99+

PaulPERSON

0.99+

AmazonORGANIZATION

0.99+

Dan RaskinPERSON

0.99+

Whole FoodsORGANIZATION

0.99+

Daniel RaskinPERSON

0.99+

64 coresQUANTITY

0.99+

AsiaLOCATION

0.99+

DanPERSON

0.99+

2020DATE

0.99+

San JoseLOCATION

0.99+

two questionsQUANTITY

0.99+

KineticaORGANIZATION

0.99+

LippoORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

second dayQUANTITY

0.99+

yesterdayDATE

0.99+

6,000 parallelQUANTITY

0.99+

64 parallelQUANTITY

0.99+

2020 OlympicsEVENT

0.99+

Strata Data ConferenceEVENT

0.99+

telcoORGANIZATION

0.98+

theCUBEORGANIZATION

0.98+

oneQUANTITY

0.98+

single engineQUANTITY

0.97+

FirstQUANTITY

0.97+

Wild WestLOCATION

0.97+

todayDATE

0.97+

five years agoDATE

0.96+

Big Data SVORGANIZATION

0.96+

one areaQUANTITY

0.95+

StrataORGANIZATION

0.95+

United States Postal ServiceORGANIZATION

0.94+

day twoQUANTITY

0.93+

Narrator: LiveTITLE

0.93+

OneQUANTITY

0.93+

one placeQUANTITY

0.9+

FintechORGANIZATION

0.88+

up to 6,000 coresQUANTITY

0.88+

yearsDATE

0.88+

US Postal ServiceORGANIZATION

0.88+

Billions of rowsQUANTITY

0.87+

terabytesQUANTITY

0.85+

JapanLOCATION

0.82+

hundred timesQUANTITY

0.82+

terabytes of dataQUANTITY

0.81+

StrataTITLE

0.8+

Tableau AccelerationTITLE

0.78+

single industryQUANTITY

0.78+

CoreIPTITLE

0.76+

360 viewQUANTITY

0.75+

Silicon ValleyLOCATION

0.73+

billion peopleQUANTITY

0.73+

2018DATE

0.73+

Data SVEVENT

0.72+

KineticaCOMMERCIAL_ITEM

0.72+

Forager Tasting RoomORGANIZATION

0.68+

BigEVENT

0.67+

millisecondQUANTITY

0.66+

KafkaPERSON

0.6+

Big DataORGANIZATION

0.59+

Data SVORGANIZATION

0.58+

big dataORGANIZATION

0.56+

nextDATE

0.55+

lotQUANTITY

0.54+

BigORGANIZATION

0.47+

Chris Selland, Unifi Software | Big Data SV 2018


 

>> Voiceover: Live from San Jose, it's The Cube. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to The Cube, our continuing coverage of our event, Big Data SV. We're on day two of this event. I'm Lisa Martin, with George Gilbert. We've had a great day yesterday learning a lot and really peeling back the layers of big data, looking at it from different perspectives, from challenges to opportunities. Joining us next is one of our Cube alumni, Chris Selland, the VP of Strategic Alliances from Unifi Software. Chris, great to meet you, welcome back! >> Thank you Lisa, it's great to be here. I have to say, as a alumni and a many time speaker, this venue is spectacular. Congratulations on the growth of The Cube, and this is an awesome venue. I've been on The Cube a bunch of times and this is as nice as I've ever seen it, >> Yeah, this is pretty cool, >> Onward and upward. This place is great. Isn't it cool? >> It really is. This is our 10th Big Data event, we've been having five now in San Jose, do our fifth one in New York City in the fall, and it's always interesting because we get the chance, George and I, and the other hosts, to really look at what is going on from different perspectives in the industry of big data. So before we kind of dig into that, tell us a little bit about Unifi Software, what do you guys do, what is unique and differentiating about Unifi. >> Sure, yeah, so I joined Unifi a little over a year ago. You know, I was attracted to the company because it really, I think, is aligned with where the market is going, and Peter talked this morning, Peter Burris was talking this morning about networks of data. Unifi is fundamentally a data catalog and data preparation platform, kind of combined or unified together. So, you know, so people say, "What do you do?" We're a data catalog with integrated data preparation. And the idea behind that, to go to Peter's, you know, mention of networks of data, is that data is becoming more and more distributed in terms of where it is, where it lives, where it sits. This idea of we're going to put everything in the data warehouse, and then we're going to put everything in the data lake, well, in reality, some of the data's in the warehouse, some of the data's in the lake, some of the data's in SAS applications, some of the data's in blob storage. And where is all of that data, what is it, and what can I do with it, that's really the fundamental problem that we solve. And, by the way, solve it for business people, because it's not just data scientists anymore, it's really going out into the entire business community now, you know, marketing people, operations people, finance people, they need data to do their jobs. Their jobs are becoming more data driven, but they're not necessarily data people. They don't know what schemas are, or joins are, but they know, "I need better data "to be able to do my job more effectively." So that's really what we're helping with. So, Chris, this is, it's kind of interesting, if you distill, you know, the capability down to the catalog and the prep-- >> Chris: Yep. So that it's ready for a catalog, but that sort of thing is, it's like investment in infrastructure, in terms of like building the highway system, but there're going to be, you know, for those early highways, there's got to be roots that you, a reason to build them out. What are some of those early use cases that justifies the investment in data infrastructure? >> There absolutely are, I mean, and by the way, those roots don't go away, those roots, you know, just like cities, right? New roots get built on top of them. So we're very much, you know, about, there's still data sitting in mainframes and legacy systems and you know, that data is absolutely critical for many large organizations. We do a lot of working in banking and financial services, and healthcare. They're still-- >> George: Are there common use cases that they start with? >> A lot of times-- >> Like, either by industry or just cross-sectional? >> Well, it's interesting, because, you know, analysts like yourselves have tended to put data catalog, which is a relatively new term, although some other big analyst firm that's having another conference this week, they were telling us recently that, starts with a "G," right? They were telling us that data catalog is now the number one search term they're getting. But it's been, by many annals, also kind of lumped in, lumped in's the wrong word, but incorporated with data governance. So traditionally, governance, another word that starts with "G," it's been the term. So, we often, we're not a traditional data governance platform, per se, but cataloging data has to have a foundation of security in governance. You know, think about what's going on in the world right now, both in the court of law and the court of public opinion, things like GDPR, right? So GDPR sort of says any customer data you have needs to be managed a certain way, with a certain level of sensitivity, and then there's other capabilities you need to open up to customers, like the right to be forgotten, so that means I need to have really good control, first of all, knowledge of, control over, and governance over my customer data. I talked about all those business people before. Certainly marketers are a great example. Marketers want all the customer data they can get, right? But there's social security numbers, PII, who should be able to see and use what? Because, if this data is used inappropriately, then it can cause a lot of problems. So, IT kind of sits in a-- they want to enable the business, but at the same time, there's a lot of risk there. So, anyway, going back to your question, you know, the catalog market is kind of evolved out of the governance market with more of a focus on kind of, you know, enabling the business, but making sure that it's done in a secure and well-governed way. >> George: Guard rails. >> Yes, guard rails, exactly, good way to say it. So, yep, that's good, I said about 500 words, and you distilled it to about two, right? Perfect, yep. >> So, in terms of your role in strategic alliances, tell us a little about some of the partnerships that Unifi is forging, to help customers understand where all this data is, to your point earlier, the different lines of business that need it to drive, identify where's their value, and drive the business forward, can actually get it. >> Absolutely, well, certainly to your point, our customers are our partners, and we can talk about some of them. But also, strategic alliances, we work very closely with a number of, you know, larger technology companies, Microsoft is a good example. We were actually part of the Microsoft Accelerator Program, which I think they've now rebranded Microsoft for Startups, but we've really been given tremendous support by that team, and we're doing a lot of work to, kind of, we're to some degree cloud agnostic, we support AWS, we support Azure, we support Google Cloud, but we're doing a lot of our development also on the Azure cloud platform. But you know, customers use all of the above, so we need to support all of the above. So Microsoft's a very close partner of ours. Another, I'll be in two weeks, and we've got some interesting news pending, which unfortunately I can't get into today, but maybe in a couple weeks, with Adobe. We're working very closely with them on their marketing cloud, their experience cloud, which is what they call their enterprise marketing cloud, which obviously, big, big focus on customer data, and then we've been working with a number of organizations and the sort of professional services system integration. We've had a lot of success with a firm called Access Group. We announced the partnership with them about two weeks ago. They've been a great partner for us, as well. So, you know, it's all about an ecosystem. Making customers successful is about getting an ecosystem together, so it's a really exciting place to be. >> So, Chris, it's actually interesting, it sounds like there's sort of a two classic routes to market. One is essentially people building your solution into theirs, whether it's an application or, you know, >> Chris: An enabling layer. >> Yes. >> Chris: Yes. >> Even higher layer. But with corporate developers, you know, it's almost like we spent years experimenting with these data lakes. But they were a little too opaque. >> Chris: Yes. >> And you know, it's not just that you provide the guard rails, but you also provide, sort of some transparency-- >> Chris: Yes. >> Into that. Have you seen a greater success rate within organizations who curate their data lakes, as opposed to those who, you know, who don't? >> Yes, absolutely. I think Peter said it very well in his presentation this morning, as well. That, you know, generally when you see data lake, we associate it with Hadoop. There are use cases that Hadoop is very good for, but there are others where it might not be the best fit. Which, to the early point about networks of data and distributed data, so companies that have, or organizations that have approached Hadoop with a "let's use it what it's good for," as opposed to "let's just dump "everything in there and figure it out later," and there have been a lot of the latter, but the former have done, generally speaking, a lot better, and that's what you're seeing. And we actually use Hadoop as a part of our platform, at least for the data preparation and transformation side of what we do. We use it in its enabling technology, as well. >> You know, it's funny, actually, when you talk about, as Peter talked about, networks of data versus centralized repositories. Scott Gnau, CTO of Hortonworks, was on yesterday, and he was talking about how he had originally come from Teradata, and that they had tried to do work, that he had tried to push them in the direction of recognizing that not all the analytic data was going to be in Teradata, you know, but they had to look more broadly with Hadapt, and I forgot what the rest of, you know-- >> Chris: Right, Aster, and-- >> Aster, yeah. >> Chris: Yes, exactly, yep. >> But what was interesting is that Hortonworks was moving towards the "we believe "everything is going to be in the data lake," but now, with their data plane service, they're talking about, you know, "We have to give you visibility and access." You mediate access to data everywhere. >> Chris: Right. >> So maybe help, so for folks who aren't, like, all bought into Hortonworks, for example, how much, you know, explain how you work relative to data plane service. >> Well, you know, maybe I could step back and give you a more general answer, because I agree with that philosophically, right? That, as I think we've been talking about here, with the networks of data, that goes back to my prior statement that there's, you know, there's different types of data platforms that have different use cases, and different types of solutions should be built on top of them, so things are getting more distributed. I think that, you know, Hortonworks, like every company, has to make the investments that are, as we are, making their customers successful. So, using Hadoop, and Hortonworks is one of our supported Hadoop platforms, we do work with them on engagements, but you know, it's all about making customers successful, ultimately. It's not about a particular product, it's about, you know, which data belongs in which location, and for what use case and what purpose, and then at the same time, when we're taking all of these different data sets and data sources, and cataloging them and preparing them and creating our output, where should we put that and catalog that, so we can create kind of a continuous improvement cycle, as well, and for those types-- >> A flywheel. >> A flywheel, exactly, continuous improvement flywheel, and for those types of purposes, you know, that's actually great use case for, you know, Hortonworks, Hadoop. That's a lot of what we typically use it for. We can actually put the data any place our customers define, but that's very often what we do with it, and then, but doing it in a very structured and organized way. As opposed to, you know, a lot of the early Hadoop, and not specific to any particular distro that went bad, were, it was just like, "Let's just dump it all "into Hadoop because it's cheaper." You know, "Let's, 'cause it's cheaper than the warehouse, "so let's just put it all in there, "and we'll figure what to do with it later." That's bad, but if you're using it in a structured way, it can be extremely useful. At the same point, and at the same time, not everything's going to go there belongs there, if you're being thoughtful about it. So you're seeing a lot more thoughtfulness these days, which is good. Which is good for customers, and it's good for us in the vendor side. Us, Hortonworks, everybody, so. >> So is there, maybe you can tell us of the different approaches to, like, the advantage of integrating the data prep with the catalogized service, because as soon as you're done with data prep it's visible within the catalog. >> Chris: Absolutely, that's one, yep. >> When, let's say when people do derive additional views into the data, how are they doing that in a way that then gets also registered back in the catalog, for further discovery? >> Yeah, well, having the integrated data preparation which is a huge differentiator from us, there are a lot of data catalog products out there, but our huge differentiator, one of them, is the fact that we have integrated data preparation. We don't have to hand off to another product, so that, as you said, gives us the ability to then catalog our output and build that flywheel, that continuous improvement flywheel, and it also just basically simplifies things for customers, hence our name. So, you know, it really kind of starts there. I think I, the second part of your question I didn't really, rewind back on that for me, it was-- >> Go ahead. >> Well, I'm not sure I remember it, right now, either. >> We all need more coffee. >> Exactly, we all need more coffee. >> So I'll ask you this last question, then. >> Yes, please. >> What are, so here we are in March 2018, what are you looking forward to, in terms of momentum and evolution of Unifi this year? >> Well, a lot of it, and tying into my role, I mentioned we will be at Adobe Summit in two weeks, so if you're going to be at Adobe Summit, come see us there, some of the work that we're doing with our partner, some of the events we're doing with people like Microsoft and Access, but really it's also just customer success, I mean, we're seeing tremendous momentum on the customer side, working with our customers, working with our partners, and again, as I mentioned, we're seeing so much more thoughtfulness in the market, these days, and less talk about, you know, the speeds and feeds, and more around business solutions. That's really also where our professional services, system integration partners, many of whom I've been with this week, really help, because they're building out solutions. You know, GDPR is coming in May, right? And you're starting to really see a groundswell of, okay, you know, and that's not about, you know, speeds and feeds. That's ultimately about making sure that I'm compliant with, you know, this huge regulatory environment. And at the same time, the court of public opinion is just as important. You know, we want to make sure that we're doing the right thing with data. Spread it throughout organization, make ourselves successful and make our customers successful. So, it's a lot of fun. >> That's, fun is good. >> Exactly, fun is good. >> Well, we thank you so much, Chris, for stopping back by The Cube and sharing your insights, what you're hearing in the big data industry, and some of the momentum that you're looking forward to carrying throughout the year. >> It's always a pleasure, and you, too. So, love the venue. >> Lisa: All right. >> Thank you, Lisa, thank you, George. >> Absolutely. We want to thank you for watching The Cube. You're watching our coverage of our event, Big Data SV, hashtag BigDataSV, for George, I almost said George Martin. For George Gilbert. >> George: I wish. >> George R.R., yeah. You would not be here if you were George R.R. Martin. >> George: No, I wouldn't. >> That was a really long way to say thank you for watching. I'm Lisa Martin, for this George. Stick around, we'll be right back with our next guest. (techno music)

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media and really peeling back the layers of big data, Thank you Lisa, it's great to be here. Onward and upward. George and I, and the other hosts, So, you know, so people say, "What do you do?" you know, for those early highways, and legacy systems and you know, with more of a focus on kind of, you know, and you distilled it to about two, right? and drive the business forward, can actually get it. So, you know, it's all about an ecosystem. or, you know, But with corporate developers, you know, as opposed to those who, you know, who don't? That, you know, generally when you see data lake, and I forgot what the rest of, you know-- yeah. "We have to give you visibility and access." how much, you know, explain how you work to my prior statement that there's, you know, and for those types of purposes, you know, So is there, maybe you can tell us So, you know, it really kind of starts there. and that's not about, you know, speeds and feeds. Well, we thank you so much, Chris, So, love the venue. We want to thank you for watching The Cube. You would not be here if you were George R.R. That was a really long way to say thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

ChrisPERSON

0.99+

PeterPERSON

0.99+

Chris SellandPERSON

0.99+

GeorgePERSON

0.99+

MicrosoftORGANIZATION

0.99+

Scott GnauPERSON

0.99+

LisaPERSON

0.99+

March 2018DATE

0.99+

AdobeORGANIZATION

0.99+

San JoseLOCATION

0.99+

Peter BurrisPERSON

0.99+

UnifiORGANIZATION

0.99+

New York CityLOCATION

0.99+

George R.R. MartinPERSON

0.99+

AWSORGANIZATION

0.99+

Unifi SoftwareORGANIZATION

0.99+

MayDATE

0.99+

TeradataORGANIZATION

0.99+

George MartinPERSON

0.99+

George R.R.PERSON

0.99+

HortonworksORGANIZATION

0.99+

Access GroupORGANIZATION

0.99+

yesterdayDATE

0.99+

bothQUANTITY

0.99+

fiveQUANTITY

0.99+

GDPRTITLE

0.99+

SiliconANGLE MediaORGANIZATION

0.98+

this weekDATE

0.98+

HadaptORGANIZATION

0.98+

fifth oneQUANTITY

0.98+

about 500 wordsQUANTITY

0.98+

HadoopTITLE

0.98+

Adobe SummitEVENT

0.98+

oneQUANTITY

0.98+

two weeksQUANTITY

0.98+

OneQUANTITY

0.96+

AsterPERSON

0.96+

this morningDATE

0.96+

this yearDATE

0.95+

two weeks agoDATE

0.95+

todayDATE

0.95+

The CubeORGANIZATION

0.95+

CubeORGANIZATION

0.93+

Big DataEVENT

0.91+

day twoQUANTITY

0.91+

AccessORGANIZATION

0.9+

Yuanhao Sun, Transwarp | Big Data SV 2018


 

>> Announcer: Live, from San Jose, it's The Cube (light music) Presenting Big Data Silicon Valley. Brought to you by Silicon Angle Media, and its ecosystem partners. >> Hi, I'm Peter Burris and welcome back to Big Data SV, The Cube's, again, annual broadcast of what's happening in the big data marketplace here at, or adjacent to Strada here in San Jose. We've been broadcasting all day. We're going to be here tomorrow as well, over at the Forager eatery and place to come meander. So come on over. Spend some time with us. Now, we've had a number of great guests. Many of the thought leaders that are visiting here in San Jose today were on the big data marketplace. But I don't think any has traveled as far as our next guest. Yuanhao Sun is the ceo of Transwarp. Come all the way from Shanghai Yuanhao. It's once again great to see you on The Cube. Thank you very much for being here. >> Good to see you again. >> So Yuanhao, the Transwarp as a company has become extremely well known for great technology. There's a lot of reasons why that's the case, but you have some interesting updates on how the technology's being applied. Why don't you tell us what's going on? >> Okay, so, recently we announced the first order to the TPC-DS benchmark result. Our product, calling scepter, that is, SQL engine on top of Hadoop. We already add quite a lot of features, like dissre transactions, like a full SQL support. So that it can mimic, like oracle or the mutual, and also traditional database features so that we can pass the whole test. This single is also scalable, because it's distributed, scalable. So the large benchmark, like TPC-DS. It starts from 10 terabytes. SQL engine can pester without much trouble. >> So I know that there have been other firms that have claimed to pass TPCC-DS, but they haven't been audited. What does it mean to say you're audited? I'd presume that as a result, you've gone through some extremely stringent and specific tests to demonstrate that you can actually pass the entire suite. >> Yes, actually, there is a third party auditor. They already audit our test process and it results for the passed six, uh, five months. So it is fully audited. The reason why we can pass the test is because, actually, there's two major reasons for traditional databases. They are not scalable to the process large dataset. So they could not pass the test. For (mumbles) vendors, because the SQL engine, the features to reach enough to pass all the test. You know, there several steps in the benchmark, and the SQL queries, there are 99 queries, the syntax is not supported by all howve vendors yet. And also, the benchmark required to upload the data, after the queries, and then we run the queries for multiple concurrent users. That means you have to support disputed transactions. You have to make the upload data consistent. For howve vendors, the SQL engine on Hadoop. They haven't implemented the de-switch transaction capabilities. So that's why they failed to pass the benchmark. >> So I had the honor of traveling to Shanghai last year and going and speaking at your user conference and was quite impressed with the energy that was in the room as you announced a large number of new products. You've been very focused on taking what open source has to offer but adding significant value to it. As you said, you've done a lot with the SQL interfaces and various capabilities of SQL on top of Hadoop. Where is Transwarp going with its products today? How is it expanding? How is it being organizing? How is it being used? >> We group these products into three catalog, including big data, cloud, AI and the machine learning. So there are three categories. The big data, we upgrade the SQL engine, the stream engine, and we have a set of tools called adjustable studio to help people to streamline the big data operations. And the second part I lie is data cloud. We call it transwarp data cloud. So this product is going to be raised in early in May this year. So this product we build this product on top of common idiots. We provide how to buy the service, get a sense as service, air as a service to customers. A lot of people took credit multiple tenets. And they turned as isolated by network, storage, cpu. They free to create a clusters and speeding up on turning it off. So it can also scale hundreds of cost. So this is the, I think this is the first we implement, like, a network isolation and sweaty percendency in cobinets. So that it can support each day affairs and all how to components. And because it is elastic, just like car computing, but we run on bare model, people can consult the data, consult the applications in one place. Because all application and Hadoop components are conternalized, that means, we are talking images. We can spend up a very quickly and scale through a larger cluster. So this data cloud product is very interesting for large company, because they usually have a small IT team. But they have to provide a (mumbles), and a machine only capability to larger groups, like one found the people. So they need a convenient way to manage all these bigger clusters. And they have to isolate the resources. Even they need a bidding system. So this product is, we already have few big names in China, like China Post, Picture Channel, and Secret of Source Channel. So they are already applying this data cloud for their internal customers. >> And China has a, has a few people, so I presume that, you know, China Post for example, is probably a pretty big implementation. >> Yes so, they have a, but the IT team is, like less than 100 people, but they have to support thousands of users. So that's why they, you usually would deploy 100 cluster for each application, right, but today, for large organization, they have lots of applications. They hope to leverage big data capability, but a very small team, IT team, can also part of so many applications. So they need a convenient the way like a, just like when you put Hadoop on public cloud. We provide a product that allows you to provide a hardware service in private cloud on bare model machines. So this is the second product category. And the third is the machine learning and artificial intelligence. We provide a data sales platform, a machine learning tool, that is, interactive tools that allows people to create the machine only pipelines and models. We even implemented some automatic modeling capability that allow you to, to fisher in youring automatically or seeming automatically and to select the best items for you so that the machine learning can be, so everyone can be at Los Angeles. So they can use our tool to quickly create a models. And we also have some probuter models for different industry, like financial service, like banks, security companies, even iot. So we have different probuter machine only models for them. We just need to modify the template, then apply the machine only models to the applications very quickly. So that probably like a lesson, for example, for a bank customer, they just use it to deploy a model in one week. This is very quick for them. Otherwise, in the past, they have a company to build that application, to develop much models. They usually takes several months. Today it is much faster. So today we have three categories, particularly like cloud and machine learning. >> Peter Burris: Machine learning and AI. >> And so three products. >> And you've got some very, very big implementations. So you were talking about a couple of banks, but we were talking, before we came on, about some of the smart cities. >> Yuanhao Sun: Right. Kinds of things that you guys are doing at enormous scale. >> Yes, so we deploy our streaming productor for more than 300 cities in China. So this cluster is like connected together. So we use streaming capability to monitor the traffic and send the information from city to the central government. So all the, the sort of essential repoetry. So whenever illegal behavior on the road is detected, that information will be sent to the policeman, or the central repoetry within two second. Whenever you are seen by the camera in any place in China, their loads where we send out within two seconds. >> So the bad behavior is detected. It's identified as the location. The system also knows where the nearest police person is. And it sends a message and says, this car has performed something bad. >> Yeah and you should stop that car in the next station or in the next crossroad. Today there are tens of thousands policeman. They depends on this system for their daily work. >> Peter Burris: Interesting. >> So, just a question on, it sounds like one of your, sort of nearest competitors, in terms of, let's take the open source community, at least the APIs, and in their case open source, Waway. Have their been customers that tried to do a POC with you and with Waway, and said, well it took four months using the pure open source stuff, and it took, say, two weeks with your stack having, being much broader and deeper? Are any examples like that? >> There are quite a lot. We have more macro-share, like in financial services, we have about 100 bank users. So if we take all banks into account, for them they already use Hadoop. So we, our macro-share is above 60%. >> George Gilbert: 60. >> Yeah, in financial services. We usually do POC and, like run benchmarks. They are real workloads and usually it takes us three days or one week. They can found, we can speed up their workload very quickly. For Bank of China, they might go to their oracle workload to our platform. And they test our platform and the huave platform too. So the first thing is they cannot marry the whole oracle workload to open source Hadoop, because the missing features. We are able to support all this workloads with very minor modifications. So the modification takes only several hours. And we can finish the whole workload within two hours, but originally they take, usually take oracle more than one day, >> George Gilbert: Wow. >> more than ten hours to finish the workload. So it is very easy to see the benefits quickly. >> Now the you have a streaming product also with that same SQL interface. Are you going to see a migration of applications that used to be batch to more near real time or continuous, or will you see a whole new set of applications that weren't done before, because the latency wasn't appropriate? >> For streaming applications, real time cases they are mostly new applications, but if we are using storm api or spark streaming api, it is not so easy to develop your applications. And another issue is once you detect one new rule, you had to add those rules dynamically to your cluster. So to add to your printer, they do not have so many knowledge of writing scholar codes. They only know how to configure. Probably they are familiar with c-code. They just need to add one SQL statement to add a new rule. So that they can. >> In your system. >> Yeah, in our system. So it is much easier for them to program streaming applications. And for those customers who they don't have real time equations, they hope to do, like a real time data warehousing. They collect all this data from websites from their censors, like Petrol Channel, an oil company, the large oil company. They collect all the (mumbles) information directly to our streaming product. In the past, they just accredit to oracle and around the dashboard. So it only takes hours to see the results. But today, the application can be moved through our streaming product with only a few modifications, because they are all SQL statements. And this application becomes the real time. They can see the real time dashboard results in several seconds. >> So Yuanhao, you're number one in China. You're moving more aggressively to participate in the US market. What's the, last question, what's the biggest difference between being number one in China, the way that big data is being done in China versus the way you're encountering big data being done here, certainly in the US, for example? Is there a difference? >> I think there are some difference. Some a seem, katsumoto usually request a POC. But in China, they usually, I think they focus more on the results. They focus on what benefit they can gain from your product. So we have to prove them. So we have to hip them to my great application to see the benefits. I think in US, they focus more on technology than Chinese customers. >> Interesting, so they're more on technology here in the US, more in the outcome in China. Once again, Yuanhao Sun, from, ceo of Transwarp, thank you very much for being on The Cube. >> Thank you. And I'm Peter Burris with George Gilbert, my co-host, and we'll be back with more from big data SV, in San Jose. Come on over to the Forager, and spend some time with us. And we'll be back in a second. (light music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by Silicon Angle Media, over at the Forager eatery and place to come meander. So Yuanhao, the Transwarp as a company has become So that it can mimic, like oracle or the mutual, to demonstrate that you can actually pass the entire suite. And also, the benchmark required to upload the data, So I had the honor of traveling to Shanghai last year So this product is going to be raised you know, China Post for example, and to select the best items for you So you were talking about a couple of banks, Kinds of things that you guys are doing at enormous scale. from city to the central government. So the bad behavior is detected. or in the next crossroad. and it took, say, two weeks with your stack having, So if we take all banks into account, So the first thing is they cannot more than ten hours to finish the workload. Now the you have a streaming product also So to add to your printer, So it only takes hours to see the results. to participate in the US market. So we have to prove them. in the US, more in the outcome in China. Come on over to the Forager, and spend some time with us.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Peter BurrisPERSON

0.99+

ShanghaiLOCATION

0.99+

George GilbertPERSON

0.99+

USLOCATION

0.99+

ChinaLOCATION

0.99+

99 queriesQUANTITY

0.99+

three daysQUANTITY

0.99+

two weeksQUANTITY

0.99+

Silicon Angle MediaORGANIZATION

0.99+

five monthsQUANTITY

0.99+

San JoseLOCATION

0.99+

China PostORGANIZATION

0.99+

Picture ChannelORGANIZATION

0.99+

one weekQUANTITY

0.99+

sixQUANTITY

0.99+

four monthsQUANTITY

0.99+

Los AngelesLOCATION

0.99+

10 terabytesQUANTITY

0.99+

last yearDATE

0.99+

todayDATE

0.99+

TodayDATE

0.99+

tomorrowDATE

0.99+

more than one dayQUANTITY

0.99+

more than 300 citiesQUANTITY

0.99+

second partQUANTITY

0.99+

two hoursQUANTITY

0.99+

less than 100 peopleQUANTITY

0.99+

more than ten hoursQUANTITY

0.99+

WawayORGANIZATION

0.99+

Bank of ChinaORGANIZATION

0.99+

thirdQUANTITY

0.99+

HadoopTITLE

0.99+

Petrol ChannelORGANIZATION

0.99+

three productsQUANTITY

0.98+

one new ruleQUANTITY

0.98+

hundredsQUANTITY

0.98+

three categoriesQUANTITY

0.98+

SQLTITLE

0.98+

singleQUANTITY

0.98+

TranswarpORGANIZATION

0.98+

firstQUANTITY

0.98+

tens of thousands policemanQUANTITY

0.98+

Yuanhao SunORGANIZATION

0.98+

each applicationQUANTITY

0.98+

two secondsQUANTITY

0.98+

100 clusterQUANTITY

0.97+

first thingQUANTITY

0.97+

about 100 bank usersQUANTITY

0.97+

two secondQUANTITY

0.97+

each dayQUANTITY

0.97+

Big Data SVORGANIZATION

0.97+

The CubeORGANIZATION

0.96+

two major reasonsQUANTITY

0.95+

oneQUANTITY

0.95+

above 60%QUANTITY

0.95+

early in May this yearDATE

0.94+

Source ChannelORGANIZATION

0.93+

Big DataORGANIZATION

0.92+

ChineseOTHER

0.9+

StradaLOCATION

0.89+

second product categoryQUANTITY

0.88+

Ron Bodkin, Google | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data, Silicon Valley, brought to you by Silicon Angle Media and its ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our event Big Data SV. I'm Lisa Martin, joined by Dave Vellante and we've been here all day having some great conversations really looking at big data, cloud, AI machine-learning from many different levels. We're happy to welcome back to theCUBE one of our distinguished alumni, Ron Bodkin, who's now the Technical Director of Applied AI at Google. Hey Ron, welcome back. >> It's nice to be back Lisa, thank you. >> Yeah, thanks for coming by. >> Thanks Dave. >> So you have been a friend of theCUBE for a long time, you've been in this industry and this space for a long time. Let's take a little bit of a walk down memory lane, your perspectives on Big Data Hadoop and the evolution that you've seen. >> Sure, you know so I first got involved in big data back in 2007. I was VP in generating a startup called QuantCast in the online advertising space. You know, we were using early versions of Hadoop to crunch through petabytes of data and build data science models and I saw a huge opportunity to bring those kind of capabilities to the enterprise. You know, we were working with early Hadoop vendors. Actually, at the time, there was really only one commercial vendor of Hadoop, it was Cloudera and we were working with them and then you know, others as they came online, right? So back then we had to spend a lot of time explaining to enterprises what was this concept of big data, why it was Hadoop as an open source could get interesting, what did it mean to build a data lake? And you know, we always said look, there's going to be a ton of value around data science, right? Putting your big data together and collecting complete information and then being able to build data science models to act in your business. So you know, the exciting thing for me is you know, now we're at a stage where many companies have put those assets together. You've got access to amazing cloud scale resources like we have at Google to not only work with great information, but to start to really act on it because you know, kind of in parallel with that evolution of big data was the evolution of the algorithms as well as the access to large amounts of digital data that's propelled, you know, a lot of innovation in AI through this new trend of deep learning that we're invested heavily in. >> I mean the epiphany of Hadoop when I first heard about it was bringing, you know, five megabytes of code to a petabyte of data as sort of the bromide. But you know, the narrative in the press has really been well, they haven't really lived up to expectations, the ROI has been largely a reduction on investment and so is that fair? I mean you've worked with practitioners, you know, all your big data career and you've seen a lot of companies transform. Obviously Google as a big data company is probably the best example of one. Do you think that's a fair narrative or did the big data hype fail to live up to expectations? >> I think there's a couple of things going on here. One is, you know, that the capabilities in big data have varied widely, right? So if you look at the way, for example, at Google we operate with big data tools that we have, they're extremely productive, work at massive scale, you know, with large numbers of users being able to slice and dice and get deep analysis of data. It's a great setup for doing machine learning, right? That's why we have things like BigQuery available in the cloud. You know, I'd say that what happened in the open source Hadoop world was it ended up settling in on more of the subset of use cases around how do we make it easy to store large amounts of data inexpensively, how do we offload ETL, how do we make it possible for data scientists to get access to raw data? I don't think that's as functional as what people really had imagined coming out of big data. But it's still served a useful function complementing what companies were already doing at their warehouse, right? So I'd say those efforts to collect big data and to make them available have really been a, they've set the stage for analytic value both through better building of analytic databases but especially through machine learning. >> And there's been some clear successes. I mean, one of them obviously is advertising, Google's had a huge success there. But much more, I mean fraud detection, you're starting to see health care really glom on. Financial services have been big on this, you know, maybe largely for marketing reasons but also risk, You know for sure, so there's been some clear successes. I've likened it to, you know, before you got to paint, you got to scrape and you got to, you put in caulking and so forth. And now we're in a position where you've got a corpus of data in your organization and you can really start to apply things like machine learning and artificial intelligence. Your thoughts on that premise? >> Yeah, I definitely think there's a lot of truth to that. I think some of it was, there was a hope, a lot of people thought that big data would be magic, that you could just dump a bunch of raw data without any effort and out would come all the answers. And that was never a realistic hope. There's always a level of you have to at least have some level of structure in the data, you have to put some effort in curating the data so you have valid results, right? So it's created a set of tools to allow scaling. You know, we now take for granted the ability to have elastic data, to have it scale and have it in the cloud in a way that just wasn't the norm even 10 years ago. It's like people were thinking about very brittle, limited amounts of data in silos was the norm, so the conversation's changed so much, we almost forget how much things have evolved. >> Speaking of evolution, tell us a little bit more about your role with applied AI at Google. What was the genesis of it and how are you working with customers for them to kind of leverage this next phase of big data and applying machine learning so that they really can identify, well monetize content and data and actually identify new revenue streams? >> Absolutely, so you know at Google, we really started the journey to become an AI-first company early this decade, a little over five years ago. We invested in the Google X team, you know, Jeff Dean was one of the leaders there, sort of to invest in, hey, these deep learning algorithms are having a big impact, right? Fei-Fei Li, who's now the Chief Scientist at Google Cloud was at Stanford doing research around how can we teach a computer to see and catalog a lot of digital data for visual purposes? So combining that with advances in computing with first GPUs and then ultimately we invested in specialized hardware that made it work well for us. The massive-scale TPU's, right? That combination really started to unlock all kinds of problems that we could solve with machine learning in a way that we couldn't before. So it's now become central to all kinds of products at Google, whether it be the biggest improvements we've had in search and advertising coming from these deep learning models but also breakthroughs, products like Google Photos where you can now search and find photos based on keywords from intelligence in a machine that looks at what's in the photo, right? So we've invested and made that a central part of the business and so what we're seeing is as we build up the cloud business, there's a tremendous interest in how can we take Google's capabilities, right, our investments in open source deep learning frameworks, TensorFlow, our investments in hardware, TPU, our scalable infrastructure for doing machine learning, right? We're able to serve a billion inferences a second, right? So we've got this massive capability we've built for our own products that we're now making available for customers and the customers are saying, "How do I tap into that? "How can I work with Google, how can I work with "the products, how can I work with the capabilities?" So the applied AI team is really about how do we help customers drive these 10x opportunities with machine learning, partnering with Google? And the reason it's a 10x opportunity is you've had a big set of improvements where models that weren't useful commercially until recently are now useful and can be applied. So you can do things like translating languages automatically, like recognizing speech, like having automated dialog for chat bots or you know, all kinds of visual APIs like our AutoML API where engineers can feed up images and it will train a model specialized to their need to recognize what you're looking for, right? So those types of advances mean that all kinds of business process can be reconceived of, and dramatically improved with automation, taking a lot of human drudgery out. So customers are like "That's really "exciting and at Google you're doing that. "How do we get that, right? "We don't know how to go there." >> Well natural language processing has been amazing in the last couple of years. Not surprising that Google is so successful there. I was kind of blown away that Amazon with Alexa sort of blew past Siri, right? And so thinking about new ways in which we're going to interact with our devices, it's clearly coming, so it leads me into my question on innovation. What's driven in your view, the innovation in the last decade and what's going to drive innovation the next 10 years? >> I think innovation is very much a function of having the right kind of culture and mindset, right? So I mean for us at Google, a big part of it is what we call 10x thinking, which is really focusing on how do you think about the big problem and work on something that could have a big impact? I also think that you can't really predict what's going to work, but there's a lot of interesting ideas and many of them won't pan out, right? But the more you have a culture of failing fast and trying things and at least being open to the data and give it a shot, right, and say "Is this crazy thing going to work?" That's why we have things like Google X where we invest in moonshots but that's where, you know, throughout the business, we say hey, you can have a 20% project, you can go work on something and many of them don't work or have a small impact but then you get things like Gmail getting created out of a 20% project. It's a cultural thing that you foster and encourage people to try things and be open to the possibility that something big is on your hands, right? >> On the cultural front, it sounds like in some cases depending on the enterprise, it's a shift, in some cases it's a cultural journey. The Google on Google story sounds like it could be a blueprint, of course, how do we do this? You've done this but how much is it a blueprint on the technology capitalizing on deep learning capabilities as well as a blueprint for helping organizations on this cultural journey to be actually being able to benefit and profit from this? >> Yeah, I mean that's absolutely right Lisa that these are both really important aspects, that there's a big part of the cultural journey. In order to be an AI-first company, to really reconceive your business around what can happen with machine learning, it's important to be a digital company, right? To have a mindset of making quick decisions and thinking about how data impacts your business and activating in real time. So there's a cultural journey that companies are going through. How do we enable our knowledge workers to do this kind of work, how do we think about our products in a new way, how do we reconceive, think about automation? There's a lot of these aspects that are cultural as well, but I think a big part of it is, you know, it's easy to get overwhelmed for companies but it's like you have pick somewhere, right? What's something you can do, what's a true north, what's an area where you can start to invest and get impact and start the journey, right? Start to do pilots, start to get something going. What we found, something I've found in my career has been when companies get started with the right first project and get some success, they can build on that success and invest more, right? Whereas you know, if you're not experimenting and trying things and moving, you're never going to get there. >> Momentum is key, well Ron, thank you so much for taking some time to stop by theCUBE. I wish we had more time to chat but we appreciate your time. >> No, it's great to be here again. >> See ya. >> We want to thank you for watching theCUBE live from our event, Big Data SV in San Jose. I'm Lisa Martin with Dave Vellante, stick around we'll be back with our wrap shortly. (relaxed electronic jingle)

Published Date : Mar 8 2018

SUMMARY :

brought to you by Silicon Angle Media We're happy to welcome back to theCUBE So you have been a friend of theCUBE for a long time, and then you know, others as they came online, right? was bringing, you know, five megabytes of code One is, you know, that the capabilities and you can really start to apply things like There's always a level of you have to at What was the genesis of it and how are you We invested in the Google X team, you know, been amazing in the last couple of years. we invest in moonshots but that's where, you know, on this cultural journey to be actually but I think a big part of it is, you know, Momentum is key, well Ron, thank you We want to thank you for watching theCUBE live

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Ron BodkinPERSON

0.99+

Lisa MartinPERSON

0.99+

2007DATE

0.99+

Jeff DeanPERSON

0.99+

RonPERSON

0.99+

DavePERSON

0.99+

GoogleORGANIZATION

0.99+

LisaPERSON

0.99+

Silicon Angle MediaORGANIZATION

0.99+

San JoseLOCATION

0.99+

Fei-Fei LiPERSON

0.99+

AmazonORGANIZATION

0.99+

20%QUANTITY

0.99+

oneQUANTITY

0.99+

HadoopTITLE

0.99+

five megabytesQUANTITY

0.99+

SiriTITLE

0.99+

theCUBEORGANIZATION

0.99+

QuantCastORGANIZATION

0.99+

10xQUANTITY

0.99+

bothQUANTITY

0.99+

Google XORGANIZATION

0.98+

first projectQUANTITY

0.97+

Silicon ValleyLOCATION

0.97+

GmailTITLE

0.97+

Big DataORGANIZATION

0.97+

firstQUANTITY

0.96+

OneQUANTITY

0.96+

10 years agoDATE

0.95+

BigQueryTITLE

0.94+

early this decadeDATE

0.94+

last couple of yearsDATE

0.94+

Big Data SVEVENT

0.94+

AlexaTITLE

0.94+

Big Data SV 2018EVENT

0.93+

ClouderaORGANIZATION

0.91+

last decadeDATE

0.89+

Google CloudORGANIZATION

0.87+

over five years agoDATE

0.85+

first companyQUANTITY

0.82+

10x opportunitiesQUANTITY

0.82+

one commercialQUANTITY

0.81+

next 10 yearsDATE

0.8+

first GPUsQUANTITY

0.78+

Big Data HadoopTITLE

0.68+

AutoMLTITLE

0.68+

Google XTITLE

0.63+

AppliedORGANIZATION

0.62+

a secondQUANTITY

0.61+

petabytesQUANTITY

0.57+

petabyteQUANTITY

0.56+

billion inferencesQUANTITY

0.54+

TensorFlowTITLE

0.53+

StanfordORGANIZATION

0.51+

Google PhotosORGANIZATION

0.42+

Seth Dobrin, IBM | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our own event, Big Data SV. I'm Lisa Martin, with my cohost Dave Vellante. We're in downtown San Jose at this really cool place, Forager Eatery. Come by, check us out. We're here tomorrow as well. We're joined by, next, one of our CUBE alumni, Seth Dobrin, the Vice President and Chief Data Officer at IBM Analytics. Hey, Seth, welcome back to theCUBE. >> Hey, thanks for having again. Always fun being with you guys. >> Good to see you, Seth. >> Good to see you. >> Yeah, so last time you were chatting with Dave and company was about in the fall at the Chief Data Officers Summit. What's kind of new with you in IBM Analytics since then? >> Yeah, so the Chief Data Officers Summit, I was talking with one of the data governance people from TD Bank and we spent a lot of time talking about governance. Still doing a lot with governance, especially with GDPR coming up. But really started to ramp up my team to focus on data science, machine learning. How do you do data science in the enterprise? How is it different from doing a Kaggle competition, or someone getting their PhD or Masters in Data Science? >> Just quickly, who is your team composed of in IBM Analytics? >> So IBM Analytics represents, think of it as our software umbrella, so it's everything that's not pure cloud or Watson or services. So it's all of our software franchise. >> But in terms of roles and responsibilities, data scientists, analysts. What's the mixture of-- >> Yeah. So on my team I have a small group of people that do governance, and so they're really managing our GDPR readiness inside of IBM in our business unit. And then the rest of my team is really focused on this data science space. And so this is set up from the perspective of we have machine-learning engineers, we have predictive-analytics engineers, we have data engineers, and we have data journalists. And that's really focus on helping IBM and other companies do data science in the enterprise. >> So what's the dynamic amongst those roles that you just mentioned? Is it really a team sport? I mean, initially it was the data science on a pedestal. Have you been able to attack that problem? >> So I know a total of two people that can do that all themselves. So I think it absolutely is a team sport. And it really takes a data engineer or someone with deep expertise in there, that also understands machine-learning, to really build out the data assets, engineer the features appropriately, provide access to the model, and ultimately to what you're going to deploy, right? Because the way you do it as a research project or an activity is different than using it in real life, right? And so you need to make sure the data pipes are there. And when I look for people, I actually look for a differentiation between machine-learning engineers and optimization. I don't even post for data scientists because then you get a lot of data scientists, right? People who aren't really data scientists, and so if you're specific and ask for machine-learning engineers or decision optimization, OR-type people, you really get a whole different crowd in. But the interplay is really important because most machine-learning use cases you want to be able to give information about what you should do next. What's the next best action? And to do that, you need decision optimization. >> So in the early days of when we, I mean, data science has been around forever, right? We always hear that. But in the, sort of, more modern use of the term, you never heard much about machine learning. It was more like stats, math, some programming, data hacking, creativity. And then now, machine learning sounds fundamental. Is that a new skillset that the data scientists had to learn? Did they get them from other parts of the organization? >> I mean, when we talk about math and stats, what we call machine learning today has been what we've been doing since the first statistics for years, right? I mean, a lot of the same things we apply in what we call machine learning today I did during my PhD 20 years ago, right? It was just with a different perspective. And you applied those types of, they were more static, right? So I would build a model to predict something, and it was only for that. It really didn't apply it beyond, so it was very static. Now, when we're talking about machine learning, I want to understand Dave, right? And I want to be able to predict Dave's behavior in the future, and learn how you're changing your behavior over time, right? So one of the things that a lot of people don't realize, especially senior executives, is that machine learning creates a self-fulfilling prophecy. You're going to drive a behavior so your data is going to change, right? So your model needs to change. And so that's really the difference between what you think of as stats and what we think of as machine learning today. So what we were looking for years ago is all the same we just described it a little differently. >> So how fine is the line between a statistician and a data scientist? >> I think any good statistician can really become a data scientist. There's some issues around data engineering and things like that but if it's a team sport, I think any really good, pure mathematician or statistician could certainly become a data scientist. Or machine-learning engineer. Sorry. >> I'm interested in it from a skillset standpoint. You were saying how you're advertising to bring on these roles. I was at the Women in Data Science Conference with theCUBE just a couple of days ago, and we hear so much excitement about the role of data scientists. It's so horizontal. People have the opportunity to make impact in policy change, healthcare, etc. So the hard skills, the soft skills, mathematician, what are some of the other elements that you would look for or that companies, enterprises that need to learn how to embrace data science, should look for? Someone that's not just a mathematician but someone that has communication skills, collaboration, empathy, what are some of those, openness, to not lead data down a certain, what do you see as the right mix there of a data scientist? >> Yeah, so I think that's a really good point, right? It's not just the hard skills. When my team goes out, because part of what we do is we go out and sit with clients and teach them our philosophy on how you should integrate data science in the enterprise. A good part of that is sitting down and understanding the use case. And working with people to tease out, how do you get to this ultimate use case because any problem worth solving is not one model, any use case is not one model, it's many models. How do you work with the people in the business to understand, okay, what's the most important thing for us to deliver first? And it's almost a negotiation, right? Talking them back. Okay, we can't solve the whole problem. We need to break it down in discreet pieces. Even when we break it down into discreet pieces, there's going to be a series of sprints to deliver that. Right? And so having these soft skills to be able to tease that in a way, and really help people understand that their way of thinking about this may or may not be right. And doing that in a way that's not offensive. And there's a lot of really smart people that can say that, but they can come across at being offensive, so those soft skills are really important. >> I'm going to talk about GDPR in the time we have remaining. We talked about in the past, the clocks ticking, May the fines go into effect. The relationship between data science, machine learning, GDPR, is it going to help us solve this problem? This is a nightmare for people. And many organizations aren't ready. Your thoughts. >> Yeah, so I think there's some aspects that we've talked about before. How important it's going to be to apply machine learning to your data to get ready for GDPR. But I think there's some aspects that we haven't talked about before here, and that's around what impact does GDPR have on being able to do data science, and being able to implement data science. So one of the aspects of the GDPR is this concept of consent, right? So it really requires consent to be understandable and very explicit. And it allows people to be able to retract that consent at any time. And so what does that mean when you build a model that's trained on someone's data? If you haven't anonymized it properly, do I have to rebuild the model without their data? And then it also brings up some points around explainability. So you need to be able to explain your decision, how you used analytics, how you got to that decision, to someone if they request it. To an auditor if they request it. Traditional machine learning, that's not too much of a problem. You can look at the features and say these features, this contributed 20%, this contributed 50%. But as you get into things like deep learning, this concept of explainable or XAI becomes really, really important. And there were some talks earlier today at Strata about how you apply machine learning, traditional machine learning to interpret your deep learning or black box AI. So that's really going to be important, those two things, in terms of how they effect data science. >> Well, you mentioned the black box. I mean, do you think we'll ever resolve the black box challenge? Or is it really that people are just going to be comfortable that what happens inside the box, how you got to that decision is okay? >> So I'm inherently both cynical and optimistic. (chuckles) But I think there's a lot of things we looked at five years ago and we said there's no way we'll ever be able to do them that we can do today. And so while I don't know how we're going to get to be able to explain this black box as a XAI, I'm fairly confident that in five years, this won't even be a conversation anymore. >> Yeah, I kind of agree. I mean, somebody said to me the other day, well, it's really hard to explain how you know it's a dog. >> Seth: Right (chuckles). But you know it's a dog. >> But you know it's a dog. And so, we'll get over this. >> Yeah. >> I love that you just brought up dogs as we're ending. That's my favorite thing in the world, thank you. Yes, you knew that. Well, Seth, I wish we had more time, and thanks so much for stopping by theCUBE and sharing some of your insights. Look forward to the next update in the next few months from you. >> Yeah, thanks for having me. Good seeing you again. >> Pleasure. >> Nice meeting you. >> Likewise. We want to thank you for watching theCUBE live from our event Big Data SV down the street from the Strata Data Conference. I'm Lisa Martin, for Dave Vellante. Thanks for watching, stick around, we'll be rick back after a short break.

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media Welcome back to theCUBE's continuing coverage Always fun being with you guys. Yeah, so last time you were chatting But really started to ramp up my team So it's all of our software franchise. What's the mixture of-- and other companies do data science in the enterprise. that you just mentioned? And to do that, you need decision optimization. So in the early days of when we, And so that's really the difference I think any good statistician People have the opportunity to make impact there's going to be a series of sprints to deliver that. in the time we have remaining. And so what does that mean when you build a model Or is it really that people are just going to be comfortable ever be able to do them that we can do today. I mean, somebody said to me the other day, But you know it's a dog. But you know it's a dog. I love that you just brought up dogs as we're ending. Good seeing you again. We want to thank you for watching theCUBE

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Lisa MartinPERSON

0.99+

SethPERSON

0.99+

DavePERSON

0.99+

IBMORGANIZATION

0.99+

Seth DobrinPERSON

0.99+

20%QUANTITY

0.99+

50%QUANTITY

0.99+

TD BankORGANIZATION

0.99+

San JoseLOCATION

0.99+

two peopleQUANTITY

0.99+

tomorrowDATE

0.99+

IBM AnalyticsORGANIZATION

0.99+

two thingsQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

one modelQUANTITY

0.99+

five yearsQUANTITY

0.98+

20 years agoDATE

0.98+

Big Data SVEVENT

0.98+

five years agoDATE

0.98+

GDPRTITLE

0.98+

theCUBEORGANIZATION

0.98+

oneQUANTITY

0.98+

Strata Data ConferenceEVENT

0.97+

todayDATE

0.97+

first statisticsQUANTITY

0.95+

CUBEORGANIZATION

0.94+

Women in Data Science ConferenceEVENT

0.94+

bothQUANTITY

0.94+

Chief Data Officers SummitEVENT

0.93+

Big Data SV 2018EVENT

0.93+

couple of days agoDATE

0.93+

yearsDATE

0.9+

Forager EateryORGANIZATION

0.9+

firstQUANTITY

0.86+

WatsonTITLE

0.86+

Officers SummitEVENT

0.74+

Data OfficerPERSON

0.73+

SVEVENT

0.71+

PresidentPERSON

0.68+

StrataTITLE

0.67+

Big DataORGANIZATION

0.66+

earlier todayDATE

0.65+

Silicon ValleyLOCATION

0.64+

yearsQUANTITY

0.6+

ChiefEVENT

0.44+

KaggleORGANIZATION

0.43+

Maribel Lopez, Lopez Research | Big Data SV 2018


 

>> Narrator: Live, from San Jose. It's theCUBE. Presenting Big Data, Silicon Valley. Brought to you by SiliconAngle Media, and its ecosystem partners. >> Welcome come back to theCUBE, we are live in San Jose, at our event, Big Data SV. I'm Lisa Martin. And we are down the street from the Strata Data Conference. We've had a great day so far, talking with a lot of folks from different companies that are all involved in the big data unraveling process. I'm excited to welcome back to theCUBE one of our extinguished alumni, Maribel Lopez; the founder and principal analyst at Lopez research. Welcome back to theCUBE. >> Thank you. I'm excited to be here. >> Yeah, so you've been, a startup conference started a couple days ago. What are some the trends and things that you're hearing that are really kind of top of mind for not just the customers that are attending, the companies that are creating or are trying to create solutions around this big data challenge and opportunity? >> Yeah absolutely, I mean I think we talked a lot about data in the years past. How do you gather the data? How do you store the data? How you might want to process the data? This year seems to be all about how do I make something interesting happen with the data? How do I make an intelligent inside? How do I cure prostate cancer? How do I make sure I can classify images? It's a really different show, and we've also changed some of the terminology a lot more in machine learning now, and artificial intelligence, and frankly a lot of discussion around ethics. So it's been very interesting. >> Data ethics you mean? >> Data ethics; how do we do privacy? How do we maintain the right level of data so that we don't have bias in our data? How do we get Diversity Inclusion going? Lots really interesting powerful human topics, not just about the data. >> I love that the human topics especially where you know AI and ML come into play. You talked, data diversity. Or bias that we were just at that women and data science conference a couple of days ago talking to a lot of female leaders in in data science, computer science, both in academia as well as in industry. And one of the interesting topics about the gender disparity, is the fact that that is limiting the analyses on data in terms of, there may be a few perspectives looking on it. So there's an inherent bias there. So that's one issue, and I'd like to get your thoughts on that. Another is with that thought, lack of thought diversity, I guess I would say going into analyzing the data, companies might be potentially limiting themselves on the types of products that they can create, how to monetize the data and actually drive new revenue streams. On the kind of thought diversity will start there. What are some of the things that you're hearing, and what are some of your recommendations for your clients on how to get some of that bias out of data analysis? >> Yes it's interesting. One is trying to find multiple sources of data. So there's data that you have and that you own. But there is a wide range of openly available data now. There's some challenges around making sure that that data is clean before you integrated with your data. But basically, diversifying your data sources with third party data is one big thing that we're talking about. In previous analytical generations, I think we talked a lot about how to have a hypothesis, and you were trying to prove a hypothesis. And now I think we're trying to be a little more open and looser, and not really lead the data where per se, but try to find the right patterns and correlations in the data. And then just awareness in general. Like we don't believe we're biased. But if we have data that's biased who gets put into the system. So we have to really be thoughtful about what we put into the system. So I think that those three things combined have really changed the way people are looking at it. And there's a lot of awareness now around that. Because we assume at some point, the machines might be making certain decisions for us. And we want to make sure that they have the best information to do that. And that they don't limit our opportunities as a society. >> Where are companies in terms of the clients that you see, culturally in terms of embracing the openness? 'Cause you're right! From a scientific scientific method perspective. People go into, I'm going to hypothesize this because I think I'm going to find this. And maybe wanting the data to say this. Where are companies, we'll say enterprises, in becoming culturally more open to not leading the data somewhere and bringing up bias? >> Well, there are two interesting things here, right? I think there are some people that have gone down the data route for a while now, sort of the industry leading companies. They're in this mindset now trying to make sure they don't leave the data, they don't create biases in the data. They have ways to explain how the data and the analysis of the learning came about, not just for regulation, but so that they can make sure they ethically done the right thing. But then I think there's the other 95 percent of companies that they're not even there yet. They don't know that this is a problem yet. So they're still dealing with the "I've got a pool in the data." "I've got to do something with it." They don't even know what they want to do with it let alone if it's biased or not. So we're not quite at the leading the witness point there with a lot of organizations. >> But that's something that you expect to see maybe down the road. >> I'm hoping we'll get ahead of it. I'm really hoping that we'll get ahead of it. >> It's a good positive outlook on it, yeah? >> I think that, I think because the real analysis of the data problem in a big machine learning, deep learning way is so new, and the people are actually out seeking guidance, that there is an opportunity to get ahead of it. The second thing that's happening is, people don't have data scientists, right? So they don't necessarily have the people that can code this. So what they're doing now, is they're depending on the vendor landscape to provide them with an entry level set of tools. So if you're Microsoft, if you're Google, if you're Amazon, you're trying very hard to make sure that you're giving tools that have the right ethics in them, and that can help kickstart people's Machine Learning efforts. So I think that's going to be a real win for us. And we talked a lot today at the Strata conference about how, oh you don't have enough images, you can't do that. Or you don't have enough data, you can't do that. Or you don't have enough data scientists. And some of what came back is that, some of the best and the brightest have coded some things that you can start to use to kickstart that will get you to a better place than you ever could have started with yourself. So that was pretty exciting, you know. Transfer learning as an example of taking you know, image node from Google and some algorithms, and using those to take your images and try to figure out if somebody has Alzheimer's or not. Encode things Alzheimer's or not characteristic. So, very cool stuff, very exciting and nice to see that we've got some minds working on this for us. >> Yeah, definitely. Where you're meeting with clients that don't have a data scientist, or chief analytics officer? Sounds like a lot of the technologies need to or some have built in sort of enablement for a difference data citizen within a company. If you talking to clients that don't have a data scientist or data science team, who are your constituents there? Where are companies that don't maybe have that skill gap? Who do they go to in their organization to start evaluating the data that they have to get to know what and start to understand what their potential is? >> Yeah, there's a couple of places people go. They go to their business decision analytics people. So the people that were working with their BI dashboards, for example. The second place they go is to the cloud computing guys, cuz we're hearing a lot about cloud computing and maybe I can buy some of the stuff from the cloud. I'm just going to roll up and get all my machine learning in the cloud, right? So we're not there yet. So the biggest thing that I talk to people about right now is, what are the realities around Machine Learning and AI? We've made tremendous progress but you know you read the newspaper, and something is going to get rid of your job, and AI's going to take over the world, and we're kind of far from that reality. First of all it's very dystopian and negative. But even if it weren't that, you know what you can do today, is not that. So there's a lot of stages in between. So the first thing is just trying to get people comfortable with. No you can't just buy one product, and throw in some data, and you've got everything you need. >> Right. >> We're not there yet. But we're getting closer. You can add some components, you can get some new information, you could do some new correlations. So just getting a reality and grounding of where we are, and that we have a lot of opportunity, and that it's moving very fast. that's the other thing. >> Right. >> IT leaders are used to all evaluated once a year, evaluated once every couple of years. These things are moving in monthly increments. Like really huge changes in product categories. So you kind of have to keep on top of it to make sure you know what's available to you. >> Right. And if they don't they miss out on not only the ability to monetize data streams, but essentially going out of business. Because somebody will come in may be more nimble and agile, and be able to do it faster. >> Yeah. And we already saw those with the digital native companies that started born in the cloud companies, we used to call them. Well, now, everybody can be using the cloud. So the question then is like what's the next wave of that? The next wave of that is around understanding how to use your data, understanding how to get third-party data, and being able to rapidly make decisions and change models based on that. >> One of the things that's interesting about big data is you know it was a big buzzword, and it seems to be becoming less of a buzzword now. Gartner even was saying I think the number was 85 percent of big data projects and I think that's more in tested environments fail. And I often say, "Failure in a lot of cases is not a bad effort." Because it spawns genesis of new products, new ideas, et cetera. But when you're talking with clients who go, alright, we've embraced Hadoop, we've got this big data lake, now it's turning really swampy. We don't know-- >> We've got lakes, we've got oceans, we've got ponds. Yeah. >> Right. What's the conversation there where you're helping a customer clean that swamp up, get broader visibility across their datasets and enable different lines of business. Not just you know, the BI folks or the cloud folks or IT. But marketing, logistics, sales. What's that conversation like to clean up the swamp and do more enablement for visibility? >> I think one of the things that we got really hung up on was, you know, creating a data ocean, right? We're going to bring everything all in one place, it's going to be this one massive data source. >> It sounded great. >> It's going to be awesome. And this is not the reality of the world, right? So I think the first thing in the cleaning up that we have to do, is being able to figure out what's the source of truth for any given dataset that somebody needs. So you see 15 salespeople walk in and they all have different versions of the data that shouldn't happen. >> Right. >> So we need to get to the point where they know where the source of truth is for that data. The second is sort of governance around the data. We spent a lot of time dumping the data but not a lot of time in terms of getting governance around who can access it, what they can do with it, for how long they could have access to it. Is it just internal? Is it internal and external? So I think that's the second thing around like harassing and haranguing the swamps, and the lakes and the ponds, right? And then assuming that you do that, I think the other thing is, You know, if you have a hammer everything looks like a nail. Well, in reality you know when you construct things you have nails, you have screws, you have bolts, right? And picking the right tool for the job is something that the IT leadership has to work with. And the only way that they get that right is to work very closely with the different lines of business so they can understand the problem. Because the business leader knows the problem, they don't know the solution. If you put them together which we've talked about forever, frankly. But now I think we're seeing more imperatives for those two to work closely together. And sometimes it's even driven by security, just to make sure that the data isn't leaking into other places or that it's secure and that they've met regulatory compliance. So we're in a much better space than we were two, three, five years ago cuz we're thinking about the real problems now. Not just how do you collect it, and how do you store it. But how do we actually make it an actionable manageable set of solutions. >> Exactly, and make it work for the business. Well Maribel, I wish we had more time, but thank you so much for stopping by theCUBE, sharing the insights that you've seen. Not just at a conference, but also with your clients. >> Thank you. >> We want to thank you for watching theCUBE. Again, I'm Lisa Martin, live from Big Data SV, in Downtown San Jose. Get involved in the conversation #BigDataSV. Come see us at the Forager Eatery & Tasting Room, and I'll be right back with our next guest. (upbeat music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by SiliconAngle Media, that are all involved in the big data unraveling process. I'm excited to be here. just the customers that are attending, a lot about data in the years past. so that we don't have bias in our data? and I'd like to get your thoughts on that. and looser, and not really lead the data where per se, that you see, culturally in terms of embracing the openness? and the analysis of the learning came about, But that's something that you expect to see I'm really hoping that we'll get ahead of it. and the brightest have coded some things that they have to get to know and maybe I can buy some of the stuff from the cloud. and that we have a lot of opportunity, to make sure you know and be able to do it faster. that started born in the cloud companies, and it seems to be becoming less of a buzzword now. we've got oceans, we've got ponds. What's that conversation like to clean up the swamp that we got really hung up on was, you know, So you see 15 salespeople walk in and they all have is something that the IT leadership has to work with. sharing the insights that you've seen. and I'll be right back with our next guest.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Lisa MartinPERSON

0.99+

MaribelPERSON

0.99+

AmazonORGANIZATION

0.99+

Maribel LopezPERSON

0.99+

San JoseLOCATION

0.99+

GoogleORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

15 salespeopleQUANTITY

0.99+

SiliconAngle MediaORGANIZATION

0.99+

85 percentQUANTITY

0.99+

95 percentQUANTITY

0.99+

GartnerORGANIZATION

0.99+

one issueQUANTITY

0.99+

twoQUANTITY

0.99+

todayDATE

0.99+

oneQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

bothQUANTITY

0.98+

Strata Data ConferenceEVENT

0.98+

Big Data SVORGANIZATION

0.98+

second thingQUANTITY

0.98+

one productQUANTITY

0.98+

first thingQUANTITY

0.98+

three thingsQUANTITY

0.97+

once a yearQUANTITY

0.97+

secondQUANTITY

0.96+

This yearDATE

0.96+

OneQUANTITY

0.96+

FirstQUANTITY

0.96+

theCUBEORGANIZATION

0.96+

Downtown San JoseLOCATION

0.96+

StrataEVENT

0.94+

two interesting thingsQUANTITY

0.94+

five years agoDATE

0.94+

Big DataORGANIZATION

0.9+

couple days agoDATE

0.87+

couple of days agoDATE

0.85+

onceQUANTITY

0.78+

#BigDataSVORGANIZATION

0.75+

one placeQUANTITY

0.75+

second placeQUANTITY

0.75+

every couple of yearsQUANTITY

0.75+

ForagerLOCATION

0.7+

DataORGANIZATION

0.69+

Narrator: LiveTITLE

0.69+

waveEVENT

0.68+

years pastDATE

0.66+

threeQUANTITY

0.66+

AlzheimerOTHER

0.66+

BigEVENT

0.65+

HadoopTITLE

0.64+

Big Data SVEVENT

0.59+

Eatery & Tasting RoomORGANIZATION

0.57+

Lopez ResearchORGANIZATION

0.55+

SV 2018EVENT

0.54+

thingQUANTITY

0.53+

LopezORGANIZATION

0.49+

Kunal Agarwal, Unravel Data | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCube! Presenting Big Data: Silicon Valley Brought to you by SiliconANGLE Media and its ecosystem partners. (techno music) >> Welcome back to theCube. We are live on our first day of coverage at our event BigDataSV. I am Lisa Martin with my co-host George Gilbert. We are at this really cool venue in downtown San Jose. We invite you to come by today, tonight for our cocktail party. It's called Forager Tasting Room and Eatery. Tasty stuff, really, really good. We are down the street from the Strata Data Conference, and we're excited to welcome to theCube a first-time guest, Kunal Agarwal, the CEO of Unravel Data. Kunal, welcome to theCube. >> Thank you so much for having me. >> So, I'm a marketing girl. I love the name Unravel Data. (Kunal laughs) >> Thank you. >> Two year old company. Tell us a bit about what you guys do and why that name... What's the implication there with respect to big data? >> Yeah, we are a application performance management company. And big data applications are just very complex. And the name Unravel is all about unraveling the mysteries of big data and understanding why things are not performing well and not really needing a PhD to do so. We're simplifying application performance management for the big data stack. >> Lisa: Excellent. >> So, so, um, you know, one of the things that a lot of people are talking about with Hadoop, originally it was this cauldron of innovation. Because we had the "let a thousand flowers bloom" in terms of all the Apache projects. But then once we tried to get it into operation, we discovered there's a... >> Kunal: There's a lot of problems. (Kunal laughs) >> There's an overhead, there's a downside to it. >> Maybe tell us, tell us why you both need to know, you need to know how people have done this many, many times. >> Yeah. >> How you need to learn from experience and then how you can apply that even in an environment where someone hasn't been doing it for that long. >> Right. So, if I back a little bit. Big data is powerful, right? It's giving companies an advantage that they never had, and data's an asset to all of these different companies. Now they're running everything from BI, machine learning, artificial intelligence, IOT, streaming applications on top of it for various reasons. Maybe it is to create a new product to understand the customers better, etc., But as you rightly pointed out, when you start to implement all of these different applications and jobs, it's very, very hard. It's because big data is very complex. With that great power comes a lot of complexity, and what we started to see is a lot of companies, while they want to create these applications and provide that differentiation to their company, they just don't have enough expertise as well in house to go and write good applications, maintain these applications, and even manage the underlying infrastructure and cluster that all these applications are running on. So we took it upon ourselves where we thought, Hey, if we simplify application performance management and if we simplify ongoing management challenges, then these companies would run more big data applications, they would be able to expand their use cases, and not really be fearful of, Hey, we don't know how to go and solve these problems. Do we actually rely on our system that is so complex and new? And that's the gap the Unravel fills, which is we monitor and manage not only one componenent of the big data ecosystem, but like you pointed out, it's a, it's a full zoo of all of these systems. You have Hadoop, and you have Spark, and you have Kafka for data injection. You may have some NoSQL systems and newer MPP platforms as well. So the vision of Unravel is really to be that one place where you can come in and understand what's happening with your applications and your system overall and be able to resolve those problems in an automatic, simple way. >> So, all right, let's start at the concrete level of what a developer might get out of >> Kunal: Right. >> something that's wrapped in Unravel and then tell us what the administrator experiences. >> Kunal: Absolutely. So if you are a big data developer you've got in a business requirement that, Hey, go and make this application that understands our customers better, right? They may choose a tool of their liking, maybe Hive, maybe Spark, maybe Kafka for data injection. And what they'll do is they'll write an app first in dev, in their dev environment or the QA environment. And they'll say, Hey, maybe this application is failing, or maybe this application is not performing as fast as I want it to, or even worse that this application is starting to hog a lot of resources, which may slow down my other applications. Now to understand what's causing these kind of problems today developers really need a PhD to go and decipher them. They have to look at tons of law rogs, uh, raw logs metrics, configuration settings and then try to stitch the story up in their head, trying to figure out what is the effect, what is the cause? Maybe it's this problem, maybe it's some other problem. And then do trial and error to try, you know to solving that particular issue. Now what we've seen is big data developers come in variety of flavors. You have the hardcore developers who truly understand Spark and Hadoop and everything, but then 80% of the people submitting these applications are data scientist or business analysts, who may understand SQL, who may know Python, but don't necessarily know what distributed computing and parallel processing and all of these things really are, and where can inefficiencies and problems really lie. So we give them this one view, which will connect all of these different data sources and then tell them in plain English, this is the problem, this is why this problem happened, and this is how you can go and resolve it, thereby getting them unstuck and making it very simple for them to go in and get the performance that they're getting. >> So, these, these, um, they're the developers up front and you're giving them a whole new, sort of, toolchain or environment to solve the operational issues. >> Kunal: Right. >> So that the, if it's DevOps, its really dev is much more sufficient. >> Yes, yes, I mean, all companies want to run fast. They don't want to be slowed down. If you have a problem today, they'll file a ticket, it'll go to the operations team, you wait a couple of days to get some more information back. That just means your business has slowed down. If things are simple enough where the application developers themselves can resolve a lot of these issues, that'll get the business unstuck and get them moving on further. Now, to the other point which you were asking, which is what about the operations and the app support people? So, Unravel's a great tool for them too because that helps them see what's happening holistically in the cluster. How are other applications behaving with each other? It's usually a multitenant, multiapplication environment that these big data jobs are running on. So, is my apps slowing down George's apps? Am I stealing resources from your applications? More so, not just about an individual application issue itself. So Unravel will give you visibility into each app, as well as the overall cluster to help you understand cluster-wide problems. >> Love to get at, maybe peel apart your target audience a little bit. You talked about DevOps. But also the business analysts, data scientists, and we talk about big data. Data is, has such tremendous power to fuel a company and, you know, like you said use it to deliver and, create and deliver new products. Are you talking with multiple audiences within a company? Do you start at DevOps and they bring in their peers? Or do you actually start, maybe, at the Chief Data Officer level? What's that kind of entrance for Unravel? >> So the word I use to describe this is DataOps, instead of DevOps, right? So in the older world you had developers, and you had operations people. Over here you have a data team and operations people, and that data team can comprise of the developers, the data scientists, the business analysts, etc., as well. But you're right. Although we first target the operations role because they have to manage and monitor the system and make sure everything is running like a well-oiled machine, they are now spreading it out to be end-users, meaning the developers themselves saying, "Don't come to me for every problem. "Look at Unravel, try solve it here, "and if you cannot, then come to me." This is all, again, improving agility within the company, making sure that people have the necessary tools and insights to carry on with their day. >> Sounds like an enabler, >> Yeah, absolutely. >> That operations would push down to the DevOp, the developers themselves. >> And even the managers and the CDOs, for example, they want to see their ROI that they're getting from their big data investments. They want to see, they have put in these millions of dollars, have got an infrastructure and these services set up, but how are we actually moving the needle forward? Are there any applications that we're actually putting in business, and is that driving any business value? So we will be able to give them a very nice dashboard helping them understand what kind of throughput are you getting from your system, how many applications were you able to develop last week and onboard to your production environment? And what's the rate of innovation that's really happening inside your company on those big data ecosystems? >> It sort of brings up an interesting question on two prongs. One is the well-known, but inexact number about how many big data projects, >> Kunal: Yeah, yeah. >> I don't know whether they fail or didn't pay off. So there's going in and saying, "Hey, we can help you manage this "because it was too complicated." But then there's also the, all the folks who decided, "Well, we really don't want "to run it all on-prem. "We're not going to throw away everything we did there, "but we're going to also put a lot of new investment >> Kunal: Exactly, exactly. >> in the cloud. Now, Wikibon has a term for that, which true private cloud, which is when you have the operational processes that you use in the public cloud and you can apply them on-prem. >> Right. >> George: But there's not many products that help you do that. How can Unravel work...? >> Kunal: That's a very good questions, George. We're seeing the world move more and more to a cloud environment, or I should say an on-demand environment where you're not so bothered about the infrastructure and the services, but you want Spark as a dial tone. You want Kafka as a dial tone. You want a machine-learning platform as a dial tone. You want to come in there, you want to put in your data, and you want to just start running it. Unravel has been designed from the ground up to monitor and manage any of these environments. So, Unravel can solve problems for your applications running on-premise and similarly all the applications that are running on cloud. Now, on the cloud there are other levels of problems as well so, of course, you'd have applications that are slow, applications that are failing; we can solve those problems. But if you look at a cloud environment, a lot of these now provide you an autoscaling capability, meaning, Hey, if this app doesn't run in the amount of time that we were hoping it to run, let's add extra hardware and run this application. Well, if you just keep throwing machines at the problem, it's not going to solve your issue. Now, it doesn't decrease the time that it will take linearly with how many servers that you're actually throwing in there, so what we can help companies understand is what is the resource requirement of a particular application? How should we be intelligently allocating resources to make sure that you're able to meet your time SLAs, your constraints of, here I need to finish this with x number of minutes, but at the same time be intelligent about how much cost you're spending over there. Do you actually need 500 containers to go and run this app? Well, you may have needed 200. How do you know that? So, Unravel will also help you get efficient with your run, not just faster, but also can it be a good multitenant citizen, can it use limited resources to actually run this applications as well? >> So, Kunal, some of the things I'm hearing from a customer's standpoint that are potential positive business outcomes are internal: performance boost. >> Kunal: Yeah. >> It also sounds like, sort of... productivity improvements internally. >> And then also the opportunity to have the insight to deliver new products, but even I'm thinking of, you know, helping make a retailer, for example, be able to do more targeted marketing, so >> the business outcomes and the impact that Unravel can make really seem to have pretty strong internal and external benefits. >> Kunal: Yes. >> Is there a favorite customer story, (Kunal laughs) don't have to mention names, that you really think speaks to your capabilities? >> So, 100% Improving performance is a very big factor of what Unravel can do. Decreasing costs by improving productivity, by limiting the amount of resources that you're using, is a very, very big factor. Now, amongst all of these companies that we work with, one key factor is improving reliability, which means, Hey, it's fine that he can speed up this application, but sometimes I know the latency that I expect from an app, maybe it's a second, maybe it's a minute, depending on the type of application. But what businesses cannot tolerate is this app taking five x amount more time today. If it's going to finish in a minute, tell me it'll finish in a minute and make sure it finishes in a minute. And this is a big use case for all of the big data vendors because a lot of the customers are moving from Teradata, or from Vertica, or from other relation databases, on to Hortonworks or Cloudera or Amazon EMR. Why? Because it's one tenth the amount of cost for running these workloads. But, all the customers get frustrated and say, "I don't mind paying 10 x more money, "but because over there it used to work. "Over here, there are just so many complications, "and I don't have reliability with these applications." So that's a big, big factor of, you know, how we actually help these customers get value out of the Unravel product. >> Okay, so, um... A question I'm, sort of... why aren't there so many other Unravels? >> Kunal: Yeah. (Kunal laughs) >> From what I understood from past conversations. >> Kunal: Yeah. >> You can only really build the models that are at the heart of your capabilities based on tons and tons of telemetry >> Kunal: Yeah. >> that cloud providers or, or, sort of, internet scale service providers have accumulated in that, because they all have sort of a well-known set of configurations and well-known kind of typology. In other words, there're not a million degrees of freedom on any particular side that you can, you have a well-scoped problem, and you have tons of data. So it's easier to build the models. So who, who else could do this? >> Yeah, so the difference between Unravel and other monitoring products is Unravel is not a monitoring product. It's an intelligent performance management suite. What that means is we don't just give you graphs and metrics and say, "Here are all the raw information, "you go figure it out." Instead, we have to take it a step further where we are actually giving people answers. In order to develop something like that, you need full stack information; that's number one. Meaning information from applications all the way down to infrastructure and everything in between. Why? Because problems can lie anywhere. And if you don't have that full stack info, you're blind-siding yourself, or limiting the scope of the problems that you can actually search for. Secondly is, like you were rightly pointing out, how do I create answers from all this raw data? So you have to think like how an expert with big data would think, which is if there is a problem what are the kinds of checks, balances, places that that person would look into, and how would that person establish that this is indeed the root cause of the problem today? And then, how would that person actually resolve this particular problem? So, we have a big team of scientists, researchers. In fact, my co-founder is a professor of computer science at Duke University who has been researching data-based optimization techniques for the last decade. We have about 80 plus publications in this area, Starfish being one of them. We have a bunch of other publications, which talk about how do you automate problem discovery, root cause analysis, as well as resolution, to get best performance out of these different databases? And you're right. A lot of work has gone on the research side, but a lot of work has gone in understanding the needs of the customers. So we worked with some of the biggest companies out there, which have some of the biggest big data clusters, to learn from them, what are some everyday, ongoing management challenges that you face, and then taking that problem to our datasets and figuring out, how can we automate problem discovery? How can we proactively spot a lot of these errors? I joke around and I tell people that we're big data for big data. Right? All these companies that we serve, they are gathering all of this data, and they're trying to find patterns, and they're trying to find, you know, some sort of an insight with their data. Our data is system generated data, performance data, application data, and we're doing the exact same thing, which is figuring out inefficiencies, problems, cause and effect of things, to be able to solve it in a more intelligent, smart way. >> Well, Kunal, thank you so much for stopping by theCube >> Kunal: Of course. >> And sharing how Unravel Data is helping to unravel the complexities of big data. (Kunal laughs) >> Thank you so much. Really appreciate it. >> Now you're a Cube almuni. (Kunal laughs) >> Absolutely. Thanks so much for having me. >> Kunal, thanks. >> Yeah, and we want to thank you for watching the Cube. I'm Lisa Martin with George Gilbert. We are live at our own event BigData SV in downtown San Jose, California. Stick around. George and I will be right back with our next guest. (quiet crowd noise) (techno music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by SiliconANGLE Media We invite you to come by today, I love the name Unravel Data. Tell us a bit about what you guys do and not really needing a PhD to do so. So, so, um, you know, one of the things that Kunal: There's a lot of problems. there's a downside to it. tell us why you both need to know, and then how you can apply that even in an environment of the big data ecosystem, but like you pointed out, and then tell us what the administrator experiences. and this is how you can go and resolve it, and you're giving them a whole new, sort of, So that the, if it's DevOps, Now, to the other point which you were asking, to fuel a company and, you know, like you said So in the older world you had developers, DevOp, the developers themselves. and is that driving any business value? One is the well-known, but inexact number "Hey, we can help you manage this and you can apply them on-prem. that help you do that. and you want to just start running it. So, Kunal, some of the things I'm hearing It also sounds like, sort of... that Unravel can make really seem to have So that's a big, big factor of, you know, A question I'm, sort of... and you have tons of data. What that means is we don't just give you graphs to unravel the complexities of big data. Thank you so much. Now you're a Cube almuni. Thanks so much for having me. Yeah, and we want to thank you

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

Kunal AgarwalPERSON

0.99+

GeorgePERSON

0.99+

KunalPERSON

0.99+

LisaPERSON

0.99+

80%QUANTITY

0.99+

HortonworksORGANIZATION

0.99+

100%QUANTITY

0.99+

VerticaORGANIZATION

0.99+

Unravel DataORGANIZATION

0.99+

TeradataORGANIZATION

0.99+

todayDATE

0.99+

500 containersQUANTITY

0.99+

OneQUANTITY

0.99+

Two yearQUANTITY

0.99+

two prongsQUANTITY

0.99+

last weekDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

tonightDATE

0.99+

200QUANTITY

0.99+

first dayQUANTITY

0.99+

San JoseLOCATION

0.99+

SparkTITLE

0.99+

ClouderaORGANIZATION

0.99+

each appQUANTITY

0.99+

PythonTITLE

0.98+

a minuteQUANTITY

0.98+

EnglishOTHER

0.98+

oneQUANTITY

0.98+

Duke UniversityORGANIZATION

0.98+

fiveQUANTITY

0.98+

KafkaTITLE

0.98+

HadoopTITLE

0.98+

BigData SVEVENT

0.97+

first-timeQUANTITY

0.97+

Strata Data ConferenceEVENT

0.97+

one key factorQUANTITY

0.96+

millions of dollarsQUANTITY

0.95+

about 80 plus publicationsQUANTITY

0.95+

SQLTITLE

0.95+

DevOpsTITLE

0.94+

firstQUANTITY

0.94+

BigDataSVEVENT

0.94+

tons and tonsQUANTITY

0.94+

bothQUANTITY

0.94+

UnravelORGANIZATION

0.93+

SecondlyQUANTITY

0.91+

million degreesQUANTITY

0.91+

San Jose, CaliforniaLOCATION

0.91+

HiveTITLE

0.91+

last decadeDATE

0.91+

UnravelTITLE

0.9+

Guy Churchward, DataTorrent | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE, presenting Big Data, Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to theCUBE. Our continuing coverage of our event, Big Data SV, continues, this is our first day. We are down the street from the Strata Data Conference. Come by, we're at this really cool venue, the Forager Tasting Room. We've got a cocktail party tonight. You're going to hear some insights there as well as tomorrow morning. I am Lisa Martin, joined by my co-host, George Gilbert, and we welcome back to theCUBE, for I think the 900 millionth time, the president and CEO of DataTorrent, Guy Churchward. Hey Guy, welcome back! >> Thank you, Lisa, I appreciate it. >> So you're one of our regular VIP's. Give us the update on DataTorrent. What's new, what's going on? >> We actually talked to you a couple of weeks ago. We did a big announcement which was around 3.10, so it's a new release that we have. In all small companies, and we're a small startup, in the big data and analytic space, there is a plethora of features that I can reel through. But it actually makes something a little bit more fundamental. So in the last year... In fact, I think we chatted with you maybe six months ago. We've been looking very carefully at how customers purchase and what they want and how they execute against technology, and it's very very different to what I expected when I came into the company about a year ago off the EMC role that I had. And so, although the features are there, there's a huge amount of underpinning around the experience that a customer would have around big data applications. I'm reminded of, I think it's Gartner that quoted that something like 80% of big data applications fail. And this is one of the things that we really wanted to look at. We have very large customers in production, and we did the analysis of what are we doing well with them, and why can't we do that en masse, and what are people really looking for? So that was really what the release was about. >> Let's elaborate on this a little bit. I want to drill into something where you said many projects, as we've all heard, have not succeeded. There's a huge amount of complexity. The terminology we use is, without tarring and feathering any one particular product, the open source community is kind of like, you're sort of harnessing a couple dozen animals and a zookeeper that works in triplicate... How does DataTorrent tackle that problem? >> Yeah, I mean, in fact I was desperately interested in writing a blog recently about using the word community after open source, because in some respects, there isn't a huge community around the open source movement. What we find is it's the du jour way in which we want to deliver technology, so I have a huge amount of developers that work on a thing called Apache Apex, which is a component in a solution, or in an architecture and in an outcome. And we love what we do, and we do the best we do, and it's better than anybody else's thing. But that's not an application, that's not an outcome. And what happens is, we kind of don't think about what else a customer has to put together, so then they have to go out to the zoo and pick loads of bits and pieces and then try to figure out how to stitch them all together in the best they can. And that takes an inordinately long time. And, in general, people who love this love tinkering with technologies, and their projects never get to production. And large enterprises are used to sitting down and saying, "I need a bulletproof application. "It has to be industrialized. "I need a full SLA on the back of it. "This thing has to have lights out technology. "And I need it quick." Because that was the other thing, as an aspect, is this market is moving so fast, and you look at things like digital economy or any other buzz term, but it really means that if you realize you need to do something, you're probably already too late. And therefore, you need it speedy, expedited. So the idea of being able to wait for 12 months, or two years for an application, also makes no sense. So the arch of this is basically deliver an outcome, don't try and change the way in which open source is currently developed, because they're in components, but embrace them. And so what we did is we sort of looked at it and said, "Well what do people really want to do?" And it's big data analytics, and I want to ingest a lot of information, I want to enrich it, I want to analyze it, and I want to take actions, and then I want to go park it. And so, we looked at it and said, "Okay, so the majority "of stuff we need is what we call a cache stack, "which is KAFKA, Apache Apex, Spark and Hadoop, "and then put complex compute on top." So you would have heard of terms like machine learning, and dimensional compute, so we have their modules. So we actually created an opinionated stack... Because otherwise you have a thousand to choose from and people get confused with choice. I equate it to going into a menu at a restaurant, there's two types of restaurants, you walk into one and you can turn pages and pages and pages and pages of stuff, and you think that's great, I got loads of choice, but the choice kind of confuses you. And also, there's only one chef at the back, and he can't cook everything well. So you know if he chooses the components and puts them together, you're probably not going to get the best meal. And then you go to restaurants that you know are really good, they generally give you one piece of paper and they say, "Here's your three entrees." And you know every single one of them. It's not a lot of choice, but at the end of the day, it's going to be a really good meal. >> So when you go into a customer... You're leading us to ask you the question which is, you're selling the prix fixe tasting menu, and you're putting all the ingredients together. What are some of those solutions and then, sort of, what happens to the platform underneath? >> Yeah, so what you don't want to do is to take these flexible, microdata services, which are open source projects, and hard glue them together to create an application that then has no flexibility. Because, again, one of the myths that I used to assume is applications would last us seven to 10 years. But what we're finding in this space is this movement towards consumerization of enterprise applications. In other words, I need an app and I need it tomorrow because I'm competitively disadvantaged, but it might be wrong, so I then need to adjust it really quick. It's this idea of continual developed, continual adjustment. But that flies in the face of all of this gluing and enterprise-ilities. And I want to base it on open source, and open source, by default, doesn't glue well together. And so what we did is we said okay, not only do you have to create an opinionated stack, and you do that because you want them all to scale into all industries, and they don't need a huge amount of choice, just pick best of breed. But you need to then put a sleeve around them so they all act as though they are a single application. And so we actually announced a thing calls Epoxy. It's a bit of a riff on gluing, but it's called DataTorrent Epoxy. So we have, it's like a microdata service bus, and you can then interchange the components. For instance, right now, Apache Apex is this string-based processing engine in that component. But if there's a better unit, we're quite happy to pull it out, chuck it away, and then put another one in. This isn't a ubiquitous snap-on toolset, because, again, the premise is use open source, get the innovation from there. It has to be bulletproof and enterprise-ility and move really fast. So those are the components I was working on. >> Guy, as CEO, I'm sure you speak with a lot of customers often. What are some of the buying patterns that you're seeing across industries, and what are some of the major business value that DataTorrent can help deliver to your customers? >> The buying patterns when we get involved, and I'm kind of breaking this down into a slightly different way, because we normally get involved when a project's in flight, one of the 80% that's failing, and in general, it's driven by a strategic business partner that has an agenda. And what you see is proprietary application vendors will say, "We can solve everything for you." So they put the tool in and realize it doesn't have the flexibility, it does have enterprise-ility, but it can't adjust fast. And then you get the other type who say, "Well we'll go to a distro or we'll go "to a general purpose practitioner, "and they'll build an application for us." And they'll take open source components, but they'll glue it together with proprietary mush, and then that doesn't then grow past. And then you get the other ones, which is, "Well if I actually am not guided by anybody, "I'll buy a bunch of developers, stick them in my company, "and I've got control on that." But they fiddle around a lot. So we arrive in and, in general, they're in this middle process of saying, "I'm at a competitive disadvantage, "I want to move forward and I want to move forward fast, "and we're working on one of those three channels." The types of outcomes, we just, and back to the expediency of this, we had a telco come to us recently, and it was just before the iPhone X launched, and they wanted to do AB testing on the launch on their platform. We got them up and running within three months. Subsequent from that launch, they then repurposed the platform and some of the components with some augmentation, and they've come out with three further applications. They've all gone into production. So the idea is then these fast cycles of microdata services being stitched together with the Epoxy resin type approach-- >> So faster time to value, lower TCO-- >> Exactly. >> Being able to get to meet their customers' needs faster-- >> Exactly, so it's outcome-based and time to value, and it's time to proof. Because this is, again, the thing that Gartner picked up on, is Hadoop's difficult, this market's complex and people kick the tires a lot. And I sort of joke with customers, "Hey if you want to "obsess about components rather than the outcome, "then your successor will probably come see us "once you're out and your group's failed." And I don't mean that in an obnoxious way. It's not just DataTorrent that solves this same thing, but this it the movement, right? Deal with open source, get enterprise-ilities, get us up and running within a quarter or two, and then let us have some use and agile repurposing. >> Following on that, just to understand going in with a solution to an economic buyer, but then having the platform be reusable, is it opinionated and focused on continuous processing applications, or does it also address both the continuous processing and batch processing? >> Yeah, it's a good answer. In general, and again Gatekeeper, you've got batch and you've got realtime and string, and so we deal with data in motion, which is string-based processing. A string-based processing engine can deal with batch as well, but a batch cannot deal with string. >> George: So you do both-- >> Yeah >> And the idea being that you can have one programming model for both. >> Exactly. >> It's just a window, batch is just a window. >> And the other thing is, a myth bust, is for the last maybe eight plus years, companies assume that the first thing you do in big data analytics is collect all the data, create a data lake, and so they go in there, they ingest the information, they put it into a data lake, and then they poke the data lake posthumously. But the data in the data lake is, by default, already old. So the latency of sticking it into a data lake and then sorting it, and then basically poking it, means that if anybody deals with the data that's in motion, you lose. Because I'm analyzing as it's happening and then you would be analyzing it after at rest, right? So now the architecture of choice is ingest the information, use high performance storage and compute, and then, in essence, ingest, normalize, enrich, analyze, and act on data in motion, in memory. And then when I've used it, then throw it off into a data lake because then I can basically do posthumous analytics and use that for enrichment later. >> You said something also interesting where the DataTorrent customers, the initial successful ones sort of tended to be larger organizations. Those are typically the ones with skillsets to, if anyone's going to be able to put pieces together, it's those guys. Have you not... Well, we always expected big data applications, or sort of adaptive applications, to go mainstream when they were either packaged apps to take all the analysis and embed it, or when you had end to end integrated products to make it simple. Where do you think, what's going to drive this mainstream? >> Yeah, it depends on how mainstream you want mainstream. It's kind of like saying how fast is a fast car. If you want a contractor that comes into IT to create a dashboard, go buy Tableau, and that's mainstream analytics, but it's not. It's mainstream dashboarding of data. The applications that we deal with, by default, the more complex data, they're going to be larger organizations. Don't misunderstand when I say, "We deal with these organizations." We don't have a professional services arm. We work very closely with people like HCL, and we do have a jumpstart team that helps people get there. But our job is teach someone, it's like a kid with a bike and the training wheels, our job is to teach them how to ride the bike, and kick the wheels off, and step away. Because what we don't want to do is to put a professional services drip feed into them and just keep sucking the money out. Our job is to get them there. Now, we've got one company who actually are going to go live next month, and it's a kid tracker, you know like a GPS one that you put on bags and with your kids, and it'll be realtime tracking for the school and also for the individuals. And they had absolutely zero Hadoop experience when we got involved with them. And so we've brought them up, we've helped them with the application, we've kicked the wheels off and now they're going to be sailing. I would say, in a year's time, they're going to be comfortable to just ignore us completely, and in the first year, there's still going to be some handholding and covering up a bruise as they fall off the bike every so often. But that's our job, it's IP, technology, all about outcomes and all about time to value. >> And from a differentiation standpoint, that ability to enable that self service and kick off the training wheels, is that one of the biggest differentiators that you find DataTorret has, versus the Tableau's and the other competitors on the market? >> I don't want to say there's no one doing what we're doing, because that will sound like we're doing something odd. But there's no one doing what we're doing. And it's almost like Tesla. Are they an electric car or are they a platform? They've spurred an industry on, and Uber did the same thing, and Lyft's done something and AirBNB has. And what we've noticed is customer's buying patterns are very specific now. Use open source, get up their enterprise-ilities, and have that level of agility. Nobody else is really doing that. The only people that will do that is your contract with someone like Hortonworks or a Cloudera, and actually pay them a lot of money to build the application for you. And our job is really saying, "No, instead of you paying "them on professional services, we'll give you the sleeve, "we'll make it a little bit more opinionated, "and we'll get you there really quickly, "and then we'll let you and set you free." And so that's one. We have a thing called the Application Factory. That's the snap on toolset where they can literally go to a GUI and say, "I'm in the financial market, "I want a fraud prevention application." And we literally then just self assemble the stack, they can pick it up, and then put their input and output in. And then, as we move forward, we'll have partners who are building the spoke applications in verticals, and they will put them up on our website, so the customers can come in and download them. Everything is subscription software. >> Fantastic, I wish we had more time, but thanks so much for finding some time today to come by theCUBE, tell us what's new, and we look forward to seeing you on the show again very soon. >> I appreciate it, thank you very much. >> We want to thank you for watching theCUBE. Again, Lisa Martin with my co-host George Gilbert, we're live at our event, Big Data SV, in downtown San Jose, down the street from the Strata Data Conference. Stick around, George and I will be back after a short break with our next guest. (light electronic jingle)

Published Date : Mar 8 2018

SUMMARY :

presenting Big Data, Silicon Valley, brought to you and we welcome back to theCUBE, So you're one of our regular VIP's. and we did the analysis of what are we doing well with them, I want to drill into something where you said many projects, So the idea of being able to wait for 12 months, So when you go into a customer... And so what we did is we said okay, not only do you have What are some of the buying patterns that you're seeing And then you get the other ones, which is, And I sort of joke with customers, "Hey if you want to and so we deal with data in motion, And the idea being that you can have one and then you would be analyzing it after at rest, right? or when you had end to end integrated products and now they're going to be sailing. and actually pay them a lot of money to build and we look forward to seeing you We want to thank you for watching theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

two yearsQUANTITY

0.99+

GeorgePERSON

0.99+

12 monthsQUANTITY

0.99+

UberORGANIZATION

0.99+

AirBNBORGANIZATION

0.99+

LisaPERSON

0.99+

TeslaORGANIZATION

0.99+

80%QUANTITY

0.99+

two typesQUANTITY

0.99+

GartnerORGANIZATION

0.99+

HortonworksORGANIZATION

0.99+

San JoseLOCATION

0.99+

iPhone XCOMMERCIAL_ITEM

0.99+

DataTorrentORGANIZATION

0.99+

sevenQUANTITY

0.99+

Guy ChurchwardPERSON

0.99+

tomorrow morningDATE

0.99+

LyftORGANIZATION

0.99+

last yearDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

six months agoDATE

0.99+

next monthDATE

0.99+

three monthsQUANTITY

0.99+

bothQUANTITY

0.99+

oneQUANTITY

0.98+

EMCORGANIZATION

0.98+

first dayQUANTITY

0.98+

tonightDATE

0.98+

Silicon ValleyLOCATION

0.98+

tomorrowDATE

0.98+

one chefQUANTITY

0.98+

10 yearsQUANTITY

0.98+

one pieceQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

ClouderaORGANIZATION

0.97+

three entreesQUANTITY

0.97+

Strata Data ConferenceEVENT

0.97+

first thingQUANTITY

0.97+

first yearQUANTITY

0.96+

single applicationQUANTITY

0.96+

todayDATE

0.95+

couple of weeks agoDATE

0.95+

telcoORGANIZATION

0.95+

900 millionth timeQUANTITY

0.95+

one companyQUANTITY

0.94+

HCLORGANIZATION

0.94+

a quarterQUANTITY

0.94+

DataTorretORGANIZATION

0.93+

three channelsQUANTITY

0.93+

twoQUANTITY

0.92+

Big Data SVEVENT

0.92+

Big Data SV 2018EVENT

0.91+

three further applicationsQUANTITY

0.86+

ApexTITLE

0.84+

a yearQUANTITY

0.82+

TableauORGANIZATION

0.81+

HadoopPERSON

0.81+

about a year agoDATE

0.8+

couple dozen animalsQUANTITY

0.8+

productQUANTITY

0.78+

eight plus yearsQUANTITY

0.77+

ApacheORGANIZATION

0.76+

agileTITLE

0.76+

GuyPERSON

0.73+

EpoxyORGANIZATION

0.71+

TableauTITLE

0.71+

DataTorrentPERSON

0.7+

around 3.10DATE

0.69+

SparkTITLE

0.68+

restaurantsQUANTITY

0.66+

GatekeeperTITLE

0.66+

modelQUANTITY

0.63+

Matthew Baird, AtScale | Big Data SV 2018


 

>> Announcer: Live from San Jose. It's theCUBE, presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media, and it's ecosystem partners. (techno music) >> Welcome back to theCUBE, our continuing coverage on day one of our event, Big Data SV. I'm Lisa Martin with George Gilbert. We are down the street from the Strata Data Conference. We've got a great, a lot of cool stuff going on. You can see the cool set behind me. We are at Forager Tasting Room & Eatery. Come down and join us, be in our audience today. We have a cocktail event tonight, who doesn't want to join that? And we have a nice presentation tomorrow morning of our Wikibon's 2018 Big Data Forecast and Review. Joining us next is Matthew Baird the co-founder of AtScale. Matthew, welcome to theCUBE. >> Thanks for having me. Fantastic venue, by the way. >> Isn't it cool? >> This is very cool. >> Yeah, it is. So, talking about Big Data, you know, Gardner says, "85% of Big Data projects have failed." I often say failure is not a bad F word, because it can spawn the genesis of a lot of great business opportunities. Data lakes were big a few years ago, turned into swamps. AtScale has this vision of Data Lake 2.0, what is that? >> So, you're right. There have been a lot of failures, there's no doubt about it. And you're also right that is how we evolve, and we're a Silicon Valley based company. We don't give up when faced with these things. It's just another way to not do something. So, what we've seen and what we've learned through our customers is they need to have a solution that is integrated with all the technologies that they've adopted in the enterprise. And it's really about, if you're going to make a data lake, you're going to have data on there that is the crown jewels of your business. How are you going to get that in the hands of your constituents, so that they can analyze it, and they can use it to make decisions? And how can we, furthermore, do that in a way that supplies governance and auditability on top of it, so that we aren't just sending data out into the ether and not knowing where it goes? We have a lot of customers in the insurance, health insurance space, and with financial customers that the data absolutely must be managed. I think one of the biggest changes is around that integration with the current technologies. There's a lot of movement into the Cloud. The new data lake is kind of focused more on these large data stores, where it was HDFS with Hadoop. Now it's S3, Google's object storage, and Azure ADLS. Those are the sorts of things that are backing the new data lake I believe. >> So if we take these, where the Data Lake Store didn't have to be something that's a open source HDFS implementation, it could even be through just through a HDSF API. >> Matthew: Yeah, absolutely. >> What are some of the, how should we think about the data sources and feeds, for this repository, and then what is it on top that we need to put to make the data more consumable? >> Yeah, that's a good point. S3, Google Object Storage, and Azure, they all have a characteristic of, they are large stores. You can store as much as you want. They generally on the Clouds, and in the open source on-prem software for landing the data exists, for streaming the data and landing it, but the important thing there is it's cost-effective. S3 is a cost-effective storage system. HDFS is a mostly cost-effective storage system. You have to manage it, so it has a slightly higher cost, but the advice has been, get it to the place you're going to store it. Store it in a unified format. You get a halo effect when you have a unified format, and I think the industry is coalescing around... I'd probably say ParK's in the lead right now, but once ParK can be read by, let's take Amazon for instance, can be read by Athena, can be read by Redshift Spectrum, it can be read by their EMR, now you have this halo effect where your data's always there, always available to be consumed by a tool or a technology that can then deliver it to your end users. >> So when we talk about ParK, we're talking about columnar serialization format, >> Matthew: Yes. but there's more on top of that that needs to be layered, so that you can, as we were talking about earlier, combine the experience of a data warehouse, and the curated >> Absolutely data access where there's guard rails, >> Matthew: Yes >> and it's simple, versus sort of the wild west, but where I capture everything in a data lake. How do you bring those two together? >> Well, specifically for AtScale, we allow you to integrate multiple data access tools in AtScale, and then we use the appropriate tool to access the data for the use case. So let me give you an example, in the Amazon case, Redshift is wonderful for accessing interactive data, which BI users want, right? They want fast queries, sub-second queries. They don't want to pay to have all the raw data necessarily stored in Redshift 'cause that's pretty expensive. So they have this Redshift spectrum, it's sitting in S3, that's cost effective. So when we go and we read raw data to build these summary tables, to deliver the data fast, we can read from Spectrum, we can put it all together, drop it into Redshift, a much smaller volume of data, so it has faster characteristics for being accessed. And it delivers it to the user that way. We do that in Hadoop when we access via Hive for building aggregate tables, but Spark or Impala, is a much faster interactive engine, so we use those. As I step back and look at this, I think the Data Lake 2.0, from a technical perspective is about abstraction, and abstraction's sort of what separates us from the animals, right? It's a concept where we can pack a lot of sophistication and complexity behind an interface that allows people to just do what they want to do. You don't know how, or maybe you do know how a car engine works, I don't really, kind of, a little bit, but I do know how to press the gas pedal and steer. >> Right. >> I don't need to know these things, and I think the Data Lake 2.0 is about, well I don't need to know how Century, or Ranger, or Atlas, or any of these technologies work. I need to know that they're there, and when I access data, they're going to be applied to that data, and they're going to deliver me the stuff that I have access to and that I can see. >> So a couple things, it sounded like I was hearing abstraction, and you said really that's kind of the key, that sounds like a differentiator for AtScale, is giving customers that abstraction they need. But I'm also curious from a data value perspective, you talked about in Redshift from an expense perspective. Do you also help customers gain abstraction by helping them evaluate value of data and where they ought to keep it, and then you give them access to it? Or is that something that they need to do, kind of bring to the table? >> We don't really care, necessarily, about the source of the data, as long as it can be expressed in a way that can be accessed by whatever engine it is. Lift and shift is an example. There's a big move to move from Teradata or from Netezza into a Cloud-based offering. People want to lift it and shift it. It's the easiest way to do this. Same table definitions, but that's not optimized necessarily for the underlying data store. Take BigQuery for example, BigQuery's an amazing piece of technology. I think there's nothing like it out there in the market today, but if you really want BigQuery to be cost-effective, and perform and scale up to concurrency of... one of our customers is going to roll out about 8,000 users on this. You have to do things in BigQuery that are BigQuery-friendly. The data structures, the way that you store the data, repeated values, those sorts of things need to be taken into consideration when you build your schema out for consumption. With AtScale they don't need to think about that, they don't need to worry about it, we do it for them. They drop the schema in the same way that it exists on their current technology, and then behind the scenes, what we're doing is we're looking at signals, we're looking at queries, we're looking at all the different ways that people access the data naturally, and then we restructure those summary tables using algorithms and statistics, and I think people would broadly call it ML type approaches, to build out something that answers those questions, and adapts over time to new questions, and new use cases. So it's really about, imagine you had the best data engineering team in the world, in a box, they're never tired, they never stop, and they're always interacting with what the customers really want, which is "Now I want to look at the data this way". >> It's sounds actually like what your talking about is you have a whole set of sources, and targets, and you understand how they operate, but why I say you, I mean your software. And so that you can take data from wherever it's coming in, and then you apply, if it's machine learning or whatever other capabilities to learn from the access methods, how to optimize that data for that engine. >> Matthew: Exactly. >> And then the end users have an optimal experience and it's almost like the data migration service that Amazon has, it's like, you give us your Postgres or Oracle database, and we'll migrate it to the cloud. It sounds like you add a lot of intelligence to that process for decision support workloads. >> Yes. >> And figure out, so now you're going to... It's not Postgres to Postgres, but it might be Teradata to Redshift, or S3, that's going to be accessed by Athena or Redshift, and then let's put that in the right format. >> I think you sort of hit something that we've noticed is very powerful, which is if you can set up, and we've done this with a number of customers, if you can set up at the abstraction layer that is AtScale, on your on-prem data, literally in, say hours, you can move it into the Cloud, obviously you have to write the detail to move it into the Cloud, but once it's in the Cloud you take the same AtScale instance, you re-point it at that new data source, and it works. We've done that with multiple customers, and it's fast and effective, and it let's you actually try out things that you may not have the agility to do before because there's differences in how the SQL dialects work, there's differences in, potentially, how the schema might be built. >> So a couple things I'm interested in, I'm hearing two A-words, that abstraction that we've talked about a number of times, you also mention adaptability. So when you're talking with customers, what are some of the key business outcomes they need to drive, where adaptability and abstraction are concerned, in terms of like cost reduction, revenue generation. What are some of those see-swee business objectives that AtScale can help companies achieve? >> So looking at, say, a customer, a large retailer on the East Coast, everybody knows the stores, they're everywhere, they sell hardware. they have a 20-terabyte cube that they use for day-to-day revenue analytics. So they do period over period analysis. When they're looking at stores, they're looking at things like, we just tried out a new marketing approach... I was talking to somebody there last week about how they have these special stores where they completely redo one area and just see how that works. They have to be able to look at those analytics, and they run those for a short amount of time. So if you're window for getting data, refreshing data, building cubes, which in the old world could take a week, you know my co-founder at Yahoo, he had a week and a half build time. That data is now two weeks old, maybe three weeks old. There might be bugs in it-- >> And the relevance might be, pshh... >> And the relevance goes down, or you can't react as fast. I've been at companies where... Speed is so important these days, and the new companies that are grasping data aggressively, putting it somewhere where they can make decisions on it on a day-to-day basis, they're winning. And they're spending... I was at a company that was spending three million dollars on pay-per-click data, a month. If you can't get data everyday, you're on the wrong campaigns, and everything goes off the rails, and you only learn about it a week later, that's 25% of your spend, right there, gone. >> So the biggest thing, sorry George, it really sounds to me like what AtScale can facilitate for probably customers in any industry is the ability to truly make data-driven business decisions that can really directly affect revenue and profit. >> Yes, and in an agile format. So, you can build-- >> That's the third A; agile, adaptability, abstraction. >> There ya go, the three A's. (Lisa laughs) We had the three V's, now we have the three A's. >> Yes. >> The fact that you're building a curated model, so in retail the calendars are complex. I'm sure everybody that uses Tableau is good at analyzing data, but they might not know what your rules are around your financial calendar, or around the hierarchies of your product. There's a lot of things that happen where you want an enterprise group of data modelers to build it, bless it, and roll it out, but then you're a user, and you say, wait, you forgot x, y, and z, I don't want to wait a week, I don't want to wait two weeks, three weeks, a month, maybe more. I want that data to be available in the model an hour later 'cause that's what I get with Tableau today. And that's where we've taken the two approaches of enterprise analytics and self-service, and tried to create a scenario where you get the best of both worlds. >> So, we know that an implication of what you're telling us is that insights are perishable, and latency is becoming more and more critical. How do you plan to work with streaming data where you've got a historical archive, but you've got fresh data coming in? But fresh could mean a variety of things. Tell us what some of those scenarios look like. >> Absolutely, I think there's two approaches to this problem, and I'm seeing both used in practice, and I'm not exactly sure, although I have some theories on which one's going to win. In one case, you are streaming everything into, sort of a... like I talked about, this data lake, S3, and you're putting it in a format like ParK, and then people are accessing it. The other way is access the data where it is. Maybe it's already in, this is a common BI scenario, you have a big data store, and then you have a dimensional data store, like Oracle has your customers, Hadoop has machine data about those customers accessing on their mobile devices or something. If there was some way to access those data without having to move the Oracle stuff into the big data store, that's a Federation story that I think we've talked about in the Bay Area for a long time, or around the world for a long time. I think we're getting closer to understanding how we can do that in practice, and have it be tenable. You don't move the big data around, you move the small data around. For data coming in from outside sources it's probably a little bit more difficult, but it is kind of a degenerate version of the same story. I would say that streaming is gaining a lot of momentum, and with what we do, we're always mapping, because of the governance piece that we've built into the product, we're always mapping where did the data come from, where did it land, and how did we use it to build summary tables. So if we build five summary tables, 'cause we're answering different types of questions, we still need to know that it goes back to this piece of data, which has these security constraints, and these audit requirements, and we always track it back to that, and we always apply those to our derived data. So when you're accessing this automatically ETLed summary tables, it just works the way it is. So I think that there are two ways that this is going to expand and I'm excited about Federation because I think the time has come. I'm also excited about streaming. I think they can serve two different use cases, and I don't actually know what the answer will be, because I've seen both in customers, it's some of the biggest customers we have. >> Well Matthew thank you so much for stopping by, and four A's, AtScale can facilitate abstraction, adaptability, and agility. >> Yes. Hashtag four A's. >> There we go. I don't even want credit for that. (laughs) >> Oh wow, I'm going to get five more followers, I know it! (George laughs) >> There ya go! >> We want to thank you for watching theCUBE, I am Lisa Martin, we are live in San Jose, at our event Big Data SV, I'm with George Gilbert. Stick around, we'll be back with our next guest after a short break. (techno music)

Published Date : Mar 7 2018

SUMMARY :

Brought to you by SiliconANGLE Media, We are down the street from the Strata Data Conference. Thanks for having me. because it can spawn the genesis that is the crown jewels of your business. So if we take these, that can then deliver it to your end users. and the curated and it's simple, versus sort of the wild west, And it delivers it to the user that way. and they're going to deliver me the stuff and then you give them access to it? The data structures, the way that you store the data, And so that you can take data and it's almost like the data migration service but it might be Teradata to Redshift, and it let's you actually try out things they need to drive, and just see how that works. And the relevance goes down, or you can't react as fast. is the ability to truly make data-driven business decisions Yes, and in an agile format. We had the three V's, now we have the three A's. where you get the best of both worlds. How do you plan to work with streaming data and then you have a dimensional data store, and four A's, AtScale can facilitate abstraction, Yes. I don't even want credit for that. We want to thank you for watching theCUBE,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
MatthewPERSON

0.99+

George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

Matthew BairdPERSON

0.99+

GeorgePERSON

0.99+

San JoseLOCATION

0.99+

YahooORGANIZATION

0.99+

three weeksQUANTITY

0.99+

AmazonORGANIZATION

0.99+

25%QUANTITY

0.99+

GardnerPERSON

0.99+

two approachesQUANTITY

0.99+

OracleORGANIZATION

0.99+

two weeksQUANTITY

0.99+

RedshiftTITLE

0.99+

S3TITLE

0.99+

three million dollarsQUANTITY

0.99+

two waysQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

one caseQUANTITY

0.99+

85%QUANTITY

0.99+

last weekDATE

0.99+

a monthQUANTITY

0.99+

CenturyORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

a weekQUANTITY

0.99+

BigQueryTITLE

0.99+

bothQUANTITY

0.99+

20-terabyteQUANTITY

0.99+

GoogleORGANIZATION

0.99+

a week and a halfQUANTITY

0.99+

a week laterDATE

0.99+

Data Lake 2.0COMMERCIAL_ITEM

0.99+

twoQUANTITY

0.99+

tomorrow morningDATE

0.99+

AtScaleORGANIZATION

0.99+

AtlasORGANIZATION

0.99+

Bay AreaLOCATION

0.98+

LisaPERSON

0.98+

ParKTITLE

0.98+

TableauTITLE

0.98+

five more followersQUANTITY

0.98+

an hour laterDATE

0.98+

RangerORGANIZATION

0.98+

NetezzaORGANIZATION

0.98+

tonightDATE

0.97+

todayDATE

0.97+

both worldsQUANTITY

0.97+

about 8,000 usersQUANTITY

0.97+

theCUBEORGANIZATION

0.97+

Strata Data ConferenceEVENT

0.97+

oneQUANTITY

0.97+

Big Data SV 2018EVENT

0.97+

TeradataORGANIZATION

0.96+

AtScaleTITLE

0.96+

Big Data SVEVENT

0.93+

East CoastLOCATION

0.93+

HadoopTITLE

0.92+

two different use casesQUANTITY

0.92+

day oneQUANTITY

0.91+

one areaQUANTITY

0.91+