Dr. Tim Wagner & Shruthi Rao | Cloud Native Insights

(upbeat electronic music) >> Narrator: From theCUBE studios in Palo Alto and Boston, connecting with thought leaders all around the world, this is a CUBE conversation! >> Hi, I'm Stu Miniman, your host for Cloud Native Insight. When we launched this series, one of the things we wanted to talk about was that we're not just using cloud as a destination, but really enabling new ways of thinking, being able to use the innovations underneath the cloud, and that if you use services in the cloud, that you're not necessarily locked into a solution or can't move forward. And that's why I'm really excited to help welcome to the program, I have the co-founders of Vendia. First we have Dr. Tim Wagner, he is the co-founder and CEO of the company, as well as generally known in the industry as the father of Serverless from the AWS Lambda, and his co-founder, Shruthi Rao, she is the chief business officer at Vendia, also came from AWS where she worked on blockchain solutions. Tim, Shruthi, thanks so much for joining us. >> Thanks for having us in here, Stu. Great to join the show. >> All right, so Shruthi, actually if we could start with you because before we get into Vendia, coming out of stealth, you know, really interesting technology space, you and Tim both learned a lot from working with customers in your previous jobs, why don't we start from you. Block chain of course had a lot of learnings, a lot of things that people don't understand about what it is and what it isn't, so give us a little bit about what you've learned and how that lead towards what you and Tim and the team are doing with Vendia. >> Yeah, absolutely, Stu! One, the most important thing that we've all heard of was this great gravitational pull towards blockchain in 2018 and 2019. Well, I was one of the founders and early adopters of blockchain from Bitcoin and Ethereum space, all the way back from 2011 and onwards. And at AWS I started the Amazon Managed Blockchain and launched Quantum Ledger Database, two services in the block chain category. What I learned there was, no surprise, there was a gold rush to blockchain from many customers. We, I personally talked to over 1,092 customers when I ran Amazon Managed Blockchain for the last two years. And I found that customers were looking at solving this dispersed data problem. Most of my customers had invested in IoT and edge devices, and these devices were gathering massive amounts of data, and on the flip side they also had invested quite a bit of effort in AI and ML and analytics to crunch this data, give them intelligence. But guess what, this data existed in multiple parties, in multiple clouds, in multiple technology stacks, and they needed a mechanism to get this data from wherever they were into one place so they could the AI, ML, analytics investment, and they wanted all of this to be done in real time, and to gravitated towards blockchain. But blockchain had quite a bit of limitations, it was not scalable, it didn't work with the existing stack that you had. It forced enterprises to adopt this new technology and entirely new type of infrastructure. It didn't work cross-cloud unless you hired expensive consultants or did it yourself, and required these specialized developers. For all of these reasons, we've seen many POCs, majority of POCs just dying on the vine and not ever reaching the production potential. So, that is when I realized that what the problem to be solved was not a trust problem, the problem was dispersed data in multiple clouds and multiple stacks problem. Sometimes multiple parties, even, problem. And that's when Tim and I started talking about, about how can we bring all of the nascent qualities of Lambda and Serverless and use all of the features of blockchain and build something together? And he has an interesting story on his own, right. >> Yeah. Yeah, Shruthi, if I could, I'd like to get a little bit of that. So, first of all for our audience, if you're watching this on the minute, probably want to hit pause, you know, go search Tim, go watch a video, read his Medium post, about the past, present, and future of Serverless. But Tim, I'm excited. You and I have talked in the past, but finally getting you on theCUBE program. >> Yeah! >> You know, I've looked through my career, and my background is infrastructure, and the role of infrastructure we know is always just to support the applications and the data that run business, that's what is important! Even when you talk about cloud, it is the applications, you know, the code, and the data that are important. So, it's not that, you know, okay I've got near infinite compute capacity, it's the new things that I can do with it. That's a comment I heard in one of your sessions. You talked about one of the most fascinating things about Serverless was just the new creativity that it inspired people to do, and I loved it wasn't just unlocking developers to say, okay I have new ways to write things, but even people that weren't traditional coders, like lots of people in marketing that were like, "I can start with this and build something new." So, I guess the question I have for you is, you know we had this idea of Platform as a Service, or even when things like containers launched, it was, we were trying to get close to that atomic unit of the application, and often it was talked about, well, do I want it for portability? Is it for ease of use? So, you've been wrangling and looking at this (Tim laughing) from a lot of different ways. So, is that as a starting point, you know, what did you see the last few years with Lambda, and you know, help connect this up to where Shruthi just left off her bit of the story. >> Absolutely. You know, the great story, the great success of the cloud is this elimination of undifferentiated heavy lifting, you know, from getting rid of having to build out a data center, to all the complexity of managing hardware. And that first wave of cloud adoption was just phenomenally successful at that. But as you say, the real thing businesses wrestle with are applications, right? It's ultimately about the business solution, not the hardware and software on which it runs. So, the very first time I sat down with Andy Jassy to talk about what eventually become Lambda, you know, one of the things I said was, look, if we want to get 10x the number of people to come and, you know, and be in the cloud and be successful it has to be 10 times simpler than it is today. You know, if step one is hire an amazing team of distributed engineers to turn a server into a full tolerance, scalable, reliable business solution, now that's going to be fundamentally limiting. We have to find a way to put that in a box, give that capability, you know, to people, without having them go hire that and build that out in the first place. And so that kind of started this journey for, for compute, we're trying to solve the problem of making compute as easy to use as possible. You know, take some code, as you said, even if you're not a diehard programmer or backend engineer, maybe you're just a full-stack engineer who loves working on the front-end, but the backend isn't your focus, turn that into something that is as scalable, as robust, as secure as somebody who has spent their entire career working on that. And that was the promise of Serverless, you know, outside of the specifics of any one cloud. Now, the challenge of course when you talk to customers, you know, is that you always heard the same two considerations. One is, I love the idea of Lamdba, but it's AWS, maybe I have multiple departments or business partners, or need to kind of work on multiple clouds. The other challenge is fantastic for compute, what about data? You know, you've kind of left me with, you're giving me sort of half the solution, you've made my compute super easy to use, can you make my data equally easy to use? And so you know, obviously the part of the genesis of Vendia is going and tackling those pieces of this, giving all that promise and ease of use of Serverless, now with a model for replicated state and data, and one that can cross accounts, machines, departments, clouds, companies, as easily as it scales on a single cloud today. >> Okay, so you covered quite a bit of ground there Tim, if you could just unpack that a little bit, because you're talking about state, cutting across environments. What is it that Vendia is bringing, how does that tie into solutions like, you know, Lamdba as you mentioned, but other clouds or even potentially on premises solutions? So, what is, you know, the IP, the code, the solution that Vendia's offering? >> Happy to! So, let's start with the customer problem here. The thing that every enterprise, every company, frankly, wrestles with is in the modern world they're producing more data than ever, IMT, digital journeys, you know, mobile, edge devices. More data coming in than ever before, at the same time, more data getting consumed than ever before with deep analytics, supply chain optimization, AI, ML. So, even more consumers of ever more data. The challenge, of course, is that data isn't always inside a company's four walls. In fact, we've heard 80% or more of that data actually lives outside of a company's control. So, step one to doing something like AI, ML, isn't even just picking a product or selecting a technology, it's getting all of your data back together again, so that's the problem that we set out to solve with Vendia, and we realized that, you know, and kind of part of the genesis for the name here, you know, Vendia comes from Venn Diagram. So, part of that need to bring code and data together across companies, across tech stacks, means the ability to solve some of these long-standing challenges. And we looked at the two sort of big movements out there. Two of them, you know, we've obviously both been involved in, one of them was Serverless, which amazing ability to scale, but single account, single cloud, single company. The other one is blockchain and distributed ledgers, manages to run more across parties, across clouds, across tech stacks, but doesn't have a great mechanism for scalability, it's really a single box deployment model, and obviously there are a lot of limitations with that. So, our technology, and kind of our insight and breakthrough here was bringing those two things together by solving the problems in each of them with the best parts of the other. So, reimagine the blockchain as a cloud data implementation built entirely out of Serverless components that have all of the scale, the cost efficiencies, the high utilization, like all of the ease of deployment that something like Lambda has today, and at the same time, you know, bring state to Serverless. Give things like Lambda and the equivalent of other clouds a simple, easy, built-in model so that applications can have multicloud, multi-account state at all times, rather than turning that into a complicated DIY project. So, that was our insight here, you know and frankly where a lot of the interesting technology for us is in turning those centralized services, a centralized version of Serverless Compute or Serverless Database into a multi-account, multicloud experience. And so that's where we spent a lot of time and energy trying to build something that gives customers a great experience. >> Yeah, so I've got plenty of background in customers that, you know, have the "information silos", if you will, so we know, when the unstructured data, you know so much of it is not searchable, I can't leverage it. Shruthi, but maybe it might make sense, you know, what is, would you say some of the top things some of your early customers are saying? You know, I have this pain point, that's pointing me in your direction, what was leading them to you? And how does the solution help them solve that problem? >> Yeah, absolutely! One of our design partners, our lead design partners is this automotive company, they're a premier automotive company, they want, their end goal is to track car parts for warranty recall issues. So, they want to track every single part that goes into a particular car, so they're about 30 to 35,000 parts in each of these cars, and then all the way from manufacturing floor to when the car is sold, and when that particular part is replaced eventually, towards the end of the lifecycle of that part. So for this, they have put together a small test group of their partners, a couple of the parts manufacturers, they're second care partners, National Highway Safety Administration is part of this group, also a couple of dealers and service centers. Now, if you just look at this group of partners, you will see some of these parties have high technology, technology backgrounds, just like the auto manufacturers themselves or the part manufacturers. Low modality or low IT-competency partners such as the service centers, for them desktop PCs are literally the IT competency, and so does the service centers. Now, most of, majority of these are on multiple clouds. This particular auto customer is on AWS and manufactures on Azure, another one is on GCP. Now, they all have to share these large files between each other, making sure that there are some transparency and business rules applicable. For example, two partners who make the same parts or similar parts cannot see each other's data. Most of the participants cannot see the PII data that are not applicable, only the service center can see that. National Highway Safety Administration has read access, not write access. A lot of that needed to be done, and their alternatives before they started using Vendia was either use point-to-point APIs, which was very expensive, very cumbersome, it works for a finite small set of parties, it does not scale, as in when you add more participants into this particular network. And the second option for them was blockchain, which they did use, and used Hyperledger Fabric, they used Ethereum Private to see how this works, but the scalability, with Ethereum Private, it's about 14 to 15 transactions per second, with Hyperledger Fabric it taps out at 100, or 150 on a good day, transaction through, but it's not just useful. All of these are always-on systems, they're not Serverless, so just provisioning capacity, our customers said it took them two to three weeks per participant. So, it's just not a scalable solution. With Vendia, what we delivered to them was this virtual data lake, where the sources of this data are on multiple clouds, are on multiple accounts owned by multiple parties, but all of that data is shared on a virtual data lake with all of the permissions, with all of the logging, with all of the security, PII, and compliance. Now, this particular auto manufacturer and the National Highway Safety Administration can run their ML algorithms to gain intelligence off of it, and start to understand patterns, so when certain parts go bad, or what's the propensity of a certain manufacturing unit producing faulty parts, and so on, and so forth. This really shows you this concept of unstructured data being shared between parties that are not, you know, connected with each other, when there are data silos. But I'd love to follow this up with another example of, you know, the democratization, democratization is very important to Vendia. When Tim launched Lambda and founded the AWS Serverless movement as a whole, and at AWS, one thing, very important thing happened, it lowered the barrier to entry for a new wave of businesses that could just experiment, try out new things, if it failed, they scrap it, if it worked, they could scale it out. And that was possible because of the entry point, because of the paper used, and the architecture itself, and we are, our vision and mission for Vendia is that Vendia fuels the next generation of multi-party connected distributed applications. My second design partner is actually a non-profit that, in the animal welfare industry. Their mission is to maintain a no-kill for dogs and cats in the United States. And the number one reason for over populations of dogs and cats in the shelters is dogs lost, dogs and cats lost during natural disasters, like the hurricane season. And when that happens, and when, let's say your dogs get lost, and you want to find a dog, the ID or the chip-reading is not reliable, they want to search this through pictures. But we also know that if you look at a picture of a dog, four people can come up with four different breed names, and this particular non-profit has 2,500 plus partners across the U.S., and they're all low to no IT modalities, some of them have higher IT competency, and a huge turnover because of volunteer employees. So, what we did for them was came up with a mechanism where they could connect with all 2,500 of these participants very easily in a very cost-effective way and get all of the pictures of all of the dogs in all these repositories into one data lake so they can run some kind of a dog facial recognition algorithm on it and identify where my lost dog is in minutes as opposed to days it used to take before. So, you see a very large customer with very sophisticated IT competency use this, also a non-profit being able to use this. And they were both able to get to this outcome in days, not months or years, as, blockchain, but just under a few days, so we're very excited about that. >> Thank you so much for the examples. All right, Tim, before we get to the end, I wonder if you could take us under the hood a little bit here. My understanding, the solution that you talk about, it's universal apps, or what you call "unis" -- >> Tim: Unis? (laughs) >> I believe, so if I saw that right, give me a little bit of compare and contrast, if you will. Obviously there's been a lot of interest in what Kubernetes has been doing. We've been watching closely, you know there's connections between what Kubernetes is doing and Serverless with the Knative project. When I saw the first video talking about Vendia, you said, "We're serverless, and we're containerless underneath." So, help us understand, because at, you know, a super high level, some of the multicloud and making things very flexible sound very similar. So you know, how is Vendia different, and why do you feel your architecture helps solve this really challenging problem? >> Sure, sure, awesome! You know, look, one of the tenets that we had here was that things have to be as easy as possible for customers, and if you think about the way somebody walks up today to an existing database system, right? They say, "Look, I've got a schema, I know the shape of my data." And a few minutes later I can get a production database, now it's single user, single cloud, single consumer there, but it's a very fast, simple process that doesn't require having code, hiring a team, et cetera, and we wanted Vendia to work the same way. Somebody can walk up with a JSON schema, hand it to us, five minutes later they have a database, only now it's a multiparty database that's decentralized, so runs across multiple platforms, multiple clouds, you know, multiple technology stacks instead of being single user. So, that's kind of goal one, is like make that as easy to use as possible. The other key tenet though is we don't want to be the least common denominator of the cloud. One of the challenges with saying everyone's going to deploy their own servers, they're going to run all their own software, they're going to build, you know, they're all going to co-deploy a Kubernetes cluster, one of the challenges with that is that, as Shruthi was saying, first, anyone for whom that's a challenge, if you don't have a whole IT department wrapped around you that's a difficult proposition to get started on no matter how amazing that technology might be. The other challenge with it though is that it locks you out, sort of the universe of a lock-in process, right, is the lock-out process. It locks you out of some of the best and brightest things the public cloud providers have come up with, and we wanted to empower customers, you know, to pick the best degree. Maybe they want to go use IBM Watson, maybe they want to use a database on Google, and at the same time they want to ingest IoT on AWS, and they wanted all to work together, and want all of that to be seamless, not something where they have to recreate an experience over, and over, and over again on three different clouds. So, that was our goal here in producing this. What we designed as an architecture was decentralized data storage at the core of it. So, think about all the precepts you hear with blockchain, they're all there, they all just look different. So, we use a no SQL database to store data so that we can scale that easily. We still have a consensus algorithm, only now it's a high speed serverless and cloud function based mechanism. You know, instead of smart contracts, you write things in a cloud function like Lambda instead, so no more learning Solidity, now you can use any language you want. So, we changed how we think about that architecture, but many of those ideas about people, really excited about blockchain and its capabilities and the vision for the future are still alive and well, they've just been implemented in a way that's far more practical and effective for the enterprise. >> All right, so what environments can I use today for your solution, Shruthi talked about customers spanning across some of the cloud, so what's available kind of today, what's on the roadmap in the future? Will this include beyond, you know, maybe the top five or six hyper scalers? Can I do, does it just require Serverless underneath? So, will things that are in a customer's own data center eventually support that. >> Absolutely. So, what we're doing right now is having people sign up for our preview release, so in the next few weeks, we're going to start turning that on for early access to developers. That'll be, the early access program, will be multi-account, focused on AWS, and then end of summer, well be doing our GA release, which will be multicloud, so we'll actually be able to operate across multiple clouds, multiple cloud services, on different platforms. But even from day one, we'll have API support in there. So, if you got a service, could even be running on a mainframe, could be on-prem, if it's API based you can still interact with the data, and still get the benefits of the system. So, developers, please start signing up, you can go find more information on vendia.net, and we're really looking forward to getting some of that early feedback and hear more from the people that we're the most excited to have start building these projects. >> Excellent, what a great call to action to get the developers and users in there. Shruthi, if you could just give us the last bit, you know, the thing that's been fascinating, Tim, when I look at the Serverless movement, you know, I've talked to some amazing companies that were two or three people (Tim laughing) and out of their basement, and they created a business, and they're like, "Oh my gosh, I got VC funding, and it's usually sub $10,000,000. So, I look at your team, I'd heard, Tim, you're the primary coder on the team. (Tim laughing) And when it comes to the seed funding it's, you know, compared to many startups, it's a small number. So, Shruthi, give us a little bit if you could the speeds and feeds of the company, your funding, and any places that you're hiring. Yeah, we are definitely hiring, lets me start from there! (Tim laughing) We're hiring for developers, and we are also hiring for solution architects, so please go to vendia.net, we have all the roles listed there, we would love to hear from you! And the second one, funding, yes. Tim is our main developer and solutions architect here, and look, the Serverless movement really helped quite a few companies, including us, to build this, bring this to market in record speeds, and we're very thankful that Tim and AWS started taking the stands, you know back in 2014, 2013, to bring this to market and democratize this. I think when we brought this new concept to our investors, they saw what this could be. It's not an easy concept to understand in the first wave, but when you understand the problem space, you see that the opportunity is pretty endless. And I'll say this for our investors, on behalf of our investors, that they saw a real founder market-fit between us. We're literally the two people who have launched and ran businesses for both Serverless and blockchain at scale, so that's what they thought was very attractive to them, and then look, it's Tim and I, and we're looking to hire 8 to 10 folks, and I think we have gotten to a space where we're making a meaningful difference to the world, and we would love for more people to join us, join this movement and democratize this big dispersed data problem and solve for this. And help us create more meanings to the data that our customers and companies worldwide are creating. We're very excited, and we're very thankful for all of our investors to be deeply committed to us and having conviction on us. >> Well, Shruthi and Tim, first of all, congratulations -- >> Thank you, thank you. >> Absolutely looking forward to, you know, watching the progress going forward. Thanks so much for joining us. >> Thank you, Stu, thank you. >> Thanks, Stu! >> All right, and definitely tune in to our regular conversations on Cloud Native Insights. I'm your host Stu Miniman, and looking forward to hearing more about your Cloud Native Insights! (upbeat electronic music)

Published Date : Jul 2 2020

SUMMARY :

and CEO of the company, Great to join the show. and how that lead towards what you and Tim and on the flip side You and I have talked in the past, it is the applications, you know, and build that out in the first place. So, what is, you know, the and at the same time, you know, And how does the solution and get all of the solution that you talk about, and why do you feel your architecture and at the same time they Will this include beyond, you know, and hear more from the people and look, the Serverless forward to, you know, and looking forward to hearing more

ENTITIES

Entity	Category	Confidence
Shruthi	PERSON	0.99+
Tim	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2018	DATE	0.99+
2014	DATE	0.99+
two	QUANTITY	0.99+
Two	QUANTITY	0.99+
80%	QUANTITY	0.99+
Shruthi Rao	PERSON	0.99+
2019	DATE	0.99+
National Highway Safety Administration	ORGANIZATION	0.99+
two partners	QUANTITY	0.99+
National Highway Safety Administration	ORGANIZATION	0.99+
2011	DATE	0.99+
2013	DATE	0.99+
8	QUANTITY	0.99+
Boston	LOCATION	0.99+
second option	QUANTITY	0.99+
10 times	QUANTITY	0.99+
Stu	PERSON	0.99+
Vendia	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Andy Jassy	PERSON	0.99+
United States	LOCATION	0.99+
U.S.	LOCATION	0.99+
10x	QUANTITY	0.99+
one	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Tim Wagner	PERSON	0.99+
two people	QUANTITY	0.99+
vendia.net	OTHER	0.99+
two services	QUANTITY	0.99+
first video	QUANTITY	0.99+
One	QUANTITY	0.99+
2,500 plus partners	QUANTITY	0.99+
each	QUANTITY	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.99+
five minutes later	DATE	0.99+
today	DATE	0.98+
100	QUANTITY	0.98+
IBM	ORGANIZATION	0.98+
First	QUANTITY	0.98+
over 1,092 customers	QUANTITY	0.98+
three people	QUANTITY	0.98+
two things	QUANTITY	0.98+
Amazon	ORGANIZATION	0.98+
150	QUANTITY	0.98+
AWS Lambda	ORGANIZATION	0.98+

Noam Shendar, Zadara Storage & Dave Elliott, Google - CUBE Conversation - #theCUBE

hi Jeff I here with the cube we are in the studio and Palo Alto the cube studio offices for a cute conversation talking about storage enterprise storage cloud and all the things that are keeping us up at night in the excitement that we see every day out in the field so we're really excited to be joined by gnomes and are who comes in all the time for sadara storage and you brought in a special gates dave elliott global product lead storage google cloud platform welcome Dave thanks for having me so we'll get right into it big announcement don't tell us all about it sure we're super excited to announce that we're now connected up to the Google cloud platform our customers know already that we've connected to the other two major providers amazon web services in microsoft azure and we've been working diligently on what we think is the most exciting addition to that and that's google cloud platform it's available right now it's immediately available it means that any customer of google cloud platform can take advantage of our award-winning enterprise storage as a service connected directly to virtual machines at google cloud so it clearly you know completes the the try fact at which wich we know you know that is the power right now in the public cloud but what's special about google cloud platform that you can get that the other providers don't offer to your customers google cloud is special because we all know google the search engine company and because google has been doing this for so long they have the existing infrastructure so google has global data centers with the networks connecting those global data centers such that customers wherever they are I can connect through hundreds of edge locations for low-latency access to the clap so whereas with the other major clouds the customer has to be physically close to the actual cloud location for optimal latency google cloud customers have far more flexibility in terms of location and we think this will do even more to get more people on board cloud with our enterprise applications okay so Dave what I want to know is are you going to paint as the Darra rack colors I think you got a google bike out front the yellow the green or red right so what does this mean free for for google obviously you guys have storage you have massive amounts towards we all have lots of stuff on on our Google on our personal Google storage as well as are you know little the cube storage so what does this mean for your customers so so really it to put it in context as enterprises move to the cloud there there are different requirements from maybe pure-play startups so we've had fantastic success we've been in the cloud business now for I think nine years but as we mature as a business and as customers mature and and make it clear that they want to move more and more workloads to the cloud their requirements though still look significantly in many cases like the requirements from the old days right from the legacy vendors I'm a storage guy and I know there are certain there are certain requirements around s la's performance ability to to move between clouds from on-prem to the cloud and so as Google matures as our customers ask for these type this type of functionality we are able to meet those requirements by working with sadara so this really gives you that yes we have that check box when you're getting into a hardcore enterprise guys may be new to the cloud or you know you wants that comfort level right he's going to have all the stuff you had before but now exact it's gonna be sitting in your guys so it's structure it's it's two use cases right it's the traditional customer consumer customer enterprise customer who has today their workloads running on primer and private their own private data centers or kolos and it's the customers today that are running on our compute instances who love our compute instances but perhaps are holding back from moving some of those more delicate workloads to the clouds lizard to general use case right because that's that's really the point it's not just about storage for the customer the storage is an enabler for his applications all right so this really opens up a whole another set of applications for the other Google services that doesn't really compete directly with the storage exactly exactly that's it so as customers move you know the really interesting things happen is customers move those we're close to the cloud they could then take advantage of you know layered on other services like like data analytics and learning and things like that and so it's really about really everything I think about this relationship is about helping enterprises move migrate more and more workloads to the cloud in a more seamless way right so it's kind of a good news bad news for you know you know the good news is now you're partnering with Google the bad news is they got a lot of they have a lot of reach and distribution are you guys ready you know what's the impact on your business now having this humongous partner distribution network potential new client network you guys ready support that what kind of new challenges does that present to you guys it'll present growth challenges but we're we're ready so what we've done is made sure that we have the support infrastructure in place and also the sales infrastructure in place so if customers need help prior to the sale during the Sailor after the sale we have different teams that handle this and we have partners as well that's a big change for us in the last couple of years switching from a fully direct model to a model that's now sixty percent partner driven in that number is growing those partners are helping us with especially with integration the customer may need storage but also they may need to deploy other applications in google cloud those partners put it all together and provide a single let's use the positive expression is a single back to pass a single bat de facto sort the choker yes your own champagne as they like to say exact and one of the things we've talked about not fair for we turn on the cameras is is that you were excited about is really the distribution of Google and specifically google access points because latency is real as Grace Hopper said you know the speed of light is just too damn slow so you really need those access points to get to the compute and the store to make cloud work the way you want it to work exactly for example a common request is to connect remote clients to central storage repository so with the with with most clouds that's limited by a public network and the Layton sees and the hops that come along with that with Google's peering capabilities pretty much everybody is closer to the compute that Google than other than other clots and we know this because when we do the search and we get the answer back in point 0 0 1 2 seconds right a big a big piece of that is is the latency between us and whatever Google location is doing that search for us the compute benefits from that the clout the Google cloud benefits from that and therefore our customers due to right and David and I presume this is just one of a number of steps within Google clouds you know kind of pursuit of the enterprise and moving the more you know can enterprise customers enterprise workloads into the Google cloud platform yeah that's that's a one hundred percent accurate I mean we continue to build out the organization we build out part we're building our partnership and of course the products to better meet the needs of larger enterprises that that have just unique needs I will I will thank you for pointing out that the differentiator on the network it's something that I think people intuitively understand but you know the the the depth and breadth of our network networking expertise has been unbelievable as opposed to most cloud vendors we want to get the data we want to get the bits on to our network as quickly as possible because we keep it on our network because we're so efficient and be able to move the bits from point A to point B and and I think that's really the big the big differentiator why you see you know such better response such such lower latency and it's not just about the latency it's also about the predictability of Joseph and so just to clarify so once it gets into the Google Network wherever that point of access is then it's contained within the Google that right right so we write we've innovated around networking for four since our early days things like open flow and software-defined networking are things that have you know great genesis inside of inside of Google and so for us to move that data onto our network is just more again faster and more reliable for our customers and for our own data right we able to leverage our own infrastructure for own services I think we have seven services now with over a billion customers and just out of sheer necessity we've had to innovate in and around networking right we've done a couple piers shows where you know just reinforces the fact you want to get off the public backbone as quickly as you can depending on how ever you need to communicate with either your own internal stuff or with somebody else and then it's just kind of a signal that you guys will be bringing in other products obviously not necessarily a competing software software defined product but just other kind of enterprise e-type solutions to offer your customers in pursuit of this kind of ongoing Enterprise path yeah I think I think that there are a lot of simulator similarities if I would drive the draw the Venn diagram between what a high-performance successful customer like snapchat one of one of our larger customers or Spotify what they require and what what some of the larger enterprises or even smaller enterprises with just very very large you know compute-intensive storage intensive requirements are so there is a Venn diagram there's a lot of overlap and we continue to to leverage our own internal investments and things like live migration of compute instances things around innovative pricing to really drive home the the low cost of cloud and the agility of cloud things like customizable VMs so you only get the the actual machine that you need so we're going to continue to innovate around that and be able to make sure that both enterprise customers and our you know sort of the the high-flying startups still have the ability to to take advantage of right right and is that new information for you guys that you can leverage because clearly like a snapchat which uses massive amount of data you have massive growth rates I mean you have these we talk a lot about the consumerization of IT in terms of the experience of interacting with an application on your phone that you want to be like when you interact with snapchat although I can never figure snapchat outer left swipe right but where you can start to use some of those lessons that you guys have learned in your broader application experience to bring to bear with your solution as well as for your customers exactly so we have two goals in this relationship one is one is to help the existing customers of the Google cloud platform do more so that means I take the existing applications and maybe they can benefit in terms of better performance reliability so do more also need to bring new applications into the Google cloud platform maybe the customer moved some early applications over into the cloud but left others on Prem we'd like to see those move into the cloud as well and then the remaining goal is to move customers who are not at all in the cloud into Google Cloud in the end by providing these capabilities we think that's that's the last impediment the customer may sit there and say yes the compute capabilities are fantastic I trust them and network capabilities we just talked about them they're there world-leading storage I'm not sure I have what I need I know if I have the I apps that I need I don't have the uptime that I need or even protocol support or features disaster recovery snapchat snap shots etc but now they're there so there's no reason not to go yeah it's exciting time so congratulations um really a big announcement obviously tremendous infrastructure by partnering with google it mean i don't know that there's anything quite like it developed over all these years I talked to school the other day my own you right good logic 65,000 like wow it's not the little startup that we that we think of over in in Mountain View anymore it and congratulations day to to really make an aggressive move on the enterprise with really putting a flag in the in the ground if you will well we're just I think we're just at the beginning I think in X the next several years in fact next decade or so it's gonna be pretty exciting time all right well thanks for stopping by the palatal offices I think you could see it thank you to thank you right dave elliott gnome send our Jeff Rick you're watching the cube will catch you next time thanks for watching

Published Date : Nov 3 2016

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
David	PERSON	0.99+
sixty percent	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Dave Elliott	PERSON	0.99+
Dave	PERSON	0.99+
seven services	QUANTITY	0.99+
amazon	ORGANIZATION	0.99+
Jeff Rick	PERSON	0.99+
Grace Hopper	PERSON	0.99+
over a billion customers	QUANTITY	0.99+
two goals	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Joseph	PERSON	0.99+
nine years	QUANTITY	0.99+
Mountain View	LOCATION	0.99+
Spotify	ORGANIZATION	0.99+
today	DATE	0.98+
one hundred percent	QUANTITY	0.97+
google	ORGANIZATION	0.97+
Noam Shendar	PERSON	0.97+
snapchat	ORGANIZATION	0.97+
two use cases	QUANTITY	0.97+
Google Cloud	TITLE	0.97+
one	QUANTITY	0.97+
65,000	QUANTITY	0.95+
next decade	DATE	0.94+
single	QUANTITY	0.9+
two major providers	QUANTITY	0.89+
gnomes	PERSON	0.83+
last couple of years	DATE	0.83+
Layton	ORGANIZATION	0.83+
four	QUANTITY	0.82+
Zadara Storage	ORGANIZATION	0.82+
dave elliott gnome	PERSON	0.81+
Google cloud	TITLE	0.8+
google cloud	TITLE	0.78+
both enterprise	QUANTITY	0.76+
next several years	DATE	0.76+
sadara	ORGANIZATION	0.72+
dave elliott	PERSON	0.71+
couple	QUANTITY	0.69+
1 2 seconds	QUANTITY	0.66+
google cloud	TITLE	0.64+
cube	TITLE	0.63+
Google	TITLE	0.62+
google	TITLE	0.61+
Venn diagram	TITLE	0.6+
microsoft azure	ORGANIZATION	0.59+
Google cloud	TITLE	0.58+
cloud	TITLE	0.58+
Darra	COMMERCIAL_ITEM	0.58+
piers	ORGANIZATION	0.57+
edge	QUANTITY	0.57+
snap	TITLE	0.57+
steps	QUANTITY	0.56+
Venn diagram	TITLE	0.46+
number	QUANTITY	0.38+

Next-Generation Analytics Social Influencer Roundtable - #BigDataNYC 2016 #theCUBE

>> Narrator: Live from New York, it's the Cube, covering big data New York City 2016. Brought to you by headline sponsors, CISCO, IBM, NVIDIA, and our ecosystem sponsors, now here's your host, Dave Valante. >> Welcome back to New York City, everybody, this is the Cube, the worldwide leader in live tech coverage, and this is a cube first, we've got a nine person, actually eight person panel of experts, data scientists, all alike. I'm here with my co-host, James Cubelis, who has helped organize this panel of experts. James, welcome. >> Thank you very much, Dave, it's great to be here, and we have some really excellent brain power up there, so I'm going to let them talk. >> Okay, well thank you again-- >> And I'll interject my thoughts now and then, but I want to hear them. >> Okay, great, we know you well, Jim, we know you'll do that, so thank you for that, and appreciate you organizing this. Okay, so what I'm going to do to our panelists is ask you to introduce yourself. I'll introduce you, but tell us a little bit about yourself, and talk a little bit about what data science means to you. A number of you started in the field a long time ago, perhaps data warehouse experts before the term data science was coined. Some of you started probably after Hal Varian said it was the sexiest job in the world. (laughs) So think about how data science has changed and or what it means to you. We're going to start with Greg Piateski, who's from Boston. A Ph.D., KDnuggets, Greg, tell us about yourself and what data science means to you. >> Okay, well thank you Dave and thank you Jim for the invitation. Data science in a sense is the second oldest profession. I think people have this built-in need to find patterns and whatever we find we want to organize the data, but we do it well on a small scale, but we don't do it well on a large scale, so really, data science takes our need and helps us organize what we find, the patterns that we find that are really valid and useful and not just random, I think this is a big challenge of data science. I've actually started in this field before the term Data Science existed. I started as a researcher and organized the first few workshops on data mining and knowledge discovery, and the term data mining became less fashionable, became predictive analytics, now it's data science and it will be something else in a few years. >> Okay, thank you, Eves Mulkearns, Eves, I of course know you from Twitter. A lot of people know you as well. Tell us about your experiences and what data scientist means to you. >> Well, data science to me is if you take the two words, the data and the science, the science it holds a lot of expertise and skills there, it's statistics, it's mathematics, it's understanding the business and putting that together with the digitization of what we have. It's not only the structured data or the unstructured data what you store in the database try to get out and try to understand what is in there, but even video what is coming on and then trying to find, like George already said, the patterns in there and bringing value to the business but looking from a technical perspective, but still linking that to the business insights and you can do that on a technical level, but then you don't know yet what you need to find, or what you're looking for. >> Okay great, thank you. Craig Brown, Cube alum. How many people have been on the Cube actually before? >> I have. >> Okay, good. I always like to ask that question. So Craig, tell us a little bit about your background and, you know, data science, how has it changed, what's it all mean to you? >> Sure, so I'm Craig Brown, I've been in IT for almost 28 years, and that was obviously before the term data science, but I've evolved from, I started out as a developer. And evolved through the data ranks, as I called it, working with data structures, working with data systems, data technologies, and now we're working with data pure and simple. Data science to me is an individual or team of individuals that dissect the data, understand the data, help folks look at the data differently than just the information that, you know, we usually use in reports, and get more insights on, how to utilize it and better leverage it as an asset within an organization. >> Great, thank you Craig, okay, Jennifer Shin? Math is obviously part of being a data scientist. You're good at math I understand. Tell us about yourself. >> Yeah, so I'm a senior principle data scientist at the Nielsen Company. I'm also the founder of 8 Path Solutions, which is a data science, analytics, and technology company, and I'm also on the faculty in the Master of Information and Data Science program at UC Berkeley. So math is part of the IT statistics for data science actually this semester, and I think for me, I consider myself a scientist primarily, and data science is a nice day job to have, right? Something where there's industry need for people with my skill set in the sciences, and data gives us a great way of being able to communicate sort of what we know in science in a way that can be used out there in the real world. I think the best benefit for me is that now that I'm a data scientist, people know what my job is, whereas before, maybe five ten years ago, no one understood what I did. Now, people don't necessarily understand what I do now, but at least they understand kind of what I do, so it's still an improvement. >> Excellent. Thank you Jennifer. Joe Caserta, you're somebody who started in the data warehouse business, and saw that snake swallow a basketball and grow into what we now know as big data, so tell us about yourself. >> So I've been doing data for 30 years now, and I wrote the Data Warehouse ETL Toolkit with Ralph Timbal, which is the best selling book in the industry on preparing data for analytics, and with the big paradigm shift that's happened, you know for me the past seven years has been, instead of preparing data for people to analyze data to make decisions, now we're preparing data for machines to make the decisions, and I think that's the big shift from data analysis to data analytics and data science. >> Great, thank you. Miriam, Miriam Fridell, welcome. >> Thank you. I'm Miriam Fridell, I work for Elder Research, we are a data science consultancy, and I came to data science, sort of through a very circuitous route. I started off as a physicist, went to work as a consultant and software engineer, then became a research analyst, and finally came to data science. And I think one of the most interesting things to me about data science is that it's not simply about building an interesting model and doing some interesting mathematics, or maybe wrangling the data, all of which I love to do, but it's really the entire analytics lifecycle, and a value that you can actually extract from data at the end, and that's one of the things that I enjoy most is seeing a client's eyes light up or a wow, I didn't really know we could look at data that way, that's really interesting. I can actually do something with that, so I think that, to me, is one of the most interesting things about it. >> Great, thank you. Justin Sadeen, welcome. >> Absolutely, than you, thank you. So my name is Justin Sadeen, I work for Morph EDU, an artificial intelligence company in Atlanta, Georgia, and we develop learning platforms for non-profit and private educational institutions. So I'm a Marine Corp veteran turned data enthusiast, and so what I think about data science is the intersection of information, intelligence, and analysis, and I'm really excited about the transition from big data into smart data, and that's what I see data science as. >> Great, and last but not least, Dez Blanchfield, welcome mate. >> Good day. Yeah, I'm the one with the funny accent. So data science for me is probably the funniest job I've ever to describe to my mom. I've had quite a few different jobs, and she's never understood any of them, and this one she understands the least. I think a fun way to describe what we're trying to do in the world of data science and analytics now is it's the equivalent of high altitude mountain climbing. It's like the extreme sport version of the computer science world, because we have to be this magical unicorn of a human that can understand plain english problems from C-suite down and then translate it into code, either as soles or as teams of developers. And so there's this black art that we're expected to be able to transmogrify from something that we just in plain english say I would like to know X, and we have to go and figure it out, so there's this neat extreme sport view I have of rushing down the side of a mountain on a mountain bike and just dodging rocks and trees and things occasionally, because invariably, we do have things that go wrong, and they don't quite give us the answers we want. But I think we're at an interesting point in time now with the explosion in the types of technology that are at our fingertips, and the scale at which we can do things now, once upon a time we would sit at a terminal and write code and just look at data and watch it in columns, and then we ended up with spreadsheet technologies at our fingertips. Nowadays it's quite normal to instantiate a small high performance distributed cluster of computers, effectively a super computer in a public cloud, and throw some data at it and see what comes back. And we can do that on a credit card. So I think we're at a really interesting tipping point now where this coinage of data science needs to be slightly better defined, so that we can help organizations who have weird and strange questions that they want to ask, tell them solutions to those questions, and deliver on them in, I guess, a commodity deliverable. I want to know xyz and I want to know it in this time frame and I want to spend this much amount of money to do it, and I don't really care how you're going to do it. And there's so many tools we can choose from and there's so many platforms we can choose from, it's this little black art of computing, if you'd like, we're effectively making it up as we go in many ways, so I think it's one of the most exciting challenges that I've had, and I think I'm pretty sure I speak for most of us in that we're lucky that we get paid to do this amazing job. That we get make up on a daily basis in some cases. >> Excellent, well okay. So we'll just get right into it. I'm going to go off script-- >> Do they have unicorns down under? I think they have some strange species right? >> Well we put the pointy bit on the back. You guys have in on the front. >> So I was at an IBM event on Friday. It was a chief data officer summit, and I attended what was called the Data Divas' breakfast. It was a women in tech thing, and one of the CDOs, she said that 25% of chief data officers are women, which is much higher than you would normally see in the profile of IT. We happen to have 25% of our panelists are women. Is that common? Miriam and Jennifer, is that common for the data science field? Or is this a higher percentage than you would normally see-- >> James: Or a lower percentage? >> I think certainly for us, we have hired a number of additional women in the last year, and they are phenomenal data scientists. I don't know that I would say, I mean I think it's certainly typical that this is still a male-dominated field, but I think like many male-dominated fields, physics, mathematics, computer science, I think that that is slowly changing and evolving, and I think certainly, that's something that we've noticed in our firm over the years at our consultancy, as we're hiring new people. So I don't know if I would say 25% is the right number, but hopefully we can get it closer to 50. Jennifer, I don't know if you have... >> Yeah, so I know at Nielsen we have actually more than 25% of our team is women, at least the team I work with, so there seems to be a lot of women who are going into the field. Which isn't too surprising, because with a lot of the issues that come up in STEM, one of the reasons why a lot of women drop out is because they want real world jobs and they feel like they want to be in the workforce, and so I think this is a great opportunity with data science being so popular for these women to actually have a job where they can still maintain that engineering and science view background that they learned in school. >> Great, well Hillary Mason, I think, was the first data scientist that I ever interviewed, and I asked her what are the sort of skills required and the first question that we wanted to ask, I just threw other women in tech in there, 'cause we love women in tech, is about this notion of the unicorn data scientist, right? It's been put forth that there's the skill sets required to be a date scientist are so numerous that it's virtually impossible to have a data scientist with all those skills. >> And I love Dez's extreme sports analogy, because that plays into the whole notion of data science, we like to talk about the theme now of data science as a team sport. Must it be an extreme sport is what I'm wondering, you know. The unicorns of the world seem to be... Is that realistic now in this new era? >> I mean when automobiles first came out, they were concerned that there wouldn't be enough chauffeurs to drive all the people around. Is there an analogy with data, to be a data-driven company. Do I need a data scientist, and does that data scientist, you know, need to have these unbelievable mixture of skills? Or are we doomed to always have a skill shortage? Open it up. >> I'd like to have a crack at that, so it's interesting, when automobiles were a thing, when they first bought cars out, and before they, sort of, were modernized by the likes of Ford's Model T, when we got away from the horse and carriage, they actually had human beings walking down the street with a flag warning the public that the horseless carriage was coming, and I think data scientists are very much like that. That we're kind of expected to go ahead of the organization and try and take the challenges we're faced with today and see what's going to come around the corner. And so we're like the little flag-bearers, if you'd like, in many ways of this is where we're at today, tell me where I'm going to be tomorrow, and try and predict the day after as well. It is very much becoming a team sport though. But I think the concept of data science being a unicorn has come about because the coinage hasn't been very well defined, you know, if you were to ask 10 people what a data scientist were, you'd get 11 answers, and I think this is a really challenging issue for hiring managers and C-suites when the generants say I was data science, I want big data, I want an analyst. They don't actually really know what they're asking for. Generally, if you ask for a database administrator, it's a well-described job spec, and you can just advertise it and some 20 people will turn up and you interview to decide whether you like the look and feel and smell of 'em. When you ask for a data scientist, there's 20 different definitions of what that one data science role could be. So we don't initially know what the job is, we don't know what the deliverable is, and we're still trying to figure that out, so yeah. >> Craig what about you? >> So from my experience, when we talk about data science, we're really talking about a collection of experiences with multiple people I've yet to find, at least from my experience, a data science effort with a lone wolf. So you're talking about a combination of skills, and so you don't have, no one individual needs to have all that makes a data scientist a data scientist, but you definitely have to have the right combination of skills amongst a team in order to accomplish the goals of data science team. So from my experiences and from the clients that I've worked with, we refer to the data science effort as a data science team. And I believe that's very appropriate to the team sport analogy. >> For us, we look at a data scientist as a full stack web developer, a jack of all trades, I mean they need to have a multitude of background coming from a programmer from an analyst. You can't find one subject matter expert, it's very difficult. And if you're able to find a subject matter expert, you know, through the lifecycle of product development, you're going to require that individual to interact with a number of other members from your team who are analysts and then you just end up well training this person to be, again, a jack of all trades, so it comes full circle. >> I own a business that does nothing but data solutions, and we've been in business 15 years, and it's been, the transition over time has been going from being a conventional wisdom run company with a bunch of experts at the top to becoming more of a data-driven company using data warehousing and BI, but now the trend is absolutely analytics driven. So if you're not becoming an analytics-driven company, you are going to be behind the curve very very soon, and it's interesting that IBM is now coining the phrase of a cognitive business. I think that is absolutely the future. If you're not a cognitive business from a technology perspective, and an analytics-driven perspective, you're going to be left behind, that's for sure. So in order to stay competitive, you know, you need to really think about data science think about how you're using your data, and I also see that what's considered the data expert has evolved over time too where it used to be just someone really good at writing SQL, or someone really good at writing queries in any language, but now it's becoming more of a interdisciplinary action where you need soft skills and you also need the hard skills, and that's why I think there's more females in the industry now than ever. Because you really need to have a really broad width of experiences that really wasn't required in the past. >> Greg Piateski, you have a comment? >> So there are not too many unicorns in nature or as data scientists, so I think organizations that want to hire data scientists have to look for teams, and there are a few unicorns like Hillary Mason or maybe Osama Faiat, but they generally tend to start companies and very hard to retain them as data scientists. What I see is in other evolution, automation, and you know, steps like IBM, Watson, the first platform is eventually a great advance for data scientists in the short term, but probably what's likely to happen in the longer term kind of more and more of those skills becoming subsumed by machine unique layer within the software. How long will it take, I don't know, but I have a feeling that the paradise for data scientists may not be very long lived. >> Greg, I have a follow up question to what I just heard you say. When a data scientist, let's say a unicorn data scientist starts a company, as you've phrased it, and the company's product is built on data science, do they give up becoming a data scientist in the process? It would seem that they become a data scientist of a higher order if they've built a product based on that knowledge. What is your thoughts on that? >> Well, I know a few people like that, so I think maybe they remain data scientists at heart, but they don't really have the time to do the analysis and they really have to focus more on strategic things. For example, today actually is the birthday of Google, 18 years ago, so Larry Page and Sergey Brin wrote a very influential paper back in the '90s About page rank. Have they remained data scientist, perhaps a very very small part, but that's not really what they do, so I think those unicorn data scientists could quickly evolve to have to look for really teams to capture those skills. >> Clearly they come to a point in their career where they build a company based on teams of data scientists and data engineers and so forth, which relates to the topic of team data science. What is the right division of roles and responsibilities for team data science? >> Before we go, Jennifer, did you have a comment on that? >> Yeah, so I guess I would say for me, when data science came out and there was, you know, the Venn Diagram that came out about all the skills you were supposed to have? I took a very different approach than all of the people who I knew who were going into data science. Most people started interviewing immediately, they were like this is great, I'm going to get a job. I went and learned how to develop applications, and learned computer science, 'cause I had never taken a computer science course in college, and made sure I trued up that one part where I didn't know these things or had the skills from school, so I went headfirst and just learned it, and then now I have actually a lot of technology patents as a result of that. So to answer Jim's question, actually. I started my company about five years ago. And originally started out as a consulting firm slash data science company, then it evolved, and one of the reasons I went back in the industry and now I'm at Nielsen is because you really can't do the same sort of data science work when you're actually doing product development. It's a very very different sort of world. You know, when you're developing a product you're developing a core feature or functionality that you're going to offer clients and customers, so I think definitely you really don't get to have that wide range of sort of looking at 8 million models and testing things out. That flexibility really isn't there as your product starts getting developed. >> Before we go into the team sport, the hard skills that you have, are you all good at math? Are you all computer science types? How about math? Are you all math? >> What were your GPAs? (laughs) >> David: Anybody not math oriented? Anybody not love math? You don't love math? >> I love math, I think it's required. >> David: So math yes, check. >> You dream in equations, right? You dream. >> Computer science? Do I have to have computer science skills? At least the basic knowledge? >> I don't know that you need to have formal classes in any of these things, but I think certainly as Jennifer was saying, if you have no skills in programming whatsoever and you have no interest in learning how to write SQL queries or RR Python, you're probably going to struggle a little bit. >> James: It would be a challenge. >> So I think yes, I have a Ph.D. in physics, I did a lot of math, it's my love language, but I think you don't necessarily need to have formal training in all of these things, but I think you need to have a curiosity and a love of learning, and so if you don't have that, you still want to learn and however you gain that knowledge I think, but yeah, if you have no technical interests whatsoever, and don't want to write a line of code, maybe data science is not the field for you. Even if you don't do it everyday. >> And statistics as well? You would put that in that same general category? How about data hacking? You got to love data hacking, is that fair? Eaves, you have a comment? >> Yeah, I think so, while we've been discussing that for me, the most important part is that you have a logical mind and you have the capability to absorb new things and the curiosity you need to dive into that. While I don't have an education in IT or whatever, I have a background in chemistry and those things that I learned there, I apply to information technology as well, and from a part that you say, okay, I'm a tech-savvy guy, I'm interested in the tech part of it, you need to speak that business language and if you can do that crossover and understand what other skill sets or parts of the roles are telling you I think the communication in that aspect is very important. >> I'd like throw just something really quickly, and I think there's an interesting thing that happens in IT, particularly around technology. We tend to forget that we've actually solved a lot of these problems in the past. If we look in history, if we look around the second World War, and Bletchley Park in the UK, where you had a very similar experience as humans that we're having currently around the whole issue of data science, so there was an interesting challenge with the enigma in the shark code, right? And there was a bunch of men put in a room and told, you're mathematicians and you come from universities, and you can crack codes, but they couldn't. And so what they ended up doing was running these ads, and putting challenges, they actually put, I think it was crossword puzzles in the newspaper, and this deluge of women came out of all kinds of different roles without math degrees, without science degrees, but could solve problems, and they were thrown at the challenge of cracking codes, and invariably, they did the heavy lifting. On a daily basis for converting messages from one format to another, so that this very small team at the end could actually get in play with the sexy piece of it. And I think we're going through a similar shift now with what we're refer to as data science in the technology and business world. Where the people who are doing the heavy lifting aren't necessarily what we'd think of as the traditional data scientists, and so, there have been some unicorns and we've championed them, and they're great. But I think the shift's going to be to accountants, actuaries, and statisticians who understand the business, and come from an MBA star background that can learn the relevant pieces of math and models that we need to to apply to get the data science outcome. I think we've already been here, we've solved this problem, we've just got to learn not to try and reinvent the wheel, 'cause the media hypes this whole thing of data science is exciting and new, but we've been here a couple times before, and there's a lot to be learned from that, my view. >> I think we had Joe next. >> Yeah, so I was going to say that, data science is a funny thing. To use the word science is kind of a misnomer, because there is definitely a level of art to it, and I like to use the analogy, when Michelangelo would look at a block of marble, everyone else looked at the block of marble to see a block of marble. He looks at a block of marble and he sees a finished sculpture, and then he figures out what tools do I need to actually make my vision? And I think data science is a lot like that. We hear a problem, we see the solution, and then we just need the right tools to do it, and I think part of consulting and data science in particular. It's not so much what we know out of the gate, but it's how quickly we learn. And I think everyone here, what makes them brilliant, is how quickly they could learn any tool that they need to see their vision get accomplished. >> David: Justin? >> Yeah, I think you make a really great point, for me, I'm a Marine Corp veteran, and the reason I mentioned that is 'cause I work with two veterans who are problem solvers. And I think that's what data scientists really are, in the long run are problem solvers, and you mentioned a great point that, yeah, I think just problem solving is the key. You don't have to be a subject matter expert, just be able to take the tools and intelligently use them. >> Now when you look at the whole notion of team data science, what is the right mix of roles, like role definitions within a high-quality or a high-preforming data science teams now IBM, with, of course, our announcement of project, data works and so forth. We're splitting the role division, in terms of data scientist versus data engineers versus application developer versus business analyst, is that the right breakdown of roles? Or what would the panelists recommend in terms of understanding what kind of roles make sense within, like I said, a high performing team that's looking for trying to develop applications that depend on data, machine learning, and so forth? Anybody want to? >> I'll tackle that. So the teams that I have created over the years made up these data science teams that I brought into customer sites have a combination of developer capabilities and some of them are IT developers, but some of them were developers of things other than applications. They designed buildings, they did other things with their technical expertise besides building technology. The other piece besides the developer is the analytics, and analytics can be taught as long as they understand how algorithms work and the code behind the analytics, in other words, how are we analyzing things, and from a data science perspective, we are leveraging technology to do the analyzing through the tool sets, so ultimately as long as they understand how tool sets work, then we can train them on the tools. Having that analytic background is an important piece. >> Craig, is it easier to, I'll go to you in a moment Joe, is it easier to cross train a data scientist to be an app developer, than to cross train an app developer to be a data scientist or does it not matter? >> Yes. (laughs) And not the other way around. It depends on the-- >> It's easier to cross train a data scientist to be an app developer than-- >> Yes. >> The other way around. Why is that? >> Developing code can be as difficult as the tool set one uses to develop code. Today's tool sets are very user friendly. where developing code is very difficult to teach a person to think along the lines of developing code when they don't have any idea of the aspects of code, of building something. >> I think it was Joe, or you next, or Jennifer, who was it? >> I would say that one of the reasons for that is data scientists will probably know if the answer's right after you process data, whereas data engineer might be able to manipulate the data but may not know if the answer's correct. So I think that is one of the reasons why having a data scientist learn the application development skills might be a easier time than the other way around. >> I think Miriam, had a comment? Sorry. >> I think that what we're advising our clients to do is to not think, before data science and before analytics became so required by companies to stay competitive, it was more of a waterfall, you have a data engineer build a solution, you know, then you throw it over the fence and the business analyst would have at it, where now, it must be agile, and you must have a scrum team where you have the data scientist and the data engineer and the project manager and the product owner and someone from the chief data office all at the table at the same time and all accomplishing the same goal. Because all of these skills are required, collectively in order to solve this problem, and it can't be done daisy chained anymore it has to be a collaboration. And that's why I think spark is so awesome, because you know, spark is a single interface that a data engineer can use, a data analyst can use, and a data scientist can use. And now with what we've learned today, having a data catalog on top so that the chief data office can actually manage it, I think is really going to take spark to the next level. >> James: Miriam? >> I wanted to comment on your question to Craig about is it harder to teach a data scientist to build an application or vice versa, and one of the things that we have worked on a lot in our data science team is incorporating a lot of best practices from software development, agile, scrum, that sort of thing, and I think particularly with a focus on deploying models that we don't just want to build an interesting data science model, we want to deploy it, and get some value. You need to really incorporate these processes from someone who might know how to build applications and that, I think for some data scientists can be a challenge, because one of the fun things about data science is you get to get into the data, and you get your hands dirty, and you build a model, and you get to try all these cool things, but then when the time comes for you to actually deploy something, you need deployment-grade code in order to make sure it can go into production at your client side and be useful for instance, so I think that there's an interesting challenge on both ends, but one of the things I've definitely noticed with some of our data scientists is it's very hard to get them to think in that mindset, which is why you have a team of people, because everyone has different skills and you can mitigate that. >> Dev-ops for data science? >> Yeah, exactly. We call it insight ops, but yeah, I hear what you're saying. Data science is becoming increasingly an operational function as opposed to strictly exploratory or developmental. Did some one else have a, Dez? >> One of the things I was going to mention, one of the things I like to do when someone gives me a new problem is take all the laptops and phones away. And we just end up in a room with a whiteboard. And developers find that challenging sometimes, so I had this one line where I said to them don't write the first line of code until you actually understand the problem you're trying to solve right? And I think where the data science focus has changed the game for organizations who are trying to get some systematic repeatable process that they can throw data at and just keep getting answers and things, no matter what the industry might be is that developers will come with a particular mindset on how they're going to codify something without necessarily getting the full spectrum and understanding the problem first place. What I'm finding is the people that come at data science tend to have more of a hacker ethic. They want to hack the problem, they want to understand the challenge, and they want to be able to get it down to plain English simple phrases, and then apply some algorithms and then build models, and then codify it, and so most of the time we sit in a room with whiteboard markers just trying to build a model in a graphical sense and make sure it's going to work and that it's going to flow, and once we can do that, we can codify it. I think when you come at it from the other angle from the developer ethic, and you're like I'm just going to codify this from day one, I'm going to write code. I'm going to hack this thing out and it's just going to run and compile. Often, you don't truly understand what he's trying to get to at the end point, and you can just spend days writing code and I think someone made the comment that sometimes you don't actually know whether the output is actually accurate in the first place. So I think there's a lot of value being provided from the data science practice. Over understanding the problem in plain english at a team level, so what am I trying to do from the business consulting point of view? What are the requirements? How do I build this model? How do I test the model? How do I run a sample set through it? Train the thing and then make sure what I'm going to codify actually makes sense in the first place, because otherwise, what are you trying to solve in the first place? >> Wasn't that Einstein who said if I had an hour to solve a problem, I'd spend 55 minutes understanding the problem and five minutes on the solution, right? It's exactly what you're talking about. >> Well I think, I will say, getting back to the question, the thing with building these teams, I think a lot of times people don't talk about is that engineers are actually very very important for data science projects and data science problems. For instance, if you were just trying to prototype something or just come up with a model, then data science teams are great, however, if you need to actually put that into production, that code that the data scientist has written may not be optimal, so as we scale out, it may be actually very inefficient. At that point, you kind of want an engineer to step in and actually optimize that code, so I think it depends on what you're building and that kind of dictates what kind of division you want among your teammates, but I do think that a lot of times, the engineering component is really undervalued out there. >> Jennifer, it seems that the data engineering function, data discovery and preparation and so forth is becoming automated to a greater degree, but if I'm listening to you, I don't hear that data engineering as a discipline is becoming extinct in terms of a role that people can be hired into. You're saying that there's a strong ongoing need for data engineers to optimize the entire pipeline to deliver the fruits of data science in production applications, is that correct? So they play that very much operational role as the backbone for... >> So I think a lot of times businesses will go to data scientist to build a better model to build a predictive model, but that model may not be something that you really want to implement out there when there's like a million users coming to your website, 'cause it may not be efficient, it may take a very long time, so I think in that sense, it is important to have good engineers, and your whole product may fail, you may build the best model it may have the best output, but if you can't actually implement it, then really what good is it? >> What about calibrating these models? How do you go about doing that and sort of testing that in the real world? Has that changed overtime? Or is it... >> So one of the things that I think can happen, and we found with one of our clients is when you build a model, you do it with the data that you have, and you try to use a very robust cross-validation process to make sure that it's robust and it's sturdy, but one thing that can sometimes happen is after you put your model into production, there can be external factors that, societal or whatever, things that have nothing to do with the data that you have or the quality of the data or the quality of the model, which can actually erode the model's performance over time. So as an example, we think about cell phone contracts right? Those have changed a lot over the years, so maybe five years ago, the type of data plan you had might not be the same that it is today, because a totally different type of plan is offered, so if you're building a model on that to say predict who's going to leave and go to a different cell phone carrier, the validity of your model overtime is going to completely degrade based on nothing that you have, that you put into the model or the data that was available, so I think you need to have this sort of model management and monitoring process to take this factors into account and then know when it's time to do a refresh. >> Cross-validation, even at one point in time, for example, there was an article in the New York Times recently that they gave the same data set to five different data scientists, this is survey data for the presidential election that's upcoming, and five different data scientists came to five different predictions. They were all high quality data scientists, the cross-validation showed a wide variation about who was on top, whether it was Hillary or whether it was Trump so that shows you that even at any point in time, cross-validation is essential to understand how robust the predictions might be. Does somebody else have a comment? Joe? >> I just want to say that this even drives home the fact that having the scrum team for each project and having the engineer and the data scientist, data engineer and data scientist working side by side because it is important that whatever we're building we assume will eventually go into production, and we used to have in the data warehousing world, you'd get the data out of the systems, out of your applications, you do analysis on your data, and the nirvana was maybe that data would go back to the system, but typically it didn't. Nowadays, the applications are dependent on the insight coming from the data science team. With the behavior of the application and the personalization and individual experience for a customer is highly dependent, so it has to be, you said is data science part of the dev-ops team, absolutely now, it has to be. >> Whose job is it to figure out the way in which the data is presented to the business? Where's the sort of presentation, the visualization plan, is that the data scientist role? Does that depend on whether or not you have that gene? Do you need a UI person on your team? Where does that fit? >> Wow, good question. >> Well usually that's the output, I mean, once you get to the point where you're visualizing the data, you've created an algorithm or some sort of code that produces that to be visualized, so at the end of the day that the customers can see what all the fuss is about from a data science perspective. But it's usually post the data science component. >> So do you run into situations where you can see it and it's blatantly obvious, but it doesn't necessarily translate to the business? >> Well there's an interesting challenge with data, and we throw the word data around a lot, and I've got this fun line I like throwing out there. If you torture data long enough, it will talk. So the challenge then is to figure out when to stop torturing it, right? And it's the same with models, and so I think in many other parts of organizations, we'll take something, if someone's doing a financial report on performance of the organization and they're doing it in a spreadsheet, they'll get two or three peers to review it, and validate that they've come up with a working model and the answer actually makes sense. And I think we're rushing so quickly at doing analysis on data that comes to us in various formats and high velocity that I think it's very important for us to actually stop and do peer reviews, of the models and the data and the output as well, because otherwise we start making decisions very quickly about things that may or may not be true. It's very easy to get the data to paint any picture you want, and you gave the example of the five different attempts at that thing, and I had this shoot out thing as well where I'll take in a team, I'll get two different people to do exactly the same thing in completely different rooms, and come back and challenge each other, and it's quite amazing to see the looks on their faces when they're like, oh, I didn't see that, and then go back and do it again until, and then just keep iterating until we get to the point where they both get the same outcome, in fact there's a really interesting anecdote about when the UNIX operation system was being written, and a couple of the authors went away and wrote the same program without realizing that each other were doing it, and when they came back, they actually had line for line, the same piece of C code, 'cause they'd actually gotten to a truth. A perfect version of that program, and I think we need to often look at, when we're building models and playing with data, if we can't come at it from different angles, and get the same answer, then maybe the answer isn't quite true yet, so there's a lot of risk in that. And it's the same with presentation, you know, you can paint any picture you want with the dashboard, but who's actually validating when the dashboard's painting the correct picture? >> James: Go ahead, please. >> There is a science actually, behind data visualization, you know if you're doing trending, it's a line graph, if you're doing comparative analysis, it's bar graph, if you're doing percentages, it's a pie chart, like there is a certain science to it, it's not that much of a mystery as the novice thinks there is, but what makes it challenging is that you also, just like any presentation, you have to consider your audience. And your audience, whenever we're delivering a solution, either insight, or just data in a grid, we really have to consider who is the consumer of this data, and actually cater the visual to that person or to that particular audience. And that is part of the art, and that is what makes a great data scientist. >> The consumer may in fact be the source of the data itself, like in a mobile app, so you're tuning their visualization and then their behavior is changing as a result, and then the data on their changed behavior comes back, so it can be a circular process. >> So Jim, at a recent conference, you were tweeting about the citizen data scientist, and you got emasculated by-- >> I spoke there too. >> Okay. >> TWI on that same topic, I got-- >> Kirk Borne I hear came after you. >> Kirk meant-- >> Called foul, flag on the play. >> Kirk meant well. I love Claudia Emahoff too, but yeah, it's a controversial topic. >> So I wonder what our panel thinks of that notion, citizen data scientist. >> Can I respond about citizen data scientists? >> David: Yeah, please. >> I think this term was introduced by Gartner analyst in 2015, and I think it's a very dangerous and misleading term. I think definitely we want to democratize the data and have access to more people, not just data scientists, but managers, BI analysts, but when there is already a term for such people, we can call the business analysts, because it implies some training, some understanding of the data. If you use the term citizen data scientist, it implies that without any training you take some data and then you find something there, and they think as Dev's mentioned, we've seen many examples, very easy to find completely spurious random correlations in data. So we don't want citizen dentists to treat our teeth or citizen pilots to fly planes, and if data's important, having citizen data scientists is equally dangerous, so I'm hoping that, I think actually Gartner did not use the term citizen data scientist in their 2016 hype course, so hopefully they will put this term to rest. >> So Gregory, you apparently are defining citizen to mean incompetent as opposed to simply self-starting. >> Well self-starting is very different, but that's not what I think what was the intention. I think what we see in terms of data democratization, there is a big trend over automation. There are many tools, for example there are many companies like Data Robot, probably IBM, has interesting machine learning capability towards automation, so I think I recently started a page on KDnuggets for automated data science solutions, and there are already 20 different forums that provide different levels of automation. So one can deliver in full automation maybe some expertise, but it's very dangerous to have part of an automated tool and at some point then ask citizen data scientists to try to take the wheels. >> I want to chime in on that. >> David: Yeah, pile on. >> I totally agree with all of that. I think the comment I just want to quickly put out there is that the space we're in is a very young, and rapidly changing world, and so what we haven't had yet is this time to stop and take a deep breath and actually define ourselves, so if you look at computer science in general, a lot of the traditional roles have sort of had 10 or 20 years of history, and so thorough the hiring process, and the development of those spaces, we've actually had time to breath and define what those jobs are, so we know what a systems programmer is, and we know what a database administrator is, but we haven't yet had a chance as a community to stop and breath and say, well what do we think these roles are, and so to fill that void, the media creates coinages, and I think this is the risk we've got now that the concept of a data scientist was just a term that was coined to fill a void, because no one quite knew what to call somebody who didn't come from a data science background if they were tinkering around data science, and I think that's something that we need to sort of sit up and pay attention to, because if we don't own that and drive it ourselves, then somebody else is going to fill the void and they'll create these very frustrating concepts like data scientist, which drives us all crazy. >> James: Miriam's next. >> So I wanted to comment, I agree with both of the previous comments, but in terms of a citizen data scientist, and I think whether or not you're citizen data scientist or an actual data scientist whatever that means, I think one of the most important things you can have is a sense of skepticism, right? Because you can get spurious correlations and it's like wow, my predictive model is so excellent, you know? And being aware of things like leaks from the future, right? This actually isn't predictive at all, it's a result of the thing I'm trying to predict, and so I think one thing I know that we try and do is if something really looks too good, we need to go back in and make sure, did we not look at the data correctly? Is something missing? Did we have a problem with the ETL? And so I think that a healthy sense of skepticism is important to make sure that you're not taking a spurious correlation and trying to derive some significant meaning from it. >> I think there's a Dilbert cartoon that I saw that described that very well. Joe, did you have a comment? >> I think that in order for citizen data scientists to really exist, I think we do need to have more maturity in the tools that they would use. My vision is that the BI tools of today are all going to be replaced with natural language processing and searching, you know, just be able to open up a search bar and say give me sales by region, and to take that one step into the future even further, it should actually say what are my sales going to be next year? And it should trigger a simple linear regression or be able to say which features of the televisions are actually affecting sales and do a clustering algorithm, you know I think hopefully that will be the future, but I don't see anything of that today, and I think in order to have a true citizen data scientist, you would need to have that, and that is pretty sophisticated stuff. >> I think for me, the idea of citizen data scientist I can relate to that, for instance, when I was in graduate school, I started doing some research on FDA data. It was an open source data set about 4.2 million data points. Technically when I graduated, the paper was still not published, and so in some sense, you could think of me as a citizen data scientist, right? I wasn't getting funding, I wasn't doing it for school, but I was still continuing my research, so I'd like to hope that with all the new data sources out there that there might be scientists or people who are maybe kept out of a field people who wanted to be in STEM and for whatever life circumstance couldn't be in it. That they might be encouraged to actually go and look into the data and maybe build better models or validate information that's out there. >> So Justin, I'm sorry you had one comment? >> It seems data science was termed before academia adopted formalized training for data science. But yeah, you can make, like Dez said, you can make data work for whatever problem you're trying to solve, whatever answer you see, you want data to work around it, you can make it happen. And I kind of consider that like in project management, like data creep, so you're so hyper focused on a solution you're trying to find the answer that you create an answer that works for that solution, but it may not be the correct answer, and I think the crossover discussion works well for that case. >> So but the term comes up 'cause there's a frustration I guess, right? That data science skills are not plentiful, and it's potentially a bottleneck in an organization. Supposedly 80% of your time is spent on cleaning data, is that right? Is that fair? So there's a problem. How much of that can be automated and when? >> I'll have a shot at that. So I think there's a shift that's going to come about where we're going to move from centralized data sets to data at the edge of the network, and this is something that's happening very quickly now where we can't just hold everything back to a central spot. When the internet of things actually wakes up. Things like the Boeing Dreamliner 787, that things got 6,000 sensors in it, produces half a terabyte of data per flight. There are 87,400 flights per day in domestic airspace in the U.S. That's 43.5 petabytes of raw data, now that's about three years worth of disk manufacturing in total, right? We're never going to copy that across one place, we can't process, so I think the challenge we've got ahead of us is looking at how we're going to move the intelligence and the analytics to the edge of the network and pre-cook the data in different tiers, so have a look at the raw material we get, and boil it down to a slightly smaller data set, bring a meta data version of that back, and eventually get to the point where we've only got the very minimum data set and data points we need to make key decisions. Without that, we're already at the point where we have too much data, and we can't munch it fast enough, and we can't spin off enough tin even if we witch the cloud on, and that's just this never ending deluge of noise, right? And you've got that signal versus noise problem so then we're now seeing a shift where people looking at how do we move the intelligence back to the edge of network which we actually solved some time ago in the securities space. You know, spam filtering, if an emails hits Google on the west coast of the U.S. and they create a check some for that spam email, it immediately goes into a database, and nothing gets on the opposite side of the coast, because they already know it's spam. They recognize that email coming in, that's evil, stop it. So we've already fixed its insecurity with intrusion detection, we've fixed it in spam, so we now need to take that learning, and bring it into business analytics, if you like, and see where we're finding patterns and behavior, and brew that out to the edge of the network, so if I'm seeing a demand over here for tickets on a new sale of a show, I need to be able to see where else I'm going to see that demand and start responding to that before the demand comes about. I think that's a shift that we're going to see quickly, because we'll never keep up with the data munching challenge and the volume's just going to explode. >> David: We just have a couple minutes. >> That does sound like a great topic for a future Cube panel which is data science on the edge of the fog. >> I got a hundred questions around that. So we're wrapping up here. Just got a couple minutes. Final thoughts on this conversation or any other pieces that you want to punctuate. >> I think one thing that's been really interesting for me being on this panel is hearing all of my co-panelists talking about common themes and things that we are also experiencing which isn't a surprise, but it's interesting to hear about how ubiquitous some of the challenges are, and also at the announcement earlier today, some of the things that they're talking about and thinking about, we're also talking about and thinking about. So I think it's great to hear we're all in different countries and different places, but we're experiencing a lot of the same challenges, and I think that's been really interesting for me to hear about. >> David: Great, anybody else, final thoughts? >> To echo Dez's thoughts, it's about we're never going to catch up with the amount of data that's produced, so it's about transforming big data into smart data. >> I could just say that with the shift from normal data, small data, to big data, the answer is automate, automate, automate, and we've been talking about advanced algorithms and machine learning for the science for changing the business, but there also needs to be machine learning and advanced algorithms for the backroom where we're actually getting smarter about how we ingestate and how we fix data as it comes in. Because we can actually train the machines to understand data anomalies and what we want to do with them over time. And I think the further upstream we get of data correction, the less work there will be downstream. And I also think that the concept of being able to fix data at the source is gone, that's behind us. Right now the data that we're using to analyze to change the business, typically we have no control over. Like Dez said, they're coming from censors and machines and internet of things and if it's wrong, it's always going to be wrong, so we have to figure out how to do that in our laboratory. >> Eaves, final thoughts? >> I think it's a mind shift being a data scientist if you look back at the time why did you start developing or writing code? Because you like to code, whatever, just for the sake of building a nice algorithm or a piece of software, or whatever, and now I think with the spirit of a data scientist, you're looking at a problem and say this is where I want to go, so you have more the top down approach than the bottom up approach. And have the big picture and that is what you really need as a data scientist, just look across technologies, look across departments, look across everything, and then on top of that, try to apply as much skills as you have available, and that's kind of unicorn that they're trying to look for, because it's pretty hard to find people with that wide vision on everything that is happening within the company, so you need to be aware of technology, you need to be aware of how a business is run, and how it fits within a cultural environment, you have to work with people and all those things together to my belief to make it very difficult to find those good data scientists. >> Jim? Your final thought? >> My final thoughts is this is an awesome panel, and I'm so glad that you've come to New York, and I'm hoping that you all stay, of course, for the the IBM Data First launch event that will take place this evening about a block over at Hudson Mercantile, so that's pretty much it. Thank you, I really learned a lot. >> I want to second Jim's thanks, really, great panel. Awesome expertise, really appreciate you taking the time, and thanks to the folks at IBM for putting this together. >> And I'm big fans of most of you, all of you, on this session here, so it's great just to meet you in person, thank you. >> Okay, and I want to thank Jeff Frick for being a human curtain there with the sun setting here in New York City. Well thanks very much for watching, we are going to be across the street at the IBM announcement, we're going to be on the ground. We open up again tomorrow at 9:30 at Big Data NYC, Big Data Week, Strata plus the Hadoop World, thanks for watching everybody, that's a wrap from here. This is the Cube, we're out. (techno music)

Published Date : Sep 28 2016

SUMMARY :

Brought to you by headline sponsors, and this is a cube first, and we have some really but I want to hear them. and appreciate you organizing this. and the term data mining Eves, I of course know you from Twitter. and you can do that on a technical level, How many people have been on the Cube I always like to ask that question. and that was obviously Great, thank you Craig, and I'm also on the faculty and saw that snake swallow a basketball and with the big paradigm Great, thank you. and I came to data science, Great, thank you. and so what I think about data science Great, and last but not least, and the scale at which I'm going to go off script-- You guys have in on the front. and one of the CDOs, she said that 25% and I think certainly, that's and so I think this is a great opportunity and the first question talk about the theme now and does that data scientist, you know, and you can just advertise and from the clients I mean they need to have and it's been, the transition over time but I have a feeling that the paradise and the company's product and they really have to focus What is the right division and one of the reasons I You dream in equations, right? and you have no interest in learning but I think you need to and the curiosity you and there's a lot to be and I like to use the analogy, and the reason I mentioned that is that the right breakdown of roles? and the code behind the analytics, And not the other way around. Why is that? idea of the aspects of code, of the reasons for that I think Miriam, had a comment? and someone from the chief data office and one of the things that an operational function as opposed to and so most of the time and five minutes on the solution, right? that code that the data but if I'm listening to you, that in the real world? the data that you have or so that shows you that and the nirvana was maybe that the customers can see and a couple of the authors went away and actually cater the of the data itself, like in a mobile app, I love Claudia Emahoff too, of that notion, citizen data scientist. and have access to more people, to mean incompetent as opposed to and at some point then ask and the development of those spaces, and so I think one thing I think there's a and I think in order to have a true so I'd like to hope that with all the new and I think So but the term comes up and the analytics to of the fog. or any other pieces that you want to and also at the so it's about transforming big data and machine learning for the science and now I think with the and I'm hoping that you and thanks to the folks at IBM so it's great just to meet you in person, This is the Cube, we're out.

ENTITIES

Entity	Category	Confidence
Jennifer	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Miriam Fridell	PERSON	0.99+
Greg Piateski	PERSON	0.99+
Justin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
David	PERSON	0.99+
Jeff Frick	PERSON	0.99+
2015	DATE	0.99+
Joe Caserta	PERSON	0.99+
James Cubelis	PERSON	0.99+
James	PERSON	0.99+
Miriam	PERSON	0.99+
Jim	PERSON	0.99+
Joe	PERSON	0.99+
Claudia Emahoff	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Hillary	PERSON	0.99+
New York	LOCATION	0.99+
Hillary Mason	PERSON	0.99+
Justin Sadeen	PERSON	0.99+
Greg	PERSON	0.99+
Dave	PERSON	0.99+
55 minutes	QUANTITY	0.99+
Trump	PERSON	0.99+
2016	DATE	0.99+
Craig	PERSON	0.99+
Dave Valante	PERSON	0.99+
George	PERSON	0.99+
Dez Blanchfield	PERSON	0.99+
UK	LOCATION	0.99+
Ford	ORGANIZATION	0.99+
Craig Brown	PERSON	0.99+
10	QUANTITY	0.99+
8 Path Solutions	ORGANIZATION	0.99+
CISCO	ORGANIZATION	0.99+
five minutes	QUANTITY	0.99+
two	QUANTITY	0.99+
30 years	QUANTITY	0.99+
Kirk	PERSON	0.99+
25%	QUANTITY	0.99+
Marine Corp	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
43.5 petabytes	QUANTITY	0.99+
Boston	LOCATION	0.99+
Data Robot	ORGANIZATION	0.99+
10 people	QUANTITY	0.99+
Hal Varian	PERSON	0.99+
Einstein	PERSON	0.99+
New York City	LOCATION	0.99+
Nielsen	ORGANIZATION	0.99+
first question	QUANTITY	0.99+
Friday	DATE	0.99+
Ralph Timbal	PERSON	0.99+
U.S.	LOCATION	0.99+
6,000 sensors	QUANTITY	0.99+
UC Berkeley	ORGANIZATION	0.99+
Sergey Brin	PERSON	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Venn diagram: