Jennifer Shin, 8 Path Solutions | Think 2018

>> Narrator: Live from Las Vegas, it's The Cube. Covering IBM Think 2018. Brought to you by IBM. >> Hello everyone and welcome to The Cube here at IBM Think in Las Vegas, the Mandalay Bay. I'm John Furrier, the host of The Cube. We're here in this Cube studio as a set for IBM Think. My next guest is Jennifer Shiin who's the founder of 8 Path Solutions. Twitter handle Jenn, J-S-H-I-N. Great to see you. Thanks for joining me. >> Yeah, happy to be here. >> I'm glad you stopped by. I wanted to get your thoughts. You're thought leader in the industry. You've been on multiple Cube panels. Thank you very much. And also Cube alumni. You know, IBM with the data center of the value proposition. The CEO's up on the stage today saying you got data, you got blockchain and you got AI, which is such the infrastructure of the future. And AI is the software of the future, data's at the middle. Dave and I were talking about that as the innovation sandwich. The data is being sandwiched between blockchain and AI, two super important things. And she also mentioned Moore's law. Faster, smaller, cheaper. Every 6 months doubling in speed and performance. And then Metcalfe's law, which is more of a network effect. Kind of teasing out token economics. You see kind of where the world's going. This is an interesting position from IBM. I like it. Is it real? >> Well it sounds very data sciency, right? You have the economics part, you have the networking. You have all these things in your plane. So I think it's very much in line with what you would expect if data science actually sustains (mumbles), which thankfully it has. >> Yeah. >> And I think the reality is you know, we like to boil things down into nice, simple concepts but in the real world when you're actually figuring it all out its going to be multiple effects. It's going to be, you know a lot of different things that interact. >> And they kind of really tease out their cloud strategy in a very elegant way. I mean they essentially said, 'Look we're into the cloud and we're not going to try to.' They didn't say it directly, but they basically said it. We're not going to compete with Amazon head-to-head. We're going to let our offerings to do the talking. We're going to use data and give customers choice with multi cloud. How does that jive for you? How does that work because at the end of the day I got to have business logics. I need applications. >> Yes. >> You know whether its blockchains, cryptocurrency or apps. The killer app's now money. >> Yep. >> If no one's making any money. >> Sure. >> No commerce is being done. >> Right. I mean I think it makes sense. You know, Amazon has such a strong hold in the infrastructure part, right? Being able to store your data elsewhere and have it be cloud. I don't think that was really IBM's core business. You know, a lot of I think their business model was built around business and business relationships and these days, one of the great things about all these data technologies is that one company doesn't have to do all of it, right? You have partnerships and actually partners so that you know, one company does AI. You partner with another company that has data. And that way you can actually both make money, right? There's more than enough work to go around and that much you can say having worked in data science teams right? If I can offload some of my work to different divisions, fantastic. That'd be great. Saves us time. You get to market faster. You can build things quicker. So I think that's one of the great things about what's happening with data these days, right? There's enough work to get around. >> And it's beautiful too because if you think about the concept that made cloud great is DevOps. Blockchain is an opportunity to use desensualization to take away a lot of inefficiencies. AI is also an automation opportunity to create value. So you got inefficiencies on block chains side and AI to create value, your thoughts and reaction to where that's going to go. You know, in light of the first death on a Uber self-driving car. Again, historic yesterday right? And so you know, the reality is right there. We're not perfect. >> Yeah. >> But there's a path. >> Well so most of its inefficiency out there. It's not the technology. It's all the people using technology, right? You broke the logic by putting in something you shouldn't have put in that data set, you know? The data's now dirty because you put in things that you know, the developer didn't think you'd put in there. So the reality is we're going to keep making mistakes and there will be more and more opportunities for new technologies to help you know, cheer that up. >> So I was talking to Rob Thomas, GM of the analytics team. You know Rob, great guy. He's smart. He's also an executive but he knows the tech. He and I were talking about this notion of data containers. So with Kubernetes now front and center as an orchestration layer for cloud and application workloads, IBM has an interesting announcement with this cloud private approach. Where data is the central thing in this. Because you've got things like GDPR out there and the regulatory environment not going to get any easier. You got blockchain crypto. That's a regulatory nightmare. We know a GDBR. That's a total nightmare. So this is happening, right? So what should customers be doing, in your experience? Customers are scratching their head. They don't want to make a wrong bet, but they need good data, good strategy. They need to do things differently. How do they get the best out of their data architecture knowing that there's hurdles and potential blockers in front of them? >> Well so I think you want to be careful of what you select. and how much are you going to be indebted to that one service that you selected, right? So if you're not sure yet maybe you don't want to invest all of your budget into this one thing you're not sure is going to be what you really want to be paying for a year or two, right? So I think being really open to how you're going to plan for things long term and thinking about where you can have some flexibility, whereas certain things you can't. For instance, if you're going to be in an industry that is going to be you know, strict on regulatory requirements right? Then you have less wiggle room than let's say an industry where that's not going to be an absolute necessary part of your technology. >> Let me ask you a question and being kind of a historian you know, what say one year is seven dog years or whatever the expression is in the data space. It just seems like yesterday that Hadoop was going to save the world. So that as kind of context, what is some technologies that just didn't pan out? Is the data link working? You know, what didn't work and what replaced it if you can make an observation? >> Well, so I think that's hard because I think the way I understood technology is probably not the way everyone else did right? I mean, you know at the end of the day it just is being a way to store data right? And just being able to use you know, more information store faster, but I'll tell you what I think is hilarious. I've seen people using Hadoop and then writing sequel queries the same way we did like ten plus years ago, same inefficiencies and they're not leveling the fact that it's Hadoop. Right? They're treating it like I want to create eight million tables and then use joins. So they're not really using the technology. I think that's probably the biggest disappointment is that without that knowledge sharing, without education you have people making the same mistakes you made when technology wasn't as efficient. >> I mean if you're a hammer, everything else is like a nail I guess if that's the expression. >> Right. >> On the exciting side, what are you excited about in technology right now? What are you looking at that's a you know, next 20 mile stare of potential goodness that could be coming out of the industry? >> So I think anytime you have better science, better measurements. So measurement's huge, right? If you think about media industry, right? Everyone's trying to measure. I think there was an article that came out about some of YouTube's failure about measurement, right? And I think in general like Facebook is you know, very well known for measurement. That's going to be really interesting to see, right? What methodologies come out in terms of how well can we measure? I think another one will be say, target advertising right? That's another huge market that you know, a lot of companies are going after. I think what's really going to be cool in the next few years is to see what people come up with, right? It's really the human ingenuity of it, right? We have the technology now. We have data engineers. What can we actually build? And how are we going to be able to partner to be able to do that? >> And there's new stacks that are developing. You think about the ecommerce stack. It's a 30 year old stack. AdTech and DNS and cookiing, now you've got social and network effects going on. You mentioned you know, the Metcalfe's law. So with all that, I want to get just your personal thoughts on blockchain. Beyond blockchain, token economics because there are a lot people who are doing stuff with crypto. But what's really kind of pointing as a mega trands standpoint is a new class of desensualized application developers are coming in. >> Right. >> Okay. They're dealing with data now on a desensualized basis. At the heart of that is the token economics, which is changing some of the business model dynamics. Have you seen anything? Your thoughts on token economics? >> So I haven't seen it from the economics standpoint. I've seen it from more of the algorithms and that standpoint. I actually have a good friend of mine, she's at Yale. And she actually runs the, she's executive director of their corporate law center. So I hear some from her on the legal side. I think what's really interesting is there's all these different arenas. Legal being a very important component in blockchain. As well as, from the mathematical standpoint. You know when I was in school way back when, we studied things like hash keys and you know, RSA keys and so from a math standpoint that's also a really cool aspect of it. So I think it's probably too early to say for sure what the economics part is going to actually look like. I think that's going to be a little more longterm. But what is exciting about this, is you actually see different parts of businesses, right? Not just the financial sector but also the legal sector and then you know say, the math and algorithms and you know. Having that integration of being able to build cooler things for that reason. >> Yeah the math's certainly exciting. Machine learning, obviously that's well documented. The growth and success of what, and certainly the interests are there. You seeing Amazon celebrating all the time. I just saw Werner Vogels, the CTO. Talking about another SageMaker, a success. They're looking at machine learning that way. You got Google with TensorFlow. You've got this goodness in these libraries now that are in the community. It's kind of a perfect storm of innovation. What's new in the ML world that developers are getting excited about that companies are harnessing for value? You seeing anything there? Can you share some commentary on the current machine learning trends? >> So I think a lot of companies have gotten a little more adjusted to the idea of ML. At the beginning everyone was like, 'Oh this is all new.' They loved the idea of it but they didn't really know what they were doing, right? Right now they know a little bit more. I think in general everyone thinks deep learning is really cool, neural networks. I think what's interesting though is everyone's trying to figure out where's the line. What's the different between AI versus machine learning versus deep learning versus neural networks. I think it's a little bit fun for me just to see everyone kind of struggle a little bit and actually even know the terminology so we can have a conversation. So I think all of that, right? Just anything related to that you know, when do you TensorFlow? What do you use it for? And then also say, from Google right? Which parts do you actually send through an API? I mean that's some of the conversations I've been having with people in the business industry, like which parts do you send through an API. Which parts do you actually have in house versus you know, having to outsource out? >> And that's really kind of your thinking there is what, around core competencies where people need to kind of own it and really build a core competency and then outsource where its more a femoral invalue. Is there a formula, I guess to know when to bring it in house and build around? >> Right. >> What's your thoughts there? >> Well part of it, I think is scalability. If you don't have the resources or the time, right? Sometimes time. If you don't have the time to build it in house, it does make sense actually to outsource it out. Also if you don't think that's part of your core business, developing that within house do you're spending all that money and resources to hire the best data scientists, may not be worth it because in fact the majority of your actual sales is with the sale department. I mean they're the ones that actually bring in that revenue. So I think it's finding a balance of what investment's actually worth it. >> And sometimes personnel could leave and you could be a big problem, you know. Someone walks about the door, gets another job because its a hot commodity to be. >> That's actually one of the big complaints I've heard is that we spend all this time investing in certain young people and then they leave. I think part of this is actually that human factor. How do you encourage them to stay? >> Let's talk about you. How did you get here? School? Interests? Did you go off the path? Did you come in from another vector? How did you get into what you're doing now and share a little bit about who you are? >> Yeah so I studied economics, mathematics, creative writing as an undergrad and statistics as a grad student. So you know, kind of perfect storm. >> Natural math, bring it all together. >> Yeah but you know its funny because I actually wrote about and talked about how data is going to be this big thing. This is like 2009, 2010 and people didn't think it was that important, you know? I was like next three to five years mathematicians are going to be a hot hire. No one believed me. So I ended up going, 'Okay well, the economy crashed.' I was in management consulting in finance, private equity hedge funds. Everyone swore like, if you do this you're going to be set for life, right? You're on the path. You'll make money and then the economy crashed. All the jobs went away. And I went, 'Maybe not the best career choice for me.' So I did what I did at companies. I looked at the market and I went, 'Where's their growth?' I saw tech had growth and decided I'm going to pick up some skills I've never had before, learn to develop more. I mean in the beginning I had no idea what an application development process was, right? I'm like, 'What does that mean to actually develop an application?' So the last few years I've really just been spending, just learning these things. What's really cool though is last year when my patents went through and I was able to actually able to launch something with Box at their keynote. That was really awesome. >> Awesome. >> So I became a long way from I think, have the academic knowledge to being able to apply it and then learn the technologies and then developing the technologies, which is a cool thing. >> Yeah and that's a good path because you came in with a clean sheet of paper. You didn't have any dogma of waterfall and all the technologies. So you kind of jumped in. Did you use like a cloud to build on? Was it Amazon? Was it? >> Oh that's funny too. Actually I do know Legacy's technology quite well because I was in corporate America before. Yeah, so like Sequel. For instance like when I started working data science, funny enough we didn't call it data science. We just called it like whatever you call it, you know. There was no data science term at that point. You know we didn't have that idea of whether to use R or Python. I mean I've used R over ten years, but it was for statistics. It was never for like actual data science work. And then we used Sequel in corporate America. When I was taking data it was like in 2012. Around then, everyone swore that no, no. They're going to programmers. Got to know programming. To which, I'm like really? In corporate America, we're going to have programmers? I mean think about how long it's going to take to get someone to learn any language and of course, now everyone's learning. It's on Sequel again right? So. >> Isn't it fun to like, when you see someone on Facebook or Linkdin, 'Oh man data's a new oil.' And then you say, 'Yeah here's a blog post I wrote in 2009.' >> Right. Yeah, exactly. Well so funny enough Ginni Rometty today was saying about exponential versus linear and that's one of the things I've been saying over the last year about because you know, you want exponential growth. Because linear anyone can do. That's a tweet. That's not really growth. >> Well we value your opinion. You've been great on The Cube. Great to help us out on those panels, got a great view. What's going on with your company? What are you working on now? What's exciting you these days? >> Yeah so one of the cool things we worked on, it's very much in line with what the IBM announcement was, so being smarter, right? So I developed some technology in the photo industry, digital assent management as well as being able to automate the renaming of files, right? So you think you probably a picture on your digital camera you never moved over because you, I remember the process. You open it, you rename it, you saved it. You open the next one. Takes forever. >> Sometimes its the same number. I got same version files. It's a nightmare. >> Exactly. So I basically automated that process of having all of that automatically renamed. So the demo that I did I had 120 photos renamed in less than two minutes, right? Just making it faster and smarter. So really developing technologies that you can actually use every day and leverage for things like photography and some cooler stuff with OCR, which is the long term goal. To be able to allow photographers to never touch the computer and have all of their clients photos automatically uploaded, renamed and sent to the right locations instantly. >> How did you get to start that app? Are you into photography or? >> No >> More of, I got a picture problem and I got to fix it? >> Well actually its funny. I had a photographer taking my picture and she showed me what she does, the process. And I went, 'This is not okay. You can do better than this.' So I can code so I basically went to Python and went, 'Alright I think this could work,' built a proof of concept and then decided to patent it. >> Awesome. Well congratulations on the patent. Final thoughts here about IBM Think? Overall sentiment of the show? Ginni's keynote. Did you get a chance to check anything out? What's the hallway conversations like? What are some of the things that you're hearing? >> So I think there's a general excitement about what might be coming, right? So a lot of the people who are here are actually here to, I think share notes. They want to know what everyone else is doing, so that's actually great. You get to see more people here who are actually interested in this technology. I think there's probably some questions about alignment, about where does everything fit. That seems to be a lot of the conversation here. It's much bigger this year as I'm sure you've noticed, right? It's a lot bigger so that's probably the biggest thing I've heard like there's so many more people than we expected there to be so. >> I like the big tent events. I'm a big fan of it. I think if I was going to be critical I would say, they should do a business event and do a technical one under the same kind of theme and bring more alpha geeks to the technical one and make this much more of a business conversation because the business transformation seems to be the hottest thing here but I want to get down in the weeds, you know? Get down and dirty so I would like to see two. That's my take. >> I think its really hard to cater to both. Like whenever I give a talk, I don't give a really nerdy talk to say a business crowd. I don't give a really business talk to a nerdy crowd, you know? >> It's hard. >> You just have to know, right? I think they both have a very different sensibility, so really if you want to have a successful talk. Generally you want both. >> Jennifer thanks so much for coming by and spending some time with The Cube. Great to see you. Thanks for sharing your insights. Jennifer Shin here inside The Cube at IBM Think 2018. I'm John Furrier, host of The Cube. We'll be back with more coverage after this short break.

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. I'm John Furrier, the host of The Cube. you got blockchain and you got AI, You have the economics part, you have the networking. And I think the reality is you know, I got to have business logics. You know whether its blockchains, cryptocurrency or apps. And that way you can actually both make money, right? And so you know, the reality is right there. new technologies to help you know, cheer that up. the regulatory environment not going to get any easier. is going to be what you really want to be paying for you know, what say one year is seven dog years And just being able to use you know, more information I guess if that's the expression. And I think in general like Facebook is you know, You mentioned you know, the Metcalfe's law. Have you seen anything? I think that's going to be a little more longterm. I just saw Werner Vogels, the CTO. Just anything related to that you know, Is there a formula, I guess to know when to If you don't have the time to build it in house, you could be a big problem, you know. How do you encourage them to stay? How did you get into what you're doing now and So you know, kind of perfect storm. I mean in the beginning I had no idea what have the academic knowledge to being able to apply it So you kind of jumped in. I mean think about how long it's going to take to get someone And then you say, 'Yeah here's a blog post I wrote in 2009.' because you know, you want exponential growth. What are you working on now? So you think you probably a picture on your digital camera Sometimes its the same number. So really developing technologies that you can actually use 'Alright I think this could work,' What are some of the things that you're hearing? So a lot of the people who are here are actually here to, I want to get down in the weeds, you know? I think its really hard to cater to both. so really if you want to have a successful talk. Great to see you.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Jennifer	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jennifer Shiin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
2009	DATE	0.99+
Rob	PERSON	0.99+
2012	DATE	0.99+
120 photos	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Ginni Rometty	PERSON	0.99+
2010	DATE	0.99+
Hadoop	TITLE	0.99+
Google	ORGANIZATION	0.99+
YouTube	ORGANIZATION	0.99+
one year	QUANTITY	0.99+
two	QUANTITY	0.99+
Jenn	PERSON	0.99+
Python	TITLE	0.99+
less than two minutes	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
both	QUANTITY	0.99+
seven dog years	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
a year	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
today	DATE	0.99+
last year	DATE	0.99+
Werner Vogels	PERSON	0.99+
first death	QUANTITY	0.98+
yesterday	DATE	0.98+
Metcalfe	PERSON	0.98+
one	QUANTITY	0.98+
eight million tables	QUANTITY	0.98+
five years	QUANTITY	0.98+
30 year old	QUANTITY	0.98+
Mandalay Bay	LOCATION	0.97+
ten plus years ago	DATE	0.97+
8 Path Solutions	ORGANIZATION	0.97+
one service	QUANTITY	0.97+
AdTech	ORGANIZATION	0.97+
Moore	PERSON	0.96+
Ginni	PERSON	0.96+
J-S-H-I-N	PERSON	0.96+
one company	QUANTITY	0.96+
America	LOCATION	0.96+
over ten years	QUANTITY	0.95+
two super important things	QUANTITY	0.95+
Linkdin	ORGANIZATION	0.94+
GDPR	TITLE	0.94+
Yale	ORGANIZATION	0.94+
three	QUANTITY	0.94+
20 mile	QUANTITY	0.92+
Legacy	ORGANIZATION	0.9+
this year	DATE	0.9+
R	TITLE	0.9+
Twitter	ORGANIZATION	0.9+
Sequel	ORGANIZATION	0.88+
Cube	COMMERCIAL_ITEM	0.86+
IBM Think	ORGANIZATION	0.86+
Kubernetes	TITLE	0.85+
Cube	ORGANIZATION	0.78+

Wrap | Machine Learning Everywhere 2018

>> Narrator: Live from New York, it's theCUBE. Covering machine learning everywhere. Build your ladder to AI. Brought to you by IBM. >> Welcome back to IBM's Machine Learning Everywhere. Build your ladder to AI, along with Dave Vellante, John Walls here, wrapping up here in New York City. Just about done with the programming here in Midtown. Dave, let's just take a step back. We've heard a lot, seen a lot, talked to a lot of folks today. First off, tell me, AI. We've heard some optimistic outlooks, some, I wouldn't say pessimistic, but some folks saying, "Eh, hold off." Not as daunting as some might think. So just your take on the artificial intelligence conversation we've heard so far today. >> I think generally, John, that people don't realize what's coming. I think the industry, in general, our industry, technology industry, the consumers of technology, the businesses that are out there, they're steeped in the past, that's what they know. They know what they've done, they know the history and they're looking at that as past equals prologue. Everybody knows that's not the case, but I think it's hard for people to envision what's coming, and what the potential of AI is. Having said that, Jennifer Shin is a near-term pessimist on the potential for AI, and rightly so. There are a lot of implementation challenges. But as we said at the open, I'm very convinced that we are now entering a new era. The Hadoop big data industry is going to pale in comparison to what we're seeing. And we're already seeing very clear glimpses of it. The obvious things are Airbnb and Uber, and the disruptions that are going on with Netflix and over-the-top programming, and how Google has changed advertising, and how Amazon is changing and has changed retail. But what you can see, and again, the best examples are Apple getting into financial services, moving into healthcare, trying to solve that problem. Amazon buying a grocer. The rumor that I heard about Amazon potentially buying Nordstrom, which my wife said is a horrible idea. (John laughs) But think about the fact that they can do that is a function of, that they are a digital-first company. Are built around data, and they can take those data models and they can apply it to different places. Who would have thought, for example, that Alexa would be so successful? That Siri is not so great? >> Alexa's become our best friend. >> And it came out of the blue. And it seems like Google has a pretty competitive piece there, but I can almost guarantee that doing this with our thumbs is not the way in which we're going to communicate in the future. It's going to be some kind of natural language interface that's going to rely on artificial intelligence and machine learning and the like. And so, I think it's hard for people to envision what's coming, other than fast forward where machines take over the world and Stephen Hawking and Elon Musk say, "Hey, we should be concerned." Maybe they're right, not in the next 10 years. >> You mentioned Jennifer, we were talking about her and the influencer panel, and we've heard from others as well, it's a combination of human intelligence and artificial intelligence. That combination's more powerful than just artificial intelligence, and so, there is a human component to this. So, for those who might be on the edge of their seat a little bit, or looking at this from a slightly more concerning perspective, maybe not the case. Maybe not necessary, is what you're thinking. >> I guess at the end of the day, the question is, "Is the world going to be a better place with all this AI? "Are we going to be more prosperous, more productive, "healthier, safer on the roads?" I am an optimist, I come down on the side of yes. I would not want to go back to the days where I didn't have GPS. That's worth it to me. >> Can you imagine, right? If you did that now, you go back five years, just five years from where we are now, back to where we were. Waze was nowhere, right? >> All the downside of these things, I feel is offset by that. And I do think it's incumbent upon the industry to try to deal with the problem, especially with young people, the blue light problem. >> John: The addictive issue. >> That's right. But I feel like those downsides are manageable, and the upsides are of enough value that society is going to continue to move forward. And I do think that humans and machines are going to continue to coexist, at least in the near- to mid- reasonable long-term. But the question is, "What can machines "do that humans can't do?" And "What can humans do that machines can't do?" And the answer to that changes every year. It's like I said earlier, not too long ago, machines couldn't climb stairs. They can now, robots can climb stairs. Can they negotiate? Can they identify cats? Who would've imagined that all these cats on the Internet would've led to facial recognition technology. It's improving very, very rapidly. So, I guess my point is that that is changing very rapidly, and there's no question it's going to have an impact on society and an impact on jobs, and all those other negative things that people talk about. To me, the key is, how do we embrace that and turn it into an opportunity? And it's about education, it's about creativity, it's about having multi-talented disciplines that you can tap. So we talked about this earlier, not just being an expert in marketing, but being an expert in marketing with digital as an understanding in your toolbox. So it's that two-tool star that I think is going to emerge. And maybe it's more than two tools. So that's how I see it shaping up. And the last thing is disruption, we talked a lot about disruption. I don't think there's any industry that's safe. Colin was saying, "Well, certain industries "that are highly regulated-" In some respects, I can see those taking longer. But I see those as the most ripe for disruption. Financial services, healthcare. Can't we solve the HIPAA challenge? We can't get access to our own healthcare information. Well, things like artificial intelligence and blockchain, we were talking off-camera about blockchain, those things, I think, can help solve the challenge of, maybe I can carry around my health profile, my medical records. I don't have access to them, it's hard to get them. So can things like artificial intelligence improve our lives? I think there's no question about it. >> What about, on the other side of the coin, if you will, the misuse concerns? There are a lot of great applications. There are a lot of great services. As you pointed out, a lot of positive, a lot of upside here. But as opportunities become available and technology develops, that you run the risk of somebody crossing the line for nefarious means. And there's a lot more at stake now because there's a lot more of us out there, if you will. So, how do you balance that? >> There's no question that's going to happen. And it has to be managed. But even if you could stop it, I would say you shouldn't because the benefits are going to outweigh the risks. And again, the question we asked the panelists, "How far can we take machines? "How far can we go?" That's question number one, number two is, "How far should we go?" We're not even close to the "should we go" yet. We're still on the, "How far can we go?" Jennifer was pointing out, I can't get my password reset 'cause I got to call somebody. That problem will be solved. >> So, you're saying it's more of a practical consideration now than an ethical one, right now? >> Right now. Moreso, and there's certainly still ethical considerations, don't get me wrong, but I see light at the end of the privacy tunnel, I see artificial intelligence as, well, analytics is helping us solve credit card fraud and things of that nature. Autonomous vehicles are just fascinating, right? Both culturally, we talked about that, you know, we learned how to drive a stick shift. (both laugh) It's a funny story you told me. >> Not going to worry about that anymore, right? >> But it was an exciting time in our lives, so there's a cultural downside of that. I don't know what the highway death toll number is, but it's enormous. If cell phones caused that many deaths, we wouldn't be using them. So that's a problem that I think things like artificial intelligence and machine intelligence can solve. And then the other big thing that we talked about is, I see a huge gap between traditional companies and these born-in-the-cloud, born-data-oriented companies. We talked about the top five companies by market cap. Microsoft, Amazon, Facebook, Alphabet, which is Google, who am I missing? >> John: Apple. >> Apple, right. And those are pretty much very much data companies. Apple's got the data from the phones, Google, we know where they get their data, et cetera, et cetera. Traditional companies, however, their data resides in silos. Jennifer talked about this, Craig, as well as Colin. Data resides in silos, it's hard to get to. It's a very human-driven business and the data is bolted on. With the companies that we just talked about, it's a data-driven business, and the humans have expertise to exploit that data, which is very important. So there's a giant skills gap in existing companies. There's data silos. The other thing we touched on this is, where does innovation come from? Innovation drives value drives disruption. So the innovation comes from data. He or she who has the best data wins. It comes from artificial intelligence, and the ability to apply artificial intelligence and machine learning. And I think something that we take for granted a lot, but it's cloud economics. And it's more than just, and somebody, one of the folks mentioned this on the interview, it's more than just putting stuff in the cloud. It's certainly managed services, that's part of it. But it's also economies of scale. It's marginal economics that are essentially zero. It's speed, it's low latency. It's, and again, global scale. You combine those things, data, artificial intelligence, and cloud economics, that's where the innovation is going to come from. And if you think about what Uber's done, what Airbnb have done, where Waze came from, they were picking and choosing from the best digital services out there, and then developing their own software from this, what I say my colleague Dave Misheloff calls this matrix. And, just to repeat, that matrix is, the vertical matrix is industries. The horizontal matrix are technology platforms, cloud, data, mobile, social, security, et cetera. They're building companies on top of that matrix. So, it's how you leverage the matrix is going to determine your future. Whether or not you get disrupted, whether your the disruptor or the disruptee. It's not just about, we talked about this at the open. Cloud, SaaS, mobile, social, big data. They're kind of yesterday's news. It's now new artificial intelligence, machine intelligence, deep learning, machine learning, cognitive. We're still trying to figure out the parlance. You could feel the changes coming. I think this matrix idea is very powerful, and how that gets leveraged in organizations ultimately will determine the levels of disruption. But every single industry is at risk. Because every single industry is going digital, digital allows you to traverse industries. We've said it many times today. Amazon went from bookseller to content producer to grocer- >> John: To grocer now, right? >> To maybe high-end retailer. Content company, Apple with Apple Pay and companies getting into healthcare, trying to solve healthcare problems. The future of warfare, you live in the Beltway. The future of warfare and cybersecurity are just coming together. One of the biggest issues I think we face as a country is we have fake news, we're seeing the weaponization of social media, as James Scott said on theCUBE. So, all these things are coming together that I think are going to make the last 10 years look tame. >> Let's just switch over to the currency of AI, data. And we've talked to, Sam Lightstone today was talking about the database querying that they've developed with the Plex product. Some fascinating capabilities now that make it a lot richer, a lot more meaningful, a lot more relevant. And that seems to be, really, an integral step to making that stuff come alive and really making it applicable to improving your business. Because they've come up with some fantastic new ways to squeeze data that's relevant out, and get it out to the user. >> Well, if you think about what I was saying earlier about data as a foundational core and human expertise around it, versus what most companies are, is human expertise with data bolted on or data in silos. What was interesting about Queryplex, I think they called it, is it essentially virtualizes the data. Well, what does that mean? That means i can have data in place, but I can have access to that data, I can democratize that data, make it accessible to people so that they can become data-driven, data is the core. Now, what I don't know, and I don't know enough, just heard about it today, I missed that announcement, I think they announced it a year ago. He mentioned DB2, he mentioned Netezza. Most of the world is not on DB2 and Netezza even though IBM customers are. I think they can get to Hadoop data stores and other data stores, I just don't know how wide that goes, what the standards look like. He joked about the standards as, the great thing about standards is- >> There are a lot of 'em. (laughs) >> There's always another one you can pick if this one fails. And he's right about that. So, that was very interesting. And so, this is again, the question, can traditional companies close that machine learning, machine intelligence, AI gap? Close being, close the gap that the big five have created. And even the small guys, small guys like Uber and Airbnb, and so forth, but even those guys are getting disrupted. The Airbnbs and the Ubers, right? Again, blockchain comes in and you say, "Why do I need a trusted third party called Uber? "Why can't I do this on the blockchain?" I predict you're going to see even those guys get disrupted. And I'll say something else, it's hard to imagine that a Google or a Facebook can be unseated. But I feel like we may be entering an era where this is their peak. Could be wrong, I'm an Apple customer. I don't know, I'm not as enthralled as I used to be. They got trillions in the bank. But is it possible that opensource and blockchain and the citizen developer, the weekend and nighttime developers, can actually attack that engine of growth for the last 10 years, 20 years, and really break that monopoly? The Internet has basically become an oligopoly where five companies, six companies, whatever, 10 companies kind of control things. Is it possible that opensource software, AI, cryptography, all this activity could challenge the status quo? Being in this business as long as I have, things never stay the same. Leaders come, leaders go. >> I just want to say, never say never. You don't know. >> So, it brings it back to IBM, which is interesting to me. It was funny, I was asking Rob Thomas a question about disruption, and I think he misinterpreted it. I think he was thinking that I was saying, "Hey, you're going to get disrupted by all these little guys." IBM's been getting disrupted for years. They know how to reinvent. A lot of people criticize IBM, how many quarters they haven't had growth, blah, blah, blah, but IBM's made some big, big bets on the future. People criticizing Watson, but it's going to be really interesting to see how all this investment that IBM has made is going to pay off. They were early on. People in the Valley like to say, "Well, the Facebooks, and even Amazon, "Google, they got the best AI. "IBM is not there with them." But think about what IBM is trying to do versus what Google is doing. They're very consumer-oriented, solving consumer problems. Consumers have really led the consumerization of IT, that's true, but none of those guys are trying to solve cancer. So IBM is talking about some big, hairy, audacious goals. And I'm not as pessimistic as some others you've seen in the trade press, it's popular to do. So, bringing it back to IBM, I saw IBM as trying to disrupt itself. The challenge IBM has, is it's got a lot of legacy software products that have purchased over the years. And it's got to figure out how to get through those. So, things like Queryplex allow them to create abstraction layers. Things like Bluemix allow them to bring together their hundreds and hundreds and hundreds of SaaS applications. That takes time, but I do see IBM making some big investments to disrupt themselves. They've got a huge analytics business. We've been covering them for quite some time now. They're a leader, if not the leader, in that business. So, their challenge is, "Okay, how do we now "apply all these technologies to help "our customers create innovation?" What I like about the IBM story is they're not out saying, "We're going to go disrupt industries." Silicon Valley has a bifurcated disruption agenda. On the one hand, they're trying to, cloud, and SaaS, and mobile, and social, very disruptive technologies. On the other hand, is Silicon Valley going to disrupt financial services, healthcare, government, education? I think they have plans to do so. Are they going to be able to execute that dual disruption agenda? Or are the consumers of AI and the doers of AI going to be the ones who actually do the disrupting? We'll see, I mean, Uber's obviously disrupted taxis, Silicon Valley company. Is that too much to ask Silicon Valley to do? That's going to be interesting to see. So, my point is, IBM is not trying to disrupt its customers' businesses, and it can point to Amazon trying to do that. Rather, it's saying, "We're going to enable you." So it could be really interesting to see what happens. You're down in DC, Jeff Bezos spent a lot of time there at the Washington Post. >> We just want the headquarters, that's all we want. We just want the headquarters. >> Well, to the point, if you've got such a growing company monopoly, maybe you should set up an HQ2 in DC. >> Three of the 20, right, for a DC base? >> Yeah, he was saying the other day that, maybe we should think about enhancing, he didn't call it social security, but the government, essentially, helping people plan for retirement and the like. I heard that and said, "Whoa, is he basically "telling us he's going to put us all out of jobs?" (both laugh) So, that, if I'm a customer of Amazon's, I'm kind of scary. So, one of the things they should absolutely do is spin out AWS, I think that helps solve that problem. But, back to IBM, Ginni Rometty was very clear at the World of Watson conference, the inaugural one, that we are not out trying to compete with our customers. I would think that resonates to a lot of people. >> Well, to be continued, right? Next month, back with IBM again? Right, three days? >> Yeah, I think third week in March. Monday, Tuesday, Wednesday, theCUBE's going to be there. Next week we're in the Bahamas. This week, actually. >> Not as a group taking vacation. Actually a working expedition. >> No, it's that blockchain conference. Actually, it's this week, what am I saying next week? >> Although I'm happy to volunteer to grip on that shoot, by the way. >> Flying out tomorrow, it's happening fast. >> Well, enjoyed this, always good to spend time with you. And good to spend time with you as well. So, you've been watching theCUBE, machine learning everywhere. Build your ladder to AI. Brought to you by IBM. Have a good one. (techno music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. talked to a lot of folks today. and they can apply it to different places. And so, I think it's hard for people to envision and so, there is a human component to this. I guess at the end of the day, the question is, back to where we were. to try to deal with the problem, And the answer to that changes every year. What about, on the other side of the coin, because the benefits are going to outweigh the risks. of the privacy tunnel, I see artificial intelligence as, And then the other big thing that we talked about is, And I think something that we take that I think are going to make the last 10 years look tame. And that seems to be, really, an integral step I can democratize that data, make it accessible to people There are a lot of 'em. The Airbnbs and the Ubers, right? I just want to say, never say never. People in the Valley like to say, We just want the headquarters, that's all we want. Well, to the point, if you've got such But, back to IBM, Ginni Rometty was very clear Monday, Tuesday, Wednesday, theCUBE's going to be there. Actually a working expedition. No, it's that blockchain conference. to grip on that shoot, by the way. And good to spend time with you as well.

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Eric Herzog	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Diane	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mark Albertson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Tricia Wang	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
James Scott	PERSON	0.99+
Scott	PERSON	0.99+
Ray Wang	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Brian Walden	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
Rachel Tobik	PERSON	0.99+
Alphabet	ORGANIZATION	0.99+
Zeynep Tufekci	PERSON	0.99+
Tricia	PERSON	0.99+
Stu	PERSON	0.99+
Tom Barton	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sandra Rivera	PERSON	0.99+
John	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
France	LOCATION	0.99+
Jennifer Lin	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Seattle	LOCATION	0.99+
Brian	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
Scott Raynovich	PERSON	0.99+
Radisys	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Eric	PERSON	0.99+
Amanda Silver	PERSON	0.99+

Machine Learning Panel | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE. Covering machine learning everywhere. Build your ladder to AI. Brought to you by IBM. Welcome back to New York City. Along with Dave Vellante, I'm John Walls. We continue our coverage here on theCUBE of machine learning everywhere. Build your ladder to AI, IBM our host here today. We put together, occasionally at these events, a panel of esteemed experts with deep perspectives on a particular subject. Today our influencer panel is comprised of three well-known and respected authorities in this space. Glad to have Colin Sumpter here with us. He's the man with the mic, by the way. He's going to talk first. But, Colin is an IT architect with CrowdMole. Thank you for being with us, Colin. Jennifer Shin, those of you on theCUBE, you're very familiar with Jennifer, a long time Cuber. Founded 8 Path Solutions, on the faculty at NYU and Cal Berkeley, and also with us is Craig Brown, a big data consultant. And a home game for all of you guys, right, more or less here we are in the city. So, thanks for having us, we appreciate the time. First off, let's just talk about the title of the event, Build Your Path... Or Your Ladder, excuse me, to AI. What are those steps on that ladder, Colin? The fundamental steps that you've got to jump on, or step on, in order to get to that true AI environment? >> In order to get to that true AI environment, John, is a matter of mastering or organizing your information well enough to perform analytics. That'll give you two choices to do either linear regression or supervised classification, and then you actually have enough organized data to talk to your team and organize your team around that data to begin that ladder to successively benefit from your data science program. >> Want to take a stab at it, Jennifer? >> So, I would say, compute, right? You need to have the right processing, or at least the ability to scale out to be able to process the algorithm fast enough to be able to find value in your data. I think the other thing is, of course, the data source itself. Do you have right data to answer the questions you want to answer? So, I think, without those two things, you'll either have a lot of great data that you can't process in time, or you'll have a great process or a great algorithm that has no real information, so your output is useless. I think those are the fundamental things you really do need to have any sort of AI solution built. >> I'll take a stab at it from the business side. They have to adopt it first. They have to believe that this is going to benefit them and that the effort that's necessary in order to build into the various aspects of algorithms and data subjects is there, so I think adopting the concept of machine learning and the development aspects that it takes to do that is a key component to building the ladder. >> So this just isn't toe in the water, right? You got to dive in the deep end, right? >> Craig: Right. >> It gets to culture. If you look at most organizations, not the big five market capped companies, but most organizations, data is not at their core. Humans are at their core, human expertise and data is sort of bolted on, but that has to change, or they're going to get disrupted. Data has to be at the core, maybe the human expertise leverages that data. What do you guys seeing with end customers in terms of their readiness for this transformation? >> What I'm seeing customers spending time right now is getting out of the silos. So, when you speak culture, that's primarily what the culture surrounds. They develop applications with functionality as a silo, and data specific to that functionality is the component in which they look at data. They have to get out of that mindset and look at the data holistically, and ultimately, in these events, looking at it as an asset. >> The data is a shared resource. >> Craig: Right, correct. >> Okay, and again, with the exception of the... Whether it's Google, Facebook, obviously, but the Ubers, the AirBNB's, etc... With the exception of those guys, most customers aren't there. Still, the data is in silos, they've got myriad infrastructure. Your thoughts, Jennifer? >> I'm also seeing sort of a disconnect between the operationalizing team, the team that runs these codes, or has a real business need for it, and sometimes you'll see corporations with research teams, and there's sort of a disconnect between what the researchers do and what these operations, or marketing, whatever domain it is, what they're doing in terms of a day to day operation. So, for instance, a researcher will look really deep into these algorithms, and may know a lot about deep learning in theory, in theoretical world, and might publish a paper that's really interesting. But, that application part where they're actually being used every day, there's this difference there, where you really shouldn't have that difference. There should be more alignment. I think actually aligning those resources... I think companies are struggling with that. >> So, Colin, we were talking off camera about RPA, Robotic Process Automation. Where's the play for machine intelligence and RPA? Maybe, first of all, you could explain RPA. >> David, RPA stands for Robotic Process Automation. That's going to enable you to grow and scale a digital workforce. Typically, it's done in the cloud. The way RPA and Robotic Process Automation plays into machine learning and data science, is that it allows you to outsource business processes to compensate for the lack of human expertise that's available in the marketplace, because you need competency to enable the technology to take advantage of these new benefits coming in the market. And, when you start automating some of these processes, you can keep pace with the innovation in the marketplace and allow the human expertise to gradually grow into these new data science technologies. >> So, I was mentioning some of the big guys before. Top five market capped companies: Google, Amazon, Apple, Facebook, Microsoft, all digital. Microsoft you can argue, but still, pretty digital, pretty data oriented. My question is about closing that gap. In your view, can companies close that gap? How can they close that gap? Are you guys helping companies close that gap? It's a wide chasm, it seems. Thoughts? >> The thought on closing the chasm is... presenting the technology to the decision-makers. What we've learned is that... you don't know what you don't know, so it's impossible to find the new technologies if you don't have the vocabulary to just begin a simple research of these new technologies. And, to close that gap, it really comes down to the awareness, events like theCUBE, webinars, different educational opportunities that are available to line of business owners, directors, VP's of systems and services, to begin that awareness process, finding consultants... begin that pipeline enablement to begin allowing the business to take advantage and harness data science, machine learning and what's coming. >> One of the things I've noticed is that there's a lot of information out there, like everyone a webinar, everyone has tutorials, but there's a lot of overlap. There aren't that many very sophisticated documents you can find about how to implement it in real world conditions. They all tend to use the same core data set, a lot of these machine learning tutorials you'll find, which is hilarious because the data set's actually very small. And I know where it comes from, just from having the expertise, but it's not something I'd ever use in the real world. The level of skill you need to be able to do any of these methodologies. But that's what's out there. So, there's a lot of information, but they're kind of at a rudimentary level. They're not really at that sophisticated level where you're going to learn enough to deploy in real world conditions. One of the things I'm noticing is, with the technical teams, with the data science team, machine learning teams, they're kind of using the same methodologies I used maybe 10 years ago. Because the management who manage these teams are not technical enough. They're business people, so they don't understand how to guide them, how to explain hey maybe you shouldn't do that with your code, because that's actually going to cause a problem. You should use parallel code, you should make sure everything is running in parallel so compute's faster. But, if these younger teams are actually learning for the first time, they make the same mistakes you made 10 years ago. So, I think, what I'm noticing is that lack of leadership is partly one of the reasons, and also the assumption that a non-technical person can lead the technical team. >> So, it's just not skillset on the worker level, if you will. It's also knowledge base on the decision-maker level. That's a bad place to be, right? So, how do you get into the door to a business like that? Obviously, and we've talked about this a little bit today, that some companies say, "We're not data companies, we're not digital companies, we sell widgets." Well, yeah but you sell widgets and you need this to sell more widgets. And so, how do you get into the door and talk about this problem that Jennifer just cited? You're signing the checks, man. You're going to have to get up to speed on this otherwise you're not going to have checks to sign in three to five years, you're done! >> I think that speaks to use cases. I think that, and what I'm actually saying at customers, is that there's a disconnect and an understanding from the executive teams and the low-level technical teams on what the use case actually means to the business. Some of the use cases are operational in nature. Some of the use cases are data in nature. There's no real conformity on what does the use case mean across the organization, and that understanding isn't there. And so, the CIO's, the CEO's, the CTO's think that, "Okay, we're going to achieve a certain level of capability if we do a variety of technological things," and the business is looking to effectively improve some or bring some efficiency to business processes. At each level within the organization, the understanding is at the level at which the discussions are being made. And so, I'm in these meetings with senior executives and we have lots of ideas on how we can bring efficiencies and some operational productivity with technology. And then we get in a meeting with the data stewards and "What are these guys talking about? They don't understand what's going on at the data level and what data we have." And then that's where the data quality challenges come into the conversation, so I think that, to close that cataclysm, we have to figure out who needs to be in the room to effectively help us build the right understanding around the use cases and then bring the technology to those use cases then actually see within the organization how we're affecting that. >> So, to change the questioning here... I want you guys to think about how capable can we make machines in the near term, let's talk next decade near term. Let's say next decade. How capable can we make machines and are there limits to what we should do? >> That's a tough one. Although you want to go next decade, we're still faced with some of the challenges today in terms of, again, that adoption, the use case scenarios, and then what my colleagues are saying here about the various data challenges and dev ops and things. So, there's a number of things that we have to overcome, but if we can get past those areas in the next decade, I don't think there's going to be much of a limit, in my opinion, as to what the technology can do and what we can ask the machines to produce for us. As Colin mentioned, with RPA, I think that the capability is there, right? But, can we also ultimately, as humans, leverage that capability effectively? >> I get this question a lot. People are really worried about AI and robots taking over, and all of that. And I go... Well, let's think about the example. We've all been online, probably over the weekend, maybe it's 3 or 4 AM, checking your bank account, and you get an error message your password is wrong. And we swear... And I've been there where I'm like, "No, no my password's right." And it keeps saying that the password is wrong. Of course, then I change it, and it's still wrong. Then, the next day when I login, I can login, same password, because they didn't put a great error message there. They just defaulted to wrong password when it's probably a server that's down. So, there are these basics or processes that we could be improving which no one's improving. So you think in that example, how many customer service reps are going to be contacted to try to address that? How many IT teams? So, for every one of these bad technologies that are out there, or technologies that are not being run efficiently or run in a way that makes sense, you actually have maybe three people that are going to be contacted to try to resolve an issue that actually maybe could have been avoided to begin with. I feel like it's optimistic to say that robots are going to take over, because you're probably going to need more people to put band-aids on bad technology and bad engineering, frankly. And I think that's the reality of it. If we had hoverboards, that would be great, you know? For a while, we thought we did, right? But we found out, oh it's not quite hoverboards. I feel like that might be what happens with AI. We might think we have it, and then go oh wait, it's not really what we thought it was. >> So there are real limits, certainly in the near to mid to maybe even long term, that are imposed. But you're an optimist. >> Yeah. Well, not so much with AI but everything else, sure. (laughing) AI, I'm a little bit like, "Well, it would be great, but I'd like basic things to be taken care of every day." So, I think the usefulness of technology is not something anyone's talking about. They're talking about this advancement, that advancement, things people don't understand, don't know even how to use in their life. Great, great is an idea. But, what about useful things we can actually use in our real life? >> So block and tackle first, and then put some reverses in later, if you will, to switch over to football. We were talking about it earlier, just about basics. Fundamentals, get your fundamentals right and then you can complement on that with supplementary technologies. Craig, Colin? >> Jen made some really good points and brought up some very good points, and so has... >> John: Craig. >> Craig, I'm sorry. (laughing) >> Craig: It's alright. >> 10 years out, Jen and Craig spoke to false positives. And false positives create a lot of inefficiency in businesses. So, when you start using machine learning and AI 10 years from now, maybe there's reduced false positives that have been scored in real time, allowing teams not to have their time consumed and their business resources consumed trying to resolve false positives. These false positives have a business value that, today, some businesses might not be able to record. In financial services, banks count money not lended. But, in every day business, a lot of businesses aren't counting the monetary consequences of false positives and the drag it has on their operational ability and capacity. >> I want to ask you guys about disruption. If you look at where the disruption, the digital disruptions, have taken place, obviously retail, certainly advertising, certainly content businesses... There are some industries that haven't been highly disruptive: financial services, insurance, we were talking earlier about aerospace, defense rather. Is any business, any industry, safe from digital disruption? >> There are. Certain industries are just highly regulated: healthcare, financial services, real estate, transactional law... These are very extremely regulated technologies, or businesses, that are... I don't want to say susceptible to technology, but they can be disrupted at a basic level, operational efficiency, to make these things happen, these business processes happen more rapidly, more accurately. >> So you guys buy that? There's some... I'd like to get a little debate going here. >> So, I work with the government, and the government's trying to change things. I feel like that's kind of a sign because they tend to be a little bit slower than, say, other private industries, or private companies. They have data, they're trying to actually put it into a system, meaning like if they have files... I think that, at some point, I got contacted about putting files that they found, like birth records, right, marriage records, that they found from 100-plus years ago and trying to put that into the system. By the way, I did look into it, there was no way to use AI for that, because there was no standardization across these files, so they have half a million files, but someone's probably going to manually have to enter that in. The reality is, I think because there's a demand for having things be digital, we aren't likely to see a decrease in that. We're not going to have one industry that goes, "Oh, your files aren't digital." Probably because they also want to be digital. The companies themselves, the employees themselves, want to see that change. So, I think there's going to be this continuous move toward it, but there's the question of, "Are we doing it better?" It is better than, say, having it on paper sometimes? Because sometimes I just feel like it's easier on paper than to have to look through my phone, look through the app. There's so many apps now! >> (laughing) I got my index cards cards still, Jennifer! Dave's got his notebook! >> I'm not sure I want my ledger to be on paper... >> Right! So I think that's going to be an interesting thing when people take a step back and go like, "Is this really better? Is this actually an improvement?" Because I don't think all things are better digital. >> That's a great question. Will the world be a better, more prosperous place... Uncertain. Your thoughts? >> I think the competition is probably the driver as to who has to this now, who's not safe. The organizations that are heavily regulated or compliance-driven can actually use that as the reasoning for not jumping into the barrel right now, and letting it happen in other areas first, watching the technology mature-- >> Dave: Let's wait. >> Yeah, let's wait, because that's traditionally how they-- >> Dave: Good strategy in your opinion? >> It depends on the entity but I think there's nothing wrong with being safe. There's nothing wrong with waiting for a variety of innovations to mature. What level of maturity, I think, is the perspective that probably is another discussion for another day, but I think that it's okay. I don't think that everyone should jump in. Get some lessons learned, watch how the other guys do it. I think that safety is in the eyes of the beholder, right? But some organizations are just competition fierce and they need a competitive edge and this is where they get it. >> When you say safety, do you mean safety in making decisions, or do you mean safety in protecting data? How are you defining safety? >> Safety in terms of when they need to launch, and look into these new technologies as a basis for change within the organization. >> What about the other side of that point? There's so much more data about it, so much more behavior about it, so many more attitudes, so on and so forth. And there is privacy issues and security issues and all that... Those are real challenges for any company, and becoming exponentially more important as more is at stake. So, how do companies address that? That's got to be absolutely part of their equation, as they decide what these future deployments are, because they're going to have great, vast reams of data, but that's a lot of vulnerability too, isn't it? >> It's as vulnerable as they... So, from an organizational standpoint, they're accustomed to these... These challenges aren't new, right? We still see data breaches. >> They're bigger now, right? >> They're bigger, but we still see occasionally data breaches in organizations where we don't expect to see them. I think that, from that perspective, it's the experiences of the organizations that determine the risks they want to take on, to a certain degree. And then, based on those risks, and how they handle adversity within those risks, from an experience standpoint they know ultimately how to handle it, and get themselves to a place where they can figure out what happened and then fix the issues. And then the others watch while these risk-takers take on these types of scenarios. >> I want to underscore this whole disruption thing and ask... We don't have much time, I know we're going a little over. I want to ask you to pull out your Hubble telescopes. Let's make a 20 to 30 year view, so we're safe, because we know we're going to be wrong. I want a sort of scale of 1 to 10, high likelihood being 10, low being 1. Maybe sort of rapid fire. Do you think large retail stores are going to mostly disappear? What do you guys think? >> I think the way that they are structured, the way that they interact with their customers might change, but you're still going to need them because there are going to be times where you need to buy something. >> So, six, seven, something like that? Is that kind of consensus, or do you feel differently Colin? >> I feel retail's going to be around, especially fashion because certain people, and myself included, I need to try my clothes on. So, you need a location to go to, a physical location to actually feel the material, experience the material. >> Alright, so we kind of have a consensus there. It's probably no. How about driving-- >> I was going to say, Amazon opened a book store. Just saying, it's kind of funny because they got... And they opened the book store, so you know, I think what happens is people forget over time, they go, "It's a new idea." It's not so much a new idea. >> I heard a rumor the other day that their next big acquisition was going to be, not Neiman Marcus. What's the other high end retailer? >> Nordstrom? >> Nordstrom, yeah. And my wife said, "Bad idea, they'll ruin it." Will driving and owning your own car become an exception? >> Driving and owning your own car... >> Dave: 30 years now, we're talking. >> 30 years... Sure, I think the concept is there. I think that we're looking at that. IOT is moving us in that direction. 5G is around the corner. So, I think the makings of it is there. So, since I can dare to be wrong, yeah I think-- >> We'll be on 10G by then anyway, so-- >> Automobiles really haven't been disrupted, the car industry. But you're forecasting, I would tend to agree. Do you guys agree or no, or do you think that culturally I want to drive my own car? >> Yeah, I think people, I think a couple of things. How well engineered is it? Because if it's badly engineered, people are not going to want to use it. For instance, there are people who could take public transportation. It's the same idea, right? Everything's autonomous, you'd have to follow in line. There's going to be some system, some order to it. And you might go-- >> Dave: Good example, yeah. >> You might go, "Oh, I want it to be faster. I don't want to be in line with that autonomous vehicle. I want to get there faster, get there sooner." And there are people who want to have that control over their lives, but they're not subject to things like schedules all the time and that's their constraint. So, I think if the engineering is bad, you're going to have more problems and people are probably going to go away from wanting to be autonomous. >> Alright, Colin, one for you. Will robots and maybe 3D printing, for example RPA, will it reverse the trend toward offshore manufacturing? >> 30 years from now, yes. I think robotic process engineering, eventually you're going to be at your cubicle or your desk, or whatever it is, and you're going to be able to print office supplies. >> Do you guys think machines will make better diagnoses than doctors? Ohhhhh. >> I'll take that one. >> Alright, alright. >> I think yes, to a certain degree, because if you look at the... problems with diagnosis, right now they miss it and I don't know how people, even 30 years from now, will be different from that perspective, where machines can look at quite a bit of data about a patient in split seconds and say, "Hey, the likelihood of you recurring this disease is nil to none, because here's what I'm basing it on." I don't think doctors will be able to do that. Now, again, daring to be wrong! (laughing) >> Jennifer: Yeah so--6 >> Don't tell your own doctor either. (laughing) >> That's true. If anything happens, we know, we all know. I think it depends. So maybe 80%, some middle percentage might be the case. I think extreme outliers, maybe not so much. You think about anything that's programmed into an algorithm, someone probably identified that disease, a human being identified that as a disease, made that connection, and then it gets put into the algorithm. I think what w6ll happen is that, for the 20% that isn't being done well by machine, you'll have people who are more specialized being able to identify the outlier cases from, say, the standard. Normally, if you have certain symptoms, you have a cold, those are kind of standard ones. If you have this weird sort of thing where there's n6w variables, environmental variables for instance, your environment can actually lead to you having cancer. So, there's othe6 factors other than just your body and your health that's going to actually be important to think about wh6n diagnosing someone. >> John: Colin, go ahead. >> I think machines aren't going to out-decision doctors. I think doctors are going to work well the machine learning. For instance, there's a published document of Watson doing the research of a team of four in 10 minutes, when it normally takes a month. So, those doctors,6to bring up Jen and Craig's point, are going to have more time to focus in on what the actual symptoms are, to resolve the outcome of patient care and patient services in a way that benefits humanity. >> I just wish that, Dave, that you would have picked a shorter horizon that... 30 years, 20 I feel good about our chances of seeing that. 30 I'm just not so sure, I mean... For the two old guys on the panel here. >> The consensus is 20 years, not so much. But beyond 10 years, a lot's going to change. >> Well, thank you all for joining this. I always enjoy the discussions. Craig, Jennifer and Colin, thanks for being here with us here on theCUBE, we appreciate the time. Back with more here from New York right after this. You're watching theCUBE. (upbeat digital music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. enough organized data to talk to your team and organize or at least the ability to scale out to be able to process and that the effort that's necessary in order to build but that has to change, or they're going to get disrupted. and data specific to that functionality but the Ubers, the AirBNB's, etc... I think companies are struggling with that. Maybe, first of all, you could explain RPA. and allow the human expertise to gradually grow Are you guys helping companies close that gap? presenting the technology to the decision-makers. how to guide them, how to explain hey maybe you shouldn't You're going to have to get up to speed on this and the business is looking to effectively improve some and are there limits to what we should do? I don't think there's going to be much of a limit, that are going to be contacted to try to resolve an issue certainly in the near to mid to maybe even long term, but I'd like basic things to be taken care of every day." in later, if you will, to switch over to football. and brought up some very good points, and so has... Craig, I'm sorry. and the drag it has on their operational ability I want to ask you guys about disruption. operational efficiency, to make these things happen, I'd like to get a little debate going here. So, I think there's going to be this continuous move ledger to be on paper... So I think that's going to be an interesting thing Will the world be a better, more prosperous place... as to who has to this now, who's not safe. It depends on the entity but I think and look into these new technologies as a basis That's got to be absolutely part of their equation, they're accustomed to these... and get themselves to a place where they can figure out I want to ask you to pull out your Hubble telescopes. because there are going to be times I feel retail's going to be around, Alright, so we kind of have a consensus there. I think what happens is people forget over time, I heard a rumor the other day that their next big Will driving and owning your own car become an exception? So, since I can dare to be wrong, yeah I think-- or do you think that culturally I want to drive my own car? There's going to be some system, some order to it. going to go away from wanting to be autonomous. Alright, Colin, one for you. be able to print office supplies. Do you guys think machines will make "Hey, the likelihood of you recurring this disease Don't tell your own doctor either. being able to identify the outlier cases from, say, I think doctors are going to work well the machine learning. I just wish that, Dave, that you would have picked The consensus is 20 years, not so much. I always enjoy the discussions.

ENTITIES

Entity	Category	Confidence
Craig	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
David	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jen	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Jennifer Shin	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Colin Sumpter	PERSON	0.99+
Craig Brown	PERSON	0.99+
John Walls	PERSON	0.99+
20	QUANTITY	0.99+
John	PERSON	0.99+
Nordstrom	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
AirBNB	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Neiman Marcus	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
20%	QUANTITY	0.99+
3	DATE	0.99+
today	DATE	0.99+
three	QUANTITY	0.99+
New York City	LOCATION	0.99+
20 years	QUANTITY	0.99+
CrowdMole	ORGANIZATION	0.99+
10	QUANTITY	0.99+
4 AM	DATE	0.99+
8 Path Solutions	ORGANIZATION	0.99+
Today	DATE	0.99+
two old guys	QUANTITY	0.99+
five years	QUANTITY	0.99+
30 years	QUANTITY	0.99+
30 year	QUANTITY	0.99+
First	QUANTITY	0.99+
three people	QUANTITY	0.99+
Ubers	ORGANIZATION	0.99+
10 minutes	QUANTITY	0.99+
10 years	QUANTITY	0.99+
a month	QUANTITY	0.98+
one	QUANTITY	0.98+
first time	QUANTITY	0.98+
next decade	DATE	0.98+
10 years ago	DATE	0.98+
seven	QUANTITY	0.98+
30	QUANTITY	0.98+
Hubble	ORGANIZATION	0.98+
two things	QUANTITY	0.98+
1	QUANTITY	0.98+
half a million files	QUANTITY	0.97+

Data Science: Present and Future | IBM Data Science For All

>> Announcer: Live from New York City it's The Cube, covering IBM data science for all. Brought to you by IBM. (light digital music) >> Welcome back to data science for all. It's a whole new game. And it is a whole new game. >> Dave Vellante, John Walls here. We've got quite a distinguished panel. So it is a new game-- >> Well we're in the game, I'm just happy to be-- (both laugh) Have a swing at the pitch. >> Well let's what we have here. Five distinguished members of our panel. It'll take me a minute to get through the introductions, but believe me they're worth it. Jennifer Shin joins us. Jennifer's the founder of 8 Path Solutions, the director of the data science of Comcast and part of the faculty at UC Berkeley and NYU. Jennifer, nice to have you with us, we appreciate the time. Joe McKendrick an analyst and contributor of Forbes and ZDNet, Joe, thank you for being here at well. Another ZDNetter next to him, Dion Hinchcliffe, who is a vice president and principal analyst of Constellation Research and also contributes to ZDNet. Good to see you, sir. To the back row, but that doesn't mean anything about the quality of the participation here. Bob Hayes with a killer Batman shirt on by the way, which we'll get to explain in just a little bit. He runs the Business over Broadway. And Joe Caserta, who the founder of Caserta Concepts. Welcome to all of you. Thanks for taking the time to be with us. Jennifer, let me just begin with you. Obviously as a practitioner you're very involved in the industry, you're on the academic side as well. We mentioned Berkeley, NYU, steep experience. So I want you to kind of take your foot in both worlds and tell me about data science. I mean where do we stand now from those two perspectives? How have we evolved to where we are? And how would you describe, I guess the state of data science? >> Yeah so I think that's a really interesting question. There's a lot of changes happening. In part because data science has now become much more established, both in the academic side as well as in industry. So now you see some of the bigger problems coming out. People have managed to have data pipelines set up. But now there are these questions about models and accuracy and data integration. So the really cool stuff from the data science standpoint. We get to get really into the details of the data. And I think on the academic side you now see undergraduate programs, not just graduate programs, but undergraduate programs being involved. UC Berkeley just did a big initiative that they're going to offer data science to undergrads. So that's a huge news for the university. So I think there's a lot of interest from the academic side to continue data science as a major, as a field. But I think in industry one of the difficulties you're now having is businesses are now asking that question of ROI, right? What do I actually get in return in the initial years? So I think there's a lot of work to be done and just a lot of opportunity. It's great because people now understand better with data sciences, but I think data sciences have to really think about that seriously and take it seriously and really think about how am I actually getting a return, or adding a value to the business? >> And there's lot to be said is there not, just in terms of increasing the workforce, the acumen, the training that's required now. It's a still relatively new discipline. So is there a shortage issue? Or is there just a great need? Is the opportunity there? I mean how would you look at that? >> Well I always think there's opportunity to be smart. If you can be smarter, you know it's always better. It gives you advantages in the workplace, it gets you an advantage in academia. The question is, can you actually do the work? The work's really hard, right? You have to learn all these different disciplines, you have to be able to technically understand data. Then you have to understand it conceptually. You have to be able to model with it, you have to be able to explain it. There's a lot of aspects that you're not going to pick up overnight. So I think part of it is endurance. Like are people going to feel motivated enough and dedicate enough time to it to get very good at that skill set. And also of course, you know in terms of industry, will there be enough interest in the long term that there will be a financial motivation. For people to keep staying in the field, right? So I think it's definitely a lot of opportunity. But that's always been there. Like I tell people I think of myself as a scientist and data science happens to be my day job. That's just the job title. But if you are a scientist and you work with data you'll always want to work with data. I think that's just an inherent need. It's kind of a compulsion, you just kind of can't help yourself, but dig a little bit deeper, ask the questions, you can't not think about it. So I think that will always exist. Whether or not it's an industry job in the way that we see it today, and like five years from now, or 10 years from now. I think that's something that's up for debate. >> So all of you have watched the evolution of data and how it effects organizations for a number of years now. If you go back to the days when data warehouse was king, we had a lot of promises about 360 degree views of the customer and how we were going to be more anticipatory in terms and more responsive. In many ways the decision support systems and the data warehousing world didn't live up to those promises. They solved other problems for sure. And so everybody was looking for big data to solve those problems. And they've begun to attack many of them. We talked earlier in The Cube today about fraud detection, it's gotten much, much better. Certainly retargeting of advertising has gotten better. But I wonder if you could comment, you know maybe start with Joe. As to the effect that data and data sciences had on organizations in terms of fulfilling that vision of a 360 degree view of customers and anticipating customer needs. >> So. Data warehousing, I wouldn't say failed. But I think it was unfinished in order to achieve what we need done today. At the time I think it did a pretty good job. I think it was the only place where we were able to collect data from all these different systems, have it in a single place for analytics. The big difference between what I think, between data warehousing and data science is data warehouses were primarily made for the consumer to human beings. To be able to have people look through some tool and be able to analyze data manually. That really doesn't work anymore, there's just too much data to do that. So that's why we need to build a science around it so that we can actually have machines actually doing the analytics for us. And I think that's the biggest stride in the evolution over the past couple of years, that now we're actually able to do that, right? It used to be very, you know you go back to when data warehouses started, you had to be a deep technologist in order to be able to collect the data, write the programs to clean the data. But now you're average causal IT person can do that. Right now I think we're back in data science where you have to be a fairly sophisticated programmer, analyst, scientist, statistician, engineer, in order to do what we need to do, in order to make machines actually understand the data. But I think part of the evolution, we're just in the forefront. We're going to see over the next, not even years, within the next year I think a lot of new innovation where the average person within business and definitely the average person within IT will be able to do as easily say, "What are my sales going to be next year?" As easy as it is to say, "What were my sales last year." Where now it's a big deal. Right now in order to do that you have to build some algorithms, you have to be a specialist on predictive analytics. And I think, you know as the tools mature, as people using data matures, and as the technology ecosystem for data matures, it's going to be easier and more accessible. >> So it's still too hard. (laughs) That's something-- >> Joe C.: Today it is yes. >> You've written about and talked about. >> Yeah no question about it. We see this citizen data scientist. You know we talked about the democratization of data science but the way we talk about analytics and warehousing and all the tools we had before, they generated a lot of insights and views on the information, but they didn't really give us the science part. And that's, I think that what's missing is the forming of the hypothesis, the closing of the loop of. We now have use of this data, but are are changing, are we thinking about it strategically? Are we learning from it and then feeding that back into the process. I think that's the big difference between data science and the analytics side. But, you know just like Google made search available to everyone, not just people who had highly specialized indexers or crawlers. Now we can have tools that make these capabilities available to anyone. You know going back to what Joe said I think the key thing is we now have tools that can look at all the data and ask all the questions. 'Cause we can't possibly do it all ourselves. Our organizations are increasingly awash in data. Which is the life blood of our organizations, but we're not using it, you know this a whole concept of dark data. And so I think the concept, or the promise of opening these tools up for everyone to be able to access those insights and activate them, I think that, you know, that's where it's headed. >> This is kind of where the T shirt comes in right? So Bob if you would, so you've got this Batman shirt on. We talked a little bit about it earlier, but it plays right into what Dion's talking about. About tools and, I don't want to spoil it, but you go ahead (laughs) and tell me about it. >> Right, so. Batman is a super hero, but he doesn't have any supernatural powers, right? He can't fly on his own, he can't become invisible on his own. But the thing is he has the utility belt and he has these tools he can use to help him solve problems. For example he as the bat ring when he's confronted with a building that he wants to get over, right? So he pulls it out and uses that. So as data professionals we have all these tools now that these vendors are making. We have IBM SPSS, we have data science experience. IMB Watson that these data pros can now use it as part of their utility belt and solve problems that they're confronted with. So if you''re ever confronted with like a Churn problem and you have somebody who has access to that data they can put that into IBM Watson, ask a question and it'll tell you what's the key driver of Churn. So it's not that you have to be a superhuman to be a data scientist, but these tools will help you solve certain problems and help your business go forward. >> Joe McKendrick, do you have a comment? >> Does that make the Batmobile the Watson? (everyone laughs) Analogy? >> I was just going to add that, you know all of the billionaires in the world today and none of them decided to become Batman yet. It's very disappointing. >> Yeah. (Joe laughs) >> Go ahead Joe. >> And I just want to add some thoughts to our discussion about what happened with data warehousing. I think it's important to point out as well that data warehousing, as it existed, was fairly successful but for larger companies. Data warehousing is a very expensive proposition it remains a expensive proposition. Something that's in the domain of the Fortune 500. But today's economy is based on a very entrepreneurial model. The Fortune 500s are out there of course it's ever shifting. But you have a lot of smaller companies a lot of people with start ups. You have people within divisions of larger companies that want to innovate and not be tied to the corporate balance sheet. They want to be able to go through, they want to innovate and experiment without having to go through finance and the finance department. So there's all these open source tools available. There's cloud resources as well as open source tools. Hadoop of course being a prime example where you can work with the data and experiment with the data and practice data science at a very low cost. >> Dion mentioned the C word, citizen data scientist last year at the panel. We had a conversation about that. And the data scientists on the panel generally were like, "Stop." Okay, we're not all of a sudden going to turn everybody into data scientists however, what we want to do is get people thinking about data, more focused on data, becoming a data driven organization. I mean as a data scientist I wonder if you could comment on that. >> Well I think so the other side of that is, you know there are also many people who maybe didn't, you know follow through with science, 'cause it's also expensive. A PhD takes a lot of time. And you know if you don't get funding it's a lot of money. And for very little security if you think about how hard it is to get a teaching job that's going to give you enough of a pay off to pay that back. Right, the time that you took off, the investment that you made. So I think the other side of that is by making data more accessible, you allow people who could have been great in science, have an opportunity to be great data scientists. And so I think for me the idea of citizen data scientist, that's where the opportunity is. I think in terms of democratizing data and making it available for everyone, I feel as though it's something similar to the way we didn't really know what KPIs were, maybe 20 years ago. People didn't use it as readily, didn't teach it in schools. I think maybe 10, 20 years from now, some of the things that we're building today from data science, hopefully more people will understand how to use these tools. They'll have a better understanding of working with data and what that means, and just data literacy right? Just being able to use these tools and be able to understand what data's saying and actually what it's not saying. Which is the thing that most people don't think about. But you can also say that data doesn't say anything. There's a lot of noise in it. There's too much noise to be able to say that there is a result. So I think that's the other side of it. So yeah I guess in terms for me, in terms of data a serious data scientist, I think it's a great idea to have that, right? But at the same time of course everyone kind of emphasized you don't want everyone out there going, "I can be a data scientist without education, "without statistics, without math," without understanding of how to implement the process. I've seen a lot of companies implement the same sort of process from 10, 20 years ago just on Hadoop instead of SQL. Right and it's very inefficient. And the only difference is that you can build more tables wrong than they could before. (everyone laughs) Which is I guess >> For less. it's an accomplishment and for less, it's cheaper, yeah. >> It is cheaper. >> Otherwise we're like I'm not a data scientist but I did stay at a Holiday Inn Express last night, right? >> Yeah. (panelists laugh) And there's like a little bit of pride that like they used 2,000, you know they used 2,000 computers to do it. Like a little bit of pride about that, but you know of course maybe not a great way to go. I think 20 years we couldn't do that, right? One computer was already an accomplishment to have that resource. So I think you have to think about the fact that if you're doing it wrong, you're going to just make that mistake bigger, which his also the other side of working with data. >> Sure, Bob. >> Yeah I have a comment about that. I've never liked the term citizen data scientist or citizen scientist. I get the point of it and I think employees within companies can help in the data analytics problem by maybe being a data collector or something. I mean I would never have just somebody become a scientist based on a few classes here she takes. It's like saying like, "Oh I'm going to be a citizen lawyer." And so you come to me with your legal problems, or a citizen surgeon. Like you need training to be good at something. You can't just be good at something just 'cause you want to be. >> John: Joe you wanted to say something too on that. >> Since we're in New York City I'd like to use the analogy of a real scientist versus a data scientist. So real scientist requires tools, right? And the tools are not new, like microscopes and a laboratory and a clean room. And these tools have evolved over years and years, and since we're in New York we could walk within a 10 block radius and buy any of those tools. It doesn't make us a scientist because we use those tools. I think with data, you know making, making the tools evolve and become easier to use, you know like Bob was saying, it doesn't make you a better data scientist, it just makes the data more accessible. You know we can go buy a microscope, we can go buy Hadoop, we can buy any kind of tool in a data ecosystem, but it doesn't really make you a scientist. I'm very involved in the NYU data science program and the Columbia data science program, like these kids are brilliant. You know these kids are not someone who is, you know just trying to run a day to day job, you know in corporate America. I think the people who are running the day to day job in corporate America are going to be the recipients of data science. Just like people who take drugs, right? As a result of a smart data scientist coming up with a formula that can help people, I think we're going to make it easier to distribute the data that can help people with all the new tools. But it doesn't really make it, you know the access to the data and tools available doesn't really make you a better data scientist. Without, like Bob was saying, without better training and education. >> So how-- I'm sorry, how do you then, if it's not for everybody, but yet I'm the user at the end of the day at my company and I've got these reams of data before me, how do you make it make better sense to me then? So that's where machine learning comes in or artificial intelligence and all this stuff. So how at the end of the day, Dion? How do you make it relevant and usable, actionable to somebody who might not be as practiced as you would like? >> I agree with Joe that many of us will be the recipients of data science. Just like you had to be a computer science at one point to develop programs for a computer, now we can get the programs. You don't need to be a computer scientist to get a lot of value out of our IT systems. The same thing's going to happen with data science. There's far more demand for data science than there ever could be produced by, you know having an ivory tower filled with data scientists. Which we need those guys, too, don't get me wrong. But we need to have, productize it and make it available in packages such that it can be consumed. The outputs and even some of the inputs can be provided by mere mortals, whether that's machine learning or artificial intelligence or bots that go off and run the hypotheses and select the algorithms maybe with some human help. We have to productize it. This is a constant of data scientist of service, which is becoming a thing now. It's, "I need this, I need this capability at scale. "I need it fast and I need it cheap." The commoditization of data science is going to happen. >> That goes back to what I was saying about, the recipient also of data science is also machines, right? Because I think the other thing that's happening now in the evolution of data is that, you know the data is, it's so tightly coupled. Back when you were talking about data warehousing you have all the business transactions then you take the data out of those systems, you put them in a warehouse for analysis, right? Maybe they'll make a decision to change that system at some point. Now the analytics platform and the business application is very tightly coupled. They become dependent upon one another. So you know people who are using the applications are now be able to take advantage of the insights of data analytics and data science, just through the app. Which never really existed before. >> I have one comment on that. You were talking about how do you get the end user more involved, well like we said earlier data science is not easy, right? As an end user, I encourage you to take a stats course, just a basic stats course, understanding what a mean is, variability, regression analysis, just basic stuff. So you as an end user can get more, or glean more insight from the reports that you're given, right? If you go to France and don't know French, then people can speak really slowly to you in French, you're not going to get it. You need to understand the language of data to get value from the technology we have available to us. >> Incidentally French is one of the languages that you have the option of learning if you're a mathematicians. So math PhDs are required to learn a second language. France being the country of algebra, that's one of the languages you could actually learn. Anyway tangent. But going back to the point. So statistics courses, definitely encourage it. I teach statistics. And one of the things that I'm finding as I go through the process of teaching it I'm actually bringing in my experience. And by bringing in my experience I'm actually kind of making the students think about the data differently. So the other thing people don't think about is the fact that like statisticians typically were expected to do, you know, just basic sort of tasks. In a sense that they're knowledge is specialized, right? But the day to day operations was they ran some data, you know they ran a test on some data, looked at the results, interpret the results based on what they were taught in school. They didn't develop that model a lot of times they just understand what the tests were saying, especially in the medical field. So when you when think about things like, we have words like population, census. Which is when you take data from every single, you have every single data point versus a sample, which is a subset. It's a very different story now that we're collecting faster than it used to be. It used to be the idea that you could collect information from everyone. Like it happens once every 10 years, we built that in. But nowadays you know, you know here about Facebook, for instance, I think they claimed earlier this year that their data was more accurate than the census data. So now there are these claims being made about which data source is more accurate. And I think the other side of this is now statisticians are expected to know data in a different way than they were before. So it's not just changing as a field in data science, but I think the sciences that are using data are also changing their fields as well. >> Dave: So is sampling dead? >> Well no, because-- >> Should it be? (laughs) >> Well if you're sampling wrong, yes. That's really the question. >> Okay. You know it's been said that the data doesn't lie, people do. Organizations are very political. Oftentimes you know, lies, damned lies and statistics, Benjamin Israeli. Are you seeing a change in the way in which organizations are using data in the context of the politics. So, some strong P&L manager say gets data and crafts it in a way that he or she can advance their agenda. Or they'll maybe attack a data set that is, probably should drive them in a different direction, but might be antithetical to their agenda. Are you seeing data, you know we talked about democratizing data, are you seeing that reduce the politics inside of organizations? >> So you know we've always used data to tell stories at the top level of an organization that's what it's all about. And I still see very much that no matter how much data science or, the access to the truth through looking at the numbers that story telling is still the political filter through which all that data still passes, right? But it's the advent of things like Block Chain, more and more corporate records and corporate information is going to end up in these open and shared repositories where there is not alternate truth. It'll come back to whoever tells the best stories at the end of the day. So I still see the organizations are very political. We are seeing now more open data though. Open data initiatives are a big thing, both in government and in the private sector. It is having an effect, but it's slow and steady. So that's what I see. >> Um, um, go ahead. >> I was just going to say as well. Ultimately I think data driven decision making is a great thing. And it's especially useful at the lower tiers of the organization where you have the routine day to day's decisions that could be automated through machine learning and deep learning. The algorithms can be improved on a constant basis. On the upper levels, you know that's why you pay executives the big bucks in the upper levels to make the strategic decisions. And data can help them, but ultimately, data, IT, technology alone will not create new markets, it will not drive new businesses, it's up to human beings to do that. The technology is the tool to help them make those decisions. But creating businesses, growing businesses, is very much a human activity. And that's something I don't see ever getting replaced. Technology might replace many other parts of the organization, but not that part. >> I tend to be a foolish optimist when it comes to this stuff. >> You do. (laughs) >> I do believe that data will make the world better. I do believe that data doesn't lie people lie. You know I think as we start, I'm already seeing trends in industries, all different industries where, you know conventional wisdom is starting to get trumped by analytics. You know I think it's still up to the human being today to ignore the facts and go with what they think in their gut and sometimes they win, sometimes they lose. But generally if they lose the data will tell them that they should have gone the other way. I think as we start relying more on data and trusting data through artificial intelligence, as we start making our lives a little bit easier, as we start using smart cars for safety, before replacement of humans. AS we start, you know, using data really and analytics and data science really as the bumpers, instead of the vehicle, eventually we're going to start to trust it as the vehicle itself. And then it's going to make lying a little bit harder. >> Okay, so great, excellent. Optimism, I love it. (John laughs) So I'm going to play devil's advocate here a little bit. There's a couple elephant in the room topics that I want to, to explore a little bit. >> Here it comes. >> There was an article today in Wired. And it was called, Why AI is Still Waiting for It's Ethics Transplant. And, I will just read a little segment from there. It says, new ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, government and military interests accountable as they design and employ AI. When tech giants build AI products, too often user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit driven business models based on aggregate data profiles. This is from Kate Crawford and Meredith Whittaker who founded AI Now. And they're calling for sort of, almost clinical trials on AI, if I could use that analogy. Before you go to market you've got to test the human impact, the social impact. Thoughts. >> And also have the ability for a human to intervene at some point in the process. This goes way back. Is everybody familiar with the name Stanislav Petrov? He's the Soviet officer who back in 1983, it was in the control room, I guess somewhere outside of Moscow in the control room, which detected a nuclear missile attack against the Soviet Union coming out of the United States. Ordinarily I think if this was an entirely AI driven process we wouldn't be sitting here right now talking about it. But this gentlemen looked at what was going on on the screen and, I'm sure he's accountable to his authorities in the Soviet Union. He probably got in a lot of trouble for this, but he decided to ignore the signals, ignore the data coming out of, from the Soviet satellites. And as it turned out, of course he was right. The Soviet satellites were seeing glints of the sun and they were interpreting those glints as missile launches. And I think that's a great example why, you know every situation of course doesn't mean the end of the world, (laughs) it was in this case. But it's a great example why there needs to be a human component, a human ability for human intervention at some point in the process. >> So other thoughts. I mean organizations are driving AI hard for profit. Best minds of our generation are trying to figure out how to get people to click on ads. Jeff Hammerbacher is famous for saying it. >> You can use data for a lot of things, data analytics, you can solve, you can cure cancer. You can make customers click on more ads. It depends on what you're goal is. But, there are ethical considerations we need to think about. When we have data that will have a racial bias against blacks and have them have higher prison sentences or so forth or worse credit scores, so forth. That has an impact on a broad group of people. And as a society we need to address that. And as scientists we need to consider how are we going to fix that problem? Cathy O'Neil in her book, Weapons of Math Destruction, excellent book, I highly recommend that your listeners read that book. And she talks about these issues about if AI, if algorithms have a widespread impact, if they adversely impact protected group. And I forget the last criteria, but like we need to really think about these things as a people, as a country. >> So always think the idea of ethics is interesting. So I had this conversation come up a lot of times when I talk to data scientists. I think as a concept, right as an idea, yes you want things to be ethical. The question I always pose to them is, "Well in the business setting "how are you actually going to do this?" 'Cause I find the most difficult thing working as a data scientist, is to be able to make the day to day decision of when someone says, "I don't like that number," how do you actually get around that. If that's the right data to be showing someone or if that's accurate. And say the business decides, "Well we don't like that number." Many people feel pressured to then change the data, change, or change what the data shows. So I think being able to educate people to be able to find ways to say what the data is saying, but not going past some line where it's a lie, where it's unethical. 'Cause you can also say what data doesn't say. You don't always have to say what the data does say. You can leave it as, "Here's what we do know, "but here's what we don't know." There's a don't know part that many people will omit when they talk about data. So I think, you know especially when it comes to things like AI it's tricky, right? Because I always tell people I don't know everyone thinks AI's going to be so amazing. I started an industry by fixing problems with computers that people didn't realize computers had. For instance when you have a system, a lot of bugs, we all have bug reports that we've probably submitted. I mean really it's no where near the point where it's going to start dominating our lives and taking over all the jobs. Because frankly it's not that advanced. It's still run by people, still fixed by people, still managed by people. I think with ethics, you know a lot of it has to do with the regulations, what the laws say. That's really going to be what's involved in terms of what people are willing to do. A lot of businesses, they want to make money. If there's no rules that says they can't do certain things to make money, then there's no restriction. I think the other thing to think about is we as consumers, like everyday in our lives, we shouldn't separate the idea of data as a business. We think of it as a business person, from our day to day consumer lives. Meaning, yes I work with data. Incidentally I also always opt out of my credit card, you know when they send you that information, they make you actually mail them, like old school mail, snail mail like a document that says, okay I don't want to be part of this data collection process. Which I always do. It's a little bit more work, but I go through that step of doing that. Now if more people did that, perhaps companies would feel more incentivized to pay you for your data, or give you more control of your data. Or at least you know, if a company's going to collect information, I'd want you to be certain processes in place to ensure that it doesn't just get sold, right? For instance if a start up gets acquired what happens with that data they have on you? You agree to give it to start up. But I mean what are the rules on that? So I think we have to really think about the ethics from not just, you know, someone who's going to implement something but as consumers what control we have for our own data. 'Cause that's going to directly impact what businesses can do with our data. >> You know you mentioned data collection. So slightly on that subject. All these great new capabilities we have coming. We talked about what's going to happen with media in the future and what 5G technology's going to do to mobile and these great bandwidth opportunities. The internet of things and the internet of everywhere. And all these great inputs, right? Do we have an arms race like are we keeping up with the capabilities to make sense of all the new data that's going to be coming in? And how do those things square up in this? Because the potential is fantastic, right? But are we keeping up with the ability to make it make sense and to put it to use, Joe? >> So I think data ingestion and data integration is probably one of the biggest challenges. I think, especially as the world is starting to become more dependent on data. I think you know, just because we're dependent on numbers we've come up with GAAP, which is generally accepted accounting principles that can be audited and proven whether it's true or false. I think in our lifetime we will see something similar to that we will we have formal checks and balances of data that we use that can be audited. Getting back to you know what Dave was saying earlier about, I personally would trust a machine that was programmed to do the right thing, than to trust a politician or some leader that may have their own agenda. And I think the other thing about machines is that they are auditable. You know you can look at the code and see exactly what it's doing and how it's doing it. Human beings not so much. So I think getting to the truth, even if the truth isn't the answer that we want, I think is a positive thing. It's something that we can't do today that once we start relying on machines to do we'll be able to get there. >> Yeah I was just going to add that we live in exponential times. And the challenge is that the way that we're structured traditionally as organizations is not allowing us to absorb advances exponentially, it's linear at best. Everyone talks about change management and how are we going to do digital transformation. Evidence shows that technology's forcing the leaders and the laggards apart. There's a few leading organizations that are eating the world and they seem to be somehow rolling out new things. I don't know how Amazon rolls out all this stuff. There's all this artificial intelligence and the IOT devices, Alexa, natural language processing and that's just a fraction, it's just a tip of what they're releasing. So it just shows that there are some organizations that have path found the way. Most of the Fortune 500 from the year 2000 are gone already, right? The disruption is happening. And so we are trying, have to find someway to adopt these new capabilities and deploy them effectively or the writing is on the wall. I spent a lot of time exploring this topic, how are we going to get there and all of us have a lot of hard work is the short answer. >> I read that there's going to be more data, or it was predicted, more data created in this year than in the past, I think it was five, 5,000 years. >> Forever. (laughs) >> And that to mix the statistics that we're analyzing currently less than 1% of the data. To taking those numbers and hear what you're all saying it's like, we're not keeping up, it seems like we're, it's not even linear. I mean that gap is just going to grow and grow and grow. How do we close that? >> There's a guy out there named Chris Dancy, he's known as the human cyborg. He has 700 hundred sensors all over his body. And his theory is that data's not new, having access to the data is new. You know we've always had a blood pressure, we've always had a sugar level. But we were never able to actually capture it in real time before. So now that we can capture and harness it, now we can be smarter about it. So I think that being able to use this information is really incredible like, this is something that over our lifetime we've never had and now we can do it. Which hence the big explosion in data. But I think how we use it and have it governed I think is the challenge right now. It's kind of cowboys and indians out there right now. And without proper governance and without rigorous regulation I think we are going to have some bumps in the road along the way. >> The data's in the oil is the question how are we actually going to operationalize around it? >> Or find it. Go ahead. >> I will say the other side of it is, so if you think about information, we always have the same amount of information right? What we choose to record however, is a different story. Now if you want wanted to know things about the Olympics, but you decide to collect information every day for years instead of just the Olympic year, yes you have a lot of data, but did you need all of that data? For that question about the Olympics, you don't need to collect data during years there are no Olympics, right? Unless of course you're comparing it relative. But I think that's another thing to think about. Just 'cause you collect more data does not mean that data will produce more statistically significant results, it does not mean it'll improve your model. You can be collecting data about your shoe size trying to get information about your hair. I mean it really does depend on what you're trying to measure, what your goals are, and what the data's going to be used for. If you don't factor the real world context into it, then yeah you can collect data, you know an infinite amount of data, but you'll never process it. Because you have no question to ask you're not looking to model anything. There is no universal truth about everything, that just doesn't exist out there. >> I think she's spot on. It comes down to what kind of questions are you trying to ask of your data? You can have one given database that has 100 variables in it, right? And you can ask it five different questions, all valid questions and that data may have those variables that'll tell you what's the best predictor of Churn, what's the best predictor of cancer treatment outcome. And if you can ask the right question of the data you have then that'll give you some insight. Just data for data's sake, that's just hype. We have a lot of data but it may not lead to anything if we don't ask it the right questions. >> Joe. >> I agree but I just want to add one thing. This is where the science in data science comes in. Scientists often will look at data that's already been in existence for years, weather forecasts, weather data, climate change data for example that go back to data charts and so forth going back centuries if that data is available. And they reformat, they reconfigure it, they get new uses out of it. And the potential I see with the data we're collecting is it may not be of use to us today, because we haven't thought of ways to use it, but maybe 10, 20, even 100 years from now someone's going to think of a way to leverage the data, to look at it in new ways and to come up with new ideas. That's just my thought on the science aspect. >> Knowing what you know about data science, why did Facebook miss Russia and the fake news trend? They came out and admitted it. You know, we miss it, why? Could they have, is it because they were focused elsewhere? Could they have solved that problem? (crosstalk) >> It's what you said which is are you asking the right questions and if you're not looking for that problem in exactly the way that it occurred you might not be able to find it. >> I thought the ads were paid in rubles. Shouldn't that be your first clue (panelists laugh) that something's amiss? >> You know red flag, so to speak. >> Yes. >> I mean Bitcoin maybe it could have hidden it. >> Bob: Right, exactly. >> I would think too that what happened last year is actually was the end of an age of optimism. I'll bring up the Soviet Union again, (chuckles). It collapsed back in 1991, 1990, 1991, Russia was reborn in. And think there was a general feeling of optimism in the '90s through the 2000s that Russia is now being well integrated into the world economy as other nations all over the globe, all continents are being integrated into the global economy thanks to technology. And technology is lifting entire continents out of poverty and ensuring more connectedness for people. Across Africa, India, Asia, we're seeing those economies that very different countries than 20 years ago and that extended into Russia as well. Russia is part of the global economy. We're able to communicate as a global, a global network. I think as a result we kind of overlook the dark side that occurred. >> John: Joe? >> Again, the foolish optimist here. But I think that... It shouldn't be the question like how did we miss it? It's do we have the ability now to catch it? And I think without data science without machine learning, without being able to train machines to look for patterns that involve corruption or result in corruption, I think we'd be out of luck. But now we have those tools. And now hopefully, optimistically, by the next election we'll be able to detect these things before they become public. >> It's a loaded question because my premise was Facebook had the ability and the tools and the knowledge and the data science expertise if in fact they wanted to solve that problem, but they were focused on other problems, which is how do I get people to click on ads? >> Right they had the ability to train the machines, but they were giving the machines the wrong training. >> Looking under the wrong rock. >> (laughs) That's right. >> It is easy to play armchair quarterback. Another topic I wanted to ask the panel about is, IBM Watson. You guys spend time in the Valley, I spend time in the Valley. People in the Valley poo-poo Watson. Ah, Google, Facebook, Amazon they've got the best AI. Watson, and some of that's fair criticism. Watson's a heavy lift, very services oriented, you just got to apply it in a very focused. At the same time Google's trying to get you to click on Ads, as is Facebook, Amazon's trying to get you to buy stuff. IBM's trying to solve cancer. Your thoughts on that sort of juxtaposition of the different AI suppliers and there may be others. Oh, nobody wants to touch this one, come on. I told you elephant in the room questions. >> Well I mean you're looking at two different, very different types of organizations. One which is really spent decades in applying technology to business and these other companies are ones that are primarily into the consumer, right? When we talk about things like IBM Watson you're looking at a very different type of solution. You used to be able to buy IT and once you installed it you pretty much could get it to work and store your records or you know, do whatever it is you needed it to do. But these types of tools, like Watson actually tries to learn your business. And it needs to spend time doing that watching the data and having its models tuned. And so you don't get the results right away. And I think that's been kind of the challenge that organizations like IBM has had. Like this is a different type of technology solution, one that has to actually learn first before it can provide value. And so I think you know you have organizations like IBM that are much better at applying technology to business, and then they have the further hurdle of having to try to apply these tools that work in very different ways. There's education too on the side of the buyer. >> I'd have to say that you know I think there's plenty of businesses out there also trying to solve very significant, meaningful problems. You know with Microsoft AI and Google AI and IBM Watson, I think it's not really the tool that matters, like we were saying earlier. A fool with a tool is still a fool. And regardless of who the manufacturer of that tool is. And I think you know having, a thoughtful, intelligent, trained, educated data scientist using any of these tools can be equally effective. >> So do you not see core AI competence and I left out Microsoft, as a strategic advantage for these companies? Is it going to be so ubiquitous and available that virtually anybody can apply it? Or is all the investment in R&D and AI going to pay off for these guys? >> Yeah, so I think there's different levels of AI, right? So there's AI where you can actually improve the model. I remember when I was invited when Watson was kind of first out by IBM to a private, sort of presentation. And my question was, "Okay, so when do I get "to access the corpus?" The corpus being sort of the foundation of NLP, which is natural language processing. So it's what you use as almost like a dictionary. Like how you're actually going to measure things, or things up. And they said, "Oh you can't." "What do you mean I can't?" It's like, "We do that." "So you're telling me as a data scientist "you're expecting me to rely on the fact "that you did it better than me and I should rely on that." I think over the years after that IBM started opening it up and offering different ways of being able to access the corpus and work with that data. But I remember at the first Watson hackathon there was only two corpus available. It was either the travel or medicine. There was no other foundational data available. So I think one of the difficulties was, you know IBM being a little bit more on the forefront of it they kind of had that burden of having to develop these systems and learning kind of the hard way that if you don't have the right models and you don't have the right data and you don't have the right access, that's going to be a huge limiter. I think with things like medical, medical information that's an extremely difficult data to start with. Partly because you know anything that you do find or don't find, the impact is significant. If I'm looking at things like what people clicked on the impact of using that data wrong, it's minimal. You might lose some money. If you do that with healthcare data, if you do that with medical data, people may die, like this is a much more difficult data set to start with. So I think from a scientific standpoint it's great to have any information about a new technology, new process. That's the nice that is that IBM's obviously invested in it and collected information. I think the difficulty there though is just 'cause you have it you can't solve everything. And if feel like from someone who works in technology, I think in general when you appeal to developers you try not to market. And with Watson it's very heavily marketed, which tends to turn off people who are more from the technical side. Because I think they don't like it when it's gimmicky in part because they do the opposite of that. They're always trying to build up the technical components of it. They don't like it when you're trying to convince them that you're selling them something when you could just give them the specs and look at it. So it could be something as simple as communication. But I do think it is valuable to have had a company who leads on the forefront of that and try to do so we can actually learn from what IBM has learned from this process. >> But you're an optimist. (John laughs) All right, good. >> Just one more thought. >> Joe go ahead first. >> Joe: I want to see how Alexa or Siri do on Jeopardy. (panelists laugh) >> All right. Going to go around a final thought, give you a second. Let's just think about like your 12 month crystal ball. In terms of either challenges that need to be met in the near term or opportunities you think will be realized. 12, 18 month horizon. Bob you've got the microphone headed up, so I'll let you lead off and let's just go around. >> I think a big challenge for business, for society is getting people educated on data and analytics. There's a study that was just released I think last month by Service Now, I think, or some vendor, or Click. They found that only 17% of the employees in Europe have the ability to use data in their job. Think about that. >> 17. >> 17. Less than 20%. So these people don't have the ability to understand or use data intelligently to improve their work performance. That says a lot about the state we're in today. And that's Europe. It's probably a lot worse in the United States. So that's a big challenge I think. To educate the masses. >> John: Joe. >> I think we probably have a better chance of improving technology over training people. I think using data needs to be iPhone easy. And I think, you know which means that a lot of innovation is in the years to come. I do think that a keyboard is going to be a thing of the past for the average user. We are going to start using voice a lot more. I think augmented reality is going to be things that becomes a real reality. Where we can hold our phone in front of an object and it will have an overlay of prices where it's available, if it's a person. I think that we will see within an organization holding a camera up to someone and being able to see what is their salary, what sales did they do last year, some key performance indicators. I hope that we are beyond the days of everyone around the world walking around like this and we start actually becoming more social as human beings through augmented reality. I think, it has to happen. I think we're going through kind of foolish times at the moment in order to get to the greater good. And I think the greater good is using technology in a very, very smart way. Which means that you shouldn't have to be, sorry to contradict, but maybe it's good to counterpoint. I don't think you need to have a PhD in SQL to use data. Like I think that's 1990. I think as we evolve it's going to become easier for the average person. Which means people like the brain trust here needs to get smarter and start innovating. I think the innovation around data is really at the tip of the iceberg, we're going to see a lot more of it in the years to come. >> Dion why don't you go ahead, then we'll come down the line here. >> Yeah so I think over that time frame two things are likely to happen. One is somebody's going to crack the consumerization of machine learning and AI, such that it really is available to the masses and we can do much more advanced things than we could. We see the industries tend to reach an inflection point and then there's an explosion. No one's quite cracked the code on how to really bring this to everyone, but somebody will. And that could happen in that time frame. And then the other thing that I think that almost has to happen is that the forces for openness, open data, data sharing, open data initiatives things like Block Chain are going to run headlong into data protection, data privacy, customer privacy laws and regulations that have to come down and protect us. Because the industry's not doing it, the government is stepping in and it's going to re-silo a lot of our data. It's going to make it recede and make it less accessible, making data science harder for a lot of the most meaningful types of activities. Patient data for example is already all locked down. We could do so much more with it, but health start ups are really constrained about what they can do. 'Cause they can't access the data. We can't even access our own health care records, right? So I think that's the challenge is we have to have that battle next to be able to go and take the next step. >> Well I see, with the growth of data a lot of it's coming through IOT, internet of things. I think that's a big source. And we're going to see a lot of innovation. A new types of Ubers or Air BnBs. Uber's so 2013 though, right? We're going to see new companies with new ideas, new innovations, they're going to be looking at the ways this data can be leveraged all this big data. Or data coming in from the IOT can be leveraged. You know there's some examples out there. There's a company for example that is outfitting tools, putting sensors in the tools. Industrial sites can therefore track where the tools are at any given time. This is an expensive, time consuming process, constantly loosing tools, trying to locate tools. Assessing whether the tool's being applied to the production line or the right tool is at the right torque and so forth. With the sensors implanted in these tools, it's now possible to be more efficient. And there's going to be innovations like that. Maybe small start up type things or smaller innovations. We're going to see a lot of new ideas and new types of approaches to handling all this data. There's going to be new business ideas. The next Uber, we may be hearing about it a year from now whatever that may be. And that Uber is going to be applying data, probably IOT type data in some, new innovative way. >> Jennifer, final word. >> Yeah so I think with data, you know it's interesting, right, for one thing I think on of the things that's made data more available and just people we open to the idea, has been start ups. But what's interesting about this is a lot of start ups have been acquired. And a lot of people at start ups that got acquired now these people work at bigger corporations. Which was the way it was maybe 10 years ago, data wasn't available and open, companies kept it very proprietary, you had to sign NDAs. It was like within the last 10 years that open source all of that initiatives became much more popular, much more open, a acceptable sort of way to look at data. I think that what I'm kind of interested in seeing is what people do within the corporate environment. Right, 'cause they have resources. They have funding that start ups don't have. And they have backing, right? Presumably if you're acquired you went in at a higher title in the corporate structure whereas if you had started there you probably wouldn't be at that title at that point. So I think you have an opportunity where people who have done innovative things and have proven that they can build really cool stuff, can now be in that corporate environment. I think part of it's going to be whether or not they can really adjust to sort of the corporate, you know the corporate landscape, the politics of it or the bureaucracy. I think every organization has that. Being able to navigate that is a difficult thing in part 'cause it's a human skill set, it's a people skill, it's a soft skill. It's not the same thing as just being able to code something and sell it. So you know it's going to really come down to people. I think if people can figure out for instance, what people want to buy, what people think, in general that's where the money comes from. You know you make money 'cause someone gave you money. So if you can find a way to look at a data or even look at technology and understand what people are doing, aren't doing, what they're happy about, unhappy about, there's always opportunity in collecting the data in that way and being able to leverage that. So you build cooler things, and offer things that haven't been thought of yet. So it's a very interesting time I think with the corporate resources available if you can do that. You know who knows what we'll have in like a year. >> I'll add one. >> Please. >> The majority of companies in the S&P 500 have a market cap that's greater than their revenue. The reason is 'cause they have IP related to data that's of value. But most of those companies, most companies, the vast majority of companies don't have any way to measure the value of that data. There's no GAAP accounting standard. So they don't understand the value contribution of their data in terms of how it helps them monetize. Not the data itself necessarily, but how it contributes to the monetization of the company. And I think that's a big gap. If you don't understand the value of the data that means you don't understand how to refine it, if data is the new oil and how to protect it and so forth and secure it. So that to me is a big gap that needs to get closed before we can actually say we live in a data driven world. >> So you're saying I've got an asset, I don't know if it's worth this or this. And they're missing that great opportunity. >> So devolve to what I know best. >> Great discussion. Really, really enjoyed the, the time as flown by. Joe if you get that augmented reality thing to work on the salary, point it toward that guy not this guy, okay? (everyone laughs) It's much more impressive if you point it over there. But Joe thank you, Dion, Joe and Jennifer and Batman. We appreciate and Bob Hayes, thanks for being with us. >> Thanks you guys. >> Really enjoyed >> Great stuff. >> the conversation. >> And a reminder coming up a the top of the hour, six o'clock Eastern time, IBMgo.com featuring the live keynote which is being set up just about 50 feet from us right now. Nick Silver is one of the headliners there, John Thomas is well, or rather Rob Thomas. John Thomas we had on earlier on The Cube. But a panel discussion as well coming up at six o'clock on IBMgo.com, six to 7:15. Be sure to join that live stream. That's it from The Cube. We certainly appreciate the time. Glad to have you along here in New York. And until the next time, take care. (bright digital music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Welcome back to data science for all. So it is a new game-- Have a swing at the pitch. Thanks for taking the time to be with us. from the academic side to continue data science And there's lot to be said is there not, ask the questions, you can't not think about it. of the customer and how we were going to be more anticipatory And I think, you know as the tools mature, So it's still too hard. I think that, you know, that's where it's headed. So Bob if you would, so you've got this Batman shirt on. to be a data scientist, but these tools will help you I was just going to add that, you know I think it's important to point out as well that And the data scientists on the panel And the only difference is that you can build it's an accomplishment and for less, So I think you have to think about the fact that I get the point of it and I think and become easier to use, you know like Bob was saying, So how at the end of the day, Dion? or bots that go off and run the hypotheses So you know people who are using the applications are now then people can speak really slowly to you in French, But the day to day operations was they ran some data, That's really the question. You know it's been said that the data doesn't lie, the access to the truth through looking at the numbers of the organization where you have the routine I tend to be a foolish optimist You do. I think as we start relying more on data and trusting data There's a couple elephant in the room topics Before you go to market you've got to test And also have the ability for a human to intervene to click on ads. And I forget the last criteria, but like we need I think with ethics, you know a lot of it has to do of all the new data that's going to be coming in? Getting back to you know what Dave was saying earlier about, organizations that have path found the way. than in the past, I think it was (laughs) I mean that gap is just going to grow and grow and grow. So I think that being able to use this information Or find it. But I think that's another thing to think about. And if you can ask the right question of the data you have And the potential I see with the data we're collecting is Knowing what you know about data science, for that problem in exactly the way that it occurred I thought the ads were paid in rubles. I think as a result we kind of overlook And I think without data science without machine learning, Right they had the ability to train the machines, At the same time Google's trying to get you And so I think you know And I think you know having, I think in general when you appeal to developers But you're an optimist. Joe: I want to see how Alexa or Siri do on Jeopardy. in the near term or opportunities you think have the ability to use data in their job. That says a lot about the state we're in today. I don't think you need to have a PhD in SQL to use data. Dion why don't you go ahead, We see the industries tend to reach an inflection point And that Uber is going to be applying data, I think part of it's going to be whether or not if data is the new oil and how to protect it I don't know if it's worth this or this. Joe if you get that augmented reality thing Glad to have you along here in New York.

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
Dave	PERSON	0.99+
Dion Hinchcliffe	PERSON	0.99+
John	PERSON	0.99+
Jennifer	PERSON	0.99+
Joe	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
Chris Dancy	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Cathy O'Neil	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Stanislav Petrov	PERSON	0.99+
Joe McKendrick	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Nick Silver	PERSON	0.99+
John Thomas	PERSON	0.99+
100 variables	QUANTITY	0.99+
John Walls	PERSON	0.99+
1990	DATE	0.99+
Joe Caserta	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
UC Berkeley	ORGANIZATION	0.99+
1983	DATE	0.99+
1991	DATE	0.99+
2013	DATE	0.99+
Constellation Research	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Bob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Bob Hayes	PERSON	0.99+
United States	LOCATION	0.99+
360 degree	QUANTITY	0.99+
one	QUANTITY	0.99+
New York	LOCATION	0.99+
Benjamin Israeli	PERSON	0.99+
France	LOCATION	0.99+
Africa	LOCATION	0.99+
12 month	QUANTITY	0.99+
Soviet Union	LOCATION	0.99+
Batman	PERSON	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Olympics	EVENT	0.99+
Meredith Whittaker	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Moscow	LOCATION	0.99+
Ubers	ORGANIZATION	0.99+
20 years	QUANTITY	0.99+
Joe C.	PERSON	0.99+

Wikibon Presents: Software is Eating the Edge | The Entangling of Big Data and IIoT

>> So as folks make their way over from Javits I'm going to give you the least interesting part of the evening and that's my segment in which I welcome you here, introduce myself, lay out what what we're going to do for the next couple of hours. So first off, thank you very much for coming. As all of you know Wikibon is a part of SiliconANGLE which also includes theCUBE, so if you look around, this is what we have been doing for the past couple of days here in the TheCUBE. We've been inviting some significant thought leaders from over on the show and in incredibly expensive limousines driven them up the street to come on to TheCUBE and spend time with us and talk about some of the things that are happening in the industry today that are especially important. We tore it down, and we're having this party tonight. So we want to thank you very much for coming and look forward to having more conversations with all of you. Now what are we going to talk about? Well Wikibon is the research arm of SiliconANGLE. So we take data that comes out of TheCUBE and other places and we incorporated it into our research. And work very closely with large end users and large technology companies regarding how to make better decisions in this incredibly complex, incredibly important transformative world of digital business. What we're going to talk about tonight, and I've got a couple of my analysts assembled, and we're also going to have a panel, is this notion of software is eating the Edge. Now most of you have probably heard Marc Andreessen, the venture capitalist and developer, original developer of Netscape many years ago, talk about how software's eating the world. Well, if software is truly going to eat the world, it's going to eat at, it's going to take the big chunks, big bites at the Edge. That's where the actual action's going to be. And what we want to talk about specifically is the entangling of the internet or the industrial internet of things and IoT with analytics. So that's what we're going to talk about over the course of the next couple of hours. To do that we're going to, I've already blown the schedule, that's on me. But to do that I'm going to spend a couple minutes talking about what we regard as the essential digital business capabilities which includes analytics and Big Data, and includes IIoT and we'll explain at least in our position why those two things come together the way that they do. But I'm going to ask the august and revered Neil Raden, Wikibon analyst to come on up and talk about harvesting value at the Edge. 'Cause there are some, not now Neil, when we're done, when I'm done. So I'm going to ask Neil to come on up and we'll talk, he's going to talk about harvesting value at the Edge. And then Jim Kobielus will follow up with him, another Wikibon analyst, he'll talk specifically about how we're going to take that combination of analytics and Edge and turn it into the new types of systems and software that are going to sustain this significant transformation that's going on. And then after that, I'm going to ask Neil and Jim to come, going to invite some other folks up and we're going to run a panel to talk about some of these issues and do a real question and answer. So the goal here is before we break for drinks is to create a community feeling within the room. That includes smart people here, smart people in the audience having a conversation ultimately about some of these significant changes so please participate and we look forward to talking about the rest of it. All right, let's get going! What is digital business? One of the nice things about being an analyst is that you can reach back on people who were significantly smarter than you and build your points of view on the shoulders of those giants including Peter Drucker. Many years ago Peter Drucker made the observation that the purpose of business is to create and keep a customer. Not better shareholder value, not anything else. It is about creating and keeping your customer. Now you can argue with that, at the end of the day, if you don't have customers, you don't have a business. Now the observation that we've made, what we've added to that is that we've made the observation that the difference between business and digital business essentially is one thing. That's data. A digital business uses data to differentially create and keep customers. That's the only difference. If you think about the difference between taxi cab companies here in New York City, every cab that I've been in in the last three days has bothered me about Uber. The reason, the difference between Uber and a taxi cab company is data. That's the primary difference. Uber uses data as an asset. And we think this is the fundamental feature of digital business that everybody has to pay attention to. How is a business going to use data as an asset? Is the business using data as an asset? Is a business driving its engagement with customers, the role of its product et cetera using data? And if they are, they are becoming a more digital business. Now when you think about that, what we're really talking about is how are they going to put data to work? How are they going to take their customer data and their operational data and their financial data and any other kind of data and ultimately turn that into superior engagement or improved customer experience or more agile operations or increased automation? Those are the kinds of outcomes that we're talking about. But it is about putting data to work. That's fundamentally what we're trying to do within a digital business. Now that leads to an observation about the crucial strategic business capabilities that every business that aspires to be more digital or to be digital has to put in place. And I want to be clear. When I say strategic capabilities I mean something specific. When you talk about, for example technology architecture or information architecture there is this notion of what capabilities does your business need? Your business needs capabilities to pursue and achieve its mission. And in the digital business these are the capabilities that are now additive to this core question, ultimately of whether or not the company is a digital business. What are the three capabilities? One, you have to capture data. Not just do a good job of it, but better than your competition. You have to capture data better than your competition. In a way that is ultimately less intrusive on your markets and on your customers. That's in many respects, one of the first priorities of the internet of things and people. The idea of using sensors and related technologies to capture more data. Once you capture that data you have to turn it into value. You have to do something with it that creates business value so you can do a better job of engaging your markets and serving your customers. And that essentially is what we regard as the basis of Big Data. Including operations, including financial performance and everything else, but ultimately it's taking the data that's being captured and turning it into value within the business. The last point here is that once you have generated a model, or an insight or some other resource that you can act upon, you then have to act upon it in the real world. We call that systems of agency, the ability to enact based on data. Now I want to spend just a second talking about systems of agency 'cause we think it's an interesting concept and it's something Jim Kobielus is going to talk about a little bit later. When we say systems of agency, what we're saying is increasingly machines are acting on behalf of a brand. Or systems, combinations of machines and people are acting on behalf of the brand. And this whole notion of agency is the idea that ultimately these systems are now acting as the business's agent. They are at the front line of engaging customers. It's an extremely rich proposition that has subtle but crucial implications. For example I was talking to a senior decision maker at a business today and they made a quick observation, they talked about they, on their way here to New York City they had followed a woman who was going through security, opened up her suitcase and took out a bird. And then went through security with the bird. And the reason why I bring this up now is as TSA was trying to figure out how exactly to deal with this, the bird started talking and repeating things that the woman had said and many of those things, in fact, might have put her in jail. Now in this case the bird is not an agent of that woman. You can't put the woman in jail because of what the bird said. But increasingly we have to ask ourselves as we ask machines to do more on our behalf, digital instrumentation and elements to do more on our behalf, it's going to have blow back and an impact on our brand if we don't do it well. I want to draw that forward a little bit because I suggest there's going to be a new lifecycle for data. And the way that we think about it is we have the internet or the Edge which is comprised of things and crucially people, using sensors, whether they be smaller processors in control towers or whether they be phones that are tracking where we go, and this crucial element here is something that we call information transducers. Now a transducer in a traditional sense is something that takes energy from one form to another so that it can perform new types of work. By information transducer I essentially mean it takes information from one form to another so it can perform another type of work. This is a crucial feature of data. One of the beauties of data is that it can be used in multiple places at multiple times and not engender significant net new costs. It's one of the few assets that you can say about that. So the concept of an information transducer's really important because it's the basis for a lot of transformations of data as data flies through organizations. So we end up with the transducers storing data in the form of analytics, machine learning, business operations, other types of things, and then it goes back and it's transduced, back into to the real world as we program the real world and turning into these systems of agency. So that's the new lifecycle. And increasingly, that's how we have to think about data flows. Capturing it, turning it into value and having it act on our behalf in front of markets. That could have enormous implications for how ultimately money is spent over the next few years. So Wikibon does a significant amount of market research in addition to advising our large user customers. And that includes doing studies on cloud, public cloud, but also studies on what's happening within the analytics world. And if you take a look at it, what we basically see happening over the course of the next few years is significant investments in software and also services to get the word out. But we also expect there's going to be a lot of hardware. A significant amount of hardware that's ultimately sold within this space. And that's because of something that we call true private cloud. This concept of ultimately a business increasingly being designed and architected around the idea of data assets means that the reality, the physical realities of how data operates, how much it costs to store it or move it, the issues of latency, the issues of intellectual property protection as well as things like the regulatory regimes that are being put in place to govern how data gets used in between locations. All of those factors are going to drive increased utilization of what we call true private cloud. On premise technologies that provide the cloud experience but act where the data naturally needs to be processed. I'll come a little bit more to that in a second. So we think that it's going to be a relatively balanced market, a lot of stuff is going to end up in the cloud, but as Neil and Jim will talk about, there's going to be an enormous amount of analytics that pulls an enormous amount of data out to the Edge 'cause that's where the action's going to be. Now one of the things I want to also reveal to you is we've done a fair amount of data, we've done a fair amount of research around this question of where or how will data guide decisions about infrastructure? And in particular the Edge is driving these conversations. So here is a piece of research that one of our cohorts at Wikibon did, David Floyer. Taking a look at IoT Edge cost comparisons over a three year period. And it showed on the left hand side, an example where the sensor towers and other types of devices were streaming data back into a central location in a wind farm, stylized wind farm example. Very very expensive. Significant amounts of money end up being consumed, significant resources end up being consumed by the cost of moving the data from one place to another. Now this is even assuming that latency does not become a problem. The second example that we looked at is if we kept more of that data at the Edge and processed at the Edge. And literally it is a 85 plus percent cost reduction to keep more of the data at the Edge. Now that has enormous implications, how we think about big data, how we think about next generation architectures, et cetera. But it's these costs that are going to be so crucial to shaping the decisions that we make over the next two years about where we put hardware, where we put resources, what type of automation is possible, and what types of technology management has to be put in place. Ultimately we think it's going to lead to a structure, an architecture in the infrastructure as well as applications that is informed more by moving cloud to the data than moving the data to the cloud. That's kind of our fundamental proposition is that the norm in the industry has been to think about moving all data up to the cloud because who wants to do IT? It's so much cheaper, look what Amazon can do. Or what AWS can do. All true statements. Very very important in many respects. But most businesses today are starting to rethink that simple proposition and asking themselves do we have to move our business to the cloud, or can we move the cloud to the business? And increasingly what we see happening as we talk to our large customers about this, is that the cloud is being extended out to the Edge, we're moving the cloud and cloud services out to the business. Because of economic reasons, intellectual property control reasons, regulatory reasons, security reasons, any number of other reasons. It's just a more natural way to deal with it. And of course, the most important reason is latency. So with that as a quick backdrop, if I may quickly summarize, we believe fundamentally that the difference today is that businesses are trying to understand how to use data as an asset. And that requires an investment in new sets of technology capabilities that are not cheap, not simple and require significant thought, a lot of planning, lot of change within an IT and business organizations. How we capture data, how we turn it into value, and how we translate that into real world action through software. That's going to lead to a rethinking, ultimately, based on cost and other factors about how we deploy infrastructure. How we use the cloud so that the data guides the activity and not the choice of cloud supplier determines or limits what we can do with our data. And that's going to lead to this notion of true private cloud and elevate the role the Edge plays in analytics and all other architectures. So I hope that was perfectly clear. And now what I want to do is I want to bring up Neil Raden. Yes, now's the time Neil! So let me invite Neil up to spend some time talking about harvesting value at the Edge. Can you see his, all right. Got it. >> Oh boy. Hi everybody. Yeah, this is a really, this is a really big and complicated topic so I decided to just concentrate on something fairly simple, but I know that Peter mentioned customers. And he also had a picture of Peter Drucker. I had the pleasure in 1998 of interviewing Peter and photographing him. Peter Drucker, not this Peter. Because I'd started a magazine called Hired Brains. It was for consultants. And Peter said, Peter said a number of really interesting things to me, but one of them was his definition of a customer was someone who wrote you a check that didn't bounce. He was kind of a wag. He was! So anyway, he had to leave to do a video conference with Jack Welch and so I said to him, how do you charge Jack Welch to spend an hour on a video conference? And he said, you know I have this theory that you should always charge your client enough that it hurts a little bit or they don't take you seriously. Well, I had the chance to talk to Jack's wife, Suzie Welch recently and I told her that story and she said, "Oh he's full of it, Jack never paid "a dime for those conferences!" (laughs) So anyway, all right, so let's talk about this. To me, things about, engineered things like the hardware and network and all these other standards and so forth, we haven't fully developed those yet, but they're coming. As far as I'm concerned, they're not the most interesting thing. The most interesting thing to me in Edge Analytics is what you're going to get out of it, what the result is going to be. Making sense of this data that's coming. And while we're on data, something I've been thinking a lot lately because everybody I've talked to for the last three days just keeps talking to me about data. I have this feeling that data isn't actually quite real. That any data that we deal with is the result of some process that's captured it from something else that's actually real. In other words it's proxy. So it's not exactly perfect. And that's why we've always had these problems about customer A, customer A, customer A, what's their definition? What's the definition of this, that and the other thing? And with sensor data, I really have the feeling, when companies get, not you know, not companies, organizations get instrumented and start dealing with this kind of data what they're going to find is that this is the first time, and I've been involved in analytics, I don't want to date myself, 'cause I know I look young, but the first, I've been dealing with analytics since 1975. And everything we've ever done in analytics has involved pulling data from some other system that was not designed for analytics. But if you think about sensor data, this is data that we're actually going to catch the first time. It's going to be ours! We're not going to get it from some other source. It's going to be the real deal, to the extent that it's the real deal. Now you may say, ya know Neil, a sensor that's sending us information about oil pressure or temperature or something like that, how can you quarrel with that? Well, I can quarrel with it because I don't know if the sensor's doing it right. So we still don't know, even with that data, if it's right, but that's what we have to work with. Now, what does that really mean? Is that we have to be really careful with this data. It's ours, we have to take care of it. We don't get to reload it from source some other day. If we munge it up it's gone forever. So that has, that has very serious implications, but let me, let me roll you back a little bit. The way I look at analytics is it's come in three different eras. And we're entering into the third now. The first era was business intelligence. It was basically built and governed by IT, it was system of record kind of reporting. And as far as I can recall, it probably started around 1988 or at least that's the year that Howard Dresner claims to have invented the term. I'm not sure it's true. And things happened before 1988 that was sort of like BI, but 88 was when they really started coming out, that's when we saw BusinessObjects and Cognos and MicroStrategy and those kinds of things. The second generation just popped out on everybody else. We're all looking around at BI and we were saying why isn't this working? Why are only five people in the organization using this? Why are we not getting value out of this massive license we bought? And along comes companies like Tableau doing data discovery, visualization, data prep and Line of Business people are using this now. But it's still the same kind of data sources. It's moved out a little bit, but it still hasn't really hit the Big Data thing. Now we're in third generation, so we not only had Big Data, which has come and hit us like a tsunami, but we're looking at smart discovery, we're looking at machine learning. We're looking at AI induced analytics workflows. And then all the natural language cousins. You know, natural language processing, natural language, what's? Oh Q, natural language query. Natural language generation. Anybody here know what natural language generation is? Yeah, so what you see now is you do some sort of analysis and that tool comes up and says this chart is about the following and it used the following data, and it's blah blah blah blah blah. I think it's kind of wordy and it's going to refined some, but it's an interesting, it's an interesting thing to do. Now, the problem I see with Edge Analytics and IoT in general is that most of the canonical examples we talk about are pretty thin. I know we talk about autonomous cars, I hope to God we never have them, 'cause I'm a car guy. Fleet Management, I think Qualcomm started Fleet Management in 1988, that is not a new application. Industrial controls. I seem to remember, I seem to remember Honeywell doing industrial controls at least in the 70s and before that I wasn't, I don't want to talk about what I was doing, but I definitely wasn't in this industry. So my feeling is we all need to sit down and think about this and get creative. Because the real value in Edge Analytics or IoT, whatever you want to call it, the real value is going to be figuring out something that's new or different. Creating a brand new business. Changing the way an operation happens in a company, right? And I think there's a lot of smart people out there and I think there's a million apps that we haven't even talked about so, if you as a vendor come to me and tell me how great your product is, please don't talk to me about autonomous cars or Fleet Managing, 'cause I've heard about that, okay? Now, hardware and architecture are really not the most interesting thing. We fell into that trap with data warehousing. We've fallen into that trap with Big Data. We talk about speeds and feeds. Somebody said to me the other day, what's the narrative of this company? This is a technology provider. And I said as far as I can tell, they don't have a narrative they have some products and they compete in a space. And when they go to clients and the clients say, what's the value of your product? They don't have an answer for that. So we don't want to fall into this trap, okay? Because IoT is going to inform you in ways you've never even dreamed about. Unfortunately some of them are going to be really stinky, you know, they're going to be really bad. You're going to lose more of your privacy, it's going to get harder to get, I dunno, mortgage for example, I dunno, maybe it'll be easier, but in any case, it's not going to all be good. So let's really think about what you want to do with this technology to do something that's really valuable. Cost takeout is not the place to justify an IoT project. Because number one, it's very expensive, and number two, it's a waste of the technology because you should be looking at, you know the old numerator denominator thing? You should be looking at the numerators and forget about the denominators because that's not what you do with IoT. And the other thing is you don't want to get over confident. Actually this is good advice about anything, right? But in this case, I love this quote by Derek Sivers He's a pretty funny guy. He said, "If more information was the answer, "then we'd all be billionaires with perfect abs." I'm not sure what's on his wishlist, but you know, I would, those aren't necessarily the two things I would think of, okay. Now, what I said about the data, I want to explain some more. Big Data Analytics, if you look at this graphic, it depicts it perfectly. It's a bunch of different stuff falling into the funnel. All right? It comes from other places, it's not original material. And when it comes in, it's always used as second hand data. Now what does that mean? That means that you have to figure out the semantics of this information and you have to find a way to put it together in a way that's useful to you, okay. That's Big Data. That's where we are. How is that different from IoT data? It's like I said, IoT is original. You can put it together any way you want because no one else has ever done that before. It's yours to construct, okay. You don't even have to transform it into a schema because you're creating the new application. But the most important thing is you have to take care of it 'cause if you lose it, it's gone. It's the original data. It's the same way, in operational systems for a long long time we've always been concerned about backup and security and everything else. You better believe this is a problem. I know a lot of people think about streaming data, that we're going to look at it for a minute, and we're going to throw most of it away. Personally I don't think that's going to happen. I think it's all going to be saved, at least for a while. Now, the governance and security, oh, by the way, I don't know where you're going to find a presentation where somebody uses a newspaper clipping about Vladimir Lenin, but here it is, enjoy yourselves. I believe that when people think about governance and security today they're still thinking along the same grids that we thought about it all along. But this is very very different and again, I'm sorry I keep thrashing this around, but this is treasured data that has to be carefully taken care of. Now when I say governance, my experience has been over the years that governance is something that IT does to make everybody's lives miserable. But that's not what I mean by governance today. It means a comprehensive program to really secure the value of the data as an asset. And you need to think about this differently. Now the other thing is you may not get to think about it differently, because some of the stuff may end up being subject to regulation. And if the regulators start regulating some of this, then that'll take some of the degrees of freedom away from you in how you put this together, but you know, that's the way it works. Now, machine learning, I think I told somebody the other day that claims about machine learning in software products are as common as twisters in trail parks. And a lot of it is not really what I'd call machine learning. But there's a lot of it around. And I think all of the open source machine learning and artificial intelligence that's popped up, it's great because all those math PhDs who work at Home Depot now have something to do when they go home at night and they construct this stuff. But if you're going to have machine learning at the Edge, here's the question, what kind of machine learning would you have at the Edge? As opposed to developing your models back at say, the cloud, when you transmit the data there. The devices at the Edge are not very powerful. And they don't have a lot of memory. So you're only going to be able to do things that have been modeled or constructed somewhere else. But that's okay. Because machine learning algorithm development is actually slow and painful. So you really want the people who know how to do this working with gobs of data creating models and testing them offline. And when you have something that works, you can put it there. Now there's one thing I want to talk about before I finish, and I think I'm almost finished. I wrote a book about 10 years ago about automated decision making and the conclusion that I came up with was that little decisions add up, and that's good. But it also means you don't have to get them all right. But you don't want computers or software making decisions unattended if it involves human life, or frankly any life. Or the environment. So when you think about the applications that you can build using this architecture and this technology, think about the fact that you're not going to be doing air traffic control, you're not going to be monitoring crossing guards at the elementary school. You're going to be doing things that may seem fairly mundane. Managing machinery on the factory floor, I mean that may sound great, but really isn't that interesting. Managing well heads, drilling for oil, well I mean, it's great to the extent that it doesn't cause wells to explode, but they don't usually explode. What it's usually used for is to drive the cost out of preventative maintenance. Not very interesting. So use your heads. Come up with really cool stuff. And any of you who are involved in Edge Analytics, the next time I talk to you I don't want to hear about the same five applications that everybody talks about. Let's hear about some new ones. So, in conclusion, I don't really have anything in conclusion except that Peter mentioned something about limousines bringing people up here. On Monday I was slogging up and down Park Avenue and Madison Avenue with my client and we were visiting all the hedge funds there because we were doing a project with them. And in the miserable weather I looked at him and I said, for godsake Paul, where's the black car? And he said, that was the 90s. (laughs) Thank you. So, Jim, up to you. (audience applauding) This is terrible, go that way, this was terrible coming that way. >> Woo, don't want to trip! And let's move to, there we go. Hi everybody, how ya doing? Thanks Neil, thanks Peter, those were great discussions. So I'm the third leg in this relay race here, talking about of course how software is eating the world. And focusing on the value of Edge Analytics in a lot of real world scenarios. Programming the real world for, to make the world a better place. So I will talk, I'll break it out analytically in terms of the research that Wikibon is doing in the area of the IoT, but specifically how AI intelligence is being embedded really to all material reality potentially at the Edge. But mobile applications and industrial IoT and the smart appliances and self driving vehicles. I will break it out in terms of a reference architecture for understanding what functions are being pushed to the Edge to hardware, to our phones and so forth to drive various scenarios in terms of real world results. So I'll move a pace here. So basically AI software or AI microservices are being infused into Edge hardware as we speak. What we see is more vendors of smart phones and other, real world appliances and things like smart driving, self driving vehicles. What they're doing is they're instrumenting their products with computer vision and natural language processing, environmental awareness based on sensing and actuation and those capabilities and inferences that these devices just do to both provide human support for human users of these devices as well as to enable varying degrees of autonomous operation. So what I'll be talking about is how AI is a foundation for data driven systems of agency of the sort that Peter is talking about. Infusing data driven intelligence into everything or potentially so. As more of this capability, all these algorithms for things like, ya know for doing real time predictions and classifications, anomaly detection and so forth, as this functionality gets diffused widely and becomes more commoditized, you'll see it burned into an ever-wider variety of hardware architecture, neuro synaptic chips, GPUs and so forth. So what I've got here in front of you is a sort of a high level reference architecture that we're building up in our research at Wikibon. So AI, artificial intelligence is a big term, a big paradigm, I'm not going to unpack it completely. Of course we don't have oodles of time so I'm going to take you fairly quickly through the high points. It's a driver for systems of agency. Programming the real world. Transducing digital inputs, the data, to analog real world results. Through the embedding of this capability in the IoT, but pushing more and more of it out to the Edge with points of decision and action in real time. And there are four capabilities that we're seeing in terms of AI enabled, enabling capabilities that are absolutely critical to software being pushed to the Edge are sensing, actuation, inference and Learning. Sensing and actuation like Peter was describing, it's about capturing data from the environment within which a device or users is operating or moving. And then actuation is the fancy term for doing stuff, ya know like industrial IoT, it's obviously machine controlled, but clearly, you know self driving vehicles is steering a vehicle and avoiding crashing and so forth. Inference is the meat and potatoes as it were of AI. Analytics does inferences. It infers from the data, the logic of the application. Predictive logic, correlations, classification, abstractions, differentiation, anomaly detection, recognizing faces and voices. We see that now with Apple and the latest version of the iPhone is embedding face recognition as a core, as the core multifactor authentication technique. Clearly that's a harbinger of what's going to be universal fairly soon which is that depends on AI. That depends on convolutional neural networks, that is some heavy hitting processing power that's necessary and it's processing the data that's coming from your face. So that's critically important. So what we're looking at then is the AI software is taking root in hardware to power continuous agency. Getting stuff done. Powered decision support by human beings who have to take varying degrees of action in various environments. We don't necessarily want to let the car steer itself in all scenarios, we want some degree of override, for lots of good reasons. They want to protect life and limb including their own. And just more data driven automation across the internet of things in the broadest sense. So unpacking this reference framework, what's happening is that AI driven intelligence is powering real time decisioning at the Edge. Real time local sensing from the data that it's capturing there, it's ingesting the data. Some, not all of that data, may be persistent at the Edge. Some, perhaps most of it, will be pushed into the cloud for other processing. When you have these highly complex algorithms that are doing AI deep learning, multilayer, to do a variety of anti-fraud and higher level like narrative, auto-narrative roll-ups from various scenes that are unfolding. A lot of this processing is going to begin to happen in the cloud, but a fair amount of the more narrowly scoped inferences that drive real time decision support at the point of action will be done on the device itself. Contextual actuation, so it's the sensor data that's captured by the device along with other data that may be coming down in real time streams through the cloud will provide the broader contextual envelope of data needed to drive actuation, to drive various models and rules and so forth that are making stuff happen at the point of action, at the Edge. Continuous inference. What it all comes down to is that inference is what's going on inside the chips at the Edge device. And what we're seeing is a growing range of hardware architectures, GPUs, CPUs, FPGAs, ASIC, Neuro synaptic chips of all sorts playing in various combinations that are automating more and more very complex inference scenarios at the Edge. And not just individual devices, swarms of devices, like drones and so forth are essentially an Edge unto themselves. You'll see these tiered hierarchies of Edge swarms that are playing and doing inferences of ever more complex dynamic nature. And much of this will be, this capability, the fundamental capabilities that is powering them all will be burned into the hardware that powers them. And then adaptive learning. Now I use the term learning rather than training here, training is at the core of it. Training means everything in terms of the predictive fitness or the fitness of your AI services for whatever task, predictions, classifications, face recognition that you, you've built them for. But I use the term learning in a broader sense. It's what's make your inferences get better and better, more accurate over time is that you're training them with fresh data in a supervised learning environment. But you can have reinforcement learning if you're doing like say robotics and you don't have ground truth against which to train the data set. You know there's maximize a reward function versus minimize a loss function, you know, the standard approach, the latter for supervised learning. There's also, of course, the issue, or not the issue, the approach of unsupervised learning with cluster analysis critically important in a lot of real world scenarios. So Edge AI Algorithms, clearly, deep learning which is multilayered machine learning models that can do abstractions at higher and higher levels. Face recognition is a high level abstraction. Faces in a social environment is an even higher level of abstraction in terms of groups. Faces over time and bodies and gestures, doing various things in various environments is an even higher level abstraction in terms of narratives that can be rolled up, are being rolled up by deep learning capabilities of great sophistication. Convolutional neural networks for processing images, recurrent neural networks for processing time series. Generative adversarial networks for doing essentially what's called generative applications of all sort, composing music, and a lot of it's being used for auto programming. These are all deep learning. There's a variety of other algorithm approaches I'm not going to bore you with here. Deep learning is essentially the enabler of the five senses of the IoT. Your phone's going to have, has a camera, it has a microphone, it has the ability to of course, has geolocation and navigation capabilities. It's environmentally aware, it's got an accelerometer and so forth embedded therein. The reason that your phone and all of the devices are getting scary sentient is that they have the sensory modalities and the AI, the deep learning that enables them to make environmentally correct decisions in the wider range of scenarios. So machine learning is the foundation of all of this, but there are other, I mean of deep learning, artificial neural networks is the foundation of that. But there are other approaches for machine learning I want to make you aware of because support vector machines and these other established approaches for machine learning are not going away but really what's driving the show now is deep learning, because it's scary effective. And so that's where most of the investment in AI is going into these days for deep learning. AI Edge platforms, tools and frameworks are just coming along like gangbusters. Much development of AI, of deep learning happens in the context of your data lake. This is where you're storing your training data. This is the data that you use to build and test to validate in your models. So we're seeing a deepening stack of Hadoop and there's Kafka, and Spark and so forth that are driving the training (coughs) excuse me, of AI models that are power all these Edge Analytic applications so that that lake will continue to broaden in terms, and deepen in terms of a scope and the range of data sets and the range of modeling, AI modeling supports. Data science is critically important in this scenario because the data scientist, the data science teams, the tools and techniques and flows of data science are the fundamental development paradigm or discipline or capability that's being leveraged to build and to train and to deploy and iterate all this AI that's being pushed to the Edge. So clearly data science is at the center, data scientists of an increasingly specialized nature are necessary to the realization to this value at the Edge. AI frameworks are coming along like you know, a mile a minute. TensorFlow has achieved a, is an open source, most of these are open source, has achieved sort of almost like a defacto standard, status, I'm using the word defacto in air quotes. There's Theano and Keras and xNet and CNTK and a variety of other ones. We're seeing range of AI frameworks come to market, most open source. Most are supported by most of the major tool vendors as well. So at Wikibon we're definitely tracking that, we plan to go deeper in our coverage of that space. And then next best action, powers recommendation engines. I mean next best action decision automation of the sort of thing Neil's covered in a variety of contexts in his career is fundamentally important to Edge Analytics to systems of agency 'cause it's driving the process automation, decision automation, sort of the targeted recommendations that are made at the Edge to individual users as well as to process that automation. That's absolutely necessary for self driving vehicles to do their jobs and industrial IoT. So what we're seeing is more and more recommendation engine or recommender capabilities powered by ML and DL are going to the Edge, are already at the Edge for a variety of applications. Edge AI capabilities, like I said, there's sensing. And sensing at the Edge is becoming ever more rich, mixed reality Edge modalities of all sort are for augmented reality and so forth. We're just seeing a growth in certain, the range of sensory modalities that are enabled or filtered and analyzed through AI that are being pushed to the Edge, into the chip sets. Actuation, that's where robotics comes in. Robotics is coming into all aspects of our lives. And you know, it's brainless without AI, without deep learning and these capabilities. Inference, autonomous edge decisioning. Like I said, it's, a growing range of inferences that are being done at the Edge. And that's where it has to happen 'cause that's the point of decision. Learning, training, much training, most training will continue to be done in the cloud because it's very data intensive. It's a grind to train and optimize an AI algorithm to do its job. It's not something that you necessarily want to do or can do at the Edge at Edge devices so, the models that are built and trained in the cloud are pushed down through a dev ops process down to the Edge and that's the way it will work pretty much in most AI environments, Edge analytics environments. You centralize the modeling, you decentralize the execution of the inference models. The training engines will be in the cloud. Edge AI applications. I'll just run you through sort of a core list of the ones that are coming into, already come into the mainstream at the Edge. Multifactor authentication, clearly the Apple announcement of face recognition is just a harbinger of the fact that that's coming to every device. Computer vision speech recognition, NLP, digital assistance and chat bots powered by natural language processing and understanding, it's all AI powered. And it's becoming very mainstream. Emotion detection, face recognition, you know I could go on and on but these are like the core things that everybody has access to or will by 2020 and they're core devices, mass market devices. Developers, designers and hardware engineers are coming together to pool their expertise to build and train not just the AI, but also the entire package of hardware in UX and the orchestration of real world business scenarios or life scenarios that all this intelligence, the submitted intelligence enables and most, much of what they build in terms of AI will be containerized as micro services through Docker and orchestrated through Kubernetes as full cloud services in an increasingly distributed fabric. That's coming along very rapidly. We can see a fair amount of that already on display at Strata in terms of what the vendors are doing or announcing or who they're working with. The hardware itself, the Edge, you know at the Edge, some data will be persistent, needs to be persistent to drive inference. That's, and you know to drive a variety of different application scenarios that need some degree of historical data related to what that device in question happens to be sensing or has sensed in the immediate past or you know, whatever. The hardware itself is geared towards both sensing and increasingly persistence and Edge driven actuation of real world results. The whole notion of drones and robotics being embedded into everything that we do. That's where that comes in. That has to be powered by low cost, low power commodity chip sets of various sorts. What we see right now in terms of chip sets is it's a GPUs, Nvidia has gone real far and GPUs have come along very fast in terms of power inference engines, you know like the Tesla cars and so forth. But GPUs are in many ways the core hardware sub straight for in inference engines in DL so far. But to become a mass market phenomenon, it's got to get cheaper and lower powered and more commoditized, and so we see a fair number of CPUs being used as the hardware for Edge Analytic applications. Some vendors are fairly big on FPGAs, I believe Microsoft has gone fairly far with FPGAs inside DL strategy. ASIC, I mean, there's neuro synaptic chips like IBM's got one. There's at least a few dozen vendors of neuro synaptic chips on the market so at Wikibon we're going to track that market as it develops. And what we're seeing is a fair number of scenarios where it's a mixed environment where you use one chip set architecture at the inference side of the Edge, and other chip set architectures that are driving the DL as processed in the cloud, playing together within a common architecture. And we see some, a fair number of DL environments where the actual training is done in the cloud on Spark using CPUs and parallelized in memory, but pushing Tensorflow models that might be trained through Spark down to the Edge where the inferences are done in FPGAs and GPUs. Those kinds of mixed hardware scenarios are very, very, likely to be standard going forward in lots of areas. So analytics at the Edge power continuous results is what it's all about. The whole point is really not moving the data, it's putting the inference at the Edge and working from the data that's already captured and persistent there for the duration of whatever action or decision or result needs to be powered from the Edge. Like Neil said cost takeout alone is not worth doing. Cost takeout alone is not the rationale for putting AI at the Edge. It's getting new stuff done, new kinds of things done in an automated consistent, intelligent, contextualized way to make our lives better and more productive. Security and governance are becoming more important. Governance of the models, governance of the data, governance in a dev ops context in terms of version controls over all those DL models that are built, that are trained, that are containerized and deployed. Continuous iteration and improvement of those to help them learn to do, make our lives better and easier. With that said, I'm going to hand it over now. It's five minutes after the hour. We're going to get going with the Influencer Panel so what we'd like to do is I call Peter, and Peter's going to call our influencers. >> All right, am I live yet? Can you hear me? All right so, we've got, let me jump back in control here. We've got, again, the objective here is to have community take on some things. And so what we want to do is I want to invite five other people up, Neil why don't you come on up as well. Start with Neil. You can sit here. On the far right hand side, Judith, Judith Hurwitz. >> Neil: I'm glad I'm on the left side. >> From the Hurwitz Group. >> From the Hurwitz Group. Jennifer Shin who's affiliated with UC Berkeley. Jennifer are you here? >> She's here, Jennifer where are you? >> She was here a second ago. >> Neil: I saw her walk out she may have, >> Peter: All right, she'll be back in a second. >> Here's Jennifer! >> Here's Jennifer! >> Neil: With 8 Path Solutions, right? >> Yep. >> Yeah 8 Path Solutions. >> Just get my mic. >> Take your time Jen. >> Peter: All right, Stephanie McReynolds. Far left. And finally Joe Caserta, Joe come on up. >> Stephie's with Elysian >> And to the left. So what I want to do is I want to start by having everybody just go around introduce yourself quickly. Judith, why don't we start there. >> I'm Judith Hurwitz, I'm president of Hurwitz and Associates. We're an analyst research and fault leadership firm. I'm the co-author of eight books. Most recent is Cognitive Computing and Big Data Analytics. I've been in the market for a couple years now. >> Jennifer. >> Hi, my name's Jennifer Shin. I'm the founder and Chief Data Scientist 8 Path Solutions LLC. We do data science analytics and technology. We're actually about to do a big launch next month, with Box actually. >> We're apparent, are we having a, sorry Jennifer, are we having a problem with Jennifer's microphone? >> Man: Just turn it back on? >> Oh you have to turn it back on. >> It was on, oh sorry, can you hear me now? >> Yes! We can hear you now. >> Okay, I don't know how that turned back off, but okay. >> So you got to redo all that Jen. >> Okay, so my name's Jennifer Shin, I'm founder of 8 Path Solutions LLC, it's a data science analytics and technology company. I founded it about six years ago. So we've been developing some really cool technology that we're going to be launching with Box next month. It's really exciting. And I have, I've been developing a lot of patents and some technology as well as teaching at UC Berkeley as a lecturer in data science. >> You know Jim, you know Neil, Joe, you ready to go? >> Joe: Just broke my microphone. >> Joe's microphone is broken. >> Joe: Now it should be all right. >> Jim: Speak into Neil's. >> Joe: Hello, hello? >> I just feel not worthy in the presence of Joe Caserta. (several laughing) >> That's right, master of mics. If you can hear me, Joe Caserta, so yeah, I've been doing data technology solutions since 1986, almost as old as Neil here, but been doing specifically like BI, data warehousing, business intelligence type of work since 1996. And been doing, wholly dedicated to Big Data solutions and modern data engineering since 2009. Where should I be looking? >> Yeah I don't know where is the camera? >> Yeah, and that's basically it. So my company was formed in 2001, it's called Caserta Concepts. We recently rebranded to only Caserta 'cause what we do is way more than just concepts. So we conceptualize the stuff, we envision what the future brings and we actually build it. And we help clients large and small who are just, want to be leaders in innovation using data specifically to advance their business. >> Peter: And finally Stephanie McReynolds. >> I'm Stephanie McReynolds, I had product marketing as well as corporate marketing for a company called Elysian. And we are a data catalog so we help bring together not only a technical understanding of your data, but we curate that data with human knowledge and use automated intelligence internally within the system to make recommendations about what data to use for decision making. And some of our customers like City of San Diego, a large automotive manufacturer working on self driving cars and General Electric use Elysian to help power their solutions for IoT at the Edge. >> All right so let's jump right into it. And again if you have a question, raise your hand, and we'll do our best to get it to the floor. But what I want to do is I want to get seven questions in front of this group and have you guys discuss, slog, disagree, agree. Let's start here. What is the relationship between Big Data AI and IoT? Now Wikibon's put forward its observation that data's being generated at the Edge, that action is being taken at the Edge and then increasingly the software and other infrastructure architectures need to accommodate the realities of how data is going to work in these very complex systems. That's our perspective. Anybody, Judith, you want to start? >> Yeah, so I think that if you look at AI machine learning, all these different areas, you have to be able to have the data learned. Now when it comes to IoT, I think one of the issues we have to be careful about is not all data will be at the Edge. Not all data needs to be analyzed at the Edge. For example if the light is green and that's good and it's supposed to be green, do you really have to constantly analyze the fact that the light is green? You actually only really want to be able to analyze and take action when there's an anomaly. Well if it goes purple, that's actually a sign that something might explode, so that's where you want to make sure that you have the analytics at the edge. Not for everything, but for the things where there is an anomaly and a change. >> Joe, how about from your perspective? >> For me I think the evolution of data is really becoming, eventually oxygen is just, I mean data's going to be the oxygen we breathe. It used to be very very reactive and there used to be like a latency. You do something, there's a behavior, there's an event, there's a transaction, and then you go record it and then you collect it, and then you can analyze it. And it was very very waterfallish, right? And then eventually we figured out to put it back into the system. Or at least human beings interpret it to try to make the system better and that is really completely turned on it's head, we don't do that anymore. Right now it's very very, it's synchronous, where as we're actually making these transactions, the machines, we don't really need, I mean human beings are involved a bit, but less and less and less. And it's just a reality, it may not be politically correct to say but it's a reality that my phone in my pocket is following my behavior, and it knows without telling a human being what I'm doing. And it can actually help me do things like get to where I want to go faster depending on my preference if I want to save money or save time or visit things along the way. And I think that's all integration of big data, streaming data, artificial intelligence and I think the next thing that we're going to start seeing is the culmination of all of that. I actually, hopefully it'll be published soon, I just wrote an article for Forbes with the term of ARBI and ARBI is the integration of Augmented Reality and Business Intelligence. Where I think essentially we're going to see, you know, hold your phone up to Jim's face and it's going to recognize-- >> Peter: It's going to break. >> And it's going to say exactly you know, what are the key metrics that we want to know about Jim. If he works on my sales force, what's his attainment of goal, what is-- >> Jim: Can it read my mind? >> Potentially based on behavior patterns. >> Now I'm scared. >> I don't think Jim's buying it. >> It will, without a doubt be able to predict what you've done in the past, you may, with some certain level of confidence you may do again in the future, right? And is that mind reading? It's pretty close, right? >> Well, sometimes, I mean, mind reading is in the eye of the individual who wants to know. And if the machine appears to approximate what's going on in the person's head, sometimes you can't tell. So I guess, I guess we could call that the Turing machine test of the paranormal. >> Well, face recognition, micro gesture recognition, I mean facial gestures, people can do it. Maybe not better than a coin toss, but if it can be seen visually and captured and analyzed, conceivably some degree of mind reading can be built in. I can see when somebody's angry looking at me so, that's a possibility. That's kind of a scary possibility in a surveillance society, potentially. >> Neil: Right, absolutely. >> Peter: Stephanie, what do you think? >> Well, I hear a world of it's the bots versus the humans being painted here and I think that, you know at Elysian we have a very strong perspective on this and that is that the greatest impact, or the greatest results is going to be when humans figure out how to collaborate with the machines. And so yes, you want to get to the location more quickly, but the machine as in the bot isn't able to tell you exactly what to do and you're just going to blindly follow it. You need to train that machine, you need to have a partnership with that machine. So, a lot of the power, and I think this goes back to Judith's story is then what is the human decision making that can be augmented with data from the machine, but then the humans are actually training the training side and driving machines in the right direction. I think that's when we get true power out of some of these solutions so it's not just all about the technology. It's not all about the data or the AI, or the IoT, it's about how that empowers human systems to become smarter and more effective and more efficient. And I think we're playing that out in our technology in a certain way and I think organizations that are thinking along those lines with IoT are seeing more benefits immediately from those projects. >> So I think we have a general agreement of what kind of some of the things you talked about, IoT, crucial capturing information, and then having action being taken, AI being crucial to defining and refining the nature of the actions that are being taken Big Data ultimately powering how a lot of that changes. Let's go to the next one. >> So actually I have something to add to that. So I think it makes sense, right, with IoT, why we have Big Data associated with it. If you think about what data is collected by IoT. We're talking about a serial information, right? It's over time, it's going to grow exponentially just by definition, right, so every minute you collect a piece of information that means over time, it's going to keep growing, growing, growing as it accumulates. So that's one of the reasons why the IoT is so strongly associated with Big Data. And also why you need AI to be able to differentiate between one minute versus next minute, right? Trying to find a better way rather than looking at all that information and manually picking out patterns. To have some automated process for being able to filter through that much data that's being collected. >> I want to point out though based on what you just said Jennifer, I want to bring Neil in at this point, that this question of IoT now generating unprecedented levels of data does introduce this idea of the primary source. Historically what we've done within technology, or within IT certainly is we've taken stylized data. There is no such thing as a real world accounting thing. It is a human contrivance. And we stylize data and therefore it's relatively easy to be very precise on it. But when we start, as you noted, when we start measuring things with a tolerance down to thousandths of a millimeter, whatever that is, metric system, now we're still sometimes dealing with errors that we have to attend to. So, the reality is we're not just dealing with stylized data, we're dealing with real data, and it's more, more frequent, but it also has special cases that we have to attend to as in terms of how we use it. What do you think Neil? >> Well, I mean, I agree with that, I think I already said that, right. >> Yes you did, okay let's move on to the next one. >> Well it's a doppelganger, the digital twin doppelganger that's automatically created by your very fact that you're living and interacting and so forth and so on. It's going to accumulate regardless. Now that doppelganger may not be your agent, or might not be the foundation for your agent unless there's some other piece of logic like an interest graph that you build, a human being saying this is my broad set of interests, and so all of my agents out there in the IoT, you all need to be aware that when you make a decision on my behalf as my agent, this is what Jim would do. You know I mean there needs to be that kind of logic somewhere in this fabric to enable true agency. >> All right, so I'm going to start with you. Oh go ahead. >> I have a real short answer to this though. I think that Big Data provides the data and compute platform to make AI possible. For those of us who dipped our toes in the water in the 80s, we got clobbered because we didn't have the, we didn't have the facilities, we didn't have the resources to really do AI, we just kind of played around with it. And I think that the other thing about it is if you combine Big Data and AI and IoT, what you're going to see is people, a lot of the applications we develop now are very inward looking, we look at our organization, we look at our customers. We try to figure out how to sell more shoes to fashionable ladies, right? But with this technology, I think people can really expand what they're thinking about and what they model and come up with applications that are much more external. >> Actually what I would add to that is also it actually introduces being able to use engineering, right? Having engineers interested in the data. Because it's actually technical data that's collected not just say preferences or information about people, but actual measurements that are being collected with IoT. So it's really interesting in the engineering space because it opens up a whole new world for the engineers to actually look at data and to actually combine both that hardware side as well as the data that's being collected from it. >> Well, Neil, you and I have talked about something, 'cause it's not just engineers. We have in the healthcare industry for example, which you know a fair amount about, there's this notion of empirical based management. And the idea that increasingly we have to be driven by data as a way of improving the way that managers do things, the way the managers collect or collaborate and ultimately collectively how they take action. So it's not just engineers, it's supposed to also inform business, what's actually happening in the healthcare world when we start thinking about some of this empirical based management, is it working? What are some of the barriers? >> It's not a function of technology. What happens in medicine and healthcare research is, I guess you can say it borders on fraud. (people chuckling) No, I'm not kidding. I know the New England Journal of Medicine a couple of years ago released a study and said that at least half their articles that they published turned out to be written, ghost written by pharmaceutical companies. (man chuckling) Right, so I think the problem is that when you do a clinical study, the one that really killed me about 10 years ago was the women's health initiative. They spent $700 million gathering this data over 20 years. And when they released it they looked at all the wrong things deliberately, right? So I think that's a systemic-- >> I think you're bringing up a really important point that we haven't brought up yet, and that is is can you use Big Data and machine learning to begin to take the biases out? So if you let the, if you divorce your preconceived notions and your biases from the data and let the data lead you to the logic, you start to, I think get better over time, but it's going to take a while to get there because we do tend to gravitate towards our biases. >> I will share an anecdote. So I had some arm pain, and I had numbness in my thumb and pointer finger and I went to, excruciating pain, went to the hospital. So the doctor examined me, and he said you probably have a pinched nerve, he said, but I'm not exactly sure which nerve it would be, I'll be right back. And I kid you not, he went to a computer and he Googled it. (Neil laughs) And he came back because this little bit of information was something that could easily be looked up, right? Every nerve in your spine is connected to your different fingers so the pointer and the thumb just happens to be your C6, so he came back and said, it's your C6. (Neil mumbles) >> You know an interesting, I mean that's a good example. One of the issues with healthcare data is that the data set is not always shared across the entire research community, so by making Big Data accessible to everyone, you actually start a more rational conversation or debate on well what are the true insights-- >> If that conversation includes what Judith talked about, the actual model that you use to set priorities and make decisions about what's actually important. So it's not just about improving, this is the test. It's not just about improving your understanding of the wrong thing, it's also testing whether it's the right or wrong thing as well. >> That's right, to be able to test that you need to have humans in dialog with one another bringing different biases to the table to work through okay is there truth in this data? >> It's context and it's correlation and you can have a great correlation that's garbage. You know if you don't have the right context. >> Peter: So I want to, hold on Jim, I want to, >> It's exploratory. >> Hold on Jim, I want to take it to the next question 'cause I want to build off of what you talked about Stephanie and that is that this says something about what is the Edge. And our perspective is that the Edge is not just devices. That when we talk about the Edge, we're talking about human beings and the role that human beings are going to play both as sensors or carrying things with them, but also as actuators, actually taking action which is not a simple thing. So what do you guys think? What does the Edge mean to you? Joe, why don't you start? >> Well, I think it could be a combination of the two. And specifically when we talk about healthcare. So I believe in 2017 when we eat we don't know why we're eating, like I think we should absolutely by now be able to know exactly what is my protein level, what is my calcium level, what is my potassium level? And then find the foods to meet that. What have I depleted versus what I should have, and eat very very purposely and not by taste-- >> And it's amazing that red wine is always the answer. >> It is. (people laughing) And tequila, that helps too. >> Jim: You're a precision foodie is what you are. (several chuckle) >> There's no reason why we should not be able to know that right now, right? And when it comes to healthcare is, the biggest problem or challenge with healthcare is no matter how great of a technology you have, you can't, you can't, you can't manage what you can't measure. And you're really not allowed to use a lot of this data so you can't measure it, right? You can't do things very very scientifically right, in the healthcare world and I think regulation in the healthcare world is really burdening advancement in science. >> Peter: Any thoughts Jennifer? >> Yes, I teach statistics for data scientists, right, so you know we talk about a lot of these concepts. I think what makes these questions so difficult is you have to find a balance, right, a middle ground. For instance, in the case of are you being too biased through data, well you could say like we want to look at data only objectively, but then there are certain relationships that your data models might show that aren't actually a causal relationship. For instance, if there's an alien that came from space and saw earth, saw the people, everyone's carrying umbrellas right, and then it started to rain. That alien might think well, it's because they're carrying umbrellas that it's raining. Now we know from real world that that's actually not the way these things work. So if you look only at the data, that's the potential risk. That you'll start making associations or saying something's causal when it's actually not, right? So that's one of the, one of the I think big challenges. I think when it comes to looking also at things like healthcare data, right? Do you collect data about anything and everything? Does it mean that A, we need to collect all that data for the question we're looking at? Or that it's actually the best, more optimal way to be able to get to the answer? Meaning sometimes you can take some shortcuts in terms of what data you collect and still get the right answer and not have maybe that level of specificity that's going to cost you millions extra to be able to get. >> So Jennifer as a data scientist, I want to build upon what you just said. And that is, are we going to start to see methods and models emerge for how we actually solve some of these problems? So for example, we know how to build a system for stylized process like accounting or some elements of accounting. We have methods and models that lead to technology and actions and whatnot all the way down to that that system can be generated. We don't have the same notion to the same degree when we start talking about AI and some of these Big Datas. We have algorithms, we have technology. But are we going to start seeing, as a data scientist, repeatability and learning and how to think the problems through that's going to lead us to a more likely best or at least good result? >> So I think that's a bit of a tough question, right? Because part of it is, it's going to depend on how many of these researchers actually get exposed to real world scenarios, right? Research looks into all these papers, and you come up with all these models, but if it's never tested in a real world scenario, well, I mean we really can't validate that it works, right? So I think it is dependent on how much of this integration there's going to be between the research community and industry and how much investment there is. Funding is going to matter in this case. If there's no funding in the research side, then you'll see a lot of industry folk who feel very confident about their models that, but again on the other side of course, if researchers don't validate those models then you really can't say for sure that it's actually more accurate, or it's more efficient. >> It's the issue of real world testing and experimentation, A B testing, that's standard practice in many operationalized ML and AI implementations in the business world, but real world experimentation in the Edge analytics, what you're actually transducing are touching people's actual lives. Problem there is, like in healthcare and so forth, when you're experimenting with people's lives, somebody's going to die. I mean, in other words, that's a critical, in terms of causal analysis, you've got to tread lightly on doing operationalizing that kind of testing in the IoT when people's lives and health are at stake. >> We still give 'em placebos. So we still test 'em. All right so let's go to the next question. What are the hottest innovations in AI? Stephanie I want to start with you as a company, someone at a company that's got kind of an interesting little thing happening. We start thinking about how do we better catalog data and represent it to a large number of people. What are some of the hottest innovations in AI as you see it? >> I think it's a little counter intuitive about what the hottest innovations are in AI, because we're at a spot in the industry where the most successful companies that are working with AI are actually incorporating them into solutions. So the best AI solutions are actually the products that you don't know there's AI operating underneath. But they're having a significant impact on business decision making or bringing a different type of application to the market and you know, I think there's a lot of investment that's going into AI tooling and tool sets for data scientists or researchers, but the more innovative companies are thinking through how do we really take AI and make it have an impact on business decision making and that means kind of hiding the AI to the business user. Because if you think a bot is making a decision instead of you, you're not going to partner with that bot very easily or very readily. I worked at, way at the start of my career, I worked in CRM when recommendation engines were all the rage online and also in call centers. And the hardest thing was to get a call center agent to actually read the script that the algorithm was presenting to them, that algorithm was 99% correct most of the time, but there was this human resistance to letting a computer tell you what to tell that customer on the other side even if it was more successful in the end. And so I think that the innovation in AI that's really going to push us forward is when humans feel like they can partner with these bots and they don't think of it as a bot, but they think about as assisting their work and getting to a better result-- >> Hence the augmentation point you made earlier. >> Absolutely, absolutely. >> Joe how 'about you? What do you look at? What are you excited about? >> I think the coolest thing at the moment right now is chat bots. Like to be able, like to have voice be able to speak with you in natural language, to do that, I think that's pretty innovative, right? And I do think that eventually, for the average user, not for techies like me, but for the average user, I think keyboards are going to be a thing of the past. I think we're going to communicate with computers through voice and I think this is the very very beginning of that and it's an incredible innovation. >> Neil? >> Well, I think we all have myopia here. We're all thinking about commercial applications. Big, big things are happening with AI in the intelligence community, in military, the defense industry, in all sorts of things. Meteorology. And that's where, well, hopefully not on an every day basis with military, you really see the effect of this. But I was involved in a project a couple of years ago where we were developing AI software to detect artillery pieces in terrain from satellite imagery. I don't have to tell you what country that was. I think you can probably figure that one out right? But there are legions of people in many many companies that are involved in that industry. So if you're talking about the dollars spent on AI, I think the stuff that we do in our industries is probably fairly small. >> Well it reminds me of an application I actually thought was interesting about AI related to that, AI being applied to removing mines from war zones. >> Why not? >> Which is not a bad thing for a whole lot of people. Judith what do you look at? >> So I'm looking at things like being able to have pre-trained data sets in specific solution areas. I think that that's something that's coming. Also the ability to, to really be able to have a machine assist you in selecting the right algorithms based on what your data looks like and the problems you're trying to solve. Some of the things that data scientists still spend a lot of their time on, but can be augmented with some, basically we have to move to levels of abstraction before this becomes truly ubiquitous across many different areas. >> Peter: Jennifer? >> So I'm going to say computer vision. >> Computer vision? >> Computer vision. So computer vision ranges from image recognition to be able to say what content is in the image. Is it a dog, is it a cat, is it a blueberry muffin? Like a sort of popular post out there where it's like a blueberry muffin versus like I think a chihuahua and then it compares the two. And can the AI really actually detect difference, right? So I think that's really where a lot of people who are in this space of being in both the AI space as well as data science are looking to for the new innovations. I think, for instance, cloud vision I think that's what Google still calls it. The vision API we've they've released on beta allows you to actually use an API to send your image and then have it be recognized right, by their API. There's another startup in New York called Clarify that also does a similar thing as well as you know Amazon has their recognition platform as well. So I think in a, from images being able to detect what's in the content as well as from videos, being able to say things like how many people are entering a frame? How many people enter the store? Not having to actually go look at it and count it, but having a computer actually tally that information for you, right? >> There's actually an extra piece to that. So if I have a picture of a stop sign, and I'm an automated car, and is it a picture on the back of a bus of a stop sign, or is it a real stop sign? So that's going to be one of the complications. >> Doesn't matter to a New York City cab driver. How 'about you Jim? >> Probably not. (laughs) >> Hottest thing in AI is General Adversarial Networks, GANT, what's hot about that, well, I'll be very quick, most AI, most deep learning, machine learning is analytical, it's distilling or inferring insights from the data. Generative takes that same algorithmic basis but to build stuff. In other words, to create realistic looking photographs, to compose music, to build CAD CAM models essentially that can be constructed on 3D printers. So GANT, it's a huge research focus all around the world are used for, often increasingly used for natural language generation. In other words it's institutionalizing or having a foundation for nailing the Turing test every single time, building something with machines that looks like it was constructed by a human and doing it over and over again to fool humans. I mean you can imagine the fraud potential. But you can also imagine just the sheer, like it's going to shape the world, GANT. >> All right so I'm going to say one thing, and then we're going to ask if anybody in the audience has an idea. So the thing that I find interesting is traditional programs, or when you tell a machine to do something you don't need incentives. When you tell a human being something, you have to provide incentives. Like how do you get someone to actually read the text. And this whole question of elements within AI that incorporate incentives as a way of trying to guide human behavior is absolutely fascinating to me. Whether it's gamification, or even some things we're thinking about with block chain and bitcoins and related types of stuff. To my mind that's going to have an enormous impact, some good, some bad. Anybody in the audience? I don't want to lose everybody here. What do you think sir? And I'll try to do my best to repeat it. Oh we have a mic. >> So my question's about, Okay, so the question's pretty much about what Stephanie's talking about which is human and loop training right? I come from a computer vision background. That's the problem, we need millions of images trained, we need humans to do that. And that's like you know, the workforce is essentially people that aren't necessarily part of the AI community, they're people that are just able to use that data and analyze the data and label that data. That's something that I think is a big problem everyone in the computer vision industry at least faces. I was wondering-- >> So again, but the problem is that is the difficulty of methodologically bringing together people who understand it and people who, people who have domain expertise people who have algorithm expertise and working together? >> I think the expertise issue comes in healthcare, right? In healthcare you need experts to be labeling your images. With contextual information where essentially augmented reality applications coming in, you have the AR kit and everything coming out, but there is a lack of context based intelligence. And all of that comes through training images, and all of that requires people to do it. And that's kind of like the foundational basis of AI coming forward is not necessarily an algorithm, right? It's how well are datas labeled? Who's doing the labeling and how do we ensure that it happens? >> Great question. So for the panel. So if you think about it, a consultant talks about being on the bench. How much time are they going to have to spend on trying to develop additional business? How much time should we set aside for executives to help train some of the assistants? >> I think that the key is not, to think of the problem a different way is that you would have people manually label data and that's one way to solve the problem. But you can also look at what is the natural workflow of that executive, or that individual? And is there a way to gather that context automatically using AI, right? And if you can do that, it's similar to what we do in our product, we observe how someone is analyzing the data and from those observations we can actually create the metadata that then trains the system in a particular direction. But you have to think about solving the problem differently of finding the workflow that then you can feed into to make this labeling easy without the human really realizing that they're labeling the data. >> Peter: Anybody else? >> I'll just add to what Stephanie said, so in the IoT applications, all those sensory modalities, the computer vision, the speech recognition, all that, that's all potential training data. So it cross checks against all the other models that are processing all the other data coming from that device. So that the natural language process of understanding can be reality checked against the images that the person happens to be commenting upon, or the scene in which they're embedded, so yeah, the data's embedded-- >> I don't think we're, we're not at the stage yet where this is easy. It's going to take time before we do start doing the pre-training of some of these details so that it goes faster, but right now, there're not that many shortcuts. >> Go ahead Joe. >> Sorry so a couple things. So one is like, I was just caught up on your incentivizing programs to be more efficient like humans. You know in Ethereum that has this notion, which is bot chain, has this theory, this concept of gas. Where like as the process becomes more efficient it costs less to actually run, right? It costs less ether, right? So it actually is kind of, the machine is actually incentivized and you don't really know what it's going to cost until the machine processes it, right? So there is like some notion of that there. But as far as like vision, like training the machine for computer vision, I think it's through adoption and crowdsourcing, so as people start using it more they're going to be adding more pictures. Very very organically. And then the machines will be trained and right now is a very small handful doing it, and it's very proactive by the Googles and the Facebooks and all of that. But as we start using it, as they start looking at my images and Jim's and Jen's images, it's going to keep getting smarter and smarter through adoption and through very organic process. >> So Neil, let me ask you a question. Who owns the value that's generated as a consequence of all these people ultimately contributing their insight and intelligence into these systems? >> Well, to a certain extent the people who are contributing the insight own nothing because the systems collect their actions and the things they do and then that data doesn't belong to them, it belongs to whoever collected it or whoever's going to do something with it. But the other thing, getting back to the medical stuff. It's not enough to say that the systems, people will do the right thing, because a lot of them are not motivated to do the right thing. The whole grant thing, the whole oh my god I'm not going to go against the senior professor. A lot of these, I knew a guy who was a doctor at University of Pittsburgh and they were doing a clinical study on the tubes that they put in little kids' ears who have ear infections, right? And-- >> Google it! Who helps out? >> Anyway, I forget the exact thing, but he came out and said that the principle investigator lied when he made the presentation, that it should be this, I forget which way it went. He was fired from his position at Pittsburgh and he has never worked as a doctor again. 'Cause he went against the senior line of authority. He was-- >> Another question back here? >> Man: Yes, Mark Turner has a question. >> Not a question, just want to piggyback what you're saying about the transfixation of maybe in healthcare of black and white images and color images in the case of sonograms and ultrasound and mammograms, you see that happening using AI? You see that being, I mean it's already happening, do you see it moving forward in that kind of way? I mean, talk more about that, about you know, AI and black and white images being used and they can be transfixed, they can be made to color images so you can see things better, doctors can perform better operations. >> So I'm sorry, but could you summarize down? What's the question? Summarize it just, >> I had a lot of students, they're interested in the cross pollenization between AI and say the medical community as far as things like ultrasound and sonograms and mammograms and how you can literally take a black and white image and it can, using algorithms and stuff be made to color images that can help doctors better do the work that they've already been doing, just do it better. You touched on it like 30 seconds. >> So how AI can be used to actually add information in a way that's not necessarily invasive but is ultimately improves how someone might respond to it or use it, yes? Related? I've also got something say about medical images in a second, any of you guys want to, go ahead Jennifer. >> Yeah, so for one thing, you know and it kind of goes back to what we were talking about before. When we look at for instance scans, like at some point I was looking at CT scans, right, for lung cancer nodules. In order for me, who I don't have a medical background, to identify where the nodule is, of course, a doctor actually had to go in and specify which slice of the scan had the nodule and where exactly it is, so it's on both the slice level as well as, within that 2D image, where it's located and the size of it. So the beauty of things like AI is that ultimately right now a radiologist has to look at every slice and actually identify this manually, right? The goal of course would be that one day we wouldn't have to have someone look at every slice to like 300 usually slices and be able to identify it much more automated. And I think the reality is we're not going to get something where it's going to be 100%. And with anything we do in the real world it's always like a 95% chance of it being accurate. So I think it's finding that in between of where, what's the threshold that we want to use to be able to say that this is, definitively say a lung cancer nodule or not. I think the other thing to think about is in terms of how their using other information, what they might use is a for instance, to say like you know, based on other characteristics of the person's health, they might use that as sort of a grading right? So you know, how dark or how light something is, identify maybe in that region, the prevalence of that specific variable. So that's usually how they integrate that information into something that's already existing in the computer vision sense. I think that's, the difficulty with this of course, is being able to identify which variables were introduced into data that does exist. >> So I'll make two quick observations on this then I'll go to the next question. One is radiologists have historically been some of the highest paid physicians within the medical community partly because they don't have to be particularly clinical. They don't have to spend a lot of time with patients. They tend to spend time with doctors which means they can do a lot of work in a little bit of time, and charge a fair amount of money. As we start to introduce some of these technologies that allow us to from a machine standpoint actually make diagnoses based on those images, I find it fascinating that you now see television ads promoting the role that the radiologist plays in clinical medicine. It's kind of an interesting response. >> It's also disruptive as I'm seeing more and more studies showing that deep learning models processing images, ultrasounds and so forth are getting as accurate as many of the best radiologists. >> That's the point! >> Detecting cancer >> Now radiologists are saying oh look, we do this great thing in terms of interacting with the patients, never have because they're being dis-intermediated. The second thing that I'll note is one of my favorite examples of that if I got it right, is looking at the images, the deep space images that come out of Hubble. Where they're taking data from thousands, maybe even millions of images and combining it together in interesting ways you can actually see depth. You can actually move through to a very very small scale a system that's 150, well maybe that, can't be that much, maybe six billion light years away. Fascinating stuff. All right so let me go to the last question here, and then I'm going to close it down, then we can have something to drink. What are the hottest, oh I'm sorry, question? >> Yes, hi, my name's George, I'm with Blue Talon. You asked earlier there the question what's the hottest thing in the Edge and AI, I would say that it's security. It seems to me that before you can empower agency you need to be able to authorize what they can act on, how they can act on, who they can act on. So it seems if you're going to move from very distributed data at the Edge and analytics at the Edge, there has to be security similarly done at the Edge. And I saw (speaking faintly) slides that called out security as a key prerequisite and maybe Judith can comment, but I'm curious how security's going to evolve to meet this analytics at the Edge. >> Well, let me do that and I'll ask Jen to comment. The notion of agency is crucially important, slightly different from security, just so we're clear. And the basic idea here is historically folks have thought about moving data or they thought about moving application function, now we are thinking about moving authority. So as you said. That's not necessarily, that's not really a security question, but this has been a problem that's been in, of concern in a number of different domains. How do we move authority with the resources? And that's really what informs the whole agency process. But with that said, Jim. >> Yeah actually I'll, yeah, thank you for bringing up security so identity is the foundation of security. Strong identity, multifactor, face recognition, biometrics and so forth. Clearly AI, machine learning, deep learning are powering a new era of biometrics and you know it's behavioral metrics and so forth that's organic to people's use of devices and so forth. You know getting to the point that Peter was raising is important, agency! Systems of agency. Your agent, you have to, you as a human being should be vouching in a secure, tamper proof way, your identity should be vouching for the identity of some agent, physical or virtual that does stuff on your behalf. How can that, how should that be managed within this increasingly distributed IoT fabric? Well a lot of that's been worked. It all ran through webs of trust, public key infrastructure, formats and you know SAML for single sign and so forth. It's all about assertion, strong assertions and vouching. I mean there's the whole workflows of things. Back in the ancient days when I was actually a PKI analyst three analyst firms ago, I got deep into all the guts of all those federation agreements, something like that has to be IoT scalable to enable systems agency to be truly fluid. So we can vouch for our agents wherever they happen to be. We're going to keep on having as human beings agents all over creation, we're not even going to be aware of everywhere that our agents are, but our identity-- >> It's not just-- >> Our identity has to follow. >> But it's not just identity, it's also authorization and context. >> Permissioning, of course. >> So I may be the right person to do something yesterday, but I'm not authorized to do it in another context in another application. >> Role based permissioning, yeah. Or persona based. >> That's right. >> I agree. >> And obviously it's going to be interesting to see the role that block chain or its follow on to the technology is going to play here. Okay so let me throw one more questions out. What are the hottest applications of AI at the Edge? We've talked about a number of them, does anybody want to add something that hasn't been talked about? Or do you want to get a beer? (people laughing) Stephanie, you raised your hand first. >> I was going to go, I bring something mundane to the table actually because I think one of the most exciting innovations with IoT and AI are actually simple things like City of San Diego is rolling out 3200 automated street lights that will actually help you find a parking space, reduce the amount of emissions into the atmosphere, so has some environmental change, positive environmental change impact. I mean, it's street lights, it's not like a, it's not medical industry, it doesn't look like a life changing innovation, and yet if we automate streetlights and we manage our energy better, and maybe they can flicker on and off if there's a parking space there for you, that's a significant impact on everyone's life. >> And dramatically suppress the impact of backseat driving! >> (laughs) Exactly. >> Joe what were you saying? >> I was just going to say you know there's already the technology out there where you can put a camera on a drone with machine learning within an artificial intelligence within it, and it can look at buildings and determine whether there's rusty pipes and cracks in cement and leaky roofs and all of those things. And that's all based on artificial intelligence. And I think if you can do that, to be able to look at an x-ray and determine if there's a tumor there is not out of the realm of possibility, right? >> Neil? >> I agree with both of them, that's what I meant about external kind of applications. Instead of figuring out what to sell our customers. Which is most what we hear. I just, I think all of those things are imminently doable. And boy street lights that help you find a parking place, that's brilliant, right? >> Simple! >> It improves your life more than, I dunno. Something I use on the internet recently, but I think it's great! That's, I'd like to see a thousand things like that. >> Peter: Jim? >> Yeah, building on what Stephanie and Neil were saying, it's ambient intelligence built into everything to enable fine grain microclimate awareness of all of us as human beings moving through the world. And enable reading of every microclimate in buildings. In other words, you know you have sensors on your body that are always detecting the heat, the humidity, the level of pollution or whatever in every environment that you're in or that you might be likely to move into fairly soon and either A can help give you guidance in real time about where to avoid, or give that environment guidance about how to adjust itself to your, like the lighting or whatever it might be to your specific requirements. And you know when you have a room like this, full of other human beings, there has to be some negotiated settlement. Some will find it too hot, some will find it too cold or whatever but I think that is fundamental in terms of reshaping the sheer quality of experience of most of our lived habitats on the planet potentially. That's really the Edge analytics application that depends on everybody having, being fully equipped with a personal area network of sensors that's communicating into the cloud. >> Jennifer? >> So I think, what's really interesting about it is being able to utilize the technology we do have, it's a lot cheaper now to have a lot of these ways of measuring that we didn't have before. And whether or not engineers can then leverage what we have as ways to measure things and then of course then you need people like data scientists to build the right model. So you can collect all this data, if you don't build the right model that identifies these patterns then all that data's just collected and it's just made a repository. So without having the models that supports patterns that are actually in the data, you're not going to find a better way of being able to find insights in the data itself. So I think what will be really interesting is to see how existing technology is leveraged, to collect data and then how that's actually modeled as well as to be able to see how technology's going to now develop from where it is now, to being able to either collect things more sensitively or in the case of say for instance if you're dealing with like how people move, whether we can build things that we can then use to measure how we move, right? Like how we move every day and then being able to model that in a way that is actually going to give us better insights in things like healthcare and just maybe even just our behaviors. >> Peter: Judith? >> So, I think we also have to look at it from a peer to peer perspective. So I may be able to get some data from one thing at the Edge, but then all those Edge devices, sensors or whatever, they all have to interact with each other because we don't live, we may, in our business lives, act in silos, but in the real world when you look at things like sensors and devices it's how they react with each other on a peer to peer basis. >> All right, before I invite John up, I want to say, I'll say what my thing is, and it's not the hottest. It's the one I hate the most. I hate AI generated music. (people laughing) Hate it. All right, I want to thank all the panelists, every single person, some great commentary, great observations. I want to thank you very much. I want to thank everybody that joined. John in a second you'll kind of announce who's the big winner. But the one thing I want to do is, is I was listening, I learned a lot from everybody, but I want to call out the one comment that I think we all need to remember, and I'm going to give you the award Stephanie. And that is increasing we have to remember that the best AI is probably AI that we don't even know is working on our behalf. The same flip side of that is all of us have to be very cognizant of the idea that AI is acting on our behalf and we may not know it. So, John why don't you come on up. Who won the, whatever it's called, the raffle? >> You won. >> Thank you! >> How 'about a round of applause for the great panel. (audience applauding) Okay we have a put the business cards in the basket, we're going to have that brought up. We're going to have two raffle gifts, some nice Bose headsets and speaker, Bluetooth speaker. Got to wait for that. I just want to say thank you for coming and for the folks watching, this is our fifth year doing our own event called Big Data NYC which is really an extension of the landscape beyond the Big Data world that's Cloud and AI and IoT and other great things happen and great experts and influencers and analysts here. Thanks for sharing your opinion. Really appreciate you taking the time to come out and share your data and your knowledge, appreciate it. Thank you. Where's the? >> Sam's right in front of you. >> There's the thing, okay. Got to be present to win. We saw some people sneaking out the back door to go to a dinner. >> First prize first. >> Okay first prize is the Bose headset. >> Bluetooth and noise canceling. >> I won't look, Sam you got to hold it down, I can see the cards. >> All right. >> Stephanie you won! (Stephanie laughing) Okay, Sawny Cox, Sawny Allie Cox? (audience applauding) Yay look at that! He's here! The bar's open so help yourself, but we got one more. >> Congratulations. Picture right here. >> Hold that I saw you. Wake up a little bit. Okay, all right. Next one is, my kids love this. This is great, great for the beach, great for everything portable speaker, great gift. >> What is it? >> Portable speaker. >> It is a portable speaker, it's pretty awesome. >> Oh you grabbed mine. >> Oh that's one of our guys. >> (lauging) But who was it? >> Can't be related! Ava, Ava, Ava. Okay Gene Penesko (audience applauding) Hey! He came in! All right look at that, the timing's great. >> Another one? (people laughing) >> Hey thanks everybody, enjoy the night, thank Peter Burris, head of research for SiliconANGLE, Wikibon and he great guests and influencers and friends. And you guys for coming in the community. Thanks for watching and thanks for coming. Enjoy the party and some drinks and that's out, that's it for the influencer panel and analyst discussion. Thank you. (logo music)

Published Date : Sep 28 2017

SUMMARY :

is that the cloud is being extended out to the Edge, the next time I talk to you I don't want to hear that are made at the Edge to individual users We've got, again, the objective here is to have community From the Hurwitz Group. And finally Joe Caserta, Joe come on up. And to the left. I've been in the market for a couple years now. I'm the founder and Chief Data Scientist We can hear you now. And I have, I've been developing a lot of patents I just feel not worthy in the presence of Joe Caserta. If you can hear me, Joe Caserta, so yeah, I've been doing We recently rebranded to only Caserta 'cause what we do to make recommendations about what data to use the realities of how data is going to work in these to make sure that you have the analytics at the edge. and ARBI is the integration of Augmented Reality And it's going to say exactly you know, And if the machine appears to approximate what's and analyzed, conceivably some degree of mind reading but the machine as in the bot isn't able to tell you kind of some of the things you talked about, IoT, So that's one of the reasons why the IoT of the primary source. Well, I mean, I agree with that, I think I already or might not be the foundation for your agent All right, so I'm going to start with you. a lot of the applications we develop now are very So it's really interesting in the engineering space And the idea that increasingly we have to be driven I know the New England Journal of Medicine So if you let the, if you divorce your preconceived notions So the doctor examined me, and he said you probably have One of the issues with healthcare data is that the data set the actual model that you use to set priorities and you can have a great correlation that's garbage. What does the Edge mean to you? And then find the foods to meet that. And tequila, that helps too. Jim: You're a precision foodie is what you are. in the healthcare world and I think regulation For instance, in the case of are you being too biased We don't have the same notion to the same degree but again on the other side of course, in the Edge analytics, what you're actually transducing What are some of the hottest innovations in AI and that means kind of hiding the AI to the business user. I think keyboards are going to be a thing of the past. I don't have to tell you what country that was. AI being applied to removing mines from war zones. Judith what do you look at? and the problems you're trying to solve. And can the AI really actually detect difference, right? So that's going to be one of the complications. Doesn't matter to a New York City cab driver. (laughs) So GANT, it's a huge research focus all around the world So the thing that I find interesting is traditional people that aren't necessarily part of the AI community, and all of that requires people to do it. So for the panel. of finding the workflow that then you can feed into that the person happens to be commenting upon, It's going to take time before we do start doing and Jim's and Jen's images, it's going to keep getting Who owns the value that's generated as a consequence But the other thing, getting back to the medical stuff. and said that the principle investigator lied and color images in the case of sonograms and ultrasound and say the medical community as far as things in a second, any of you guys want to, go ahead Jennifer. to say like you know, based on other characteristics I find it fascinating that you now see television ads as many of the best radiologists. and then I'm going to close it down, It seems to me that before you can empower agency Well, let me do that and I'll ask Jen to comment. agreements, something like that has to be IoT scalable and context. So I may be the right person to do something yesterday, Or persona based. that block chain or its follow on to the technology into the atmosphere, so has some environmental change, the technology out there where you can put a camera And boy street lights that help you find a parking place, That's, I'd like to see a thousand things like that. that are always detecting the heat, the humidity, patterns that are actually in the data, but in the real world when you look at things and I'm going to give you the award Stephanie. and for the folks watching, We saw some people sneaking out the back door I can see the cards. Stephanie you won! Picture right here. This is great, great for the beach, great for everything All right look at that, the timing's great. that's it for the influencer panel and analyst discussion.

ENTITIES

Entity	Category	Confidence
Judith	PERSON	0.99+
Jennifer	PERSON	0.99+
Jim	PERSON	0.99+
Neil	PERSON	0.99+
Stephanie McReynolds	PERSON	0.99+
Jack	PERSON	0.99+
2001	DATE	0.99+
Marc Andreessen	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Joe Caserta	PERSON	0.99+
Suzie Welch	PERSON	0.99+
Joe	PERSON	0.99+
David Floyer	PERSON	0.99+
Peter	PERSON	0.99+
Stephanie	PERSON	0.99+
Jen	PERSON	0.99+
Neil Raden	PERSON	0.99+
Mark Turner	PERSON	0.99+
Judith Hurwitz	PERSON	0.99+
John	PERSON	0.99+
Elysian	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
Qualcomm	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
2017	DATE	0.99+
Honeywell	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Derek Sivers	PERSON	0.99+
New York	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
1998	DATE	0.99+

Next-Generation Analytics Social Influencer Roundtable - #BigDataNYC 2016 #theCUBE

>> Narrator: Live from New York, it's the Cube, covering big data New York City 2016. Brought to you by headline sponsors, CISCO, IBM, NVIDIA, and our ecosystem sponsors, now here's your host, Dave Valante. >> Welcome back to New York City, everybody, this is the Cube, the worldwide leader in live tech coverage, and this is a cube first, we've got a nine person, actually eight person panel of experts, data scientists, all alike. I'm here with my co-host, James Cubelis, who has helped organize this panel of experts. James, welcome. >> Thank you very much, Dave, it's great to be here, and we have some really excellent brain power up there, so I'm going to let them talk. >> Okay, well thank you again-- >> And I'll interject my thoughts now and then, but I want to hear them. >> Okay, great, we know you well, Jim, we know you'll do that, so thank you for that, and appreciate you organizing this. Okay, so what I'm going to do to our panelists is ask you to introduce yourself. I'll introduce you, but tell us a little bit about yourself, and talk a little bit about what data science means to you. A number of you started in the field a long time ago, perhaps data warehouse experts before the term data science was coined. Some of you started probably after Hal Varian said it was the sexiest job in the world. (laughs) So think about how data science has changed and or what it means to you. We're going to start with Greg Piateski, who's from Boston. A Ph.D., KDnuggets, Greg, tell us about yourself and what data science means to you. >> Okay, well thank you Dave and thank you Jim for the invitation. Data science in a sense is the second oldest profession. I think people have this built-in need to find patterns and whatever we find we want to organize the data, but we do it well on a small scale, but we don't do it well on a large scale, so really, data science takes our need and helps us organize what we find, the patterns that we find that are really valid and useful and not just random, I think this is a big challenge of data science. I've actually started in this field before the term Data Science existed. I started as a researcher and organized the first few workshops on data mining and knowledge discovery, and the term data mining became less fashionable, became predictive analytics, now it's data science and it will be something else in a few years. >> Okay, thank you, Eves Mulkearns, Eves, I of course know you from Twitter. A lot of people know you as well. Tell us about your experiences and what data scientist means to you. >> Well, data science to me is if you take the two words, the data and the science, the science it holds a lot of expertise and skills there, it's statistics, it's mathematics, it's understanding the business and putting that together with the digitization of what we have. It's not only the structured data or the unstructured data what you store in the database try to get out and try to understand what is in there, but even video what is coming on and then trying to find, like George already said, the patterns in there and bringing value to the business but looking from a technical perspective, but still linking that to the business insights and you can do that on a technical level, but then you don't know yet what you need to find, or what you're looking for. >> Okay great, thank you. Craig Brown, Cube alum. How many people have been on the Cube actually before? >> I have. >> Okay, good. I always like to ask that question. So Craig, tell us a little bit about your background and, you know, data science, how has it changed, what's it all mean to you? >> Sure, so I'm Craig Brown, I've been in IT for almost 28 years, and that was obviously before the term data science, but I've evolved from, I started out as a developer. And evolved through the data ranks, as I called it, working with data structures, working with data systems, data technologies, and now we're working with data pure and simple. Data science to me is an individual or team of individuals that dissect the data, understand the data, help folks look at the data differently than just the information that, you know, we usually use in reports, and get more insights on, how to utilize it and better leverage it as an asset within an organization. >> Great, thank you Craig, okay, Jennifer Shin? Math is obviously part of being a data scientist. You're good at math I understand. Tell us about yourself. >> Yeah, so I'm a senior principle data scientist at the Nielsen Company. I'm also the founder of 8 Path Solutions, which is a data science, analytics, and technology company, and I'm also on the faculty in the Master of Information and Data Science program at UC Berkeley. So math is part of the IT statistics for data science actually this semester, and I think for me, I consider myself a scientist primarily, and data science is a nice day job to have, right? Something where there's industry need for people with my skill set in the sciences, and data gives us a great way of being able to communicate sort of what we know in science in a way that can be used out there in the real world. I think the best benefit for me is that now that I'm a data scientist, people know what my job is, whereas before, maybe five ten years ago, no one understood what I did. Now, people don't necessarily understand what I do now, but at least they understand kind of what I do, so it's still an improvement. >> Excellent. Thank you Jennifer. Joe Caserta, you're somebody who started in the data warehouse business, and saw that snake swallow a basketball and grow into what we now know as big data, so tell us about yourself. >> So I've been doing data for 30 years now, and I wrote the Data Warehouse ETL Toolkit with Ralph Timbal, which is the best selling book in the industry on preparing data for analytics, and with the big paradigm shift that's happened, you know for me the past seven years has been, instead of preparing data for people to analyze data to make decisions, now we're preparing data for machines to make the decisions, and I think that's the big shift from data analysis to data analytics and data science. >> Great, thank you. Miriam, Miriam Fridell, welcome. >> Thank you. I'm Miriam Fridell, I work for Elder Research, we are a data science consultancy, and I came to data science, sort of through a very circuitous route. I started off as a physicist, went to work as a consultant and software engineer, then became a research analyst, and finally came to data science. And I think one of the most interesting things to me about data science is that it's not simply about building an interesting model and doing some interesting mathematics, or maybe wrangling the data, all of which I love to do, but it's really the entire analytics lifecycle, and a value that you can actually extract from data at the end, and that's one of the things that I enjoy most is seeing a client's eyes light up or a wow, I didn't really know we could look at data that way, that's really interesting. I can actually do something with that, so I think that, to me, is one of the most interesting things about it. >> Great, thank you. Justin Sadeen, welcome. >> Absolutely, than you, thank you. So my name is Justin Sadeen, I work for Morph EDU, an artificial intelligence company in Atlanta, Georgia, and we develop learning platforms for non-profit and private educational institutions. So I'm a Marine Corp veteran turned data enthusiast, and so what I think about data science is the intersection of information, intelligence, and analysis, and I'm really excited about the transition from big data into smart data, and that's what I see data science as. >> Great, and last but not least, Dez Blanchfield, welcome mate. >> Good day. Yeah, I'm the one with the funny accent. So data science for me is probably the funniest job I've ever to describe to my mom. I've had quite a few different jobs, and she's never understood any of them, and this one she understands the least. I think a fun way to describe what we're trying to do in the world of data science and analytics now is it's the equivalent of high altitude mountain climbing. It's like the extreme sport version of the computer science world, because we have to be this magical unicorn of a human that can understand plain english problems from C-suite down and then translate it into code, either as soles or as teams of developers. And so there's this black art that we're expected to be able to transmogrify from something that we just in plain english say I would like to know X, and we have to go and figure it out, so there's this neat extreme sport view I have of rushing down the side of a mountain on a mountain bike and just dodging rocks and trees and things occasionally, because invariably, we do have things that go wrong, and they don't quite give us the answers we want. But I think we're at an interesting point in time now with the explosion in the types of technology that are at our fingertips, and the scale at which we can do things now, once upon a time we would sit at a terminal and write code and just look at data and watch it in columns, and then we ended up with spreadsheet technologies at our fingertips. Nowadays it's quite normal to instantiate a small high performance distributed cluster of computers, effectively a super computer in a public cloud, and throw some data at it and see what comes back. And we can do that on a credit card. So I think we're at a really interesting tipping point now where this coinage of data science needs to be slightly better defined, so that we can help organizations who have weird and strange questions that they want to ask, tell them solutions to those questions, and deliver on them in, I guess, a commodity deliverable. I want to know xyz and I want to know it in this time frame and I want to spend this much amount of money to do it, and I don't really care how you're going to do it. And there's so many tools we can choose from and there's so many platforms we can choose from, it's this little black art of computing, if you'd like, we're effectively making it up as we go in many ways, so I think it's one of the most exciting challenges that I've had, and I think I'm pretty sure I speak for most of us in that we're lucky that we get paid to do this amazing job. That we get make up on a daily basis in some cases. >> Excellent, well okay. So we'll just get right into it. I'm going to go off script-- >> Do they have unicorns down under? I think they have some strange species right? >> Well we put the pointy bit on the back. You guys have in on the front. >> So I was at an IBM event on Friday. It was a chief data officer summit, and I attended what was called the Data Divas' breakfast. It was a women in tech thing, and one of the CDOs, she said that 25% of chief data officers are women, which is much higher than you would normally see in the profile of IT. We happen to have 25% of our panelists are women. Is that common? Miriam and Jennifer, is that common for the data science field? Or is this a higher percentage than you would normally see-- >> James: Or a lower percentage? >> I think certainly for us, we have hired a number of additional women in the last year, and they are phenomenal data scientists. I don't know that I would say, I mean I think it's certainly typical that this is still a male-dominated field, but I think like many male-dominated fields, physics, mathematics, computer science, I think that that is slowly changing and evolving, and I think certainly, that's something that we've noticed in our firm over the years at our consultancy, as we're hiring new people. So I don't know if I would say 25% is the right number, but hopefully we can get it closer to 50. Jennifer, I don't know if you have... >> Yeah, so I know at Nielsen we have actually more than 25% of our team is women, at least the team I work with, so there seems to be a lot of women who are going into the field. Which isn't too surprising, because with a lot of the issues that come up in STEM, one of the reasons why a lot of women drop out is because they want real world jobs and they feel like they want to be in the workforce, and so I think this is a great opportunity with data science being so popular for these women to actually have a job where they can still maintain that engineering and science view background that they learned in school. >> Great, well Hillary Mason, I think, was the first data scientist that I ever interviewed, and I asked her what are the sort of skills required and the first question that we wanted to ask, I just threw other women in tech in there, 'cause we love women in tech, is about this notion of the unicorn data scientist, right? It's been put forth that there's the skill sets required to be a date scientist are so numerous that it's virtually impossible to have a data scientist with all those skills. >> And I love Dez's extreme sports analogy, because that plays into the whole notion of data science, we like to talk about the theme now of data science as a team sport. Must it be an extreme sport is what I'm wondering, you know. The unicorns of the world seem to be... Is that realistic now in this new era? >> I mean when automobiles first came out, they were concerned that there wouldn't be enough chauffeurs to drive all the people around. Is there an analogy with data, to be a data-driven company. Do I need a data scientist, and does that data scientist, you know, need to have these unbelievable mixture of skills? Or are we doomed to always have a skill shortage? Open it up. >> I'd like to have a crack at that, so it's interesting, when automobiles were a thing, when they first bought cars out, and before they, sort of, were modernized by the likes of Ford's Model T, when we got away from the horse and carriage, they actually had human beings walking down the street with a flag warning the public that the horseless carriage was coming, and I think data scientists are very much like that. That we're kind of expected to go ahead of the organization and try and take the challenges we're faced with today and see what's going to come around the corner. And so we're like the little flag-bearers, if you'd like, in many ways of this is where we're at today, tell me where I'm going to be tomorrow, and try and predict the day after as well. It is very much becoming a team sport though. But I think the concept of data science being a unicorn has come about because the coinage hasn't been very well defined, you know, if you were to ask 10 people what a data scientist were, you'd get 11 answers, and I think this is a really challenging issue for hiring managers and C-suites when the generants say I was data science, I want big data, I want an analyst. They don't actually really know what they're asking for. Generally, if you ask for a database administrator, it's a well-described job spec, and you can just advertise it and some 20 people will turn up and you interview to decide whether you like the look and feel and smell of 'em. When you ask for a data scientist, there's 20 different definitions of what that one data science role could be. So we don't initially know what the job is, we don't know what the deliverable is, and we're still trying to figure that out, so yeah. >> Craig what about you? >> So from my experience, when we talk about data science, we're really talking about a collection of experiences with multiple people I've yet to find, at least from my experience, a data science effort with a lone wolf. So you're talking about a combination of skills, and so you don't have, no one individual needs to have all that makes a data scientist a data scientist, but you definitely have to have the right combination of skills amongst a team in order to accomplish the goals of data science team. So from my experiences and from the clients that I've worked with, we refer to the data science effort as a data science team. And I believe that's very appropriate to the team sport analogy. >> For us, we look at a data scientist as a full stack web developer, a jack of all trades, I mean they need to have a multitude of background coming from a programmer from an analyst. You can't find one subject matter expert, it's very difficult. And if you're able to find a subject matter expert, you know, through the lifecycle of product development, you're going to require that individual to interact with a number of other members from your team who are analysts and then you just end up well training this person to be, again, a jack of all trades, so it comes full circle. >> I own a business that does nothing but data solutions, and we've been in business 15 years, and it's been, the transition over time has been going from being a conventional wisdom run company with a bunch of experts at the top to becoming more of a data-driven company using data warehousing and BI, but now the trend is absolutely analytics driven. So if you're not becoming an analytics-driven company, you are going to be behind the curve very very soon, and it's interesting that IBM is now coining the phrase of a cognitive business. I think that is absolutely the future. If you're not a cognitive business from a technology perspective, and an analytics-driven perspective, you're going to be left behind, that's for sure. So in order to stay competitive, you know, you need to really think about data science think about how you're using your data, and I also see that what's considered the data expert has evolved over time too where it used to be just someone really good at writing SQL, or someone really good at writing queries in any language, but now it's becoming more of a interdisciplinary action where you need soft skills and you also need the hard skills, and that's why I think there's more females in the industry now than ever. Because you really need to have a really broad width of experiences that really wasn't required in the past. >> Greg Piateski, you have a comment? >> So there are not too many unicorns in nature or as data scientists, so I think organizations that want to hire data scientists have to look for teams, and there are a few unicorns like Hillary Mason or maybe Osama Faiat, but they generally tend to start companies and very hard to retain them as data scientists. What I see is in other evolution, automation, and you know, steps like IBM, Watson, the first platform is eventually a great advance for data scientists in the short term, but probably what's likely to happen in the longer term kind of more and more of those skills becoming subsumed by machine unique layer within the software. How long will it take, I don't know, but I have a feeling that the paradise for data scientists may not be very long lived. >> Greg, I have a follow up question to what I just heard you say. When a data scientist, let's say a unicorn data scientist starts a company, as you've phrased it, and the company's product is built on data science, do they give up becoming a data scientist in the process? It would seem that they become a data scientist of a higher order if they've built a product based on that knowledge. What is your thoughts on that? >> Well, I know a few people like that, so I think maybe they remain data scientists at heart, but they don't really have the time to do the analysis and they really have to focus more on strategic things. For example, today actually is the birthday of Google, 18 years ago, so Larry Page and Sergey Brin wrote a very influential paper back in the '90s About page rank. Have they remained data scientist, perhaps a very very small part, but that's not really what they do, so I think those unicorn data scientists could quickly evolve to have to look for really teams to capture those skills. >> Clearly they come to a point in their career where they build a company based on teams of data scientists and data engineers and so forth, which relates to the topic of team data science. What is the right division of roles and responsibilities for team data science? >> Before we go, Jennifer, did you have a comment on that? >> Yeah, so I guess I would say for me, when data science came out and there was, you know, the Venn Diagram that came out about all the skills you were supposed to have? I took a very different approach than all of the people who I knew who were going into data science. Most people started interviewing immediately, they were like this is great, I'm going to get a job. I went and learned how to develop applications, and learned computer science, 'cause I had never taken a computer science course in college, and made sure I trued up that one part where I didn't know these things or had the skills from school, so I went headfirst and just learned it, and then now I have actually a lot of technology patents as a result of that. So to answer Jim's question, actually. I started my company about five years ago. And originally started out as a consulting firm slash data science company, then it evolved, and one of the reasons I went back in the industry and now I'm at Nielsen is because you really can't do the same sort of data science work when you're actually doing product development. It's a very very different sort of world. You know, when you're developing a product you're developing a core feature or functionality that you're going to offer clients and customers, so I think definitely you really don't get to have that wide range of sort of looking at 8 million models and testing things out. That flexibility really isn't there as your product starts getting developed. >> Before we go into the team sport, the hard skills that you have, are you all good at math? Are you all computer science types? How about math? Are you all math? >> What were your GPAs? (laughs) >> David: Anybody not math oriented? Anybody not love math? You don't love math? >> I love math, I think it's required. >> David: So math yes, check. >> You dream in equations, right? You dream. >> Computer science? Do I have to have computer science skills? At least the basic knowledge? >> I don't know that you need to have formal classes in any of these things, but I think certainly as Jennifer was saying, if you have no skills in programming whatsoever and you have no interest in learning how to write SQL queries or RR Python, you're probably going to struggle a little bit. >> James: It would be a challenge. >> So I think yes, I have a Ph.D. in physics, I did a lot of math, it's my love language, but I think you don't necessarily need to have formal training in all of these things, but I think you need to have a curiosity and a love of learning, and so if you don't have that, you still want to learn and however you gain that knowledge I think, but yeah, if you have no technical interests whatsoever, and don't want to write a line of code, maybe data science is not the field for you. Even if you don't do it everyday. >> And statistics as well? You would put that in that same general category? How about data hacking? You got to love data hacking, is that fair? Eaves, you have a comment? >> Yeah, I think so, while we've been discussing that for me, the most important part is that you have a logical mind and you have the capability to absorb new things and the curiosity you need to dive into that. While I don't have an education in IT or whatever, I have a background in chemistry and those things that I learned there, I apply to information technology as well, and from a part that you say, okay, I'm a tech-savvy guy, I'm interested in the tech part of it, you need to speak that business language and if you can do that crossover and understand what other skill sets or parts of the roles are telling you I think the communication in that aspect is very important. >> I'd like throw just something really quickly, and I think there's an interesting thing that happens in IT, particularly around technology. We tend to forget that we've actually solved a lot of these problems in the past. If we look in history, if we look around the second World War, and Bletchley Park in the UK, where you had a very similar experience as humans that we're having currently around the whole issue of data science, so there was an interesting challenge with the enigma in the shark code, right? And there was a bunch of men put in a room and told, you're mathematicians and you come from universities, and you can crack codes, but they couldn't. And so what they ended up doing was running these ads, and putting challenges, they actually put, I think it was crossword puzzles in the newspaper, and this deluge of women came out of all kinds of different roles without math degrees, without science degrees, but could solve problems, and they were thrown at the challenge of cracking codes, and invariably, they did the heavy lifting. On a daily basis for converting messages from one format to another, so that this very small team at the end could actually get in play with the sexy piece of it. And I think we're going through a similar shift now with what we're refer to as data science in the technology and business world. Where the people who are doing the heavy lifting aren't necessarily what we'd think of as the traditional data scientists, and so, there have been some unicorns and we've championed them, and they're great. But I think the shift's going to be to accountants, actuaries, and statisticians who understand the business, and come from an MBA star background that can learn the relevant pieces of math and models that we need to to apply to get the data science outcome. I think we've already been here, we've solved this problem, we've just got to learn not to try and reinvent the wheel, 'cause the media hypes this whole thing of data science is exciting and new, but we've been here a couple times before, and there's a lot to be learned from that, my view. >> I think we had Joe next. >> Yeah, so I was going to say that, data science is a funny thing. To use the word science is kind of a misnomer, because there is definitely a level of art to it, and I like to use the analogy, when Michelangelo would look at a block of marble, everyone else looked at the block of marble to see a block of marble. He looks at a block of marble and he sees a finished sculpture, and then he figures out what tools do I need to actually make my vision? And I think data science is a lot like that. We hear a problem, we see the solution, and then we just need the right tools to do it, and I think part of consulting and data science in particular. It's not so much what we know out of the gate, but it's how quickly we learn. And I think everyone here, what makes them brilliant, is how quickly they could learn any tool that they need to see their vision get accomplished. >> David: Justin? >> Yeah, I think you make a really great point, for me, I'm a Marine Corp veteran, and the reason I mentioned that is 'cause I work with two veterans who are problem solvers. And I think that's what data scientists really are, in the long run are problem solvers, and you mentioned a great point that, yeah, I think just problem solving is the key. You don't have to be a subject matter expert, just be able to take the tools and intelligently use them. >> Now when you look at the whole notion of team data science, what is the right mix of roles, like role definitions within a high-quality or a high-preforming data science teams now IBM, with, of course, our announcement of project, data works and so forth. We're splitting the role division, in terms of data scientist versus data engineers versus application developer versus business analyst, is that the right breakdown of roles? Or what would the panelists recommend in terms of understanding what kind of roles make sense within, like I said, a high performing team that's looking for trying to develop applications that depend on data, machine learning, and so forth? Anybody want to? >> I'll tackle that. So the teams that I have created over the years made up these data science teams that I brought into customer sites have a combination of developer capabilities and some of them are IT developers, but some of them were developers of things other than applications. They designed buildings, they did other things with their technical expertise besides building technology. The other piece besides the developer is the analytics, and analytics can be taught as long as they understand how algorithms work and the code behind the analytics, in other words, how are we analyzing things, and from a data science perspective, we are leveraging technology to do the analyzing through the tool sets, so ultimately as long as they understand how tool sets work, then we can train them on the tools. Having that analytic background is an important piece. >> Craig, is it easier to, I'll go to you in a moment Joe, is it easier to cross train a data scientist to be an app developer, than to cross train an app developer to be a data scientist or does it not matter? >> Yes. (laughs) And not the other way around. It depends on the-- >> It's easier to cross train a data scientist to be an app developer than-- >> Yes. >> The other way around. Why is that? >> Developing code can be as difficult as the tool set one uses to develop code. Today's tool sets are very user friendly. where developing code is very difficult to teach a person to think along the lines of developing code when they don't have any idea of the aspects of code, of building something. >> I think it was Joe, or you next, or Jennifer, who was it? >> I would say that one of the reasons for that is data scientists will probably know if the answer's right after you process data, whereas data engineer might be able to manipulate the data but may not know if the answer's correct. So I think that is one of the reasons why having a data scientist learn the application development skills might be a easier time than the other way around. >> I think Miriam, had a comment? Sorry. >> I think that what we're advising our clients to do is to not think, before data science and before analytics became so required by companies to stay competitive, it was more of a waterfall, you have a data engineer build a solution, you know, then you throw it over the fence and the business analyst would have at it, where now, it must be agile, and you must have a scrum team where you have the data scientist and the data engineer and the project manager and the product owner and someone from the chief data office all at the table at the same time and all accomplishing the same goal. Because all of these skills are required, collectively in order to solve this problem, and it can't be done daisy chained anymore it has to be a collaboration. And that's why I think spark is so awesome, because you know, spark is a single interface that a data engineer can use, a data analyst can use, and a data scientist can use. And now with what we've learned today, having a data catalog on top so that the chief data office can actually manage it, I think is really going to take spark to the next level. >> James: Miriam? >> I wanted to comment on your question to Craig about is it harder to teach a data scientist to build an application or vice versa, and one of the things that we have worked on a lot in our data science team is incorporating a lot of best practices from software development, agile, scrum, that sort of thing, and I think particularly with a focus on deploying models that we don't just want to build an interesting data science model, we want to deploy it, and get some value. You need to really incorporate these processes from someone who might know how to build applications and that, I think for some data scientists can be a challenge, because one of the fun things about data science is you get to get into the data, and you get your hands dirty, and you build a model, and you get to try all these cool things, but then when the time comes for you to actually deploy something, you need deployment-grade code in order to make sure it can go into production at your client side and be useful for instance, so I think that there's an interesting challenge on both ends, but one of the things I've definitely noticed with some of our data scientists is it's very hard to get them to think in that mindset, which is why you have a team of people, because everyone has different skills and you can mitigate that. >> Dev-ops for data science? >> Yeah, exactly. We call it insight ops, but yeah, I hear what you're saying. Data science is becoming increasingly an operational function as opposed to strictly exploratory or developmental. Did some one else have a, Dez? >> One of the things I was going to mention, one of the things I like to do when someone gives me a new problem is take all the laptops and phones away. And we just end up in a room with a whiteboard. And developers find that challenging sometimes, so I had this one line where I said to them don't write the first line of code until you actually understand the problem you're trying to solve right? And I think where the data science focus has changed the game for organizations who are trying to get some systematic repeatable process that they can throw data at and just keep getting answers and things, no matter what the industry might be is that developers will come with a particular mindset on how they're going to codify something without necessarily getting the full spectrum and understanding the problem first place. What I'm finding is the people that come at data science tend to have more of a hacker ethic. They want to hack the problem, they want to understand the challenge, and they want to be able to get it down to plain English simple phrases, and then apply some algorithms and then build models, and then codify it, and so most of the time we sit in a room with whiteboard markers just trying to build a model in a graphical sense and make sure it's going to work and that it's going to flow, and once we can do that, we can codify it. I think when you come at it from the other angle from the developer ethic, and you're like I'm just going to codify this from day one, I'm going to write code. I'm going to hack this thing out and it's just going to run and compile. Often, you don't truly understand what he's trying to get to at the end point, and you can just spend days writing code and I think someone made the comment that sometimes you don't actually know whether the output is actually accurate in the first place. So I think there's a lot of value being provided from the data science practice. Over understanding the problem in plain english at a team level, so what am I trying to do from the business consulting point of view? What are the requirements? How do I build this model? How do I test the model? How do I run a sample set through it? Train the thing and then make sure what I'm going to codify actually makes sense in the first place, because otherwise, what are you trying to solve in the first place? >> Wasn't that Einstein who said if I had an hour to solve a problem, I'd spend 55 minutes understanding the problem and five minutes on the solution, right? It's exactly what you're talking about. >> Well I think, I will say, getting back to the question, the thing with building these teams, I think a lot of times people don't talk about is that engineers are actually very very important for data science projects and data science problems. For instance, if you were just trying to prototype something or just come up with a model, then data science teams are great, however, if you need to actually put that into production, that code that the data scientist has written may not be optimal, so as we scale out, it may be actually very inefficient. At that point, you kind of want an engineer to step in and actually optimize that code, so I think it depends on what you're building and that kind of dictates what kind of division you want among your teammates, but I do think that a lot of times, the engineering component is really undervalued out there. >> Jennifer, it seems that the data engineering function, data discovery and preparation and so forth is becoming automated to a greater degree, but if I'm listening to you, I don't hear that data engineering as a discipline is becoming extinct in terms of a role that people can be hired into. You're saying that there's a strong ongoing need for data engineers to optimize the entire pipeline to deliver the fruits of data science in production applications, is that correct? So they play that very much operational role as the backbone for... >> So I think a lot of times businesses will go to data scientist to build a better model to build a predictive model, but that model may not be something that you really want to implement out there when there's like a million users coming to your website, 'cause it may not be efficient, it may take a very long time, so I think in that sense, it is important to have good engineers, and your whole product may fail, you may build the best model it may have the best output, but if you can't actually implement it, then really what good is it? >> What about calibrating these models? How do you go about doing that and sort of testing that in the real world? Has that changed overtime? Or is it... >> So one of the things that I think can happen, and we found with one of our clients is when you build a model, you do it with the data that you have, and you try to use a very robust cross-validation process to make sure that it's robust and it's sturdy, but one thing that can sometimes happen is after you put your model into production, there can be external factors that, societal or whatever, things that have nothing to do with the data that you have or the quality of the data or the quality of the model, which can actually erode the model's performance over time. So as an example, we think about cell phone contracts right? Those have changed a lot over the years, so maybe five years ago, the type of data plan you had might not be the same that it is today, because a totally different type of plan is offered, so if you're building a model on that to say predict who's going to leave and go to a different cell phone carrier, the validity of your model overtime is going to completely degrade based on nothing that you have, that you put into the model or the data that was available, so I think you need to have this sort of model management and monitoring process to take this factors into account and then know when it's time to do a refresh. >> Cross-validation, even at one point in time, for example, there was an article in the New York Times recently that they gave the same data set to five different data scientists, this is survey data for the presidential election that's upcoming, and five different data scientists came to five different predictions. They were all high quality data scientists, the cross-validation showed a wide variation about who was on top, whether it was Hillary or whether it was Trump so that shows you that even at any point in time, cross-validation is essential to understand how robust the predictions might be. Does somebody else have a comment? Joe? >> I just want to say that this even drives home the fact that having the scrum team for each project and having the engineer and the data scientist, data engineer and data scientist working side by side because it is important that whatever we're building we assume will eventually go into production, and we used to have in the data warehousing world, you'd get the data out of the systems, out of your applications, you do analysis on your data, and the nirvana was maybe that data would go back to the system, but typically it didn't. Nowadays, the applications are dependent on the insight coming from the data science team. With the behavior of the application and the personalization and individual experience for a customer is highly dependent, so it has to be, you said is data science part of the dev-ops team, absolutely now, it has to be. >> Whose job is it to figure out the way in which the data is presented to the business? Where's the sort of presentation, the visualization plan, is that the data scientist role? Does that depend on whether or not you have that gene? Do you need a UI person on your team? Where does that fit? >> Wow, good question. >> Well usually that's the output, I mean, once you get to the point where you're visualizing the data, you've created an algorithm or some sort of code that produces that to be visualized, so at the end of the day that the customers can see what all the fuss is about from a data science perspective. But it's usually post the data science component. >> So do you run into situations where you can see it and it's blatantly obvious, but it doesn't necessarily translate to the business? >> Well there's an interesting challenge with data, and we throw the word data around a lot, and I've got this fun line I like throwing out there. If you torture data long enough, it will talk. So the challenge then is to figure out when to stop torturing it, right? And it's the same with models, and so I think in many other parts of organizations, we'll take something, if someone's doing a financial report on performance of the organization and they're doing it in a spreadsheet, they'll get two or three peers to review it, and validate that they've come up with a working model and the answer actually makes sense. And I think we're rushing so quickly at doing analysis on data that comes to us in various formats and high velocity that I think it's very important for us to actually stop and do peer reviews, of the models and the data and the output as well, because otherwise we start making decisions very quickly about things that may or may not be true. It's very easy to get the data to paint any picture you want, and you gave the example of the five different attempts at that thing, and I had this shoot out thing as well where I'll take in a team, I'll get two different people to do exactly the same thing in completely different rooms, and come back and challenge each other, and it's quite amazing to see the looks on their faces when they're like, oh, I didn't see that, and then go back and do it again until, and then just keep iterating until we get to the point where they both get the same outcome, in fact there's a really interesting anecdote about when the UNIX operation system was being written, and a couple of the authors went away and wrote the same program without realizing that each other were doing it, and when they came back, they actually had line for line, the same piece of C code, 'cause they'd actually gotten to a truth. A perfect version of that program, and I think we need to often look at, when we're building models and playing with data, if we can't come at it from different angles, and get the same answer, then maybe the answer isn't quite true yet, so there's a lot of risk in that. And it's the same with presentation, you know, you can paint any picture you want with the dashboard, but who's actually validating when the dashboard's painting the correct picture? >> James: Go ahead, please. >> There is a science actually, behind data visualization, you know if you're doing trending, it's a line graph, if you're doing comparative analysis, it's bar graph, if you're doing percentages, it's a pie chart, like there is a certain science to it, it's not that much of a mystery as the novice thinks there is, but what makes it challenging is that you also, just like any presentation, you have to consider your audience. And your audience, whenever we're delivering a solution, either insight, or just data in a grid, we really have to consider who is the consumer of this data, and actually cater the visual to that person or to that particular audience. And that is part of the art, and that is what makes a great data scientist. >> The consumer may in fact be the source of the data itself, like in a mobile app, so you're tuning their visualization and then their behavior is changing as a result, and then the data on their changed behavior comes back, so it can be a circular process. >> So Jim, at a recent conference, you were tweeting about the citizen data scientist, and you got emasculated by-- >> I spoke there too. >> Okay. >> TWI on that same topic, I got-- >> Kirk Borne I hear came after you. >> Kirk meant-- >> Called foul, flag on the play. >> Kirk meant well. I love Claudia Emahoff too, but yeah, it's a controversial topic. >> So I wonder what our panel thinks of that notion, citizen data scientist. >> Can I respond about citizen data scientists? >> David: Yeah, please. >> I think this term was introduced by Gartner analyst in 2015, and I think it's a very dangerous and misleading term. I think definitely we want to democratize the data and have access to more people, not just data scientists, but managers, BI analysts, but when there is already a term for such people, we can call the business analysts, because it implies some training, some understanding of the data. If you use the term citizen data scientist, it implies that without any training you take some data and then you find something there, and they think as Dev's mentioned, we've seen many examples, very easy to find completely spurious random correlations in data. So we don't want citizen dentists to treat our teeth or citizen pilots to fly planes, and if data's important, having citizen data scientists is equally dangerous, so I'm hoping that, I think actually Gartner did not use the term citizen data scientist in their 2016 hype course, so hopefully they will put this term to rest. >> So Gregory, you apparently are defining citizen to mean incompetent as opposed to simply self-starting. >> Well self-starting is very different, but that's not what I think what was the intention. I think what we see in terms of data democratization, there is a big trend over automation. There are many tools, for example there are many companies like Data Robot, probably IBM, has interesting machine learning capability towards automation, so I think I recently started a page on KDnuggets for automated data science solutions, and there are already 20 different forums that provide different levels of automation. So one can deliver in full automation maybe some expertise, but it's very dangerous to have part of an automated tool and at some point then ask citizen data scientists to try to take the wheels. >> I want to chime in on that. >> David: Yeah, pile on. >> I totally agree with all of that. I think the comment I just want to quickly put out there is that the space we're in is a very young, and rapidly changing world, and so what we haven't had yet is this time to stop and take a deep breath and actually define ourselves, so if you look at computer science in general, a lot of the traditional roles have sort of had 10 or 20 years of history, and so thorough the hiring process, and the development of those spaces, we've actually had time to breath and define what those jobs are, so we know what a systems programmer is, and we know what a database administrator is, but we haven't yet had a chance as a community to stop and breath and say, well what do we think these roles are, and so to fill that void, the media creates coinages, and I think this is the risk we've got now that the concept of a data scientist was just a term that was coined to fill a void, because no one quite knew what to call somebody who didn't come from a data science background if they were tinkering around data science, and I think that's something that we need to sort of sit up and pay attention to, because if we don't own that and drive it ourselves, then somebody else is going to fill the void and they'll create these very frustrating concepts like data scientist, which drives us all crazy. >> James: Miriam's next. >> So I wanted to comment, I agree with both of the previous comments, but in terms of a citizen data scientist, and I think whether or not you're citizen data scientist or an actual data scientist whatever that means, I think one of the most important things you can have is a sense of skepticism, right? Because you can get spurious correlations and it's like wow, my predictive model is so excellent, you know? And being aware of things like leaks from the future, right? This actually isn't predictive at all, it's a result of the thing I'm trying to predict, and so I think one thing I know that we try and do is if something really looks too good, we need to go back in and make sure, did we not look at the data correctly? Is something missing? Did we have a problem with the ETL? And so I think that a healthy sense of skepticism is important to make sure that you're not taking a spurious correlation and trying to derive some significant meaning from it. >> I think there's a Dilbert cartoon that I saw that described that very well. Joe, did you have a comment? >> I think that in order for citizen data scientists to really exist, I think we do need to have more maturity in the tools that they would use. My vision is that the BI tools of today are all going to be replaced with natural language processing and searching, you know, just be able to open up a search bar and say give me sales by region, and to take that one step into the future even further, it should actually say what are my sales going to be next year? And it should trigger a simple linear regression or be able to say which features of the televisions are actually affecting sales and do a clustering algorithm, you know I think hopefully that will be the future, but I don't see anything of that today, and I think in order to have a true citizen data scientist, you would need to have that, and that is pretty sophisticated stuff. >> I think for me, the idea of citizen data scientist I can relate to that, for instance, when I was in graduate school, I started doing some research on FDA data. It was an open source data set about 4.2 million data points. Technically when I graduated, the paper was still not published, and so in some sense, you could think of me as a citizen data scientist, right? I wasn't getting funding, I wasn't doing it for school, but I was still continuing my research, so I'd like to hope that with all the new data sources out there that there might be scientists or people who are maybe kept out of a field people who wanted to be in STEM and for whatever life circumstance couldn't be in it. That they might be encouraged to actually go and look into the data and maybe build better models or validate information that's out there. >> So Justin, I'm sorry you had one comment? >> It seems data science was termed before academia adopted formalized training for data science. But yeah, you can make, like Dez said, you can make data work for whatever problem you're trying to solve, whatever answer you see, you want data to work around it, you can make it happen. And I kind of consider that like in project management, like data creep, so you're so hyper focused on a solution you're trying to find the answer that you create an answer that works for that solution, but it may not be the correct answer, and I think the crossover discussion works well for that case. >> So but the term comes up 'cause there's a frustration I guess, right? That data science skills are not plentiful, and it's potentially a bottleneck in an organization. Supposedly 80% of your time is spent on cleaning data, is that right? Is that fair? So there's a problem. How much of that can be automated and when? >> I'll have a shot at that. So I think there's a shift that's going to come about where we're going to move from centralized data sets to data at the edge of the network, and this is something that's happening very quickly now where we can't just hold everything back to a central spot. When the internet of things actually wakes up. Things like the Boeing Dreamliner 787, that things got 6,000 sensors in it, produces half a terabyte of data per flight. There are 87,400 flights per day in domestic airspace in the U.S. That's 43.5 petabytes of raw data, now that's about three years worth of disk manufacturing in total, right? We're never going to copy that across one place, we can't process, so I think the challenge we've got ahead of us is looking at how we're going to move the intelligence and the analytics to the edge of the network and pre-cook the data in different tiers, so have a look at the raw material we get, and boil it down to a slightly smaller data set, bring a meta data version of that back, and eventually get to the point where we've only got the very minimum data set and data points we need to make key decisions. Without that, we're already at the point where we have too much data, and we can't munch it fast enough, and we can't spin off enough tin even if we witch the cloud on, and that's just this never ending deluge of noise, right? And you've got that signal versus noise problem so then we're now seeing a shift where people looking at how do we move the intelligence back to the edge of network which we actually solved some time ago in the securities space. You know, spam filtering, if an emails hits Google on the west coast of the U.S. and they create a check some for that spam email, it immediately goes into a database, and nothing gets on the opposite side of the coast, because they already know it's spam. They recognize that email coming in, that's evil, stop it. So we've already fixed its insecurity with intrusion detection, we've fixed it in spam, so we now need to take that learning, and bring it into business analytics, if you like, and see where we're finding patterns and behavior, and brew that out to the edge of the network, so if I'm seeing a demand over here for tickets on a new sale of a show, I need to be able to see where else I'm going to see that demand and start responding to that before the demand comes about. I think that's a shift that we're going to see quickly, because we'll never keep up with the data munching challenge and the volume's just going to explode. >> David: We just have a couple minutes. >> That does sound like a great topic for a future Cube panel which is data science on the edge of the fog. >> I got a hundred questions around that. So we're wrapping up here. Just got a couple minutes. Final thoughts on this conversation or any other pieces that you want to punctuate. >> I think one thing that's been really interesting for me being on this panel is hearing all of my co-panelists talking about common themes and things that we are also experiencing which isn't a surprise, but it's interesting to hear about how ubiquitous some of the challenges are, and also at the announcement earlier today, some of the things that they're talking about and thinking about, we're also talking about and thinking about. So I think it's great to hear we're all in different countries and different places, but we're experiencing a lot of the same challenges, and I think that's been really interesting for me to hear about. >> David: Great, anybody else, final thoughts? >> To echo Dez's thoughts, it's about we're never going to catch up with the amount of data that's produced, so it's about transforming big data into smart data. >> I could just say that with the shift from normal data, small data, to big data, the answer is automate, automate, automate, and we've been talking about advanced algorithms and machine learning for the science for changing the business, but there also needs to be machine learning and advanced algorithms for the backroom where we're actually getting smarter about how we ingestate and how we fix data as it comes in. Because we can actually train the machines to understand data anomalies and what we want to do with them over time. And I think the further upstream we get of data correction, the less work there will be downstream. And I also think that the concept of being able to fix data at the source is gone, that's behind us. Right now the data that we're using to analyze to change the business, typically we have no control over. Like Dez said, they're coming from censors and machines and internet of things and if it's wrong, it's always going to be wrong, so we have to figure out how to do that in our laboratory. >> Eaves, final thoughts? >> I think it's a mind shift being a data scientist if you look back at the time why did you start developing or writing code? Because you like to code, whatever, just for the sake of building a nice algorithm or a piece of software, or whatever, and now I think with the spirit of a data scientist, you're looking at a problem and say this is where I want to go, so you have more the top down approach than the bottom up approach. And have the big picture and that is what you really need as a data scientist, just look across technologies, look across departments, look across everything, and then on top of that, try to apply as much skills as you have available, and that's kind of unicorn that they're trying to look for, because it's pretty hard to find people with that wide vision on everything that is happening within the company, so you need to be aware of technology, you need to be aware of how a business is run, and how it fits within a cultural environment, you have to work with people and all those things together to my belief to make it very difficult to find those good data scientists. >> Jim? Your final thought? >> My final thoughts is this is an awesome panel, and I'm so glad that you've come to New York, and I'm hoping that you all stay, of course, for the the IBM Data First launch event that will take place this evening about a block over at Hudson Mercantile, so that's pretty much it. Thank you, I really learned a lot. >> I want to second Jim's thanks, really, great panel. Awesome expertise, really appreciate you taking the time, and thanks to the folks at IBM for putting this together. >> And I'm big fans of most of you, all of you, on this session here, so it's great just to meet you in person, thank you. >> Okay, and I want to thank Jeff Frick for being a human curtain there with the sun setting here in New York City. Well thanks very much for watching, we are going to be across the street at the IBM announcement, we're going to be on the ground. We open up again tomorrow at 9:30 at Big Data NYC, Big Data Week, Strata plus the Hadoop World, thanks for watching everybody, that's a wrap from here. This is the Cube, we're out. (techno music)

Published Date : Sep 28 2016

SUMMARY :

Brought to you by headline sponsors, and this is a cube first, and we have some really but I want to hear them. and appreciate you organizing this. and the term data mining Eves, I of course know you from Twitter. and you can do that on a technical level, How many people have been on the Cube I always like to ask that question. and that was obviously Great, thank you Craig, and I'm also on the faculty and saw that snake swallow a basketball and with the big paradigm Great, thank you. and I came to data science, Great, thank you. and so what I think about data science Great, and last but not least, and the scale at which I'm going to go off script-- You guys have in on the front. and one of the CDOs, she said that 25% and I think certainly, that's and so I think this is a great opportunity and the first question talk about the theme now and does that data scientist, you know, and you can just advertise and from the clients I mean they need to have and it's been, the transition over time but I have a feeling that the paradise and the company's product and they really have to focus What is the right division and one of the reasons I You dream in equations, right? and you have no interest in learning but I think you need to and the curiosity you and there's a lot to be and I like to use the analogy, and the reason I mentioned that is that the right breakdown of roles? and the code behind the analytics, And not the other way around. Why is that? idea of the aspects of code, of the reasons for that I think Miriam, had a comment? and someone from the chief data office and one of the things that an operational function as opposed to and so most of the time and five minutes on the solution, right? that code that the data but if I'm listening to you, that in the real world? the data that you have or so that shows you that and the nirvana was maybe that the customers can see and a couple of the authors went away and actually cater the of the data itself, like in a mobile app, I love Claudia Emahoff too, of that notion, citizen data scientist. and have access to more people, to mean incompetent as opposed to and at some point then ask and the development of those spaces, and so I think one thing I think there's a and I think in order to have a true so I'd like to hope that with all the new and I think So but the term comes up and the analytics to of the fog. or any other pieces that you want to and also at the so it's about transforming big data and machine learning for the science and now I think with the and I'm hoping that you and thanks to the folks at IBM so it's great just to meet you in person, This is the Cube, we're out.

ENTITIES

Entity	Category	Confidence
Jennifer	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Miriam Fridell	PERSON	0.99+
Greg Piateski	PERSON	0.99+
Justin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
David	PERSON	0.99+
Jeff Frick	PERSON	0.99+
2015	DATE	0.99+
Joe Caserta	PERSON	0.99+
James Cubelis	PERSON	0.99+
James	PERSON	0.99+
Miriam	PERSON	0.99+
Jim	PERSON	0.99+
Joe	PERSON	0.99+
Claudia Emahoff	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Hillary	PERSON	0.99+
New York	LOCATION	0.99+
Hillary Mason	PERSON	0.99+
Justin Sadeen	PERSON	0.99+
Greg	PERSON	0.99+
Dave	PERSON	0.99+
55 minutes	QUANTITY	0.99+
Trump	PERSON	0.99+
2016	DATE	0.99+
Craig	PERSON	0.99+
Dave Valante	PERSON	0.99+
George	PERSON	0.99+
Dez Blanchfield	PERSON	0.99+
UK	LOCATION	0.99+
Ford	ORGANIZATION	0.99+
Craig Brown	PERSON	0.99+
10	QUANTITY	0.99+
8 Path Solutions	ORGANIZATION	0.99+
CISCO	ORGANIZATION	0.99+
five minutes	QUANTITY	0.99+
two	QUANTITY	0.99+
30 years	QUANTITY	0.99+
Kirk	PERSON	0.99+
25%	QUANTITY	0.99+
Marine Corp	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
43.5 petabytes	QUANTITY	0.99+
Boston	LOCATION	0.99+
Data Robot	ORGANIZATION	0.99+
10 people	QUANTITY	0.99+
Hal Varian	PERSON	0.99+
Einstein	PERSON	0.99+
New York City	LOCATION	0.99+
Nielsen	ORGANIZATION	0.99+
first question	QUANTITY	0.99+
Friday	DATE	0.99+
Ralph Timbal	PERSON	0.99+
U.S.	LOCATION	0.99+
6,000 sensors	QUANTITY	0.99+
UC Berkeley	ORGANIZATION	0.99+
Sergey Brin	PERSON	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Jennifer Shin: