Daniel G Hernandez & Scott Buckles, IBM | IBM Data and AI Forum

>> Narrator: Live from Miami, Florida, it's The Cube. Covering IBM's Data in AI Forum, brought to you by IBM. >> Welcome back to Miami, everybody. You're watching The Cube, the leader in live tech coverage. We're here covering the IBM Data and AI Forum. Scott Buckles is here to my right. He's the business unit executive at IBM and long time Cube alum, Daniel Hernandez is the Vice President of Data and AI group. Good to see you guys, thanks for coming on. >> Thanks for having us. >> Good to see you. >> You're very welcome. We're going to talk about data ops, kind of accelerating the journey to AI around data ops, but what is data ops and how does it fit into AI? Daniel, we'll start with you. >> There's no AI without data. You've got data science to help you build AI. You've got dev ops to help you build apps. You've got nothing to basically help you prepare data for AI. Data ops is the equivalent of dev ops, but for delivering AI ready data. >> So, how are you, Scott, dealing with this topic with customers, is it resonating? Are they leaning into it, or are they saying, "what?" >> No, it's absolutely resonating. We have a lot of customers that are doing a lot of good things on the data science side. But, trying to get the right data at the right people, and do it fast, is a huge problem. They're finding they're spending too much time prepping data, getting the data into the models, and they're not spending enough time failing fast with some of those models, or getting the models that they need to put in production into production fast enough. So, this absolutely resonates with them because I think it's been confusing for a long time. >> So, AI's scary to a lot of people, right? It's a complicated situation, right? And how do you make it less scary? >> Talk about problems that can be solved with it, basically. You want a better customer experience in your contact center, you want a similarly amazing experience when they're interacting with you on the web. How do you do that? AI is simply a way to get it done, and a way to get it done exceptionally well. So, that's how I like to talk about it. I don't start with here's AI, tell me what problems you can solve. Here are the problems you've got, and where appropriate, here's where AI can help. >> So what are some of your favorite problems that you guys are solving with customers. >> Customer and employee care, which, basically, is any business that does business has customers. Customer and employee care are huge a problem space. Catching bad people, financial crimes investigation is a huge one. Fraud, KYC AML as an example. >> National security, things like that, right? >> Yeah. >> You spend all your time with customers, what else? >> Well, customer experience is probably the one that we're seeing the most. The other is being more efficient. Helping businesses solve those problems quicker, faster. Try to find new avenues for revenue. How to cut costs out of their organization, out of their run time. Those are the ones that we see the most. >> So when you say customer experience, immediately chat bots jumps into my head. But I know we're talking more than, sort of a, transcends chat bots, but double click on customer experience, how are people applying machine intelligence to improve customer experience? >> Well, when I think of it, I think about if you call in to Delta, and you have one bad experience, or your airline, whatever that airline may be, that that customer experience could lead to losing that customer forever, and there used to be an old adage that you have one bad experience and you tell 10 people about it, you have a good one, and you tell one person, or two peoples. So, getting the right data to have that experience is where it becomes a challenge and we've seen instances where customers, or excuse me, organizations are literally trying to find the data on the screen while the customer is on hold. So, they're saying, "can I put you on hold?" and they're trying to go out and find it. So, being able to automate finding that data, getting it in the right hands, to the right people, at the right time, in moment's notice, is a great opportunity for AI and machine learning, and that's an example of how we do it. >> So, from a technical standpoint, Daniel, you guys have this IBM Cloud Pak for Data that's going to magic data virtualization thing. Let's take an example that Scott just gave us, think of an airline. I love my mobile app, I can do everything on my mobile app, except there are certain things I can't do, I have to go to the website. There are certain things I have to do with e-commerce that I have to go to the website that I can't do. Sometimes watching a movie, I can't order a movie from the app, I have to go to website, the URL, and order it there and put it on my watch list. So, I presume that there's some technical debt in each of those platforms, and there's no way to get the data from here, and the data from here talking to each other. Is that the kind of problem that you're solving? >> Yes, and in this particular case, you're actually touching on what we mean by customer and employee care everywhere. The interaction you have on your phone should be the same as the interaction and the kind of response on the web, which should be the same, if not better, when you're talking to a human being. How do you have the exceptional customer and employee care, all channels. Today, say the art is, I've got a specific experience for my phone, a specific experience for my website, a specific, different experience in my contact center. The whole work we're doing around Watson Assistant, and it as a virtual assistant, is to be that nervous system that underpins all channels, and with Cloud Pak for Data, we can deliver it anywhere. You want to run your contact center on an IBM Cloud? Great. You want to run it on Amazon, Azure, Google, your own private center, or everything in between, great. Cloud Pak for Data is how you get Watson Assistant, the rest of Watson and our data stack anywhere you want, so you can deliver that same consistent, amazing experience, all channels, anywhere. >> And I know the tone of my question was somewhat negative, but I'm actually optimistic, and there's a couple examples I'll give. I remember Bill Belichick one time said, "Agh, the weather, it can't ever get the weather right," this is probably five, six years ago. Actually, they do pretty well with the weather compared to 10 or 15 years ago. The other is fraud detection. In the last 10 years, fraud detection has become so much better in terms of just the time it takes to identify a fraud, and the number of false positives. Even in the last, I'd say, 12 to 18 months, false positives are way down. I think that's machine intelligence, right? >> I mean, if you're using business rules, they're not way down. They're still way up. If you're using more sophisticated techniques, that are depending upon the operational data to be trained, then they should be way down. But, there is still a lot of these systems that are based on old school business rules that can't keep up. They're producing alerts that, in many cases, are ignored, and because they're ignored, you're susceptible to bad issues. With, especially AI based techniques for fraud detection, you better have good data to train this stuff, which gets back to the whole data ops thing, and training those with good data, which data ops can help you get done. >> And a key part to data ops is the people and the process. It's not just about automating things and automating the data to get it in the right place. You have to modernize those business processes and have the right skills to be able to do that as well. Otherwise, you're not going to make the progress. You're not going to reap the benefits. >> Well, that was actually my next question. What about the people and the process? We were talking before, off camera, about our PA, and he's saying "pave the cow path." But sometimes you actually have to re-engineer the process and you might not have the skill set. So it's people and process, and then technology you lay in. And we've always talked about this, technology is always going to change. Smart technologists will figure it out. But, the people and the process, that's the hardest part. What are you seeing in the field? >> We see a lot of customers struggling with the people and process side, for a variety of reasons. The technology seems to be the focus, but when we talk to customers, we spend a lot of time saying, "well, what needs to change in your business process "when this happens? "How do those business rules need to change "so you don't get those false positives?" Because it doesn't matter at the end of the day. >> So, can we go back to the business rules thing? So, it sounds like the business rules are sort of an outdated, policy based, rigid sort of structure that's enforced no matter what. Versus machine intelligence, which can interpret situations on the fly, but can you add some color to that and explain the difference between what you call sort of business rules based versus AI based. >> So the AI based ones, in this particular case, probably classic statistical machine learning techniques, to do something like know who I am, right? My name is Danny Hernandez, if you were to Google Danny Hernandez, the number one search result is going to be a rapper. There is a rapper that actually just recently came out, he's not even that good, but he's a new one. A statistical machine learning technique would be able to say, "all right, given Daniel "and the context information I know about him, "when I look for Daniel Hernandez, "and I supplement the identity with that "contextual information, it means it's one of "the six that work at IBM." Right? >> Not the rapper. >> Not the rapper. >> Not the rapper. >> Exactly. I don't mind being matched with a rapper, but match me with a good rapper. >> All you've got to do is search Daniel Hernandez and The Cube and you'll find him. >> Ha, right. Bingo. Actually that's true. So, in any case, the AI based techniques basically allow you to isolate who I am, based on more features that you know about me, so that you get me right. Because if you can't even start there, with whom are you transacting, you're not going to have any hope of detecting fraud. Either that, or you're going to get false positives because you're going to associate me with someone that I'm not, and then it's just going to make me upset, because when you should be transacting with me, you're not because you're saying I'm someone I'm not. >> So, that ties back to what we were saying before, know you're customer and anti money laundering. Which, of course, was big, and still is, during the crypto craze. Maybe crypto is not as crazy, but that was a big deal when you had bitcoin at whatever it was. What are some practical applications for KYC AML that you're seeing in the field today? >> I think that what we see a lot of, what we're applying in my business is automating the discovery of data and learning about the lineage of that data. Where did it come from? This was a problem that was really hard to solve 18 months ago, because it took a lot of man power to do it. And as soon as you did it once, it was outdated. So, we've recently released some capabilities within Watson Knowledge Catalog that really help automate that, so that as the data continues to grow, and continues to change, as it always does, that rather than having two, three hundred business analysts or data stewards trying to go figure that out, machine learning can go do that for you. >> So, all the big banks are glomming on to this? >> Absolutely. >> So think about any customer onboarding, right? You better know who your customer is, and you better have provisions around anti money laundering. Otherwise, there's going to be some very serious downside risk. It's just one example of many, for sure. >> Let's talk about some of the data challenges because we talked a lot about digital, digital business, I've always said the difference between a business and a digital business is how they use data. So, what are some of the challenging issues that customers are facing, and particularly, incumbents, Ginni Rometty used the term a couple of events ago, and it might have even been World of Watson, incumbent disruptors, maybe that was the first think, which I thought was a very poignant term. So, what are some of the data challenges that these incumbents are facing, and how is IMB helping solve them? >> For us, one of them that we see is just understanding where their data is. There is a lot of dark data out there that they haven't discovered yet. And what impact is that having on their analytics, what opportunities aren't they taking advantage of, and what risks are they being exposed to by that being out there. Unstructured data is another big part of it as well. Structured data is sort of the easy answer to solving the data problem, >> [Daniel Hernandez] But still hard. >> But still hard. Unstructured data is something that almost feels like an afterthought a lot of times. But, the opportunities and risks there are equally, if not greater, to your business. >> So yeah, what you're saying it's an afterthought, because a lot of times people are saying, "that's too hard." >> Scott Buckles: Right. >> Forget it. >> Scott Buckles: Right. Right. Absolutely. >> Because there's gold in them there hills, right? >> Scott Buckles: Yeah, absolutely. >> So, how does IBM help solve that problem? Is it tooling, is it discovery tooling? >> Well, yeah, so we recently released a product called InstaScan, that helps you to go discover unstructured data within any cloud environment. So, that was released a couple months ago, that's a huge opportunity that we see where customers can actually go and discover that dark data, discover those risks. And then combine that with some of the capabilities that we do with structured data too, so you have a holistic view of where your data is, and start tying that together. >> If I could add, any company that has any operating history is going to have a pretty complex data environment. Any company that wants to employ AI has a fundamental choice. Either I bring my AI to the data, or I bring my data to the AI. Our competition demand that you bring your data to the AI, which is expensive, hard, often impossible. So, if you have any desire to employ this stuff, you had better take the I'm going to bring my AI to the data approach, or be prepared to deal with a multi-year deployment for this stuff. So, that principle difference in how we think about the problem, means that we can help our customers apply AI to problem sets that they otherwise couldn't because they would have to move. And in many cases, they're just abandoning projects all together because of that. >> So, now we're starting to get into sort of data strategy. So, let's talk about data strategy. So, it starts with, I guess, understanding the value of your data. >> [Daniel Hernandez] Start with understanding what you got. >> Yeah, what data do I have. What's the value of that data? How do I get to that data? You just mentioned you can't have a strategy that says, "okay, move all the data into some God box." >> Good luck. >> Yeah. That won't work. So, do customers have coherent data strategies? Are they formulating? Where are we on that maturity curve? >> Absolutely, I think the advent of the CDO role, as the Chief Data Officer role, has really helped bring the awareness that you have to have that enterprise data strategy. >> So, that's a sign. If there's a CDO in the house. >> There's someone working on enterprise, yeah, absolutely. >> So, it's really their role, the CDO's role, to construct the data strategy. >> Absolutely. And one of the challenges that we see, though, in that, is that because it is a new role, is like going back to Daniel's historical operational stuff, right? There's a lot of things you have to sort out within your data strategy of who owns the data, right? Regardless of where it sits within an enterprise, and how are you applying that strategy to those data assets across the business. And that's not an easy challenge. That goes back to the people process side of it. >> Well, right. I bet you if I asked Jim Cavanaugh what's IBM's data strategy, I bet you he'd have a really coherent answer. But I bet you if I asked Scott Hebner, the CMO of the data and AI group, I bet you I'd get a somewhat different answer. And so, there's multiple data strategies, but I guess it's (mumbles) job to make sure that they are coherent and tie in, right? >> Absolutely. >> Am I getting this? >> Absolutely. >> Quick study. >> So, what's IBM's data strategy? (laughs) >> Data is good. >> Data is good. Bring AI to the data. >> Look, I mean, data and AI, that's the name of the business, that's the name of the portfolio that represents our philosophy. No AI without data, increasingly, not a lot of value of data without AI. We have to help our customers understand this, that's a skill, education, point of view problem, and we have to deliver technology that actually works in the wild, in their environment, not as we want them to be, but as they are. Which is often messy. But I think that's our fun. It's the reason we've been here for a while. >> All right, I'll give you guys a last word, we got to run, but both Scott and Daniel, take aways from the event today, things that you're excited about, things that you learned. Just give us the bumper sticker. >> For me, you talk about whether people recognize the need for a data strategy in their role. For me, it's people being pumped about that, being excited about it, recognizing it, and wanting to solve those problems and leverage the capabilities that are out there. >> We've seen a lot of that today. >> Absolutely. And we're at a great time and place where the capabilities and the technologies with machine learning and AI are applicable and real, that they're solving those problems. So, I think that gets everybody excited, which is cool. >> Bring it home, Daniel. >> Excitement, a ton of experimentation with AI, some real issues that are getting in the way of full-scale deployments, a methodology data ops, to deal with those real hardcore data problems in the enterprise, resonating, a technology stack that allows you to implement that as a company is, through Cloud Pak for Data, no matter where they want to run is what they need, and I'm happy we're able to deliver it to them. >> Great. Great segment, guys. Thanks for coming. >> Awesome. Thank you. >> Data, applying AI to that data, scaling with the cloud, that's the innovation cocktail that we talk about all the time on The Cube. Scaling data your way, this is Dave Vellante and we're in Miami at the AI and Data Forum, brought to you by IBM. We'll be right back right after this short break. (upbeat music)

Published Date : Oct 22 2019

SUMMARY :

Covering IBM's Data in AI Forum, brought to you by IBM. Good to see you guys, thanks for coming on. kind of accelerating the journey to AI around data ops, You've got dev ops to help you build apps. or getting the models that they need to put in production So, that's how I like to talk about it. that you guys are solving with customers. is any business that does business has customers. Those are the ones that we see the most. So when you say customer experience, So, getting the right data to have that experience and the data from here talking to each other. and the kind of response on the web, in terms of just the time it takes to identify a fraud, you better have good data to train this stuff, and automating the data to get it in the right place. the process and you might not have the skill set. Because it doesn't matter at the end of the day. and explain the difference between what you call the number one search result is going to be a rapper. I don't mind being matched with a rapper, and The Cube and you'll find him. so that you get me right. So, that ties back to what we were saying before, automate that, so that as the data continues to grow, and you better have provisions around anti money laundering. Let's talk about some of the data challenges Structured data is sort of the are equally, if not greater, to your business. because a lot of times people are saying, "that's too hard." Absolutely. that helps you to go discover unstructured data Our competition demand that you bring your data to the AI, So, it starts with, I guess, You just mentioned you can't have a strategy that says, So, do customers have coherent data strategies? that you have to have that enterprise data strategy. So, that's a sign. to construct the data strategy. There's a lot of things you have to sort out But I bet you if I asked Scott Hebner, Bring AI to the data. data and AI, that's the name of the business, but both Scott and Daniel, take aways from the event today, and leverage the capabilities that are out there. that they're solving those problems. a technology stack that allows you to implement that Thanks for coming. Thank you. brought to you by IBM.

ENTITIES

Entity	Category	Confidence
Daniel	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jim Cavanaugh	PERSON	0.99+
Scott Buckles	PERSON	0.99+
Daniel Hernandez	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Scott	PERSON	0.99+
Danny Hernandez	PERSON	0.99+
Miami	LOCATION	0.99+
Ginni Rometty	PERSON	0.99+
Bill Belichick	PERSON	0.99+
two	QUANTITY	0.99+
Scott Hebner	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Daniel G Hernandez	PERSON	0.99+
Delta	ORGANIZATION	0.99+
one person	QUANTITY	0.99+
10 people	QUANTITY	0.99+
12	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
two peoples	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
Today	DATE	0.99+
18 months	QUANTITY	0.99+
five	DATE	0.99+
today	DATE	0.99+
six	QUANTITY	0.99+
Watson Assistant	TITLE	0.99+
18 months ago	DATE	0.98+
each	QUANTITY	0.98+
both	QUANTITY	0.98+
one example	QUANTITY	0.98+
one	QUANTITY	0.98+
10	DATE	0.96+
The Cube	TITLE	0.95+
Azure	ORGANIZATION	0.94+
one bad experience	QUANTITY	0.94+
IBM Data and AI Forum	ORGANIZATION	0.93+
15 years ago	DATE	0.91+
World of Watson	ORGANIZATION	0.9+
first think	QUANTITY	0.9+
Watson	TITLE	0.9+
six years ago	DATE	0.9+
couple months ago	DATE	0.9+
one time	QUANTITY	0.89+
three hundred business	QUANTITY	0.89+
The Cube	ORGANIZATION	0.88+
Cloud Pak for	TITLE	0.84+
AI and	ORGANIZATION	0.82+
last 10 years	DATE	0.82+
IBM Data	ORGANIZATION	0.81+
Cloud Pak	COMMERCIAL_ITEM	0.81+
couple	QUANTITY	0.8+
Watson Knowledge Catalog	TITLE	0.77+
Cloud Pak for Data	TITLE	0.72+
couple of events	DATE	0.69+
double	QUANTITY	0.66+
Data Forum	ORGANIZATION	0.65+
KYC AML	TITLE	0.62+
Cloud Pak	ORGANIZATION	0.61+
Vice	PERSON	0.58+
and AI Forum	EVENT	0.56+
Data	ORGANIZATION	0.55+
InstaScan	TITLE	0.55+

Daniel Hernandez, IBM | Change the Game: Winning With AI 2018

>> Live from Times Square in New York City, it's theCUBE, covering IBM's Change the Game, Winning with AI, brought to you by IBM. >> Hi everybody, welcome back to theCUBE's special presentation. We're here at the Western Hotel and the theater district covering IBM's announcements. They've got an analyst meeting today, partner event. They've got a big event tonight. IBM.com/winwithAI, go to that website, if you're in town register. You can watch the webcast online. You'll see this very cool play of Vince Lombardy, one of his famous plays. It's kind of a power sweep right which is a great way to talk about sort of winning and with X's and O's. So anyway, Daniel Hernandez is here the vice president of IBM analytics, long time Cube along. It's great to see you again, thanks for coming on. >> My pleasure Dave. >> So we've talked a number of times. We talked earlier this year. Give us the update on momentum in your business. You guys are doing really well, we see this in the quadrants and the waves, but your perspective. >> Data science and AI, so when we last talked we were just introducing something called IBM Club Private for data. The basic idea is anybody that wants to do data science, data engineering or building apps with data anywhere, we're going to give them a single integrated platform to get that done. It's going to be the most efficient, best way to do those jobs to be done. We introduced it, it's been a resounding success. Been rolling that out with clients, that's been a whole lot of fun. >> So we talked a little bit with Rob Thomas about some of the news that you guys have, but this is really your wheelhouse so I'm going to drill down into each of these. Let's say we had Rob Beerden on yesterday on our program and he talked a lot about the IBM Red Hat and Hortonworks relationship. Certainly they talked about it on their earnings call and there seems to be clear momentum in the marketplace. But give us your perspective on that announcement. What exactly is it all about? I mean it started kind of back in the ODPI days and it's really evolved into something that now customers are taking advantage of. >> You go back to June last year, we entered into a relationship with Hortonworks where the basic primacy, was customers care about data and any data driven initiative was going to require data science. We had to do a better job bringing these eco systems, one focused on kind of Hadoop, the other one on classic enterprise analytical and operational data together. We did that last year. The other element of that was we're going to bring our data science and machine learning tools and run times to where the data is including Hadoop. That's been a resounding success. The next step up is how do we proliferate that single integrated stack everywhere including private Cloud or preferred Clouds like Open Shift. So there was two elements of the announcement. We did the hybrid Cloud architecture initiative which is taking the Hadoop data stack and bringing it to containers and Kubernetes. That's a big deal for people that want to run the infrastructure with Cloud characteristics. And the other was we're going to bring that whole stack onto Open Shift. So on IBM's side, with IBM Cloud Private for data we are driving certification of that entire stack on OpenShift so any customer that's betting on OpenShift as their Cloud infrastructure can benefit from that and the single integrated data stack. It's a pretty big deal. >> So OpenShift is really interesting because OpenShift was kind of quiet for awhile. It was quiest if you will. And then containers come on the scene and OpenShift has just exploded. What are your perspectives on that and what's IBM's angle on OpenShift? >> Containers of Kubernetes basically allow you to get Cloud characteristics everywhere. It used to be locked in to kind of the public Cloud or SCP providers that were offering as a service whether PAS OR IAS and Docker and Kubernetes are making the same underline technology that enabled elasticity, pay as you go models available anywhere including your own data center. So I think it explains why OpenShift, why IBM Cloud Private, why IBM Club Private for data just got on there. >> I mean the Core OS move by Red Hat was genius. They picked that up for the song in our view anyway and it's really helped explode that. And in this world, everybody's talking about Kubernetes. I mean we're here at a big data conference all week. It used to be Hadoop world. Everybody's talking about containers, Kubernetes and Multi cloud. Those are kind of the hot trends. I presume you've seen the same thing. >> 100 percent. There's not a single client that I know, and I spend the majority of my time with clients that are running their workloads in a single stack. And so what do you do? If data is an imperative for you, you better run your data analytic stack wherever you need to and that means Multi cloud by definition. So you've got a choice. You can say, I can port that workload to every distinct programming model and data stack or you can have a data stack everywhere including Multi clouds and Open Shift in this case. >> So thinking about the three companies, so Hortonworks obviously had duped distro specialists, open source, brings that end to end sort of data management from you know Edge, or Clouds on Prim. Red Hat doing a lot of the sort of hardcore infrastructure layer. IBM bringing in the analytics and really empowering people to get insights out of data. Is that the right way to think about that triangle? >> 100 percent and you know with the Hortonworks and IBM data stacks, we've got our common services, particularly you're on open meta data which means wherever your data is, you're going to know about it and you're going to be able to control it. Privacy, security, data discovery reasons, that's a pretty big deal. >> Yeah and as the Cloud, well obviously the Cloud whether it's on Prim or in the public Cloud expands now to the Edge, you've also got this concept of data virtualization. We've talked about this in the past. You guys have made some announcements there. But let's put a double click on that a little bit. What's it all about? >> Data virtualization been going on for a long time. It's basic intent is to help you access data through whatever tools, no matter where the data is. Traditional approaches of data virtualization are pretty limiting. So they work relatively well when you've got small data sets but when you've got highly fragmented data, which is the case in virtually every enterprise that exists a lot of the undermined technology for data virtualization breaks down. Data coming through a single headnote. Ultimately that becomes the critical issue. So you can't take advantage of data virtualization technologies largely because of that when you've got wide scale deployments. We've been incubating technology under this project codename query plex, it was a code name that we used internally and that we were working with Beta clients on and testing it out, validating it technically and it was pretty clear that this is a game changing method for data virtualization that allows you to drive the benefits of accessing your data wherever it is, pushing down queries where the data is and getting benefits of that through highly fragmented data landscape. And so what we've done is take that extremely innovated next generation data virtualization technology include it in our data platform called IBM Club Private for Data, and made it a critical feature inside of that. >> I like that term, query plex, it reminds me of the global sisplex. I go back to the days when actually viewing sort of distributed global systems was very, very challenging and IBM sort of solved that problem. Okay, so what's the secret sauce though of query plex and data virtualization? How does it all work? What's the tech behind it? >> So technically, instead of data coming and getting funneled through one node. If you ever think of your data as kind of a graph of computational data nodes. What query plex does is take advantage of that computational mesh to do queries and analytics. So instead of bringing all the data and funneling it through one of the nodes, and depending on the computational horsepower of that node and all the data being able to get to it, this just federates it out. It distributes out that workload so it's some magic behind the scenes but relatively simple technique. Low computing aggregate, it's probably going to be higher than whatever you can put into that single node. >> And how do customers access these services? How long does it take? >> It would look like a standard query interface to them. So this is all magic behind the scenes. >> Okay and they get this capability as part of what? IBM's >> IBM's Club Private for Data. It's going to be a feature, so this project query plex, is introduced as next generation data virtualization technology which just becomes a part of IBM Club Private for Data. >> Okay and then the other announcement that we talked to Rob, I'd like to understand a little bit more behind it. Actually before we get there, can we talk about the business impact of query plex and data virtualization? Thinking about it, it dramatically simplifies the processes that I have to go through to get data. But more importantly, it helps me get a handle on my data so I can apply machine intelligence. It seems like the innovation sandwich if you will. Data plus AI and then Cloud models for scale and simplicity and that's what's going to drive innovation. So talk about the business impact that people are excited about with regard to query plex. >> Better economics, so in order for you to access your data, you don't have to do ETO in this particular case. So data at rest getting consumed because of this online technology. Two performance, so because of the way this works you're actually going to get faster response times. Three, you're going to be able to query more data simply because this technology allows you to access all your data in a fragmented way without having to consolidate it. >> Okay, so it eliminates steps, right, and gets you time to value and gives you a bigger corporate of data that you can the analyze and drive inside. >> 100 percent. >> Okay, let's talk about stack overflow. You know, Rob took us through a little bit about what that's, what's going on there but why stack overflow, you're targeting developers? Talk to me more about that. >> So stack overflow, 50 million active developers each month on that community. You're a developer and you want to know something, you have to go to stack overflow. You think about data science and AI as disciplines. The idea that that is only dermained to AI and data scientists is very limiting idea. In order for you to actually apply artificial intelligence for whatever your use case is instead of a business it's going to require multiple individuals working together to get that particular outcome done including developers. So instead of having a distinct community for AI that's focused on AI machine developers, why not bring the artificial intelligence community to where the developers already are, which is stack overflow. So, if you go to AI.stackexchange.com, it's going to be the place for you to go to get all your answers to any question around artificial intelligence and of course IBM is going to be there in the community helping out. >> So it's AI.stackexchange.com. You know, it's interesting Daniel that, I mean to talk about digital transformation talking about data. John Furrier said something awhile back about the dots. This is like five or six years ago. He said data is the new development kit and now you guys are essentially targeting developers around AI, obviously a data centric. People trying to put data at the core of the organization. You see that that's a winning strategy. What do you think about that? >> 100 percent, I mean we're the data company instead of IBM, so you're probably asking the wrong guy if you think >> You're biased. (laughing) >> Yeah possibly, but I'm acknowledged. The data over opinions. >> Alright, tell us about tonight what we can expect? I was referencing the Vince Lombardy play here. You know, what's behind that? What are we going to see tonight? >> We were joking a little bit about the old school power eye formation, but that obviously works for your, you're a New England fan aren't you? >> I am actually, if you saw the games this weekend Pat's were in the power eye for quite a bit of the game which I know upset a lot of people. But it works. >> Yeah, maybe we should of used it as a Dallas Cowboy team. But anyways, it's going to be an amazing night. So we're going to have a bunch of clients talking about what they're doing with AI. And so if you're interested in learning what's happening in the industry, kind of perfect event to get it. We're going to do some expert analysis. It will be a little bit of fun breaking down what those customers did to be successful and maybe some tips and tricks that will help you along your way. >> Great, it's right up the street on the west side highway, probably about a mile from the Javis Center people that are at Strata. We've been running programs all week. One of the themes that we talked about, we had an event Tuesday night. We had a bunch of people coming in. There was people from financial services, we had folks from New York State, the city of New York. It was a great meet up and we had a whole conversation got going and one of the things that we talked about and I'd love to get your thoughts and kind of know where you're headed here, but big data to do all that talk and people ask, is that, now at AI, the conversation has moved to AI, is it same wine, new bottle, or is there something substantive here? The consensus was, there's substantive innovation going on. Your thoughts about where that innovation is coming from and what the potential is for clients? >> So if you're going to implement AI for let's say customer care for instance, you're going to be three wrongs griefs. You need data, you need algorithms, you need compute. With a lot of different structure to relate down to capture data wasn't captured until the traditional data systems anchored by Hadoop and big data movement. We landed, we created a data and computational grid for that data today. With all the advancements going on in algorithms particularly in Open Source, you now have, you can build a neuro networks, you can do Cisco machine learning in any language that you want. And bringing those together are exactly the combination that you need to implement any AI system. You already have data and computational grids here. You've got algorithms bringing them together solving some problem that matters to a customer is like the natural next step. >> And despite the skills gap, the skill gaps that we talked about, you're seeing a lot of knowledge transfer from a lot of expertise getting out there into the wild when you follow people like Kirk Born on Twitter you'll see that he'll post like the 20 different models for deep learning and people are starting to share that information. And then that skills gap is closing. Maybe not as fast as some people like but it seems like the industry is paying attention to this and really driving hard to work toward it 'cause it's real. >> Yeah I agree. You're going to have Seth Dulpren, I think it's Niagara, one of our clients. What I like about them is the, in general there's two skill issues. There's one, where does data science and AI help us solve problems that matter in business? That's really a, trying to build a treasure map of potential problems you can solve with a stack. And Seth and Niagara are going to give you a really good basis for the kinds of problems that we can solve. I don't think there's enough of that going on. There's a lot of commentary communication actually work underway in the technical skill problem. You know, how do I actually build these models to do. But there's not enough in how do I, now that I solved that problem, how do we marry it to problems that matter? So the skills gap, you know, we're doing our part with our data science lead team which Seth opens which is telling a customer, pick a hard problem, give us some data, give us some domain experts. We're going to be in the AI and ML experts and we're going to see what happens. So the skill problem is very serious but I don't think it's most people are not having the right conversations about it necessarily. They understand intuitively there's a tech problem but that tech not linked to a business problem matters nothing. >> Yeah it's not insurmountable, I'm glad you mentioned that. We're going to be talking to Niagara Bottling and how they use the data science elite team as an accelerant, to kind of close that gap. And I'm really interested in the knowledge transfer that occurred and of course the one thing about IBM and companies like IBM is you get not only technical skills but you get deep industry expertise as well. Daniel, always great to see you. Love talking about the offerings and going deep. So good luck tonight. We'll see you there and thanks so much for coming on theCUBE. >> My pleasure. >> Alright, keep it right there everybody. This is Dave Vellanti. We'll be back right after this short break. You're watching theCUBE. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

IBM's Change the Game, Hotel and the theater district and the waves, but your perspective. It's going to be the most about some of the news that you guys have, and run times to where the It was quiest if you will. kind of the public Cloud Those are kind of the hot trends. and I spend the majority Is that the right way to and you're going to be able to control it. Yeah and as the Cloud, and getting benefits of that I go back to the days and all the data being able to get to it, query interface to them. It's going to be a feature, So talk about the business impact of the way this works that you can the analyze Talk to me more about that. it's going to be the place for you to go and now you guys are You're biased. The data over opinions. What are we going to see tonight? saw the games this weekend kind of perfect event to get it. One of the themes that we talked about, that you need to implement any AI system. that he'll post like the And Seth and Niagara are going to give you kind of close that gap. This is Dave Vellanti.

ENTITIES

Entity	Category	Confidence
Dave Vellanti	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Rob	PERSON	0.99+
Daniel	PERSON	0.99+
John Furrier	PERSON	0.99+
Tuesday night	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Beerden	PERSON	0.99+
AI.stackexchange.com	OTHER	0.99+
Cisco	ORGANIZATION	0.99+
Three	QUANTITY	0.99+
Dave	PERSON	0.99+
New York City	LOCATION	0.99+
New York State	LOCATION	0.99+
Seth Dulpren	PERSON	0.99+
last year	DATE	0.99+
Rob Thomas	PERSON	0.99+
yesterday	DATE	0.99+
tonight	DATE	0.99+
Dallas Cowboy	ORGANIZATION	0.99+
one	QUANTITY	0.99+
three companies	QUANTITY	0.99+
Open Shift	TITLE	0.99+
New York	LOCATION	0.99+
two elements	QUANTITY	0.99+
IBM Red Hat	ORGANIZATION	0.99+
100 percent	QUANTITY	0.99+
June last year	DATE	0.99+
20 different models	QUANTITY	0.98+
Vince Lombardy	PERSON	0.98+
five	DATE	0.98+
Times Square	LOCATION	0.98+
Red Hat	ORGANIZATION	0.97+
each	QUANTITY	0.97+
Pat	PERSON	0.97+
OpenShift	TITLE	0.97+
each month	QUANTITY	0.97+
single client	QUANTITY	0.96+
New England	LOCATION	0.96+
single	QUANTITY	0.96+
single stack	QUANTITY	0.96+
Hadoop	TITLE	0.96+
six years ago	DATE	0.94+
three wrongs	QUANTITY	0.94+
IBM.com/winwithAI	OTHER	0.94+
today	DATE	0.94+
earlier this year	DATE	0.93+
Niagara	ORGANIZATION	0.93+
One	QUANTITY	0.92+
about a mile	QUANTITY	0.92+
Kirk Born	PERSON	0.91+
Seth	ORGANIZATION	0.91+
IBM Club	ORGANIZATION	0.89+
Change the Game: Winning With AI	TITLE	0.88+
50 million active developers	QUANTITY	0.88+

Daniel Hernandez, IBM | IBM Think 2018

>> Narrator: Live from Las Vegas It's theCUBE covering IBM Think 2018. Brought to you by IBM. >> We're back at Mandalay Bay in Las Vegas. This is IBM Think 2018. This is day three of theCUBE's wall-to-wall coverage. My name is Dave Vellante, I'm here with Peter Burris. You're watching theCUBE, the leader in live tech coverage. Daniel Hernandez is here. He's the Vice President of IBM Analytics, a CUBE alum. It's great to see you again, Daniel >> Thanks >> Dave: Thanks for coming back on >> Happy to be here. >> Big tech show, consolidating a bunch of shows, you guys, you kind of used to have your own sort of analytics show but now you've got all the clients here. How do you like it? Compare and contrast. >> IBM Analytics loves to share so having all our clients in one place, I actually like it. We're going to work out some of the kinks a little bit but I think one show where you can have a conversation around Artificial Intelligence, data, analytics, power systems, is beneficial to all of us, actually. >> Well in many respects, the whole industry is munging together. Folks focus more on workloads as opposed to technology or even roles. So having an event like this where folks can talk about what they're trying to do, the workloads they're trying to create, the role that analytics, AI, et cetera is going to play in informing those workloads. Not a bad place to get that crosspollination. What do you think? >> Daniel: Totally. You talk to a client, there are so many problems. Problems are a combination of stuff that we have to offer and analytics stuff that our friends in Hybrid Integration have to offer. So for me, logistically, I could say oh, Mike Gilfix, business process automation. Go talk to him. And he's here. That's happened probably at least a dozen times so far in not even two days. >> Alright so I got to ask, your tagline. Making data ready for AI. What does that mean? >> We get excited about amazing tech. Artificial intelligence is amazing technology. I remember when Watson beat Jeopardy. Just being inspired by all the things that I thought it could do to solve problems that matter to me. And if you look over the last many years, virtual assistants, image recognition systems that solve pretty big problems like catching bad guys are inspirational pieces of work that were inspired a lot by what we did then. And in business, it's triggered a wave of artificial intelligence can help me solve business critical issues. And I will tell you that many clients simply aren't ready to get started. And because they're not ready, they're going to fail. And so our attitude about things are, through IBM Analytics, we're going to deliver the critical capabilities you need to be ready for AI. And if you don't have that, 100% of your projects will fail. >> But how do you get the business ready to think about data differently? You can do a lot to say, the technology you need to do this looks differently but you also need to get the organization to acculturate, appreciate that their business is going to run differently as a consequence of data and what you do with it. How do you get the business to start making adjustments? >> I think you just said the magic word, the business. Which is to say, at least all the conversations I have with my customers, they can't even tell that I'm from the analytics because I'm asking them about the problems. What do you try to do? How would you measure success? What are the critical issues that you're trying to solve? Are you trying to make money, save money, those kinds of things. And by focusing on it, we can advise them then based on that how we can help. So the data culture that you're describing I think it's a fact, like you become data aware and understand the power of it by doing. You do by starting with the problems, developing successes and then iterating. >> An approach to solving problems. >> Yeah >> So that's kind of a step zero to getting data ready for AI >> Right. But in no conversation that leads to success does it ever start with we're going to do AI or machine learning, what problem are we going to solve? It's always the other way around. And when we do that, our technology then is easily explainable. It's like okay, you want to build a system for better customer interactions in your call center. Well, what does that mean? You need data about how they have interacted with you, products they have interacted with, you might want predictions that anticipate what their needs are before they tell you. And so we can systematically address them through the capabilities we've got. >> Dave, if I could amplify one thing. It makes the technology easier when you put it in these constants I think that's a really crucial important point. >> It's super simple. All of us have had to have it, if we're in technology. Going the other way around, my stuff is cool. Here's why it's cool. What problems can you solve? Not helpful for most of our clients. >> I wonder if you could comment on this Daniel. I feel like we're, the last ten years about cloud mobile, social, big data. We seem to be entering an era now of sense, speak, act, optimize, see, learn. This sort of pervasive AI, if you will. How- is that a reasonable notion, that we're entering that era, and what do you see clients doing to take advantage of that? What's their mindset like when you talk to them? >> I think the evidence is there. You just got to look around the show and see what's possible, technically. The Watson team has been doing quite a bit of stuff around speech, around image. It's fascinating tech, stuff that feels magical to me. And I know how this stuff works and it still feels kind of fascinating. Now the question is how do you apply that to solve problems. I think it's only a matter of time where most companies are implementing artificial intelligence systems in business critical and core parts of their processes and they're going to get there by starting, by doing what they're already doing now with us, and that is what problem am I solving? What data do I need to get that done? How do I control and organize that information so I can exploit it? How can I exploit machine learning and deep learning and all these other technologies to then solve that problem. How do I measure success? How do I track that? And just systematically running these experiments. I think that crescendos to a critical mass. >> Let me ask you a question. Because you're a technologist and you said it's amazing, it's like magic even to you. Imagine non technologists, what `it's like to me. There's a black box component of AI, and maybe that's okay. I'm just wondering if that's, is that a headwind, are clients comfortable with that? If you have to describe how you really know it's a cat. I mean, I know a cat when I see it. And the machine can tell me it's a cat, or not a hot dog Silicon Valley reference. (Peter laughs) But to tell me actually how it works, to figure that out there's a black box component. Does that scare people? Or are they okay with that? >> You've probably given me too much credit. So I really can't explain how all that just works but what I can tell you is how certainly, I mean, lets take regulated industries like banks and insurance companies that are building machine learning models throughout their enterprise. They've got to explain to a regulator that they are offering considerations around anti discriminatory, basically they're not buying systems that cause them to do things that are against the law, effectively. So what are they doing? Well, they're using tools like ones from IBM to build these models to track the process of creating these models which includes what data they used, how that training was done, prove that the inputs and outputs are not anti-discriminatory and actually go through their own internal general counsel and regulators to get it done. So whether you can explain the model in this particular case doesn't matter. What they're trying to prove is that the effect is not violating the law, which the tool sets and the process around those tool sets allow you to get that done today. >> Well, let me build on that because one of the ways that it does work is that, as Ginni said yesterday, Ginni Rometty said yesterday that it's always going to be a machine human component to it. And so the way it typically works is a machine says I think this is a cat and a human validates it or not. The machine still doesn't really know if it's a cat but coming back to this point, one of the key things that we see anyway, and one of the advantages that IBM likely has, is today the folks running Operational Systems, the core of the business, trust their data sources. >> Do they? >> They trust their DB2 database, they trust their Oracle database, they trust the data that's in the applications. >> Dave: So it's the data that's in their Data Lake? >> I'm not saying they do but that's the key question. At what point in time, and I think the real important part of your question is, at what point in time do the hardcore people allow AI to provide a critical input that's going to significantly or potentially dramatically change the behavior of the core operational systems. That seems a really crucial point. What kind of feedback do you get from customers as you talk about turning AI from something that has an insight every now and then to becoming effectively, an element or essential to the operation of the business? >> One of the critical issues in getting especially machine learning models, integrated in business critical processes and workflows is getting those models running where that work is done. So if you look, I mean, when I was here last time I was talking about the, we were focused on portfolio simplification and bringing machine learning where the data was. We brought machine learning to private cloud, we brought it onto Gadook, we brought it on mainframe. I think it is a critical necessary ingredient that you need to deliver that outcome. Like, bring that technology where the data is. Otherwise it just won't work. Why? As soon as you move, you've got latency. As soon as you move, you've got data quality issues you're going to have contending. That's going to exacerbate whatever mistrust you might have. >> Or the stuff's not cheap to move. It's not cheap to ingest. >> Yeah. By the way, the Machine Learning on Z offering that we launched last year in March, April was one of our highest, most successful offerings last year. >> Let's talk about some of the offerings. I mean, at the end of the day you're in the business of selling stuff. You've talked about Machine Learning on Z X, whatever platform. Cloud Private, I know you've got perspectives on that. Db2 Event Store is something that you're obviously familiar with. SPSS is part of the portfolio. >> 50 year, the anniversary. >> Give us the update on some of these products. >> Making data ready for AI requires a design principled on simplicity. We launched in January three core offerings that help clients benefit from the capability that we deliver to capture data, to organize and control that data and analyze that data. So we delivered a Hybrid Data Management offering which gives you everything you need to collect data, it's anchored by Db2. We have the Unified Governance and Integration portfolio that gives you everything you need to organize and control that data as anchored by our information server product set. And we've got our Data Science and Businesses Analytics portfolio, which is anchored by our data science experience, SPSS and Cognos Analytics portfolio. So clients that want to mix and match those capabilities in support of artificial intelligence systems, or otherwise, can benefit from that easily. We just announced here a radical- an even radical step forward in simplification, which we thought that there already was. So if you want to move to the public cloud but can't, don't want to move to the public cloud for whatever reason and we think, by the way, public cloud for workload to like, you should try to run as much as you can there because the benefits of it. But if for whatever reason you can't, we need to deliver those benefits behind the firewall where those workloads are. So last year the Hybrid Integration team led by Denis Kennelly, introduced an IBM cloud private offering. It's basically application paths behind the firewall. It's like run on a Kubernetes environment. Your applications do buildouts, do migrations of existing workloads to it. What we did with IBM Cloud Private for data is have the data companion for that. IBM Cloud Private was a runaway success for us. You could imagine the data companion to that just being like, what application doesn't need data? It's peanut butter and jelly for us. >> Last question, oh you had another point? >> It's alright. I wanted to talk about Db2 and SPCC. >> Oh yes, let's go there, yeah. >> Db2 Event Store, I forget if anybody- It has 100x performance improvement on Ingest relative to the current state of the order. You say, why does that matter? If you do an analysis or analytics, machine learning, artificial intelligence, you're only as good as whatever data you have captured of your, whatever your reality is. Currently our databases don't allow you to capture everything you would want. So Db2 Event Store with that Ingest lets you capture more than you could ever imagine you would want. 250 billion events per year is basically what it's rated at. So we think that's a massive improvement in database technology and it happens to be based in open source, so the programming model is something that developers feel is familiar. SPSS is celebrating it's 50th year anniversary. It's the number one digital offering inside of IBM. It had 510,000 users trying it out last year. We just renovated the user experience and made it even more simple on stats. We're doing the same thing on Modeler and we're bringing SPSS and our data science experience together so that there's one tool chain for data science end to end in the Private Cloud. It's pretty phenomenal stuff. >> Okay great, appreciate you running down the portfolio for us. Last question. It's kind of a, get out of your telescope. When you talk to clients, when you think about technology from a technologist's perspective, how far can we take machine intelligence? Think 20 plus years, how far can we take it and how far should we take it? >> Can they ever really know what a cat is? (chuckles) >> I don't know what the answer to that question is, to be honest. >> Are people asking you that question, in the client base? >> No. >> Are they still figuring out, how do I apply it today? >> Surely they're not asking me, probably because I'm not the smartest guy in the room. They're probably asking some of the smarter guys-- >> Dave: Well, Elon Musk is talking about it. Stephen Hawking was talking about it. >> I think it's so hard to anticipate. I think where we are today is magical and I couldn't have anticipated it seven years ago, to be honest, so I can't imagine. >> It's really hard to predict, isn't it? >> Yeah. I've been wrong on three to four year horizons. I can't do 20 realistically. So I'm sorry to disappoint you. >> No, that's okay. Because it leads to my real last question which is what kinds of things can machines do that humans can't and you don't even have to answer this, but I just want to put it out there to the audience to think about how are they going to complement each other. How are they going to compete with each other? These are some of the big questions that I think society is asking. And IBM has some answers, but we're going to apply it here, here and here, you guys are clear about augmented intelligence, not replacing. But there are big questions that I think we want to get out there and have people ponder. I don't know if you have a comment. >> I do. I think there are non obvious things to human beings, relationships between data that's expressing some part of your reality that a machine through machine learning can see that we can't. Now, what does it mean? Do you take action on it? Is it simply an observation? Is it something that a human being can do? So I think that combination is something that companies can take advantage of today. Those non obvious relationships inside of your data, non obvious insights into your data is what machines can get done now. It's how machine learning is being used today. Is it going to be able to reason on what to do about it? Not yet, so you still need human beings in the middle too, especially when you deal with consequential decisions. >> Yeah but nonetheless, I think the impact on industry is going to be significant. Other questions we ask are retail stores going to be the exception versus the normal. Banks lose control of the payment systems. Will cyber be the future of warfare? Et cetera et cetera. These are really interesting questions that we try and cover on theCUBE and we appreciate you helping us explore those. Daniel, it's always great to see you. >> Thank you, Dave. Thank you, Peter. >> Alright keep it right there buddy, we'll be back with our next guest right after this short break. (electronic music)

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. It's great to see you again, Daniel How do you like it? bit but I think one show where you can have a is going to play in informing those workloads. You talk to a client, Alright so I got to ask, your tagline. And I will tell you that many clients simply appreciate that their business is going to run differently I think you just said the magic word, the business. But in no conversation that leads to success when you put it in these constants What problems can you solve? entering that era, and what do you see Now the question is how do you apply that to solve problems. If you have to describe how you really know it's a cat. So whether you can explain the model in this Well, let me build on that because one of the the applications. What kind of feedback do you get from customers That's going to exacerbate whatever mistrust you might have. Or the stuff's not cheap to move. that we launched last year in March, April I mean, at the end of the day you're in to like, you should try to run as much as you I wanted to talk about Db2 and SPCC. So Db2 Event Store with that Ingest lets you capture When you talk to clients, when you think about is, to be honest. I'm not the smartest guy in the room. Dave: Well, Elon Musk is talking about it. I think it's so hard to anticipate. So I'm sorry to disappoint you. How are they going to compete with each other? I think there are non obvious things to industry is going to be significant. with our next guest right after this short break.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Daniel	PERSON	0.99+
Daniel Hernandez	PERSON	0.99+
Mike Gilfix	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ginni	PERSON	0.99+
Ginni Rometty	PERSON	0.99+
Peter	PERSON	0.99+
Denis Kennelly	PERSON	0.99+
Dave	PERSON	0.99+
January	DATE	0.99+
Stephen Hawking	PERSON	0.99+
yesterday	DATE	0.99+
Elon Musk	PERSON	0.99+
last year	DATE	0.99+
100x	QUANTITY	0.99+
20 plus years	QUANTITY	0.99+
100%	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
510,000 users	QUANTITY	0.99+
March	DATE	0.99+
today	DATE	0.99+
50 year	QUANTITY	0.99+
Db2	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
seven years ago	DATE	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
Z X	TITLE	0.98+
20	QUANTITY	0.98+
three	QUANTITY	0.98+
SPSS	TITLE	0.98+
April	DATE	0.96+
IBM Analytics	ORGANIZATION	0.96+
Gadook	ORGANIZATION	0.96+
Silicon Valley	LOCATION	0.94+
two days	QUANTITY	0.94+
Oracle	ORGANIZATION	0.92+
SPCC	ORGANIZATION	0.92+
DB2	TITLE	0.9+
four year	QUANTITY	0.9+
one place	QUANTITY	0.89+
Vegas	LOCATION	0.89+
Kubernetes	TITLE	0.87+
SPSS	ORGANIZATION	0.86+
Jeopardy	ORGANIZATION	0.86+
50th year anniversary	QUANTITY	0.86+
Watson	PERSON	0.82+
at least a dozen times	QUANTITY	0.82+
Db2 Event Store	TITLE	0.8+
theCUBE	ORGANIZATION	0.8+
intelligence	EVENT	0.79+
step zero	QUANTITY	0.78+
one tool	QUANTITY	0.77+
250 billion events per year	QUANTITY	0.76+
three core offerings	QUANTITY	0.75+
one thing	QUANTITY	0.7+
Db2 Event	ORGANIZATION	0.68+
Vice President	PERSON	0.68+
Ingest	ORGANIZATION	0.68+

Daniel Hernandez, Analytics Offering Management | IBM Data Science For All

>> Announcer: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome to the big apple, John Walls and Dave Vellante here on theCUBE we are live at IBM's Data Science For All. Going to be here throughout the day with a big panel discussion wrapping up our day. So be sure to stick around all day long on theCUBe for that. Dave always good to be here in New York is it not? >> Well you know it's been kind of the data science weeks, months, last week we're in Boston at an event with the chief data officer conference. All the Boston Datarati were there, bring it all down to New York City getting hardcore really with data science so it's from chief data officer to the hardcore data scientists. >> The CDO, hot term right now. Daniel Hernandez now joins as our first guest here at Data Science For All. Who's a VP of IBM Analytics, good to see you. David thanks for being with us. >> Pleasure. >> Alright well give us first off your take, let's just step back high level here. Data science it's certainly been evolving for decades if you will. First off how do you define it today? And then just from the IBM side of the fence, how do you see it in terms of how businesses should be integrating this into their mindset. >> So the way I describe data science simply to my clients is it's using the scientific method to answer questions or deliver insights. It's kind of that simple. Or answering questions quantitatively. So it's a methodology, it's a discipline, it's not necessarily tools. So that's kind of the way I approach describing what it is. >> Okay and then from the IBM side of the fence, in terms of how wide of a net are you casting these days I assume it's as big as you can get your arms out. >> So when you think about any particular problem that's a data science problem, you need certain capabilities. We happen to deliver those capabilities. You need the ability to collect, store, manage, any and all data. You need the ability to organize that data so you can discover it and protect it. You got to be able to analyze it. Automate the mundane, explain the past, predict the future. Those are the capabilities you need to do data science. We deliver a portfolio of it. Including on the analyze part of our portfolio, our data science tools that we would declare as such. >> So data science for all is very aspirational, and when you guys made the announcement of the Watson data platform last fall, one of the things that you focused on was collaboration between data scientists, data engineers, quality engineers, application development, the whole sort of chain. And you made the point that most of the time that data scientists spend is on wrangling data. You're trying to attack that problem, and you're trying to break down the stovepipes between those roles that I just mentioned. All that has to happen before you can actually have data science for all. I mean that's just data science for all hardcore data people. Where are we in terms of sort of the progress that your clients have made in that regard? >> So you know, I would say there's two majors vectors of progress we've made. So if you want data science for all you need to be able to address people that know how to code and people that don't know how to code. So if you consider kind the history of IBM in the data science space especially in SPSS, which has been around for decades. We're mastering and solving data science problems for non-coders. The data science experience really started with embracing coders. Developers that grew up in open source, that lived and learned Jupiter or Python and were more comfortable there. And integration of these is kind of our focus. So that's one aspect. Serving the needs of people that know how to code and don't in the kind of data science role. And then for all means supporting an entire analytics life cycle from collecting the data you need in order to answer the question that you're trying to answer to organizing that information once you've collected so you can discover it inside of tools like our own data science experience and SPSS, and then of course the set of tools that around exploratory analytics. All integrated so that you can do that end to end life cycle. So where clients are, I think they're getting certainly much more sophisticated in understanding that. You know most people have approached data science as a tool problem, as a data prep problem. It's a life cycle problem. And that's kind of how we're thinking about it. We're thinking about it in terms of, alright if our job is answer questions, delivering insights through scientific methods, how do we decompose that problem to a set of things that people need to get the job done, serving the individuals that have to work together. >> And when you think about, go back to the days where it's sort of the data warehouse was king. Something we talked about in Boston last week, it used to be the data warehouse was king, now it's the process is much more important. But it was very few people had access to that data, you had the elapsed time of getting answers, and the inflexibility of the systems. Has that changed and to what degree has it changed? >> I think if you were to go ask anybody in business whether or not they have all the data they need to do their job, they would say no. Why? So we've invested in EDW's, we've invested in Hadoop. In part sometimes, the problem might be, I just don't have the data. Most of the time it is I have the data I just don't know where it is. So there's a pretty significant issue on data discoverability, and it's important that I might have data in my operational systems, I might have data inside my EDW, I don't have everything inside my EDW, I've standed up one or more data lakes, and to solve my problem like customer segmentation I have data everywhere, how do I find and bring it in? >> That seems like that should be a fundamental consideration, right? If you're going to gather this much more information, make it accessible to people. And if you don't, it's a big flaw, it's a big gap is it not? >> So yes, and I think part of the reason why is because governance professionals which I am, you know I spent quite a bit of time trying to solve governance related problems. We've been focusing pretty maniacally on kind of the compliance, and the regulatory and security related issues. Like how do we keep people from going to jail, how do we ensure regulatory compliance with things like e-discovery, and records for instance. And it just so happens the same discipline that you use, even though in some cases lighter weight implementations, are what you need in order to solve this data discovery problem. So the discourse around governance has been historically about compliance, about regulations, about cost takeout, not analytics. And so a lot of our time certainly in R&D is trying to solve that data discovery problem which is how do I discover data using semantics that I have, which as a regular user is not physical understandings of my data, and once I find it how am I assured that what I get is what I should get so that it's, I'm not subject to compliance related issues, but also making the company more vulnerable to data breach. >> Well so presumably part of that anyway involves automating classification at the point of creation or use, which is actually was a technical challenge for a number of years. Has that challenge been solved in your view? >> I think machine learning is, and in fact later on today I will be doing some demonstrations of technology which will show how we're making the application of machine learning easy, inside of everything we do we're applying machine learning techniques including to classification problems that help us solve the problem. So it could be we're automatically harvesting technical metadata. Are there business terms that could be automatically extracted that don't require some data steward to have to know and assert, right? Or can we automatically suggest and still have the steward for a case where I need a canonical data model, and so I just don't want the machine to tell me everything, but I want the machine to assist the data curation process. We are not just exploring the application of machine learning to solve that data classification problem, which historically was a manual one. We're embedding that into most of the stuff that we're doing. Often you won't even know that we're doing it behind the scenes. >> So that means that often times well the machine ideally are making the decisions as to who gets access to what, and is helping at least automate that governance, but there's a natural friction that occurs. And I wonder if you can talk about the balance sheet if you will between information as an asset, information as a liability. You know the more restrictions you put on that information the more it constricts you know a business user's ability. So how do you see that shaping up? >> I think it's often a people process problem, not necessarily a technology problem. I don't think as an industry we've figured it out. Certainly a lot of our clients haven't figured out that balance. I mean there are plenty of conversation I'll go into where I'll talk to a data science team in a same line of business as a governance team and what the data science team will tell us is I'm building my own data catalog because the stuff that the governance guys are doing doesn't help me. And the reason why it doesn't help me is because it's they're going through this top down data curation methodology and I've got a question, I need to go find the data that's relevant. I might not know what that is straight away. So the CDO function in a lot of organizations is helping bridge that. So you'll see governance responsibilities line up with the CDO with analytics. And I think that's gone a long way to bridge that gaps. But that conversation that I was just mentioning is not unique to one or two customers. Still a lot of customers are doing it. Often customers that either haven't started a CDO practice or are early days on it still. >> So about that, because this is being introduced to the workplace, a new concept right, fairly new CDOs. As opposed to CIO or CTO, you know you have these other. I mean how do you talk to your clients about trying to broaden their perspective on that and I guess emphasizing the need for them to consider putting somebody of a sole responsibility, or primary responsibility for their data. Instead of just putting it lumping it in somewhere else. >> So we happen to have one of the best CDO's inside of our group which is like a handy tool for me. So if I go into a client and it's purporting to be a data science problem and it turns out they have a data management issue around data discovery, and they haven't yet figured out how to install the process and people design to solve that particular issue one of the key things I'll do is I'll bring in our CDO and his delegates to have a conversation around them on what we're doing inside of IBM, what we're seeing in other customers to help institute that practice inside of, inside of their own organization. We have forums like the CDO event in Boston last week, which are designed to, you know it's not designed to be here's what IBM can do in technology, it's designed to say here's how the discipline impacts your business and here's some best practices you should apply. So if ultimately I enter into those conversations where I find that there's a need, I typically am like alright, I'm not going to, tools are part of the problem but not the only issue, let me bring someone in that can describe the people process related issues which you got to get right. In order for, in some cases to the tools that I deliver to matter. >> We had Seth Dobrin on last weekend in Boston, and Inderpal Bhandari as well, and he put forth this enterprise, sort of data blueprint if you will. CDO's are sort of-- >> Daniel: We're using that in IBM by the way. >> Well this is the thing, it's a really well thought out sort of structure that seems to be trickling down to the divisions. And so it's interesting to hear how you're applying Seth's expertise. I want to ask you about the Hortonworks relationship. You guys have made a big deal about that this summer. To me it was a no brainer. Really what was the point of IBM having a Hadoop distro, and Hortonworks gets this awesome distribution channel. IBM has always had an affinity for open source so that made sense there. What's behind that relationship and how's it going? >> It's going awesome. Perhaps what we didn't say and we probably should have focused on is the why customers care aspect. There are three main by an occasion use cases that customers are implementing where they are ready even before the relationship. They're asking IBM and Hortonworks to work together. And so we were coming to the table working together as partners before the deeper collaboration we started in June. The first one was bringing data science to Hadoop. So running data science models, doing data exploration where the data is. And if you were to actually rewind the clock on the IBM side and consider what we did with Hortonworks in full consideration of what we did prior, we brought the data science experience and machine learning to Z in February. The highest value transactional data was there. The next step was bring data science to where the, often for a lot of clients the second most valuable set of data which is Hadoop. So that was kind of part one. And then we've kind of continued that by bringing data science experience to the private cloud. So that's one use case. I got a lot data, I need to do data science, I want to do it in resident, I want to take advantage of the compute grid I've already laid down, and I want to take advantage of the performance benefits and the integrated security and governance benefits by having these things co-located. That's kind of play one. So we're bringing in data science experience and HDP and HDF, which are the Hortonworks distributions way closer together and optimized for each other. Another component of that is not all data is going to be in Hadoop as we were describing. Some of it's in an EDW and that data science job is going to require data outside of Hadoop, and so we brought big SQL. It was already supporting Hortonworks, we just optimized the stack, and so the combination of data science experience and big SQL allows you to data science against a broader surface area of data. That's kind of play one. Play two is I've got a EDW either for cost or agility reasons I want to augment it or some cases I might want to offload some data from it to Hadoop. And so the combination of Hortonworks plus big SQL and our data integration technologies are a perfect combination there and we have plenty of clients using that for kind of analytics offloading from EDW. And then the third piece that we're doing quite a bit of engineering, go-to-market work around is govern data lakes. So I want to enable self service analytics throughout my enterprise. I want self service analytics tools to everyone that has access to it. I want to make data available to them, but I want that data to be governed so that they can discover what's in it in the lake, and whatever I give them is what they should have access to. So those are the kind of the three tracks that we're working with Hortonworks on, and all of them are making stunning results inside of clients. >> And so that involves actually some serious engineering as well-- >> Big time. It's not just sort of a Barney deal or just a pure go to market-- >> It's certainly more the market texture and just works. >> Big picture down the road then. Whatever challenges that you see on your side of the business for the next 12 months. What are you going to tackle, what's that monster out there that you think okay this is our next hurdle to get by. >> I forgot if Rob said this before, but you'll hear him say often and it's statistically proven, the majority of the data that's available is not available to be Googled, so it's behind a firewall. And so we started last year with the Watson data platform creating an integrating data analytics system. What if customers have data that's on-prem that they want to take advantage of, what if they're not ready for the public cloud. How do we deliver public benefits to them when they want to run that workload behind a firewall. So we're doing a significant amount of engineering, really starting with the work that we did on a data science experience. Bringing it behind the firewall, but still delivering similar benefits you would expect if you're delivering it in the public cloud. A major advancement that IBM made is run IBM cloud private. I don't know if you guys are familiar with that announcement. We made, I think it's already two weeks ago. So it's a (mumbles) foundation on top of which we have micro services on top of which our stack is going to be made available. So when I think of kind of where the future is, you know our customers ultimately we believe want to run data and analytic workloads in the public cloud. How do we get them there considering they're not there now in a stepwise fashion that is sensible economically project management-wise culturally. Without having them having to wait. That's kind of big picture, kind of a big problem space we're spending considerable time thinking through. >> We've been talking a lot about this on theCUBE in the last several months or even years is people realize they can't just reform their business and stuff into the cloud. They have to bring the cloud model to their data. Wherever that data exists. If it's in the cloud, great. And the key there is you got to have a capability and a solution that substantially mimics that public cloud experience. That's kind of what you guys are focused on. >> What I tell clients is, if you're ready for certain workloads, especially green field workloads, and the capability exists in a public cloud, you should go there now. Because you're going to want to go there eventually anyway. And if not, then a vendor like IBM helps you take advantage of that behind a firewall, often in form facts that are ready to go. The integrated analytics system, I don't know if you're familiar with that. That includes our super advanced data warehouse, the data science experience, our query federation technology powered by big SQL, all in a form factor that's ready to go. You get started there for data and data science workloads and that's a major step in the direction to the public cloud. >> Alright well Daniel thank you for the time, we appreciate that. We didn't get to touch at all on baseball, but next time right? >> Daniel: Go Cubbies. (laughing) >> Sore spot with me but it's alright, go Cubbies. Alright Daniel Hernandez from IBM, back with more here from Data Science For All. IBM's event here in Manhattan. Back with more in theCUBE in just a bit. (electronic music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. So be sure to stick around all day long on theCUBe for that. to the hardcore data scientists. Who's a VP of IBM Analytics, good to see you. how do you see it in terms of how businesses should be So that's kind of the way I approach describing what it is. in terms of how wide of a net are you casting You need the ability to organize that data All that has to happen before you can actually and people that don't know how to code. Has that changed and to what degree has it changed? and to solve my problem like customer segmentation And if you don't, it's a big flaw, it's a big gap is it not? And it just so happens the same discipline that you use, Well so presumably part of that anyway We're embedding that into most of the stuff You know the more restrictions you put on that information So the CDO function in a lot of organizations As opposed to CIO or CTO, you know you have these other. the process and people design to solve that particular issue data blueprint if you will. that seems to be trickling down to the divisions. is going to be in Hadoop as we were describing. just a pure go to market-- that you think okay this is our next hurdle to get by. I don't know if you guys are familiar And the key there is you got to have a capability often in form facts that are ready to go. We didn't get to touch at all on baseball, Daniel: Go Cubbies. IBM's event here in Manhattan.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Daniel	PERSON	0.99+
February	DATE	0.99+
Boston	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
one	QUANTITY	0.99+
David	PERSON	0.99+
Manhattan	LOCATION	0.99+
Inderpal Bhandari	PERSON	0.99+
June	DATE	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Seth	PERSON	0.99+
Python	TITLE	0.99+
third piece	QUANTITY	0.99+
EDW	ORGANIZATION	0.99+
second	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
last week	DATE	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
SQL	TITLE	0.99+
two customers	QUANTITY	0.99+
Hadoop	TITLE	0.99+
first	QUANTITY	0.99+
SPSS	TITLE	0.98+
Seth Dobrin	PERSON	0.98+
three tracks	QUANTITY	0.98+
John Walls	PERSON	0.98+
IBM Analytics	ORGANIZATION	0.98+
first guest	QUANTITY	0.97+
two weeks ago	DATE	0.97+
one aspect	QUANTITY	0.96+
first one	QUANTITY	0.96+
Barney	ORGANIZATION	0.96+
two majors	QUANTITY	0.96+
last weekend	DATE	0.94+
this summer	DATE	0.94+
Hadoop	ORGANIZATION	0.93+
decades	QUANTITY	0.92+
last fall	DATE	0.9+
two	QUANTITY	0.85+
IBM Data Science For All	ORGANIZATION	0.79+
three main	QUANTITY	0.78+
next 12 months	DATE	0.78+
CDO	TITLE	0.77+
D	ORGANIZATION	0.72+

Bobby Patrick, UiPath | The Release Show: Post Event Analysis

>>from around the globe. It's the Cube with digital coverage of you. I path live the release show brought to you by you. >>I path Hi. Welcome back to this special R p A drill down with support from you. I path You're watching The Cube. My name is Dave Volante and Bobby CMO. You know I passed Bobby. Good to see you again. Hope you're doing well. Thanks for coming on. >>Hi, Dave. It's great to see you as well. It's always a pleasure to be on the Cube and even in the virtual format, this is really exciting. >>So, you know, last year at forward, we talked about the possibility of a downturn. Now nobody expected this kind of downturn. But we talked about that. Automation was likely something that was going to stay strong even in the downturn. We were thinking about potential recession or an economic downturn. Stock market dropped, but nothing like this. How are you guys holding up in this posted 19 pandemic? What are you seeing in the marketplace? >>Yeah, we certainly we're not thinking of a black swan or rhino or whatever we call this, but, you know, it's been a pretty crazy couple of months for everybody. You know, when When this first started, we were like everybody else. Not sure how it impact our business. The interesting thing has been that you're in code. It actually brought a reality check through. A lot of companies and organizations realize that it's very few tools to respond quickly, right? Bond with, you know, cost pressures that we're urgent or preserving revenue, perhaps, or responding to Ah, strange resource is, you know, in all centers, or or built to support. You know, the surge in in, um, in the healthcare community. And so r p a became one of those tools that quickly waas knowledge and adopted. And so we went out two months ago to go find those 1st 1st use cases. Talk about him, then. You know, 1st 30 days we had 50 in production, right? Companies, you know, great organizations like Cleveland Clinic, right? You know where they use their parking lot? Give the first tests the swab tests, right of, uh, well, who have proven right? You know, they had a line of 88 hours by, you know, putting a robot in place in two days. They got that line down by 80 or 90% right? It is a huge hit as we see that kind of a kind of benefit all across right now in the world. Right now we have. We were featured in The Wall Street Journal recently with nurses and a large hospital system in Ireland called Matter. The nurses said in the interview that, you know they have. They were able to free up time to be a patient's right, which is what they're there for, anyway, thanks to robots during this during this emergency. So I think you know, it's it's definitely raise The awareness that that this technology is provides an amazing time to value, and that's it's pretty unprecedented in the world of B two B software. >>I want to share some data with you in our community is the first time we've we've shown this. Guys would bring up the data slide, and so this is ah, chart that e. T are produced. There's enterprise technology research. They go out of reporter. They survey CIOs and I T practitioners and a survey in different segments and the use of methodology Net score. And this is sort of how method how Net scores derived. And so what this chart shows is the percent of customers that responded there were about 125 You I path customers that responded. Are you adopting new U I path? Are you increasing spending in 2020? Are you planning on flat spending or decreasing spending? Are you replacing the platform of beacons? And so basically, we take the green, uh, subtract the read from the green, and that gives us net score. But the point is that Bobby abouts about 80% of your customers are planning to spend Maurin 2020 than they spent in 2019 and only about 6% of planning on spending less, which is fairly astounding. I mean, we've been reporting on this for a while in the heat nous in the in the automation market generally and specifically. But are you seeing this in the marketplace? And maybe you could talk about why? >>Well, we just finished our first fiscal quarter into the end of April, and we're still privately held, so we can be, uh, find some insights of our company, but yeah, the the pace of our business picked up actually in in the mark. April timeframe. Um, customer adoption, large customer adoption. Um, the number of new new companies and new logos were at a record high. And, you know, we're entering into this quarter now, and we have some 20 plus $1,000,000 deals that are like that. It closed, right? I mean, that's probably a 30% increase Versus what? How many we have today alone. Right? So our business, you know, is is now well over 400 million and air are we ended last year, 3 60 and the growth rate continues fast. I think you know what's interesting is that the pace of the recode world was already fast, right? The the luxury of time has kind of disappeared. And so people are thinking about, you know, they don't have they can't wait now, months and years for digital transformation. They have to do things in days and days and days and weeks. And and that's where our technology really comes into play. Right? And and and it actually is also coming to play well in the world of the remote workforce. Reality two of the ability for remote workers to get trained while they're home on automation to build automation pipelines to to build automation. Now, with our latest release, you can download our podcast, capture and report what you're doing, and it basically generates the process definition document and the sample files, which allow for faster implementation by our center of excellence. So what's really happening here? We see it is a sense of urgency coming out of this. Prices are coming down the curve. Hopefully, now this is of urgency that our customers are facing in terms of how they respond, you know, and respond digitally to helping their business out. And it varies a lot by industry, our state and local business was really thinking was not going to be the biggest laggard of any industry picked up in a significant way in the last couple of months, New York State, with Governor Cuomo, became a big customer of ours. There's a quote from L. A County, see Iot that I've got here. They just employed us. It's public, this quote, he said. Deputy CIO said Price is always the mother of invention. We can always carry forward the good things they're coming out of this crisis situation. He's referring to our P A is being a lesson. They learned hearing this, that they're going to carry forward. And so we see this state of Oklahoma became a customer and others. So I think that's that's what we're seeing kind of a broad based. It's worldwide. >>You're really organizations can't put it off anymore. I think you're right. It sort of brought forward the future into the present. Now you mentioned 360 million last year. We had forecast 350 million was pretty good for you guys released, so it's happy about that. But so obviously still a strong trajectory. You know, it might have been higher without without covert. We'll never know, but sort of underscores the strength of the space. Um, and February you guys, there was an article that so you're essentially Theo Dan, Daniel Hernandez was quoted. Is that on hold now? Are you guys still sort of thinking about pressing forward or too early to say right? >>Yeah. I mean, I think I think the reality is we have a very, very strong business. We've raised, you know, significant money from great investors, some of which are the leading VCs in the world. and also that the public company investors and, you know, we have, ah, aggressive plan. We have an aggressive plan to build out our platform for hyper automation to continue. The growth path is now becoming the center of companies of I, T and Digital Strategies, not on the side. Right. And so to do that, you know, we're gonna want capital to help fuel our our our ambitions and fuel Our ability to serve our customers and public markets is probably a very, very logical one. As Daniel mentioned in a in a A recent, uh, he's on Bloomberg that he definitely sees. That is ah, maybe accelerating that, You know, we're late Last year, we started focusing on sustainable growth as a company and operational regular. These are important things in addition to having strong growth that, you know, a long term company has to have in place. And I can tell you, um, I'm really excited about the fact that we, you know, we operate very much like a public company. Now, internally, we you know, we do draft earnings releases that aren't public yet, and we do mock earnings, earnings calls, and we have hired Thomas Hansen is runs our chief revenue officer with storage backgrounds. And so you're gonna interview as well. These are these are these are the best of the best, right? That joint, they're joined this company, they're joining alongside the arm Kalonzo the world that are part of this company. And so I think, Yeah, I think it's an AR It's likely. And and it's gonna We're here to be a long term leader in this decade of automation. >>Well, and one of the other things that we forecast on our breaking analysis we took a look at the total available market kind of like into it. Early days of service Now is you know, people were really not fully understanding the market and chillin C it is is quite large, so video. So when we look at the competition, you know, you guys, if I showed you the same wheel with automation anywhere, it would also look strong. You know, some of the others, maybe not a strong but still stronger than many of the segments. I mean, for instance, you know, on Prem hardware. You know, compared with that and you know the automation space in general across the board is very, very strong. So I wonder if maybe you could talk a little bit about how you guys differentiate from the competition. How you see that? >>Yeah, I think you know, we've We've come a long way in the last three years, right? In terms of becoming the market leader, having the highest market share, we're very open and transparent about our numbers with We've long had the vision of a robot. Every person, uh, and and we've been delivering on that on on that vision and ah, building out a platform that helps companies, you know, transform digitally enterprise wide. Right. So, you know, I don't see any of our competitors with a platform for hyper automation like this. We have an incredible focus on the ability to help people actually find the ideas, build the pipeline, score the pipelines and integrate those with the automation center of excellence. Right? We have the ability now with our latest release to help test automation testers now not only in the world of art A but actually take robotic robots and and architecture into doing test automation. The traditional test automation market in a much better and faster way So you know, we're innovating at a pace that that it is, I think, much faster than I don't. I don't know automation anywhere. I won't share any their numbers. You know, who knows what the numbers are. We have guesses, but I'm fairly certain that we continue to gain share on them. But you know, what's most important is customer adoption, and we've also seen a number of customers switch from some of our competitors to us. Our competitors are undercapitalized and middle. Invest in R and D. This is an investment area, really build a platform out from our competitors have architectures that are hard to upgrade, right? This has been a big source of pain for companies that have been on our competitors. Where upgrades are difficult requires them to retest every time where our upgrades are very rolling, you know, are very smooth. We have an insider program which you know, I don't think any of our competitors have. If you go inside that you had pat that your customer every single bit every single review betting, private preview, public preview and general availability, you can provide feedback on and the customers can score up new ideas. They drive our our roadmap. Right. And this is I think we operate differently. I think our growth is a is a good indication of that. And, you know, and there are new competitors like Microsoft. But I think you know, you know, medium or long term, you know, they're gonna make effort around our, um and you know, they're behind the, um, automation is really hard. The buried entry here is not it's not. Not easy. And we're going to keep me on that platform, play out, and I think that's ah, that's what makes us so different. Um and ah, you know, we have the renewal numbers, retention numbers, expansion numbers and and the revenue numbers to improve that, uh, you know, we're number one. >>Well, so I mean, there's a lot of ways to skin the cat, and you're right. You guys are really focused, you know, you automation anywhere really focused on this space, and you shared with us how you differentiate there. But as you point out Microsoft, they sort of added on I had talked to Allan, preferably the day from paga. You know, those guys don't position themselves as our PC, but they have r p A. I talked to, you know, our mutual friend Robert Young John the other day, right? They're piling onto this this trend, right? So why not? Right, It's it's ah, it's hot. But so, you know, clearly you guys are innovating there. I want to talk about your vision before we get into the latest product release two things that I would call out the term hyper automation with, I think is the Gartner term. And then it will probably stick. And then this this idea of a robot for every person How would you describe your vision? >>Yeah, I mean, we think that robots can and improve, you know, the the lives of of or pers everywhere, right? We think in every every function, every role. And we see that already, the job satisfaction and the people don't want to do the mundane, repetitive work, right? The new hires coming out of college, you know, they're gonna be excel and sequel server. We're no longer the tools of productivity. For them, it's it's your path. We have business. Schools that have committed top tier business schools have committed to deploying your path or to putting you're passing every force in the school these students are graduating with the right path is their most important skill going into companies. And they're gonna expect to be able to use robots within their companies in their daily lives. A swell. So, you know, we have customers today that are rolling out a robot for every person you know. We had Ah, Conoco Phillips on just earlier in our launch, talking about citizen developers, enabling says, developer armies of developers and growing enterprise wide. See, Intel was on as well from Singapore, the large telco. They're doing the exact same thing. So I think you know, I think this is this is this is this is about broad based digital transformation. Everybody participating And what happens is the leading companies to do this, you know, they're going to get the benefit of benefits out of it. It can reinvest that productivity, benefits and data science and analytics and serving customers and in, you know, and and, ah, new product ideas. And so, you know, this is this. You know, automation is going to fuel now the ability for companies to really differentiate and serve their customers better. And it's only needed enterprise wide view on it that you really maximizing. Take Amazon, for example, a great customer during during this prices. You know, they're trying to hire hundreds of thousands of people, right? Help in the fact that in their in their distribution centers elsewhere, this all served demand to help people who like you and I home or ordering things that we need, right? Well, they're use your path robots all throughout their HR hr on boarding HR recruiting HR administration And so helping them has been a big during this prices surge of robots is helping them actually hire workers. You know another example of Schneider Electric and amazing customer of ours. They're bringing their plants, their manufacturing facilities, implants back online faster by using robots to help manage the PPE personal protective equipment in the plant allow people workers to get back to work faster. Right? So what's happening is is, you know in that in those cases is your different examples of robots and different functions, right? In all cases, it's about helping grow a company faster. It's about helping protect workers. It's about helping getting revenue machines back up and running after Kobe is going to be critical to get back to work faster. So I'm I'm really excited about the fact that as people think about automation across the organization, the number of ideas and Aaron opportunities for improvement are are we're just starting to tap that potential. >>Well, this is why I think the vision is so important because you're talking about things that are transformative. Now, as you well know, one of the criticisms of RPS. So you have people, the suppliers and just yeah, we, you know, looking at mundane tasks, just automating mundane tasks like sometimes paving the cow path and say, you're very much aware of that criticism. But if I look at the recent announcements, you're really starting to build out that vision that you just talked about. They're really four takeaways. You sort of extending the core PAP platform, injecting AI end some or and more automation end to end automation really taken that full lifestyles lifecycle systems view and the last one is sort of putting it talks to the robot. For every person that sort of citizen automation, if you will, that sort of encompasses your product announcements. So it wasn't just sort of a point Announcement really is a underscores the platform. I wonder if you could just What do we need to know about you guys? Just that out. >>So we think about how we think about the rolls back to a division of robots person how automation can help different roles. And so this product launch $20 for this large scale launch that you just articulated, um, impacts in a fax and helps many different kinds of new roles Certainly process analysts now who examined processes, passes performance improvements. You know, they're a user of our process mining solution in our past. Find a solution that helps speed on our way. Arpaio engine, no testers and quality engineers. Now they can actually use studio pro and actually used test robots are brand new, and our new test manager is sort of the orchestration and management of test executions. Now they can participate in in leveraged power of robots and what they do as well. And we kind of think about that, you know, kind of across the board in our organization across the platform. They can use tools like you have path insights in Europe. If you're an analyst or your, uh ah. B I, this intelligence person really know what's going on with robots in terms of our wife for my organization and provide that up to the, you know, sea levels in the board of directors in real time. So I think that's that's the big part. Here is we're bringing, and we're helping bring in many, many different kinds of roles different kinds of people. Data scientist. You mentioned AI. Now data scientists can build a model. The models applied to ai fabric an orchestrator. It's drag and drop by our developer in studio, and now you can turn, you know, a a mundane, rules based task right into an experience based ones where a robot can help make a decision right. Based on experience and data, they can tweak and tune that model and data scientists can interact, you know, with the automation is flowing through your path. So I think that's how we think about it, right? You know, one of the great new capabilities, as well as the ability to engage line workers, dispatch out workers If you're a telco or or retail story retail store workers you know the robots can work with humans out in the field. We've got one real large manufacturer with 18,000 drivers in a DST direct store delivery scenario. And you know the ability for them to interact with robots and help them do their job in the field. Our customers better after the list data entry and data manipulation, multiple systems. So I this is this makes us very unique in our vision and in our execution. And again, I don't I have not heard of a single ah example by competitors that has any kind of a vision or articulation to be able to help a company enterprise wide and, you know, with the speed and the and the full, full vision that we have. >>Okay, so you're not worried about downturns. You can't control black swans Anyway, you're not worried about the competition. It feels like you know, you're worried about what you're worried about. You want about growing too fast. Additionally, deploying the the capital that you've raised. What worries you? >>Yeah. You know, we're paranoid or paranoid company, right? And when it comes to the market and and trying to drive, I think we've done a lot to help actually push the rock up the hill in terms of really, really driving our market, building the market, and we want to continue that right and not let up. So there's this kind of desire to never let up, right? Well, we always remind ourselves we must work harder, must work harder. We must work harder. And that's that's That's sort of this this mentality around ourselves, by the smartest people. Hire the smartest people you work with our customers, our customers are priority. Do that with really high excellence and really high sincerity that it comes through and everything that we do, you know, to build a world class operation to be, you know, Daniel DNS. When I first met him, he said, You know, I really want to be the enemy of the great news ecology company that serve customers really well. And it was amazing things for society, and and, you know, we're on that track, but we've got, you know, we're in the in the in the early innings. So, you know, making sure that we also run our business in a way that, um, you know, uh, is ready to be Ah, you know, publicly successful company on being able to raise new sources of capital to fund our ambitions and our ideas. I mean, you saw the number of announcements from our 24 release. It reminded me of an AWS re invent conference, where it's just innovation, innovation, innovation, innovation. And these are very real. They're not made up mythical announcements that some of our competitors do about launching some kind of discovery box doesn't exist, right? These are very real with real customers behind them, and and so you know, just doing that with the same level of tenacity. But being, you know, old, fast, immersed and humble, which are four core culture values along the way and not losing that Azeri grow. That's that's something we talk about maintaining that culture that's super critical to us. >>Everybody's talking about Okay, What What's gonna be permanent? Postpone it. I was just listening to Julie Sweet, CEO of Accenture, and she was saying that, you know, prior to Covic, they had data that showed that the top 25% of companies that have leaned into digital transformation were outperforming. You know, the balance of their peers, and I know question now that the the rest of that base really is going to be focused on automation. Automation is is really going to be one of those things that is high, high priority now and really for the next decade and beyond. So, Bobby, thanks so much for coming on the Cube and supporting us in this in this r p. A drill down. Really appreciate it, >>Dave. It's always a pleasure as always. Great to see you. Thank you. >>Alright. And thank you for watching everybody. Dave Volante. We'll be right back right after this short break. You're watching the cube. >>Yeah, yeah, yeah, yeah.

Published Date : May 21 2020

SUMMARY :

I path live the release show brought to you by you. Good to see you again. It's always a pleasure to be on the Cube and even in the virtual format, So, you know, last year at forward, we talked about the possibility So I think you know, it's it's definitely raise The awareness I want to share some data with you in our community is the first time we've we've shown this. So our business, you know, is is now well over 400 Um, and February you guys, there was an article that so you're essentially I'm really excited about the fact that we, you know, we operate very much like a public company. Early days of service Now is you know, people were really not fully understanding numbers to improve that, uh, you know, we're number one. our PC, but they have r p A. I talked to, you know, our mutual friend Robert Young Yeah, I mean, we think that robots can and improve, you know, yeah, we, you know, looking at mundane tasks, just automating mundane tasks like sometimes And we kind of think about that, you know, kind of across the board in our organization across the It feels like you know, you're worried about what you're worried about. and and so you know, just doing that with the same level of tenacity. CEO of Accenture, and she was saying that, you know, prior to Covic, Great to see you. And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Susan Wojcicki	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Jim	PERSON	0.99+
Jason	PERSON	0.99+
Tara Hernandez	PERSON	0.99+
David Floyer	PERSON	0.99+
Dave	PERSON	0.99+
Lena Smart	PERSON	0.99+
John Troyer	PERSON	0.99+
Mark Porter	PERSON	0.99+
Mellanox	ORGANIZATION	0.99+
Kevin Deierling	PERSON	0.99+
Marty Lans	PERSON	0.99+
Tara	PERSON	0.99+
John	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Jim Jackson	PERSON	0.99+
Jason Newton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Dave Winokur	PERSON	0.99+
Daniel	PERSON	0.99+
Lena	PERSON	0.99+
Meg Whitman	PERSON	0.99+
Telco	ORGANIZATION	0.99+
Julie Sweet	PERSON	0.99+
Marty	PERSON	0.99+
Yaron Haviv	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Western Digital	ORGANIZATION	0.99+
Kayla Nelson	PERSON	0.99+
Mike Piech	PERSON	0.99+
Jeff	PERSON	0.99+
Dave Volante	PERSON	0.99+
John Walls	PERSON	0.99+
Keith Townsend	PERSON	0.99+
five	QUANTITY	0.99+
Ireland	LOCATION	0.99+
Antonio	PERSON	0.99+
Daniel Laury	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
six	QUANTITY	0.99+
Todd Kerry	PERSON	0.99+
John Furrier	PERSON	0.99+
$20	QUANTITY	0.99+
Mike	PERSON	0.99+
January 30th	DATE	0.99+
Meg	PERSON	0.99+
Mark Little	PERSON	0.99+
Luke Cerney	PERSON	0.99+
Peter	PERSON	0.99+
Jeff Basil	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Dan	PERSON	0.99+
10	QUANTITY	0.99+
Allan	PERSON	0.99+
40 gig	QUANTITY	0.99+

Data Science for All: It's a Whole New Game

>> There's a movement that's sweeping across businesses everywhere here in this country and around the world. And it's all about data. Today businesses are being inundated with data. To the tune of over two and a half million gigabytes that'll be generated in the next 60 seconds alone. What do you do with all that data? To extract insights you typically turn to a data scientist. But not necessarily anymore. At least not exclusively. Today the ability to extract value from data is becoming a shared mission. A team effort that spans the organization extending far more widely than ever before. Today, data science is being democratized. >> Data Sciences for All: It's a Whole New Game. >> Welcome everyone, I'm Katie Linendoll. I'm a technology expert writer and I love reporting on all things tech. My fascination with tech started very young. I began coding when I was 12. Received my networking certs by 18 and a degree in IT and new media from Rochester Institute of Technology. So as you can tell, technology has always been a sure passion of mine. Having grown up in the digital age, I love having a career that keeps me at the forefront of science and technology innovations. I spend equal time in the field being hands on as I do on my laptop conducting in depth research. Whether I'm diving underwater with NASA astronauts, witnessing the new ways which mobile technology can help rebuild the Philippine's economy in the wake of super typhoons, or sharing a first look at the newest iPhones on The Today Show, yesterday, I'm always on the hunt for the latest and greatest tech stories. And that's what brought me here. I'll be your host for the next hour and as we explore the new phenomenon that is taking businesses around the world by storm. And data science continues to become democratized and extends beyond the domain of the data scientist. And why there's also a mandate for all of us to become data literate. Now that data science for all drives our AI culture. And we're going to be able to take to the streets and go behind the scenes as we uncover the factors that are fueling this phenomenon and giving rise to a movement that is reshaping how businesses leverage data. And putting organizations on the road to AI. So coming up, I'll be doing interviews with data scientists. We'll see real world demos and take a look at how IBM is changing the game with an open data science platform. We'll also be joined by legendary statistician Nate Silver, founder and editor-in-chief of FiveThirtyEight. Who will shed light on how a data driven mindset is changing everything from business to our culture. We also have a few people who are joining us in our studio, so thank you guys for joining us. Come on, I can do better than that, right? Live studio audience, the fun stuff. And for all of you during the program, I want to remind you to join that conversation on social media using the hashtag DSforAll, it's data science for all. Share your thoughts on what data science and AI means to you and your business. And, let's dive into a whole new game of data science. Now I'd like to welcome my co-host General Manager IBM Analytics, Rob Thomas. >> Hello, Katie. >> Come on guys. >> Yeah, seriously. >> No one's allowed to be quiet during this show, okay? >> Right. >> Or, I'll start calling people out. So Rob, thank you so much. I think you know this conversation, we're calling it a data explosion happening right now. And it's nothing new. And when you and I chatted about it. You've been talking about this for years. You have to ask, is this old news at this point? >> Yeah, I mean, well first of all, the data explosion is not coming, it's here. And everybody's in the middle of it right now. What is different is the economics have changed. And the scale and complexity of the data that organizations are having to deal with has changed. And to this day, 80% of the data in the world still sits behind corporate firewalls. So, that's becoming a problem. It's becoming unmanageable. IT struggles to manage it. The business can't get everything they need. Consumers can't consume it when they want. So we have a challenge here. >> It's challenging in the world of unmanageable. Crazy complexity. If I'm sitting here as an IT manager of my business, I'm probably thinking to myself, this is incredibly frustrating. How in the world am I going to get control of all this data? And probably not just me thinking it. Many individuals here as well. >> Yeah, indeed. Everybody's thinking about how am I going to put data to work in my organization in a way I haven't done before. Look, you've got to have the right expertise, the right tools. The other thing that's happening in the market right now is clients are dealing with multi cloud environments. So data behind the firewall in private cloud, multiple public clouds. And they have to find a way. How am I going to pull meaning out of this data? And that brings us to data science and AI. That's how you get there. >> I understand the data science part but I think we're all starting to hear more about AI. And it's incredible that this buzz word is happening. How do businesses adopt to this AI growth and boom and trend that's happening in this world right now? >> Well, let me define it this way. Data science is a discipline. And machine learning is one technique. And then AI puts both machine learning into practice and applies it to the business. So this is really about how getting your business where it needs to go. And to get to an AI future, you have to lay a data foundation today. I love the phrase, "there's no AI without IA." That means you're not going to get to AI unless you have the right information architecture to start with. >> Can you elaborate though in terms of how businesses can really adopt AI and get started. >> Look, I think there's four things you have to do if you're serious about AI. One is you need a strategy for data acquisition. Two is you need a modern data architecture. Three is you need pervasive automation. And four is you got to expand job roles in the organization. >> Data acquisition. First pillar in this you just discussed. Can we start there and explain why it's so critical in this process? >> Yeah, so let's think about how data acquisition has evolved through the years. 15 years ago, data acquisition was about how do I get data in and out of my ERP system? And that was pretty much solved. Then the mobile revolution happens. And suddenly you've got structured and non-structured data. More than you've ever dealt with. And now you get to where we are today. You're talking terabytes, petabytes of data. >> [Katie] Yottabytes, I heard that word the other day. >> I heard that too. >> Didn't even know what it meant. >> You know how many zeros that is? >> I thought we were in Star Wars. >> Yeah, I think it's a lot of zeroes. >> Yodabytes, it's new. >> So, it's becoming more and more complex in terms of how you acquire data. So that's the new data landscape that every client is dealing with. And if you don't have a strategy for how you acquire that and manage it, you're not going to get to that AI future. >> So a natural segue, if you are one of these businesses, how do you build for the data landscape? >> Yeah, so the question I always hear from customers is we need to evolve our data architecture to be ready for AI. And the way I think about that is it's really about moving from static data repositories to more of a fluid data layer. >> And we continue with the architecture. New data architecture is an interesting buzz word to hear. But it's also one of the four pillars. So if you could dive in there. >> Yeah, I mean it's a new twist on what I would call some core data science concepts. For example, you have to leverage tools with a modern, centralized data warehouse. But your data warehouse can't be stagnant to just what's right there. So you need a way to federate data across different environments. You need to be able to bring your analytics to the data because it's most efficient that way. And ultimately, it's about building an optimized data platform that is designed for data science and AI. Which means it has to be a lot more flexible than what clients have had in the past. >> All right. So we've laid out what you need for driving automation. But where does the machine learning kick in? >> Machine learning is what gives you the ability to automate tasks. And I think about machine learning. It's about predicting and automating. And this will really change the roles of data professionals and IT professionals. For example, a data scientist cannot possibly know every algorithm or every model that they could use. So we can automate the process of algorithm selection. Another example is things like automated data matching. Or metadata creation. Some of these things may not be exciting but they're hugely practical. And so when you think about the real use cases that are driving return on investment today, it's things like that. It's automating the mundane tasks. >> Let's go ahead and come back to something that you mentioned earlier because it's fascinating to be talking about this AI journey, but also significant is the new job roles. And what are those other participants in the analytics pipeline? >> Yeah I think we're just at the start of this idea of new job roles. We have data scientists. We have data engineers. Now you see machine learning engineers. Application developers. What's really happening is that data scientists are no longer allowed to work in their own silo. And so the new job roles is about how does everybody have data first in their mind? And then they're using tools to automate data science, to automate building machine learning into applications. So roles are going to change dramatically in organizations. >> I think that's confusing though because we have several organizations who saying is that highly specialized roles, just for data science? Or is it applicable to everybody across the board? >> Yeah, and that's the big question, right? Cause everybody's thinking how will this apply? Do I want this to be just a small set of people in the organization that will do this? But, our view is data science has to for everybody. It's about bring data science to everybody as a shared mission across the organization. Everybody in the company has to be data literate. And participate in this journey. >> So overall, group effort, has to be a common goal, and we all need to be data literate across the board. >> Absolutely. >> Done deal. But at the end of the day, it's kind of not an easy task. >> It's not. It's not easy but it's maybe not as big of a shift as you would think. Because you have to put data in the hands of people that can do something with it. So, it's very basic. Give access to data. Data's often locked up in a lot of organizations today. Give people the right tools. Embrace the idea of choice or diversity in terms of those tools. That gets you started on this path. >> It's interesting to hear you say essentially you need to train everyone though across the board when it comes to data literacy. And I think people that are coming into the work force don't necessarily have a background or a degree in data science. So how do you manage? >> Yeah, so in many cases that's true. I will tell you some universities are doing amazing work here. One example, University of California Berkeley. They offer a course for all majors. So no matter what you're majoring in, you have a course on foundations of data science. How do you bring data science to every role? So it's starting to happen. We at IBM provide data science courses through CognitiveClass.ai. It's for everybody. It's free. And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. The key point is this though. It's more about attitude than it is aptitude. I think anybody can figure this out. But it's about the attitude to say we're putting data first and we're going to figure out how to make this real in our organization. >> I also have to give a shout out to my alma mater because I have heard that there is an offering in MS in data analytics. And they are always on the forefront of new technologies and new majors and on trend. And I've heard that the placement behind those jobs, people graduating with the MS is high. >> I'm sure it's very high. >> So go Tigers. All right, tangential. Let me get back to something else you touched on earlier because you mentioned that a number of customers ask you how in the world do I get started with AI? It's an overwhelming question. Where do you even begin? What do you tell them? >> Yeah, well things are moving really fast. But the good thing is most organizations I see, they're already on the path, even if they don't know it. They might have a BI practice in place. They've got data warehouses. They've got data lakes. Let me give you an example. AMC Networks. They produce a lot of the shows that I'm sure you watch Katie. >> [Katie] Yes, Breaking Bad, Walking Dead, any fans? >> [Rob] Yeah, we've got a few. >> [Katie] Well you taught me something I didn't even know. Because it's amazing how we have all these different industries, but yet media in itself is impacted too. And this is a good example. >> Absolutely. So, AMC Networks, think about it. They've got ads to place. They want to track viewer behavior. What do people like? What do they dislike? So they have to optimize every aspect of their business from marketing campaigns to promotions to scheduling to ads. And their goal was transform data into business insights and really take the burden off of their IT team that was heavily burdened by obviously a huge increase in data. So their VP of BI took the approach of using machine learning to process large volumes of data. They used a platform that was designed for AI and data processing. It's the IBM analytics system where it's a data warehouse, data science tools are built in. It has in memory data processing. And just like that, they were ready for AI. And they're already seeing that impact in their business. >> Do you think a movement of that nature kind of presses other media conglomerates and organizations to say we need to be doing this too? >> I think it's inevitable that everybody, you're either going to be playing, you're either going to be leading, or you'll be playing catch up. And so, as we talk to clients we think about how do you start down this path now, even if you have to iterate over time? Because otherwise you're going to wake up and you're going to be behind. >> One thing worth noting is we've talked about analytics to the data. It's analytics first to the data, not the other way around. >> Right. So, look. We as a practice, we say you want to bring data to where the data sits. Because it's a lot more efficient that way. It gets you better outcomes in terms of how you train models and it's more efficient. And we think that leads to better outcomes. Other organization will say, "Hey move the data around." And everything becomes a big data movement exercise. But once an organization has started down this path, they're starting to get predictions, they want to do it where it's really easy. And that means analytics applied right where the data sits. >> And worth talking about the role of the data scientist in all of this. It's been called the hot job of the decade. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. >> Yes. >> I want to see this on the cover of Vogue. Like I want to see the first data scientist. Female preferred, on the cover of Vogue. That would be amazing. >> Perhaps you can. >> People agree. So what changes for them? Is this challenging in terms of we talk data science for all. Where do all the data science, is it data science for everyone? And how does it change everything? >> Well, I think of it this way. AI gives software super powers. It really does. It changes the nature of software. And at the center of that is data scientists. So, a data scientist has a set of powers that they've never had before in any organization. And that's why it's a hot profession. Now, on one hand, this has been around for a while. We've had actuaries. We've had statisticians that have really transformed industries. But there are a few things that are new now. We have new tools. New languages. Broader recognition of this need. And while it's important to recognize this critical skill set, you can't just limit it to a few people. This is about scaling it across the organization. And truly making it accessible to all. >> So then do we need more data scientists? Or is this something you train like you said, across the board? >> Well, I think you want to do a little bit of both. We want more. But, we can also train more and make the ones we have more productive. The way I think about it is there's kind of two markets here. And we call it clickers and coders. >> [Katie] I like that. That's good. >> So, let's talk about what that means. So clickers are basically somebody that wants to use tools. Create models visually. It's drag and drop. Something that's very intuitive. Those are the clickers. Nothing wrong with that. It's been valuable for years. There's a new crop of data scientists. They want to code. They want to build with the latest open source tools. They want to write in Python or R. These are the coders. And both approaches are viable. Both approaches are critical. Organizations have to have a way to meet the needs of both of those types. And there's not a lot of things available today that do that. >> Well let's keep going on that. Because I hear you talking about the data scientists role and how it's critical to success, but with the new tools, data science and analytics skills can extend beyond the domain of just the data scientist. >> That's right. So look, we're unifying coders and clickers into a single platform, which we call IBM Data Science Experience. And as the demand for data science expertise grows, so does the need for these kind of tools. To bring them into the same environment. And my view is if you have the right platform, it enables the organization to collaborate. And suddenly you've changed the nature of data science from an individual sport to a team sport. >> So as somebody that, my background is in IT, the question is really is this an additional piece of what IT needs to do in 2017 and beyond? Or is it just another line item to the budget? >> So I'm afraid that some people might view it that way. As just another line item. But, I would challenge that and say data science is going to reinvent IT. It's going to change the nature of IT. And every organization needs to think about what are the skills that are critical? How do we engage a broader team to do this? Because once they get there, this is the chance to reinvent how they're performing IT. >> [Katie] Challenging or not? >> Look it's all a big challenge. Think about everything IT organizations have been through. Some of them were late to things like mobile, but then they caught up. Some were late to cloud, but then they caught up. I would just urge people, don't be late to data science. Use this as your chance to reinvent IT. Start with this notion of clickers and coders. This is a seminal moment. Much like mobile and cloud was. So don't be late. >> And I think it's critical because it could be so costly to wait. And Rob and I were even chatting earlier how data analytics is just moving into all different kinds of industries. And I can tell you even personally being effected by how important the analysis is in working in pediatric cancer for the last seven years. I personally implement virtual reality headsets to pediatric cancer hospitals across the country. And it's great. And it's working phenomenally. And the kids are amazed. And the staff is amazed. But the phase two of this project is putting in little metrics in the hardware that gather the breathing, the heart rate to show that we have data. Proof that we can hand over to the hospitals to continue making this program a success. So just in-- >> That's a great example. >> An interesting example. >> Saving lives? >> Yes. >> That's also applying a lot of what we talked about. >> Exciting stuff in the world of data science. >> Yes. Look, I just add this is an existential moment for every organization. Because what you do in this area is probably going to define how competitive you are going forward. And think about if you don't do something. What if one of your competitors goes and creates an application that's more engaging with clients? So my recommendation is start small. Experiment. Learn. Iterate on projects. Define the business outcomes. Then scale up. It's very doable. But you've got to take the first step. >> First step always critical. And now we're going to get to the fun hands on part of our story. Because in just a moment we're going to take a closer look at what data science can deliver. And where organizations are trying to get to. All right. Thank you Rob and now we've been joined by Siva Anne who is going to help us navigate this demo. First, welcome Siva. Give him a big round of applause. Yeah. All right, Rob break down what we're going to be looking at. You take over this demo. >> All right. So this is going to be pretty interesting. So Siva is going to take us through. So he's going to play the role of a financial adviser. Who wants to help better serve clients through recommendations. And I'm going to really illustrate three things. One is how do you federate data from multiple data sources? Inside the firewall, outside the firewall. How do you apply machine learning to predict and to automate? And then how do you move analytics closer to your data? So, what you're seeing here is a custom application for an investment firm. So, Siva, our financial adviser, welcome. So you can see at the top, we've got market data. We pulled that from an external source. And then we've got Siva's calendar in the middle. He's got clients on the right side. So page down, what else do you see down there Siva? >> [Siva] I can see the recent market news. And in here I can see that JP Morgan is calling for a US dollar rebound in the second half of the year. And, I have upcoming meeting with Leo Rakes. I can get-- >> [Rob] So let's go in there. Why don't you click on Leo Rakes. So, you're sitting at your desk, you're deciding how you're going to spend the day. You know you have a meeting with Leo. So you click on it. You immediately see, all right, so what do we know about him? We've got data governance implemented. So we know his age, we know his degree. We can see he's not that aggressive of a trader. Only six trades in the last few years. But then where it gets interesting is you go to the bottom. You start to see predicted industry affinity. Where did that come from? How do we have that? >> [Siva] So these green lines and red arrows here indicate the trending affinity of Leo Rakes for particular industry stocks. What we've done here is we've built machine learning models using customer's demographic data, his stock portfolios, and browsing behavior to build a model which can predict his affinity for a particular industry. >> [Rob] Interesting. So, I like to think of this, we call it celebrity experiences. So how do you treat every customer like they're a celebrity? So to some extent, we're reading his mind. Because without asking him, we know that he's going to have an affinity for auto stocks. So we go down. Now we look at his portfolio. You can see okay, he's got some different holdings. He's got Amazon, Google, Apple, and then he's got RACE, which is the ticker for Ferrari. You can see that's done incredibly well. And so, as a financial adviser, you look at this and you say, all right, we know he loves auto stocks. Ferrari's done very well. Let's create a hedge. Like what kind of security would interest him as a hedge against his position for Ferrari? Could we go figure that out? >> [Siva] Yes. Given I know that he's gotten an affinity for auto stocks, and I also see that Ferrari has got some terminus gains, I want to lock in these gains by hedging. And I want to do that by picking a auto stock which has got negative correlation with Ferrari. >> [Rob] So this is where we get to the idea of in database analytics. Cause you start clicking that and immediately we're getting instant answers of what's happening. So what did we find here? We're going to compare Ferrari and Honda. >> [Siva] I'm going to compare Ferrari with Honda. And what I see here instantly is that Honda has got a negative correlation with Ferrari, which makes it a perfect mix for his stock portfolio. Given he has an affinity for auto stocks and it correlates negatively with Ferrari. >> [Rob] These are very powerful tools at the hand of a financial adviser. You think about it. As a financial adviser, you wouldn't think about federating data, machine learning, pretty powerful. >> [Siva] Yes. So what we have seen here is that using the common SQL engine, we've been able to federate queries across multiple data sources. Db2 Warehouse in the cloud, IBM's Integrated Analytic System, and Hortonworks powered Hadoop platform for the new speeds. We've been able to use machine learning to derive innovative insights about his stock affinities. And drive the machine learning into the appliance. Closer to where the data resides to deliver high performance analytics. >> [Rob] At scale? >> [Siva] We're able to run millions of these correlations across stocks, currency, other factors. And even score hundreds of customers for their affinities on a daily basis. >> That's great. Siva, thank you for playing the role of financial adviser. So I just want to recap briefly. Cause this really powerful technology that's really simple. So we federated, we aggregated multiple data sources from all over the web and internal systems. And public cloud systems. Machine learning models were built that predicted Leo's affinity for a certain industry. In this case, automotive. And then you see when you deploy analytics next to your data, even a financial adviser, just with the click of a button is getting instant answers so they can go be more productive in their next meeting. This whole idea of celebrity experiences for your customer, that's available for everybody, if you take advantage of these types of capabilities. Katie, I'll hand it back to you. >> Good stuff. Thank you Rob. Thank you Siva. Powerful demonstration on what we've been talking about all afternoon. And thank you again to Siva for helping us navigate. Should be give him one more round of applause? We're going to be back in just a moment to look at how we operationalize all of this data. But in first, here's a message from me. If you're a part of a line of business, your main fear is disruption. You know data is the new goal that can create huge amounts of value. So does your competition. And they may be beating you to it. You're convinced there are new business models and revenue sources hidden in all the data. You just need to figure out how to leverage it. But with the scarcity of data scientists, you really can't rely solely on them. You may need more people throughout the organization that have the ability to extract value from data. And as a data science leader or data scientist, you have a lot of the same concerns. You spend way too much time looking for, prepping, and interpreting data and waiting for models to train. You know you need to operationalize the work you do to provide business value faster. What you want is an easier way to do data prep. And rapidly build models that can be easily deployed, monitored and automatically updated. So whether you're a data scientist, data science leader, or in a line of business, what's the solution? What'll it take to transform the way you work? That's what we're going to explore next. All right, now it's time to delve deeper into the nuts and bolts. The nitty gritty of operationalizing data science and creating a data driven culture. How do you actually do that? Well that's what these experts are here to share with us. I'm joined by Nir Kaldero, who's head of data science at Galvanize, which is an education and training organization. Tricia Wang, who is co-founder of Sudden Compass, a consultancy that helps companies understand people with data. And last, but certainly not least, Michael Li, founder and CEO of Data Incubator, which is a data science train company. All right guys. Shall we get right to it? >> All right. >> So data explosion happening right now. And we are seeing it across the board. I just shared an example of how it's impacting my philanthropic work in pediatric cancer. But you guys each have so many unique roles in your business life. How are you seeing it just blow up in your fields? Nir, your thing? >> Yeah, for example like in Galvanize we train many Fortune 500 companies. And just by looking at the demand of companies that wants us to help them go through this digital transformation is mind-blowing. Data point by itself. >> Okay. Well what we're seeing what's going on is that data science like as a theme, is that it's actually for everyone now. But what's happening is that it's actually meeting non technical people. But what we're seeing is that when non technical people are implementing these tools or coming at these tools without a base line of data literacy, they're often times using it in ways that distance themselves from the customer. Because they're implementing data science tools without a clear purpose, without a clear problem. And so what we do at Sudden Compass is that we work with companies to help them embrace and understand the complexity of their customers. Because often times they are misusing data science to try and flatten their understanding of the customer. As if you can just do more traditional marketing. Where you're putting people into boxes. And I think the whole ROI of data is that you can now understand people's relationships at a much more complex level at a greater scale before. But we have to do this with basic data literacy. And this has to involve technical and non technical people. >> Well you can have all the data in the world, and I think it speaks to, if you're not doing the proper movement with it, forget it. It means nothing at the same time. >> No absolutely. I mean, I think that when you look at the huge explosion in data, that comes with it a huge explosion in data experts. Right, we call them data scientists, data analysts. And sometimes they're people who are very, very talented, like the people here. But sometimes you have people who are maybe re-branding themselves, right? Trying to move up their title one notch to try to attract that higher salary. And I think that that's one of the things that customers are coming to us for, right? They're saying, hey look, there are a lot of people that call themselves data scientists, but we can't really distinguish. So, we have sort of run a fellowship where you help companies hire from a really talented group of folks, who are also truly data scientists and who know all those kind of really important data science tools. And we also help companies internally. Fortune 500 companies who are looking to grow that data science practice that they have. And we help clients like McKinsey, BCG, Bain, train up their customers, also their clients, also their workers to be more data talented. And to build up that data science capabilities. >> And Nir, this is something you work with a lot. A lot of Fortune 500 companies. And when we were speaking earlier, you were saying many of these companies can be in a panic. >> Yeah. >> Explain that. >> Yeah, so you know, not all Fortune 500 companies are fully data driven. And we know that the winners in this fourth industrial revolution, which I like to call the machine intelligence revolution, will be companies who navigate and transform their organization to unlock the power of data science and machine learning. And the companies that are not like that. Or not utilize data science and predictive power well, will pretty much get shredded. So they are in a panic. >> Tricia, companies have to deal with data behind the firewall and in the new multi cloud world. How do organizations start to become driven right to the core? >> I think the most urgent question to become data driven that companies should be asking is how do I bring the complex reality that our customers are experiencing on the ground in to a corporate office? Into the data models. So that question is critical because that's how you actually prevent any big data disasters. And that's how you leverage big data. Because when your data models are really far from your human models, that's when you're going to do things that are really far off from how, it's going to not feel right. That's when Tesco had their terrible big data disaster that they're still recovering from. And so that's why I think it's really important to understand that when you implement big data, you have to further embrace thick data. The qualitative, the emotional stuff, that is difficult to quantify. But then comes the difficult art and science that I think is the next level of data science. Which is that getting non technical and technical people together to ask how do we find those unknown nuggets of insights that are difficult to quantify? Then, how do we do the next step of figuring out how do you mathematically scale those insights into a data model? So that actually is reflective of human understanding? And then we can start making decisions at scale. But you have to have that first. >> That's absolutely right. And I think that when we think about what it means to be a data scientist, right? I always think about it in these sort of three pillars. You have the math side. You have to have that kind of stats, hardcore machine learning background. You have the programming side. You don't work with small amounts of data. You work with large amounts of data. You've got to be able to type the code to make those computers run. But then the last part is that human element. You have to understand the domain expertise. You have to understand what it is that I'm actually analyzing. What's the business proposition? And how are the clients, how are the users actually interacting with the system? That human element that you were talking about. And I think having somebody who understands all of those and not just in isolation, but is able to marry that understanding across those different topics, that's what makes a data scientist. >> But I find that we don't have people with those skill sets. And right now the way I see teams being set up inside companies is that they're creating these isolated data unicorns. These data scientists that have graduated from your programs, which are great. But, they don't involve the people who are the domain experts. They don't involve the designers, the consumer insight people, the people, the salespeople. The people who spend time with the customers day in and day out. Somehow they're left out of the room. They're consulted, but they're not a stakeholder. >> Can I actually >> Yeah, yeah please. >> Can I actually give a quick example? So for example, we at Galvanize train the executives and the managers. And then the technical people, the data scientists and the analysts. But in order to actually see all of the RY behind the data, you also have to have a creative fluid conversation between non technical and technical people. And this is a major trend now. And there's a major gap. And we need to increase awareness and kind of like create a new, kind of like environment where technical people also talks seamlessly with non technical ones. >> [Tricia] We call-- >> That's one of the things that we see a lot. Is one of the trends in-- >> A major trend. >> data science training is it's not just for the data science technical experts. It's not just for one type of person. So a lot of the training we do is sort of data engineers. People who are more on the software engineering side learning more about the stats of math. And then people who are sort of traditionally on the stat side learning more about the engineering. And then managers and people who are data analysts learning about both. >> Michael, I think you said something that was of interest too because I think we can look at IBM Watson as an example. And working in healthcare. The human component. Because often times we talk about machine learning and AI, and data and you get worried that you still need that human component. Especially in the world of healthcare. And I think that's a very strong point when it comes to the data analysis side. Is there any particular example you can speak to of that? >> So I think that there was this really excellent paper a while ago talking about all the neuro net stuff and trained on textual data. So looking at sort of different corpuses. And they found that these models were highly, highly sexist. They would read these corpuses and it's not because neuro nets themselves are sexist. It's because they're reading the things that we write. And it turns out that we write kind of sexist things. And they would sort of find all these patterns in there that were sort of latent, that had a lot of sort of things that maybe we would cringe at if we sort of saw. And I think that's one of the really important aspects of the human element, right? It's being able to come in and sort of say like, okay, I know what the biases of the system are, I know what the biases of the tools are. I need to figure out how to use that to make the tools, make the world a better place. And like another area where this comes up all the time is lending, right? So the federal government has said, and we have a lot of clients in the financial services space, so they're constantly under these kind of rules that they can't make discriminatory lending practices based on a whole set of protected categories. Race, sex, gender, things like that. But, it's very easy when you train a model on credit scores to pick that up. And then to have a model that's inadvertently sexist or racist. And that's where you need the human element to come back in and say okay, look, you're using the classic example would be zip code, you're using zip code as a variable. But when you look at it, zip codes actually highly correlated with race. And you can't do that. So you may inadvertently by sort of following the math and being a little naive about the problem, inadvertently introduce something really horrible into a model and that's where you need a human element to sort of step in and say, okay hold on. Slow things down. This isn't the right way to go. >> And the people who have -- >> I feel like, I can feel her ready to respond. >> Yes, I'm ready. >> She's like let me have at it. >> And the people here it is. And the people who are really great at providing that human intelligence are social scientists. We are trained to look for bias and to understand bias in data. Whether it's quantitative or qualitative. And I really think that we're going to have less of these kind of problems if we had more integrated teams. If it was a mandate from leadership to say no data science team should be without a social scientist, ethnographer, or qualitative researcher of some kind, to be able to help see these biases. >> The talent piece is actually the most crucial-- >> Yeah. >> one here. If you look about how to enable machine intelligence in organization there are the pillars that I have in my head which is the culture, the talent and the technology infrastructure. And I believe and I saw in working very closely with the Fortune 100 and 200 companies that the talent piece is actually the most important crucial hard to get. >> [Tricia] I totally agree. >> It's absolutely true. Yeah, no I mean I think that's sort of like how we came up with our business model. Companies were basically saying hey, I can't hire data scientists. And so we have a fellowship where we get 2,000 applicants each quarter. We take the top 2% and then we sort of train them up. And we work with hiring companies who then want to hire from that population. And so we're sort of helping them solve that problem. And the other half of it is really around training. Cause with a lot of industries, especially if you're sort of in a more regulated industry, there's a lot of nuances to what you're doing. And the fastest way to develop that data science or AI talent may not necessarily be to hire folks who are coming out of a PhD program. It may be to take folks internally who have a lot of that domain knowledge that you have and get them trained up on those data science techniques. So we've had large insurance companies come to us and say hey look, we hire three or four folks from you a quarter. That doesn't move the needle for us. What we really need is take the thousand actuaries and statisticians that we have and get all of them trained up to become a data scientist and become data literate in this new open source world. >> [Katie] Go ahead. >> All right, ladies first. >> Go ahead. >> Are you sure? >> No please, fight first. >> Go ahead. >> Go ahead Nir. >> So this is actually a trend that we have been seeing in the past year or so that companies kind of like start to look how to upscale and look for talent within the organization. So they can actually move them to become more literate and navigate 'em from analyst to data scientist. And from data scientist to machine learner. So this is actually a trend that is happening already for a year or so. >> Yeah, but I also find that after they've gone through that training in getting people skilled up in data science, the next problem that I get is executives coming to say we've invested in all of this. We're still not moving the needle. We've already invested in the right tools. We've gotten the right skills. We have enough scale of people who have these skills. Why are we not moving the needle? And what I explain to them is look, you're still making decisions in the same way. And you're still not involving enough of the non technical people. Especially from marketing, which is now, the CMO's are much more responsible for driving growth in their companies now. But often times it's so hard to change the old way of marketing, which is still like very segmentation. You know, demographic variable based, and we're trying to move people to say no, you have to understand the complexity of customers and not put them in boxes. >> And I think underlying a lot of this discussion is this question of culture, right? >> Yes. >> Absolutely. >> How do you build a data driven culture? And I think that that culture question, one of the ways that comes up quite often in especially in large, Fortune 500 enterprises, is that they are very, they're not very comfortable with sort of example, open source architecture. Open source tools. And there is some sort of residual bias that that's somehow dangerous. So security vulnerability. And I think that that's part of the cultural challenge that they often have in terms of how do I build a more data driven organization? Well a lot of the talent really wants to use these kind of tools. And I mean, just to give you an example, we are partnering with one of the major cloud providers to sort of help make open source tools more user friendly on their platform. So trying to help them attract the best technologists to use their platform because they want and they understand the value of having that kind of open source technology work seamlessly on their platforms. So I think that just sort of goes to show you how important open source is in this movement. And how much large companies and Fortune 500 companies and a lot of the ones we work with have to embrace that. >> Yeah, and I'm seeing it in our work. Even when we're working with Fortune 500 companies, is that they've already gone through the first phase of data science work. Where I explain it was all about the tools and getting the right tools and architecture in place. And then companies started moving into getting the right skill set in place. Getting the right talent. And what you're talking about with culture is really where I think we're talking about the third phase of data science, which is looking at communication of these technical frameworks so that we can get non technical people really comfortable in the same room with data scientists. That is going to be the phase, that's really where I see the pain point. And that's why at Sudden Compass, we're really dedicated to working with each other to figure out how do we solve this problem now? >> And I think that communication between the technical stakeholders and management and leadership. That's a very critical piece of this. You can't have a successful data science organization without that. >> Absolutely. >> And I think that actually some of the most popular trainings we've had recently are from managers and executives who are looking to say, how do I become more data savvy? How do I figure out what is this data science thing and how do I communicate with my data scientists? >> You guys made this way too easy. I was just going to get some popcorn and watch it play out. >> Nir, last 30 seconds. I want to leave you with an opportunity to, anything you want to add to this conversation? >> I think one thing to conclude is to say that companies that are not data driven is about time to hit refresh and figure how they transition the organization to become data driven. To become agile and nimble so they can actually see what opportunities from this important industrial revolution. Otherwise, unfortunately they will have hard time to survive. >> [Katie] All agreed? >> [Tricia] Absolutely, you're right. >> Michael, Trish, Nir, thank you so much. Fascinating discussion. And thank you guys again for joining us. We will be right back with another great demo. Right after this. >> Thank you Katie. >> Once again, thank you for an excellent discussion. Weren't they great guys? And thank you for everyone who's tuning in on the live webcast. As you can hear, we have an amazing studio audience here. And we're going to keep things moving. I'm now joined by Daniel Hernandez and Siva Anne. And we're going to turn our attention to how you can deliver on what they're talking about using data science experience to do data science faster. >> Thank you Katie. Siva and I are going to spend the next 10 minutes showing you how you can deliver on what they were saying using the IBM Data Science Experience to do data science faster. We'll demonstrate through new features we introduced this week how teams can work together more effectively across the entire analytics life cycle. How you can take advantage of any and all data no matter where it is and what it is. How you could use your favorite tools from open source. And finally how you could build models anywhere and employ them close to where your data is. Remember the financial adviser app Rob showed you? To build an app like that, we needed a team of data scientists, developers, data engineers, and IT staff to collaborate. We do this in the Data Science Experience through a concept we call projects. When I create a new project, I can now use the new Github integration feature. We're doing for data science what we've been doing for developers for years. Distributed teams can work together on analytics projects. And take advantage of Github's version management and change management features. This is a huge deal. Let's explore the project we created for the financial adviser app. As you can see, our data engineer Joane, our developer Rob, and others are collaborating this project. Joane got things started by bringing together the trusted data sources we need to build the app. Taking a closer look at the data, we see that our customer and profile data is stored on our recently announced IBM Integrated Analytics System, which runs safely behind our firewall. We also needed macro economic data, which she was able to find in the Federal Reserve. And she stored it in our Db2 Warehouse on Cloud. And finally, she selected stock news data from NASDAQ.com and landed that in a Hadoop cluster, which happens to be powered by Hortonworks. We added a new feature to the Data Science Experience so that when it's installed with Hortonworks, it automatically uses a need of security and governance controls within the cluster so your data is always secure and safe. Now we want to show you the news data we stored in the Hortonworks cluster. This is the mean administrative console. It's powered by an open source project called Ambari. And here's the news data. It's in parquet files stored in HDFS, which happens to be a distributive file system. To get the data from NASDAQ into our cluster, we used IBM's BigIntegrate and BigQuality to create automatic data pipelines that acquire, cleanse, and ingest that news data. Once the data's available, we use IBM's Big SQL to query that data using SQL statements that are much like the ones we would use for any relation of data, including the data that we have in the Integrated Analytics System and Db2 Warehouse on Cloud. This and the federation capabilities that Big SQL offers dramatically simplifies data acquisition. Now we want to show you how we support a brand new tool that we're excited about. Since we launched last summer, the Data Science Experience has supported Jupyter and R for data analysis and visualization. In this week's update, we deeply integrated another great open source project called Apache Zeppelin. It's known for having great visualization support, advanced collaboration features, and is growing in popularity amongst the data science community. This is an example of Apache Zeppelin and the notebook we created through it to explore some of our data. Notice how wonderful and easy the data visualizations are. Now we want to walk you through the Jupyter notebook we created to explore our customer preference for stocks. We use notebooks to understand and explore data. To identify the features that have some predictive power. Ultimately, we're trying to assess what ultimately is driving customer stock preference. Here we did the analysis to identify the attributes of customers that are likely to purchase auto stocks. We used this understanding to build our machine learning model. For building machine learning models, we've always had tools integrated into the Data Science Experience. But sometimes you need to use tools you already invested in. Like our very own SPSS as well as SAS. Through new import feature, you can easily import those models created with those tools. This helps you avoid vendor lock-in, and simplify the development, training, deployment, and management of all your models. To build the models we used in app, we could have coded, but we prefer a visual experience. We used our customer profile data in the Integrated Analytic System. Used the Auto Data Preparation to cleanse our data. Choose the binary classification algorithms. Let the Data Science Experience evaluate between logistic regression and gradient boosted tree. It's doing the heavy work for us. As you can see here, the Data Science Experience generated performance metrics that show us that the gradient boosted tree is the best performing algorithm for the data we gave it. Once we save this model, it's automatically deployed and available for developers to use. Any application developer can take this endpoint and consume it like they would any other API inside of the apps they built. We've made training and creating machine learning models super simple. But what about the operations? A lot of companies are struggling to ensure their model performance remains high over time. In our financial adviser app, we know that customer data changes constantly, so we need to always monitor model performance and ensure that our models are retrained as is necessary. This is a dashboard that shows the performance of our models and lets our teams monitor and retrain those models so that they're always performing to our standards. So far we've been showing you the Data Science Experience available behind the firewall that we're using to build and train models. Through a new publish feature, you can build models and deploy them anywhere. In another environment, private, public, or anywhere else with just a few clicks. So here we're publishing our model to the Watson machine learning service. It happens to be in the IBM cloud. And also deeply integrated with our Data Science Experience. After publishing and switching to the Watson machine learning service, you can see that our stock affinity and model that we just published is there and ready for use. So this is incredibly important. I just want to say it again. The Data Science Experience allows you to train models behind your own firewall, take advantage of your proprietary and sensitive data, and then deploy those models wherever you want with ease. So summarize what we just showed you. First, IBM's Data Science Experience supports all teams. You saw how our data engineer populated our project with trusted data sets. Our data scientists developed, trained, and tested a machine learning model. Our developers used APIs to integrate machine learning into their apps. And how IT can use our Integrated Model Management dashboard to monitor and manage model performance. Second, we support all data. On premises, in the cloud, structured, unstructured, inside of your firewall, and outside of it. We help you bring analytics and governance to where your data is. Third, we support all tools. The data science tools that you depend on are readily available and deeply integrated. This includes capabilities from great partners like Hortonworks. And powerful tools like our very own IBM SPSS. And fourth, and finally, we support all deployments. You can build your models anywhere, and deploy them right next to where your data is. Whether that's in the public cloud, private cloud, or even on the world's most reliable transaction platform, IBM z. So see for yourself. Go to the Data Science Experience website, take us for a spin. And if you happen to be ready right now, our recently created Data Science Elite Team can help you get started and run experiments alongside you with no charge. Thank you very much. >> Thank you very much Daniel. It seems like a great time to get started. And thanks to Siva for taking us through it. Rob and I will be back in just a moment to add some perspective right after this. All right, once again joined by Rob Thomas. And Rob obviously we got a lot of information here. >> Yes, we've covered a lot of ground. >> This is intense. You got to break it down for me cause I think we zoom out and see the big picture. What better data science can deliver to a business? Why is this so important? I mean we've heard it through and through. >> Yeah, well, I heard it a couple times. But it starts with businesses have to embrace a data driven culture. And it is a change. And we need to make data accessible with the right tools in a collaborative culture because we've got diverse skill sets in every organization. But data driven companies succeed when data science tools are in the hands of everyone. And I think that's a new thought. I think most companies think just get your data scientist some tools, you'll be fine. This is about tools in the hands of everyone. I think the panel did a great job of describing about how we get to data science for all. Building a data culture, making it a part of your everyday operations, and the highlights of what Daniel just showed us, that's some pretty cool features for how organizations can get to this, which is you can see IBM's Data Science Experience, how that supports all teams. You saw data analysts, data scientists, application developer, IT staff, all working together. Second, you saw how we support all tools. And your choice of tools. So the most popular data science libraries integrated into one platform. And we saw some new capabilities that help companies avoid lock-in, where you can import existing models created from specialist tools like SPSS or others. And then deploy them and manage them inside of Data Science Experience. That's pretty interesting. And lastly, you see we continue to build on this best of open tools. Partnering with companies like H2O, Hortonworks, and others. Third, you can see how you use all data no matter where it lives. That's a key challenge every organization's going to face. Private, public, federating all data sources. We announced new integration with the Hortonworks data platform where we deploy machine learning models where your data resides. That's been a key theme. Analytics where the data is. And lastly, supporting all types of deployments. Deploy them in your Hadoop cluster. Deploy them in your Integrated Analytic System. Or deploy them in z, just to name a few. A lot of different options here. But look, don't believe anything I say. Go try it for yourself. Data Science Experience, anybody can use it. Go to datascience.ibm.com and look, if you want to start right now, we just created a team that we call Data Science Elite. These are the best data scientists in the world that will come sit down with you and co-create solutions, models, and prove out a proof of concept. >> Good stuff. Thank you Rob. So you might be asking what does an organization look like that embraces data science for all? And how could it transform your role? I'm going to head back to the office and check it out. Let's start with the perspective of the line of business. What's changed? Well, now you're starting to explore new business models. You've uncovered opportunities for new revenue sources and all that hidden data. And being disrupted is no longer keeping you up at night. As a data science leader, you're beginning to collaborate with a line of business to better understand and translate the objectives into the models that are being built. Your data scientists are also starting to collaborate with the less technical team members and analysts who are working closest to the business problem. And as a data scientist, you stop feeling like you're falling behind. Open source tools are keeping you current. You're also starting to operationalize the work that you do. And you get to do more of what you love. Explore data, build models, put your models into production, and create business impact. All in all, it's not a bad scenario. Thanks. All right. We are back and coming up next, oh this is a special time right now. Cause we got a great guest speaker. New York Magazine called him the spreadsheet psychic and number crunching prodigy who went from correctly forecasting baseball games to correctly forecasting presidential elections. He even invented a proprietary algorithm called PECOTA for predicting future performance by baseball players and teams. And his New York Times bestselling book, The Signal and the Noise was named by Amazon.com as the number one best non-fiction book of 2012. He's currently the Editor in Chief of the award winning website, FiveThirtyEight and appears on ESPN as an on air commentator. Big round of applause. My pleasure to welcome Nate Silver. >> Thank you. We met backstage. >> Yes. >> It feels weird to re-shake your hand, but you know, for the audience. >> I had to give the intense firm grip. >> Definitely. >> The ninja grip. So you and I have crossed paths kind of digitally in the past, which it really interesting, is I started my career at ESPN. And I started as a production assistant, then later back on air for sports technology. And I go to you to talk about sports because-- >> Yeah. >> Wow, has ESPN upped their game in terms of understanding the importance of data and analytics. And what it brings. Not just to MLB, but across the board. >> No, it's really infused into the way they present the broadcast. You'll have win probability on the bottom line. And they'll incorporate FiveThirtyEight metrics into how they cover college football for example. So, ESPN ... Sports is maybe the perfect, if you're a data scientist, like the perfect kind of test case. And the reason being that sports consists of problems that have rules. And have structure. And when problems have rules and structure, then it's a lot easier to work with. So it's a great way to kind of improve your skills as a data scientist. Of course, there are also important real world problems that are more open ended, and those present different types of challenges. But it's such a natural fit. The teams. Think about the teams playing the World Series tonight. The Dodgers and the Astros are both like very data driven, especially Houston. Golden State Warriors, the NBA Champions, extremely data driven. New England Patriots, relative to an NFL team, it's shifted a little bit, the NFL bar is lower. But the Patriots are certainly very analytical in how they make decisions. So, you can't talk about sports without talking about analytics. >> And I was going to save the baseball question for later. Cause we are moments away from game seven. >> Yeah. >> Is everyone else watching game seven? It's been an incredible series. Probably one of the best of all time. >> Yeah, I mean-- >> You have a prediction here? >> You can mention that too. So I don't have a prediction. FiveThirtyEight has the Dodgers with a 60% chance of winning. >> [Katie] LA Fans. >> So you have two teams that are about equal. But the Dodgers pitching staff is in better shape at the moment. The end of a seven game series. And they're at home. >> But the statistics behind the two teams is pretty incredible. >> Yeah. It's like the first World Series in I think 56 years or something where you have two 100 win teams facing one another. There have been a lot of parity in baseball for a lot of years. Not that many offensive overall juggernauts. But this year, and last year with the Cubs and the Indians too really. But this year, you have really spectacular teams in the World Series. It kind of is a showcase of modern baseball. Lots of home runs. Lots of strikeouts. >> [Katie] Lots of extra innings. >> Lots of extra innings. Good defense. Lots of pitching changes. So if you love the modern baseball game, it's been about the best example that you've had. If you like a little bit more contact, and fewer strikeouts, maybe not so much. But it's been a spectacular and very exciting World Series. It's amazing to talk. MLB is huge with analysis. I mean, hands down. But across the board, if you can provide a few examples. Because there's so many teams in front offices putting such an, just a heavy intensity on the analysis side. And where the teams are going. And if you could provide any specific examples of teams that have really blown your mind. Especially over the last year or two. Because every year it gets more exciting if you will. I mean, so a big thing in baseball is defensive shifts. So if you watch tonight, you'll probably see a couple of plays where if you're used to watching baseball, a guy makes really solid contact. And there's a fielder there that you don't think should be there. But that's really very data driven where you analyze where's this guy hit the ball. That part's not so hard. But also there's game theory involved. Because you have to adjust for the fact that he knows where you're positioning the defenders. He's trying therefore to make adjustments to his own swing and so that's been a major innovation in how baseball is played. You know, how bullpens are used too. Where teams have realized that actually having a guy, across all sports pretty much, realizing the importance of rest. And of fatigue. And that you can be the best pitcher in the world, but guess what? After four or five innings, you're probably not as good as a guy who has a fresh arm necessarily. So I mean, it really is like, these are not subtle things anymore. It's not just oh, on base percentage is valuable. It really effects kind of every strategic decision in baseball. The NBA, if you watch an NBA game tonight, see how many three point shots are taken. That's in part because of data. And teams realizing hey, three points is worth more than two, once you're more than about five feet from the basket, the shooting percentage gets really flat. And so it's revolutionary, right? Like teams that will shoot almost half their shots from the three point range nowadays. Larry Bird, who wound up being one of the greatest three point shooters of all time, took only eight three pointers his first year in the NBA. It's quite noticeable if you watch baseball or basketball in particular. >> Not to focus too much on sports. One final question. In terms of Major League Soccer, and now in NFL, we're having the analysis and having wearables where it can now showcase if they wanted to on screen, heart rate and breathing and how much exertion. How much data is too much data? And when does it ruin the sport? >> So, I don't think, I mean, again, it goes sport by sport a little bit. I think in basketball you actually have a more exciting game. I think the game is more open now. You have more three pointers. You have guys getting higher assist totals. But you know, I don't know. I'm not one of those people who thinks look, if you love baseball or basketball, and you go in to work for the Astros, the Yankees or the Knicks, they probably need some help, right? You really have to be passionate about that sport. Because it's all based on what questions am I asking? As I'm a fan or I guess an employee of the team. Or a player watching the game. And there isn't really any substitute I don't think for the insight and intuition that a curious human has to kind of ask the right questions. So we can talk at great length about what tools do you then apply when you have those questions, but that still comes from people. I don't think machine learning could help with what questions do I want to ask of the data. It might help you get the answers. >> If you have a mid-fielder in a soccer game though, not exerting, only 80%, and you're seeing that on a screen as a fan, and you're saying could that person get fired at the end of the day? One day, with the data? >> So we found that actually some in soccer in particular, some of the better players are actually more still. So Leo Messi, maybe the best player in the world, doesn't move as much as other soccer players do. And the reason being that A) he kind of knows how to position himself in the first place. B) he realizes that you make a run, and you're out of position. That's quite fatiguing. And particularly soccer, like basketball, is a sport where it's incredibly fatiguing. And so, sometimes the guys who conserve their energy, that kind of old school mentality, you have to hustle at every moment. That is not helpful to the team if you're hustling on an irrelevant play. And therefore, on a critical play, can't get back on defense, for example. >> Sports, but also data is moving exponentially as we're just speaking about today. Tech, healthcare, every different industry. Is there any particular that's a favorite of yours to cover? And I imagine they're all different as well. >> I mean, I do like sports. We cover a lot of politics too. Which is different. I mean in politics I think people aren't intuitively as data driven as they might be in sports for example. It's impressive to follow the breakthroughs in artificial intelligence. It started out just as kind of playing games and playing chess and poker and Go and things like that. But you really have seen a lot of breakthroughs in the last couple of years. But yeah, it's kind of infused into everything really. >> You're known for your work in politics though. Especially presidential campaigns. >> Yeah. >> This year, in particular. Was it insanely challenging? What was the most notable thing that came out of any of your predictions? >> I mean, in some ways, looking at the polling was the easiest lens to look at it. So I think there's kind of a myth that last year's result was a big shock and it wasn't really. If you did the modeling in the right way, then you realized that number one, polls have a margin of error. And so when a candidate has a three point lead, that's not particularly safe. Number two, the outcome between different states is correlated. Meaning that it's not that much of a surprise that Clinton lost Wisconsin and Michigan and Pennsylvania and Ohio. You know I'm from Michigan. Have friends from all those states. Kind of the same types of people in those states. Those outcomes are all correlated. So what people thought was a big upset for the polls I think was an example of how data science done carefully and correctly where you understand probabilities, understand correlations. Our model gave Trump a 30% chance of winning. Others models gave him a 1% chance. And so that was interesting in that it showed that number one, that modeling strategies and skill do matter quite a lot. When you have someone saying 30% versus 1%. I mean, that's a very very big spread. And number two, that these aren't like solved problems necessarily. Although again, the problem with elections is that you only have one election every four years. So I can be very confident that I have a better model. Even one year of data doesn't really prove very much. Even five or 10 years doesn't really prove very much. And so, being aware of the limitations to some extent intrinsically in elections when you only get one kind of new training example every four years, there's not really any way around that. There are ways to be more robust to sparce data environments. But if you're identifying different types of business problems to solve, figuring out what's a solvable problem where I can add value with data science is a really key part of what you're doing. >> You're such a leader in this space. In data and analysis. It would be interesting to kind of peek back the curtain, understand how you operate but also how large is your team? How you're putting together information. How quickly you're putting it out. Cause I think in this right now world where everybody wants things instantly-- >> Yeah. >> There's also, you want to be first too in the world of journalism. But you don't want to be inaccurate because that's your credibility. >> We talked about this before, right? I think on average, speed is a little bit overrated in journalism. >> [Katie] I think it's a big problem in journalism. >> Yeah. >> Especially in the tech world. You have to be first. You have to be first. And it's just pumping out, pumping out. And there's got to be more time spent on stories if I can speak subjectively. >> Yeah, for sure. But at the same time, we are reacting to the news. And so we have people that come in, we hire most of our people actually from journalism. >> [Katie] How many people do you have on your team? >> About 35. But, if you get someone who comes in from an academic track for example, they might be surprised at how fast journalism is. That even though we might be slower than the average website, the fact that there's a tragic event in New York, are there things we have to say about that? A candidate drops out of the presidential race, are things we have to say about that. In periods ranging from minutes to days as opposed to kind of weeks to months to years in the academic world. The corporate world moves faster. What is a little different about journalism is that you are expected to have more precision where people notice when you make a mistake. In corporations, you have maybe less transparency. If you make 10 investments and seven of them turn out well, then you'll get a lot of profit from that, right? In journalism, it's a little different. If you make kind of seven predictions or say seven things, and seven of them are very accurate and three of them aren't, you'll still get criticized a lot for the three. Just because that's kind of the way that journalism is. And so the kind of combination of needing, not having that much tolerance for mistakes, but also needing to be fast. That is tricky. And I criticize other journalists sometimes including for not being data driven enough, but the best excuse any journalist has, this is happening really fast and it's my job to kind of figure out in real time what's going on and provide useful information to the readers. And that's really difficult. Especially in a world where literally, I'll probably get off the stage and check my phone and who knows what President Trump will have tweeted or what things will have happened. But it really is a kind of 24/7. >> Well because it's 24/7 with FiveThirtyEight, one of the most well known sites for data, are you feeling micromanagey on your people? Because you do have to hit this balance. You can't have something come out four or five days later. >> Yeah, I'm not -- >> Are you overseeing everything? >> I'm not by nature a micromanager. And so you try to hire well. You try and let people make mistakes. And the flip side of this is that if a news organization that never had any mistakes, never had any corrections, that's raw, right? You have to have some tolerance for error because you are trying to decide things in real time. And figure things out. I think transparency's a big part of that. Say here's what we think, and here's why we think it. If we have a model to say it's not just the final number, here's a lot of detail about how that's calculated. In some case we release the code and the raw data. Sometimes we don't because there's a proprietary advantage. But quite often we're saying we want you to trust us and it's so important that you trust us, here's the model. Go play around with it yourself. Here's the data. And that's also I think an important value. >> That speaks to open source. And your perspective on that in general. >> Yeah, I mean, look, I'm a big fan of open source. I worry that I think sometimes the trends are a little bit away from open source. But by the way, one thing that happens when you share your data or you share your thinking at least in lieu of the data, and you can definitely do both is that readers will catch embarrassing mistakes that you made. By the way, even having open sourceness within your team, I mean we have editors and copy editors who often save you from really embarrassing mistakes. And by the way, it's not necessarily people who have a training in data science. I would guess that of our 35 people, maybe only five to 10 have a kind of formal background in what you would call data science. >> [Katie] I think that speaks to the theme here. >> Yeah. >> [Katie] That everybody's kind of got to be data literate. >> But yeah, it is like you have a good intuition. You have a good BS detector basically. And you have a good intuition for hey, this looks a little bit out of line to me. And sometimes that can be based on domain knowledge, right? We have one of our copy editors, she's a big college football fan. And we had an algorithm we released that tries to predict what the human being selection committee will do, and she was like, why is LSU rated so high? Cause I know that LSU sucks this year. And we looked at it, and she was right. There was a bug where it had forgotten to account for their last game where they lost to Troy or something and so -- >> That also speaks to the human element as well. >> It does. In general as a rule, if you're designing a kind of regression based model, it's different in machine learning where you have more, when you kind of build in the tolerance for error. But if you're trying to do something more precise, then so much of it is just debugging. It's saying that looks wrong to me. And I'm going to investigate that. And sometimes it's not wrong. Sometimes your model actually has an insight that you didn't have yourself. But fairly often, it is. And I think kind of what you learn is like, hey if there's something that bothers me, I want to go investigate that now and debug that now. Because the last thing you want is where all of a sudden, the answer you're putting out there in the world hinges on a mistake that you made. Cause you never know if you have so to speak, 1,000 lines of code and they all perform something differently. You never know when you get in a weird edge case where this one decision you made winds up being the difference between your having a good forecast and a bad one. In a defensible position and a indefensible one. So we definitely are quite diligent and careful. But it's also kind of knowing like, hey, where is an approximation good enough and where do I need more precision? Cause you could also drive yourself crazy in the other direction where you know, it doesn't matter if the answer is 91.2 versus 90. And so you can kind of go 91.2, three, four and it's like kind of A) false precision and B) not a good use of your time. So that's where I do still spend a lot of time is thinking about which problems are "solvable" or approachable with data and which ones aren't. And when they're not by the way, you're still allowed to report on them. We are a news organization so we do traditional reporting as well. And then kind of figuring out when do you need precision versus when is being pointed in the right direction good enough? >> I would love to get inside your brain and see how you operate on just like an everyday walking to Walgreens movement. It's like oh, if I cross the street in .2-- >> It's not, I mean-- >> Is it like maddening in there? >> No, not really. I mean, I'm like-- >> This is an honest question. >> If I'm looking for airfares, I'm a little more careful. But no, part of it's like you don't want to waste time on unimportant decisions, right? I will sometimes, if I can't decide what to eat at a restaurant, I'll flip a coin. If the chicken and the pasta both sound really good-- >> That's not high tech Nate. We want better. >> But that's the point, right? It's like both the chicken and the pasta are going to be really darn good, right? So I'm not going to waste my time trying to figure it out. I'm just going to have an arbitrary way to decide. >> Serious and business, how organizations in the last three to five years have just evolved with this data boom. How are you seeing it as from a consultant point of view? Do you think it's an exciting time? Do you think it's a you must act now time? >> I mean, we do know that you definitely see a lot of talent among the younger generation now. That so FiveThirtyEight has been at ESPN for four years now. And man, the quality of the interns we get has improved so much in four years. The quality of the kind of young hires that we make straight out of college has improved so much in four years. So you definitely do see a younger generation for which this is just part of their bloodstream and part of their DNA. And also, particular fields that we're interested in. So we're interested in people who have both a data and a journalism background. We're interested in people who have a visualization and a coding background. A lot of what we do is very much interactive graphics and so forth. And so we do see those skill sets coming into play a lot more. And so the kind of shortage of talent that had I think frankly been a problem for a long time, I'm optimistic based on the young people in our office, it's a little anecdotal but you can tell that there are so many more programs that are kind of teaching students the right set of skills that maybe weren't taught as much a few years ago. >> But when you're seeing these big organizations, ESPN as perfect example, moving more towards data and analytics than ever before. >> Yeah. >> You would say that's obviously true. >> Oh for sure. >> If you're not moving that direction, you're going to fall behind quickly. >> Yeah and the thing is, if you read my book or I guess people have a copy of the book. In some ways it's saying hey, there are lot of ways to screw up when you're using data. And we've built bad models. We've had models that were bad and got good results. Good models that got bad results and everything else. But the point is that the reason to be out in front of the problem is so you give yourself more runway to make errors and mistakes. And to learn kind of what works and what doesn't and which people to put on the problem. I sometimes do worry that a company says oh we need data. And everyone kind of agrees on that now. We need data science. Then they have some big test case. And they have a failure. And they maybe have a failure because they didn't know really how to use it well enough. But learning from that and iterating on that. And so by the time that you're on the third generation of kind of a problem that you're trying to solve, and you're watching everyone else make the mistake that you made five years ago, I mean, that's really powerful. But that doesn't mean that getting invested in it now, getting invested both in technology and the human capital side is important. >> Final question for you as we run out of time. 2018 beyond, what is your biggest project in terms of data gathering that you're working on? >> There's a midterm election coming up. That's a big thing for us. We're also doing a lot of work with NBA data. So for four years now, the NBA has been collecting player tracking data. So they have 3D cameras in every arena. So they can actually kind of quantify for example how fast a fast break is, for example. Or literally where a player is and where the ball is. For every NBA game now for the past four or five years. And there hasn't really been an overall metric of player value that's taken advantage of that. The teams do it. But in the NBA, the teams are a little bit ahead of journalists and analysts. So we're trying to have a really truly next generation stat. It's a lot of data. Sometimes I now more oversee things than I once did myself. And so you're parsing through many, many, many lines of code. But yeah, so we hope to have that out at some point in the next few months. >> Anything you've personally been passionate about that you've wanted to work on and kind of solve? >> I mean, the NBA thing, I am a pretty big basketball fan. >> You can do better than that. Come on, I want something real personal that you're like I got to crunch the numbers. >> You know, we tried to figure out where the best burrito in America was a few years ago. >> I'm going to end it there. >> Okay. >> Nate, thank you so much for joining us. It's been an absolute pleasure. Thank you. >> Cool, thank you. >> I thought we were going to chat World Series, you know. Burritos, important. I want to thank everybody here in our audience. Let's give him a big round of applause. >> [Nate] Thank you everyone. >> Perfect way to end the day. And for a replay of today's program, just head on over to ibm.com/dsforall. I'm Katie Linendoll. And this has been Data Science for All: It's a Whole New Game. Test one, two. One, two, three. Hi guys, I just want to quickly let you know as you're exiting. A few heads up. Downstairs right now there's going to be a meet and greet with Nate. And we're going to be doing that with clients and customers who are interested. So I would recommend before the game starts, and you lose Nate, head on downstairs. And also the gallery is open until eight p.m. with demos and activations. And tomorrow, make sure to come back too. Because we have exciting stuff. I'll be joining you as your host. And we're kicking off at nine a.m. So bye everybody, thank you so much. >> [Announcer] Ladies and gentlemen, thank you for attending this evening's webcast. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your name badge at the registration desk. Thank you. Also, please note there are two exits on the back of the room on either side of the room. Have a good evening. Ladies and gentlemen, the meet and greet will be on stage. Thank you.

Published Date : Nov 1 2017

SUMMARY :

Today the ability to extract value from data is becoming a shared mission. And for all of you during the program, I want to remind you to join that conversation on And when you and I chatted about it. And the scale and complexity of the data that organizations are having to deal with has It's challenging in the world of unmanageable. And they have to find a way. AI. And it's incredible that this buzz word is happening. And to get to an AI future, you have to lay a data foundation today. And four is you got to expand job roles in the organization. First pillar in this you just discussed. And now you get to where we are today. And if you don't have a strategy for how you acquire that and manage it, you're not going And the way I think about that is it's really about moving from static data repositories And we continue with the architecture. So you need a way to federate data across different environments. So we've laid out what you need for driving automation. And so when you think about the real use cases that are driving return on investment today, Let's go ahead and come back to something that you mentioned earlier because it's fascinating And so the new job roles is about how does everybody have data first in their mind? Everybody in the company has to be data literate. So overall, group effort, has to be a common goal, and we all need to be data literate But at the end of the day, it's kind of not an easy task. It's not easy but it's maybe not as big of a shift as you would think. It's interesting to hear you say essentially you need to train everyone though across the And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. And I've heard that the placement behind those jobs, people graduating with the MS is high. Let me get back to something else you touched on earlier because you mentioned that a number They produce a lot of the shows that I'm sure you watch Katie. And this is a good example. So they have to optimize every aspect of their business from marketing campaigns to promotions And so, as we talk to clients we think about how do you start down this path now, even It's analytics first to the data, not the other way around. We as a practice, we say you want to bring data to where the data sits. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. Female preferred, on the cover of Vogue. And how does it change everything? And while it's important to recognize this critical skill set, you can't just limit it And we call it clickers and coders. [Katie] I like that. And there's not a lot of things available today that do that. Because I hear you talking about the data scientists role and how it's critical to success, And my view is if you have the right platform, it enables the organization to collaborate. And every organization needs to think about what are the skills that are critical? Use this as your chance to reinvent IT. And I can tell you even personally being effected by how important the analysis is in working And think about if you don't do something. And now we're going to get to the fun hands on part of our story. And then how do you move analytics closer to your data? And in here I can see that JP Morgan is calling for a US dollar rebound in the second half But then where it gets interesting is you go to the bottom. data, his stock portfolios, and browsing behavior to build a model which can predict his affinity And so, as a financial adviser, you look at this and you say, all right, we know he loves And I want to do that by picking a auto stock which has got negative correlation with Ferrari. Cause you start clicking that and immediately we're getting instant answers of what's happening. And what I see here instantly is that Honda has got a negative correlation with Ferrari, As a financial adviser, you wouldn't think about federating data, machine learning, pretty And drive the machine learning into the appliance. And even score hundreds of customers for their affinities on a daily basis. And then you see when you deploy analytics next to your data, even a financial adviser, And as a data science leader or data scientist, you have a lot of the same concerns. But you guys each have so many unique roles in your business life. And just by looking at the demand of companies that wants us to help them go through this And I think the whole ROI of data is that you can now understand people's relationships Well you can have all the data in the world, and I think it speaks to, if you're not doing And I think that that's one of the things that customers are coming to us for, right? And Nir, this is something you work with a lot. And the companies that are not like that. Tricia, companies have to deal with data behind the firewall and in the new multi cloud And so that's why I think it's really important to understand that when you implement big And how are the clients, how are the users actually interacting with the system? And right now the way I see teams being set up inside companies is that they're creating But in order to actually see all of the RY behind the data, you also have to have a creative That's one of the things that we see a lot. So a lot of the training we do is sort of data engineers. And I think that's a very strong point when it comes to the data analysis side. And that's where you need the human element to come back in and say okay, look, you're And the people who are really great at providing that human intelligence are social scientists. the talent piece is actually the most important crucial hard to get. It may be to take folks internally who have a lot of that domain knowledge that you have And from data scientist to machine learner. And what I explain to them is look, you're still making decisions in the same way. And I mean, just to give you an example, we are partnering with one of the major cloud And what you're talking about with culture is really where I think we're talking about And I think that communication between the technical stakeholders and management You guys made this way too easy. I want to leave you with an opportunity to, anything you want to add to this conversation? I think one thing to conclude is to say that companies that are not data driven is And thank you guys again for joining us. And we're going to turn our attention to how you can deliver on what they're talking about And finally how you could build models anywhere and employ them close to where your data is. And thanks to Siva for taking us through it. You got to break it down for me cause I think we zoom out and see the big picture. And we saw some new capabilities that help companies avoid lock-in, where you can import And as a data scientist, you stop feeling like you're falling behind. We met backstage. And I go to you to talk about sports because-- And what it brings. And the reason being that sports consists of problems that have rules. And I was going to save the baseball question for later. Probably one of the best of all time. FiveThirtyEight has the Dodgers with a 60% chance of winning. So you have two teams that are about equal. It's like the first World Series in I think 56 years or something where you have two 100 And that you can be the best pitcher in the world, but guess what? And when does it ruin the sport? So we can talk at great length about what tools do you then apply when you have those And the reason being that A) he kind of knows how to position himself in the first place. And I imagine they're all different as well. But you really have seen a lot of breakthroughs in the last couple of years. You're known for your work in politics though. What was the most notable thing that came out of any of your predictions? And so, being aware of the limitations to some extent intrinsically in elections when It would be interesting to kind of peek back the curtain, understand how you operate but But you don't want to be inaccurate because that's your credibility. I think on average, speed is a little bit overrated in journalism. And there's got to be more time spent on stories if I can speak subjectively. And so we have people that come in, we hire most of our people actually from journalism. And so the kind of combination of needing, not having that much tolerance for mistakes, Because you do have to hit this balance. And so you try to hire well. And your perspective on that in general. But by the way, one thing that happens when you share your data or you share your thinking And you have a good intuition for hey, this looks a little bit out of line to me. And I think kind of what you learn is like, hey if there's something that bothers me, It's like oh, if I cross the street in .2-- I mean, I'm like-- But no, part of it's like you don't want to waste time on unimportant decisions, right? We want better. It's like both the chicken and the pasta are going to be really darn good, right? Serious and business, how organizations in the last three to five years have just And man, the quality of the interns we get has improved so much in four years. But when you're seeing these big organizations, ESPN as perfect example, moving more towards But the point is that the reason to be out in front of the problem is so you give yourself Final question for you as we run out of time. And so you're parsing through many, many, many lines of code. You can do better than that. You know, we tried to figure out where the best burrito in America was a few years Nate, thank you so much for joining us. I thought we were going to chat World Series, you know. And also the gallery is open until eight p.m. with demos and activations. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your

ENTITIES

Entity	Category	Confidence
Tricia Wang	PERSON	0.99+
Katie	PERSON	0.99+
Katie Linendoll	PERSON	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Joane	PERSON	0.99+
Daniel	PERSON	0.99+
Michael Li	PERSON	0.99+
Nate Silver	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Trump	PERSON	0.99+
Nate	PERSON	0.99+
Honda	ORGANIZATION	0.99+
Siva	PERSON	0.99+
McKinsey	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Larry Bird	PERSON	0.99+
2017	DATE	0.99+
Rob Thomas	PERSON	0.99+
Michigan	LOCATION	0.99+
Yankees	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Clinton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Tesco	ORGANIZATION	0.99+
Michael	PERSON	0.99+
America	LOCATION	0.99+
Leo	PERSON	0.99+
four years	QUANTITY	0.99+
five	QUANTITY	0.99+
30%	QUANTITY	0.99+
Astros	ORGANIZATION	0.99+
Trish	PERSON	0.99+
Sudden Compass	ORGANIZATION	0.99+
Leo Messi	PERSON	0.99+
two teams	QUANTITY	0.99+
1,000 lines	QUANTITY	0.99+
one year	QUANTITY	0.99+
10 investments	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
The Signal and the Noise	TITLE	0.99+
Tricia	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
80%	QUANTITY	0.99+
BCG	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
ESPN	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Ferrari	ORGANIZATION	0.99+
last year	DATE	0.99+
18	QUANTITY	0.99+
three	QUANTITY	0.99+
Data Incubator	ORGANIZATION	0.99+
Patriots	ORGANIZATION	0.99+

Vikram Murali, IBM | IBM Data Science For All

>> Narrator: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome back to New York here on theCUBE. Along with Dave Vellante, I'm John Walls. We're Data Science For All, IBM's two day event, and we'll be here all day long wrapping up again with that panel discussion from four to five here Eastern Time, so be sure to stick around all day here on theCUBE. Joining us now is Vikram Murali, who is a program director at IBM, and Vikram thank for joining us here on theCUBE. Good to see you. >> Good to see you too. Thanks for having me. >> You bet. So, among your primary responsibilities, The Data Science Experience. So first off, if you would, share with our viewers a little bit about that. You know, the primary mission. You've had two fairly significant announcements. Updates, if you will, here over the past month or so, so share some information about that too if you would. >> Sure, so my team, we build The Data Science Experience, and our goal is for us to enable data scientist, in their path, to gain insights into data using data science techniques, mission learning, the latest and greatest open source especially, and be able to do collaboration with fellow data scientist, with data engineers, business analyst, and it's all about freedom. Giving freedom to data scientist to pick the tool of their choice, and program and code in the language of their choice. So that's the mission of Data Science Experience, when we started this. The two releases, that you mentioned, that we had in the last 45 days. There was one in September and then there was one on October 30th. Both of these releases are very significant in the mission learning space especially. We now support Scikit-Learn, XGBoost, TensorFlow libraries in Data Science Experience. We have deep integration with Horton Data Platform, which is keymark of our partnership with Hortonworks. Something that we announced back in the summer, and this last release of Data Science Experience, two days back, specifically can do authentication with Technotes with Hadoop. So now our Hadoop customers, our Horton Data Platform customers, can leverage all the goodies that we have in Data Science Experience. It's more deeply integrated with our Hadoop based environments. >> A lot of people ask me, "Okay, when IBM announces a product like Data Science Experience... You know, IBM has a lot of products in its portfolio. Are they just sort of cobbling together? You know? So exulting older products, and putting a skin on them? Or are they developing them from scratch?" How can you help us understand that? >> That's a great question, and I hear that a lot from our customers as well. Data Science Experience started off as a design first methodology. And what I mean by that is we are using IBM design to lead the charge here along with the product and development. And we are actually talking to customers, to data scientist, to data engineers, to enterprises, and we are trying to find out what problems they have in data science today and how we can best address them. So it's not about taking older products and just re-skinning them, but Data Science Experience, for example, it started of as a brand new product: completely new slate with completely new code. Now, IBM has done data science and mission learning for a very long time. We have a lot of assets like SPSS Modeler and Stats, and digital optimization. And we are re-investing in those products, and we are investing in such a way, and doing product research in such a way, not to make the old fit with the new, but in a way where it fits into the realm of collaboration. How can data scientist leverage our existing products with open source, and how we can do collaboration. So it's not just re-skinning, but it's building ground up. >> So this is really important because you say architecturally it's built from the ground up. Because, you know, given enough time and enough money, you know, smart people, you can make anything work. So the reason why this is important is you mentioned, for instance, TensorFlow. You know that down the road there's going to be some other tooling, some other open source project that's going to take hold, and your customers are going to say, "I want that." You've got to then integrate that, or you have to choose whether or not to. If it's a super heavy lift, you might not be able to do it, or do it in time to hit the market. If you architected your system to be able to accommodate that. Future proof is the term everybody uses, so have you done? How have you done that? I'm sure API's are involved, but maybe you could add some color. >> Sure. So we are and our Data Science Experience and mission learning... It is a microservices based architecture, so we are completely dockerized, and we use Kubernetes under the covers for container dockerstration. And all these are tools that are used in The Valley, across different companies, and also in products across IBM as well. So some of these legacy products that you mentioned, we are actually using some of these newer methodologies to re-architect them, and we are dockerizing them, and the microservice architecture actually helps us address issues that we have today as well as be open to development and taking newer methodologies and frameworks into consideration that may not exist today. So the microservices architecture, for example, TensorFlow is something that you brought in. So we can just pin up a docker container just for TensorFlow and attach it to our existing Data Science Experience, and it just works. Same thing with other frameworks like XGBoost, and Kross, and Scikit-Learn, all these are frameworks and libraries that are coming up in open source within the last, I would say, a year, two years, three years timeframe. Previously, integrating them into our product would have been a nightmare. We would have had to re-architect our product every time something came, but now with the microservice architecture it is very easy for us to continue with those. >> We were just talking to Daniel Hernandez a little bit about the Hortonworks relationship at high level. One of the things that I've... I mean, I've been following Hortonworks since day one when Yahoo kind of spun them out. And know those guys pretty well. And they always make a big deal out of when they do partnerships, it's deep engineering integration. And so they're very proud of that, so I want to come on to test that a little bit. Can you share with our audience the kind of integrations you've done? What you've brought to the table? What Hortonworks brought to the table? >> Yes, so Data Science Experience today can work side by side with Horton Data Platform, HDP. And we could have actually made that work about two, three months back, but, as part of our partnership that was announced back in June, we set up drawing engineering teams. We have multiple touch points every day. We call it co-development, and they have put resources in. We have put resources in, and today, especially with the release that came out on October 30th, Data Science Experience can authenticate using secure notes. That I previously mentioned, and that was a direct example of our partnership with Hortonworks. So that is phase one. Phase two and phase three is going to be deeper integration, so we are planning on making Data Science Experience and a body management pact. And so a Hortonworks customer, if you have HDP already installed, you don't have to install DSX separately. It's going to be a management pack. You just spin it up. And the third phase is going to be... We're going to be using YARN for resource management. YARN is very good a resource management. And for infrastructure as a service for data scientist, we can actually delegate that work to YARN. So, Hortonworks, they are putting resources into YARN, doubling down actually. And they are making changes to YARN where it will act as the resource manager not only for the Hadoop and Spark workloads, but also for Data Science Experience workloads. So that is the level of deep engineering that we are engaged with Hortonworks. >> YARN stands for yet another resource negotiator. There you go for... >> John: Thank you. >> The trivia of the day. (laughing) Okay, so... But of course, Hortonworks are big on committers. And obviously a big committer to YARN. Probably wouldn't have YARN without Hortonworks. So you mentioned that's kind of what they're bringing to the table, and you guys primarily are focused on the integration as well as some other IBM IP? >> That is true as well as the notes piece that I mentioned. We have a notes commenter. We have multiple notes commenters on our side, and that helps us as well. So all the notes is part of the HDP package. We need knowledge on our side to work with Hortonworks developers to make sure that we are contributing and making end roads into Data Science Experience. That way the integration becomes a lot more easier. And from an IBM IP perspective... So Data Science Experience already comes with a lot of packages and libraries that are open source, but IBM research has worked on a lot of these libraries. I'll give you a few examples: Brunel and PixieDust is something that our developers love. These are visualization libraries that were actually cooked up by IBM research and the open sourced. And these are prepackaged into Data Science Experience, so there is IBM IP involved and there are a lot of algorithms, mission learning algorithms, that we put in there. So that comes right out of the package. >> And you guys, the development teams, are really both in The Valley? Is that right? Or are you really distributed around the world? >> Yeah, so we are. The Data Science Experience development team is in North America between The Valley and Toronto. The Hortonworks team, they are situated about eight miles from where we are in The Valley, so there's a lot of synergy. We work very closely with them, and that's what we see in the product. >> I mean, what impact does that have? Is it... You know, you hear today, "Oh, yeah. We're a virtual organization. We have people all over the world: Eastern Europe, Brazil." How much of an impact is that? To have people so physically proximate? >> I think it has major impact. I mean IBM is a global organization, so we do have teams around the world, and we work very well. With the invent of IP telephoning, and screen-shares, and so on, yes we work. But it really helps being in the same timezone, especially working with a partner just eight miles or ten miles a way. We have a lot of interaction with them and that really helps. >> Dave: Yeah. Body language? >> Yeah. >> Yeah. You talked about problems. You talked about issues. You know, customers. What are they now? Before it was like, "First off, I want to get more data." Now they've got more data. Is it figuring out what to do with it? Finding it? Having it available? Having it accessible? Making sense of it? I mean what's the barrier right now? >> The barrier, I think for data scientist... The number one barrier continues to be data. There's a lot of data out there. Lot of data being generated, and the data is dirty. It's not clean. So number one problem that data scientist have is how do I get to clean data, and how do I access data. There are so many data repositories, data lakes, and data swamps out there. Data scientist, they don't want to be in the business of finding out how do I access data. They want to have instant access to data, and-- >> Well if you would let me interrupt you. >> Yeah? >> You say it's dirty. Give me an example. >> So it's not structured data, so data scientist-- >> John: So unstructured versus structured? >> Unstructured versus structured. And if you look at all the social media feeds that are being generated, the amount of data that is being generated, it's all unstructured data. So we need to clean up the data, and the algorithms need structured data or data in a particular format. And data scientist don't want to spend too much time in cleaning up that data. And access to data, as I mentioned. And that's where Data Science Experience comes in. Out of the box we have so many connectors available. It's very easy for customers to bring in their own connectors as well, and you have instant access to data. And as part of our partnership with Hortonworks, you don't have to bring data into Data Science Experience. The data is becoming so big. You want to leave it where it is. Instead, push analytics down to where it is. And you can do that. We can connect to remote Spark. We can push analytics down through remote Spark. All of that is possible today with Data Science Experience. The second thing that I hear from data scientist is all the open source libraries. Every day there's a new one. It's a boon and a bane as well, and the problem with that is the open source community is very vibrant, and there a lot of data science competitions, mission learning competitions that are helping move this community forward. And it's a good thing. The bad thing is data scientist like to work in silos on their laptop. How do you, from an enterprise perspective... How do you take that, and how do you move it? Scale it to an enterprise level? And that's where Data Science Experience comes in because now we provide all the tools. The tools of your choice: open source or proprietary. You have it in here, and you can easily collaborate. You can do all the work that you need with open source packages, and libraries, bring your own, and as well as collaborate with other data scientist in the enterprise. >> So, you're talking about dirty data. I mean, with Hadoop and no schema on, right? We kind of knew this problem was coming. So technology sort of got us into this problem. Can technology help us get out of it? I mean, from an architectural standpoint. When you think about dirty data, can you architect things in to help? >> Yes. So, if you look at the mission learning pipeline, the pipeline starts with ingesting data and then cleansing or cleaning that data. And then you go into creating a model, training, picking a classifier, and so on. So we have tools built into Data Science Experience, and we're working on tools, that will be coming up and down our roadmap, which will help data scientist do that themselves. I mean, they don't have to be really in depth coders or developers to do that. Python is very powerful. You can do a lot of data wrangling in Python itself, so we are enabling data scientist to do that within the platform, within Data Science Experience. >> If I look at sort of the demographics of the development teams. We were talking about Hortonworks and you guys collaborating. What are they like? I mean people picture IBM, you know like this 100 plus year old company. What's the persona of the developers in your team? >> The persona? I would say we have a very young, agile development team, and by that I mean... So we've had six releases this year in Data Science Experience. Just for the on premises side of the product, and the cloud side of the product it's got huge delivery. We have releases coming out faster than we can code. And it's not just re-architecting it every time, but it's about adding features, giving features that our customers are asking for, and not making them wait for three months, six months, one year. So our releases are becoming a lot more frequent, and customers are loving it. And that is, in part, because of the team. The team is able to evolve. We are very agile, and we have an awesome team. That's all. It's an amazing team. >> But six releases in... >> Yes. We had immediate release in April, and since then we've had about five revisions of the release where we add lot more features to our existing releases. A lot more packages, libraries, functionality, and so on. >> So you know what monster you're creating now don't you? I mean, you know? (laughing) >> I know, we are setting expectation. >> You still have two months left in 2017. >> We do. >> We do not make frame release cycles. >> They are not, and that's the advantage of the microservices architecture. I mean, when you upgrade, a customer upgrades, right? They don't have to bring that entire system down to upgrade. You can target one particular part, one particular microservice. You componentize it, and just upgrade that particular microservice. It's become very simple, so... >> Well some of those microservices aren't so micro. >> Vikram: Yeah. Not. Yeah, so it's a balance. >> You're growing, but yeah. >> It's a balance you have to keep. Making sure that you componentize it in such a way that when you're doing an upgrade, it effects just one small piece of it, and you don't have to take everything down. >> Dave: Right. >> But, yeah, I agree with you. >> Well, it's been a busy year for you. To say the least, and I'm sure 2017-2018 is not going to slow down. So continue success. >> Vikram: Thank you. >> Wish you well with that. Vikram, thanks for being with us here on theCUBE. >> Thank you. Thanks for having me. >> You bet. >> Back with Data Science For All. Here in New York City, IBM. Coming up here on theCUBE right after this. >> Cameraman: You guys are clear. >> John: All right. That was great.

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Good to see you. Good to see you too. about that too if you would. and be able to do collaboration How can you help us understand that? and we are investing in such a way, You know that down the and attach it to our existing One of the things that I've... And the third phase is going to be... There you go for... and you guys primarily are So that comes right out of the package. The Valley and Toronto. We have people all over the We have a lot of interaction with them Is it figuring out what to do with it? and the data is dirty. You say it's dirty. You can do all the work that you need with can you architect things in to help? I mean, they don't have to and you guys collaborating. And that is, in part, because of the team. and since then we've had about and that's the advantage of microservices aren't so micro. Yeah, so it's a balance. and you don't have to is not going to slow down. Wish you well with that. Thanks for having me. Back with Data Science For All. That was great.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Vikram	PERSON	0.99+
John	PERSON	0.99+
three months	QUANTITY	0.99+
six months	QUANTITY	0.99+
John Walls	PERSON	0.99+
October 30th	DATE	0.99+
2017	DATE	0.99+
April	DATE	0.99+
June	DATE	0.99+
one year	QUANTITY	0.99+
Daniel Hernandez	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
September	DATE	0.99+
one	QUANTITY	0.99+
ten miles	QUANTITY	0.99+
YARN	ORGANIZATION	0.99+
eight miles	QUANTITY	0.99+
Vikram Murali	PERSON	0.99+
New York City	LOCATION	0.99+
North America	LOCATION	0.99+
two day	QUANTITY	0.99+
Python	TITLE	0.99+
two releases	QUANTITY	0.99+
New York	LOCATION	0.99+
two years	QUANTITY	0.99+
three years	QUANTITY	0.99+
six releases	QUANTITY	0.99+
Toronto	LOCATION	0.99+
today	DATE	0.99+
Both	QUANTITY	0.99+
two months	QUANTITY	0.99+
a year	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
third phase	QUANTITY	0.98+
both	QUANTITY	0.98+
this year	DATE	0.98+
first methodology	QUANTITY	0.98+
First	QUANTITY	0.97+
second thing	QUANTITY	0.97+
one small piece	QUANTITY	0.96+
One	QUANTITY	0.96+
XGBoost	TITLE	0.96+
Cameraman	PERSON	0.96+
about eight miles	QUANTITY	0.95+
Horton Data Platform	ORGANIZATION	0.95+
2017-2018	DATE	0.94+
first	QUANTITY	0.94+
The Valley	LOCATION	0.94+
TensorFlow	TITLE	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Daniel Hernandez: