Jon Roskill, Acumatica & Melissa Di Donato, SUSE | IFS World 2019

>> Announcer: Live from Boston, Massachusetts, it's theCube. Covering IFS World Conference 2019. Brought to you by IFS. >> Welcome back to Boston everybody you're watching theCube, the leader in live tech coverage. This is day one of the IFS World Conference. I'm Dave Vallante with my co-host Paul Gillen. Melissa Di Donato is here, she's the CEO of SUSE and Jon Roskill is the CEO of Acumatica. Folks, welcome to theCube. >> Thank you so much. >> So you guys had the power panel today? Talking about digital transformation. I got a question for all of you. What's the difference between a business and a digital business? Melissa, I'll give you first crack. >> Before a regular old business and a digital business? Everyone's digital these days, aren't they? I was interviewing the, one of the leaders in Expedia and I said, "Are you a travel company "or are you a digital company? "Like where do you lead with?" And she said to me, "No no, we're a travel company "but we use digital." So it seems like the more and more we think about what the future means how we service our customers, customers being at the core everyone's a digital business. The way you service, the way you communicate the way you support. So whether you're a business or none you're always got to be a digital business. >> You better be a digital business and so-- >> I'm going to take a slightly different tact on that which is, we talk about digital and analog businesses and analog businesses are ones that are data silos they have a lot of systems, so they think they're digital but they're disconnected. And, you know, part of a transformation is connecting all the systems together and getting them to work like one. >> But I think the confict other common thread is data, right? A digital business maybe puts data at the core and that's how they get competitive advantage but, I want to ask you guys about your respective businesses. So SUSE, obviously you compete with the big whale RedHat, you know, the big news last year IBM $34 billion. How did that or will that in your view affect your business? >> It's already affecting our business. We've seen a big big uptake in interest in SUSE and what we're doing. You know, they say that a big part of the install based customers that RedHat and IBM currently have are unhappy about the decision to be acquired by IBM. Whether they're in conflict because we're a very big heavily channel business, right? So a lot of the channel partners are not quite happy about having one of their closest competitors now be, you know, part of the inner circle if you will. And other customers are just not happy. I mean, RedHat had fast innovation, fast pace and thought leadership and now all of a sudden they're going to be buried inside of a large conglomerate and they're not happy about that. So when we look at what's been happening for us particularly since March, we became an independent company now one of the world's largest independent open source company in the world. Since IBM has been taking over from RedHat. And, you know, big big uptake. Since March we became independent we've been getting a lot of questioning. "Where are we, where are we going, what are we doing?" And, " Hey, you know, I haven't heard about SUSE a while "what are you doing now?" So it's been really good news for us really, really good news. >> I mean, we're huge fans of RedHat. We do a lot of their events and-- >> Melissa: I'm a huge fan myself. >> But I tell you, I mean, we know from first hand IBM has this nasty habit of buying companies tripling the price. Now they say they're going to leave RedHat alone, we'll see. >> Yeah, like they said they'd leave Lotus alone and all the others. >> SPSS, you saw that, Ustream, you know one of our platforms. >> What's your view, how do you think it's going to go? >> I don't think it's about cloud I think it's about services and I think that's the piece that we don't really have great visibility on. Can IBM kind of jam OpenShift into its customers you know, businesses without them even really knowing it and that's the near-term cash flow play that they're trying to, you know, effect. >> Yeah, but it's not working for them, isn't though? Because when you look at the install base 90% of their business it's been Linux open source environment and OpenShift is a tag-along. I don't know if that's a real enabler for the future rather than, you know, an afterthought from the past. >> Well, for $34 billion it better be. >> I want to ask you about the cost of shifting because historically, you know if you were IBM, you were stuck with IBM forever. What is involved in customers moving from RedHat to SUSE presumably you're doing some of those migrations style. >> We are, we are doing them more and more in fact, we're even offering migration services ourself in some applications. It depends on the application layer. >> How simple is that? >> It depends on the application. So, we've got some telco companies is very very complex 24/7, you know, high pays, big fat enterprise applications around billing, for example. They're harder to move. >> A lot of custom code. >> A lot of custom code, really deep, really rich they need, you know, constant operation because it's billing, right? Big, fat transactions, those are a little bit more complex than say, the other applications are. Nonetheless, there is a migration path and in fact, we're one of the only open source companies in the world that provides support for not just SUSE, but actually for RedHat. So, if you're a RedHat, for or a well customer that want to get off an unsupported version of RedHat you can come over to SUSE. We'll not just support your RedHat system but actually come up with a migration plan to get you into a supported version of SUSE. >> If it's a package set of apps and you have to freeze the code it's actually not that bad-- >> It's not that bad, no. >> To migrate. All right, Jon I got to ask you, so help us understand Acumatica and IFS and the relationship you're like sister companies, you both the ERP providers. How do you work together or? >> Yeah, so we're both owned by a private equity firm called EQT. IFS is generally focused on $500 million and above company so more enterprise and we're focused on core mid-market. So say, $20 million to $500 million. And so very complementary in that way. IFS is largely direct selling we're a 100% through channels. IFS is stronger in Europe, we're stronger in North America and so they see these as very complementary assets and rather than to, perhaps what's going on with the IBM, RedHat discussion here. Slam these big things together and screw them up they're trying to actually keep us independent. So they put us in a holding company but we're trying to leverage much of each other's goodness as we can. >> Is there a migration path? I mean, for customers who reach the top end of your market can they smoothly get to IFS? >> Yeah, it's not going to be like a smooth you know, turn a switch and go. But it absolutely is a migration option for customers and we do have a set of customers that are outgrowing us you know, we have a number of customers now over a billion dollars running on Acumatica and you know, for a company, we've got one that we're actually talking to about this right now operating in 41 countries global, they need 24/7 support we're not the right company to be running their ERP system. >> On your panel today guys you were talking about, a lot about digital transformations kind of lessons learned. What are the big mistakes you see companies making and kind of what's your roadmap for success? >> I think doing too much too fast. Everyone talks about the digital innovation digital transformation. It's really a business transformation with digital being the underpinning the push forward that carries the business forward, right? And I think that we make too many mistakes with regards to doing too much, too fast, too soon, that's one. Doing and adopting technology for technology's sake. "Oh, it's ML, it's AI." And everyone loves these big buzz words, right? All the code words for what technology is? So they tend to bring it on but they don't really know the outcome. Really really important at SUSE were absolutely obsessed with our customers and during a digital transformation if you remain absolutely sick of anything about your customer at the core of every decision you make and everything you do. Particularly with regards to digital transformation you want to make sure that business outcome is focused on them. Having a clear roadmap with milestones along the journey is really important and ensuring it's really collaborative. We talked this morning about digital natives you know, we're all young, aren't we? Me in particular, but, you know I think the younger generation of digital natives think a little bit differently perhaps than we were originally thinking when we were their age. You know, I depend on that thinking I depend on that integration of that thought leadership infused into companies to help really reach customers in different ways. Our customers are buying differently our customers have different expectations they have different deliverables they require and they expect to be supported in different way. And those digital natives, that young talent can really aid in that delivery of good thought leadership for our businesses. >> So Jon, we're seeing IT spending at the macro slow down a little bit. You know, a lot of different factors going on it's not a disaster, it's not falling off the cliff but definitely pre-2018 levels and one of the theories is that you had this kind of spray-and-pray kind of like Melissa was say, deal was going too fast trying everything and now we're seeing more of a narrow focus on things that are going to give a return. Do you see that happening out there? >> Yeah, definitely some, I mean people are looking for returns even in what's been a really vibrant economy but, you know, I agree with Melissa's point there's a lot of ready, shoot, aim projects out there and, you know, the biggest thing I see is the ones that aren't, the fail that aren't the ones that aren't led by the leadership. They're sort of given off to some side team often the IT team and said, "Go lead digital transformation of the company." And digital transformation you know, Melissa said this morning it's business transformation. You've got to bring the business part of it to the table and you've got to think about, it's got to be led by the CEO or the entire senior leadership team has to be on board and if not, it's not going to be successful. >> So, pragmatism would say, okay, you get some quick hits get some wins and then you got kind of the, you know, Bezos, Michael Dell mindset go big or go home, so what's your philosophy? Moonshots or, you know, quick hits? >> I always think starting you know, you've got to understand your team's capabilities. So starting is something that you can get a gauge of that you know, particularly if you're new and you're walking into an organization, you know. Melissa, I don't know how long you've been in your role now? >> Melissa: 65 days. >> Right, so there you go. So it's probably a good person to ask what, you know, what you're finding out there but I think, you know, getting a gauge of what your resources are. I mean, one of the things you see around here is there are, you know, dozens of partner firms that are, or can be brought into, you know supplement the resources you have in your own team. So being thoughtful in that is part of the approach. And then having a roadmap for what you're trying to do. Like we talked this morning about a customer that Linda had been talking about. Have been working on for six or seven years, right? And you're saying, for an enterprise a very large enterprise company taking six or seven years to turn the battleship maybe isn't that long. >> Okay, so you got the sister company going on. Do you have a commercial relationship with IFS or you just here as kind of an outside speaker and a thought leader? >> I'm here as an outside speaker thought leader. There is talk that perhaps we can you know, work together in the future we're trying to work that out right now. >> I want to ask you about open source business models. We still see companies sort of struggling to come up with, not profitable but, you know, insanely profitable business models based on open source software. What do you see coming out of all this? Is there a model that you think is going to work in the long term? >> I think the future is open source for sure and this is coming from a person who spent 25 years in proprietary software having worked for the larger piece here in vendors. 100% of my life has been dedicated to proprietary software. So whilst that's true I came at SUSE and the open source environment in a very different way as a customer running my proprietary applications on open source Linux based systems. So I come with a little bit different of a, you know, of an approach I would say. The future's open source for sure the way that we collaborate, the innovation the borderless means of which we deliver you know, leadership within our business is much much different than proprietary software. You would think as well that, you know the wall that we hide behind an open source being able to access software anywhere in a community and be able to provide thought leadership masks and hides who the developers and engineers are and instead exacerbates the thought leadership that comes out of them. So it provides for a naturally inclusive and diverse environment which leads to really good business results. We all know the importance of diversity and inclusion. I think there is definitely a place for open source in the world it's a matter providing it in such a way that creates business value that does enable and foster that growth of the community because nothing is better than having two or three or four or five million developers hacking away at my software to deliver better business value to my customers. The commercial side is going to be around the support, right? The enterprise customers would want to know that when bump goes in the night I've got someone I can pay to support my systems. And that's really what SUSE is about protecting our install base. Ensuring that we get them live, all the time every day and keep them running frictionlessly across their IT department. >> Now there's another model, the so-called open core model that holds that, the future is actually proprietary on top of an open base. So are you saying that you don't think that's a good model? >> I don't know, jury's out. Next time that you come to our event which is going to be in March, in Dublin. We're doing our SUSECON conference. Leave that question for me and I'll have an answer for you. I'm pontificating. >> Well I did and-- >> It's a date. The 12th of March. >> It's certainly working for Amazon. I mean, you know, Amazon's criticized for bogarting open source but Redshift is built on open source I think Aurora is built on open source. They're obviously making a lot of money. Your open core model failed for cloud era. Hortonworks was pure, Hortonworks had a model like, you know, you guys and RedHat and that didn't work and now that was kind of profitless prosperity of Hadoop and maybe that was sort of an over head-- >> I think our model, the future's open-source no question. It's just what level of open source within the sack do we keep proprietary or not, it's the case maybe, right? Do we allow open source in the bottom or the top or do we put some proprietary components on top to preserve and protect like an umbrella the core of which is open source. I don't know, we're thinking about that right now. We're trynna think what our future looks like. What the model should look like in the future for the industry. How can we service our customers best. At the end of the day, it's satisfying customer needs and solving business problems. If that's going to be, pure open source or open source with a little bit of proprietary to service the customer best that's what we're all going to be after, aren't we? >> So, there's no question that the innovation model is open source. I mean, I don't think that's a debate, the hard part is. Okay, how do you make money? A bit of open source for you guys. I mean, are you using open source technologies presumable you are, everybody is but-- >> So we're very open API's, who joined three years ago. We joined openapi.org. And so we've been one of the the leading ERP companies in the industry on publishing open API's and then we do a lot of customization work with our community and all of that's going on in GitHub. And so it's all open source, it's all out there for people who want it. Not everybody wants to be messing around in the core of a transaction engine and that's where you get into you know, the sort of the core argument of, you know which pieces should be people modifying? Do you want people in the kernel? Maybe, maybe not. And, you know, this is not my area of expertise so I'll defer to Melissa. Having people would be able to extend things in an open source model. Having people be able to find a library of customizations and components that can extend Acumatica, that's obviously a good thing. >> I mean, I think you hit on it with developers. I mean, that to me is the key lever. I mean, if I were a VM where I'd hire you know, 1000, 2000 open source software developers and say, "Go build next-generation apps and tools "and give it away." And then I'd say, "Okay, Michael Dell make you a hardware "run better in our software." That's a business model, you can make a lot of money-- >> 100% and we're, you know, we're going to be very acquisitive right now, we're looking for our future, right? We're looking to make a mark right now and where do we go next? How can we help predict the outcome next step in the marketplace when it pertains to, you know, the core of applications and the delivery mechanism in which we want to offer. The ease of being able to get thousands of mainframe customers with complex enterprise applications. Let's say, for example to the cloud. And a part of that is going to be the developer network. I mean, that's a really really big important segment for us and we're looking at companies. Who can we acquire? What's the business outcome? And what the developer networks look like. >> So Cloud and Edge, here got to be two huge opportunities for you, right? Again, it's all about developers. I think that's the right strategy at the Edge. You see a lot of Edge activity where somebody trying to throw a box at the Edge with the top down, in a traditional IT model. It's really the devs up, where I think-- >> It is, it is the dev ups, you're exactly right. Exactly right. >> Yeah, I mean, Edge is fascinating. That's going to be amazing what happens in the next 10 years and we don't even know, but we ship a construction edition we've got a customer that we're working with that's instrumenting all of their construction machinery on something like a thousand construction sites and feeding the sensor data into a Acumatica and so it's a way to keep track of all the machines and what's going on with them. You know, obviously shipping logistics the opportunity to start putting things like, you know, RFID tags on everything an instrument to all of that, out at the Edge. And then the issue is you get this huge amount of data and how do you process that and get the intelligence out of it and make the right decisions. >> Well, how do you? When data is plentiful, insights, you know, aren't is-- >> Yeah, well I think that's where the machine learning breakthroughs are going to happen. I mean, we've built out a team in the last three years on machine learning, all the guys who've been talking about Amazon, Microsoft, Google are all putting out machine learning engines that companies can pick up and start building models around. So we're doing one's around, you know inventory, logistics, shipping. We just release one on expense reports. You know, that really is where the innovation is happening right now. >> Okay, so you're not an inventor of AI you're going to take those technologies apply 'em to your business. >> Yeah, we don't want to be the engine builder we want to be the guys that are building the models and putting the insight for the industry on top that's our job. >> All right Melissa, we'll give you the final word and IFS World 2019, I think, is this your first one? >> It's my first one, yeah-- >> We say bumper sticker say when your truck's are pulling away or-- (laughs) >> A bumper sticker would say, "When you think about the future of open source "think about SUSE." (laughing) >> Dave: I love it. >> I'd say in the event, I mean, I'm super-impressed I think it's the group that's here is great the customers are really enthused and you know, I have zero bias so I'm just giving you my perspective. >> Yeah, I mean the ecosystem is robust here, I have to say. I think they said 400 partners and I was pleasantly surprised when I was walking around last-- >> This is your second one, isn't it? >> It's theCubes second one, my first. >> Oh your first, all right, well done. And so what do you think? Coming back? >> I would love to come back. Especially overseas, I know you guys do a bunch of stuff over seas. >> There you go, he wants to travel. >> Dublin in March? >> March the 12th. >> Dublin is a good place for sure so you're doing at the big conference? >> Yep, the big conference center and it's-- >> That is a great venue. >> And not just because the green thing but it's actually because (laughs). >> No, that's a really nice venue, it's modern It's got, I think three or four floors. >> It does, yeah yeah, we're looking forward to it. >> And then evening events at the, you know, the Guinness Storehouse. >> There you go. >> Exactly right. So we'll look forward to hosting you there. >> All right, great, see you there. >> We'll come with our tough questions for you. (laughing) >> Thanks you guys, I really appreciate your time. >> Thanks very much. >> Thank you for watching but right back, right after this short break you're watching theCube from IFS World in Boston be right back. (upbeat music)

Published Date : Oct 8 2019

SUMMARY :

Brought to you by IFS. and Jon Roskill is the CEO of Acumatica. So you guys had the power panel today? the way you support. And, you know, part of a transformation RedHat, you know, the big news last year IBM $34 billion. now be, you know, part of the inner circle if you will. I mean, we're huge fans of RedHat. Now they say they're going to leave RedHat alone, we'll see. and all the others. SPSS, you saw that, Ustream, you know that they're trying to, you know, effect. rather than, you know, an afterthought from the past. I want to ask you about the cost of shifting It depends on the application layer. 24/7, you know, high pays, big fat they need, you know, constant operation How do you work together or? and so they see these as very complementary assets and you know, for a company, we've got one What are the big mistakes you see companies making and everything you do. is that you had this kind of spray-and-pray and, you know, the biggest thing I see So starting is something that you can get a gauge of that I mean, one of the things you see around here Okay, so you got the sister company going on. you know, work together in the future I want to ask you about open source business models. of a, you know, of an approach I would say. So are you saying that you don't think that's a good model? Next time that you come to our event The 12th of March. I mean, you know, Amazon's criticized in the future for the industry. I mean, are you using open source technologies and that's where you get into I mean, I think you hit on it with developers. 100% and we're, you know, we're going to be very acquisitive So Cloud and Edge, here got to be It is, it is the dev ups, you're exactly right. and how do you process that So we're doing one's around, you know apply 'em to your business. and putting the insight for the industry on top "When you think about the future of open source and you know, I have zero bias Yeah, I mean the ecosystem is robust here, I have to say. And so what do you think? Especially overseas, I know you guys And not just because the green thing It's got, I think three or four floors. at the, you know, the Guinness Storehouse. So we'll look forward to hosting you there. We'll come with our tough questions for you. Thank you for watching

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Melissa	PERSON	0.99+
Melissa Di Donato	PERSON	0.99+
Paul Gillen	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Linda	PERSON	0.99+
Google	ORGANIZATION	0.99+
Jon Roskill	PERSON	0.99+
two	QUANTITY	0.99+
Jon	PERSON	0.99+
EQT	ORGANIZATION	0.99+
Dave Vallante	PERSON	0.99+
six	QUANTITY	0.99+
Europe	LOCATION	0.99+
25 years	QUANTITY	0.99+
Dublin	LOCATION	0.99+
March	DATE	0.99+
$20 million	QUANTITY	0.99+
100%	QUANTITY	0.99+
Michael Dell	PERSON	0.99+
four	QUANTITY	0.99+
Dave	PERSON	0.99+
three	QUANTITY	0.99+
SUSE	TITLE	0.99+
$34 billion	QUANTITY	0.99+
seven years	QUANTITY	0.99+
RedHat	TITLE	0.99+
SUSE	ORGANIZATION	0.99+
$500 million	QUANTITY	0.99+
400 partners	QUANTITY	0.99+
North America	LOCATION	0.99+
Bezos	PERSON	0.99+
RedHat	ORGANIZATION	0.99+
1000	QUANTITY	0.99+
Acumatica	ORGANIZATION	0.99+
Boston, Massachusetts	LOCATION	0.99+
IFS	ORGANIZATION	0.99+
last year	DATE	0.99+
first one	QUANTITY	0.99+
Boston	LOCATION	0.99+
Linux	TITLE	0.99+
41 countries	QUANTITY	0.99+
65 days	QUANTITY	0.99+
second one	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
OpenShift	TITLE	0.99+
first	QUANTITY	0.98+
today	DATE	0.98+
IFS World Conference	EVENT	0.98+
both	QUANTITY	0.98+

Dr. Vikram Saksena, NETSCOUT | CUBEConversation, July 2019

from the silicon angle media office in Boston Massachusetts it's the queue now here's your host still minimun hi I'm Stu minimun and this is a cube conversation from our Boston area studio happy to welcome to the program a first-time guest on the program but from knit scout who we've been digging into the concept of visibility without borders dr. Vikram Saxena who's with the office of the CTO from the for mention net scout thank you so much for joining us thanks to it thanks for having me all right dr. Zana before we get into kind of your role why don't you go back give us a little bit about you know your background you and I have some shared background comm we both work for some of the arms of you know Ma Bell that's right back in the day yeah you work a little bit more senior and yeah you know probably a lot more patents than I have my current count is still sure happy to do that you're right I started in 82 which was two years before the breakup of Marbella so you know and then everything started happening right around that time so yeah I started in Bell Labs you know stayed there close to 20 years did lot of the early pioneering work on packet switching before the days of internet frame relay all of that happened it was a pretty exciting time I was there building up we built up the AT&T business from scratch to a billion dollars in the IP space you know in a voice company that was always challenging so and then I moved on to do startups in the broadband space the two of them moved to the Boston area and then moved on to play the CTO role and public companies sonnez networks Tellabs and then you know came to an EPS card about five years ago yeah you know I I love talking about you know some of those incubators of innovation though I you know historically speaking just you know threw off so much technology that's right been seeing so much the media lately about you know the 50th anniversary of Apollo 11 that's so many things that came out of NASA Bell Labs was one of those places that helped inspire me to study engineering that's you know definitely got me on my career but here we are 2019 that's you're still you know working into with some of these telcos and how they're all you know dealing with this wave of cloud and yeah I know the constant change there so bring us inside you know what's your role inside net Scout that office of the CTO yes so net Scout is in the business of you know mining Network data and and what we excel at is extracting what we call actionable intelligence from network traffic which we use the term smart data but essentially my role is really to be the bridge between our technology group and the customers you know bring out understand the problems the challenges that our customers are facing and then work with the teams to build the right product to you know to fit in to the current environment okay one of our favorite things on the cube is you know talking to customers they're going through their transformation that's what you talk about the enterprise you know digital transformation that's what we think there's more than just the buzzword there yeah I've talked to financial institutions manufacturing you know you name it out there if it's a company that's not necessarily born in the cloud they are undergoing that digital transformation bring us inside you know your customer base that this telcos the service providers you know most of them have a heavy tech component to what they're doing but you know are they embracing digital transformation what what does it mean for them so you know as you said it's it's a big term that catches a lot of things but in one word if I described for the telcos it's all about agility if you look at the telco model historically it has been on a path where services get rolled out every six months year multiple years you know not exactly what we call an agile environment compared to today you know but when the cloud happened it changed the landscape because cloud not only created a new way of delivering services but also changed expectations on how fast things can happen and that created high expectations on the customer side which in turn started putting pressure on the on the telcos and and the service providers to become as agile as cloud providers and and and as you know the the network which is really the main asset of a service provider was built around platforms that were not really designed to be programmable you know so they came in with hardwired services and they would change at a very low timescale and building around that is the whole software layer of OS SPSS which over time became very monolithic very slow to change so coupling the network and the software layer created a very slow moving environment so this is what's really causing the change to go to a model where the networks can be programmable which essentially means moving from a hardware centric model to a software centric model where services can be programmed on-demand and created on the fly and maybe sometimes even under the control of the customers and layering on top of that changing the OS s infrastructure to make it more predictive make it more actionable and driven by advances in machine learning and artificial intelligence to make this entire environment extremely dynamic in agile so that's kind of what we are seeing in the marketplace yeah I totally agree that that agility is usually the first thing put forward I I need to be faster yeah it used to be you know faster better cheaper now like a faster faster faster I can actually help compensate for some of those other pieces there of course service riders usually you know very conscious on the cost of things there because if they can lower their cost they can usually of course make them more competitive and pass that along to their ultimate consumers you know bring us inside that you know you mentions this change to software that's going on you know there are so many waves of change going on there everything from you know you talk about IOT and edge computing yeah it's a big you know massive role at a 5g that ya even gets talked about in the general press that these days and at government states they're so you know where are you know your customers today what are some of the critical challenge they have and yeah you know where is that kind of monitoring observability that that kind of piece fit in so so good so let me give to backdrop points first of all you mentioned cost so they are always very cost-conscious trying to drive it down and the reason for that is the traditional services have been heavily commoditized you know voice texting video data they've been commoditized so the customers worn the same stuff cheaper and cheaper and cheaper all the time right so that puts a pressure on margins and reducing cost but now you the industry is at a point where I think the telcos need to grow the top line you know that's a challenge because you can always reduce cost but at some point you get to a point of diminishing returns so now I think the challenge is how do they grow their top line you know so they can become healthier again in that context and that leads to whole notion of what services they need to innovate on so it's all about once you have a programmable Network and a software that is intelligent and smart that becomes a platform for delivering new services so this is where you know you see on the enterprise side Sdn Enterprise IOT all these services are coming now using technologies of software-defined networking network function virtualization and 5g as you mentioned is the next generation of wireless technology that is coming on board right now and that opens up the possibility for the first time to new things dimensions come into play first not only a consumer centric focus which was always there but now opening it up to enterprises and businesses and IOT and secondly fixed broadband right the the the era where telcos used to either drive copper or fiber slow cumbersome takes a lot of time right and the cable guys have already done that with coaxial cable so they need to go faster and faster means use Wireless and finally with 5g you have a technology that can deliver fixed broadband which means all the high definition video voice data and other services like AR VR into the home so it's opening up a new possibility rather than having a separate fixed network and a separate wireless network for the first time they can collapse that into one common platform and go after both fixed and mobile and both consumers and enterprise force yeah we said what one of the big topics of conversation at Cisco live was at San Diego just a short time ago it was 5g and then it you know Wi-Fi six the next generation of that because I'm still going to need inside my building you know for the companies but the 5g holds the promise - give me - so much faster bandwidth so much dense for environment I guess some of the concerns I hear out there and maybe you can tell me kind of where we are and where the telcos fit in is you know 5g from a technology standpoint we understand where it is but that rollout is going to take time yes you know it's great to say you're going to have this dense and highly available thing but you know that's gonna start the same place all the previous generations all right it's the place where actually we don't have bad connectivity today it's you know it's in the urban areas it's where we have dense populations you know sometimes it's thrown out there o5g is gonna be great for edge and IOT and it's like well you know we don't have balloons and planes you know and you know the you know the towers everywhere so where are we with that rollout of 5g what side of timeframes are your customer base looking at as to where that where that goes to play so I think from what I'm seeing in the marketplace I think there is a less of a focus on building out ubiquitous coverage because you know when the focus is on consumers you need coverage because they're everywhere right but I think where they are focusing on because they want to create new revenue a new top-line growth they're focusing more on industry verticals IOT now that allows you to build out networks and pockets of air your customers are because enterprises are always focused in the top cities and you know heck top metro areas so before you make it available for consumers if you get an opportunity to build out at least in the major metropolitan area an infrastructure where you're getting paid as you're building it out because you're signing up this enterprise customers who are willing to pay for these IOT services you get paid you get to build out the infrastructure and then slowly as new applications emerge I think you can make it widely available for consumers I think the challenge on consumer side is the smart phones have been tapped out you know and and people are not going to get that excited about 5g just to use the next-gen I found right so there it has to be about new applications and services and things that people talk about always on the horizon are a are we are and think like that but they are out there they're not there today because it device has to come on board that becomes mass consumable and exciting to customers so while the industry is waiting for that to happen I think there's a great opportunity right now to turn up services for enterprise verticals in the IOT space because the devices are ready and everybody because enterprises are going through their own digital transformation they want to be in a connected world right so they're putting pressure on telcos to connect all their devices into the network and there is a monetization opportunity there so I think what the carriers are going to do is sign up verticals whether it's transportation health care so if they sign up a bunch of hospitals they're going to deploy infrastructure in that area to sign up hospitals if they're going to sign up manufacturing they're going to build their infrastructure in those areas where they're right so by that model you can build out a 5g network that is concentrated on their customer base and then get to ubiquitous coverage later when the consumer applications come yeah so I like that a lot because you know when I think back if we've learned from the sins of the past it used to be if we build it they will come let's you know dig trenches across all the highways and with as much fiber as we can and then the dot-com burst happens and we have all of this capacity that we can't give away yeah what it sounds like you're describing is really a service centric view yes I've got customers and I've got applications and I'm going to build to that and then I can build off of that yeah piece there could talk a little bit about that focus and you know where yeah where your customers are going yeah so maybe just likely before that what I want to talk about the distributed nature of the 5g network so you mentioned edge right so one of the things that are happening when you want to deliver low latency services or high bandwidth services you need to push things closer to the edge as you know when cloud started it's more in the what we call the core you know the large data centers the hyper scale data centers where applications are are being deployed now but when you demand low latency let's say sub 15 millisecond 10 millisecond latency that has to be pushed much more closer to the customer now this is what's for saying the edge cloud deployment in 5g and then what that does is it also forces you to distribute functionality you know everything is not centralized in the core but it's distributed in the edge and the code the control plane maybe in the core but the user plane moves to the edge so that changes the entire flow of traffic and services in a 5g Network they are no longer centralized which means it becomes more challenging to be able to manage and assure these services in a highly distributed telco cloud environment which has this notion of edge and core now on top of that if you say that you know this is all about top-line growth and customer satisfaction then your focus on operationalizing these services has to change from in network centric view to a service centric view because in the past as you know when we were both in Bell Labs in AT&T you know we were pretty much you know focused on the network you know focused on the data from the network the network elements the switches and the routers and all of that and making sure that the network is healthy now that is good but it's not sufficient to guarantee that the services and the service level agreements for customers are being met so what you need to do is focus at the service layer much more so than you were doing it in the past so that changes the paradigm on what data you need to use how you want to use it and how do you stitch together this view in a highly distributed environment and do it in real-time and do it all very quickly so the customers don't see the pain if anything breaks and actually be more proactive in lot of cases be more predictive and take corrective actions before the impact services so this is the challenge and and clearly from a net Scout point of view I think we are right in the center of this hurricane and you know given the history we sort of have figured out on how to do this yeah you know the networking has a long history of we've got a lot of data we've got all of these flows and things change but right exactly as you said understanding what happened at that application that is we've been really tie to make sure it's just IT sitting on the side but IT driving that business that's my application those data flows so yeah you maybe expound a little bit more net Scouts fit there yeah and you know what why it's so critical for what customers need today yeah happy to do that so so if you look at what are the sources of data that you actually can use and and what you should use so basically they fall into three buckets what I call first is what I call infrastructure data which is all about data you get from hypervisors we switches they're telling you more about how the infrastructure is behaving where you need to add more horsepower CPU is memory storage and so on so that is very infrastructure centric the second one is from network elements you know what the DNS servers give you DHCP servers what your routers and switches are giving you the firewalls are giving you and they are also in a way telling you more about what the network elements are seeing so there's a little bit of a hybrid between infrastructure and a service layer component but the problem is that data is it's very vendor dependent it's highly fragmented across there because there's no real standards how to create this data so there is telemetry data there are sis logs and they all vendors do it what they think is best for them so the challenge then becomes on the service provider side and how do you stitch together because service is an end-to-end construct or an application it starts at a at a at a user and goes to a server and you need to be able to get that holistic view n2n so the most appropriate data that net scout feels is what we call the wire data or the traffic data is actually looking at packets themselves because they give you the most direct knowledge about how the service is behaving how it's performing and not only that you can actually predict problems as opposed to react to problems because you can trend this data you can apply machine learning to this data and be able to say what might go wrong and be able to take corrective action so we feel that extracting the right contextual information relevant implicit information timely information in a vendor independent way in a way that is universally if we available from edge to core those are the attributes of wire data and we excel in processing that at the source in real-time and converting all of that into actionable intelligence that is very analytics and automation friendly so this is our strength what that allows us to do is as they are going through this transition between 4G and 5g between physical and virtual across fixed and mobile networks you know you can go through this transition if you have it stitched together end to end view that crosses these boundaries or borders as we call it visibility without borders and in this context your operations people never lose insight into what's going on with their customer applications and behavior so they can go through this migration with confidence that they will not negatively impact their user experience by using our technology yeah you know we've thrown out these terms intelligence and automation for decades yes in our industry but if you look at these hybrid environments and all of these changes come out if an operator doesn't have tools like this they can't keep up they can go so I need to have that machine learning I have to have those tools that can help me intelligently attack these pieces otherwise there's no way I can do it yeah and one point there is you know it's like garbage in garbage out if you don't get the right data you can have the most sophisticated machine learning but it's not going to predict the right answer so the quality of data is very important just as the quality of your analytics in your algorithms so we feel that the combination of right data and the right analytics is how you're going to get advantage of you know accurate predictions and automation around that whole suite okay love that right data right information right delusion why don't want to give you right analytics I want to give you the final word final takeaways for your customers today so I think we are in a very exciting time in the industry you know 5g as a technology is a probably the first generation technology which is coming on board where there is so much focus on on things like security and and new applications and so on and and I think it's an exciting time for service providers to take advantage of this platform and then be able to use it to deliver new services and ultimately see their top lines grow which we all want in the industry because if they are successful then via suppliers you know do well you know so I think it's a pretty exciting time and and vyas net scout are happy to be in this spot right now and to see and help our customers go to go through this transition alright dr. Vikram Singh Saxena thank you so much for joining us sharing with us everything that's happening in your space and it glad to see the excitement still with the journey that you've been on thank you Stu happy to be here all right and as always check out the cubed on net for all of our content I'm Stu minimun and thanks as always for watching the cube [Music]

Published Date : Jul 17 2019

SUMMARY :

know the you know the towers everywhere

ENTITIES

Entity	Category	Confidence
July 2019	DATE	0.99+
Boston	LOCATION	0.99+
San Diego	LOCATION	0.99+
AT&T	ORGANIZATION	0.99+
Bell Labs	ORGANIZATION	0.99+
2019	DATE	0.99+
dr.	PERSON	0.99+
first time	QUANTITY	0.99+
Boston Massachusetts	LOCATION	0.99+
two	QUANTITY	0.98+
today	DATE	0.98+
10 millisecond	QUANTITY	0.98+
one word	QUANTITY	0.98+
telcos	ORGANIZATION	0.98+
telco	ORGANIZATION	0.98+
NASA Bell Labs	ORGANIZATION	0.98+
one point	QUANTITY	0.97+
dr. Zana	PERSON	0.97+
Stu minimun	PERSON	0.97+
first generation	QUANTITY	0.97+
both	QUANTITY	0.96+
first-time	QUANTITY	0.96+
Vikram Saksena	PERSON	0.96+
first	QUANTITY	0.96+
Tellabs	ORGANIZATION	0.96+
Ma Bell	PERSON	0.95+
one	QUANTITY	0.94+
decades	QUANTITY	0.92+
Vikram Singh Saxena	PERSON	0.92+
first thing	QUANTITY	0.91+
50th anniversary	QUANTITY	0.91+
every six months	QUANTITY	0.91+
second one	QUANTITY	0.91+
billion dollars	QUANTITY	0.88+
CTO	ORGANIZATION	0.88+
Vikram Saxena	PERSON	0.86+
wave of cloud	EVENT	0.82+
two	DATE	0.82+
one common platform	QUANTITY	0.8+
5g	QUANTITY	0.79+
agile	TITLE	0.77+
sonnez	ORGANIZATION	0.76+
about five years ago	DATE	0.76+
lot of data	QUANTITY	0.75+
20 years	QUANTITY	0.75+
15 millisecond	QUANTITY	0.74+
NETSCOUT	ORGANIZATION	0.72+
Dr.	PERSON	0.72+
82	DATE	0.7+
Stu	PERSON	0.7+
net Scout	ORGANIZATION	0.68+
5g	OTHER	0.67+
secondly	QUANTITY	0.65+
OS SPSS	TITLE	0.63+
those	QUANTITY	0.62+
of cases	QUANTITY	0.59+
three buckets	QUANTITY	0.57+
years	QUANTITY	0.53+
Cisco live	EVENT	0.5+
minimun	PERSON	0.49+
4G	OTHER	0.47+
Apollo 11	COMMERCIAL_ITEM	0.42+
Marbella	ORGANIZATION	0.32+

Survey Data Shows Momentum for IBM Red Hat But Questions Remain

>> From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE! (upbeat electronic music) Now, here's your host, Dave Vellante. >> Hi, everybody, this is Dave Vellante, and I want to share with you some recent survey data that talks to the IBM acquisition of Red Hat, which closed today. It's always really valuable to go out, talk to practitioners, see what they're doing, and it's a hard thing to do. It's very expensive to get this type of survey data. A lot of times, it's very much out of date. You might remember. Some of you might remember a company called the InfoPro. Its founder and CEO was Ken Male, and he raised some money from Gideon Gartner, and he had this awesome survey panel. Well, somehow it failed. Well, friends of mine at ETR, Enterprise Technology Research, have basically created a modern version of the InfoPro. It's the InfoPro on steroids with a modern interface and data science behind it. They've now been at this for 10 years. They built a panel of 4,500 users, practitioners that they can go to, a lot of C level folks, a lot of VP level and then some doers down at the engineering level, and they go out and periodically survey these folks, and one of the surveys they did back in October was what do you think of the IBM-Red Hat acquisition? And then they've periodically gone out and talked to customers of both Red Hat and IBM or both to get a sense of the sentiment. So given that the acquisition closed today, we wanted to share some of that data with you, and our friends at ETR shared with us some of their drill down data with us, and we're going to share it with you. So first of all, I want to summarize something that they said. Back in October, they said, "We view this acquisition as less of an attempt "by IBM to climb into the cloud game, cloud relevance, "but rather a strategic opportunity "to reboot IBM's early 1990s IT services business strategy." I couldn't agree with that more. I've said all along this is a services play connecting OpenShift from Red Hat into the what Ginni Rometty talks about as the 80% of the install base that is still on prem with the workloads at the backend of mission critical systems that need to be modernized. That's IBM's opportunity. That's why this is a front end loaded cashflow deal 'cause IBM can immediately start doing business through it services organization and generate cash. They went on to say, ETR said, "Here, IBM could position itself "as the de facto IT services partner "for Fortune 100 to Global 2000 organizations "and their digital transformations. "Therefore, in theory, this could reinvigorate "the global services business for IBM "and their overlapping customer bases "could alow IBM to recapture and accelerate a great deal "of service revenues that they have lost "over the past few years." Again, I couldn't agree more. It's less about a cloud play. It is definitely about a multi-cloud play, which is how IBM's positioning this, but services de-risks this entire acquisition in my opinion even though it's very large, 34 billion. Okay, I'm show you some data. So pull up this slide. So what ETR does is they'll go out. So this is a survey of right after the acquisition of about 132 Global 2000 practitioners across a bunch of different industries, energy, utilities, financial services, government, healthcare, IT, telco, retail consumers, so a nice cross section of industries and largely in North America but a healthy cross section of AMIA and APAC. And again, these are large enterprises. So what this slide shows is conditioned responses, which I love conditioned responses. It sort of forces people to answer which of the following best describes. But this says, "Given IBM's intent to acquire Red Hat, "do you believe your organization will be more likely "to use this new combination "or less likely in your digital transformation?" You can see here on the left hand side, the green, 23% positive, on the right hand side, 13% negative. So, the data doesn't necessarily support ETR's original conclusions and my belief that this all about services momentum because most IT people are going to wait and see. So you can see the fat middle there is 64%. Basically you're saying, "Yeah, we're going to wait and see. "This really doesn't change anything." But nonetheless, you see a meaningfully more positive sentiment than negative sentiment. The bottom half of this slide shows, the question is, "Do you believe that this acquisition "makes or will make IBM a legitimate competitor "in the cloud wars between AWS and Microsoft Azure?" You can see on the left hand side, it says 45% positive. Very few say, all the way on the left hand side, a very legitimate player in the cloud on par with AWS and Azure. I don't believe that's the case. But a majority said, "IBM is surely better off "with Red Hat than without Red Hat in the context of cloud." Again, I would agree with that. While I think this is largely a services play, it's also, as Stu Miniman pointed out in an earlier video with me, a cloud play. And you can see it's still 38% is negative on the right hand side. 15% absolutely not, IBM is far behind AWS and Azure in cloud. I would tend to agree with that, but IBM is different. They're trying to bring together its entire software portfolio so it has a competitive approach. It's not trying to take Azure and AWS head on. So you see 38% negative, 45% positive. Now, what the survey didn't do is really didn't talk to multi-cloud. This, to me, puts IBM at the forefront of multi-cloud, right in there with VMware. You got IBM-Red Hat, Google with Anthos, Cisco coming at it from a network perspective and, of course, Microsoft leveraging its large estate of software. So, maybe next time we can poke at the multi-cloud. Now, that survey was done of about over 150, about 157 in the Global 2000. Sorry, I apologize. That was was 137. The next chart that I'm going to show you is a sentiment chart that took a pulse periodically, which was 157 IT practitioners, C level executives, VPs and IT practitioners. And what this chart shows essentially is the spending intentions for Red Hat over time. Now, the green bars are really about the adoption rates, and you can see they fluctuate, and it's kind of the percentage on left hand side and time is on the horizontal axis. The red is the replacement. We're going to replace. We're not going to buy. We're going to replace. In the middle is that fat middle, we're going to stay flat. So the yellow line is essentially what ETR calls market share. It's really an indication of mind share in my opinion. And then the blue line is spending intentions net score. So what does that mean? What that means is they basically take the gray, which is staying the same, they subtract out the red, which is we're doing less, and they add in the we're going to do more. So what does this data show? Let's focus on the blue line. So you can see, you know, slightly declining, and then pretty significantly declining last summer, maybe that's 'cause people spend less in the summer, and then really dropping coming into the announcement of the acquisition in October of 2018, IBM announced the $34 billion acquisition of Red Hat. Look at the spike post announcement. The sentiment went way up. You have a meaningful jump. Now, you see a little dip in the April survey, and again, that might've been just an attenuation of the enthusiasm. Now, July is going on right now, so that's why it's phased out, but we'll come back and check that data later. So, and then you can see this sort of similar trend with what they call market share, which, to me, is, again, really mind share and kind of sentiment. You can see the significant uptick in momentum coming out of the announcement. So people are generally pretty enthusiastic. Again, remember, these are customers of IBM, customers of Red Hat and customer of both. Now, let's see what the practitioners said. Let's go to some of the open endeds. What I love about ETR is they actually don't just do the hardcore data, they actually ask people open ended questions. So let's put this slide up and share with you some of the drill down statements that I thought were quite relevant. The first one is right on. "Assuming IBM does not try to increase subscription costs "for RHEL," Red Hat Enterprise Linux, "then its organizational issues over sales "and support should go away. "This should fix an issue where enterprises "were moving away from RHEL to lower cost alternatives "with significant movement to other vendors. "This plus IBM's purchase of SoftLayer and deployment "of CloudFoundry will make it harder "for Fortune 1000 companies to move away from IBM." So a lot implied things in there. The first thing I want to mention is IBM has a nasty habit when it buys companies, particularly software companies, to raise prices. You certainly saw this with SPSS. You saw this with other smaller acquisitions like Ustream. Cognos customers complained about that. IBM buys software companies with large install bases. It's got a lock in spec. It'll raise prices. It works because financially it's clearly worked for IBM, but it sometimes ticks off customers. So IBM has said it's going to keep Red Hat separate. Let's see what it does from a pricing standpoint. The next comment here is kind of interesting. "IBM has been trying hard to "transition to cloud-service model. "However, its transition has not been successful "even in the private-cloud domain." So basically these guys are saying something that I've just said is that IBM's cloud strategy essentially failed to meet its expectations. That's why it has to go out and spend $34 billion with Red Hat. While it's certainly transformed IBM in some respects, IBM's still largely a services company, not as competitive as cloud as it would've liked. So this guys says, "let alone in this fiercely competitive "public cloud domain." They're not number one. "One of the reasons, probably the most important one, "is IBM itself does not have a cloudOS product. "So, acquiring Red Hat will give IBM "some competitive advantage going forward." Interesting comments. Let's take a look at some of the other ones here. I think this is right on, too. "I don't think IBM's goal is to challenge AWS "or Azure directly." 100% agree. That's why they got rid of the low end intel business because it's not trying to be in the commodity businesses. They cannot compete with AWS and Azure in terms of the cost structure of cloud infrastructure. No way. "It's more to go after hybrid multi-cloud." Ginni Rometty said today at the announcement, "We're the only hybrid multi-cloud, opensource vendor out there. Now, the third piece of that opensource I think is less important than competing in hybrid and multi-cloud. Clearly Red hat gives IMB a better position to do this with CoreOS, CentOS. And so is it worth 34 billion? This individual thinks it is. So it's a vice president of a financial insurance organization, again, IBM's strong house. So you can here some of the other comments here. "For customers doing significant business "with IBM Global Services teams." Again, outsourcing, it's a 10-plus billion dollar opportunity for IBM to monetize over the next five years, in my opinion. "This acquisition could help IBM "drive some of those customers "toward a multi-cloud strategy "that also includes IBM's cloud." Yes, it's a very much of a play that will integrate services, Red Hat, Linux, OpenShift, and of course, IBM's cloud, sprinkle in a little Watson, throw in some hardware that IBM has a captive channel so the storage guys and the server guys can sell their hardware in there if the customer doesn't care. So it's a big integrated services play. "Positioning Red Hat, and empowering them "across legacy IBM silos, will determine if this works." Again, couldn't agree more. These are very insightful comments. This is a largely a services and an integration play. Hybrid cloud, multi-cloud is complex. IBM loves complexity. IBM's services organization is number one in the industry. Red Hat gives it an ingredient that it didn't have before other than as a partner. IBM now owns that intellectual property and can really go hard and lean in to that services opportunity. Okay, so thanks to our friends at Enterprise Technology Research for sharing that data, and thank you for watching theCUBE. This is Dave Vellante signing off for now. Talk to you soon. (upbeat electronic music)

Published Date : Jul 9 2019

SUMMARY :

From the SiliconANGLE Media office and it's kind of the percentage on left hand side

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
October of 2018	DATE	0.99+
Ken Male	PERSON	0.99+
APAC	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
October	DATE	0.99+
InfoPro	ORGANIZATION	0.99+
AMIA	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
$34 billion	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
April	DATE	0.99+
45%	QUANTITY	0.99+
10 years	QUANTITY	0.99+
64%	QUANTITY	0.99+
July	DATE	0.99+
38%	QUANTITY	0.99+
Enterprise Technology Research	ORGANIZATION	0.99+
North America	LOCATION	0.99+
4,500 users	QUANTITY	0.99+
13%	QUANTITY	0.99+
80%	QUANTITY	0.99+
34 billion	QUANTITY	0.99+
100%	QUANTITY	0.99+
15%	QUANTITY	0.99+
RHEL	TITLE	0.99+
third piece	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Ustream	ORGANIZATION	0.99+
23%	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
first thing	QUANTITY	0.99+
One	QUANTITY	0.99+

Rob Thomas, IBM | IBM Innovation Day 2018

(digital music) >> From Yorktown Heights, New York It's theCUBE! Covering IBM Cloud Innovation Day. Brought to you by IBM. >> Hi, it's Wikibon's Peter Burris again. We're broadcasting on The Cube from IBM Innovation Day at the Thomas J Watson Research Laboratory in Yorktown Heights, New York. Have a number of great conversations, and we got a great one right now. Rob Thomas, who's the General Manager of IBM Analytics, welcome back to theCUBE. >> Thanks Peter, great to see you. Thanks for coming out here to the woods. >> Oh, well it's not that bad. I actually live not to far from here. Interesting Rob, I was driving up the Taconic Parkway and I realized I hadn't been on it in 40 years, so. >> Is that right? (laugh) >> Very exciting. So Rob let's talk IBM analytics and some of the changes that are taking place. Specifically, how are customers thinking about achieving their AI outcomes. What's that ladder look like? >> Yeah. We call it the AI ladder. Which is basically all the steps that a client has to take to get to get to an AI future, is the best way I would describe it. From how you collect data, to how you organize your data. How you analyze your data, start to put machine learning into motion. How you infuse your data, meaning you can take any insights, infuse it into other applications. Those are the basic building blocks of this laddered AI. 81 percent of clients that start to do something with AI, they realize their first issue is a data issue. They can't find the data, they don't have the data. The AI ladder's about taking care of the data problem so you can focus on where the value is, the AI pieces. >> So, AI is a pretty broad, hairy topic today. What are customers learning about AI? What kind of experience are they gaining? How is it sharpening their thoughts and their pencils, as they think about what kind of outcomes they want to achieve? >> You know, its... For some reason, it's a bit of a mystical topic, but to me AI is actually quite simple. I'd like to say AI is not magic. Some people think it's a magical black box. You just, you know, put a few inputs in, you sit around and magic happens. It's not that, it's real work, it's real computer science. It's about how do I put, you know, how do I build models? Put models into production? Most models, when they go into production, are not that good, so how do I continually train and retrain those models? Then the AI aspect is about how do I bring human features to that? How do I integrate that with natural language, or with speech recognition, or with image recognition. So, when you get under the covers, it's actually not that mystical. It's about basic building blocks that help you start to achieve business outcomes. >> It's got to be very practical, otherwise the business has a hard time ultimately adopting it, but you mentioned a number of different... I especially like the 'add the human features' to it of the natural language. It also suggests that the skill set of AI starts to evolve as companies mature up this ladder. How is that starting to change? >> That's still one of the biggest gaps, I would say. Skill sets around the modern languages of data science that lead to AI: Python, AR, Scala, as an example of a few. That's still a bit of a gap. Our focus has been how do we make tools that anybody can use. So if you've grown up doing SPSS or SaaS, something like that, how do you adopt those skills for the open world of data science? That can make a big difference. On the human features point, we've actually built applications to try to make that piece easy. Great example is with Royal Bank of Scotland where we've created a solution called Watson Assistant which is basically how do we arm their call center representatives to be much more intelligent and engaging with clients, predicting what clients may do. Those types of applications package up the human features and the components I talked about, makes it really easy to get AI into production. >> Now many years ago, the genius Turing, noted the notion of the Turing machine where you couldn't tell the difference between the human and a machine from an engagement standpoint. We're actually starting to see that happen in some important ways. You mentioned the call center. >> Yep. >> How are technologies and agency coming together? By that I mean, the rate at which businesses are actually applying AI to act as an agent for them in front of customers? >> I think it's slow. What I encourage clients to do is, you have to do a massive number of experiments. So don't talk to me about the one or two AI projects you're doing, I'm thinking like hundreds. I was with a bank last week in Japan, and they're comment was in the last year they've done a hundred different AI projects. These are not one year long projects with hundreds of people. It's like, let's do a bunch of small experiments. You have to be comfortable that probably half of your experiments are going to fail, that's okay. The goal is how do you increase your win rate. Do you learn from the ones that work, and from the ones that don't work, so that you can apply those. This is all, to me at this stage, is about experimentation. Any enterprise right now, has to be thinking in terms of hundreds of experiments, not one, not two or 'Hey, should we do that project?' Think in terms of hundreds of experiments. You're going to learn a lot when you do that. >> But as you said earlier, AI is not magic and it's grounded in something, and it's increasingly obvious that it's grounded in analytics. So what is the relationship between AI analytics, and what types of analytics are capable of creating value independent of AI? >> So if you think about how I kind of decomposed AI, talked about human features, I talked about, it kind of starts with a model, you train the model. The model is only as good as the data that you feed it. So, that assumes that one, that your data's not locked into a bunch of different silos. It assumes that your data is actually governed. You have a data catalog or that type of capability. If you have those basics in place, once you have a single instantiation of your data, it becomes very easy to train models, and you can find that the more that you feed it, the better the model's going to get, the better your business outcomes are going to get. That's our whole strategy around IBM Cloud Private for Data. Basically, one environment, a console for all your data, build a model here, train it in all your data, no matter where it is, it's pretty powerful. >> Let me pick up on that where it is, 'cause it's becoming increasingly obvious, at least to us and our clients, that the world is not going to move all the data over to a central location. The data is going to be increasingly distributed closer to the sources, closer to where the action is. How does AI and that notion of increasing distributed data going to work together for clients. >> So we've just released what's called IBM Data Virtualization this month, and it is a leapfrog in terms of data virtualization technology. So the idea is leave your data where ever it is, it could be in a data center, it could be on a different data center, it could be on an automobile if you're an automobile manufacturer. We can federate data from anywhere, take advantage of processing power on the edge. So we're breaking down that problem. Which is, the initial analytics problem was before I do this I've got to bring all my data to one place. It's not a good use of money. It's a lot of time and it's a lot of money. So we're saying leave your data where it is, we will virtualize your data from wherever it may be. >> That's really cool. What was it called again? >> IBM Data Virtualization and it's part of IBM Cloud Private for Data. It's a feature in that. >> Excellent, so one last question Rob. February's coming up, IBM Think San Francisco thirty plus thousand people, what kind of conversations do you anticipate having with you customers, your partners, as they try to learn, experiment, take away actions that they can take to achieve their outcomes? >> I want to have this AI experimentation discussion. I will be encouraging every client, let's talk about hundreds of experiments not 5. Let's talk about what we can get started on now. Technology's incredibly cheap to get started and do something, and it's all about rate and pace, and trying a bunch of things. That's what I'm going to be encouraging. The clients that you're going to see on stage there are the ones that have adopted this mentality in the last year and they've got some great successes to show. >> Rob Thomas, general manager IBM Analytics, thanks again for being on theCUBE. >> Thanks Peter. >> Once again this is Peter Buriss of Wikibon, from IBM Innovation Day, Thomas J Watson Research Center. We'll be back in a moment. (techno beat)

Published Date : Dec 7 2018

SUMMARY :

Brought to you by IBM. at the Thomas J Watson Research Laboratory Thanks for coming out here to the woods. I actually live not to far from here. and some of the changes care of the data problem What kind of experience are they gaining? blocks that help you How is that starting to change? that lead to AI: Python, AR, notion of the Turing so that you can apply those. But as you said earlier, AI that the more that you feed it, that the world is not So the idea is leave your What was it called again? of IBM Cloud Private for Data. that they can take to going to see on stage there Rob Thomas, general Peter Buriss of Wikibon,

ENTITIES

Entity	Category	Confidence
Peter Buriss	PERSON	0.99+
Japan	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
Peter	PERSON	0.99+
one	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
one year	QUANTITY	0.99+
Royal Bank of Scotland	ORGANIZATION	0.99+
Rob	PERSON	0.99+
81 percent	QUANTITY	0.99+
last week	DATE	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
February	DATE	0.99+
first issue	QUANTITY	0.99+
Yorktown Heights, New York	LOCATION	0.99+
IBM Innovation Day	EVENT	0.99+
IBM Analytics	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.98+
Python	TITLE	0.98+
Taconic Parkway	LOCATION	0.98+
40 years	QUANTITY	0.98+
Scala	TITLE	0.98+
thirty plus thousand people	QUANTITY	0.97+
IBM Cloud Innovation Day	EVENT	0.96+
hundreds of experiments	QUANTITY	0.96+
today	DATE	0.96+
Watson Assistant	TITLE	0.96+
one place	QUANTITY	0.94+
IBM Innovation Day 2018	EVENT	0.93+
Thomas J Watson Research Center	ORGANIZATION	0.93+
SPSS	TITLE	0.89+
this month	DATE	0.88+
one environment	QUANTITY	0.86+
San Francisco	LOCATION	0.8+
half of	QUANTITY	0.79+
hundreds of people	QUANTITY	0.78+
many years ago	DATE	0.77+
hundreds of experiments	QUANTITY	0.76+
single instantiation	QUANTITY	0.76+
hundred different AI projects	QUANTITY	0.76+
one last question	QUANTITY	0.73+
SaaS	TITLE	0.71+
Turing	ORGANIZATION	0.71+
AR	TITLE	0.7+
IBM Think	ORGANIZATION	0.69+
J Watson Research	ORGANIZATION	0.67+
Thomas	LOCATION	0.62+
The Cube	TITLE	0.58+
money	QUANTITY	0.58+
Virtualization	COMMERCIAL_ITEM	0.55+
Laboratory	LOCATION	0.54+
Turing	PERSON	0.51+
Cloud Private	COMMERCIAL_ITEM	0.49+
Private for	COMMERCIAL_ITEM	0.47+
Cloud	TITLE	0.3+

Arista Thurman III, Argonne | Veritas Vision Solution Day 2018

>> Narrator: From Chicago, it's The Cube. Covering Veritas Vision Solution Day 2018. Brought to you by Veritas. >> Welcome back to the Windy City everybody. You're watching The Cube, the leader in live tech coverage. We're goin' out to the events, we extract a signal from the noise. We're here at the Veritas Vision Solution Days in Chicago. We were just a few weeks ago we were at the iconic Tavern on the Green in New York City. We're here at the Palmer House Hotel, beautiful hotel right in downtown Chicago near the lake. It's just an awesome venue, it's great to be here. Arista Thurman III is here, he's the principle computer engineer at the Argonne National Labs. Great to see you, thanks for coming on The Cube. >> Yah, good to be here, thanks. >> So tell the audience about Argonne National Labs. What do you guys all about? >> About science, so we're all about the advancement of science. We do a lot of different experiments from technology for batteries and chemistry. The project we're working on is the advanced photon source, which is a light source that's used to collect data in experiments with a photon source. >> OK, so you're an IT practitioner, >> Arista Thurman: That is correct. >> Serving scientists. >> Arista Thurman: Yes. >> What's that like? Is that like an IT guy serving doctors? Are they kind of particular? >> Arista Thurman: A little bit. >> There's some challenges there, but yah it's great. So basically you have a unique customer base, and they have additional requirements. So, it's not like a normal customer base. They're very smart people. They have a lot of demands and needs, and we do our best to provide all the services they require. >> Yah, so given that they're technical people, they may not be IT people but they have an affinity to technology. First of all, it must be hard to BS them, right? (laughter) >> Arista Thurman: No doubt, no doubt. >> They'd cut through that, so you got to be straight with them. And they're probably pretty demanding, right? I mean, they have limited resources and limited time and limited budgets, and they're probably pounding you pretty hard. Is that the case, or are they more forgiving? >> They're great people to work with, but there can be some challenges. I mean, it's unique in the idea that they work on multiple platforms. So it's from Unix to Linux to Mac. Multiple computers in their offices, multiple data requirements. And a lot of things happen without a lot of process and planning. Some things are ad hoc. So, it puts a little bit of strain sometimes on you to try to make everything happen in the amount of time they have. And everything is There's some challenges with regard to how to get things done in a timely fashion when you don't know what's going to happen with some of these experiments. >> I mean I imagine, right? They can probably deal with a lot of uncertain processes because that's kind of their lives, right? You must have to cobble things together for them to get them a solution sometimes, is that the case? >> We do sometimes. I think it's all about getting enough funding and enough resources to take care of all the different experiments. >> Dave Vellante: A balancing act. >> Yah. >> Dave Vellante: Ya so you look after, compute and storage. >> Arista Thurman: Yes. >> Right, so talk about what's happening generally there and then specifically data protection. >> So in general, my primary focus is Linux. Linus administration, Red Hat Linux. And we've seen a lot of data growth over the last five years and we've got projection for more growth as we are planning for an upgrade. So we're going to change our bmine and make it more efficient. Have a better light source and that's all planned in the next two to three years. And so, there's a lot of extra projects on top of our normal workload. We have a lot of equipment that probably needs to be refreshed. There's resources and with IT and any kind of data management things change. So whatever we're doing today, in the next three years we'll be doing something different because things change with regard to CPU speeds, performance of IO networking, storage requirements. All those things are continually growing exponentially. And when scientists want to do more experiments and they get new resources in, it's going to require more resources for us to maintain and keep them operational at the speeds and performance they want. >> Yah, we do hundreds of events with The Cube. We do about 130 events this year, and a lot of them are so-called "big data" orientation. And when you go to those data oriented events, you hear a lot of, sort of the roots of that. Or at least similarities to the scientific technical computing areas and it's sort of evolved into big data. A lot of the disciplines are similar. So, you're talking about a lot of data here. Sometimes it's really fast data, and there's a lot of variety, presumably, in that data. So how much data are we talking about? Is it huge volumes? Maybe you could describe your data environment. >> Primarily we have things broken up into different areas. So we have some block storage, and that provides a lot of our virtual the back-end for our virtualization environments which is either Microsoft or Red Hat RHV. I would estimate that's somewhere in a petabyte range. And then we also have our NAS file systems which spread across multiple environments providing NFS version three and four and also to Windows clients CIFS and some of the Mac clients also utilize that. And that's at about a little less than a petabyte. We also have high performance computing and that's a couple petabytes, at least. And all those numbers are just estimates because we're constantly growing. >> Any given time it's changing. But you're talking about multiple petabytes. So how do you back up, how do you protect multiple petabytes? >> Well I think it has to, it's all about a balancing act 'cause it's hard to back up everything in that same time window. So we have multiple backup environments providing resources for individual platforms. Like for Windows we'd do something a little different than we'd do for Linux. And we have different retention policies. Some environments need to be retained, retention is three years and some is six months, some three months, and so you have to have a system of migrating your storage to faster discs and then layer off the tape for long term retention. It's a challenge that we're constantly fighting with. >> How do you use Veritas? You're a customer obviously? >> Yah, we've been a Veritas customer for many years and we utilize Veritas in our virtualization environments. They kind of help us out with central platform. We've actually explored other things but the most cost effective thing to us at this point has been Veritas. We utilize them to back up primarily our NAS and our black files, our black file systems that provide most of the virtualization. >> Why Veritas? What is it about them that you have an affinity for? There's a zillion other backup software vendors out there, why Veritas? >> I think we have invested a lot in Veritas over the years. Predating my time at Argonne we've been using Veritas. In my previous career, in Sun Microsystems we also had some kind of relationship with Veritas. So it's easy and I think, like I mentioned earlier, we explored other things but it wasn't cost effective to make that kind of change. And it's been a reliable product. It does require work but it has been a reliable product. >> So, you'd mentioned your Linux, Red Hat Linux. >> Arista Thurman: Yes. >> So you saw this IBM announced it's going to buy Red Hat for 34 billion dollars. What were your thoughts when you heard that news? >> I was like, "Wow, what is going to happen now?" I was like, "How is that going to impact us?" Is it going to change our licensing model? Or is it going to be a good thing, or a bad thing? Right now we just don't really know. We're just kind of waiting and seeing. But it's like, OK, I mean that's a big deal. It is a biggest deal certainly from IBM. Their biggest previous deal was I think Cognos at five billion, so this dwarfs that. The deal of course doesn't close probably till the second half of 2019. So it's going to take a while. But look, IBM is known when it buys software companies, saw this with SPSS, you've seen it with other companies that it buys, it often times will change the pricing model. How do you license Red Hat? Do you have an enterprise license agreement? Do you know offhand? >> We do have an agreement with them. >> Dave Vellante: Lock that in. Lock that long term in now before the deal goes down. >> One of my counterparts is in charge of that part of it. So I'm sure we'll be having that conversation shortly. >> Yah, interesting. Well listen, Arista thanks very much for coming on The Cube, really appreciate your insight. >> Thank you. >> It's great to meet you, all right, you're welcome. Thanks for watching everybody, it's a wrap from Chicago. This has been The Cube, Veritas Vision Days. Check out SiliconAngle.com for all the news. TheCube.net is where you'll find these videos and a lot of others. You'll see where The Cube is next. Wikibon.com for all the research. Thanks for the team here, appreciate your help on the ground. We're out from Chicago, this is Dave Vellante. We'll see ya next time.

Published Date : Nov 10 2018

SUMMARY :

Brought to you by Veritas. Arista Thurman III is here, he's the principle So tell the audience about Argonne National Labs. We do a lot of different experiments So basically you have a unique customer base, First of all, it must be hard to BS them, right? Is that the case, or are they more forgiving? So it's from Unix to Linux to Mac. and enough resources to take care of Right, so talk about what's happening We have a lot of equipment that A lot of the disciplines are similar. and some of the Mac clients also utilize that. So how do you back up, how do you protect 'cause it's hard to back up everything but the most cost effective thing to us at this point I think we have invested a lot in Veritas over the years. So you saw this IBM announced it's going to buy So it's going to take a while. Lock that long term in now before the deal goes down. One of my counterparts is in charge of that part of it. for coming on The Cube, really appreciate your insight. and a lot of others.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Argonne National Labs	ORGANIZATION	0.99+
Arista Thurman	PERSON	0.99+
three years	QUANTITY	0.99+
Veritas	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
Chicago	LOCATION	0.99+
five billion	QUANTITY	0.99+
34 billion dollars	QUANTITY	0.99+
Sun Microsystems	ORGANIZATION	0.99+
Windows	TITLE	0.99+
Argonne	ORGANIZATION	0.99+
Arista Thurman III	PERSON	0.99+
One	QUANTITY	0.99+
Argonne	PERSON	0.99+
Arista	PERSON	0.99+
three months	QUANTITY	0.99+
this year	DATE	0.99+
Cognos	ORGANIZATION	0.98+
New York City	LOCATION	0.98+
Linux	TITLE	0.98+
second half of 2019	DATE	0.97+
Unix	TITLE	0.96+
today	DATE	0.95+
Wikibon.com	ORGANIZATION	0.94+
First	QUANTITY	0.94+
Red Hat	ORGANIZATION	0.92+
about 130 events	QUANTITY	0.92+
Windy City	LOCATION	0.91+
hundreds of events	QUANTITY	0.91+
Veritas Vision Solution Day 2018	EVENT	0.91+
few weeks ago	DATE	0.87+
Palmer House Hotel	ORGANIZATION	0.87+
last five years	DATE	0.86+
Mac	COMMERCIAL_ITEM	0.85+
Veritas Vision Solution Days	EVENT	0.83+
TheCube.net	OTHER	0.8+
less than a petabyte	QUANTITY	0.77+
couple petabytes	QUANTITY	0.76+
Linus	ORGANIZATION	0.73+
SPSS	ORGANIZATION	0.73+
SiliconAngle.com	ORGANIZATION	0.73+
Red Hat Linux	TITLE	0.72+
Veritas Vision	EVENT	0.72+
Cube	ORGANIZATION	0.71+
two	QUANTITY	0.71+
Tavern on the Green	LOCATION	0.67+
Cube	TITLE	0.67+
next three years	DATE	0.65+
The Cube	ORGANIZATION	0.62+
The Cube	TITLE	0.61+
four	QUANTITY	0.52+
version	QUANTITY	0.47+
petabytes	QUANTITY	0.44+
years	QUANTITY	0.44+
zillion	QUANTITY	0.44+
RHV	TITLE	0.43+
three	OTHER	0.24+

Scott Hebner, IBM | Change the Game: Winning With AI

>> Live from Times Square in New York City, it's theCUBE. Covering IBMs Change the Game, Winning With AI. Brought to you by, IBM. >> Hi, everybody, we're back. My name is Dave Vellante and you're watching, theCUBE. The leader in live tech coverage. We're here with Scott Hebner who's the VP of marketing for IBM analytics and AI. Scott, it's good to see you again, thanks for coming back on theCUBE. >> It's always great to be here, I love doing these. >> So one of the things we've been talking about for quite some time on theCUBE now, we've been following the whole big data movement since the early Hadoop days. And now AI is the big trend and we always ask is this old wine, new bottle? Or is it something substantive? And the consensus is, it's real, it's real innovation because of the data. What's your perspective? >> I do think it's another one of these major waves, and if you kind of go back through time, there's been a series of them, right? We went from, sort of centralized computing into client server, and then we went from client server into the whole world of e-business in the internet, back around 2000 time frame or so. Then we went from internet computing to, cloud. Right? And I think the next major wave here is that next step is AI. And machine learning, and applying all this intelligent automation to the entire system. So I think, and it's not just a evolution, it's a pretty big change that's occurring here. Particularly the value that it can provide businesses is pretty profound. >> Well it seems like that's the innovation engine for at least the next decade. It's not Moore's Law anymore, it's applying machine intelligence and AI to the data and then being able to actually operationalize that at scale. With the cloud-like model, whether its OnPrem or Offprem, your thoughts on that? >> Yeah, I mean I think that's right on 'cause, if you kind of think about what AI's going to do, in the end it's going to be about just making much better decisions. Evidence based decisions, your ability to get to data that is previously unattainable, right? 'Cause it can discover things in real time. So it's about decision making and it's about fueling better, and more intelligent business processing. Right? But I think, what's really driving, sort of under the covers of that, is this idea that, are clients really getting what they need from their data? 'Cause we all know that the data's exploding in terms of growth. And what we know from our clients and from studies is only about 15% of what business leaders believe that they're getting what they need from their data. Yet most businesses are sitting on about 80% of their data, that's either inaccessible, un-analyzed, or un-trusted, right? So, what they're asking themselves is how do we first unlock the value of all this data. And they knew they have to do it in new ways, and I think the new ways starts to talk about cloud native architectures, containerization, things of that nature. Plus, artificial intelligence. So, I think what the market is starting to tell us is, AI is the way to unlock the value of all this data. And it's time to really do something significant with it otherwise, it's just going to be marginal progress over time. They need to make big progress. >> But data is plentiful, insights aren't. And part of your strategy is always been to bring insights out of that dividend and obviously focused on clients outcomes. But, a big part of your role is not only communicating IBMs analytic and AI strategy, but also helping shape that strategy. How do you, sort of summarize that strategy? >> Well we talk about the ladder to AI, 'cause one thing when you look at the actual clients that are ahead of the game here, and the challenges that they've faced to get to the value of AI, what we've learned, very, very clearly, is that the hardest part of AI is actually making your data ready for AI. It's about the data. It's sort of this notion that there's no AI without a information architecture, right? You have to build that architecture to make your data ready, 'cause bad data will be paralyzing to AI. And actually there was a great MIT Sloan study that they did earlier in the year that really dives into all these challenges and if I remember correctly, about 81% of them said that the number one challenge they had is, their data. Is their data ready? Do they know what data to get to? And that's really where it all starts. So we have this notion of the ladder to AI, it's several, very prescriptive steps, that we believe through best practices, you need to actually take to get to AI. And once you get to AI then it becomes about how you operationalize it in the way that it scales, that you have explainability, you have transparency, you have trust in what the model is. But it really much is a systematical approach here that we believe clients are going to get there in a much faster way. >> So the picture of the ladder here it starts with collect, and that's kind of what we did with, Hadoop, we collected a lot of data 'cause it was inexpensive and then organizing it, it says, create a trusted analytics foundation. Still building that sort of framework and then analyze and actually start getting insights on demand. And then automation, that seems to be the big theme now. Is, how do I get automation? Whether it's through machine learning, infusing AI everywhere. Be a blockchain is part of that automation, obviously. And it ultimately getting to the outcome, you call it trust, achieving trust and transparency, that's the outcome that we want here, right? >> I mean I think it all really starts with making your data simple and accessible. Which is about collecting the data. And doing it in a way you can tap into all types of data, regardless of where it lives. So the days of trying to move data around all over the place or, heavy duty replication and integration, let it sit where it is, but be able to virtualize it and collect it and containerize it, so it can be more accessible and usable. And that kind of goes to the point that 80% of the enterprised data, is inaccessible, right? So it all starts first with, are you getting all the data collected appropriately, and getting it into a way that you can use it. And then we start feeding things in like, IOT data, and sensors, and it becomes real time data that you have to do this against, right? So, notions of replicating and integrating and moving data around becomes not very practical. So that's step one. Step two is, once you collect all the data doesn't necessarily mean you trust it, right? So when we say, trust, we're talking about business ready data. Do people know what the data is? Are there business entities associated with it? Has it been cleansed, right? Has it been take out all the duplicate data? What do you when a situation with data, you know you have sources of data that are telling you different things. Like, I think we've all been on a treadmill where the phone, the watch, and the treadmill will actually tell you different distances, I mean what's the truth? The whole notion of organizing is getting it ready to be used by the business, in applying the policies, the compliance, and all the protections that you need for that data. Step three is, the ability to build out all this, ability to analyze it. To do it on scale, right, and to do it in a way that everyone can leverage the data. So not just the business analysts, but you need to enable everyone through self-service. And that's the advancements that we're getting in new analytics capabilities that make mere mortals able to get to that data and do their analysis. >> And if I could inject, the challenge with the sort of traditional decision support world is you had maybe two, or three people that were like, the data gods. You had to go through them, and they would get the analysis. And it's just, the agility wasn't there. >> Right. >> So you're trying to, democratizing that, putting it in the hands. >> Absolutely. >> Maybe the business user's not as much of an expert as the person who can build theCUBE, but they could find new use cases, and drive more value, right? >> Actually, from a developer, that needs to get access, and analytics infused into their applications, to the other end of the spectrum which could be, a marketing leader, a finance planner, someone who's planning budgets, supply chain planner. Right, so it's that whole spectrum, not only allowing them to tap into, and analyze the data and gain insights from it, but allow them to customize how they do it and do it in a more self-service. So that's the notion of scale on demand insights. It's really a cultural thing enabled through the technology. With that foundation, then you have the ability to start infuse, where I think the real power starts to kick in here. So I mean, all that's kind of making your data ready for AI, right? Then you start to infuse machine learning, everywhere. And that's when you start to build these models that are self-learning, that start to automate the ability to get to these insights, and to the data. And uncover what has previously been unattainable, right? And that's where the whole thing starts to become automated and more real time and more intelligent. And that's where those models then allow you to do things you couldn't do before. With the data, they're saying they're not getting access to. And then of course, once you get the models, just because you have good models doesn't mean that they've been operationalized, that they've been embedded in applications, embedded in business process. That you have trust and transparency and explainability of what it's telling you. And that's that top tier of the ladder, is really about embedding it, right, so that into your business process in a way that you trust it. So, we have a systematic set of approaches to that, best practices. And of course we have the portfolio that would help you step up that ladder. >> So the fat middle of this bell curve is, something kind of this maturity curve, is kind of the organize and analyze phase, that's probably where most people are today. And what's the big challenge of getting up that ladder, is it the algorithms, what is it? >> Well I think it, it clearly with most movements like this, starts with culture and skills, right? And the ability to just change the game within an organization. But putting that aside, I think what's really needed here is an information architecture that's based in the agility of a cloud native platform, that gives you the productivity, and truly allows you to leverage your data, wherever it resides. So whether it's in the private cloud, the public cloud, on premise, dedicated no matter where it sits, you want to be able to tap into all that data. 'Cause remember, the challenge with data is it's always changing. I don't mean the sources, but the actual data. So you need an architecture that can handle all that. Once you stabilize that, then you can start to apply better analytics to it. And so yeah, I think you're right. That is sort of the bell curve here. And with that foundation that's when the power of infusing machine learning and deep learning and neuronetworks, I mean those kind of AI technologies and models into it all, just takes it to a whole new level. But you can't do those models until you have those bottom tiers under control. >> Right, setting that foundation. Building that framework. >> Exactly. >> And then applying. >> What developers of AI applications, particularly those that have been successful, have told us pretty clearly, is that building the actual algorithms, is not necessarily the hard part. The hard part is making all the data ready for that. And in fact I was reading a survey the other day of actual data scientists and AI developers and 60% of them said the thing they hate the most, is all the data collection, data prep. 'Cause it's so hard. And so, a big part of our strategy is just to simplify that. Make it simple and accessible so that you can really focus on what you want to do and where the value is, which is building the algorithms and the models, and getting those deployed. >> Big challenge and hugely important, I mean IBM is a 100 year old company that's going through it's own digital transformation. You know, we've had Inderpal Bhandari on talking about how to essentially put data at the core of the company, it's a real hard problem for a lot of companies who were not born, you know, five or, seven years ago. And so, putting data at that core and putting human expertise around it as opposed to maybe, having whatever as the core. Humans or the plant or the manufacturing facility, that's a big change for a lot of organizations. Now at the end of the day IBM, and IBM sells strategy but the analytics group, you're in the software business so, what offerings do you have, to help people get there? >> Well in the collect step, it's essentially our hybrid data management portfolio. So think DB2, DB2 warehouse, DB2 event store, which is about IOT data. So there's a set of, and that's where big data in Hadoop and all that with Wentworth's, that's where that all fits in. So building the ability to access all this data, virtualize it, do things like Queryplex, things of that nature, is where that all sits. >> Queryplex being that to the data, virtualization capability. >> Yeah. >> Get to the data no matter where it is. >> To find a queary and don't worry about where it resides, we'll figure that out for you, kind of thought, right? In the organize, that is infosphere, so that's basically our unified governance and integration part of our portfolio. So again, that is collecting all this, taking the collected data and organizing it, and making sure you're compliant with whatever policies. And making it, you know, business ready, right? And so infosphere's where you should look to understand that portfolio better. When you get into scale and analytics on demand, that's Cognos analytics, it is our planning analytics portfolio. And that's essentially our business analytics part of all this. And some data science tools like, SPSS, we're doing statistical analysis and SPSS modeler, if we're doing statistical modeling, things of that nature, right? When you get into the automate and the ML, everywhere, that's Watson Studio which is the integrated development environment, right? Not just for IBM Watson, but all, has a huge array of open technologies in it like, TensorFlow and Python, and all those kind of things. So that's the development environment that Watson machine learning is the runtime that will allow you to run those models anywhere. So those are the two big pieces of that. And then from there you'll see IBM building out more and more of what we already have. But we have Watson applications. Like Watson Assistant, Watson Discovery. We have a huge portfolio of Watson APIs for everything from tone to speech, things of that nature. And then the ability to infuse that all into the business processes. Sort of where you're going to see IBM heading in the future here. >> I love how you brought that home, and we talked about the ladder and it's more than just a PowerPoint slide. It actually is fundamental to your strategy, it maps with your offerings. So you can get the heads nodding, with the customers. Where are you on this maturity curve, here's how we can help with products and services. And then the other thing I'll mention, you know, we kind of learned when we spoke to some others this week, and we saw some of your announcements previously, the Red Hat component which allows you to bring that cloud experience no matter where you are, and you've got technologies to do that, obviously, you know, Red Hat, you guys have been sort of birds of a feather, an open source. Because, your data is going to live wherever it lives, whether it's on Prem, whether it's in the cloud, whether it's in the Edge, and you want to bring sort of a common model. Whether it's, containers, kubernetes, being able to, bring that cloud experience to the data, your thoughts on that? >> And this is where the big deal comes in, is for each one of those tiers, so, the DB2 family, infosphere, business analytics, Cognos and all that, and Watson Studio, you can get started, purchase those technologies and start to use them, right, as individual products or softwares that service. What we're also doing is, this is the more important step into the future, is we're building all those capabilities into one integrated unified cloud platform. That's called, IBM Cloud Private for data. Think of that as a unified, collaborative team environment for AI and data science. Completely built on a cloud native architecture of containers and micro services. That will support a multi cloud environment. So, IBM cloud, other clouds, you mention Red Hat with Openshift, so, over time by adopting IBM Cloud Private for data, you'll get those steps of the ladder all integrated to one unified environment. So you have the ability to buy the unified environment, get involved in that, and it all integrated, no assembly required kind of thought. Or, you could assemble it by buying the individual components, or some combination of both. So a big part of the strategy is, a great deal of flexibility on how you acquire these capabilities and deploy them in your enterprise. There's no one size fits all. We give you a lot of flexibility to do that. >> And that's a true hybrid vision, I don't have to have just IBM and IBM cloud, you're recognizing other clouds out there, you're not exclusive like some companies, but that's really important. >> It's a multi cloud strategy, it really is, it's a multi cloud strategy. And that's exactly what we need, we recognize that most businesses, there's very few that have standardized on only one cloud provider, right? Most of them have multiples clouds, and then it breaks up of dedicated, private, public. And so our strategy is to enable this capability, think of it as a cloud data platform for AI, across all these clouds, regardless of what you have. >> All right, Scott, thanks for taking us through the strategies. I've always loved talking to you 'cause you're a clear thinker, and you explain things really well in simple terms, a lot of complexity here but, it is really important as the next wave sets up. So thanks very much for your time. >> Great, always great to be here, thank you. >> All right, good to see you. All right, thanks for watching everybody. We are now going to bring it back to CubeNYC so, thanks for watching and we will see you in the afternoon. We've got the panel, the influencer panel, that I'll be running with Peter Burris and John Furrier. So, keep it right there, we'll be right back. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by, IBM. it's good to see you again, It's always great to be And now AI is the big and if you kind of go back through time, and then being able to actually in the end it's going to be about And part of your strategy is of the ladder to AI, So the picture of the ladder And that's the advancements And it's just, the agility wasn't there. the hands. And that's when you start is it the algorithms, what is it? And the ability to just change Right, setting that foundation. is that building the actual algorithms, And so, putting data at that core So building the ability Queryplex being that to the data, Get to the data no matter And so infosphere's where you should look and you want to bring So a big part of the strategy is, I don't have to have And so our strategy is to I've always loved talking to you to be here, thank you. We've got the panel, the influencer panel,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Scott	PERSON	0.99+
Scott Hebner	PERSON	0.99+
80%	QUANTITY	0.99+
two	QUANTITY	0.99+
60%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Python	TITLE	0.99+
Inderpal Bhandari	PERSON	0.99+
PowerPoint	TITLE	0.99+
IBMs	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
TensorFlow	TITLE	0.99+
three people	QUANTITY	0.99+
both	QUANTITY	0.98+
Times Square	LOCATION	0.98+
Watson	TITLE	0.98+
about 80%	QUANTITY	0.98+
Watson Assistant	TITLE	0.98+
step one	QUANTITY	0.98+
one	QUANTITY	0.97+
MIT Sloan	ORGANIZATION	0.97+
next decade	DATE	0.97+
about 15%	QUANTITY	0.97+
Watson Studio	TITLE	0.97+
this week	DATE	0.97+
Step two	QUANTITY	0.96+
Watson Discovery	TITLE	0.96+
two big pieces	QUANTITY	0.96+
Red Hat	TITLE	0.96+
about 81%	QUANTITY	0.96+
Openshift	TITLE	0.95+
CubeNYC	LOCATION	0.94+
five	DATE	0.94+
Queryplex	TITLE	0.94+
first	QUANTITY	0.93+
today	DATE	0.92+
100 year old	QUANTITY	0.92+
Wentworth	ORGANIZATION	0.91+
Step three	QUANTITY	0.91+
Change the Game: Winning With AI	TITLE	0.9+
one cloud provider	QUANTITY	0.9+
one thing	QUANTITY	0.89+
DB2	TITLE	0.85+
each one	QUANTITY	0.84+
seven years ago	DATE	0.83+
OnPrem	ORGANIZATION	0.83+
waves	EVENT	0.82+
number one challenge	QUANTITY	0.8+
Red Hat	TITLE	0.78+
Offprem	ORGANIZATION	0.77+
DB2	ORGANIZATION	0.76+
major	EVENT	0.76+
major wave	EVENT	0.75+
SPSS	TITLE	0.73+
Moore's Law	TITLE	0.72+
Cognos	TITLE	0.72+
next	EVENT	0.66+
Cloud	TITLE	0.64+
around 2000	QUANTITY	0.64+
Hadoop	TITLE	0.61+
early Hadoop days	DATE	0.55+
them	QUANTITY	0.51+
wave	EVENT	0.5+
in	DATE	0.49+
theCUBE	TITLE	0.45+
theCUBE	ORGANIZATION	0.42+

Caryn Woodruff, IBM & Ritesh Arora, HCL Technologies | IBM CDO Summit Spring 2018

>> Announcer: Live from downtown San Francisco, it's the Cube, covering IBM Chief Data Officer Strategy Summit 2018. Brought to you by IBM. >> Welcome back to San Francisco everybody. We're at the Parc 55 in Union Square and this is the Cube, the leader in live tech coverage and we're covering exclusive coverage of the IBM CDO strategy summit. IBM has these things, they book in on both coasts, one in San Francisco one in Boston, spring and fall. Great event, intimate event. 130, 150 chief data officers, learning, transferring knowledge, sharing ideas. Cayn Woodruff is here as the principle data scientist at IBM and she's joined by Ritesh Ororo, who is the director of digital analytics at HCL Technologies. Folks welcome to the Cube, thanks for coming on. >> Thank you >> Thanks for having us. >> You're welcome. So we're going to talk about data management, data engineering, we're going to talk about digital, as I said Ritesh because digital is in your title. It's a hot topic today. But Caryn let's start off with you. Principle Data Scientist, so you're the one that is in short supply. So a lot of demand, you're getting pulled in a lot of different directions. But talk about your role and how you manage all those demands on your time. >> Well, you know a lot of, a lot of our work is driven by business needs, so it's really understanding what is critical to the business, what's going to support our businesses strategy and you know, picking the projects that we work on based on those items. So it's you really do have to cultivate the things that you spend your time on and make sure you're spending your time on the things that matter and as Ritesh and I were talking about earlier, you know, a lot of that means building good relationships with the people who manage the systems and the people who manage the data so that you can get access to what you need to get the critical insights that the business needs, >> So Ritesh, data management I mean this means a lot of things to a lot of people. It's evolved over the years. Help us frame what data management is in this day and age. >> Sure, so there are two aspects of data in my opinion. One is the data management, another the data engineering, right? And over the period as the data has grown significantly. Whether it's unstructured data, whether it's structured data, or the transactional data. We need to have some kind of governance in the policies to secure data to make data as an asset for a company so the business can rely on your data. What you are delivering to them. Now, the another part comes is the data engineering. Data engineering is more about an IT function, which is data acquisition, data preparation and delivering the data to the end-user, right? It can be business, it can be third-party but it all comes under the governance, under the policies, which are designed to secure the data, how the data should be accessed to different parts of the company or the external parties. >> And how those two worlds come together? The business piece and the IT piece, is that where you come in? >> That is where data science definitely comes into the picture. So if you go online, you can find Venn diagrams that describe data science as a combination of computer science math and statistics and business acumen. And so where it comes in the middle is data science. So it's really being able to put those things together. But, you know, what's what's so critical is you know, Interpol, actually, shared at the beginning here and I think a few years ago here, talked about the five pillars to building a data strategy. And, you know, one of those things is use cases, like getting out, picking a need, solving it and then going from there and along the way you realize what systems are critical, what data you need, who the business users are. You know, what would it take to scale that? So these, like, Proof-point projects that, you know, eventually turn into these bigger things, and for them to turn into bigger things you've got to have that partnership. You've got to know where your trusted data is, you've got to know that, how it got there, who can touch it, how frequently it is updated. Just being able to really understand that and work with partners that manage the infrastructure so that you can leverage it and make it available to other people and transparent. >> I remember when I first interviewed Hilary Mason way back when and I was asking her about that Venn diagram and she threw in another one, which was data hacking. >> Caryn: Uh-huh, yeah. >> Well, talk about that. You've got to be curious about data. You need to, you know, take a bath in data. >> (laughs) Yes, yes. I mean yeah, you really.. Sometimes you have to be a detective and you have to really want to know more. And, I mean, understanding the data is like the majority of the battle. >> So Ritesh, we were talking off-camera about it's not how titles change, things evolve, data, digital. They're kind of interchangeable these days. I mean we always say the difference between a business and a digital business is how they have used data. And so digital being part of your role, everybody's trying to get digital transformation, right? As an SI, you guys are at the heart of it. Certainly, IBM as well. What kinds of questions are our clients asking you about digital? >> So I ultimately see data, whatever we drive from data, it is used by the business side. So we are trying to always solve a business problem, which is to optimize the issues the company is facing, or try to generate more revenues, right? Now, the digital as well as the data has been married together, right? Earlier there are, you can say we are trying to analyze the data to get more insights, what is happening in that company. And then we came up with a predictive modeling that based on the data that will statically collect, how can we predict different scenarios, right? Now digital, we, over the period of the last 10 20 years, as the data has grown, there are different sources of data has come in picture, we are talking about social media and so on, right? And nobody is looking for just reports out of the Excel, right? It is more about how you are presenting the data to the senior management, to the entire world and how easily they can understand it. That's where the digital from the data digitization, as well as the application digitization comes in picture. So the tools are developed over the period to have a better visualization, better understanding. How can we integrate annotation within the data? So these are all different aspects of digitization on the data and we try to integrate the digital concepts within our data and analytics, right? So I used to be more, I mean, I grew up as a data engineer, analytics engineer but now I'm looking more beyond just the data or the data preparation. It's more about presenting the data to the end-user and the business. How it is easy for them to understand it. >> Okay I got to ask you, so you guys are data wonks. I am too, kind of, but I'm not as skilled as you are, but, and I say that with all due respect. I mean you love data. >> Caryn: Yes. >> As data science becomes a more critical skill within organizations, we always talk about the amount of data, data growth, the stats are mind-boggling. But as a data scientist, do you feel like you have access to the right data and how much of a challenge is that with clients? >> So we do have access to the data but the challenge is, the company has so many systems, right? It's not just one or two applications. There are companies we have 50 or 60 or even hundreds of application built over last 20 years. And there are some applications, which are basically duplicate, which replicates the data. Now, the challenge is to integrate the data from different systems because they maintain different metadata. They have the quality of data is a concern. And sometimes with the international companies, the rules, for example, might be in US or India or China, the data acquisitions are different, right? And you are, as you become more global, you try to integrate the data beyond boundaries, which becomes a more compliance issue sometimes, also, beyond the technical issues of data integration. >> Any thoughts on that? >> Yeah, I think, you know one of the other issues too, you have, as you've heard of shadow IT, where people have, like, servers squirreled away under their desks. There's your shadow data, where people have spreadsheets and databases that, you know, they're storing on, like a small server or that they share within their department. And so you know, you were discussing, we were talking earlier about the different systems. And you might have a name in one system that's one way and a name in another system that's slightly different, and then a third system, where it's it's different and there's extra granularity to it or some extra twist. And so you really have to work with all of the people that own these processes and figure out what's the trusted source? What can we all agree on? So there's a lot of... It's funny, a lot of the data problems are people problems. So it's getting people to talk and getting people to agree on, well this is why I need it this way, and this is why I need it this way, and figuring out how you come to a common solution so you can even create those single trusted sources that then everybody can go to and everybody knows that they're working with the the right thing and the same thing that they all agree on. >> The politics of it and, I mean, politics is kind of a pejorative word but let's say dissonance, where you have maybe of a back-end syst6em, financial system and the CFO, he or she is looking at the data saying oh, this is what the data says and then... I remember I was talking to a, recently, a chef in a restaurant said that the CFO saw this but I know that's not the case, I don't have the data to prove it. So I'm going to go get the data. And so, and then as they collect that data they bring together. So I guess in some ways you guys are mediators. >> [Caryn And Ritesh] Yes, yes. Absolutely. >> 'Cause the data doesn't lie you just got to understand it. >> You have to ask the right question. Yes. And yeah. >> And sometimes when you see the data, you start, that you don't even know what questions you want to ask until you see the data. Is that is that a challenge for your clients? >> Caryn: Yes, all the time. Yeah >> So okay, what else do we want to we want to talk about? The state of collaboration, let's say, between the data scientists, the data engineer, the quality engineer, maybe even the application developers. Somebody, John Fourier often says, my co-host and business partner, data is the new development kit. Give me the data and I'll, you know, write some code and create an application. So how about collaboration amongst those roles, is that something... I know IBM's gone on about some products there but your point Caryn, it's a lot of times it's the people. >> It is. >> And the culture. What are you seeing in terms of evolution and maturity of that challenge? >> You know I have a very good friend who likes to say that data science is a team sport and so, you know, these should not be, like, solo projects where just one person is wading up to their elbows in data. This should be something where you've got engineers and scientists and business, people coming together to really work through it as a team because everybody brings really different strengths to the table and it takes a lot of smart brains to figure out some of these really complicated things. >> I completely agree. Because we see the challenges, we always are trying to solve a business problem. It's important to marry IT as well as the business side. We have the technical expert but we don't have domain experts, subject matter experts who knows the business in IT, right? So it's very very important to collaborate closely with the business, right? And data scientist a intermediate layer between the IT as well as business I will say, right? Because a data scientist as they, over the years, as they try to analyze the information, they understand business better, right? And they need to collaborate with IT to either improve the quality, right? That kind of challenges they are facing and I need you to, the data engineer has to work very hard to make sure the data delivered to the data scientist or the business is accurate as much as possible because wrong data will lead to wrong predictions, right? And ultimately we need to make sure that we integrate the data in the right way. >> What's a different cultural dynamic that was, say ten years ago, where you'd go to a statistician, she'd fire up the SPSS.. >> Caryn: We still use that. >> I'm sure you still do but run some kind of squares give me some, you know, probabilities and you know maybe run some Monte Carlo simulation. But one person kind of doing all that it's your point, Caryn. >> Well you know, it's it's interesting. There are there are some students I mentor at a local university and you know we've been talking about the projects that they get and that you know, more often than not they get a nice clean dataset to go practice learning their modeling on, you know? And they don't have to get in there and clean it all up and normalize the fields and look for some crazy skew or no values or, you know, where you've just got so much noise that needs to be reduced into something more manageable. And so it's, you know, you made the point earlier about understanding the data. It's just, it really is important to be very curious and ask those tough questions and understand what you're dealing with. Before you really start jumping in and building a bunch of models. >> Let me add another point. That the way we have changed over the last ten years, especially from the technical point of view. Ten years back nobody talks about the real-time data analysis. There was no streaming application as such. Now nobody talks about the batch analysis, right? Everybody wants data on real-time basis. But not if not real-time might be near real-time basis. That has become a challenge. And it's not just that prediction, which are happening in their ERP environment or on the cloud, they want the real-time integration with the social media for the marketing and the sales and how they can immediately do the campaign, right? So, for example, if I go to Google and I search for for any product, right, for example, a pressure cooker, right? And I go to Facebook, immediately I see the ad within two minutes. >> Yeah, they're retargeting. >> So that's a real-time analytics is happening under different application, including the third-party data, which is coming from social media. So that has become a good source of data but it has become a challenge for the data analyst and the data scientist. How quickly we can turn around is called data analysis. >> Because it used to be you would get ads for a pressure cooker for months, even after you bought the pressure cooker and now it's only a few days, right? >> Ritesh: It's a minute. You close this application, you log into Facebook... >> Oh, no doubt. >> Ritesh: An ad is there. >> Caryn: There it is. >> Ritesh: Because everything is linked either your phone number or email ID you're done. >> It's interesting. We talked about disruption a lot. I wonder if that whole model is going to get disrupted in a new way because everybody started using the same ad. >> So that's a big change of our last 10 years. >> Do you think..oh go ahead. >> oh no, I was just going to say, you know, another thing is just there's so much that is available to everybody now, you know. There's not this small little set of tools that's restricted to people that are in these very specific jobs. But with open source and with so many software-as-a-service products that are out there, anybody can go out and get an account and just start, you know, practicing or playing or joining a cackle competition or, you know, start getting their hands on.. There's data sets that are out there that you can just download to practice and learn on and use. So, you know, it's much more open, I think, than it used to be. >> Yeah, community additions of software, open data. The number of open day sources just keeps growing. Do you think that machine intelligence can, or how can machine intelligence help with this data quality challenge? >> I think that it's it's always going to require people, you know? There's always going to be a need for people to train the machines on how to interpret the data. How to classify it, how to tag it. There's actually a really good article in Popular Science this month about a woman who was training a machine on fake news and, you know, it did a really nice job of finding some of the the same claims that she did. But she found a few more. So, you know, I think it's, on one hand we have machines that we can augment with data and they can help us make better decisions or sift through large volumes of data but then when we're teaching the machines to classify the data or to help us with metadata classification, for example, or, you know, to help us clean it. I think that it's going to be a while before we get to the point where that's the inverse. >> Right, so in that example you gave, the human actually did a better job from the machine. Now, this amazing to me how.. What, what machines couldn't do that humans could, you know last year and all of a sudden, you know, they can. It wasn't long ago that robots couldn't climb stairs. >> And now they can. >> And now they can. >> It's really creepy. >> I think the difference now is, earlier you know, you knew that there is an issue in the data. But you don't know that how much data is corrupt or wrong, right? Now, there are tools available and they're very sophisticated tools. They can pinpoint and provide you the percentage of accuracy, right? On different categories of data that that you come across, right? Even forget about the structure data. Even when you talk about unstructured data, the data which comes from social media or the comments and the remarks that you log or are logged by the customer service representative, there are very sophisticated text analytics tools available, which can talk very accurately about the data as well as the personality of the person who is who's giving that information. >> Tough problems but it seems like we're making progress. All you got to do is look at fraud detection as an example. Folks, thanks very much.. >> Thank you. >> Thank you very much. >> ...for sharing your insight. You're very welcome. Alright, keep it right there everybody. We're live from the IBM CTO conference in San Francisco. Be right back, you're watching the Cube. (electronic music)

Published Date : May 2 2018

SUMMARY :

Brought to you by IBM. of the IBM CDO strategy summit. and how you manage all those demands on your time. and you know, picking the projects that we work on I mean this means a lot of things to a lot of people. and delivering the data to the end-user, right? so that you can leverage it and make it available about that Venn diagram and she threw in another one, You need to, you know, take a bath in data. and you have to really want to know more. As an SI, you guys are at the heart of it. the data to get more insights, I mean you love data. and how much of a challenge is that with clients? Now, the challenge is to integrate the data And so you know, you were discussing, I don't have the data to prove it. [Caryn And Ritesh] Yes, yes. You have to ask the right question. And sometimes when you see the data, Caryn: Yes, all the time. Give me the data and I'll, you know, And the culture. and so, you know, these should not be, like, and I need you to, the data engineer that was, say ten years ago, and you know maybe run some Monte Carlo simulation. and that you know, more often than not And I go to Facebook, immediately I see the ad and the data scientist. You close this application, you log into Facebook... Ritesh: Because everything is linked I wonder if that whole model is going to get disrupted that is available to everybody now, you know. Do you think that machine intelligence going to require people, you know? Right, so in that example you gave, and the remarks that you log All you got to do is look at fraud detection as an example. We're live from the IBM CTO conference

ENTITIES

Entity	Category	Confidence
Ritesh Ororo	PERSON	0.99+
Caryn	PERSON	0.99+
John Fourier	PERSON	0.99+
Ritesh	PERSON	0.99+
IBM	ORGANIZATION	0.99+
US	LOCATION	0.99+
50	QUANTITY	0.99+
Cayn Woodruff	PERSON	0.99+
Boston	LOCATION	0.99+
San Francisco	LOCATION	0.99+
China	LOCATION	0.99+
India	LOCATION	0.99+
last year	DATE	0.99+
Excel	TITLE	0.99+
one	QUANTITY	0.99+
Caryn Woodruff	PERSON	0.99+
Ritesh Arora	PERSON	0.99+
Hilary Mason	PERSON	0.99+
60	QUANTITY	0.99+
130	QUANTITY	0.99+
One	QUANTITY	0.99+
Monte Carlo	TITLE	0.99+
HCL Technologies	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
third system	QUANTITY	0.98+
today	DATE	0.98+
Interpol	ORGANIZATION	0.98+
ten years ago	DATE	0.98+
two applications	QUANTITY	0.98+
first	QUANTITY	0.98+
Parc 55	LOCATION	0.98+
five pillars	QUANTITY	0.98+
one system	QUANTITY	0.98+
Google	ORGANIZATION	0.97+
two aspects	QUANTITY	0.97+
both coasts	QUANTITY	0.97+
one person	QUANTITY	0.96+
Ten years back	DATE	0.96+
two minutes	QUANTITY	0.95+
this month	DATE	0.95+
Union Square	LOCATION	0.95+
two worlds	QUANTITY	0.94+
Spring 2018	DATE	0.94+
Popular Science	TITLE	0.9+
CTO	EVENT	0.88+
days	QUANTITY	0.88+
one way	QUANTITY	0.87+
SPSS	TITLE	0.86+
single trusted sources	QUANTITY	0.85+
Venn	ORGANIZATION	0.84+
few years ago	DATE	0.84+
150 chief data officers	QUANTITY	0.83+
last 10 20 years	DATE	0.83+
Officer Strategy Summit 2018	EVENT	0.82+
hundreds of application	QUANTITY	0.8+
last 10 years	DATE	0.8+
Cube	COMMERCIAL_ITEM	0.79+
IBM Chief	EVENT	0.79+
IBM CDO strategy summit	EVENT	0.72+
last ten years	DATE	0.7+
IBM CDO Summit	EVENT	0.7+
fall	DATE	0.68+
Cube	TITLE	0.66+
spring	DATE	0.65+
last 20 years	DATE	0.63+
minute	QUANTITY	0.49+

Piotr Mierzejewski, IBM | Dataworks Summit EU 2018

>> Announcer: From Berlin, Germany, it's theCUBE covering Dataworks Summit Europe 2018 brought to you by Hortonworks. (upbeat music) >> Well hello, I'm James Kobielus and welcome to theCUBE. We are here at Dataworks Summit 2018, in Berlin, Germany. It's a great event, Hortonworks is the host, they made some great announcements. They've had partners doing the keynotes and the sessions, breakouts, and IBM is one of their big partners. Speaking of IBM, from IBM we have a program manager, Piotr, I'll get this right, Piotr Mierzejewski, your focus is on data science machine learning and data science experience which is one of the IBM Products for working data scientists to build and to train models in team data science enterprise operational environments, so Piotr, welcome to theCUBE. I don't think we've had you before. >> Thank you. >> You're a program manager. I'd like you to discuss what you do for IBM, I'd like you to discuss Data Science Experience. I know that Hortonworks is a reseller of Data Science Experience, so I'd like you to discuss the partnership going forward and how you and Hortonworks are serving your customers, data scientists and others in those teams who are building and training and deploying machine learning and deep learning, AI, into operational applications. So Piotr, I give it to you now. >> Thank you. Thank you for inviting me here, very excited. This is a very loaded question, and I would like to begin, before I get actually to why the partnership makes sense, I would like to begin with two things. First, there is no machine learning about data. And second, machine learning is not easy. Especially, especially-- >> James: I never said it was! (Piotr laughs) >> Well there is this kind of perception, like you can have a data scientist working on their Mac, working on some machine learning algorithms and they can create a recommendation engine, let's say in a two, three days' time. This is because of the explosion of open-source in that space. You have thousands of libraries, from Python, from R, from Scala, you have access to Spark. All these various open-source offerings that are enabling data scientists to actually do this wonderful work. However, when you start talking about bringing machine learning to the enterprise, this is not an easy thing to do. You have to think about governance, resiliency, the data access, actual model deployments, which are not trivial. When you have to expose this in a uniform fashion to actually various business units. Now all this has to actually work in a private cloud, public clouds environment, on a variety of hardware, a variety of different operating systems. Now that is not trivial. (laughs) Now when you deploy a model, as the data scientist is going to deploy the model, he needs to be able to actually explain how the model was created. He has to be able to explain what the data was used. He needs to ensure-- >> Explicable AI, or explicable machine learning, yeah, that's a hot focus of our concern, of enterprises everywhere, especially in a world where governance and tracking and lineage GDPR and so forth, so hot. >> Yes, you've mentioned all the right things. Now, so given those two things, there's no ML web data, and ML is not easy, why the partnership between Hortonworks and IBM makes sense, well, you're looking at the number one industry leading big data plot from Hortonworks. Then, you look at a DSX local, which, I'm proud to say, I've been there since the first line of code, and I'm feeling very passionate about the product, is the merger between the two, ability to integrate them tightly together gives your data scientists secure access to data, ability to leverage the spark that runs inside a Hortonworks cluster, ability to actually work in a platform like DSX that doesn't limit you to just one kind of technology but allows you to work with multiple technologies, ability to actually work on not only-- >> When you say technologies here, you're referring to frameworks like TensorFlow, and-- >> Precisely. Very good, now that part I'm going to get into very shortly, (laughs) so please don't steal my thunder. >> James: Okay. >> Now, what I was saying is that not only DSX and Hortonworks integrated to the point that you can actually manage your Hadoop clusters, Hadoop environments within a DSX, you can actually work on your Python models and your analytics within DSX and then push it remotely to be executed where your data is. Now, why is this important? If you work with the data that's megabytes, gigabytes, maybe you know you can pull it in, but in truly what you want to do when you move to the terabytes and the petabytes of data, what happens is that you actually have to push the analytics to where your data resides, and leverage for example YARN, a resource manager, to distribute your workloads and actually train your models on your actually HDP cluster. That's one of the huge volume propositions. Now, mind you to say this is all done in a secure fashion, with ability to actually install DSX on the edge notes of the HDP clusters. >> James: Hmm... >> As of HDP 264, DSX has been certified to actually work with HDP. Now, this partnership embarked, we embarked on this partnership about 10 months ago. Now, often happens that there is announcements, but there is not much materializing after such announcement. This is not true in case of DSX and HDP. We have had, just recently we have had a release of the DSX 1.2 which I'm super excited about. Now, let's talk about those open-source toolings in the various platforms. Now, you don't want to force your data scientists to actually work with just one environment. Some of them might prefer to work on Spark, some of them like their RStudio, they're statisticians, they like R, others like Python, with Zeppelin, say Jupyter Notebook. Now, how about Tensorflow? What are you going to do when actually, you know, you have to do the deep learning workloads, when you want to use neural nets? Well, DSX does support ability to actually bring in GPU notes and do the Tensorflow training. As a sidecar approach, you can append the note, you can scale the platform horizontally and vertically, and train your deep learning workloads, and actually remove the sidecar out. So you should put it towards the cluster and remove it at will. Now, DSX also actually not only satisfies the needs of your programmer data scientists, that actually code in Python and Scala or R, but actually allows your business analysts to work and create models in a visual fashion. As of DSX 1.2, you can actually, we have embedded, integrated, an SPSS modeler, redesigned, rebranded, this is an amazing technology from IBM that's been on for a while, very well established, but now with the new interface, embedded inside a DSX platform, allows your business analysts to actually train and create the model in a visual fashion and, what is beautiful-- >> Business analysts, not traditional data scientists. >> Not traditional data scientists. >> That sounds equivalent to how IBM, a few years back, was able to bring more of a visual experience to SPSS proper to enable the business analysts of the world to build and do data-mining and so forth with structured data. Go ahead, I don't want to steal your thunder here. >> No, no, precisely. (laughs) >> But I see it's the same phenomenon, you bring the same capability to greatly expand the range of data professionals who can do, in this case, do machine learning hopefully as well as professional, dedicated data scientists. >> Certainly, now what we have to also understand is that data science is actually a team sport. It involves various stakeholders from the organization. From executive, that actually gives you the business use case to your data engineers that actually understand where your data is and can grant the access-- >> James: They manage the Hadoop clusters, many of them, yeah. >> Precisely. So they manage the Hadoop clusters, they actually manage your relational databases, because we have to realize that not all the data is in the datalinks yet, you have legacy systems, which DSX allows you to actually connect to and integrate to get data from. It also allows you to actually consume data from streaming sources, so if you actually have a Kafka message cob and actually were streaming data from your applications or IoT devices, you can actually integrate all those various data sources and federate them within the DSX to use for machine training models. Now, this is all around predictive analytics. But what if I tell you that right now with the DSX you can actually do prescriptive analytics as well? With the 1.2, again I'm going to be coming back to this 1.2 DSX with the most recent release we have actually added decision optimization, an industry-leading solution from IBM-- >> Prescriptive analytics, gotcha-- >> Yes, for prescriptive analysis. So now if you have warehouses, or you have a fleet of trucks, or you want to optimize the flow in let's say, a utility company, whether it be for power or could it be for, let's say for water, you can actually create and train prescriptive models within DSX and deploy them the same fashion as you will deploy and manage your SPSS streams as well as the machine learning models from Spark, from Python, so with XGBoost, Tensorflow, Keras, all those various aspects. >> James: Mmmhmm. >> Now what's going to get really exciting in the next two months, DSX will actually bring in natural learning language processing and text analysis and sentiment analysis by Vio X. So Watson Explorer, it's another offering from IBM... >> James: It's called, what is the name of it? >> Watson Explorer. >> Oh Watson Explorer, yes. >> Watson Explorer, yes. >> So now you're going to have this collaborative message platform, extendable! Extendable collaborative platform that can actually install and run in your data centers without the need to access internet. That's actually critical. Yes, we can deploy an IWS. Yes we can deploy an Azure. On Google Cloud, definitely we can deploy in Softlayer and we're very good at that, however in the majority of cases we find that the customers have challenges for bringing the data out to the cloud environments. Hence, with DSX, we designed it to actually deploy and run and scale everywhere. Now, how we have done it, we've embraced open source. This was a huge shift within IBM to realize that yes we do have 350,000 employees, yes we could develop container technologies, but why? Why not embrace what is actually industry standards with the Docker and equivalent as they became industry standards? Bring in RStudio, the Jupyter, the Zeppelin Notebooks, bring in the ability for a data scientist to choose the environments they want to work with and actually extend them and make the deployments of web services, applications, the models, and those are actually full releases, I'm not only talking about the model, I'm talking about the scripts that can go with that ability to actually pull the data in and allow the models to be re-trained, evaluated and actually re-deployed without taking them down. Now that's what actually becomes, that's what is the true differentiator when it comes to DSX, and all done in either your public or private cloud environments. >> So that's coming in the next version of DSX? >> Outside of DSX-- >> James: We're almost out of time, so-- >> Oh, I'm so sorry! >> No, no, no. It's my job as the host to let you know that. >> Of course. (laughs) >> So if you could summarize where DSX is going in 30 seconds or less as a product, the next version is, what is it? >> It's going to be the 1.2.1. >> James: Okay. >> 1.2.1 and we're expecting to release at the end of June. What's going to be unique in the 1.2.1 is infusing the text and sentiment analysis, so natural language processing with predictive and prescriptive analysis for both developers and your business analysts. >> James: Yes. >> So essentially a platform not only for your data scientist but pretty much every single persona inside the organization >> Including your marketing professionals who are baking sentiment analysis into what they do. Thank you very much. This has been Piotr Mierzejewski of IBM. He's a Program Manager for DSX and for ML, AI, and data science solutions and of course a strong partnership is with Hortonworks. We're here at Dataworks Summit in Berlin. We've had two excellent days of conversations with industry experts including Piotr. We want to thank everyone, we want to thank the host of this event, Hortonworks for having us here. We want to thank all of our guests, all these experts, for sharing their time out of their busy schedules. We want to thank everybody at this event for all the fascinating conversations, the breakouts have been great, the whole buzz here is exciting. GDPR's coming down and everybody's gearing up and getting ready for that, but everybody's also focused on innovative and disruptive uses of AI and machine learning and business, and using tools like DSX. I'm James Kobielus for the entire CUBE team, SiliconANGLE Media, wishing you all, wherever you are, whenever you watch this, have a good day and thank you for watching theCUBE. (upbeat music)

Published Date : Apr 19 2018

SUMMARY :

brought to you by Hortonworks. and to train models in team data science and how you and Hortonworks are serving your customers, Thank you for inviting me here, very excited. from Python, from R, from Scala, you have access to Spark. GDPR and so forth, so hot. that doesn't limit you to just one kind of technology Very good, now that part I'm going to get into very shortly, and then push it remotely to be executed where your data is. Now, you don't want to force your data scientists of the world to build and do data-mining (laughs) you bring the same capability the business use case to your data engineers James: They manage the Hadoop clusters, With the 1.2, again I'm going to be coming back to this as you will deploy and manage your SPSS streams in the next two months, DSX will actually bring in and allow the models to be re-trained, evaluated It's my job as the host to let you know that. (laughs) is infusing the text and sentiment analysis, and of course a strong partnership is with Hortonworks.

ENTITIES

Entity	Category	Confidence
Piotr Mierzejewski	PERSON	0.99+
James Kobielus	PERSON	0.99+
James	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Piotr	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
30 seconds	QUANTITY	0.99+
Berlin	LOCATION	0.99+
IWS	ORGANIZATION	0.99+
Python	TITLE	0.99+
Spark	TITLE	0.99+
two	QUANTITY	0.99+
First	QUANTITY	0.99+
Scala	TITLE	0.99+
Berlin, Germany	LOCATION	0.99+
350,000 employees	QUANTITY	0.99+
DSX	ORGANIZATION	0.99+
Mac	COMMERCIAL_ITEM	0.99+
two things	QUANTITY	0.99+
RStudio	TITLE	0.99+
DSX	TITLE	0.99+
DSX 1.2	TITLE	0.98+
both developers	QUANTITY	0.98+
second	QUANTITY	0.98+
GDPR	TITLE	0.98+
Watson Explorer	TITLE	0.98+
Dataworks Summit 2018	EVENT	0.98+
first line	QUANTITY	0.98+
Dataworks Summit Europe 2018	EVENT	0.98+
SiliconANGLE Media	ORGANIZATION	0.97+
end of June	DATE	0.97+
TensorFlow	TITLE	0.97+
thousands of libraries	QUANTITY	0.96+
R	TITLE	0.96+
Jupyter	ORGANIZATION	0.96+
1.2.1	OTHER	0.96+
two excellent days	QUANTITY	0.95+
Dataworks Summit	EVENT	0.94+
Dataworks Summit EU 2018	EVENT	0.94+
SPSS	TITLE	0.94+
one	QUANTITY	0.94+
Azure	TITLE	0.92+
one kind	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.92+
HDP	ORGANIZATION	0.91+

Daniel Hernandez, IBM | IBM Think 2018

>> Narrator: Live from Las Vegas It's theCUBE covering IBM Think 2018. Brought to you by IBM. >> We're back at Mandalay Bay in Las Vegas. This is IBM Think 2018. This is day three of theCUBE's wall-to-wall coverage. My name is Dave Vellante, I'm here with Peter Burris. You're watching theCUBE, the leader in live tech coverage. Daniel Hernandez is here. He's the Vice President of IBM Analytics, a CUBE alum. It's great to see you again, Daniel >> Thanks >> Dave: Thanks for coming back on >> Happy to be here. >> Big tech show, consolidating a bunch of shows, you guys, you kind of used to have your own sort of analytics show but now you've got all the clients here. How do you like it? Compare and contrast. >> IBM Analytics loves to share so having all our clients in one place, I actually like it. We're going to work out some of the kinks a little bit but I think one show where you can have a conversation around Artificial Intelligence, data, analytics, power systems, is beneficial to all of us, actually. >> Well in many respects, the whole industry is munging together. Folks focus more on workloads as opposed to technology or even roles. So having an event like this where folks can talk about what they're trying to do, the workloads they're trying to create, the role that analytics, AI, et cetera is going to play in informing those workloads. Not a bad place to get that crosspollination. What do you think? >> Daniel: Totally. You talk to a client, there are so many problems. Problems are a combination of stuff that we have to offer and analytics stuff that our friends in Hybrid Integration have to offer. So for me, logistically, I could say oh, Mike Gilfix, business process automation. Go talk to him. And he's here. That's happened probably at least a dozen times so far in not even two days. >> Alright so I got to ask, your tagline. Making data ready for AI. What does that mean? >> We get excited about amazing tech. Artificial intelligence is amazing technology. I remember when Watson beat Jeopardy. Just being inspired by all the things that I thought it could do to solve problems that matter to me. And if you look over the last many years, virtual assistants, image recognition systems that solve pretty big problems like catching bad guys are inspirational pieces of work that were inspired a lot by what we did then. And in business, it's triggered a wave of artificial intelligence can help me solve business critical issues. And I will tell you that many clients simply aren't ready to get started. And because they're not ready, they're going to fail. And so our attitude about things are, through IBM Analytics, we're going to deliver the critical capabilities you need to be ready for AI. And if you don't have that, 100% of your projects will fail. >> But how do you get the business ready to think about data differently? You can do a lot to say, the technology you need to do this looks differently but you also need to get the organization to acculturate, appreciate that their business is going to run differently as a consequence of data and what you do with it. How do you get the business to start making adjustments? >> I think you just said the magic word, the business. Which is to say, at least all the conversations I have with my customers, they can't even tell that I'm from the analytics because I'm asking them about the problems. What do you try to do? How would you measure success? What are the critical issues that you're trying to solve? Are you trying to make money, save money, those kinds of things. And by focusing on it, we can advise them then based on that how we can help. So the data culture that you're describing I think it's a fact, like you become data aware and understand the power of it by doing. You do by starting with the problems, developing successes and then iterating. >> An approach to solving problems. >> Yeah >> So that's kind of a step zero to getting data ready for AI >> Right. But in no conversation that leads to success does it ever start with we're going to do AI or machine learning, what problem are we going to solve? It's always the other way around. And when we do that, our technology then is easily explainable. It's like okay, you want to build a system for better customer interactions in your call center. Well, what does that mean? You need data about how they have interacted with you, products they have interacted with, you might want predictions that anticipate what their needs are before they tell you. And so we can systematically address them through the capabilities we've got. >> Dave, if I could amplify one thing. It makes the technology easier when you put it in these constants I think that's a really crucial important point. >> It's super simple. All of us have had to have it, if we're in technology. Going the other way around, my stuff is cool. Here's why it's cool. What problems can you solve? Not helpful for most of our clients. >> I wonder if you could comment on this Daniel. I feel like we're, the last ten years about cloud mobile, social, big data. We seem to be entering an era now of sense, speak, act, optimize, see, learn. This sort of pervasive AI, if you will. How- is that a reasonable notion, that we're entering that era, and what do you see clients doing to take advantage of that? What's their mindset like when you talk to them? >> I think the evidence is there. You just got to look around the show and see what's possible, technically. The Watson team has been doing quite a bit of stuff around speech, around image. It's fascinating tech, stuff that feels magical to me. And I know how this stuff works and it still feels kind of fascinating. Now the question is how do you apply that to solve problems. I think it's only a matter of time where most companies are implementing artificial intelligence systems in business critical and core parts of their processes and they're going to get there by starting, by doing what they're already doing now with us, and that is what problem am I solving? What data do I need to get that done? How do I control and organize that information so I can exploit it? How can I exploit machine learning and deep learning and all these other technologies to then solve that problem. How do I measure success? How do I track that? And just systematically running these experiments. I think that crescendos to a critical mass. >> Let me ask you a question. Because you're a technologist and you said it's amazing, it's like magic even to you. Imagine non technologists, what `it's like to me. There's a black box component of AI, and maybe that's okay. I'm just wondering if that's, is that a headwind, are clients comfortable with that? If you have to describe how you really know it's a cat. I mean, I know a cat when I see it. And the machine can tell me it's a cat, or not a hot dog Silicon Valley reference. (Peter laughs) But to tell me actually how it works, to figure that out there's a black box component. Does that scare people? Or are they okay with that? >> You've probably given me too much credit. So I really can't explain how all that just works but what I can tell you is how certainly, I mean, lets take regulated industries like banks and insurance companies that are building machine learning models throughout their enterprise. They've got to explain to a regulator that they are offering considerations around anti discriminatory, basically they're not buying systems that cause them to do things that are against the law, effectively. So what are they doing? Well, they're using tools like ones from IBM to build these models to track the process of creating these models which includes what data they used, how that training was done, prove that the inputs and outputs are not anti-discriminatory and actually go through their own internal general counsel and regulators to get it done. So whether you can explain the model in this particular case doesn't matter. What they're trying to prove is that the effect is not violating the law, which the tool sets and the process around those tool sets allow you to get that done today. >> Well, let me build on that because one of the ways that it does work is that, as Ginni said yesterday, Ginni Rometty said yesterday that it's always going to be a machine human component to it. And so the way it typically works is a machine says I think this is a cat and a human validates it or not. The machine still doesn't really know if it's a cat but coming back to this point, one of the key things that we see anyway, and one of the advantages that IBM likely has, is today the folks running Operational Systems, the core of the business, trust their data sources. >> Do they? >> They trust their DB2 database, they trust their Oracle database, they trust the data that's in the applications. >> Dave: So it's the data that's in their Data Lake? >> I'm not saying they do but that's the key question. At what point in time, and I think the real important part of your question is, at what point in time do the hardcore people allow AI to provide a critical input that's going to significantly or potentially dramatically change the behavior of the core operational systems. That seems a really crucial point. What kind of feedback do you get from customers as you talk about turning AI from something that has an insight every now and then to becoming effectively, an element or essential to the operation of the business? >> One of the critical issues in getting especially machine learning models, integrated in business critical processes and workflows is getting those models running where that work is done. So if you look, I mean, when I was here last time I was talking about the, we were focused on portfolio simplification and bringing machine learning where the data was. We brought machine learning to private cloud, we brought it onto Gadook, we brought it on mainframe. I think it is a critical necessary ingredient that you need to deliver that outcome. Like, bring that technology where the data is. Otherwise it just won't work. Why? As soon as you move, you've got latency. As soon as you move, you've got data quality issues you're going to have contending. That's going to exacerbate whatever mistrust you might have. >> Or the stuff's not cheap to move. It's not cheap to ingest. >> Yeah. By the way, the Machine Learning on Z offering that we launched last year in March, April was one of our highest, most successful offerings last year. >> Let's talk about some of the offerings. I mean, at the end of the day you're in the business of selling stuff. You've talked about Machine Learning on Z X, whatever platform. Cloud Private, I know you've got perspectives on that. Db2 Event Store is something that you're obviously familiar with. SPSS is part of the portfolio. >> 50 year, the anniversary. >> Give us the update on some of these products. >> Making data ready for AI requires a design principled on simplicity. We launched in January three core offerings that help clients benefit from the capability that we deliver to capture data, to organize and control that data and analyze that data. So we delivered a Hybrid Data Management offering which gives you everything you need to collect data, it's anchored by Db2. We have the Unified Governance and Integration portfolio that gives you everything you need to organize and control that data as anchored by our information server product set. And we've got our Data Science and Businesses Analytics portfolio, which is anchored by our data science experience, SPSS and Cognos Analytics portfolio. So clients that want to mix and match those capabilities in support of artificial intelligence systems, or otherwise, can benefit from that easily. We just announced here a radical- an even radical step forward in simplification, which we thought that there already was. So if you want to move to the public cloud but can't, don't want to move to the public cloud for whatever reason and we think, by the way, public cloud for workload to like, you should try to run as much as you can there because the benefits of it. But if for whatever reason you can't, we need to deliver those benefits behind the firewall where those workloads are. So last year the Hybrid Integration team led by Denis Kennelly, introduced an IBM cloud private offering. It's basically application paths behind the firewall. It's like run on a Kubernetes environment. Your applications do buildouts, do migrations of existing workloads to it. What we did with IBM Cloud Private for data is have the data companion for that. IBM Cloud Private was a runaway success for us. You could imagine the data companion to that just being like, what application doesn't need data? It's peanut butter and jelly for us. >> Last question, oh you had another point? >> It's alright. I wanted to talk about Db2 and SPCC. >> Oh yes, let's go there, yeah. >> Db2 Event Store, I forget if anybody- It has 100x performance improvement on Ingest relative to the current state of the order. You say, why does that matter? If you do an analysis or analytics, machine learning, artificial intelligence, you're only as good as whatever data you have captured of your, whatever your reality is. Currently our databases don't allow you to capture everything you would want. So Db2 Event Store with that Ingest lets you capture more than you could ever imagine you would want. 250 billion events per year is basically what it's rated at. So we think that's a massive improvement in database technology and it happens to be based in open source, so the programming model is something that developers feel is familiar. SPSS is celebrating it's 50th year anniversary. It's the number one digital offering inside of IBM. It had 510,000 users trying it out last year. We just renovated the user experience and made it even more simple on stats. We're doing the same thing on Modeler and we're bringing SPSS and our data science experience together so that there's one tool chain for data science end to end in the Private Cloud. It's pretty phenomenal stuff. >> Okay great, appreciate you running down the portfolio for us. Last question. It's kind of a, get out of your telescope. When you talk to clients, when you think about technology from a technologist's perspective, how far can we take machine intelligence? Think 20 plus years, how far can we take it and how far should we take it? >> Can they ever really know what a cat is? (chuckles) >> I don't know what the answer to that question is, to be honest. >> Are people asking you that question, in the client base? >> No. >> Are they still figuring out, how do I apply it today? >> Surely they're not asking me, probably because I'm not the smartest guy in the room. They're probably asking some of the smarter guys-- >> Dave: Well, Elon Musk is talking about it. Stephen Hawking was talking about it. >> I think it's so hard to anticipate. I think where we are today is magical and I couldn't have anticipated it seven years ago, to be honest, so I can't imagine. >> It's really hard to predict, isn't it? >> Yeah. I've been wrong on three to four year horizons. I can't do 20 realistically. So I'm sorry to disappoint you. >> No, that's okay. Because it leads to my real last question which is what kinds of things can machines do that humans can't and you don't even have to answer this, but I just want to put it out there to the audience to think about how are they going to complement each other. How are they going to compete with each other? These are some of the big questions that I think society is asking. And IBM has some answers, but we're going to apply it here, here and here, you guys are clear about augmented intelligence, not replacing. But there are big questions that I think we want to get out there and have people ponder. I don't know if you have a comment. >> I do. I think there are non obvious things to human beings, relationships between data that's expressing some part of your reality that a machine through machine learning can see that we can't. Now, what does it mean? Do you take action on it? Is it simply an observation? Is it something that a human being can do? So I think that combination is something that companies can take advantage of today. Those non obvious relationships inside of your data, non obvious insights into your data is what machines can get done now. It's how machine learning is being used today. Is it going to be able to reason on what to do about it? Not yet, so you still need human beings in the middle too, especially when you deal with consequential decisions. >> Yeah but nonetheless, I think the impact on industry is going to be significant. Other questions we ask are retail stores going to be the exception versus the normal. Banks lose control of the payment systems. Will cyber be the future of warfare? Et cetera et cetera. These are really interesting questions that we try and cover on theCUBE and we appreciate you helping us explore those. Daniel, it's always great to see you. >> Thank you, Dave. Thank you, Peter. >> Alright keep it right there buddy, we'll be back with our next guest right after this short break. (electronic music)

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. It's great to see you again, Daniel How do you like it? bit but I think one show where you can have a is going to play in informing those workloads. You talk to a client, Alright so I got to ask, your tagline. And I will tell you that many clients simply appreciate that their business is going to run differently I think you just said the magic word, the business. But in no conversation that leads to success when you put it in these constants What problems can you solve? entering that era, and what do you see Now the question is how do you apply that to solve problems. If you have to describe how you really know it's a cat. So whether you can explain the model in this Well, let me build on that because one of the the applications. What kind of feedback do you get from customers That's going to exacerbate whatever mistrust you might have. Or the stuff's not cheap to move. that we launched last year in March, April I mean, at the end of the day you're in to like, you should try to run as much as you I wanted to talk about Db2 and SPCC. So Db2 Event Store with that Ingest lets you capture When you talk to clients, when you think about is, to be honest. I'm not the smartest guy in the room. Dave: Well, Elon Musk is talking about it. I think it's so hard to anticipate. So I'm sorry to disappoint you. How are they going to compete with each other? I think there are non obvious things to industry is going to be significant. with our next guest right after this short break.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Daniel	PERSON	0.99+
Daniel Hernandez	PERSON	0.99+
Mike Gilfix	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ginni	PERSON	0.99+
Ginni Rometty	PERSON	0.99+
Peter	PERSON	0.99+
Denis Kennelly	PERSON	0.99+
Dave	PERSON	0.99+
January	DATE	0.99+
Stephen Hawking	PERSON	0.99+
yesterday	DATE	0.99+
Elon Musk	PERSON	0.99+
last year	DATE	0.99+
100x	QUANTITY	0.99+
20 plus years	QUANTITY	0.99+
100%	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
510,000 users	QUANTITY	0.99+
March	DATE	0.99+
today	DATE	0.99+
50 year	QUANTITY	0.99+
Db2	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
seven years ago	DATE	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
Z X	TITLE	0.98+
20	QUANTITY	0.98+
three	QUANTITY	0.98+
SPSS	TITLE	0.98+
April	DATE	0.96+
IBM Analytics	ORGANIZATION	0.96+
Gadook	ORGANIZATION	0.96+
Silicon Valley	LOCATION	0.94+
two days	QUANTITY	0.94+
Oracle	ORGANIZATION	0.92+
SPCC	ORGANIZATION	0.92+
DB2	TITLE	0.9+
four year	QUANTITY	0.9+
one place	QUANTITY	0.89+
Vegas	LOCATION	0.89+
Kubernetes	TITLE	0.87+
SPSS	ORGANIZATION	0.86+
Jeopardy	ORGANIZATION	0.86+
50th year anniversary	QUANTITY	0.86+
Watson	PERSON	0.82+
at least a dozen times	QUANTITY	0.82+
Db2 Event Store	TITLE	0.8+
theCUBE	ORGANIZATION	0.8+
intelligence	EVENT	0.79+
step zero	QUANTITY	0.78+
one tool	QUANTITY	0.77+
250 billion events per year	QUANTITY	0.76+
three core offerings	QUANTITY	0.75+
one thing	QUANTITY	0.7+
Db2 Event	ORGANIZATION	0.68+
Vice President	PERSON	0.68+
Ingest	ORGANIZATION	0.68+

Nancy Hensley, IBM | IBM Think 2018

>> Announcer: Live from Las Vegas, it's The CUBE . Covering IBM Think 2018. Brought to you by IBM. >> Hello, and welcome to The CUBE . Here we are at IBM Think 2018. I'm John Furrier, your host. We are here for a feature one-on-one CUBE interview with Nancy Hensley, the Chief Digital Officer of the Analytics group. IBM has a new position rolling out across the company called the Chief Digital Offices. So there's a chief-Chief Digital Officer, and that's Bob Lord. But each business unit's taking digital seriously as a way to engage and provide services and value to customers and anyone who's interested. Nancy, great to see you. CUBE alumni. >> Thank you, thank you. Glad to be here. Always happy to be back. >> Thanks for stopping by. So, I'm really interested in this Chief Digital Officer role that you're in. >> Yeah. >> You know we love digital, you know we're progressive, we love to try new things. >> Nancy: Absolutely. >> IBM, big infrastructure on digital. What's your new role? Take a minute to explain what you're working on in this analytics group. So you're in this analytics division. >> Nancy: Yes. >> So you're in the business unit? Take a minute. >> I haven't left all of my love for data and analytics. I'm still here, but now what I'm doing is making these products much more consumable and accessible. The challenge we had, and I think a big change that's happening in the industry, is that best of breed isn't good enough anymore. You have to make these products much more accessible because the power shifting to that one digital consumer, who's going to search for some sort of capability >> John: Yeah. >> And wherever they find it is where they're going to start to engage, right? And that's where we have to be. >> Yeah. I mean, to me, remember the old days? CRM. Customer Relationship Management software. >> Yeah. >> I mean, right now, software is in a relationship, still. >> Nancy: Absolutely. >> So, talk about the relationship because digitally it's different. >> Nancy: It is. >> It's not a catalog for learning. >> Right. >> It's not waterfall, it's more agile, it's more personal. >> Right. >> But it can't be intrusive, because people don't want to be sold to, they're worried about their data. >> Nancy: Right. >> Re-targeting. >> Nancy: Yep. >> How are you guys changing the game? >> So, we used to develop products. Now we develop experiences. The product is the experience, and the experience is the product, and that starts from, how easy is it to find? When I search for a capability like text analytics or content analytics, do I find what I'm looking for? How easy is it to get my hands on it and try it? How easy is it to have that aha moment of 'oh, I get how this product can help me,' right? How easy is it to engage with my peers in a community? How easy it is to get support, right? All of that is part of the experience. And what we're doing now, is wrapping that all together around the product. >> Talk about specifically, some of the things you're working on, I'd like to get.. >> Sure. >> I know you were talking before we came on camera about some of the programs, but at the end of the day, people want to get the job done, right? >> Nancy: Right. >> They need, they have a job to do, a mission, and they want to feel like they got instant value. >> Nancy: M-hmm. >> Maybe kick the tires, do a little deep dive. >> Nancy: Yup. >> Jump around, not feel like they're, you know, getting in a headlock on the IBM dot com site. >> (laughing) Well, let's talk about one of the products that we started with, which was SPSS Statistics. So, do you know Statistics actually turns 50 this year? 50! That's amazing, right? So, Statistics is primarily students in academia. So the average profile of a Statistics buyer is normally under 25. How do you think those buyers want to buy? It's probably not through a face-to-face IBM sales relationship, right? So we started off with that product because it was the most B to C product that we had, and we knew that the buyer gave us some very clear signals about, I want to buy digitally, I want to be able to easily try it, download it, and subscription-based pricing, which is including support and a good community go-to. So, when we started off the digital transformation a year ago on Statistics, it was very difficult to find, it was very difficult to try. We didn't have a very good NPS score for support. And so we transformed the whole experience, and you literally can get on, it's easy to find, it comes up top of the search. You download it, you swipe your credit card. It's a very sleek experience and you are up and running in like, 15 minutes. >> You know, one of the thing's that's interesting, people just want a relationship with that experience. And as you guys rethink this, if you think about it, analytics, the younger buyers.. >> Nancy: Yeah! >> They don't actually even use email. They have mobile email accounts. >> Nancy: Right. >> They're on Snapchat, they're on Instagram, and they have multiple channels open, and so you have to be smarter about how to engage in the preferred method that the users want. How is that translated within IBM? Share some inside baseball about some of the conversations inside IBM as you guys try to make that happen. Because I know, certainly, that you're talking about it, you guys are doing stuff. What's the conversations like inside IBM? >> (laughing) I think we want to be able to do more to engage with the client in-product. Everything from making it easier for them to find support to even booking time with an expert. And the more we can push that into the product so they never have to leave that original experience, I think it's better for them, right? I mean, in the past IBM would have one site for developer works, right? One site that had support information, one site that had product information, one site that had, like, learn and discover assets and another site that you would try and buy. And that was just too much work for the consumer to try and get to that point where they were very comfortable and confident, they could find their peers, right? So consolidating that all, that is the big challenge now. Because we're, you know, we're not a young company So we have a lot of information that's digitized out there. >> And you have some older buyers, I mean, but that's the trade-off. I have this conversation all the time with folks, that new solutions aren't mutually exclusive to the old way. >> Nancy: Right. >> There are a lot of people that still use email >> Nancy: Yeah. >> As a preferred method. It's been the killer app for 30 years. >> Nancy: Absolutely. >> Okay, but now the new users, you've got to bolt on new programs, so how's that ... How are you guys thinking about that? Is there any technology decisions that you guys made? Jeannie mentioned you guys are using your own tools and technology, love her story. A.I. ... >> So, one of the cool things- >> Blockchain, data. >> Absolutely, absolutely. So, one of the cool things we're doing is using chat bots to optimize the time of our digital sales reps. So if you go on SPSS Statistics right now, you can have a conversation with a chat bot, and what it's done is, it's actually helped us optimize. So, when you actually talk to a really good rep that you want to get deeper in conversation on, you've already gotten a lot of your questions answered. We've improved their time, they optimized their time by 76%. Overall, what digital's done for us in a product like Statistics, is it's reduced the amount of time it's taken us to acquire new clients. So, for every 100 new clients that we acquire new, brand new to IBM, it's been reduced by 70%. So we can truly accelerate how many more clients we can onboard in digital than we ever could before. >> So here's a trick question for you. It's kind of a hard question, but it's kind of a trick question because it's hard to answer. At least I think it's hard, maybe you'll think it's easier. Inefficiencies always come in new technologies, but whenever you have new technologies, you can create new efficiencies. >> Nancy: M-hmm. >> What if, because you mentioned some great stats, you guys are shortening the cycle down to acquire new customers. >> Provide value, faster time to value. Have you seen any new blockers come in front of you or have you seen any new things that you guys have disrupted a way in terms of making it more efficient? Because there's always an opportunity to reduce the steps it takes to do something. >> Nancy: Right. >> Or make it easier to use and more simpler. >> Well, it is a huge mindset shift for us, because this is not how we've engaged with the client. So first, it's important for clients to understand that there are two routes to market with us now. One is through a face-to-face, traditional sales method, and some clients will continue to engage on that through many of our products. There's our partners. Actually, it's more than two. And now there's digital, and that's brand new, right? Truly digital self-service commerce, and with that we're doing more focus around how do we grow adoption around those products faster than we ever can before? So we're using new growth hacking techniques and that is, again, it is very disruptive to the mindset that we came from, but, you know, I always say, IBM, we continue to reinvent ourselves so we're reinventing a new experience. >> Well, I've got to just say growth hacking techniques has been a big debate in Silicon Valley. Gamification, growth hacks is kind of passé in terms of wording. There's nuance, but I want to share that with you. There were companies that did growth hacking at the expense of the users. >> Nancy: Right. >> But there's actually growth hacking that creates a good user experience >> Nancy: Absolutely. >> That's kind of being replaced with gamification and this is becoming a very critical part of digital. >> Nancy: Absolutely. >> 'Gamifiying' on behalf of user experience, >> Nancy: Yep. >> which Jeannie was saying that's the focus, is really the short cut. >> Nancy: Absolutely. >> So to me, the shortcut is, how do I get to what I want to find ... ? That's gamification. It's an algorithm, it's software. >> Right, right. And how do you amplify on what's working and what's not working? So we're literally running weekly experiments. We get the teams together, we have squads that get together it's everybody from design to development, and we just do a big drain dump of here's the things we should try. And then we just try and we start to double down where it's working, and we learn a lot from the things that aren't working, and not everything works in digital is what we're finding. >> The best thing about it is that you can always re-start and re-try because it's easy to work with. >> Right. >> So I want to talk about the role of community. IBM has always had a strong community mindset. >> Nancy: Absolutely. >> The ethos going back to Open Source days, it's been a leader in Linux, and continued to have an open source presence. We've been following the Hyper ledger project in the Linux foundation, I've been covering some of the IBM work there with Blockchain. But more and more open source and community. How do you guys take digital to communities? >> So, in the past, the digital experience wasn't really all-inclusive around the product, so you would have to go to a different place to connect with community. And now what we're doing is bringing that all into, we call it a hub-like experience in the marketplace so it's all there. Because part of your decision process is, I want to go connect with people like me, right? I want to connect with my peers. So we're making it easier to do that. So now that it's all interconnected in the marketplace, making it easier for people to find, because ... You know, what do you do when you buy something, right? You read reviews, you see how other people have used it. >> Check the ratings. >> Community's critical to that, right? Exactly. So we've connected all that too, including a support experience as well. All of that revolves around the product digitally. >> All right. I've got to ask the final question. I asked Janine Sneed who's the CDO for Hybrid Cloud the same question, what's on your to-do list? New job, congratulations. >> (laughing) Thank you! >> An important one, we think it's super critical. What's your priorities, what are you going to work on? What's the to-do list look like? What are some of the things you want to accomplish over the next year, be it putting stakes in the ground, new programs ... What's the priorities? Share some insight into what you're thinking. >> I would like get as much self-service capability across the products that we are determining to be digital, is probably my number one priority. But the number two is, to create a great onboarding experience, right? And that's different than selling. When you're selling, you're convincing somebody. When you're onboarding a client, you're kind of showing them the way, and so I want to create that great onboarding experience in every single product so that our products are easy to adapt, they're easy to use, and they, you know, that's how we grow. >> You've got to earn their trust in that onboarding process. >> Absolutely, absolutely. I mean, in the digital space it's everything. >> Yeah. >> It's everything. >> Digital trust. Nancy Hensley, Chief data ... Data ... >> (laughing) >> Chief Digital Officer. See, CDO means multiple things. Chief Data Officer, but you're Chief Digital Officer for the Analytics team, and CUBE alumni. sharing her thoughts on her new opportunity within IBM and also an important one, as digital is the fabric, digital's transformation is changing experiences and outcomes, of course creating value. I'm John Furrier here in the CUBE studios at IBM Think. We'll be back with more after this short break. (electronic music)

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. the Chief Digital Officer of the Analytics group. Glad to be here. Chief Digital Officer role that you're in. You know we love digital, you know we're progressive, Take a minute to explain what you're working on So you're in the business unit? because the power shifting to that one digital consumer, And that's where we have to be. Customer Relationship Management software. So, talk about the relationship But it can't be intrusive, because people don't want to How easy is it to engage with my peers in a community? some of the things you're working on, I'd like to get.. They need, they have a job to do, a mission, getting in a headlock on the IBM dot com site. B to C product that we had, and we knew that the buyer You know, one of the thing's that's interesting, They have mobile email accounts. and so you have to be smarter about how to engage And the more we can push that into the product that new solutions aren't mutually exclusive to the old way. It's been the killer app for 30 years. How are you guys thinking about that? Statistics, is it's reduced the amount of time but whenever you have new technologies, you guys are shortening the cycle down to reduce the steps it takes to do something. to the mindset that we came from, at the expense of the users. and this is becoming a very critical part of digital. is really the short cut. So to me, the shortcut is, We get the teams together, we have squads that get together and re-try because it's easy to work with. So I want to talk about the role of community. and continued to have an open source presence. So now that it's all interconnected in the marketplace, All of that revolves around the product digitally. the same question, what's on your to-do list? What are some of the things you want to accomplish are easy to adapt, they're easy to use, and they, you know, I mean, in the digital space it's everything. Digital trust. as digital is the fabric,

ENTITIES

Entity	Category	Confidence
Nancy Hensley	PERSON	0.99+
Nancy	PERSON	0.99+
Janine Sneed	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jeannie	PERSON	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
70%	QUANTITY	0.99+
30 years	QUANTITY	0.99+
15 minutes	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Bob Lord	PERSON	0.99+
Las Vegas	LOCATION	0.99+
76%	QUANTITY	0.99+
CUBE	ORGANIZATION	0.99+
One	QUANTITY	0.99+
one site	QUANTITY	0.99+
a year ago	DATE	0.99+
this year	DATE	0.98+
One site	QUANTITY	0.98+
Linux	TITLE	0.98+
more than two	QUANTITY	0.98+
first	QUANTITY	0.98+
two routes	QUANTITY	0.98+
next year	DATE	0.97+
IBM Think	ORGANIZATION	0.97+
one	QUANTITY	0.96+
50	QUANTITY	0.94+
SPSS Statistics	TITLE	0.94+
under 25	QUANTITY	0.92+
each business unit	QUANTITY	0.9+
Think 2018	EVENT	0.87+
100 new clients	QUANTITY	0.82+
agile	TITLE	0.82+
Instagram	ORGANIZATION	0.8+
2018	DATE	0.78+
single product	QUANTITY	0.74+
a minute	QUANTITY	0.66+
IBM dot com	ORGANIZATION	0.66+
Snapchat	ORGANIZATION	0.65+
two	QUANTITY	0.58+
NPS	ORGANIZATION	0.56+
C	PERSON	0.55+

Janine Sneed, IBM | IBM Think 2018

>> Narrator: Live from Las Vegas it's theCUBE. Covering IBM Think 2018. Brought to you by IBM. >> Hello everyone, welcome to theCUBE here at IBM Think 2018. I'm John Furrier. We're on the ground with theCUBE. In theCUBE studio today we have a live audience on break but I had a chance to meet with the Chief Digital Officer of Hybrid Cloud, Janine Snead, who's just appointed. She's here in set on theCUBE. Great to see you at IBM Think. >> Hi, great to see you. Thanks for having me. >> Thanks for coming on. I'm super excited. When I interviewed Bob Lord last year, Chief Digital Officer, you know we love digital on theCUBE so we get really excited. We're like great, that's awesome. Now IBM's got more Chief Digital Officers being appointed >> Janine: That's right. >> You're the first Chief Digital Officer in a business unit. That's awesome, congratulations. >> Thank you. Yeah we're excited about it. We know and we believe that the future is really in the hands of the web. And we know that customers are engaging with us differently. They want much more of a self service. They want to experience the products without always I'll say a person interacting with them. And we know that from a product perspective there's things that we need to do to make our offerings much more digitally consumable. So we're taking this very seriously. And we put an organization in place Digital within Hybrid Cloud, that truly focuses on the time from a customer goes out and actually does a search, all the way through the buyer journey to the time they get to the product. >> John: You know I've been a student of IBM I actually worked at IBM as a co-op back in my early days. IBM has always been on the leading edge of marketing. And you guys are looking at socially you looked at social in an early way, digital in an early way, but now with the cloud you can actually engage customers digitally. So I've got to ask you, you know, how are you going to do that? >> Janine: Yeah >> John: Because you've got to remember websites are now the fabric of all this that's 30 year old tech stack. You've got cloud now, you've got APIs with the synchronous software packages. You've got blockchain. All these new things. So what's the vision as you guys go out and start putting stakes in the ground for a digital strategy. How are you guys doing it, can you share the vision? >> Yeah, I think it starts with using our own technology. So within the Hybrid Cloud organization, we have a lot of software and we're putting that software out on the cloud. We want customers to engage with us digitally through a technical experience. So we're taking our products, putting product demos, we're putting POTs, we're putting even proof of concept secure in the cloud, guided demos where they can come and experience these offerings without ever engaging with us. Now of course once they're ready they can engage with us but this is truly about a low touch, self service way for customers to engage with our products. >> Now a lot of people, and we talk about this all the time, but the general sentiment online now is you have the kind of crazies out there you've seen that on Reddit, fake news, weaponizing content. Then you have the other side of the spectrum where people are like, I don't want to be sold to. I'm discovering, I want to learn. >> Janine: Yes. >> John: I'm in communities. I know you guys address that. I want you to just clarify, because there's a model now where people just want to be ingratiated in. You know, kick the tires. Which by the way, kicking tires right now is much different than it was years ago because you have APIs. You have SARS source code. You have credits for cloud. >> Janine: That's right. >> What is the digital motion there? I mean obviously it's a light touch. >> Yeah >> But is it still an IBM.com? >> It is. So we're still on IBM.com properties. And we're nurturing with the ecosystem and the communities to also go where they are, but bring them back to the IBM.com properties and engage with them when they're ready. You know, we've done the research. We know that 70% of b2b buyers learn about your products and your services without ever talking to you. So we want to be where those users are and eventually that will be back on our property but we also want to find them where they are. >> You know, one of the things we were talking about before you came on camera here, We've been doing theCUBE for seven years or so plus six shows now to one show. But the thought leadership on theCUBE has always been powerful. And that's seemed to be a great way to get into communities. And IBM's got a lot of thought leaders. So I'm sure you have a plan for thought leaders. You have IBM Fellows. You've got R&D. You've got a lot of content opportunities. >> We do. We've got a lot of partners. So here at this conference we've been talking to a lot of our partners who want to be a part of this experience. We've got great solutions and all of our solutions a lot of them are delivered with partners. And so it's working the community. It's working the ecosystem. And it's doing this together with partners to allow them to contribute and allow customers to come and consume solutions. In much of a use case way, of course you can have product by product by products, but how do you essentially deliver solutions based on use cases. >> So I'll ask you a personal question. How did you get here? Was it like hey, I want to do the digital job, was it an itch that you were scratching, did Bob Lord lure you into the job? (Janine laughs) Did he recruit you? I mean -- >> No, it's -- >> How did you get it? >> It's a great question >> Because this is a great opportunity. >> It is. I'm a product person by training. And I spent the last 18 months in sales. And I enjoyed every minute of that and listening and understanding how our sellers want to consume. Short, snackable type of learning and training and watching what was going on with the digital ecosystem I thought it was a great way to really mix my skills that I have within product with what I just learned from my sales role. And I did nine months in marketing. So I felt like it was kind of a mixture. And we have a huge opportunity here. So the opportunity presented itself. >> Sales always has a my favorite sales expression is people love to buy from people that they like. How are you going to make IBM likable digitally? Is there a strategy there? >> Oh, it's simple. (John laughs) It is so dead simple. It's about the user experience. When users come, you have to give them the best experience possible because you never get a second chance to make a good first impression. So I want to basically set the bar. And we're an MVP right now with a lot of the stuff that we're doing out. >> You mean software and tools and stuff? >> Yeah, no, well, our experience right now so when you come and you experience our tools I'm sorry, our demos and our proof of technologies and our tutorials out on our site it's MVP. We're 45 days old. But it's about the user experience. And so we've been serving users here that are coming to try our stuff. >> So the Digital Technical Engagement, that's the DTE? >> Janine: DTE, yep. >> That's the one that's 45 days? >> That's the one that's 45 days old. >> The IBM site's not 45 days old. >> Yeah, yeah. >> But this new program. So take a minute to explain what the DTE, the Digital Technical Engagement program is. What was the guiding principles behind it >> Yeah >> What's some of the deign objectives is there any new cool tech under the covers? Share a little bit of color on that. >> Sure, sure. Happy to. So back in the fourth quarter of last year we took a look and we said, how are customers consuming? How are we engaging? How are we showing up? And what do we need to do to shift to become more agile and lighten the way that we showed up. And so we really gathered a few smart creatives from the CIO's office, from IBM design, from product and from marketing and we said guys, we're going to run an experiment. We want to set up a site off of IBM.com a page off of IBM.com and it's very simple. Keep it so clean. Keep the user experience clean. Take something like IBM Cloud Private. Give me three product demos. Give me one guided demo where in 10 minutes a client can get through IBM Cloud Private without getting stuck and then give them a way to try it for two weeks. Just experiment. Well, in 90 days we've had 10,500 users try that guided demo and our NPS is 56. >> What does NPS mean? - Net Promoter Score >> That's what I figured, okay. >> So it's about experimentation. And so in this world that we're going into we want to experiment. And so from there, what happened, that proved to be successful. We now have an organization of about 60 people within digital technical engagement deep product experts, but we also have a platform team to drive that experience. >> So there's some real value there. I mean, a lot of people look at website and digital technologies as ad tech, you know, and there's a lot of bad press out there now with Facebook where a lot of people are looking at Facebook as content that got weaponized for fake news and the ad tech has a bad track record of fill out a form, they're going to sell me something. How are you going to change that perception? >> That's a great question. So a lot of the folks that we're working with right now say you have to capture user information capture user information. And for me, I don't want to be bothered. So I'm kind of looking at this maybe a little bit too selfishly saying I want to demo without giving you my information. We have our product demos and our guided demos, we don't collect any information from the user. When you are going to reserve our software for two weeks, up to a month, we do collect some information about you. >> John: You got to. >> We have to. >> At some point. >> So we're keeping it very low touch because we know that's how users want to engage. >> You don't want to gate the hell out of it. >> No, we don't want to gate the hell out of it. We want to keep it just, let them explore without being all over them. Right? >> Talk about the new IBM. You know, one of the things that's transforming right now that I'm impressed with is IBM's constantly reinventing themself. I was impressed with Ginni's keynote. The way she talks about data in the middle, blockchain on one side and AI on the other. I call it the innovation sandwich. >> Janine: Yeah >> How are you applying that vision to digital? I mean not yet obviously, you're only at the beginning. >> Right But that vision is pretty solid. And she brought up Moore's Law and Metcalfe's Law. >> That's right. >> Moore's Law is making things faster, smaller, cheaper. >> Right >> Component wise and speed. >> Yes >> Metcalfe's Law is about network effect and the future of digital is either going to be token economics or blockchain with programatic tooling that gives users great experiences. So how do you tie that together? Maybe it's too early to ask, but-- >> No, no. It's simple. I'm a consumer of this stuff. I'm using the cloud. I'm using the IBM Design Thinking because I brought in three designers from Phil Gilbert's group. Right? I'm embedded in the digital organization basically, regardless of where I sit. So we are adopting best practices that come from IBM's big chief digital office. >> So you get to use your own tools, that's one of the things she said. >> Yeah and we'll embed, we'll get there. Right? >> Yeah >> Well actually, we already are doing, we embedded chat. So we've got Watson Chat running on our SPSS statistics page So it's about the cloud, it's about user experience. It's about applying digital practices from Bob Lord's organization and then it's about Watson. >> I was having a great Twitter thread with a bunch of people that were on Twitter just ranting on the weekend a couple weekends ago about digital transformation. Tom Peters actually jumped in, the famous Tom Peters who wrote the books there, a management consultant, about digital transformation. I love digital transformation, it's overused, but it's legit. People are transforming. So the question was, how do you do it successfully? And all the canned answers came out. Well, you need commitment from the top. You've got to have this and that. And I said look, bottom line, if people don't have the expertise, and if they don't know what they're doing, they can't transform. So it begs the question for skills gap. A lot of people are learning, so there's a learning environment. It's not just sales. Proficiency, getting the product buying. There's a community thirst for learning. How is that incorporated in, if any? >> I think I have a little bit of a different hurdle. The people that we're working with are learning. They're out in the communities they're engaging. I think one of the things that we have to continue to do is continue to show the value of digital transformation. Remember, IBM is a big company. I'm not a ten person startup. Right? We're a bigger organization so what we have to do is show why digital is important back in with our product teams. I think for the most part our marketing teams get it. Because you have to make trade offs. Am I going to invest in this feature in the product or am I going to put in something like eCommerce so you can subscribe and buy. >> Priorities. But you're a product person, so it's all about the trade offs. >> Yeah, it's all about the trade offs, right? So the skills are part of it but some of it is just education on why this is so critical. And then the last thing is passion. You have to bring the skills, the education and then that passionate team that really believes that they can get this done. >> Okay so given that, let's go back to some of the comments I made about the people who we were talking about on Twitter >> Janine: Sure >> Commitment from the top. IBM commitment at the top is there? What are they saying, what's the marching orders? >> The marching orders is we got to go and we're not moving fast enough. Speed, speed, speed, right? So we got to move fast. >> So in an interview with Bob Lord, one of the things we talked about was interesting. He's like I like to just get stuff done. I think he might have used another word. Maybe it was off camera he said that. IBM's got a lot of process. How do you take the old IBM process and make it work for you rather than having digital work for the process? >> Yeah >> It's a lot of internal things but no need to give away too much but it's a management challenge. How do you cut through it? >> I think from a process perspective, these are conversations and you have to explain why. If you could go in and explain why you need to do something differently, then people will listen. I'd like to give an example, okay? I had 26 days to get five products out the door. I formed a team January 2nd. By January 26th, I had to be live. Now I worked with my marketing team and I said I will get into your buyer journey, but I have to launch my Digital Technical Engagement site and my products. They understood. So I went live. Now, will I back back into the process? Sure I will. >> John: But you had good alignment. >> But yeah, we have to move fast, right? So it's explaining why and having mature conversations and then people that really believe in digital they'll support you. >> Great conversation. I'm looking forward to chatting more with you. We're at theCUBE. But I want to ask you one final question before we break. What's your objective? What's the roadmap for you, what's your top priorities? Are you hiring? Who're you looking for? What kind of product priorities, what's the sales priorities? What's your to-do list? >> I think let's start with the customer. So the customer priority is to deliver the best experience possible as they engage with IBM digitally. And that's all about the user experience. From a talent perspective, it's all about diversity, inclusion, and people that come with different skills from technology, to growth hacking, to marketing, and to engineering. And some people that think differently. We want people that, no idea is a bad idea, just come and bring great ideas. >> Well, diversity and inclusion, first of all, half of the users are women. And you also have to have an understanding of the use cases. >> Yeah >> It's not just men using software. >> Yeah, that's right. >> It's a huge deal. >> That's right, that's right. >> Alright well, Janine, great to have you on theCUBE. Thanks for spending the time. >> Thank you. >> Congratulations on the new role. Janine Sneed, Chief Digital Officer from IBM Hybrid Cloud. First IBM Chief Digital Officer in a business unit. I also today have Bob Lord and a lot of other folks doing digital but great to see the digital momentum. >> Thank you. >> It's not just a selling apparatus. It's all about value for users. It's theCUBE bringing you the value here at IBM Think 2018. I'm John Furrier, back with more after this short break. (upbeat music)

Published Date : Mar 20 2018

SUMMARY :

Brought to you by IBM. We're on the ground with theCUBE. Hi, great to see you. Chief Digital Officer, you know we love digital on theCUBE You're the first Chief Digital Officer And we know that customers are engaging with us differently. So I've got to ask you, you know, So what's the vision as you guys go out and start secure in the cloud, guided demos where they can Now a lot of people, and we talk about this all the time, I want you to just clarify, What is the digital motion there? So we want to be where those users are You know, one of the things we were talking about In much of a use case way, of course you can have So I'll ask you a personal question. And I spent the last 18 months in sales. How are you going to make IBM likable digitally? It's about the user experience. But it's about the user experience. So take a minute to explain what the DTE, What's some of the deign objectives So back in the fourth quarter of last year And so in this world that we're going into How are you going to change that perception? So a lot of the folks that we're working with right now So we're keeping it very low touch because we know that's No, we don't want to gate the hell out of it. I call it the innovation sandwich. How are you applying that vision to digital? And she brought up Moore's Law and Metcalfe's Law. and the future of digital is either going to be I'm embedded in the digital organization So you get to use your own tools, that's Yeah and we'll embed, we'll get there. So it's about the cloud, it's about user experience. So the question was, how do you do it successfully? I think one of the things that we have to so it's all about the trade offs. So the skills are part of it but some of it Commitment from the top. So we got to move fast. So in an interview with Bob Lord, one of the It's a lot of internal things these are conversations and you have to explain why. So it's explaining why and having mature conversations But I want to ask you one final question before we break. So the customer priority is to deliver the best half of the users are women. Thanks for spending the time. Congratulations on the new role. It's theCUBE bringing you the value here at IBM Think 2018.

ENTITIES

Entity	Category	Confidence
Janine	PERSON	0.99+
John	PERSON	0.99+
Janine Sneed	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Janine Snead	PERSON	0.99+
January 2nd	DATE	0.99+
Tom Peters	PERSON	0.99+
Bob Lord	PERSON	0.99+
two weeks	QUANTITY	0.99+
six shows	QUANTITY	0.99+
John Furrier	PERSON	0.99+
10 minutes	QUANTITY	0.99+
26 days	QUANTITY	0.99+
10,500 users	QUANTITY	0.99+
30 year	QUANTITY	0.99+
seven years	QUANTITY	0.99+
nine months	QUANTITY	0.99+
45 days	QUANTITY	0.99+
one show	QUANTITY	0.99+
January 26th	DATE	0.99+
90 days	QUANTITY	0.99+
70%	QUANTITY	0.99+
First	QUANTITY	0.99+
last year	DATE	0.99+
five products	QUANTITY	0.99+
Metcalfe	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
first impression	QUANTITY	0.99+
Phil Gilbert	PERSON	0.99+
ten person	QUANTITY	0.99+
three designers	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
today	DATE	0.98+
second chance	QUANTITY	0.98+
DTE	ORGANIZATION	0.98+
IBM Cloud Private	TITLE	0.98+
one	QUANTITY	0.98+
about 60 people	QUANTITY	0.97+
IBM Think	ORGANIZATION	0.97+
Ginni	PERSON	0.97+
Hybrid Cloud	ORGANIZATION	0.96+
Watson Chat	TITLE	0.96+
one final question	QUANTITY	0.94+
Reddit	ORGANIZATION	0.92+
Watson	TITLE	0.92+
one side	QUANTITY	0.9+

Rob Thomas, IBM | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE, covering Machine Learning Everywhere: Build Your Ladder to AI, brought to you by IBM. >> Welcome back to New York City. theCUBE continue our coverage here at IBM's event, Machine Learning Everywhere: Build Your Ladder to AI. And with us now is Rob Thomas, who is the vice president of, or general manager, rather, of IBM analytics. Sorry about that, Rob. Good to have you with us this morning. Good to see you, sir. >> Great to see you John. Dave, great to see you as well. >> Great to see you. >> Well let's just talk about the event first. Great lineup of guests. We're looking forward to visiting with several of them here on theCUBE today. But let's talk about, first off, general theme with what you're trying to communicate and where you sit in terms of that ladder to success in the AI world. >> So, maybe start by stepping back to, we saw you guys a few times last year. Once in Munich, I recall, another one in New York, and the theme of both of those events was, data science renaissance. We started to see data science picking up steam in organizations. We also talked about machine learning. The great news is that, in that timeframe, machine learning has really become a real thing in terms of actually being implemented into organizations, and changing how companies run. And that's what today is about, is basically showcasing a bunch of examples, not only from our clients, but also from within IBM, how we're using machine learning to run our own business. And the thing I always remind clients when I talk to them is, machine learning is not going to replace managers, but I think machine learning, managers that use machine learning will replace managers that do not. And what you see today is a bunch of examples of how that's true because it gives you superpowers. If you've automated a lot of the insight, data collection, decision making, it makes you a more powerful manager, and that's going to change a lot of enterprises. >> It seems like a no-brainer, right? I mean, or a must-have. >> I think there's a, there's always that, sometimes there's a fear factor. There is a culture piece that holds people back. We're trying to make it really simple in terms of how we talk about the day, and the examples that we show, to get people comfortable, to kind of take a step onto that ladder back to the company. >> It's conceptually a no-brainer, but it's a challenge. You wrote a blog and it was really interesting. It was, one of the clients said to you, "I'm so glad I'm not in the technology industry." And you went, "Uh, hello?" (laughs) "I've got news for you, you are in the technology industry." So a lot of customers that I talk to feel like, meh, you know, in our industry, it's really not getting disrupted. That's kind of taxis and retail. We're in banking and, you know, but, digital is disrupting every industry and every industry is going to have to adopt ML, AI, whatever you want to call it. Can traditional companies close that gap? What's your take? >> I think they can, but, I'll go back to the word I used before, it starts with culture. Am I accepting that I'm a technology company, even if traditionally I've made tractors, as an example? Or if traditionally I've just been you know, selling shirts and shoes, have I embraced the role, my role as a technology company? Because if you set that culture from the top, everything else flows from there. It can't be, IT is something that we do on the side. It has to be a culture of, it's fundamental to what we do as a company. There was an MIT study that said, data-driven cultures drive productivity gains of six to 10 percent better than their competition. You can't, that stuff compounds, too. So if your competitors are doing that and you're not, not only do you fall behind in the short term but you fall woefully behind in the medium term. And so, I think companies are starting to get there but it takes a constant push to get them focused on that. >> So if you're a tractor company, you've got human expertise around making tractors and messaging and marketing tractors, and then, and data is kind of there, sort of a bolt-on, because everybody's got to be data-driven, but if you look at the top companies by market cap, you know, we were talking about it earlier. Data is foundational. It's at their core, so, that seems to me to be the hard part, Rob, I'd like you to comment in terms of that cultural shift. How do you go from sort of data in silos and, you know, not having cloud economics and, that are fundamental, to having that dynamic, and how does IBM help? >> You know, I think, to give companies credit, I think most organizations have developed some type of data practice or discipline over the last, call it five years. But most of that's historical, meaning, yeah, we'll take snapshots of history. We'll use that to guide decision making. You fast-forward to what we're talking about today, just so we're on the same page, machine learning is about, you build a model, you train a model with data, and then as new data flows in, your model is constantly updating. So your ability to make decisions improves over time. That's very different from, we're doing historical reporting on data. And so I think it's encouraging that companies have kind of embraced that data discipline in the last five years, but what we're talking about today is a big next step and what we're trying to break it down to what I call the building blocks, so, back to the point on an AI ladder, what I mean by an AI ladder is, you can't do AI without machine learning. You can't do machine learning without analytics. You can't do analytics without the right data architecture. So those become the building blocks of how you get towards a future of AI. And so what I encourage companies is, if you're not ready for that AI leading edge use case, that's okay, but you can be preparing for that future now. That's what the building blocks are about. >> You know, I think we're, I know we're ahead of, you know, Jeremiah Owyang on a little bit later, but I was reading something that he had written about gut and instinct, from the C-Suite, and how, that's how companies were run, right? You had your CEO, your president, they made decisions based on their guts or their instincts. And now, you've got this whole new objective tool out there that's gold, and it's kind of taking some of the gut and instinct out of it, in a way, and maybe there are people who still can't quite grasp that, that maybe their guts and their instincts, you know, what their gut tells them, you know, is one thing, but there's pretty objective data that might indicate something else. >> Moneyball for business. >> A little bit of a clash, I mean, is there a little bit of a clash in that respect? >> I think you'd be surprise by how much decision making is still pure opinion. I mean, I see that everywhere. But we're heading more towards what you described for sure. One of the clients talking here today, AMC Networks, think it's a great example of a company that you wouldn't think of as a technology company, primarily a content producer, they make great shows, but they've kind of gone that extra step to say, we can integrate data sources from third parties, our own data about viewer habits, we can do that to change our relationship with advertisers. Like, that's a company that's really embraced this idea of being a technology company, and you can see it in their results, and so, results are not coincidence in this world anymore. It's about a practice applied to data, leveraging machine learning, on a path towards AI. If companies are doing that, they're going to be successful. >> And we're going to have the tally from AMC on, but so there's a situation where they have embraced it, that they've dealt with that culture, and data has become foundational. Now, I'm interested as to what their journey look like. What are you seeing with clients? How they break this down, the silos of data that have been built up over decades. >> I think, so they get almost like a maturity curve. You've got, and the rule I talk about is 40-40-20, where 40% of organizations are really using data just to optimize costs right now. That's okay, but that's on the lower end of the maturity curve. 40% are saying, all right, I'm starting to get into data science. I'm starting to think about how I extend to new products, new services, using data. And then 20% are on the leading edge. And that's where I'd put AMC Networks, by the way, because they've done unique things with integrating data sets and building models so that they've automated a lot of what used to be painstakingly long processes, internal processes to do it. So you've got this 40-40-20 of organizations in terms of their maturity on this. If you're not on that curve right now, you have a problem. But I'd say most are somewhere on that curve. If you're in the first 40% and you're, right now data for you is just about optimizing cost, you're going to be behind. If you're not right now, you're going to be behind in the next year, that's a problem. So I'd kind of encourage people to think about what it takes to be in the next 40%. Ultimately you want to be in the 20% that's actually leading this transformation. >> So change it to 40-20-40. That's where you want it to go, right? You want to flip that paradigm. >> I want to ask you a question. You've done a lot of M and A in the past. You spent a lot of time in Silicon Valley and Silicon Valley obviously very, very disruptive, you know, cultures and organizations and it's always been a sort of technology disruption. It seems like there's a ... another disruption going on, not just horizontal technologies, you know, cloud or mobile or social, whatever it is, but within industries. Some industries, as we've been talking, radically disrupted. Retail, taxis, certainly advertising, et cetera et cetera. Some have not yet, the client that you talked to. Do you see, technology companies generally, Silicon Valley companies specifically, as being able to pull off a sort of disruption of not only technologies but also industries and where does IBM play there? You've made a sort of, Ginni in particular has made a deal about, hey, we're not going to compete with our customers. So talking about this sort of dual disruption agenda, one on the technology side, one within industries that Apple's getting into financial services and, you know, Amazon getting into grocery, what's your take on that and where does IBM fit in that world? >> So, I mean, IBM has been in Silicon Valley for a long time, I would say probably longer than 99.9% of the companies in Silicon Valley, so, we've got a big lab there. We do a lot of innovation out of there. So love it, I mean, the culture of the valley is great for the world because it's all about being the challenger, it's about innovation, and that's tremendous. >> No fear. >> Yeah, absolutely. So, look, we work with a lot of different partners, some who are, you know, purely based in the valley. I think they challenge us. We can learn from them, and that's great. I think the one, the one misnomer that I see right now, is there's a undertone that innovation is happening in Silicon Valley and only in Silicon Valley. And I think that's a myth. Give you an example, we just, in December, we released something called Event Store which is basically our stab at reinventing the database business that's been pretty much the same for the last 30 to 40 years. And we're now ingesting millions of rows of data a second. We're doing it in a Parquet format using a Spark engine. Like, this is an amazing innovation that will change how any type of IOT use case can manage data. Now ... people don't think of IBM when they think about innovations like that because it's not the only thing we talk about. We don't have, the IBM website isn't dedicated to that single product because IBM is a much bigger company than that. But we're innovating like crazy. A lot of that is out of what we're doing in Silicon Valley and our labs around the world and so, I'm very optimistic on what we're doing in terms of innovation. >> Yeah, in fact, I think, rephrase my question. I was, you know, you're right. I mean people think of IBM as getting disrupted. I wasn't posing it, I think of you as a disruptor. I know that may sound weird to some people but in the sense that you guys made some huge bets with things like Watson on solving some of the biggest, world's problems. And so I see you as disrupting sort of, maybe yourselves. Okay, frame that. But I don't see IBM as saying, okay, we are going to now disrupt healthcare, disrupt financial services, rather we are going to help our, like some of your comp... I don't know if you'd call them competitors. Amazon, as they say, getting into content and buying grocery, you know, food stores. You guys seems to have a different philosophy. That's what I'm trying to get to is, we're going to disrupt ourselves, okay, fine. But we're not going to go hard into healthcare, hard into financial services, other than selling technology and services to those organizations, does that make sense? >> Yeah, I mean, look, our mission is to make our clients ... better at what they do. That's our mission, we want to be essential in terms of their journey to be successful in their industry. So frankly, I love it every time I see an announcement about Amazon entering another vertical space, because all of those companies just became my clients. Because they're not going to work with Amazon when they're competing with them head to head, day in, day out, so I love that. So us working with these companies to make them better through things like Watson Health, what we're doing in healthcare, it's about making companies who have built their business in healthcare, more effective at how they perform, how they drive results, revenue, ROI for their investors. That's what we do, that's what IBM has always done. >> Yeah, so it's an interesting discussion. I mean, I tend to agree. I think Silicon Valley maybe should focus on those technology disruptions. I think that they'll have a hard time pulling off that dual disruption and maybe if you broadly define Silicon Valley as Seattle and so forth, but, but it seems like that formula has worked for decades, and will continue to work. Other thoughts on sort of the progression of ML, how it gets into organizations. You know, where you see this going, again, I was saying earlier, the parlance is changing. Big data is kind of, you know, mm. Okay, Hadoop, well, that's fine. We seem to be entering this new world that's pervasive, it's embedded, it's intelligent, it's autonomous, it's self-healing, it's all these things that, you know, we aspire to. We're now back in the early innings. We're late innings of big data, that's kind of ... But early innings of this new era, what are your thoughts on that? >> You know, I'd say the biggest restriction right now I see, we talked before about somehow, sometimes companies don't have the desire, so we have to help create the desire, create the culture to go do this. Even for the companies that have a burning desire, the issue quickly becomes a skill gap. And so we're doing a lot to try to help bridge that skill gap. Let's take data science as an example. There's two worlds of data science that I would describe. There's clickers, and there's coders. Clickers want to do drag and drop. They will use traditional tools like SPSS, which we're modernizing, that's great. We want to support them if that's how they want to work and build models and deploy models. There's also this world of coders. This is people that want to do all their data science in ML, and Python, and Scala, and R, like, that's what they want to do. And so we're supporting them through things like Data Science Experience, which is built on Apache Jupiter. It's all open source tooling, it'd designed for coders. The reason I think that's important, it goes back to the point on skill sets. There is a skill gap in most companies. So if you walk in and you say, this is the only way to do this thing, you kind of excluded half the companies because they say, I can't play in that world. So we are intentionally going after a strategy that says, there's a segmentation in skill types. In places there's a gap, we can help you fill that gap. That's how we're thinking about them. >> And who does that bode well for? If you say that you were trying to close a gap, does that bode well for, we talked about the Millennial crowd coming in and so they, you know, do they have a different approach or different mental outlook on this, or is it to the mid-range employee, you know, who is open minded, I mean, but, who is the net sweet spot, you think, that say, oh, this is a great opportunity right now? >> So just take data science as an example. The clicker coder comment I made, I would put the clicker audience as mostly people that are 20 years into their career. They've been around a while. The coder audience is all the Millennials. It's all the new audience. I think the greatest beneficiary is the people that find themselves kind of stuck in the middle, which is they're kind of interested in this ... >> That straddle both sides of the line yeah? >> But they've got the skill set and the desire to do some of the new tooling and new approaches. So I think this kind of creates an opportunity for that group in the middle to say, you know, what am I going to adopt as a platform for how I go forward and how I provide leadership in my company? >> So your advice, then, as you're talking to your clients, I mean you're also talking to their workforce. In a sense, then, your advice to them is, you know, join, jump in the wave, right? You've got your, you can't straddle, you've got to go. >> And you've got to experiment, you've got to try things. Ultimately, organizations are going to gravitate to things that they like using in terms of an approach or a methodology or a tool. But that comes with experimentation, so people need to get out there and try something. >> Maybe we could talk about developers a little bit. We were talking to Dinesh earlier and you guys of course have focused on data scientists, data engineers, obviously developers. And Dinesh was saying, look, many, if not most, of the 10 million Java developers out there, they're not, like, focused around the data. That's really the data scientist's job. But then, my colleague John Furrier says, hey, data is the new development kit. You know, somebody said recently, you know, Andreessen's comment, "software is eating the world." Well, data is eating software. So if Furrier is right and that comment is right, it seems like developers increasingly have to become more data aware, fundamentally. Blockchain developers clearly are more data focused. What's your take on the developer community, where they fit into this whole AI, machine learning space? >> I was just in Las Vegas yesterday and I did a session with a bunch of our business partners. ISVs, so software companies, mostly a developer audience, and the discussion I had with them was around, you're doing, you're building great products, you're building great applications. But your product is only as good as the data and the intelligence that you embed in your product. Because you're still putting too much of a burden on the user, as opposed to having everything happen magically, if you will. So that discussion was around, how do you embed data, embed AI, into your products and do that at the forefront versus, you deliver a product and the client has to say, all right, now I need to get my data out of this application and move it somewhere else so I can do the data science that I want to do. That's what I see happening with developers. It's kind of ... getting them to think about data as opposed to just thinking about the application development framework, because that's where most of them tend to focus. >> Mm, right. >> Well, we've talked about, well, earlier on about the governance, so just curious, with Madhu, which I'll, we'll have that interview in just a little bit here. I'm kind of curious about your take on that, is that it's a little kinder, gentler, friendlier than maybe some might look at it nowadays because of some organization that it causes, within your group and some value that's being derived from that, that more efficiency, more contextual information that's, you know, more relevant, whatever. When you talk to your clients about meeting rules, regs, GDPR, all these things, how do you get them to see that it's not a black veil of doom and gloom but it really is, really more of an opportunity for them to cash in? >> You know, my favorite question to ask when I go visit clients is I say, I say, just show of hands, how many people have all the data they need to do their job? To date, nobody has ever raised their hand. >> Not too many hands up. >> The reason I phrased it that way is, that's fundamentally a governance challenge. And so, when you think about governance, I think everybody immediately thinks about compliance, GDPR, types of things you mentioned, and that's great. But there's two use cases for governance. One is compliance, the other one is self service analytics. Because if you've done data governance, then you can make your data available to everybody in the organization because you know you've got the right rules, the right permissions set up. That will change how people do their jobs and I think sometimes governance gets painted into a compliance corner, when organizations need to think about it as, this is about making data accessible to my entire workforce. That's a big change. I don't think anybody has that today. Except for the clients that we're working with, where I think we've made good strides in that. >> What's your sort of number one, two, and three, or pick one, advice for those companies that as you blogged about, don't realize yet that they're in the software business and the technology business? For them to close the ... machine intelligence, machine learning, AI gap, where should they start? >> I do think it can be basic steps. And the reason I say that is, if you go to a company that hasn't really viewed themselves as a technology company, and you start talking about machine intelligence, AI, like, everybody like, runs away scared, like it's not interesting. So I bring it back to building blocks. For a client to be great in data, and to become a technology company, you really need three platforms for how you think about data. You need a platform for how you manage your data, so think of it as data management. You need a platform for unified governance and integration, and you need a platform for data science and business analytics. And to some extent, I don't care where you start, but you've got to start with one of those. And if you do that, you know, you'll start to create a flywheel of momentum where you'll get some small successes. Then you can go in the other area, and so I just encourage everybody, start down that path. Pick one of the three. Or you may already have something going in one of them, so then pick one where you don't have something going. Just start down the path, because, those building blocks, once you have those in place, you'll be able to scale AI and ML in the future in your organization. But without that, you're going to always be limited to kind of a use case at a time. >> Yeah, and I would add, this is, you talked about it a couple times today, is that cultural aspect, that realization that in order to be data driven, you know, buzzword, you have to embrace that and drive that through the culture. Right? >> That starts at the top, right? Which is, it's not, you know, it's not normal to have a culture of, we're going to experiment, we're going to try things, half of them may not work. And so, it starts at the top in terms of how you set the tone and set that culture. >> IBM Think, we're less than a month away. CUBE is going to be there, very excited about that. First time that you guys have done Think. You've consolidated all your big, big events. What can we expect from you guys? >> I think it's going to be an amazing show. To your point, we thought about this for a while, consolidating to a single IBM event. There's no question just based on the response and the enrollment we have so far, that was the right answer. We'll have people from all over the world. A bunch of clients, we've got some great announcements that will come out that week. And for clients that are thinking about coming, honestly the best thing about it is all the education and training. We basically build a curriculum, and think of it as a curriculum around, how do we make our clients more effective at competing with the Amazons of the world, back to the other point. And so I think we build a great curriculum and it will be a great week. >> Well, if I've heard anything today, it's about, don't be afraid to dive in at the deep end, just dive, right? Get after it and, looking forward to the rest of the day. Rob, thank you for joining us here and we'll see you in about a month! >> Sounds great. >> Right around the corner. >> All right, Rob Thomas joining us here from IBM Analytics, the GM at IBM Analytics. Back with more here on theCUBE. (upbeat music)

Published Date : Feb 27 2018

SUMMARY :

Build Your Ladder to AI, brought to you by IBM. Good to have you with us this morning. Dave, great to see you as well. and where you sit in terms of that ladder And what you see today is a bunch of examples I mean, or a must-have. onto that ladder back to the company. So a lot of customers that I talk to And so, I think companies are starting to get there to be the hard part, Rob, I'd like you to comment You fast-forward to what we're talking about today, and it's kind of taking some of the gut But we're heading more towards what you described for sure. Now, I'm interested as to what their journey look like. to think about what it takes to be in the next 40%. That's where you want it to go, right? I want to ask you a question. So love it, I mean, the culture of the valley for the last 30 to 40 years. but in the sense that you guys made some huge bets in terms of their journey to be successful Big data is kind of, you know, mm. create the culture to go do this. The coder audience is all the Millennials. for that group in the middle to say, you know, you know, join, jump in the wave, right? so people need to get out there and try something. and you guys of course have focused on data scientists, that you embed in your product. When you talk to your clients about have all the data they need to do their job? And so, when you think about governance, and the technology business? And to some extent, I don't care where you start, that in order to be data driven, you know, buzzword, Which is, it's not, you know, it's not normal CUBE is going to be there, very excited about that. I think it's going to be an amazing show. and we'll see you in about a month! from IBM Analytics, the GM at IBM Analytics.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
December	DATE	0.99+
Rob Thomas	PERSON	0.99+
New York	LOCATION	0.99+
Dinesh	PERSON	0.99+
AMC Networks	ORGANIZATION	0.99+
John	PERSON	0.99+
Jeremiah Owyang	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Rob	PERSON	0.99+
20 years	QUANTITY	0.99+
Dave	PERSON	0.99+
Munich	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
MIT	ORGANIZATION	0.99+
10 million	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
last year	DATE	0.99+
Furrier	PERSON	0.99+
AMC	ORGANIZATION	0.99+
One	QUANTITY	0.99+
yesterday	DATE	0.99+
six	QUANTITY	0.99+
New York City	LOCATION	0.99+
GDPR	TITLE	0.99+
40%	QUANTITY	0.99+
both	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.99+
Seattle	LOCATION	0.99+
Scala	TITLE	0.99+
two use cases	QUANTITY	0.99+
today	DATE	0.99+
Python	TITLE	0.98+
Andreessen	PERSON	0.98+
both sides	QUANTITY	0.98+
two	QUANTITY	0.98+
Watson Health	ORGANIZATION	0.98+
millions of rows	QUANTITY	0.98+
five years	QUANTITY	0.97+
next year	DATE	0.97+
less than a month	QUANTITY	0.97+
Madhu	PERSON	0.97+
Amazons	ORGANIZATION	0.96+

Data Science for All: It's a Whole New Game

>> There's a movement that's sweeping across businesses everywhere here in this country and around the world. And it's all about data. Today businesses are being inundated with data. To the tune of over two and a half million gigabytes that'll be generated in the next 60 seconds alone. What do you do with all that data? To extract insights you typically turn to a data scientist. But not necessarily anymore. At least not exclusively. Today the ability to extract value from data is becoming a shared mission. A team effort that spans the organization extending far more widely than ever before. Today, data science is being democratized. >> Data Sciences for All: It's a Whole New Game. >> Welcome everyone, I'm Katie Linendoll. I'm a technology expert writer and I love reporting on all things tech. My fascination with tech started very young. I began coding when I was 12. Received my networking certs by 18 and a degree in IT and new media from Rochester Institute of Technology. So as you can tell, technology has always been a sure passion of mine. Having grown up in the digital age, I love having a career that keeps me at the forefront of science and technology innovations. I spend equal time in the field being hands on as I do on my laptop conducting in depth research. Whether I'm diving underwater with NASA astronauts, witnessing the new ways which mobile technology can help rebuild the Philippine's economy in the wake of super typhoons, or sharing a first look at the newest iPhones on The Today Show, yesterday, I'm always on the hunt for the latest and greatest tech stories. And that's what brought me here. I'll be your host for the next hour and as we explore the new phenomenon that is taking businesses around the world by storm. And data science continues to become democratized and extends beyond the domain of the data scientist. And why there's also a mandate for all of us to become data literate. Now that data science for all drives our AI culture. And we're going to be able to take to the streets and go behind the scenes as we uncover the factors that are fueling this phenomenon and giving rise to a movement that is reshaping how businesses leverage data. And putting organizations on the road to AI. So coming up, I'll be doing interviews with data scientists. We'll see real world demos and take a look at how IBM is changing the game with an open data science platform. We'll also be joined by legendary statistician Nate Silver, founder and editor-in-chief of FiveThirtyEight. Who will shed light on how a data driven mindset is changing everything from business to our culture. We also have a few people who are joining us in our studio, so thank you guys for joining us. Come on, I can do better than that, right? Live studio audience, the fun stuff. And for all of you during the program, I want to remind you to join that conversation on social media using the hashtag DSforAll, it's data science for all. Share your thoughts on what data science and AI means to you and your business. And, let's dive into a whole new game of data science. Now I'd like to welcome my co-host General Manager IBM Analytics, Rob Thomas. >> Hello, Katie. >> Come on guys. >> Yeah, seriously. >> No one's allowed to be quiet during this show, okay? >> Right. >> Or, I'll start calling people out. So Rob, thank you so much. I think you know this conversation, we're calling it a data explosion happening right now. And it's nothing new. And when you and I chatted about it. You've been talking about this for years. You have to ask, is this old news at this point? >> Yeah, I mean, well first of all, the data explosion is not coming, it's here. And everybody's in the middle of it right now. What is different is the economics have changed. And the scale and complexity of the data that organizations are having to deal with has changed. And to this day, 80% of the data in the world still sits behind corporate firewalls. So, that's becoming a problem. It's becoming unmanageable. IT struggles to manage it. The business can't get everything they need. Consumers can't consume it when they want. So we have a challenge here. >> It's challenging in the world of unmanageable. Crazy complexity. If I'm sitting here as an IT manager of my business, I'm probably thinking to myself, this is incredibly frustrating. How in the world am I going to get control of all this data? And probably not just me thinking it. Many individuals here as well. >> Yeah, indeed. Everybody's thinking about how am I going to put data to work in my organization in a way I haven't done before. Look, you've got to have the right expertise, the right tools. The other thing that's happening in the market right now is clients are dealing with multi cloud environments. So data behind the firewall in private cloud, multiple public clouds. And they have to find a way. How am I going to pull meaning out of this data? And that brings us to data science and AI. That's how you get there. >> I understand the data science part but I think we're all starting to hear more about AI. And it's incredible that this buzz word is happening. How do businesses adopt to this AI growth and boom and trend that's happening in this world right now? >> Well, let me define it this way. Data science is a discipline. And machine learning is one technique. And then AI puts both machine learning into practice and applies it to the business. So this is really about how getting your business where it needs to go. And to get to an AI future, you have to lay a data foundation today. I love the phrase, "there's no AI without IA." That means you're not going to get to AI unless you have the right information architecture to start with. >> Can you elaborate though in terms of how businesses can really adopt AI and get started. >> Look, I think there's four things you have to do if you're serious about AI. One is you need a strategy for data acquisition. Two is you need a modern data architecture. Three is you need pervasive automation. And four is you got to expand job roles in the organization. >> Data acquisition. First pillar in this you just discussed. Can we start there and explain why it's so critical in this process? >> Yeah, so let's think about how data acquisition has evolved through the years. 15 years ago, data acquisition was about how do I get data in and out of my ERP system? And that was pretty much solved. Then the mobile revolution happens. And suddenly you've got structured and non-structured data. More than you've ever dealt with. And now you get to where we are today. You're talking terabytes, petabytes of data. >> [Katie] Yottabytes, I heard that word the other day. >> I heard that too. >> Didn't even know what it meant. >> You know how many zeros that is? >> I thought we were in Star Wars. >> Yeah, I think it's a lot of zeroes. >> Yodabytes, it's new. >> So, it's becoming more and more complex in terms of how you acquire data. So that's the new data landscape that every client is dealing with. And if you don't have a strategy for how you acquire that and manage it, you're not going to get to that AI future. >> So a natural segue, if you are one of these businesses, how do you build for the data landscape? >> Yeah, so the question I always hear from customers is we need to evolve our data architecture to be ready for AI. And the way I think about that is it's really about moving from static data repositories to more of a fluid data layer. >> And we continue with the architecture. New data architecture is an interesting buzz word to hear. But it's also one of the four pillars. So if you could dive in there. >> Yeah, I mean it's a new twist on what I would call some core data science concepts. For example, you have to leverage tools with a modern, centralized data warehouse. But your data warehouse can't be stagnant to just what's right there. So you need a way to federate data across different environments. You need to be able to bring your analytics to the data because it's most efficient that way. And ultimately, it's about building an optimized data platform that is designed for data science and AI. Which means it has to be a lot more flexible than what clients have had in the past. >> All right. So we've laid out what you need for driving automation. But where does the machine learning kick in? >> Machine learning is what gives you the ability to automate tasks. And I think about machine learning. It's about predicting and automating. And this will really change the roles of data professionals and IT professionals. For example, a data scientist cannot possibly know every algorithm or every model that they could use. So we can automate the process of algorithm selection. Another example is things like automated data matching. Or metadata creation. Some of these things may not be exciting but they're hugely practical. And so when you think about the real use cases that are driving return on investment today, it's things like that. It's automating the mundane tasks. >> Let's go ahead and come back to something that you mentioned earlier because it's fascinating to be talking about this AI journey, but also significant is the new job roles. And what are those other participants in the analytics pipeline? >> Yeah I think we're just at the start of this idea of new job roles. We have data scientists. We have data engineers. Now you see machine learning engineers. Application developers. What's really happening is that data scientists are no longer allowed to work in their own silo. And so the new job roles is about how does everybody have data first in their mind? And then they're using tools to automate data science, to automate building machine learning into applications. So roles are going to change dramatically in organizations. >> I think that's confusing though because we have several organizations who saying is that highly specialized roles, just for data science? Or is it applicable to everybody across the board? >> Yeah, and that's the big question, right? Cause everybody's thinking how will this apply? Do I want this to be just a small set of people in the organization that will do this? But, our view is data science has to for everybody. It's about bring data science to everybody as a shared mission across the organization. Everybody in the company has to be data literate. And participate in this journey. >> So overall, group effort, has to be a common goal, and we all need to be data literate across the board. >> Absolutely. >> Done deal. But at the end of the day, it's kind of not an easy task. >> It's not. It's not easy but it's maybe not as big of a shift as you would think. Because you have to put data in the hands of people that can do something with it. So, it's very basic. Give access to data. Data's often locked up in a lot of organizations today. Give people the right tools. Embrace the idea of choice or diversity in terms of those tools. That gets you started on this path. >> It's interesting to hear you say essentially you need to train everyone though across the board when it comes to data literacy. And I think people that are coming into the work force don't necessarily have a background or a degree in data science. So how do you manage? >> Yeah, so in many cases that's true. I will tell you some universities are doing amazing work here. One example, University of California Berkeley. They offer a course for all majors. So no matter what you're majoring in, you have a course on foundations of data science. How do you bring data science to every role? So it's starting to happen. We at IBM provide data science courses through CognitiveClass.ai. It's for everybody. It's free. And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. The key point is this though. It's more about attitude than it is aptitude. I think anybody can figure this out. But it's about the attitude to say we're putting data first and we're going to figure out how to make this real in our organization. >> I also have to give a shout out to my alma mater because I have heard that there is an offering in MS in data analytics. And they are always on the forefront of new technologies and new majors and on trend. And I've heard that the placement behind those jobs, people graduating with the MS is high. >> I'm sure it's very high. >> So go Tigers. All right, tangential. Let me get back to something else you touched on earlier because you mentioned that a number of customers ask you how in the world do I get started with AI? It's an overwhelming question. Where do you even begin? What do you tell them? >> Yeah, well things are moving really fast. But the good thing is most organizations I see, they're already on the path, even if they don't know it. They might have a BI practice in place. They've got data warehouses. They've got data lakes. Let me give you an example. AMC Networks. They produce a lot of the shows that I'm sure you watch Katie. >> [Katie] Yes, Breaking Bad, Walking Dead, any fans? >> [Rob] Yeah, we've got a few. >> [Katie] Well you taught me something I didn't even know. Because it's amazing how we have all these different industries, but yet media in itself is impacted too. And this is a good example. >> Absolutely. So, AMC Networks, think about it. They've got ads to place. They want to track viewer behavior. What do people like? What do they dislike? So they have to optimize every aspect of their business from marketing campaigns to promotions to scheduling to ads. And their goal was transform data into business insights and really take the burden off of their IT team that was heavily burdened by obviously a huge increase in data. So their VP of BI took the approach of using machine learning to process large volumes of data. They used a platform that was designed for AI and data processing. It's the IBM analytics system where it's a data warehouse, data science tools are built in. It has in memory data processing. And just like that, they were ready for AI. And they're already seeing that impact in their business. >> Do you think a movement of that nature kind of presses other media conglomerates and organizations to say we need to be doing this too? >> I think it's inevitable that everybody, you're either going to be playing, you're either going to be leading, or you'll be playing catch up. And so, as we talk to clients we think about how do you start down this path now, even if you have to iterate over time? Because otherwise you're going to wake up and you're going to be behind. >> One thing worth noting is we've talked about analytics to the data. It's analytics first to the data, not the other way around. >> Right. So, look. We as a practice, we say you want to bring data to where the data sits. Because it's a lot more efficient that way. It gets you better outcomes in terms of how you train models and it's more efficient. And we think that leads to better outcomes. Other organization will say, "Hey move the data around." And everything becomes a big data movement exercise. But once an organization has started down this path, they're starting to get predictions, they want to do it where it's really easy. And that means analytics applied right where the data sits. >> And worth talking about the role of the data scientist in all of this. It's been called the hot job of the decade. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. >> Yes. >> I want to see this on the cover of Vogue. Like I want to see the first data scientist. Female preferred, on the cover of Vogue. That would be amazing. >> Perhaps you can. >> People agree. So what changes for them? Is this challenging in terms of we talk data science for all. Where do all the data science, is it data science for everyone? And how does it change everything? >> Well, I think of it this way. AI gives software super powers. It really does. It changes the nature of software. And at the center of that is data scientists. So, a data scientist has a set of powers that they've never had before in any organization. And that's why it's a hot profession. Now, on one hand, this has been around for a while. We've had actuaries. We've had statisticians that have really transformed industries. But there are a few things that are new now. We have new tools. New languages. Broader recognition of this need. And while it's important to recognize this critical skill set, you can't just limit it to a few people. This is about scaling it across the organization. And truly making it accessible to all. >> So then do we need more data scientists? Or is this something you train like you said, across the board? >> Well, I think you want to do a little bit of both. We want more. But, we can also train more and make the ones we have more productive. The way I think about it is there's kind of two markets here. And we call it clickers and coders. >> [Katie] I like that. That's good. >> So, let's talk about what that means. So clickers are basically somebody that wants to use tools. Create models visually. It's drag and drop. Something that's very intuitive. Those are the clickers. Nothing wrong with that. It's been valuable for years. There's a new crop of data scientists. They want to code. They want to build with the latest open source tools. They want to write in Python or R. These are the coders. And both approaches are viable. Both approaches are critical. Organizations have to have a way to meet the needs of both of those types. And there's not a lot of things available today that do that. >> Well let's keep going on that. Because I hear you talking about the data scientists role and how it's critical to success, but with the new tools, data science and analytics skills can extend beyond the domain of just the data scientist. >> That's right. So look, we're unifying coders and clickers into a single platform, which we call IBM Data Science Experience. And as the demand for data science expertise grows, so does the need for these kind of tools. To bring them into the same environment. And my view is if you have the right platform, it enables the organization to collaborate. And suddenly you've changed the nature of data science from an individual sport to a team sport. >> So as somebody that, my background is in IT, the question is really is this an additional piece of what IT needs to do in 2017 and beyond? Or is it just another line item to the budget? >> So I'm afraid that some people might view it that way. As just another line item. But, I would challenge that and say data science is going to reinvent IT. It's going to change the nature of IT. And every organization needs to think about what are the skills that are critical? How do we engage a broader team to do this? Because once they get there, this is the chance to reinvent how they're performing IT. >> [Katie] Challenging or not? >> Look it's all a big challenge. Think about everything IT organizations have been through. Some of them were late to things like mobile, but then they caught up. Some were late to cloud, but then they caught up. I would just urge people, don't be late to data science. Use this as your chance to reinvent IT. Start with this notion of clickers and coders. This is a seminal moment. Much like mobile and cloud was. So don't be late. >> And I think it's critical because it could be so costly to wait. And Rob and I were even chatting earlier how data analytics is just moving into all different kinds of industries. And I can tell you even personally being effected by how important the analysis is in working in pediatric cancer for the last seven years. I personally implement virtual reality headsets to pediatric cancer hospitals across the country. And it's great. And it's working phenomenally. And the kids are amazed. And the staff is amazed. But the phase two of this project is putting in little metrics in the hardware that gather the breathing, the heart rate to show that we have data. Proof that we can hand over to the hospitals to continue making this program a success. So just in-- >> That's a great example. >> An interesting example. >> Saving lives? >> Yes. >> That's also applying a lot of what we talked about. >> Exciting stuff in the world of data science. >> Yes. Look, I just add this is an existential moment for every organization. Because what you do in this area is probably going to define how competitive you are going forward. And think about if you don't do something. What if one of your competitors goes and creates an application that's more engaging with clients? So my recommendation is start small. Experiment. Learn. Iterate on projects. Define the business outcomes. Then scale up. It's very doable. But you've got to take the first step. >> First step always critical. And now we're going to get to the fun hands on part of our story. Because in just a moment we're going to take a closer look at what data science can deliver. And where organizations are trying to get to. All right. Thank you Rob and now we've been joined by Siva Anne who is going to help us navigate this demo. First, welcome Siva. Give him a big round of applause. Yeah. All right, Rob break down what we're going to be looking at. You take over this demo. >> All right. So this is going to be pretty interesting. So Siva is going to take us through. So he's going to play the role of a financial adviser. Who wants to help better serve clients through recommendations. And I'm going to really illustrate three things. One is how do you federate data from multiple data sources? Inside the firewall, outside the firewall. How do you apply machine learning to predict and to automate? And then how do you move analytics closer to your data? So, what you're seeing here is a custom application for an investment firm. So, Siva, our financial adviser, welcome. So you can see at the top, we've got market data. We pulled that from an external source. And then we've got Siva's calendar in the middle. He's got clients on the right side. So page down, what else do you see down there Siva? >> [Siva] I can see the recent market news. And in here I can see that JP Morgan is calling for a US dollar rebound in the second half of the year. And, I have upcoming meeting with Leo Rakes. I can get-- >> [Rob] So let's go in there. Why don't you click on Leo Rakes. So, you're sitting at your desk, you're deciding how you're going to spend the day. You know you have a meeting with Leo. So you click on it. You immediately see, all right, so what do we know about him? We've got data governance implemented. So we know his age, we know his degree. We can see he's not that aggressive of a trader. Only six trades in the last few years. But then where it gets interesting is you go to the bottom. You start to see predicted industry affinity. Where did that come from? How do we have that? >> [Siva] So these green lines and red arrows here indicate the trending affinity of Leo Rakes for particular industry stocks. What we've done here is we've built machine learning models using customer's demographic data, his stock portfolios, and browsing behavior to build a model which can predict his affinity for a particular industry. >> [Rob] Interesting. So, I like to think of this, we call it celebrity experiences. So how do you treat every customer like they're a celebrity? So to some extent, we're reading his mind. Because without asking him, we know that he's going to have an affinity for auto stocks. So we go down. Now we look at his portfolio. You can see okay, he's got some different holdings. He's got Amazon, Google, Apple, and then he's got RACE, which is the ticker for Ferrari. You can see that's done incredibly well. And so, as a financial adviser, you look at this and you say, all right, we know he loves auto stocks. Ferrari's done very well. Let's create a hedge. Like what kind of security would interest him as a hedge against his position for Ferrari? Could we go figure that out? >> [Siva] Yes. Given I know that he's gotten an affinity for auto stocks, and I also see that Ferrari has got some terminus gains, I want to lock in these gains by hedging. And I want to do that by picking a auto stock which has got negative correlation with Ferrari. >> [Rob] So this is where we get to the idea of in database analytics. Cause you start clicking that and immediately we're getting instant answers of what's happening. So what did we find here? We're going to compare Ferrari and Honda. >> [Siva] I'm going to compare Ferrari with Honda. And what I see here instantly is that Honda has got a negative correlation with Ferrari, which makes it a perfect mix for his stock portfolio. Given he has an affinity for auto stocks and it correlates negatively with Ferrari. >> [Rob] These are very powerful tools at the hand of a financial adviser. You think about it. As a financial adviser, you wouldn't think about federating data, machine learning, pretty powerful. >> [Siva] Yes. So what we have seen here is that using the common SQL engine, we've been able to federate queries across multiple data sources. Db2 Warehouse in the cloud, IBM's Integrated Analytic System, and Hortonworks powered Hadoop platform for the new speeds. We've been able to use machine learning to derive innovative insights about his stock affinities. And drive the machine learning into the appliance. Closer to where the data resides to deliver high performance analytics. >> [Rob] At scale? >> [Siva] We're able to run millions of these correlations across stocks, currency, other factors. And even score hundreds of customers for their affinities on a daily basis. >> That's great. Siva, thank you for playing the role of financial adviser. So I just want to recap briefly. Cause this really powerful technology that's really simple. So we federated, we aggregated multiple data sources from all over the web and internal systems. And public cloud systems. Machine learning models were built that predicted Leo's affinity for a certain industry. In this case, automotive. And then you see when you deploy analytics next to your data, even a financial adviser, just with the click of a button is getting instant answers so they can go be more productive in their next meeting. This whole idea of celebrity experiences for your customer, that's available for everybody, if you take advantage of these types of capabilities. Katie, I'll hand it back to you. >> Good stuff. Thank you Rob. Thank you Siva. Powerful demonstration on what we've been talking about all afternoon. And thank you again to Siva for helping us navigate. Should be give him one more round of applause? We're going to be back in just a moment to look at how we operationalize all of this data. But in first, here's a message from me. If you're a part of a line of business, your main fear is disruption. You know data is the new goal that can create huge amounts of value. So does your competition. And they may be beating you to it. You're convinced there are new business models and revenue sources hidden in all the data. You just need to figure out how to leverage it. But with the scarcity of data scientists, you really can't rely solely on them. You may need more people throughout the organization that have the ability to extract value from data. And as a data science leader or data scientist, you have a lot of the same concerns. You spend way too much time looking for, prepping, and interpreting data and waiting for models to train. You know you need to operationalize the work you do to provide business value faster. What you want is an easier way to do data prep. And rapidly build models that can be easily deployed, monitored and automatically updated. So whether you're a data scientist, data science leader, or in a line of business, what's the solution? What'll it take to transform the way you work? That's what we're going to explore next. All right, now it's time to delve deeper into the nuts and bolts. The nitty gritty of operationalizing data science and creating a data driven culture. How do you actually do that? Well that's what these experts are here to share with us. I'm joined by Nir Kaldero, who's head of data science at Galvanize, which is an education and training organization. Tricia Wang, who is co-founder of Sudden Compass, a consultancy that helps companies understand people with data. And last, but certainly not least, Michael Li, founder and CEO of Data Incubator, which is a data science train company. All right guys. Shall we get right to it? >> All right. >> So data explosion happening right now. And we are seeing it across the board. I just shared an example of how it's impacting my philanthropic work in pediatric cancer. But you guys each have so many unique roles in your business life. How are you seeing it just blow up in your fields? Nir, your thing? >> Yeah, for example like in Galvanize we train many Fortune 500 companies. And just by looking at the demand of companies that wants us to help them go through this digital transformation is mind-blowing. Data point by itself. >> Okay. Well what we're seeing what's going on is that data science like as a theme, is that it's actually for everyone now. But what's happening is that it's actually meeting non technical people. But what we're seeing is that when non technical people are implementing these tools or coming at these tools without a base line of data literacy, they're often times using it in ways that distance themselves from the customer. Because they're implementing data science tools without a clear purpose, without a clear problem. And so what we do at Sudden Compass is that we work with companies to help them embrace and understand the complexity of their customers. Because often times they are misusing data science to try and flatten their understanding of the customer. As if you can just do more traditional marketing. Where you're putting people into boxes. And I think the whole ROI of data is that you can now understand people's relationships at a much more complex level at a greater scale before. But we have to do this with basic data literacy. And this has to involve technical and non technical people. >> Well you can have all the data in the world, and I think it speaks to, if you're not doing the proper movement with it, forget it. It means nothing at the same time. >> No absolutely. I mean, I think that when you look at the huge explosion in data, that comes with it a huge explosion in data experts. Right, we call them data scientists, data analysts. And sometimes they're people who are very, very talented, like the people here. But sometimes you have people who are maybe re-branding themselves, right? Trying to move up their title one notch to try to attract that higher salary. And I think that that's one of the things that customers are coming to us for, right? They're saying, hey look, there are a lot of people that call themselves data scientists, but we can't really distinguish. So, we have sort of run a fellowship where you help companies hire from a really talented group of folks, who are also truly data scientists and who know all those kind of really important data science tools. And we also help companies internally. Fortune 500 companies who are looking to grow that data science practice that they have. And we help clients like McKinsey, BCG, Bain, train up their customers, also their clients, also their workers to be more data talented. And to build up that data science capabilities. >> And Nir, this is something you work with a lot. A lot of Fortune 500 companies. And when we were speaking earlier, you were saying many of these companies can be in a panic. >> Yeah. >> Explain that. >> Yeah, so you know, not all Fortune 500 companies are fully data driven. And we know that the winners in this fourth industrial revolution, which I like to call the machine intelligence revolution, will be companies who navigate and transform their organization to unlock the power of data science and machine learning. And the companies that are not like that. Or not utilize data science and predictive power well, will pretty much get shredded. So they are in a panic. >> Tricia, companies have to deal with data behind the firewall and in the new multi cloud world. How do organizations start to become driven right to the core? >> I think the most urgent question to become data driven that companies should be asking is how do I bring the complex reality that our customers are experiencing on the ground in to a corporate office? Into the data models. So that question is critical because that's how you actually prevent any big data disasters. And that's how you leverage big data. Because when your data models are really far from your human models, that's when you're going to do things that are really far off from how, it's going to not feel right. That's when Tesco had their terrible big data disaster that they're still recovering from. And so that's why I think it's really important to understand that when you implement big data, you have to further embrace thick data. The qualitative, the emotional stuff, that is difficult to quantify. But then comes the difficult art and science that I think is the next level of data science. Which is that getting non technical and technical people together to ask how do we find those unknown nuggets of insights that are difficult to quantify? Then, how do we do the next step of figuring out how do you mathematically scale those insights into a data model? So that actually is reflective of human understanding? And then we can start making decisions at scale. But you have to have that first. >> That's absolutely right. And I think that when we think about what it means to be a data scientist, right? I always think about it in these sort of three pillars. You have the math side. You have to have that kind of stats, hardcore machine learning background. You have the programming side. You don't work with small amounts of data. You work with large amounts of data. You've got to be able to type the code to make those computers run. But then the last part is that human element. You have to understand the domain expertise. You have to understand what it is that I'm actually analyzing. What's the business proposition? And how are the clients, how are the users actually interacting with the system? That human element that you were talking about. And I think having somebody who understands all of those and not just in isolation, but is able to marry that understanding across those different topics, that's what makes a data scientist. >> But I find that we don't have people with those skill sets. And right now the way I see teams being set up inside companies is that they're creating these isolated data unicorns. These data scientists that have graduated from your programs, which are great. But, they don't involve the people who are the domain experts. They don't involve the designers, the consumer insight people, the people, the salespeople. The people who spend time with the customers day in and day out. Somehow they're left out of the room. They're consulted, but they're not a stakeholder. >> Can I actually >> Yeah, yeah please. >> Can I actually give a quick example? So for example, we at Galvanize train the executives and the managers. And then the technical people, the data scientists and the analysts. But in order to actually see all of the RY behind the data, you also have to have a creative fluid conversation between non technical and technical people. And this is a major trend now. And there's a major gap. And we need to increase awareness and kind of like create a new, kind of like environment where technical people also talks seamlessly with non technical ones. >> [Tricia] We call-- >> That's one of the things that we see a lot. Is one of the trends in-- >> A major trend. >> data science training is it's not just for the data science technical experts. It's not just for one type of person. So a lot of the training we do is sort of data engineers. People who are more on the software engineering side learning more about the stats of math. And then people who are sort of traditionally on the stat side learning more about the engineering. And then managers and people who are data analysts learning about both. >> Michael, I think you said something that was of interest too because I think we can look at IBM Watson as an example. And working in healthcare. The human component. Because often times we talk about machine learning and AI, and data and you get worried that you still need that human component. Especially in the world of healthcare. And I think that's a very strong point when it comes to the data analysis side. Is there any particular example you can speak to of that? >> So I think that there was this really excellent paper a while ago talking about all the neuro net stuff and trained on textual data. So looking at sort of different corpuses. And they found that these models were highly, highly sexist. They would read these corpuses and it's not because neuro nets themselves are sexist. It's because they're reading the things that we write. And it turns out that we write kind of sexist things. And they would sort of find all these patterns in there that were sort of latent, that had a lot of sort of things that maybe we would cringe at if we sort of saw. And I think that's one of the really important aspects of the human element, right? It's being able to come in and sort of say like, okay, I know what the biases of the system are, I know what the biases of the tools are. I need to figure out how to use that to make the tools, make the world a better place. And like another area where this comes up all the time is lending, right? So the federal government has said, and we have a lot of clients in the financial services space, so they're constantly under these kind of rules that they can't make discriminatory lending practices based on a whole set of protected categories. Race, sex, gender, things like that. But, it's very easy when you train a model on credit scores to pick that up. And then to have a model that's inadvertently sexist or racist. And that's where you need the human element to come back in and say okay, look, you're using the classic example would be zip code, you're using zip code as a variable. But when you look at it, zip codes actually highly correlated with race. And you can't do that. So you may inadvertently by sort of following the math and being a little naive about the problem, inadvertently introduce something really horrible into a model and that's where you need a human element to sort of step in and say, okay hold on. Slow things down. This isn't the right way to go. >> And the people who have -- >> I feel like, I can feel her ready to respond. >> Yes, I'm ready. >> She's like let me have at it. >> And the people here it is. And the people who are really great at providing that human intelligence are social scientists. We are trained to look for bias and to understand bias in data. Whether it's quantitative or qualitative. And I really think that we're going to have less of these kind of problems if we had more integrated teams. If it was a mandate from leadership to say no data science team should be without a social scientist, ethnographer, or qualitative researcher of some kind, to be able to help see these biases. >> The talent piece is actually the most crucial-- >> Yeah. >> one here. If you look about how to enable machine intelligence in organization there are the pillars that I have in my head which is the culture, the talent and the technology infrastructure. And I believe and I saw in working very closely with the Fortune 100 and 200 companies that the talent piece is actually the most important crucial hard to get. >> [Tricia] I totally agree. >> It's absolutely true. Yeah, no I mean I think that's sort of like how we came up with our business model. Companies were basically saying hey, I can't hire data scientists. And so we have a fellowship where we get 2,000 applicants each quarter. We take the top 2% and then we sort of train them up. And we work with hiring companies who then want to hire from that population. And so we're sort of helping them solve that problem. And the other half of it is really around training. Cause with a lot of industries, especially if you're sort of in a more regulated industry, there's a lot of nuances to what you're doing. And the fastest way to develop that data science or AI talent may not necessarily be to hire folks who are coming out of a PhD program. It may be to take folks internally who have a lot of that domain knowledge that you have and get them trained up on those data science techniques. So we've had large insurance companies come to us and say hey look, we hire three or four folks from you a quarter. That doesn't move the needle for us. What we really need is take the thousand actuaries and statisticians that we have and get all of them trained up to become a data scientist and become data literate in this new open source world. >> [Katie] Go ahead. >> All right, ladies first. >> Go ahead. >> Are you sure? >> No please, fight first. >> Go ahead. >> Go ahead Nir. >> So this is actually a trend that we have been seeing in the past year or so that companies kind of like start to look how to upscale and look for talent within the organization. So they can actually move them to become more literate and navigate 'em from analyst to data scientist. And from data scientist to machine learner. So this is actually a trend that is happening already for a year or so. >> Yeah, but I also find that after they've gone through that training in getting people skilled up in data science, the next problem that I get is executives coming to say we've invested in all of this. We're still not moving the needle. We've already invested in the right tools. We've gotten the right skills. We have enough scale of people who have these skills. Why are we not moving the needle? And what I explain to them is look, you're still making decisions in the same way. And you're still not involving enough of the non technical people. Especially from marketing, which is now, the CMO's are much more responsible for driving growth in their companies now. But often times it's so hard to change the old way of marketing, which is still like very segmentation. You know, demographic variable based, and we're trying to move people to say no, you have to understand the complexity of customers and not put them in boxes. >> And I think underlying a lot of this discussion is this question of culture, right? >> Yes. >> Absolutely. >> How do you build a data driven culture? And I think that that culture question, one of the ways that comes up quite often in especially in large, Fortune 500 enterprises, is that they are very, they're not very comfortable with sort of example, open source architecture. Open source tools. And there is some sort of residual bias that that's somehow dangerous. So security vulnerability. And I think that that's part of the cultural challenge that they often have in terms of how do I build a more data driven organization? Well a lot of the talent really wants to use these kind of tools. And I mean, just to give you an example, we are partnering with one of the major cloud providers to sort of help make open source tools more user friendly on their platform. So trying to help them attract the best technologists to use their platform because they want and they understand the value of having that kind of open source technology work seamlessly on their platforms. So I think that just sort of goes to show you how important open source is in this movement. And how much large companies and Fortune 500 companies and a lot of the ones we work with have to embrace that. >> Yeah, and I'm seeing it in our work. Even when we're working with Fortune 500 companies, is that they've already gone through the first phase of data science work. Where I explain it was all about the tools and getting the right tools and architecture in place. And then companies started moving into getting the right skill set in place. Getting the right talent. And what you're talking about with culture is really where I think we're talking about the third phase of data science, which is looking at communication of these technical frameworks so that we can get non technical people really comfortable in the same room with data scientists. That is going to be the phase, that's really where I see the pain point. And that's why at Sudden Compass, we're really dedicated to working with each other to figure out how do we solve this problem now? >> And I think that communication between the technical stakeholders and management and leadership. That's a very critical piece of this. You can't have a successful data science organization without that. >> Absolutely. >> And I think that actually some of the most popular trainings we've had recently are from managers and executives who are looking to say, how do I become more data savvy? How do I figure out what is this data science thing and how do I communicate with my data scientists? >> You guys made this way too easy. I was just going to get some popcorn and watch it play out. >> Nir, last 30 seconds. I want to leave you with an opportunity to, anything you want to add to this conversation? >> I think one thing to conclude is to say that companies that are not data driven is about time to hit refresh and figure how they transition the organization to become data driven. To become agile and nimble so they can actually see what opportunities from this important industrial revolution. Otherwise, unfortunately they will have hard time to survive. >> [Katie] All agreed? >> [Tricia] Absolutely, you're right. >> Michael, Trish, Nir, thank you so much. Fascinating discussion. And thank you guys again for joining us. We will be right back with another great demo. Right after this. >> Thank you Katie. >> Once again, thank you for an excellent discussion. Weren't they great guys? And thank you for everyone who's tuning in on the live webcast. As you can hear, we have an amazing studio audience here. And we're going to keep things moving. I'm now joined by Daniel Hernandez and Siva Anne. And we're going to turn our attention to how you can deliver on what they're talking about using data science experience to do data science faster. >> Thank you Katie. Siva and I are going to spend the next 10 minutes showing you how you can deliver on what they were saying using the IBM Data Science Experience to do data science faster. We'll demonstrate through new features we introduced this week how teams can work together more effectively across the entire analytics life cycle. How you can take advantage of any and all data no matter where it is and what it is. How you could use your favorite tools from open source. And finally how you could build models anywhere and employ them close to where your data is. Remember the financial adviser app Rob showed you? To build an app like that, we needed a team of data scientists, developers, data engineers, and IT staff to collaborate. We do this in the Data Science Experience through a concept we call projects. When I create a new project, I can now use the new Github integration feature. We're doing for data science what we've been doing for developers for years. Distributed teams can work together on analytics projects. And take advantage of Github's version management and change management features. This is a huge deal. Let's explore the project we created for the financial adviser app. As you can see, our data engineer Joane, our developer Rob, and others are collaborating this project. Joane got things started by bringing together the trusted data sources we need to build the app. Taking a closer look at the data, we see that our customer and profile data is stored on our recently announced IBM Integrated Analytics System, which runs safely behind our firewall. We also needed macro economic data, which she was able to find in the Federal Reserve. And she stored it in our Db2 Warehouse on Cloud. And finally, she selected stock news data from NASDAQ.com and landed that in a Hadoop cluster, which happens to be powered by Hortonworks. We added a new feature to the Data Science Experience so that when it's installed with Hortonworks, it automatically uses a need of security and governance controls within the cluster so your data is always secure and safe. Now we want to show you the news data we stored in the Hortonworks cluster. This is the mean administrative console. It's powered by an open source project called Ambari. And here's the news data. It's in parquet files stored in HDFS, which happens to be a distributive file system. To get the data from NASDAQ into our cluster, we used IBM's BigIntegrate and BigQuality to create automatic data pipelines that acquire, cleanse, and ingest that news data. Once the data's available, we use IBM's Big SQL to query that data using SQL statements that are much like the ones we would use for any relation of data, including the data that we have in the Integrated Analytics System and Db2 Warehouse on Cloud. This and the federation capabilities that Big SQL offers dramatically simplifies data acquisition. Now we want to show you how we support a brand new tool that we're excited about. Since we launched last summer, the Data Science Experience has supported Jupyter and R for data analysis and visualization. In this week's update, we deeply integrated another great open source project called Apache Zeppelin. It's known for having great visualization support, advanced collaboration features, and is growing in popularity amongst the data science community. This is an example of Apache Zeppelin and the notebook we created through it to explore some of our data. Notice how wonderful and easy the data visualizations are. Now we want to walk you through the Jupyter notebook we created to explore our customer preference for stocks. We use notebooks to understand and explore data. To identify the features that have some predictive power. Ultimately, we're trying to assess what ultimately is driving customer stock preference. Here we did the analysis to identify the attributes of customers that are likely to purchase auto stocks. We used this understanding to build our machine learning model. For building machine learning models, we've always had tools integrated into the Data Science Experience. But sometimes you need to use tools you already invested in. Like our very own SPSS as well as SAS. Through new import feature, you can easily import those models created with those tools. This helps you avoid vendor lock-in, and simplify the development, training, deployment, and management of all your models. To build the models we used in app, we could have coded, but we prefer a visual experience. We used our customer profile data in the Integrated Analytic System. Used the Auto Data Preparation to cleanse our data. Choose the binary classification algorithms. Let the Data Science Experience evaluate between logistic regression and gradient boosted tree. It's doing the heavy work for us. As you can see here, the Data Science Experience generated performance metrics that show us that the gradient boosted tree is the best performing algorithm for the data we gave it. Once we save this model, it's automatically deployed and available for developers to use. Any application developer can take this endpoint and consume it like they would any other API inside of the apps they built. We've made training and creating machine learning models super simple. But what about the operations? A lot of companies are struggling to ensure their model performance remains high over time. In our financial adviser app, we know that customer data changes constantly, so we need to always monitor model performance and ensure that our models are retrained as is necessary. This is a dashboard that shows the performance of our models and lets our teams monitor and retrain those models so that they're always performing to our standards. So far we've been showing you the Data Science Experience available behind the firewall that we're using to build and train models. Through a new publish feature, you can build models and deploy them anywhere. In another environment, private, public, or anywhere else with just a few clicks. So here we're publishing our model to the Watson machine learning service. It happens to be in the IBM cloud. And also deeply integrated with our Data Science Experience. After publishing and switching to the Watson machine learning service, you can see that our stock affinity and model that we just published is there and ready for use. So this is incredibly important. I just want to say it again. The Data Science Experience allows you to train models behind your own firewall, take advantage of your proprietary and sensitive data, and then deploy those models wherever you want with ease. So summarize what we just showed you. First, IBM's Data Science Experience supports all teams. You saw how our data engineer populated our project with trusted data sets. Our data scientists developed, trained, and tested a machine learning model. Our developers used APIs to integrate machine learning into their apps. And how IT can use our Integrated Model Management dashboard to monitor and manage model performance. Second, we support all data. On premises, in the cloud, structured, unstructured, inside of your firewall, and outside of it. We help you bring analytics and governance to where your data is. Third, we support all tools. The data science tools that you depend on are readily available and deeply integrated. This includes capabilities from great partners like Hortonworks. And powerful tools like our very own IBM SPSS. And fourth, and finally, we support all deployments. You can build your models anywhere, and deploy them right next to where your data is. Whether that's in the public cloud, private cloud, or even on the world's most reliable transaction platform, IBM z. So see for yourself. Go to the Data Science Experience website, take us for a spin. And if you happen to be ready right now, our recently created Data Science Elite Team can help you get started and run experiments alongside you with no charge. Thank you very much. >> Thank you very much Daniel. It seems like a great time to get started. And thanks to Siva for taking us through it. Rob and I will be back in just a moment to add some perspective right after this. All right, once again joined by Rob Thomas. And Rob obviously we got a lot of information here. >> Yes, we've covered a lot of ground. >> This is intense. You got to break it down for me cause I think we zoom out and see the big picture. What better data science can deliver to a business? Why is this so important? I mean we've heard it through and through. >> Yeah, well, I heard it a couple times. But it starts with businesses have to embrace a data driven culture. And it is a change. And we need to make data accessible with the right tools in a collaborative culture because we've got diverse skill sets in every organization. But data driven companies succeed when data science tools are in the hands of everyone. And I think that's a new thought. I think most companies think just get your data scientist some tools, you'll be fine. This is about tools in the hands of everyone. I think the panel did a great job of describing about how we get to data science for all. Building a data culture, making it a part of your everyday operations, and the highlights of what Daniel just showed us, that's some pretty cool features for how organizations can get to this, which is you can see IBM's Data Science Experience, how that supports all teams. You saw data analysts, data scientists, application developer, IT staff, all working together. Second, you saw how we support all tools. And your choice of tools. So the most popular data science libraries integrated into one platform. And we saw some new capabilities that help companies avoid lock-in, where you can import existing models created from specialist tools like SPSS or others. And then deploy them and manage them inside of Data Science Experience. That's pretty interesting. And lastly, you see we continue to build on this best of open tools. Partnering with companies like H2O, Hortonworks, and others. Third, you can see how you use all data no matter where it lives. That's a key challenge every organization's going to face. Private, public, federating all data sources. We announced new integration with the Hortonworks data platform where we deploy machine learning models where your data resides. That's been a key theme. Analytics where the data is. And lastly, supporting all types of deployments. Deploy them in your Hadoop cluster. Deploy them in your Integrated Analytic System. Or deploy them in z, just to name a few. A lot of different options here. But look, don't believe anything I say. Go try it for yourself. Data Science Experience, anybody can use it. Go to datascience.ibm.com and look, if you want to start right now, we just created a team that we call Data Science Elite. These are the best data scientists in the world that will come sit down with you and co-create solutions, models, and prove out a proof of concept. >> Good stuff. Thank you Rob. So you might be asking what does an organization look like that embraces data science for all? And how could it transform your role? I'm going to head back to the office and check it out. Let's start with the perspective of the line of business. What's changed? Well, now you're starting to explore new business models. You've uncovered opportunities for new revenue sources and all that hidden data. And being disrupted is no longer keeping you up at night. As a data science leader, you're beginning to collaborate with a line of business to better understand and translate the objectives into the models that are being built. Your data scientists are also starting to collaborate with the less technical team members and analysts who are working closest to the business problem. And as a data scientist, you stop feeling like you're falling behind. Open source tools are keeping you current. You're also starting to operationalize the work that you do. And you get to do more of what you love. Explore data, build models, put your models into production, and create business impact. All in all, it's not a bad scenario. Thanks. All right. We are back and coming up next, oh this is a special time right now. Cause we got a great guest speaker. New York Magazine called him the spreadsheet psychic and number crunching prodigy who went from correctly forecasting baseball games to correctly forecasting presidential elections. He even invented a proprietary algorithm called PECOTA for predicting future performance by baseball players and teams. And his New York Times bestselling book, The Signal and the Noise was named by Amazon.com as the number one best non-fiction book of 2012. He's currently the Editor in Chief of the award winning website, FiveThirtyEight and appears on ESPN as an on air commentator. Big round of applause. My pleasure to welcome Nate Silver. >> Thank you. We met backstage. >> Yes. >> It feels weird to re-shake your hand, but you know, for the audience. >> I had to give the intense firm grip. >> Definitely. >> The ninja grip. So you and I have crossed paths kind of digitally in the past, which it really interesting, is I started my career at ESPN. And I started as a production assistant, then later back on air for sports technology. And I go to you to talk about sports because-- >> Yeah. >> Wow, has ESPN upped their game in terms of understanding the importance of data and analytics. And what it brings. Not just to MLB, but across the board. >> No, it's really infused into the way they present the broadcast. You'll have win probability on the bottom line. And they'll incorporate FiveThirtyEight metrics into how they cover college football for example. So, ESPN ... Sports is maybe the perfect, if you're a data scientist, like the perfect kind of test case. And the reason being that sports consists of problems that have rules. And have structure. And when problems have rules and structure, then it's a lot easier to work with. So it's a great way to kind of improve your skills as a data scientist. Of course, there are also important real world problems that are more open ended, and those present different types of challenges. But it's such a natural fit. The teams. Think about the teams playing the World Series tonight. The Dodgers and the Astros are both like very data driven, especially Houston. Golden State Warriors, the NBA Champions, extremely data driven. New England Patriots, relative to an NFL team, it's shifted a little bit, the NFL bar is lower. But the Patriots are certainly very analytical in how they make decisions. So, you can't talk about sports without talking about analytics. >> And I was going to save the baseball question for later. Cause we are moments away from game seven. >> Yeah. >> Is everyone else watching game seven? It's been an incredible series. Probably one of the best of all time. >> Yeah, I mean-- >> You have a prediction here? >> You can mention that too. So I don't have a prediction. FiveThirtyEight has the Dodgers with a 60% chance of winning. >> [Katie] LA Fans. >> So you have two teams that are about equal. But the Dodgers pitching staff is in better shape at the moment. The end of a seven game series. And they're at home. >> But the statistics behind the two teams is pretty incredible. >> Yeah. It's like the first World Series in I think 56 years or something where you have two 100 win teams facing one another. There have been a lot of parity in baseball for a lot of years. Not that many offensive overall juggernauts. But this year, and last year with the Cubs and the Indians too really. But this year, you have really spectacular teams in the World Series. It kind of is a showcase of modern baseball. Lots of home runs. Lots of strikeouts. >> [Katie] Lots of extra innings. >> Lots of extra innings. Good defense. Lots of pitching changes. So if you love the modern baseball game, it's been about the best example that you've had. If you like a little bit more contact, and fewer strikeouts, maybe not so much. But it's been a spectacular and very exciting World Series. It's amazing to talk. MLB is huge with analysis. I mean, hands down. But across the board, if you can provide a few examples. Because there's so many teams in front offices putting such an, just a heavy intensity on the analysis side. And where the teams are going. And if you could provide any specific examples of teams that have really blown your mind. Especially over the last year or two. Because every year it gets more exciting if you will. I mean, so a big thing in baseball is defensive shifts. So if you watch tonight, you'll probably see a couple of plays where if you're used to watching baseball, a guy makes really solid contact. And there's a fielder there that you don't think should be there. But that's really very data driven where you analyze where's this guy hit the ball. That part's not so hard. But also there's game theory involved. Because you have to adjust for the fact that he knows where you're positioning the defenders. He's trying therefore to make adjustments to his own swing and so that's been a major innovation in how baseball is played. You know, how bullpens are used too. Where teams have realized that actually having a guy, across all sports pretty much, realizing the importance of rest. And of fatigue. And that you can be the best pitcher in the world, but guess what? After four or five innings, you're probably not as good as a guy who has a fresh arm necessarily. So I mean, it really is like, these are not subtle things anymore. It's not just oh, on base percentage is valuable. It really effects kind of every strategic decision in baseball. The NBA, if you watch an NBA game tonight, see how many three point shots are taken. That's in part because of data. And teams realizing hey, three points is worth more than two, once you're more than about five feet from the basket, the shooting percentage gets really flat. And so it's revolutionary, right? Like teams that will shoot almost half their shots from the three point range nowadays. Larry Bird, who wound up being one of the greatest three point shooters of all time, took only eight three pointers his first year in the NBA. It's quite noticeable if you watch baseball or basketball in particular. >> Not to focus too much on sports. One final question. In terms of Major League Soccer, and now in NFL, we're having the analysis and having wearables where it can now showcase if they wanted to on screen, heart rate and breathing and how much exertion. How much data is too much data? And when does it ruin the sport? >> So, I don't think, I mean, again, it goes sport by sport a little bit. I think in basketball you actually have a more exciting game. I think the game is more open now. You have more three pointers. You have guys getting higher assist totals. But you know, I don't know. I'm not one of those people who thinks look, if you love baseball or basketball, and you go in to work for the Astros, the Yankees or the Knicks, they probably need some help, right? You really have to be passionate about that sport. Because it's all based on what questions am I asking? As I'm a fan or I guess an employee of the team. Or a player watching the game. And there isn't really any substitute I don't think for the insight and intuition that a curious human has to kind of ask the right questions. So we can talk at great length about what tools do you then apply when you have those questions, but that still comes from people. I don't think machine learning could help with what questions do I want to ask of the data. It might help you get the answers. >> If you have a mid-fielder in a soccer game though, not exerting, only 80%, and you're seeing that on a screen as a fan, and you're saying could that person get fired at the end of the day? One day, with the data? >> So we found that actually some in soccer in particular, some of the better players are actually more still. So Leo Messi, maybe the best player in the world, doesn't move as much as other soccer players do. And the reason being that A) he kind of knows how to position himself in the first place. B) he realizes that you make a run, and you're out of position. That's quite fatiguing. And particularly soccer, like basketball, is a sport where it's incredibly fatiguing. And so, sometimes the guys who conserve their energy, that kind of old school mentality, you have to hustle at every moment. That is not helpful to the team if you're hustling on an irrelevant play. And therefore, on a critical play, can't get back on defense, for example. >> Sports, but also data is moving exponentially as we're just speaking about today. Tech, healthcare, every different industry. Is there any particular that's a favorite of yours to cover? And I imagine they're all different as well. >> I mean, I do like sports. We cover a lot of politics too. Which is different. I mean in politics I think people aren't intuitively as data driven as they might be in sports for example. It's impressive to follow the breakthroughs in artificial intelligence. It started out just as kind of playing games and playing chess and poker and Go and things like that. But you really have seen a lot of breakthroughs in the last couple of years. But yeah, it's kind of infused into everything really. >> You're known for your work in politics though. Especially presidential campaigns. >> Yeah. >> This year, in particular. Was it insanely challenging? What was the most notable thing that came out of any of your predictions? >> I mean, in some ways, looking at the polling was the easiest lens to look at it. So I think there's kind of a myth that last year's result was a big shock and it wasn't really. If you did the modeling in the right way, then you realized that number one, polls have a margin of error. And so when a candidate has a three point lead, that's not particularly safe. Number two, the outcome between different states is correlated. Meaning that it's not that much of a surprise that Clinton lost Wisconsin and Michigan and Pennsylvania and Ohio. You know I'm from Michigan. Have friends from all those states. Kind of the same types of people in those states. Those outcomes are all correlated. So what people thought was a big upset for the polls I think was an example of how data science done carefully and correctly where you understand probabilities, understand correlations. Our model gave Trump a 30% chance of winning. Others models gave him a 1% chance. And so that was interesting in that it showed that number one, that modeling strategies and skill do matter quite a lot. When you have someone saying 30% versus 1%. I mean, that's a very very big spread. And number two, that these aren't like solved problems necessarily. Although again, the problem with elections is that you only have one election every four years. So I can be very confident that I have a better model. Even one year of data doesn't really prove very much. Even five or 10 years doesn't really prove very much. And so, being aware of the limitations to some extent intrinsically in elections when you only get one kind of new training example every four years, there's not really any way around that. There are ways to be more robust to sparce data environments. But if you're identifying different types of business problems to solve, figuring out what's a solvable problem where I can add value with data science is a really key part of what you're doing. >> You're such a leader in this space. In data and analysis. It would be interesting to kind of peek back the curtain, understand how you operate but also how large is your team? How you're putting together information. How quickly you're putting it out. Cause I think in this right now world where everybody wants things instantly-- >> Yeah. >> There's also, you want to be first too in the world of journalism. But you don't want to be inaccurate because that's your credibility. >> We talked about this before, right? I think on average, speed is a little bit overrated in journalism. >> [Katie] I think it's a big problem in journalism. >> Yeah. >> Especially in the tech world. You have to be first. You have to be first. And it's just pumping out, pumping out. And there's got to be more time spent on stories if I can speak subjectively. >> Yeah, for sure. But at the same time, we are reacting to the news. And so we have people that come in, we hire most of our people actually from journalism. >> [Katie] How many people do you have on your team? >> About 35. But, if you get someone who comes in from an academic track for example, they might be surprised at how fast journalism is. That even though we might be slower than the average website, the fact that there's a tragic event in New York, are there things we have to say about that? A candidate drops out of the presidential race, are things we have to say about that. In periods ranging from minutes to days as opposed to kind of weeks to months to years in the academic world. The corporate world moves faster. What is a little different about journalism is that you are expected to have more precision where people notice when you make a mistake. In corporations, you have maybe less transparency. If you make 10 investments and seven of them turn out well, then you'll get a lot of profit from that, right? In journalism, it's a little different. If you make kind of seven predictions or say seven things, and seven of them are very accurate and three of them aren't, you'll still get criticized a lot for the three. Just because that's kind of the way that journalism is. And so the kind of combination of needing, not having that much tolerance for mistakes, but also needing to be fast. That is tricky. And I criticize other journalists sometimes including for not being data driven enough, but the best excuse any journalist has, this is happening really fast and it's my job to kind of figure out in real time what's going on and provide useful information to the readers. And that's really difficult. Especially in a world where literally, I'll probably get off the stage and check my phone and who knows what President Trump will have tweeted or what things will have happened. But it really is a kind of 24/7. >> Well because it's 24/7 with FiveThirtyEight, one of the most well known sites for data, are you feeling micromanagey on your people? Because you do have to hit this balance. You can't have something come out four or five days later. >> Yeah, I'm not -- >> Are you overseeing everything? >> I'm not by nature a micromanager. And so you try to hire well. You try and let people make mistakes. And the flip side of this is that if a news organization that never had any mistakes, never had any corrections, that's raw, right? You have to have some tolerance for error because you are trying to decide things in real time. And figure things out. I think transparency's a big part of that. Say here's what we think, and here's why we think it. If we have a model to say it's not just the final number, here's a lot of detail about how that's calculated. In some case we release the code and the raw data. Sometimes we don't because there's a proprietary advantage. But quite often we're saying we want you to trust us and it's so important that you trust us, here's the model. Go play around with it yourself. Here's the data. And that's also I think an important value. >> That speaks to open source. And your perspective on that in general. >> Yeah, I mean, look, I'm a big fan of open source. I worry that I think sometimes the trends are a little bit away from open source. But by the way, one thing that happens when you share your data or you share your thinking at least in lieu of the data, and you can definitely do both is that readers will catch embarrassing mistakes that you made. By the way, even having open sourceness within your team, I mean we have editors and copy editors who often save you from really embarrassing mistakes. And by the way, it's not necessarily people who have a training in data science. I would guess that of our 35 people, maybe only five to 10 have a kind of formal background in what you would call data science. >> [Katie] I think that speaks to the theme here. >> Yeah. >> [Katie] That everybody's kind of got to be data literate. >> But yeah, it is like you have a good intuition. You have a good BS detector basically. And you have a good intuition for hey, this looks a little bit out of line to me. And sometimes that can be based on domain knowledge, right? We have one of our copy editors, she's a big college football fan. And we had an algorithm we released that tries to predict what the human being selection committee will do, and she was like, why is LSU rated so high? Cause I know that LSU sucks this year. And we looked at it, and she was right. There was a bug where it had forgotten to account for their last game where they lost to Troy or something and so -- >> That also speaks to the human element as well. >> It does. In general as a rule, if you're designing a kind of regression based model, it's different in machine learning where you have more, when you kind of build in the tolerance for error. But if you're trying to do something more precise, then so much of it is just debugging. It's saying that looks wrong to me. And I'm going to investigate that. And sometimes it's not wrong. Sometimes your model actually has an insight that you didn't have yourself. But fairly often, it is. And I think kind of what you learn is like, hey if there's something that bothers me, I want to go investigate that now and debug that now. Because the last thing you want is where all of a sudden, the answer you're putting out there in the world hinges on a mistake that you made. Cause you never know if you have so to speak, 1,000 lines of code and they all perform something differently. You never know when you get in a weird edge case where this one decision you made winds up being the difference between your having a good forecast and a bad one. In a defensible position and a indefensible one. So we definitely are quite diligent and careful. But it's also kind of knowing like, hey, where is an approximation good enough and where do I need more precision? Cause you could also drive yourself crazy in the other direction where you know, it doesn't matter if the answer is 91.2 versus 90. And so you can kind of go 91.2, three, four and it's like kind of A) false precision and B) not a good use of your time. So that's where I do still spend a lot of time is thinking about which problems are "solvable" or approachable with data and which ones aren't. And when they're not by the way, you're still allowed to report on them. We are a news organization so we do traditional reporting as well. And then kind of figuring out when do you need precision versus when is being pointed in the right direction good enough? >> I would love to get inside your brain and see how you operate on just like an everyday walking to Walgreens movement. It's like oh, if I cross the street in .2-- >> It's not, I mean-- >> Is it like maddening in there? >> No, not really. I mean, I'm like-- >> This is an honest question. >> If I'm looking for airfares, I'm a little more careful. But no, part of it's like you don't want to waste time on unimportant decisions, right? I will sometimes, if I can't decide what to eat at a restaurant, I'll flip a coin. If the chicken and the pasta both sound really good-- >> That's not high tech Nate. We want better. >> But that's the point, right? It's like both the chicken and the pasta are going to be really darn good, right? So I'm not going to waste my time trying to figure it out. I'm just going to have an arbitrary way to decide. >> Serious and business, how organizations in the last three to five years have just evolved with this data boom. How are you seeing it as from a consultant point of view? Do you think it's an exciting time? Do you think it's a you must act now time? >> I mean, we do know that you definitely see a lot of talent among the younger generation now. That so FiveThirtyEight has been at ESPN for four years now. And man, the quality of the interns we get has improved so much in four years. The quality of the kind of young hires that we make straight out of college has improved so much in four years. So you definitely do see a younger generation for which this is just part of their bloodstream and part of their DNA. And also, particular fields that we're interested in. So we're interested in people who have both a data and a journalism background. We're interested in people who have a visualization and a coding background. A lot of what we do is very much interactive graphics and so forth. And so we do see those skill sets coming into play a lot more. And so the kind of shortage of talent that had I think frankly been a problem for a long time, I'm optimistic based on the young people in our office, it's a little anecdotal but you can tell that there are so many more programs that are kind of teaching students the right set of skills that maybe weren't taught as much a few years ago. >> But when you're seeing these big organizations, ESPN as perfect example, moving more towards data and analytics than ever before. >> Yeah. >> You would say that's obviously true. >> Oh for sure. >> If you're not moving that direction, you're going to fall behind quickly. >> Yeah and the thing is, if you read my book or I guess people have a copy of the book. In some ways it's saying hey, there are lot of ways to screw up when you're using data. And we've built bad models. We've had models that were bad and got good results. Good models that got bad results and everything else. But the point is that the reason to be out in front of the problem is so you give yourself more runway to make errors and mistakes. And to learn kind of what works and what doesn't and which people to put on the problem. I sometimes do worry that a company says oh we need data. And everyone kind of agrees on that now. We need data science. Then they have some big test case. And they have a failure. And they maybe have a failure because they didn't know really how to use it well enough. But learning from that and iterating on that. And so by the time that you're on the third generation of kind of a problem that you're trying to solve, and you're watching everyone else make the mistake that you made five years ago, I mean, that's really powerful. But that doesn't mean that getting invested in it now, getting invested both in technology and the human capital side is important. >> Final question for you as we run out of time. 2018 beyond, what is your biggest project in terms of data gathering that you're working on? >> There's a midterm election coming up. That's a big thing for us. We're also doing a lot of work with NBA data. So for four years now, the NBA has been collecting player tracking data. So they have 3D cameras in every arena. So they can actually kind of quantify for example how fast a fast break is, for example. Or literally where a player is and where the ball is. For every NBA game now for the past four or five years. And there hasn't really been an overall metric of player value that's taken advantage of that. The teams do it. But in the NBA, the teams are a little bit ahead of journalists and analysts. So we're trying to have a really truly next generation stat. It's a lot of data. Sometimes I now more oversee things than I once did myself. And so you're parsing through many, many, many lines of code. But yeah, so we hope to have that out at some point in the next few months. >> Anything you've personally been passionate about that you've wanted to work on and kind of solve? >> I mean, the NBA thing, I am a pretty big basketball fan. >> You can do better than that. Come on, I want something real personal that you're like I got to crunch the numbers. >> You know, we tried to figure out where the best burrito in America was a few years ago. >> I'm going to end it there. >> Okay. >> Nate, thank you so much for joining us. It's been an absolute pleasure. Thank you. >> Cool, thank you. >> I thought we were going to chat World Series, you know. Burritos, important. I want to thank everybody here in our audience. Let's give him a big round of applause. >> [Nate] Thank you everyone. >> Perfect way to end the day. And for a replay of today's program, just head on over to ibm.com/dsforall. I'm Katie Linendoll. And this has been Data Science for All: It's a Whole New Game. Test one, two. One, two, three. Hi guys, I just want to quickly let you know as you're exiting. A few heads up. Downstairs right now there's going to be a meet and greet with Nate. And we're going to be doing that with clients and customers who are interested. So I would recommend before the game starts, and you lose Nate, head on downstairs. And also the gallery is open until eight p.m. with demos and activations. And tomorrow, make sure to come back too. Because we have exciting stuff. I'll be joining you as your host. And we're kicking off at nine a.m. So bye everybody, thank you so much. >> [Announcer] Ladies and gentlemen, thank you for attending this evening's webcast. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your name badge at the registration desk. Thank you. Also, please note there are two exits on the back of the room on either side of the room. Have a good evening. Ladies and gentlemen, the meet and greet will be on stage. Thank you.

Published Date : Nov 1 2017

SUMMARY :

Today the ability to extract value from data is becoming a shared mission. And for all of you during the program, I want to remind you to join that conversation on And when you and I chatted about it. And the scale and complexity of the data that organizations are having to deal with has It's challenging in the world of unmanageable. And they have to find a way. AI. And it's incredible that this buzz word is happening. And to get to an AI future, you have to lay a data foundation today. And four is you got to expand job roles in the organization. First pillar in this you just discussed. And now you get to where we are today. And if you don't have a strategy for how you acquire that and manage it, you're not going And the way I think about that is it's really about moving from static data repositories And we continue with the architecture. So you need a way to federate data across different environments. So we've laid out what you need for driving automation. And so when you think about the real use cases that are driving return on investment today, Let's go ahead and come back to something that you mentioned earlier because it's fascinating And so the new job roles is about how does everybody have data first in their mind? Everybody in the company has to be data literate. So overall, group effort, has to be a common goal, and we all need to be data literate But at the end of the day, it's kind of not an easy task. It's not easy but it's maybe not as big of a shift as you would think. It's interesting to hear you say essentially you need to train everyone though across the And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. And I've heard that the placement behind those jobs, people graduating with the MS is high. Let me get back to something else you touched on earlier because you mentioned that a number They produce a lot of the shows that I'm sure you watch Katie. And this is a good example. So they have to optimize every aspect of their business from marketing campaigns to promotions And so, as we talk to clients we think about how do you start down this path now, even It's analytics first to the data, not the other way around. We as a practice, we say you want to bring data to where the data sits. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. Female preferred, on the cover of Vogue. And how does it change everything? And while it's important to recognize this critical skill set, you can't just limit it And we call it clickers and coders. [Katie] I like that. And there's not a lot of things available today that do that. Because I hear you talking about the data scientists role and how it's critical to success, And my view is if you have the right platform, it enables the organization to collaborate. And every organization needs to think about what are the skills that are critical? Use this as your chance to reinvent IT. And I can tell you even personally being effected by how important the analysis is in working And think about if you don't do something. And now we're going to get to the fun hands on part of our story. And then how do you move analytics closer to your data? And in here I can see that JP Morgan is calling for a US dollar rebound in the second half But then where it gets interesting is you go to the bottom. data, his stock portfolios, and browsing behavior to build a model which can predict his affinity And so, as a financial adviser, you look at this and you say, all right, we know he loves And I want to do that by picking a auto stock which has got negative correlation with Ferrari. Cause you start clicking that and immediately we're getting instant answers of what's happening. And what I see here instantly is that Honda has got a negative correlation with Ferrari, As a financial adviser, you wouldn't think about federating data, machine learning, pretty And drive the machine learning into the appliance. And even score hundreds of customers for their affinities on a daily basis. And then you see when you deploy analytics next to your data, even a financial adviser, And as a data science leader or data scientist, you have a lot of the same concerns. But you guys each have so many unique roles in your business life. And just by looking at the demand of companies that wants us to help them go through this And I think the whole ROI of data is that you can now understand people's relationships Well you can have all the data in the world, and I think it speaks to, if you're not doing And I think that that's one of the things that customers are coming to us for, right? And Nir, this is something you work with a lot. And the companies that are not like that. Tricia, companies have to deal with data behind the firewall and in the new multi cloud And so that's why I think it's really important to understand that when you implement big And how are the clients, how are the users actually interacting with the system? And right now the way I see teams being set up inside companies is that they're creating But in order to actually see all of the RY behind the data, you also have to have a creative That's one of the things that we see a lot. So a lot of the training we do is sort of data engineers. And I think that's a very strong point when it comes to the data analysis side. And that's where you need the human element to come back in and say okay, look, you're And the people who are really great at providing that human intelligence are social scientists. the talent piece is actually the most important crucial hard to get. It may be to take folks internally who have a lot of that domain knowledge that you have And from data scientist to machine learner. And what I explain to them is look, you're still making decisions in the same way. And I mean, just to give you an example, we are partnering with one of the major cloud And what you're talking about with culture is really where I think we're talking about And I think that communication between the technical stakeholders and management You guys made this way too easy. I want to leave you with an opportunity to, anything you want to add to this conversation? I think one thing to conclude is to say that companies that are not data driven is And thank you guys again for joining us. And we're going to turn our attention to how you can deliver on what they're talking about And finally how you could build models anywhere and employ them close to where your data is. And thanks to Siva for taking us through it. You got to break it down for me cause I think we zoom out and see the big picture. And we saw some new capabilities that help companies avoid lock-in, where you can import And as a data scientist, you stop feeling like you're falling behind. We met backstage. And I go to you to talk about sports because-- And what it brings. And the reason being that sports consists of problems that have rules. And I was going to save the baseball question for later. Probably one of the best of all time. FiveThirtyEight has the Dodgers with a 60% chance of winning. So you have two teams that are about equal. It's like the first World Series in I think 56 years or something where you have two 100 And that you can be the best pitcher in the world, but guess what? And when does it ruin the sport? So we can talk at great length about what tools do you then apply when you have those And the reason being that A) he kind of knows how to position himself in the first place. And I imagine they're all different as well. But you really have seen a lot of breakthroughs in the last couple of years. You're known for your work in politics though. What was the most notable thing that came out of any of your predictions? And so, being aware of the limitations to some extent intrinsically in elections when It would be interesting to kind of peek back the curtain, understand how you operate but But you don't want to be inaccurate because that's your credibility. I think on average, speed is a little bit overrated in journalism. And there's got to be more time spent on stories if I can speak subjectively. And so we have people that come in, we hire most of our people actually from journalism. And so the kind of combination of needing, not having that much tolerance for mistakes, Because you do have to hit this balance. And so you try to hire well. And your perspective on that in general. But by the way, one thing that happens when you share your data or you share your thinking And you have a good intuition for hey, this looks a little bit out of line to me. And I think kind of what you learn is like, hey if there's something that bothers me, It's like oh, if I cross the street in .2-- I mean, I'm like-- But no, part of it's like you don't want to waste time on unimportant decisions, right? We want better. It's like both the chicken and the pasta are going to be really darn good, right? Serious and business, how organizations in the last three to five years have just And man, the quality of the interns we get has improved so much in four years. But when you're seeing these big organizations, ESPN as perfect example, moving more towards But the point is that the reason to be out in front of the problem is so you give yourself Final question for you as we run out of time. And so you're parsing through many, many, many lines of code. You can do better than that. You know, we tried to figure out where the best burrito in America was a few years Nate, thank you so much for joining us. I thought we were going to chat World Series, you know. And also the gallery is open until eight p.m. with demos and activations. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your

ENTITIES

Entity	Category	Confidence
Tricia Wang	PERSON	0.99+
Katie	PERSON	0.99+
Katie Linendoll	PERSON	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Joane	PERSON	0.99+
Daniel	PERSON	0.99+
Michael Li	PERSON	0.99+
Nate Silver	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Trump	PERSON	0.99+
Nate	PERSON	0.99+
Honda	ORGANIZATION	0.99+
Siva	PERSON	0.99+
McKinsey	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Larry Bird	PERSON	0.99+
2017	DATE	0.99+
Rob Thomas	PERSON	0.99+
Michigan	LOCATION	0.99+
Yankees	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Clinton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Tesco	ORGANIZATION	0.99+
Michael	PERSON	0.99+
America	LOCATION	0.99+
Leo	PERSON	0.99+
four years	QUANTITY	0.99+
five	QUANTITY	0.99+
30%	QUANTITY	0.99+
Astros	ORGANIZATION	0.99+
Trish	PERSON	0.99+
Sudden Compass	ORGANIZATION	0.99+
Leo Messi	PERSON	0.99+
two teams	QUANTITY	0.99+
1,000 lines	QUANTITY	0.99+
one year	QUANTITY	0.99+
10 investments	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
The Signal and the Noise	TITLE	0.99+
Tricia	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
80%	QUANTITY	0.99+
BCG	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
ESPN	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Ferrari	ORGANIZATION	0.99+
last year	DATE	0.99+
18	QUANTITY	0.99+
three	QUANTITY	0.99+
Data Incubator	ORGANIZATION	0.99+
Patriots	ORGANIZATION	0.99+

Data Science: Present and Future | IBM Data Science For All

>> Announcer: Live from New York City it's The Cube, covering IBM data science for all. Brought to you by IBM. (light digital music) >> Welcome back to data science for all. It's a whole new game. And it is a whole new game. >> Dave Vellante, John Walls here. We've got quite a distinguished panel. So it is a new game-- >> Well we're in the game, I'm just happy to be-- (both laugh) Have a swing at the pitch. >> Well let's what we have here. Five distinguished members of our panel. It'll take me a minute to get through the introductions, but believe me they're worth it. Jennifer Shin joins us. Jennifer's the founder of 8 Path Solutions, the director of the data science of Comcast and part of the faculty at UC Berkeley and NYU. Jennifer, nice to have you with us, we appreciate the time. Joe McKendrick an analyst and contributor of Forbes and ZDNet, Joe, thank you for being here at well. Another ZDNetter next to him, Dion Hinchcliffe, who is a vice president and principal analyst of Constellation Research and also contributes to ZDNet. Good to see you, sir. To the back row, but that doesn't mean anything about the quality of the participation here. Bob Hayes with a killer Batman shirt on by the way, which we'll get to explain in just a little bit. He runs the Business over Broadway. And Joe Caserta, who the founder of Caserta Concepts. Welcome to all of you. Thanks for taking the time to be with us. Jennifer, let me just begin with you. Obviously as a practitioner you're very involved in the industry, you're on the academic side as well. We mentioned Berkeley, NYU, steep experience. So I want you to kind of take your foot in both worlds and tell me about data science. I mean where do we stand now from those two perspectives? How have we evolved to where we are? And how would you describe, I guess the state of data science? >> Yeah so I think that's a really interesting question. There's a lot of changes happening. In part because data science has now become much more established, both in the academic side as well as in industry. So now you see some of the bigger problems coming out. People have managed to have data pipelines set up. But now there are these questions about models and accuracy and data integration. So the really cool stuff from the data science standpoint. We get to get really into the details of the data. And I think on the academic side you now see undergraduate programs, not just graduate programs, but undergraduate programs being involved. UC Berkeley just did a big initiative that they're going to offer data science to undergrads. So that's a huge news for the university. So I think there's a lot of interest from the academic side to continue data science as a major, as a field. But I think in industry one of the difficulties you're now having is businesses are now asking that question of ROI, right? What do I actually get in return in the initial years? So I think there's a lot of work to be done and just a lot of opportunity. It's great because people now understand better with data sciences, but I think data sciences have to really think about that seriously and take it seriously and really think about how am I actually getting a return, or adding a value to the business? >> And there's lot to be said is there not, just in terms of increasing the workforce, the acumen, the training that's required now. It's a still relatively new discipline. So is there a shortage issue? Or is there just a great need? Is the opportunity there? I mean how would you look at that? >> Well I always think there's opportunity to be smart. If you can be smarter, you know it's always better. It gives you advantages in the workplace, it gets you an advantage in academia. The question is, can you actually do the work? The work's really hard, right? You have to learn all these different disciplines, you have to be able to technically understand data. Then you have to understand it conceptually. You have to be able to model with it, you have to be able to explain it. There's a lot of aspects that you're not going to pick up overnight. So I think part of it is endurance. Like are people going to feel motivated enough and dedicate enough time to it to get very good at that skill set. And also of course, you know in terms of industry, will there be enough interest in the long term that there will be a financial motivation. For people to keep staying in the field, right? So I think it's definitely a lot of opportunity. But that's always been there. Like I tell people I think of myself as a scientist and data science happens to be my day job. That's just the job title. But if you are a scientist and you work with data you'll always want to work with data. I think that's just an inherent need. It's kind of a compulsion, you just kind of can't help yourself, but dig a little bit deeper, ask the questions, you can't not think about it. So I think that will always exist. Whether or not it's an industry job in the way that we see it today, and like five years from now, or 10 years from now. I think that's something that's up for debate. >> So all of you have watched the evolution of data and how it effects organizations for a number of years now. If you go back to the days when data warehouse was king, we had a lot of promises about 360 degree views of the customer and how we were going to be more anticipatory in terms and more responsive. In many ways the decision support systems and the data warehousing world didn't live up to those promises. They solved other problems for sure. And so everybody was looking for big data to solve those problems. And they've begun to attack many of them. We talked earlier in The Cube today about fraud detection, it's gotten much, much better. Certainly retargeting of advertising has gotten better. But I wonder if you could comment, you know maybe start with Joe. As to the effect that data and data sciences had on organizations in terms of fulfilling that vision of a 360 degree view of customers and anticipating customer needs. >> So. Data warehousing, I wouldn't say failed. But I think it was unfinished in order to achieve what we need done today. At the time I think it did a pretty good job. I think it was the only place where we were able to collect data from all these different systems, have it in a single place for analytics. The big difference between what I think, between data warehousing and data science is data warehouses were primarily made for the consumer to human beings. To be able to have people look through some tool and be able to analyze data manually. That really doesn't work anymore, there's just too much data to do that. So that's why we need to build a science around it so that we can actually have machines actually doing the analytics for us. And I think that's the biggest stride in the evolution over the past couple of years, that now we're actually able to do that, right? It used to be very, you know you go back to when data warehouses started, you had to be a deep technologist in order to be able to collect the data, write the programs to clean the data. But now you're average causal IT person can do that. Right now I think we're back in data science where you have to be a fairly sophisticated programmer, analyst, scientist, statistician, engineer, in order to do what we need to do, in order to make machines actually understand the data. But I think part of the evolution, we're just in the forefront. We're going to see over the next, not even years, within the next year I think a lot of new innovation where the average person within business and definitely the average person within IT will be able to do as easily say, "What are my sales going to be next year?" As easy as it is to say, "What were my sales last year." Where now it's a big deal. Right now in order to do that you have to build some algorithms, you have to be a specialist on predictive analytics. And I think, you know as the tools mature, as people using data matures, and as the technology ecosystem for data matures, it's going to be easier and more accessible. >> So it's still too hard. (laughs) That's something-- >> Joe C.: Today it is yes. >> You've written about and talked about. >> Yeah no question about it. We see this citizen data scientist. You know we talked about the democratization of data science but the way we talk about analytics and warehousing and all the tools we had before, they generated a lot of insights and views on the information, but they didn't really give us the science part. And that's, I think that what's missing is the forming of the hypothesis, the closing of the loop of. We now have use of this data, but are are changing, are we thinking about it strategically? Are we learning from it and then feeding that back into the process. I think that's the big difference between data science and the analytics side. But, you know just like Google made search available to everyone, not just people who had highly specialized indexers or crawlers. Now we can have tools that make these capabilities available to anyone. You know going back to what Joe said I think the key thing is we now have tools that can look at all the data and ask all the questions. 'Cause we can't possibly do it all ourselves. Our organizations are increasingly awash in data. Which is the life blood of our organizations, but we're not using it, you know this a whole concept of dark data. And so I think the concept, or the promise of opening these tools up for everyone to be able to access those insights and activate them, I think that, you know, that's where it's headed. >> This is kind of where the T shirt comes in right? So Bob if you would, so you've got this Batman shirt on. We talked a little bit about it earlier, but it plays right into what Dion's talking about. About tools and, I don't want to spoil it, but you go ahead (laughs) and tell me about it. >> Right, so. Batman is a super hero, but he doesn't have any supernatural powers, right? He can't fly on his own, he can't become invisible on his own. But the thing is he has the utility belt and he has these tools he can use to help him solve problems. For example he as the bat ring when he's confronted with a building that he wants to get over, right? So he pulls it out and uses that. So as data professionals we have all these tools now that these vendors are making. We have IBM SPSS, we have data science experience. IMB Watson that these data pros can now use it as part of their utility belt and solve problems that they're confronted with. So if you''re ever confronted with like a Churn problem and you have somebody who has access to that data they can put that into IBM Watson, ask a question and it'll tell you what's the key driver of Churn. So it's not that you have to be a superhuman to be a data scientist, but these tools will help you solve certain problems and help your business go forward. >> Joe McKendrick, do you have a comment? >> Does that make the Batmobile the Watson? (everyone laughs) Analogy? >> I was just going to add that, you know all of the billionaires in the world today and none of them decided to become Batman yet. It's very disappointing. >> Yeah. (Joe laughs) >> Go ahead Joe. >> And I just want to add some thoughts to our discussion about what happened with data warehousing. I think it's important to point out as well that data warehousing, as it existed, was fairly successful but for larger companies. Data warehousing is a very expensive proposition it remains a expensive proposition. Something that's in the domain of the Fortune 500. But today's economy is based on a very entrepreneurial model. The Fortune 500s are out there of course it's ever shifting. But you have a lot of smaller companies a lot of people with start ups. You have people within divisions of larger companies that want to innovate and not be tied to the corporate balance sheet. They want to be able to go through, they want to innovate and experiment without having to go through finance and the finance department. So there's all these open source tools available. There's cloud resources as well as open source tools. Hadoop of course being a prime example where you can work with the data and experiment with the data and practice data science at a very low cost. >> Dion mentioned the C word, citizen data scientist last year at the panel. We had a conversation about that. And the data scientists on the panel generally were like, "Stop." Okay, we're not all of a sudden going to turn everybody into data scientists however, what we want to do is get people thinking about data, more focused on data, becoming a data driven organization. I mean as a data scientist I wonder if you could comment on that. >> Well I think so the other side of that is, you know there are also many people who maybe didn't, you know follow through with science, 'cause it's also expensive. A PhD takes a lot of time. And you know if you don't get funding it's a lot of money. And for very little security if you think about how hard it is to get a teaching job that's going to give you enough of a pay off to pay that back. Right, the time that you took off, the investment that you made. So I think the other side of that is by making data more accessible, you allow people who could have been great in science, have an opportunity to be great data scientists. And so I think for me the idea of citizen data scientist, that's where the opportunity is. I think in terms of democratizing data and making it available for everyone, I feel as though it's something similar to the way we didn't really know what KPIs were, maybe 20 years ago. People didn't use it as readily, didn't teach it in schools. I think maybe 10, 20 years from now, some of the things that we're building today from data science, hopefully more people will understand how to use these tools. They'll have a better understanding of working with data and what that means, and just data literacy right? Just being able to use these tools and be able to understand what data's saying and actually what it's not saying. Which is the thing that most people don't think about. But you can also say that data doesn't say anything. There's a lot of noise in it. There's too much noise to be able to say that there is a result. So I think that's the other side of it. So yeah I guess in terms for me, in terms of data a serious data scientist, I think it's a great idea to have that, right? But at the same time of course everyone kind of emphasized you don't want everyone out there going, "I can be a data scientist without education, "without statistics, without math," without understanding of how to implement the process. I've seen a lot of companies implement the same sort of process from 10, 20 years ago just on Hadoop instead of SQL. Right and it's very inefficient. And the only difference is that you can build more tables wrong than they could before. (everyone laughs) Which is I guess >> For less. it's an accomplishment and for less, it's cheaper, yeah. >> It is cheaper. >> Otherwise we're like I'm not a data scientist but I did stay at a Holiday Inn Express last night, right? >> Yeah. (panelists laugh) And there's like a little bit of pride that like they used 2,000, you know they used 2,000 computers to do it. Like a little bit of pride about that, but you know of course maybe not a great way to go. I think 20 years we couldn't do that, right? One computer was already an accomplishment to have that resource. So I think you have to think about the fact that if you're doing it wrong, you're going to just make that mistake bigger, which his also the other side of working with data. >> Sure, Bob. >> Yeah I have a comment about that. I've never liked the term citizen data scientist or citizen scientist. I get the point of it and I think employees within companies can help in the data analytics problem by maybe being a data collector or something. I mean I would never have just somebody become a scientist based on a few classes here she takes. It's like saying like, "Oh I'm going to be a citizen lawyer." And so you come to me with your legal problems, or a citizen surgeon. Like you need training to be good at something. You can't just be good at something just 'cause you want to be. >> John: Joe you wanted to say something too on that. >> Since we're in New York City I'd like to use the analogy of a real scientist versus a data scientist. So real scientist requires tools, right? And the tools are not new, like microscopes and a laboratory and a clean room. And these tools have evolved over years and years, and since we're in New York we could walk within a 10 block radius and buy any of those tools. It doesn't make us a scientist because we use those tools. I think with data, you know making, making the tools evolve and become easier to use, you know like Bob was saying, it doesn't make you a better data scientist, it just makes the data more accessible. You know we can go buy a microscope, we can go buy Hadoop, we can buy any kind of tool in a data ecosystem, but it doesn't really make you a scientist. I'm very involved in the NYU data science program and the Columbia data science program, like these kids are brilliant. You know these kids are not someone who is, you know just trying to run a day to day job, you know in corporate America. I think the people who are running the day to day job in corporate America are going to be the recipients of data science. Just like people who take drugs, right? As a result of a smart data scientist coming up with a formula that can help people, I think we're going to make it easier to distribute the data that can help people with all the new tools. But it doesn't really make it, you know the access to the data and tools available doesn't really make you a better data scientist. Without, like Bob was saying, without better training and education. >> So how-- I'm sorry, how do you then, if it's not for everybody, but yet I'm the user at the end of the day at my company and I've got these reams of data before me, how do you make it make better sense to me then? So that's where machine learning comes in or artificial intelligence and all this stuff. So how at the end of the day, Dion? How do you make it relevant and usable, actionable to somebody who might not be as practiced as you would like? >> I agree with Joe that many of us will be the recipients of data science. Just like you had to be a computer science at one point to develop programs for a computer, now we can get the programs. You don't need to be a computer scientist to get a lot of value out of our IT systems. The same thing's going to happen with data science. There's far more demand for data science than there ever could be produced by, you know having an ivory tower filled with data scientists. Which we need those guys, too, don't get me wrong. But we need to have, productize it and make it available in packages such that it can be consumed. The outputs and even some of the inputs can be provided by mere mortals, whether that's machine learning or artificial intelligence or bots that go off and run the hypotheses and select the algorithms maybe with some human help. We have to productize it. This is a constant of data scientist of service, which is becoming a thing now. It's, "I need this, I need this capability at scale. "I need it fast and I need it cheap." The commoditization of data science is going to happen. >> That goes back to what I was saying about, the recipient also of data science is also machines, right? Because I think the other thing that's happening now in the evolution of data is that, you know the data is, it's so tightly coupled. Back when you were talking about data warehousing you have all the business transactions then you take the data out of those systems, you put them in a warehouse for analysis, right? Maybe they'll make a decision to change that system at some point. Now the analytics platform and the business application is very tightly coupled. They become dependent upon one another. So you know people who are using the applications are now be able to take advantage of the insights of data analytics and data science, just through the app. Which never really existed before. >> I have one comment on that. You were talking about how do you get the end user more involved, well like we said earlier data science is not easy, right? As an end user, I encourage you to take a stats course, just a basic stats course, understanding what a mean is, variability, regression analysis, just basic stuff. So you as an end user can get more, or glean more insight from the reports that you're given, right? If you go to France and don't know French, then people can speak really slowly to you in French, you're not going to get it. You need to understand the language of data to get value from the technology we have available to us. >> Incidentally French is one of the languages that you have the option of learning if you're a mathematicians. So math PhDs are required to learn a second language. France being the country of algebra, that's one of the languages you could actually learn. Anyway tangent. But going back to the point. So statistics courses, definitely encourage it. I teach statistics. And one of the things that I'm finding as I go through the process of teaching it I'm actually bringing in my experience. And by bringing in my experience I'm actually kind of making the students think about the data differently. So the other thing people don't think about is the fact that like statisticians typically were expected to do, you know, just basic sort of tasks. In a sense that they're knowledge is specialized, right? But the day to day operations was they ran some data, you know they ran a test on some data, looked at the results, interpret the results based on what they were taught in school. They didn't develop that model a lot of times they just understand what the tests were saying, especially in the medical field. So when you when think about things like, we have words like population, census. Which is when you take data from every single, you have every single data point versus a sample, which is a subset. It's a very different story now that we're collecting faster than it used to be. It used to be the idea that you could collect information from everyone. Like it happens once every 10 years, we built that in. But nowadays you know, you know here about Facebook, for instance, I think they claimed earlier this year that their data was more accurate than the census data. So now there are these claims being made about which data source is more accurate. And I think the other side of this is now statisticians are expected to know data in a different way than they were before. So it's not just changing as a field in data science, but I think the sciences that are using data are also changing their fields as well. >> Dave: So is sampling dead? >> Well no, because-- >> Should it be? (laughs) >> Well if you're sampling wrong, yes. That's really the question. >> Okay. You know it's been said that the data doesn't lie, people do. Organizations are very political. Oftentimes you know, lies, damned lies and statistics, Benjamin Israeli. Are you seeing a change in the way in which organizations are using data in the context of the politics. So, some strong P&L manager say gets data and crafts it in a way that he or she can advance their agenda. Or they'll maybe attack a data set that is, probably should drive them in a different direction, but might be antithetical to their agenda. Are you seeing data, you know we talked about democratizing data, are you seeing that reduce the politics inside of organizations? >> So you know we've always used data to tell stories at the top level of an organization that's what it's all about. And I still see very much that no matter how much data science or, the access to the truth through looking at the numbers that story telling is still the political filter through which all that data still passes, right? But it's the advent of things like Block Chain, more and more corporate records and corporate information is going to end up in these open and shared repositories where there is not alternate truth. It'll come back to whoever tells the best stories at the end of the day. So I still see the organizations are very political. We are seeing now more open data though. Open data initiatives are a big thing, both in government and in the private sector. It is having an effect, but it's slow and steady. So that's what I see. >> Um, um, go ahead. >> I was just going to say as well. Ultimately I think data driven decision making is a great thing. And it's especially useful at the lower tiers of the organization where you have the routine day to day's decisions that could be automated through machine learning and deep learning. The algorithms can be improved on a constant basis. On the upper levels, you know that's why you pay executives the big bucks in the upper levels to make the strategic decisions. And data can help them, but ultimately, data, IT, technology alone will not create new markets, it will not drive new businesses, it's up to human beings to do that. The technology is the tool to help them make those decisions. But creating businesses, growing businesses, is very much a human activity. And that's something I don't see ever getting replaced. Technology might replace many other parts of the organization, but not that part. >> I tend to be a foolish optimist when it comes to this stuff. >> You do. (laughs) >> I do believe that data will make the world better. I do believe that data doesn't lie people lie. You know I think as we start, I'm already seeing trends in industries, all different industries where, you know conventional wisdom is starting to get trumped by analytics. You know I think it's still up to the human being today to ignore the facts and go with what they think in their gut and sometimes they win, sometimes they lose. But generally if they lose the data will tell them that they should have gone the other way. I think as we start relying more on data and trusting data through artificial intelligence, as we start making our lives a little bit easier, as we start using smart cars for safety, before replacement of humans. AS we start, you know, using data really and analytics and data science really as the bumpers, instead of the vehicle, eventually we're going to start to trust it as the vehicle itself. And then it's going to make lying a little bit harder. >> Okay, so great, excellent. Optimism, I love it. (John laughs) So I'm going to play devil's advocate here a little bit. There's a couple elephant in the room topics that I want to, to explore a little bit. >> Here it comes. >> There was an article today in Wired. And it was called, Why AI is Still Waiting for It's Ethics Transplant. And, I will just read a little segment from there. It says, new ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, government and military interests accountable as they design and employ AI. When tech giants build AI products, too often user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit driven business models based on aggregate data profiles. This is from Kate Crawford and Meredith Whittaker who founded AI Now. And they're calling for sort of, almost clinical trials on AI, if I could use that analogy. Before you go to market you've got to test the human impact, the social impact. Thoughts. >> And also have the ability for a human to intervene at some point in the process. This goes way back. Is everybody familiar with the name Stanislav Petrov? He's the Soviet officer who back in 1983, it was in the control room, I guess somewhere outside of Moscow in the control room, which detected a nuclear missile attack against the Soviet Union coming out of the United States. Ordinarily I think if this was an entirely AI driven process we wouldn't be sitting here right now talking about it. But this gentlemen looked at what was going on on the screen and, I'm sure he's accountable to his authorities in the Soviet Union. He probably got in a lot of trouble for this, but he decided to ignore the signals, ignore the data coming out of, from the Soviet satellites. And as it turned out, of course he was right. The Soviet satellites were seeing glints of the sun and they were interpreting those glints as missile launches. And I think that's a great example why, you know every situation of course doesn't mean the end of the world, (laughs) it was in this case. But it's a great example why there needs to be a human component, a human ability for human intervention at some point in the process. >> So other thoughts. I mean organizations are driving AI hard for profit. Best minds of our generation are trying to figure out how to get people to click on ads. Jeff Hammerbacher is famous for saying it. >> You can use data for a lot of things, data analytics, you can solve, you can cure cancer. You can make customers click on more ads. It depends on what you're goal is. But, there are ethical considerations we need to think about. When we have data that will have a racial bias against blacks and have them have higher prison sentences or so forth or worse credit scores, so forth. That has an impact on a broad group of people. And as a society we need to address that. And as scientists we need to consider how are we going to fix that problem? Cathy O'Neil in her book, Weapons of Math Destruction, excellent book, I highly recommend that your listeners read that book. And she talks about these issues about if AI, if algorithms have a widespread impact, if they adversely impact protected group. And I forget the last criteria, but like we need to really think about these things as a people, as a country. >> So always think the idea of ethics is interesting. So I had this conversation come up a lot of times when I talk to data scientists. I think as a concept, right as an idea, yes you want things to be ethical. The question I always pose to them is, "Well in the business setting "how are you actually going to do this?" 'Cause I find the most difficult thing working as a data scientist, is to be able to make the day to day decision of when someone says, "I don't like that number," how do you actually get around that. If that's the right data to be showing someone or if that's accurate. And say the business decides, "Well we don't like that number." Many people feel pressured to then change the data, change, or change what the data shows. So I think being able to educate people to be able to find ways to say what the data is saying, but not going past some line where it's a lie, where it's unethical. 'Cause you can also say what data doesn't say. You don't always have to say what the data does say. You can leave it as, "Here's what we do know, "but here's what we don't know." There's a don't know part that many people will omit when they talk about data. So I think, you know especially when it comes to things like AI it's tricky, right? Because I always tell people I don't know everyone thinks AI's going to be so amazing. I started an industry by fixing problems with computers that people didn't realize computers had. For instance when you have a system, a lot of bugs, we all have bug reports that we've probably submitted. I mean really it's no where near the point where it's going to start dominating our lives and taking over all the jobs. Because frankly it's not that advanced. It's still run by people, still fixed by people, still managed by people. I think with ethics, you know a lot of it has to do with the regulations, what the laws say. That's really going to be what's involved in terms of what people are willing to do. A lot of businesses, they want to make money. If there's no rules that says they can't do certain things to make money, then there's no restriction. I think the other thing to think about is we as consumers, like everyday in our lives, we shouldn't separate the idea of data as a business. We think of it as a business person, from our day to day consumer lives. Meaning, yes I work with data. Incidentally I also always opt out of my credit card, you know when they send you that information, they make you actually mail them, like old school mail, snail mail like a document that says, okay I don't want to be part of this data collection process. Which I always do. It's a little bit more work, but I go through that step of doing that. Now if more people did that, perhaps companies would feel more incentivized to pay you for your data, or give you more control of your data. Or at least you know, if a company's going to collect information, I'd want you to be certain processes in place to ensure that it doesn't just get sold, right? For instance if a start up gets acquired what happens with that data they have on you? You agree to give it to start up. But I mean what are the rules on that? So I think we have to really think about the ethics from not just, you know, someone who's going to implement something but as consumers what control we have for our own data. 'Cause that's going to directly impact what businesses can do with our data. >> You know you mentioned data collection. So slightly on that subject. All these great new capabilities we have coming. We talked about what's going to happen with media in the future and what 5G technology's going to do to mobile and these great bandwidth opportunities. The internet of things and the internet of everywhere. And all these great inputs, right? Do we have an arms race like are we keeping up with the capabilities to make sense of all the new data that's going to be coming in? And how do those things square up in this? Because the potential is fantastic, right? But are we keeping up with the ability to make it make sense and to put it to use, Joe? >> So I think data ingestion and data integration is probably one of the biggest challenges. I think, especially as the world is starting to become more dependent on data. I think you know, just because we're dependent on numbers we've come up with GAAP, which is generally accepted accounting principles that can be audited and proven whether it's true or false. I think in our lifetime we will see something similar to that we will we have formal checks and balances of data that we use that can be audited. Getting back to you know what Dave was saying earlier about, I personally would trust a machine that was programmed to do the right thing, than to trust a politician or some leader that may have their own agenda. And I think the other thing about machines is that they are auditable. You know you can look at the code and see exactly what it's doing and how it's doing it. Human beings not so much. So I think getting to the truth, even if the truth isn't the answer that we want, I think is a positive thing. It's something that we can't do today that once we start relying on machines to do we'll be able to get there. >> Yeah I was just going to add that we live in exponential times. And the challenge is that the way that we're structured traditionally as organizations is not allowing us to absorb advances exponentially, it's linear at best. Everyone talks about change management and how are we going to do digital transformation. Evidence shows that technology's forcing the leaders and the laggards apart. There's a few leading organizations that are eating the world and they seem to be somehow rolling out new things. I don't know how Amazon rolls out all this stuff. There's all this artificial intelligence and the IOT devices, Alexa, natural language processing and that's just a fraction, it's just a tip of what they're releasing. So it just shows that there are some organizations that have path found the way. Most of the Fortune 500 from the year 2000 are gone already, right? The disruption is happening. And so we are trying, have to find someway to adopt these new capabilities and deploy them effectively or the writing is on the wall. I spent a lot of time exploring this topic, how are we going to get there and all of us have a lot of hard work is the short answer. >> I read that there's going to be more data, or it was predicted, more data created in this year than in the past, I think it was five, 5,000 years. >> Forever. (laughs) >> And that to mix the statistics that we're analyzing currently less than 1% of the data. To taking those numbers and hear what you're all saying it's like, we're not keeping up, it seems like we're, it's not even linear. I mean that gap is just going to grow and grow and grow. How do we close that? >> There's a guy out there named Chris Dancy, he's known as the human cyborg. He has 700 hundred sensors all over his body. And his theory is that data's not new, having access to the data is new. You know we've always had a blood pressure, we've always had a sugar level. But we were never able to actually capture it in real time before. So now that we can capture and harness it, now we can be smarter about it. So I think that being able to use this information is really incredible like, this is something that over our lifetime we've never had and now we can do it. Which hence the big explosion in data. But I think how we use it and have it governed I think is the challenge right now. It's kind of cowboys and indians out there right now. And without proper governance and without rigorous regulation I think we are going to have some bumps in the road along the way. >> The data's in the oil is the question how are we actually going to operationalize around it? >> Or find it. Go ahead. >> I will say the other side of it is, so if you think about information, we always have the same amount of information right? What we choose to record however, is a different story. Now if you want wanted to know things about the Olympics, but you decide to collect information every day for years instead of just the Olympic year, yes you have a lot of data, but did you need all of that data? For that question about the Olympics, you don't need to collect data during years there are no Olympics, right? Unless of course you're comparing it relative. But I think that's another thing to think about. Just 'cause you collect more data does not mean that data will produce more statistically significant results, it does not mean it'll improve your model. You can be collecting data about your shoe size trying to get information about your hair. I mean it really does depend on what you're trying to measure, what your goals are, and what the data's going to be used for. If you don't factor the real world context into it, then yeah you can collect data, you know an infinite amount of data, but you'll never process it. Because you have no question to ask you're not looking to model anything. There is no universal truth about everything, that just doesn't exist out there. >> I think she's spot on. It comes down to what kind of questions are you trying to ask of your data? You can have one given database that has 100 variables in it, right? And you can ask it five different questions, all valid questions and that data may have those variables that'll tell you what's the best predictor of Churn, what's the best predictor of cancer treatment outcome. And if you can ask the right question of the data you have then that'll give you some insight. Just data for data's sake, that's just hype. We have a lot of data but it may not lead to anything if we don't ask it the right questions. >> Joe. >> I agree but I just want to add one thing. This is where the science in data science comes in. Scientists often will look at data that's already been in existence for years, weather forecasts, weather data, climate change data for example that go back to data charts and so forth going back centuries if that data is available. And they reformat, they reconfigure it, they get new uses out of it. And the potential I see with the data we're collecting is it may not be of use to us today, because we haven't thought of ways to use it, but maybe 10, 20, even 100 years from now someone's going to think of a way to leverage the data, to look at it in new ways and to come up with new ideas. That's just my thought on the science aspect. >> Knowing what you know about data science, why did Facebook miss Russia and the fake news trend? They came out and admitted it. You know, we miss it, why? Could they have, is it because they were focused elsewhere? Could they have solved that problem? (crosstalk) >> It's what you said which is are you asking the right questions and if you're not looking for that problem in exactly the way that it occurred you might not be able to find it. >> I thought the ads were paid in rubles. Shouldn't that be your first clue (panelists laugh) that something's amiss? >> You know red flag, so to speak. >> Yes. >> I mean Bitcoin maybe it could have hidden it. >> Bob: Right, exactly. >> I would think too that what happened last year is actually was the end of an age of optimism. I'll bring up the Soviet Union again, (chuckles). It collapsed back in 1991, 1990, 1991, Russia was reborn in. And think there was a general feeling of optimism in the '90s through the 2000s that Russia is now being well integrated into the world economy as other nations all over the globe, all continents are being integrated into the global economy thanks to technology. And technology is lifting entire continents out of poverty and ensuring more connectedness for people. Across Africa, India, Asia, we're seeing those economies that very different countries than 20 years ago and that extended into Russia as well. Russia is part of the global economy. We're able to communicate as a global, a global network. I think as a result we kind of overlook the dark side that occurred. >> John: Joe? >> Again, the foolish optimist here. But I think that... It shouldn't be the question like how did we miss it? It's do we have the ability now to catch it? And I think without data science without machine learning, without being able to train machines to look for patterns that involve corruption or result in corruption, I think we'd be out of luck. But now we have those tools. And now hopefully, optimistically, by the next election we'll be able to detect these things before they become public. >> It's a loaded question because my premise was Facebook had the ability and the tools and the knowledge and the data science expertise if in fact they wanted to solve that problem, but they were focused on other problems, which is how do I get people to click on ads? >> Right they had the ability to train the machines, but they were giving the machines the wrong training. >> Looking under the wrong rock. >> (laughs) That's right. >> It is easy to play armchair quarterback. Another topic I wanted to ask the panel about is, IBM Watson. You guys spend time in the Valley, I spend time in the Valley. People in the Valley poo-poo Watson. Ah, Google, Facebook, Amazon they've got the best AI. Watson, and some of that's fair criticism. Watson's a heavy lift, very services oriented, you just got to apply it in a very focused. At the same time Google's trying to get you to click on Ads, as is Facebook, Amazon's trying to get you to buy stuff. IBM's trying to solve cancer. Your thoughts on that sort of juxtaposition of the different AI suppliers and there may be others. Oh, nobody wants to touch this one, come on. I told you elephant in the room questions. >> Well I mean you're looking at two different, very different types of organizations. One which is really spent decades in applying technology to business and these other companies are ones that are primarily into the consumer, right? When we talk about things like IBM Watson you're looking at a very different type of solution. You used to be able to buy IT and once you installed it you pretty much could get it to work and store your records or you know, do whatever it is you needed it to do. But these types of tools, like Watson actually tries to learn your business. And it needs to spend time doing that watching the data and having its models tuned. And so you don't get the results right away. And I think that's been kind of the challenge that organizations like IBM has had. Like this is a different type of technology solution, one that has to actually learn first before it can provide value. And so I think you know you have organizations like IBM that are much better at applying technology to business, and then they have the further hurdle of having to try to apply these tools that work in very different ways. There's education too on the side of the buyer. >> I'd have to say that you know I think there's plenty of businesses out there also trying to solve very significant, meaningful problems. You know with Microsoft AI and Google AI and IBM Watson, I think it's not really the tool that matters, like we were saying earlier. A fool with a tool is still a fool. And regardless of who the manufacturer of that tool is. And I think you know having, a thoughtful, intelligent, trained, educated data scientist using any of these tools can be equally effective. >> So do you not see core AI competence and I left out Microsoft, as a strategic advantage for these companies? Is it going to be so ubiquitous and available that virtually anybody can apply it? Or is all the investment in R&D and AI going to pay off for these guys? >> Yeah, so I think there's different levels of AI, right? So there's AI where you can actually improve the model. I remember when I was invited when Watson was kind of first out by IBM to a private, sort of presentation. And my question was, "Okay, so when do I get "to access the corpus?" The corpus being sort of the foundation of NLP, which is natural language processing. So it's what you use as almost like a dictionary. Like how you're actually going to measure things, or things up. And they said, "Oh you can't." "What do you mean I can't?" It's like, "We do that." "So you're telling me as a data scientist "you're expecting me to rely on the fact "that you did it better than me and I should rely on that." I think over the years after that IBM started opening it up and offering different ways of being able to access the corpus and work with that data. But I remember at the first Watson hackathon there was only two corpus available. It was either the travel or medicine. There was no other foundational data available. So I think one of the difficulties was, you know IBM being a little bit more on the forefront of it they kind of had that burden of having to develop these systems and learning kind of the hard way that if you don't have the right models and you don't have the right data and you don't have the right access, that's going to be a huge limiter. I think with things like medical, medical information that's an extremely difficult data to start with. Partly because you know anything that you do find or don't find, the impact is significant. If I'm looking at things like what people clicked on the impact of using that data wrong, it's minimal. You might lose some money. If you do that with healthcare data, if you do that with medical data, people may die, like this is a much more difficult data set to start with. So I think from a scientific standpoint it's great to have any information about a new technology, new process. That's the nice that is that IBM's obviously invested in it and collected information. I think the difficulty there though is just 'cause you have it you can't solve everything. And if feel like from someone who works in technology, I think in general when you appeal to developers you try not to market. And with Watson it's very heavily marketed, which tends to turn off people who are more from the technical side. Because I think they don't like it when it's gimmicky in part because they do the opposite of that. They're always trying to build up the technical components of it. They don't like it when you're trying to convince them that you're selling them something when you could just give them the specs and look at it. So it could be something as simple as communication. But I do think it is valuable to have had a company who leads on the forefront of that and try to do so we can actually learn from what IBM has learned from this process. >> But you're an optimist. (John laughs) All right, good. >> Just one more thought. >> Joe go ahead first. >> Joe: I want to see how Alexa or Siri do on Jeopardy. (panelists laugh) >> All right. Going to go around a final thought, give you a second. Let's just think about like your 12 month crystal ball. In terms of either challenges that need to be met in the near term or opportunities you think will be realized. 12, 18 month horizon. Bob you've got the microphone headed up, so I'll let you lead off and let's just go around. >> I think a big challenge for business, for society is getting people educated on data and analytics. There's a study that was just released I think last month by Service Now, I think, or some vendor, or Click. They found that only 17% of the employees in Europe have the ability to use data in their job. Think about that. >> 17. >> 17. Less than 20%. So these people don't have the ability to understand or use data intelligently to improve their work performance. That says a lot about the state we're in today. And that's Europe. It's probably a lot worse in the United States. So that's a big challenge I think. To educate the masses. >> John: Joe. >> I think we probably have a better chance of improving technology over training people. I think using data needs to be iPhone easy. And I think, you know which means that a lot of innovation is in the years to come. I do think that a keyboard is going to be a thing of the past for the average user. We are going to start using voice a lot more. I think augmented reality is going to be things that becomes a real reality. Where we can hold our phone in front of an object and it will have an overlay of prices where it's available, if it's a person. I think that we will see within an organization holding a camera up to someone and being able to see what is their salary, what sales did they do last year, some key performance indicators. I hope that we are beyond the days of everyone around the world walking around like this and we start actually becoming more social as human beings through augmented reality. I think, it has to happen. I think we're going through kind of foolish times at the moment in order to get to the greater good. And I think the greater good is using technology in a very, very smart way. Which means that you shouldn't have to be, sorry to contradict, but maybe it's good to counterpoint. I don't think you need to have a PhD in SQL to use data. Like I think that's 1990. I think as we evolve it's going to become easier for the average person. Which means people like the brain trust here needs to get smarter and start innovating. I think the innovation around data is really at the tip of the iceberg, we're going to see a lot more of it in the years to come. >> Dion why don't you go ahead, then we'll come down the line here. >> Yeah so I think over that time frame two things are likely to happen. One is somebody's going to crack the consumerization of machine learning and AI, such that it really is available to the masses and we can do much more advanced things than we could. We see the industries tend to reach an inflection point and then there's an explosion. No one's quite cracked the code on how to really bring this to everyone, but somebody will. And that could happen in that time frame. And then the other thing that I think that almost has to happen is that the forces for openness, open data, data sharing, open data initiatives things like Block Chain are going to run headlong into data protection, data privacy, customer privacy laws and regulations that have to come down and protect us. Because the industry's not doing it, the government is stepping in and it's going to re-silo a lot of our data. It's going to make it recede and make it less accessible, making data science harder for a lot of the most meaningful types of activities. Patient data for example is already all locked down. We could do so much more with it, but health start ups are really constrained about what they can do. 'Cause they can't access the data. We can't even access our own health care records, right? So I think that's the challenge is we have to have that battle next to be able to go and take the next step. >> Well I see, with the growth of data a lot of it's coming through IOT, internet of things. I think that's a big source. And we're going to see a lot of innovation. A new types of Ubers or Air BnBs. Uber's so 2013 though, right? We're going to see new companies with new ideas, new innovations, they're going to be looking at the ways this data can be leveraged all this big data. Or data coming in from the IOT can be leveraged. You know there's some examples out there. There's a company for example that is outfitting tools, putting sensors in the tools. Industrial sites can therefore track where the tools are at any given time. This is an expensive, time consuming process, constantly loosing tools, trying to locate tools. Assessing whether the tool's being applied to the production line or the right tool is at the right torque and so forth. With the sensors implanted in these tools, it's now possible to be more efficient. And there's going to be innovations like that. Maybe small start up type things or smaller innovations. We're going to see a lot of new ideas and new types of approaches to handling all this data. There's going to be new business ideas. The next Uber, we may be hearing about it a year from now whatever that may be. And that Uber is going to be applying data, probably IOT type data in some, new innovative way. >> Jennifer, final word. >> Yeah so I think with data, you know it's interesting, right, for one thing I think on of the things that's made data more available and just people we open to the idea, has been start ups. But what's interesting about this is a lot of start ups have been acquired. And a lot of people at start ups that got acquired now these people work at bigger corporations. Which was the way it was maybe 10 years ago, data wasn't available and open, companies kept it very proprietary, you had to sign NDAs. It was like within the last 10 years that open source all of that initiatives became much more popular, much more open, a acceptable sort of way to look at data. I think that what I'm kind of interested in seeing is what people do within the corporate environment. Right, 'cause they have resources. They have funding that start ups don't have. And they have backing, right? Presumably if you're acquired you went in at a higher title in the corporate structure whereas if you had started there you probably wouldn't be at that title at that point. So I think you have an opportunity where people who have done innovative things and have proven that they can build really cool stuff, can now be in that corporate environment. I think part of it's going to be whether or not they can really adjust to sort of the corporate, you know the corporate landscape, the politics of it or the bureaucracy. I think every organization has that. Being able to navigate that is a difficult thing in part 'cause it's a human skill set, it's a people skill, it's a soft skill. It's not the same thing as just being able to code something and sell it. So you know it's going to really come down to people. I think if people can figure out for instance, what people want to buy, what people think, in general that's where the money comes from. You know you make money 'cause someone gave you money. So if you can find a way to look at a data or even look at technology and understand what people are doing, aren't doing, what they're happy about, unhappy about, there's always opportunity in collecting the data in that way and being able to leverage that. So you build cooler things, and offer things that haven't been thought of yet. So it's a very interesting time I think with the corporate resources available if you can do that. You know who knows what we'll have in like a year. >> I'll add one. >> Please. >> The majority of companies in the S&P 500 have a market cap that's greater than their revenue. The reason is 'cause they have IP related to data that's of value. But most of those companies, most companies, the vast majority of companies don't have any way to measure the value of that data. There's no GAAP accounting standard. So they don't understand the value contribution of their data in terms of how it helps them monetize. Not the data itself necessarily, but how it contributes to the monetization of the company. And I think that's a big gap. If you don't understand the value of the data that means you don't understand how to refine it, if data is the new oil and how to protect it and so forth and secure it. So that to me is a big gap that needs to get closed before we can actually say we live in a data driven world. >> So you're saying I've got an asset, I don't know if it's worth this or this. And they're missing that great opportunity. >> So devolve to what I know best. >> Great discussion. Really, really enjoyed the, the time as flown by. Joe if you get that augmented reality thing to work on the salary, point it toward that guy not this guy, okay? (everyone laughs) It's much more impressive if you point it over there. But Joe thank you, Dion, Joe and Jennifer and Batman. We appreciate and Bob Hayes, thanks for being with us. >> Thanks you guys. >> Really enjoyed >> Great stuff. >> the conversation. >> And a reminder coming up a the top of the hour, six o'clock Eastern time, IBMgo.com featuring the live keynote which is being set up just about 50 feet from us right now. Nick Silver is one of the headliners there, John Thomas is well, or rather Rob Thomas. John Thomas we had on earlier on The Cube. But a panel discussion as well coming up at six o'clock on IBMgo.com, six to 7:15. Be sure to join that live stream. That's it from The Cube. We certainly appreciate the time. Glad to have you along here in New York. And until the next time, take care. (bright digital music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Welcome back to data science for all. So it is a new game-- Have a swing at the pitch. Thanks for taking the time to be with us. from the academic side to continue data science And there's lot to be said is there not, ask the questions, you can't not think about it. of the customer and how we were going to be more anticipatory And I think, you know as the tools mature, So it's still too hard. I think that, you know, that's where it's headed. So Bob if you would, so you've got this Batman shirt on. to be a data scientist, but these tools will help you I was just going to add that, you know I think it's important to point out as well that And the data scientists on the panel And the only difference is that you can build it's an accomplishment and for less, So I think you have to think about the fact that I get the point of it and I think and become easier to use, you know like Bob was saying, So how at the end of the day, Dion? or bots that go off and run the hypotheses So you know people who are using the applications are now then people can speak really slowly to you in French, But the day to day operations was they ran some data, That's really the question. You know it's been said that the data doesn't lie, the access to the truth through looking at the numbers of the organization where you have the routine I tend to be a foolish optimist You do. I think as we start relying more on data and trusting data There's a couple elephant in the room topics Before you go to market you've got to test And also have the ability for a human to intervene to click on ads. And I forget the last criteria, but like we need I think with ethics, you know a lot of it has to do of all the new data that's going to be coming in? Getting back to you know what Dave was saying earlier about, organizations that have path found the way. than in the past, I think it was (laughs) I mean that gap is just going to grow and grow and grow. So I think that being able to use this information Or find it. But I think that's another thing to think about. And if you can ask the right question of the data you have And the potential I see with the data we're collecting is Knowing what you know about data science, for that problem in exactly the way that it occurred I thought the ads were paid in rubles. I think as a result we kind of overlook And I think without data science without machine learning, Right they had the ability to train the machines, At the same time Google's trying to get you And so I think you know And I think you know having, I think in general when you appeal to developers But you're an optimist. Joe: I want to see how Alexa or Siri do on Jeopardy. in the near term or opportunities you think have the ability to use data in their job. That says a lot about the state we're in today. I don't think you need to have a PhD in SQL to use data. Dion why don't you go ahead, We see the industries tend to reach an inflection point And that Uber is going to be applying data, I think part of it's going to be whether or not if data is the new oil and how to protect it I don't know if it's worth this or this. Joe if you get that augmented reality thing Glad to have you along here in New York.

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
Dave	PERSON	0.99+
Dion Hinchcliffe	PERSON	0.99+
John	PERSON	0.99+
Jennifer	PERSON	0.99+
Joe	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
Chris Dancy	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Cathy O'Neil	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Stanislav Petrov	PERSON	0.99+
Joe McKendrick	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Nick Silver	PERSON	0.99+
John Thomas	PERSON	0.99+
100 variables	QUANTITY	0.99+
John Walls	PERSON	0.99+
1990	DATE	0.99+
Joe Caserta	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
UC Berkeley	ORGANIZATION	0.99+
1983	DATE	0.99+
1991	DATE	0.99+
2013	DATE	0.99+
Constellation Research	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Bob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Bob Hayes	PERSON	0.99+
United States	LOCATION	0.99+
360 degree	QUANTITY	0.99+
one	QUANTITY	0.99+
New York	LOCATION	0.99+
Benjamin Israeli	PERSON	0.99+
France	LOCATION	0.99+
Africa	LOCATION	0.99+
12 month	QUANTITY	0.99+
Soviet Union	LOCATION	0.99+
Batman	PERSON	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Olympics	EVENT	0.99+
Meredith Whittaker	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Moscow	LOCATION	0.99+
Ubers	ORGANIZATION	0.99+
20 years	QUANTITY	0.99+
Joe C.	PERSON	0.99+

Vikram Murali, IBM | IBM Data Science For All

>> Narrator: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome back to New York here on theCUBE. Along with Dave Vellante, I'm John Walls. We're Data Science For All, IBM's two day event, and we'll be here all day long wrapping up again with that panel discussion from four to five here Eastern Time, so be sure to stick around all day here on theCUBE. Joining us now is Vikram Murali, who is a program director at IBM, and Vikram thank for joining us here on theCUBE. Good to see you. >> Good to see you too. Thanks for having me. >> You bet. So, among your primary responsibilities, The Data Science Experience. So first off, if you would, share with our viewers a little bit about that. You know, the primary mission. You've had two fairly significant announcements. Updates, if you will, here over the past month or so, so share some information about that too if you would. >> Sure, so my team, we build The Data Science Experience, and our goal is for us to enable data scientist, in their path, to gain insights into data using data science techniques, mission learning, the latest and greatest open source especially, and be able to do collaboration with fellow data scientist, with data engineers, business analyst, and it's all about freedom. Giving freedom to data scientist to pick the tool of their choice, and program and code in the language of their choice. So that's the mission of Data Science Experience, when we started this. The two releases, that you mentioned, that we had in the last 45 days. There was one in September and then there was one on October 30th. Both of these releases are very significant in the mission learning space especially. We now support Scikit-Learn, XGBoost, TensorFlow libraries in Data Science Experience. We have deep integration with Horton Data Platform, which is keymark of our partnership with Hortonworks. Something that we announced back in the summer, and this last release of Data Science Experience, two days back, specifically can do authentication with Technotes with Hadoop. So now our Hadoop customers, our Horton Data Platform customers, can leverage all the goodies that we have in Data Science Experience. It's more deeply integrated with our Hadoop based environments. >> A lot of people ask me, "Okay, when IBM announces a product like Data Science Experience... You know, IBM has a lot of products in its portfolio. Are they just sort of cobbling together? You know? So exulting older products, and putting a skin on them? Or are they developing them from scratch?" How can you help us understand that? >> That's a great question, and I hear that a lot from our customers as well. Data Science Experience started off as a design first methodology. And what I mean by that is we are using IBM design to lead the charge here along with the product and development. And we are actually talking to customers, to data scientist, to data engineers, to enterprises, and we are trying to find out what problems they have in data science today and how we can best address them. So it's not about taking older products and just re-skinning them, but Data Science Experience, for example, it started of as a brand new product: completely new slate with completely new code. Now, IBM has done data science and mission learning for a very long time. We have a lot of assets like SPSS Modeler and Stats, and digital optimization. And we are re-investing in those products, and we are investing in such a way, and doing product research in such a way, not to make the old fit with the new, but in a way where it fits into the realm of collaboration. How can data scientist leverage our existing products with open source, and how we can do collaboration. So it's not just re-skinning, but it's building ground up. >> So this is really important because you say architecturally it's built from the ground up. Because, you know, given enough time and enough money, you know, smart people, you can make anything work. So the reason why this is important is you mentioned, for instance, TensorFlow. You know that down the road there's going to be some other tooling, some other open source project that's going to take hold, and your customers are going to say, "I want that." You've got to then integrate that, or you have to choose whether or not to. If it's a super heavy lift, you might not be able to do it, or do it in time to hit the market. If you architected your system to be able to accommodate that. Future proof is the term everybody uses, so have you done? How have you done that? I'm sure API's are involved, but maybe you could add some color. >> Sure. So we are and our Data Science Experience and mission learning... It is a microservices based architecture, so we are completely dockerized, and we use Kubernetes under the covers for container dockerstration. And all these are tools that are used in The Valley, across different companies, and also in products across IBM as well. So some of these legacy products that you mentioned, we are actually using some of these newer methodologies to re-architect them, and we are dockerizing them, and the microservice architecture actually helps us address issues that we have today as well as be open to development and taking newer methodologies and frameworks into consideration that may not exist today. So the microservices architecture, for example, TensorFlow is something that you brought in. So we can just pin up a docker container just for TensorFlow and attach it to our existing Data Science Experience, and it just works. Same thing with other frameworks like XGBoost, and Kross, and Scikit-Learn, all these are frameworks and libraries that are coming up in open source within the last, I would say, a year, two years, three years timeframe. Previously, integrating them into our product would have been a nightmare. We would have had to re-architect our product every time something came, but now with the microservice architecture it is very easy for us to continue with those. >> We were just talking to Daniel Hernandez a little bit about the Hortonworks relationship at high level. One of the things that I've... I mean, I've been following Hortonworks since day one when Yahoo kind of spun them out. And know those guys pretty well. And they always make a big deal out of when they do partnerships, it's deep engineering integration. And so they're very proud of that, so I want to come on to test that a little bit. Can you share with our audience the kind of integrations you've done? What you've brought to the table? What Hortonworks brought to the table? >> Yes, so Data Science Experience today can work side by side with Horton Data Platform, HDP. And we could have actually made that work about two, three months back, but, as part of our partnership that was announced back in June, we set up drawing engineering teams. We have multiple touch points every day. We call it co-development, and they have put resources in. We have put resources in, and today, especially with the release that came out on October 30th, Data Science Experience can authenticate using secure notes. That I previously mentioned, and that was a direct example of our partnership with Hortonworks. So that is phase one. Phase two and phase three is going to be deeper integration, so we are planning on making Data Science Experience and a body management pact. And so a Hortonworks customer, if you have HDP already installed, you don't have to install DSX separately. It's going to be a management pack. You just spin it up. And the third phase is going to be... We're going to be using YARN for resource management. YARN is very good a resource management. And for infrastructure as a service for data scientist, we can actually delegate that work to YARN. So, Hortonworks, they are putting resources into YARN, doubling down actually. And they are making changes to YARN where it will act as the resource manager not only for the Hadoop and Spark workloads, but also for Data Science Experience workloads. So that is the level of deep engineering that we are engaged with Hortonworks. >> YARN stands for yet another resource negotiator. There you go for... >> John: Thank you. >> The trivia of the day. (laughing) Okay, so... But of course, Hortonworks are big on committers. And obviously a big committer to YARN. Probably wouldn't have YARN without Hortonworks. So you mentioned that's kind of what they're bringing to the table, and you guys primarily are focused on the integration as well as some other IBM IP? >> That is true as well as the notes piece that I mentioned. We have a notes commenter. We have multiple notes commenters on our side, and that helps us as well. So all the notes is part of the HDP package. We need knowledge on our side to work with Hortonworks developers to make sure that we are contributing and making end roads into Data Science Experience. That way the integration becomes a lot more easier. And from an IBM IP perspective... So Data Science Experience already comes with a lot of packages and libraries that are open source, but IBM research has worked on a lot of these libraries. I'll give you a few examples: Brunel and PixieDust is something that our developers love. These are visualization libraries that were actually cooked up by IBM research and the open sourced. And these are prepackaged into Data Science Experience, so there is IBM IP involved and there are a lot of algorithms, mission learning algorithms, that we put in there. So that comes right out of the package. >> And you guys, the development teams, are really both in The Valley? Is that right? Or are you really distributed around the world? >> Yeah, so we are. The Data Science Experience development team is in North America between The Valley and Toronto. The Hortonworks team, they are situated about eight miles from where we are in The Valley, so there's a lot of synergy. We work very closely with them, and that's what we see in the product. >> I mean, what impact does that have? Is it... You know, you hear today, "Oh, yeah. We're a virtual organization. We have people all over the world: Eastern Europe, Brazil." How much of an impact is that? To have people so physically proximate? >> I think it has major impact. I mean IBM is a global organization, so we do have teams around the world, and we work very well. With the invent of IP telephoning, and screen-shares, and so on, yes we work. But it really helps being in the same timezone, especially working with a partner just eight miles or ten miles a way. We have a lot of interaction with them and that really helps. >> Dave: Yeah. Body language? >> Yeah. >> Yeah. You talked about problems. You talked about issues. You know, customers. What are they now? Before it was like, "First off, I want to get more data." Now they've got more data. Is it figuring out what to do with it? Finding it? Having it available? Having it accessible? Making sense of it? I mean what's the barrier right now? >> The barrier, I think for data scientist... The number one barrier continues to be data. There's a lot of data out there. Lot of data being generated, and the data is dirty. It's not clean. So number one problem that data scientist have is how do I get to clean data, and how do I access data. There are so many data repositories, data lakes, and data swamps out there. Data scientist, they don't want to be in the business of finding out how do I access data. They want to have instant access to data, and-- >> Well if you would let me interrupt you. >> Yeah? >> You say it's dirty. Give me an example. >> So it's not structured data, so data scientist-- >> John: So unstructured versus structured? >> Unstructured versus structured. And if you look at all the social media feeds that are being generated, the amount of data that is being generated, it's all unstructured data. So we need to clean up the data, and the algorithms need structured data or data in a particular format. And data scientist don't want to spend too much time in cleaning up that data. And access to data, as I mentioned. And that's where Data Science Experience comes in. Out of the box we have so many connectors available. It's very easy for customers to bring in their own connectors as well, and you have instant access to data. And as part of our partnership with Hortonworks, you don't have to bring data into Data Science Experience. The data is becoming so big. You want to leave it where it is. Instead, push analytics down to where it is. And you can do that. We can connect to remote Spark. We can push analytics down through remote Spark. All of that is possible today with Data Science Experience. The second thing that I hear from data scientist is all the open source libraries. Every day there's a new one. It's a boon and a bane as well, and the problem with that is the open source community is very vibrant, and there a lot of data science competitions, mission learning competitions that are helping move this community forward. And it's a good thing. The bad thing is data scientist like to work in silos on their laptop. How do you, from an enterprise perspective... How do you take that, and how do you move it? Scale it to an enterprise level? And that's where Data Science Experience comes in because now we provide all the tools. The tools of your choice: open source or proprietary. You have it in here, and you can easily collaborate. You can do all the work that you need with open source packages, and libraries, bring your own, and as well as collaborate with other data scientist in the enterprise. >> So, you're talking about dirty data. I mean, with Hadoop and no schema on, right? We kind of knew this problem was coming. So technology sort of got us into this problem. Can technology help us get out of it? I mean, from an architectural standpoint. When you think about dirty data, can you architect things in to help? >> Yes. So, if you look at the mission learning pipeline, the pipeline starts with ingesting data and then cleansing or cleaning that data. And then you go into creating a model, training, picking a classifier, and so on. So we have tools built into Data Science Experience, and we're working on tools, that will be coming up and down our roadmap, which will help data scientist do that themselves. I mean, they don't have to be really in depth coders or developers to do that. Python is very powerful. You can do a lot of data wrangling in Python itself, so we are enabling data scientist to do that within the platform, within Data Science Experience. >> If I look at sort of the demographics of the development teams. We were talking about Hortonworks and you guys collaborating. What are they like? I mean people picture IBM, you know like this 100 plus year old company. What's the persona of the developers in your team? >> The persona? I would say we have a very young, agile development team, and by that I mean... So we've had six releases this year in Data Science Experience. Just for the on premises side of the product, and the cloud side of the product it's got huge delivery. We have releases coming out faster than we can code. And it's not just re-architecting it every time, but it's about adding features, giving features that our customers are asking for, and not making them wait for three months, six months, one year. So our releases are becoming a lot more frequent, and customers are loving it. And that is, in part, because of the team. The team is able to evolve. We are very agile, and we have an awesome team. That's all. It's an amazing team. >> But six releases in... >> Yes. We had immediate release in April, and since then we've had about five revisions of the release where we add lot more features to our existing releases. A lot more packages, libraries, functionality, and so on. >> So you know what monster you're creating now don't you? I mean, you know? (laughing) >> I know, we are setting expectation. >> You still have two months left in 2017. >> We do. >> We do not make frame release cycles. >> They are not, and that's the advantage of the microservices architecture. I mean, when you upgrade, a customer upgrades, right? They don't have to bring that entire system down to upgrade. You can target one particular part, one particular microservice. You componentize it, and just upgrade that particular microservice. It's become very simple, so... >> Well some of those microservices aren't so micro. >> Vikram: Yeah. Not. Yeah, so it's a balance. >> You're growing, but yeah. >> It's a balance you have to keep. Making sure that you componentize it in such a way that when you're doing an upgrade, it effects just one small piece of it, and you don't have to take everything down. >> Dave: Right. >> But, yeah, I agree with you. >> Well, it's been a busy year for you. To say the least, and I'm sure 2017-2018 is not going to slow down. So continue success. >> Vikram: Thank you. >> Wish you well with that. Vikram, thanks for being with us here on theCUBE. >> Thank you. Thanks for having me. >> You bet. >> Back with Data Science For All. Here in New York City, IBM. Coming up here on theCUBE right after this. >> Cameraman: You guys are clear. >> John: All right. That was great.

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Good to see you. Good to see you too. about that too if you would. and be able to do collaboration How can you help us understand that? and we are investing in such a way, You know that down the and attach it to our existing One of the things that I've... And the third phase is going to be... There you go for... and you guys primarily are So that comes right out of the package. The Valley and Toronto. We have people all over the We have a lot of interaction with them Is it figuring out what to do with it? and the data is dirty. You say it's dirty. You can do all the work that you need with can you architect things in to help? I mean, they don't have to and you guys collaborating. And that is, in part, because of the team. and since then we've had about and that's the advantage of microservices aren't so micro. Yeah, so it's a balance. and you don't have to is not going to slow down. Wish you well with that. Thanks for having me. Back with Data Science For All. That was great.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Vikram	PERSON	0.99+
John	PERSON	0.99+
three months	QUANTITY	0.99+
six months	QUANTITY	0.99+
John Walls	PERSON	0.99+
October 30th	DATE	0.99+
2017	DATE	0.99+
April	DATE	0.99+
June	DATE	0.99+
one year	QUANTITY	0.99+
Daniel Hernandez	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
September	DATE	0.99+
one	QUANTITY	0.99+
ten miles	QUANTITY	0.99+
YARN	ORGANIZATION	0.99+
eight miles	QUANTITY	0.99+
Vikram Murali	PERSON	0.99+
New York City	LOCATION	0.99+
North America	LOCATION	0.99+
two day	QUANTITY	0.99+
Python	TITLE	0.99+
two releases	QUANTITY	0.99+
New York	LOCATION	0.99+
two years	QUANTITY	0.99+
three years	QUANTITY	0.99+
six releases	QUANTITY	0.99+
Toronto	LOCATION	0.99+
today	DATE	0.99+
Both	QUANTITY	0.99+
two months	QUANTITY	0.99+
a year	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
third phase	QUANTITY	0.98+
both	QUANTITY	0.98+
this year	DATE	0.98+
first methodology	QUANTITY	0.98+
First	QUANTITY	0.97+
second thing	QUANTITY	0.97+
one small piece	QUANTITY	0.96+
One	QUANTITY	0.96+
XGBoost	TITLE	0.96+
Cameraman	PERSON	0.96+
about eight miles	QUANTITY	0.95+
Horton Data Platform	ORGANIZATION	0.95+
2017-2018	DATE	0.94+
first	QUANTITY	0.94+
The Valley	LOCATION	0.94+
TensorFlow	TITLE	0.94+

Daniel Hernandez, Analytics Offering Management | IBM Data Science For All

>> Announcer: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome to the big apple, John Walls and Dave Vellante here on theCUBE we are live at IBM's Data Science For All. Going to be here throughout the day with a big panel discussion wrapping up our day. So be sure to stick around all day long on theCUBe for that. Dave always good to be here in New York is it not? >> Well you know it's been kind of the data science weeks, months, last week we're in Boston at an event with the chief data officer conference. All the Boston Datarati were there, bring it all down to New York City getting hardcore really with data science so it's from chief data officer to the hardcore data scientists. >> The CDO, hot term right now. Daniel Hernandez now joins as our first guest here at Data Science For All. Who's a VP of IBM Analytics, good to see you. David thanks for being with us. >> Pleasure. >> Alright well give us first off your take, let's just step back high level here. Data science it's certainly been evolving for decades if you will. First off how do you define it today? And then just from the IBM side of the fence, how do you see it in terms of how businesses should be integrating this into their mindset. >> So the way I describe data science simply to my clients is it's using the scientific method to answer questions or deliver insights. It's kind of that simple. Or answering questions quantitatively. So it's a methodology, it's a discipline, it's not necessarily tools. So that's kind of the way I approach describing what it is. >> Okay and then from the IBM side of the fence, in terms of how wide of a net are you casting these days I assume it's as big as you can get your arms out. >> So when you think about any particular problem that's a data science problem, you need certain capabilities. We happen to deliver those capabilities. You need the ability to collect, store, manage, any and all data. You need the ability to organize that data so you can discover it and protect it. You got to be able to analyze it. Automate the mundane, explain the past, predict the future. Those are the capabilities you need to do data science. We deliver a portfolio of it. Including on the analyze part of our portfolio, our data science tools that we would declare as such. >> So data science for all is very aspirational, and when you guys made the announcement of the Watson data platform last fall, one of the things that you focused on was collaboration between data scientists, data engineers, quality engineers, application development, the whole sort of chain. And you made the point that most of the time that data scientists spend is on wrangling data. You're trying to attack that problem, and you're trying to break down the stovepipes between those roles that I just mentioned. All that has to happen before you can actually have data science for all. I mean that's just data science for all hardcore data people. Where are we in terms of sort of the progress that your clients have made in that regard? >> So you know, I would say there's two majors vectors of progress we've made. So if you want data science for all you need to be able to address people that know how to code and people that don't know how to code. So if you consider kind the history of IBM in the data science space especially in SPSS, which has been around for decades. We're mastering and solving data science problems for non-coders. The data science experience really started with embracing coders. Developers that grew up in open source, that lived and learned Jupiter or Python and were more comfortable there. And integration of these is kind of our focus. So that's one aspect. Serving the needs of people that know how to code and don't in the kind of data science role. And then for all means supporting an entire analytics life cycle from collecting the data you need in order to answer the question that you're trying to answer to organizing that information once you've collected so you can discover it inside of tools like our own data science experience and SPSS, and then of course the set of tools that around exploratory analytics. All integrated so that you can do that end to end life cycle. So where clients are, I think they're getting certainly much more sophisticated in understanding that. You know most people have approached data science as a tool problem, as a data prep problem. It's a life cycle problem. And that's kind of how we're thinking about it. We're thinking about it in terms of, alright if our job is answer questions, delivering insights through scientific methods, how do we decompose that problem to a set of things that people need to get the job done, serving the individuals that have to work together. >> And when you think about, go back to the days where it's sort of the data warehouse was king. Something we talked about in Boston last week, it used to be the data warehouse was king, now it's the process is much more important. But it was very few people had access to that data, you had the elapsed time of getting answers, and the inflexibility of the systems. Has that changed and to what degree has it changed? >> I think if you were to go ask anybody in business whether or not they have all the data they need to do their job, they would say no. Why? So we've invested in EDW's, we've invested in Hadoop. In part sometimes, the problem might be, I just don't have the data. Most of the time it is I have the data I just don't know where it is. So there's a pretty significant issue on data discoverability, and it's important that I might have data in my operational systems, I might have data inside my EDW, I don't have everything inside my EDW, I've standed up one or more data lakes, and to solve my problem like customer segmentation I have data everywhere, how do I find and bring it in? >> That seems like that should be a fundamental consideration, right? If you're going to gather this much more information, make it accessible to people. And if you don't, it's a big flaw, it's a big gap is it not? >> So yes, and I think part of the reason why is because governance professionals which I am, you know I spent quite a bit of time trying to solve governance related problems. We've been focusing pretty maniacally on kind of the compliance, and the regulatory and security related issues. Like how do we keep people from going to jail, how do we ensure regulatory compliance with things like e-discovery, and records for instance. And it just so happens the same discipline that you use, even though in some cases lighter weight implementations, are what you need in order to solve this data discovery problem. So the discourse around governance has been historically about compliance, about regulations, about cost takeout, not analytics. And so a lot of our time certainly in R&D is trying to solve that data discovery problem which is how do I discover data using semantics that I have, which as a regular user is not physical understandings of my data, and once I find it how am I assured that what I get is what I should get so that it's, I'm not subject to compliance related issues, but also making the company more vulnerable to data breach. >> Well so presumably part of that anyway involves automating classification at the point of creation or use, which is actually was a technical challenge for a number of years. Has that challenge been solved in your view? >> I think machine learning is, and in fact later on today I will be doing some demonstrations of technology which will show how we're making the application of machine learning easy, inside of everything we do we're applying machine learning techniques including to classification problems that help us solve the problem. So it could be we're automatically harvesting technical metadata. Are there business terms that could be automatically extracted that don't require some data steward to have to know and assert, right? Or can we automatically suggest and still have the steward for a case where I need a canonical data model, and so I just don't want the machine to tell me everything, but I want the machine to assist the data curation process. We are not just exploring the application of machine learning to solve that data classification problem, which historically was a manual one. We're embedding that into most of the stuff that we're doing. Often you won't even know that we're doing it behind the scenes. >> So that means that often times well the machine ideally are making the decisions as to who gets access to what, and is helping at least automate that governance, but there's a natural friction that occurs. And I wonder if you can talk about the balance sheet if you will between information as an asset, information as a liability. You know the more restrictions you put on that information the more it constricts you know a business user's ability. So how do you see that shaping up? >> I think it's often a people process problem, not necessarily a technology problem. I don't think as an industry we've figured it out. Certainly a lot of our clients haven't figured out that balance. I mean there are plenty of conversation I'll go into where I'll talk to a data science team in a same line of business as a governance team and what the data science team will tell us is I'm building my own data catalog because the stuff that the governance guys are doing doesn't help me. And the reason why it doesn't help me is because it's they're going through this top down data curation methodology and I've got a question, I need to go find the data that's relevant. I might not know what that is straight away. So the CDO function in a lot of organizations is helping bridge that. So you'll see governance responsibilities line up with the CDO with analytics. And I think that's gone a long way to bridge that gaps. But that conversation that I was just mentioning is not unique to one or two customers. Still a lot of customers are doing it. Often customers that either haven't started a CDO practice or are early days on it still. >> So about that, because this is being introduced to the workplace, a new concept right, fairly new CDOs. As opposed to CIO or CTO, you know you have these other. I mean how do you talk to your clients about trying to broaden their perspective on that and I guess emphasizing the need for them to consider putting somebody of a sole responsibility, or primary responsibility for their data. Instead of just putting it lumping it in somewhere else. >> So we happen to have one of the best CDO's inside of our group which is like a handy tool for me. So if I go into a client and it's purporting to be a data science problem and it turns out they have a data management issue around data discovery, and they haven't yet figured out how to install the process and people design to solve that particular issue one of the key things I'll do is I'll bring in our CDO and his delegates to have a conversation around them on what we're doing inside of IBM, what we're seeing in other customers to help institute that practice inside of, inside of their own organization. We have forums like the CDO event in Boston last week, which are designed to, you know it's not designed to be here's what IBM can do in technology, it's designed to say here's how the discipline impacts your business and here's some best practices you should apply. So if ultimately I enter into those conversations where I find that there's a need, I typically am like alright, I'm not going to, tools are part of the problem but not the only issue, let me bring someone in that can describe the people process related issues which you got to get right. In order for, in some cases to the tools that I deliver to matter. >> We had Seth Dobrin on last weekend in Boston, and Inderpal Bhandari as well, and he put forth this enterprise, sort of data blueprint if you will. CDO's are sort of-- >> Daniel: We're using that in IBM by the way. >> Well this is the thing, it's a really well thought out sort of structure that seems to be trickling down to the divisions. And so it's interesting to hear how you're applying Seth's expertise. I want to ask you about the Hortonworks relationship. You guys have made a big deal about that this summer. To me it was a no brainer. Really what was the point of IBM having a Hadoop distro, and Hortonworks gets this awesome distribution channel. IBM has always had an affinity for open source so that made sense there. What's behind that relationship and how's it going? >> It's going awesome. Perhaps what we didn't say and we probably should have focused on is the why customers care aspect. There are three main by an occasion use cases that customers are implementing where they are ready even before the relationship. They're asking IBM and Hortonworks to work together. And so we were coming to the table working together as partners before the deeper collaboration we started in June. The first one was bringing data science to Hadoop. So running data science models, doing data exploration where the data is. And if you were to actually rewind the clock on the IBM side and consider what we did with Hortonworks in full consideration of what we did prior, we brought the data science experience and machine learning to Z in February. The highest value transactional data was there. The next step was bring data science to where the, often for a lot of clients the second most valuable set of data which is Hadoop. So that was kind of part one. And then we've kind of continued that by bringing data science experience to the private cloud. So that's one use case. I got a lot data, I need to do data science, I want to do it in resident, I want to take advantage of the compute grid I've already laid down, and I want to take advantage of the performance benefits and the integrated security and governance benefits by having these things co-located. That's kind of play one. So we're bringing in data science experience and HDP and HDF, which are the Hortonworks distributions way closer together and optimized for each other. Another component of that is not all data is going to be in Hadoop as we were describing. Some of it's in an EDW and that data science job is going to require data outside of Hadoop, and so we brought big SQL. It was already supporting Hortonworks, we just optimized the stack, and so the combination of data science experience and big SQL allows you to data science against a broader surface area of data. That's kind of play one. Play two is I've got a EDW either for cost or agility reasons I want to augment it or some cases I might want to offload some data from it to Hadoop. And so the combination of Hortonworks plus big SQL and our data integration technologies are a perfect combination there and we have plenty of clients using that for kind of analytics offloading from EDW. And then the third piece that we're doing quite a bit of engineering, go-to-market work around is govern data lakes. So I want to enable self service analytics throughout my enterprise. I want self service analytics tools to everyone that has access to it. I want to make data available to them, but I want that data to be governed so that they can discover what's in it in the lake, and whatever I give them is what they should have access to. So those are the kind of the three tracks that we're working with Hortonworks on, and all of them are making stunning results inside of clients. >> And so that involves actually some serious engineering as well-- >> Big time. It's not just sort of a Barney deal or just a pure go to market-- >> It's certainly more the market texture and just works. >> Big picture down the road then. Whatever challenges that you see on your side of the business for the next 12 months. What are you going to tackle, what's that monster out there that you think okay this is our next hurdle to get by. >> I forgot if Rob said this before, but you'll hear him say often and it's statistically proven, the majority of the data that's available is not available to be Googled, so it's behind a firewall. And so we started last year with the Watson data platform creating an integrating data analytics system. What if customers have data that's on-prem that they want to take advantage of, what if they're not ready for the public cloud. How do we deliver public benefits to them when they want to run that workload behind a firewall. So we're doing a significant amount of engineering, really starting with the work that we did on a data science experience. Bringing it behind the firewall, but still delivering similar benefits you would expect if you're delivering it in the public cloud. A major advancement that IBM made is run IBM cloud private. I don't know if you guys are familiar with that announcement. We made, I think it's already two weeks ago. So it's a (mumbles) foundation on top of which we have micro services on top of which our stack is going to be made available. So when I think of kind of where the future is, you know our customers ultimately we believe want to run data and analytic workloads in the public cloud. How do we get them there considering they're not there now in a stepwise fashion that is sensible economically project management-wise culturally. Without having them having to wait. That's kind of big picture, kind of a big problem space we're spending considerable time thinking through. >> We've been talking a lot about this on theCUBE in the last several months or even years is people realize they can't just reform their business and stuff into the cloud. They have to bring the cloud model to their data. Wherever that data exists. If it's in the cloud, great. And the key there is you got to have a capability and a solution that substantially mimics that public cloud experience. That's kind of what you guys are focused on. >> What I tell clients is, if you're ready for certain workloads, especially green field workloads, and the capability exists in a public cloud, you should go there now. Because you're going to want to go there eventually anyway. And if not, then a vendor like IBM helps you take advantage of that behind a firewall, often in form facts that are ready to go. The integrated analytics system, I don't know if you're familiar with that. That includes our super advanced data warehouse, the data science experience, our query federation technology powered by big SQL, all in a form factor that's ready to go. You get started there for data and data science workloads and that's a major step in the direction to the public cloud. >> Alright well Daniel thank you for the time, we appreciate that. We didn't get to touch at all on baseball, but next time right? >> Daniel: Go Cubbies. (laughing) >> Sore spot with me but it's alright, go Cubbies. Alright Daniel Hernandez from IBM, back with more here from Data Science For All. IBM's event here in Manhattan. Back with more in theCUBE in just a bit. (electronic music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. So be sure to stick around all day long on theCUBe for that. to the hardcore data scientists. Who's a VP of IBM Analytics, good to see you. how do you see it in terms of how businesses should be So that's kind of the way I approach describing what it is. in terms of how wide of a net are you casting You need the ability to organize that data All that has to happen before you can actually and people that don't know how to code. Has that changed and to what degree has it changed? and to solve my problem like customer segmentation And if you don't, it's a big flaw, it's a big gap is it not? And it just so happens the same discipline that you use, Well so presumably part of that anyway We're embedding that into most of the stuff You know the more restrictions you put on that information So the CDO function in a lot of organizations As opposed to CIO or CTO, you know you have these other. the process and people design to solve that particular issue data blueprint if you will. that seems to be trickling down to the divisions. is going to be in Hadoop as we were describing. just a pure go to market-- that you think okay this is our next hurdle to get by. I don't know if you guys are familiar And the key there is you got to have a capability often in form facts that are ready to go. We didn't get to touch at all on baseball, Daniel: Go Cubbies. IBM's event here in Manhattan.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Daniel	PERSON	0.99+
February	DATE	0.99+
Boston	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
one	QUANTITY	0.99+
David	PERSON	0.99+
Manhattan	LOCATION	0.99+
Inderpal Bhandari	PERSON	0.99+
June	DATE	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Seth	PERSON	0.99+
Python	TITLE	0.99+
third piece	QUANTITY	0.99+
EDW	ORGANIZATION	0.99+
second	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
last week	DATE	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
SQL	TITLE	0.99+
two customers	QUANTITY	0.99+
Hadoop	TITLE	0.99+
first	QUANTITY	0.99+
SPSS	TITLE	0.98+
Seth Dobrin	PERSON	0.98+
three tracks	QUANTITY	0.98+
John Walls	PERSON	0.98+
IBM Analytics	ORGANIZATION	0.98+
first guest	QUANTITY	0.97+
two weeks ago	DATE	0.97+
one aspect	QUANTITY	0.96+
first one	QUANTITY	0.96+
Barney	ORGANIZATION	0.96+
two majors	QUANTITY	0.96+
last weekend	DATE	0.94+
this summer	DATE	0.94+
Hadoop	ORGANIZATION	0.93+
decades	QUANTITY	0.92+
last fall	DATE	0.9+
two	QUANTITY	0.85+
IBM Data Science For All	ORGANIZATION	0.79+
three main	QUANTITY	0.78+
next 12 months	DATE	0.78+
CDO	TITLE	0.77+
D	ORGANIZATION	0.72+

Day One Kickoff | PentahoWorld 2017

>> Narrator: Live from Orlando, Florida, its theCUBE. Covering Pentaho World 2017. Brought to you by Hitachi Vantara. >> We are kicking off day one of Pentaho World. Brought to you, of course, by Hitachi Vantara. I'm your host, Rebecca Knight, along with my co-hosts. We have Dave Vellante and James Kobielus. Guys I'm thrilled to be here in Orlando, Florida. Kicking off Pentaho World with theCUBE. >> Hey Rebecca, twice in one week. >> I know, this is very exciting, very exciting. So we were just listening to the key notes. We heard a lot about the big three, the power of the big three. Which is internet of things, predictive analytics, big data. So the question for you both is where is Hitachi Vantara in this marketplace? And are they doing what they need to do to win? >> Well so the first big question everyone is asking is what the heck is Hitachi-Vantara? (laughing) What is that? >> Maybe we should have started there. >> We joke, some people say it sounds like a SUV, Japanese company, blah blah blah. When we talked to Brian-- >> Jim: A well engineered SUV. >> So Brian Householder told us, well you know it really is about vantage and vantage points. And when you listen to their angles on insights and data, anywhere and however you want it. So they're trying to give their customers an advantage and a vantage point on data and insights. So that's kind of interesting and cool branding. The second big, I think, point is Hitachi has undergone a massive transformation itself. Certainly Hitachi America, which is really not a brand they use anymore, but Hitachi Data Systems. Brian Householder talked in his keynote, when he came in 14 years ago, Hitachi was 80 percent hardware, and infrastructure, and storage. And they've transformed that. They're about 50/50 last year. In terms of infrastructure versus software and services. But what they've done, in my view, is taken now the next step. I think Hitachi has said, alright listen, storage is going to the cloud, Dell and EMC are knocking each others head off. China is coming in to play. Do we really want to try and dominate that business? Rather, why don't we play from our strengths? Which is devices, internet of things, the industrial internet. So they buy Pentaho two years ago, and we're going to talk more about that, bring in an analytics platform. And this sort of marrying IT and OT, information technology and operation technology, together to go attack what is a trillion dollar marketplace. >> That's it so Pentaho was a very strategic acquisition. For Hitachi, of course, Hitachi data system plus Hitachi insides, plus Pentaho equals Hitachi Vantara. Pentaho was one of the pioneering vendors more than a decade ago. In the whole open source analytics arena. If you cast your mind back to the middle millennium decade, open source was starting to come into its own. Of course, we already had Linux an so forth, but in terms of the data world, we're talking about the pre-Hadoop era, the pre-Spark era. We're talking about the pre-TensorFlow era. Pentaho, I should say at that time. Which is, by the way, now a product group within Hitachi Vantara. It's not a stand alone company. Pentaho established itself as the spearhead for open-source, predictive analytics, and data mining. They made something called Weka, which is an open-source data mining toolkit that was actually developed initially in New Zealand. The core of their offering, to market, in many ways became very much a core player in terms of analytics as a service a so forth, but very much established themselves, Pentaho, as an up and coming solution provider taking a more or less, by the book, open source approach for delivering solutions to market. But they were entering a market that was already fairly mature in terms of data mining. Because you are talking about the mid-2000's. You already had SaaS, and SPSS, and some of the others that had been in that space. And done quite well for a long time. And so cut ahead to the present day. Pentaho had evolved to incorporate some fairly robust data integration, data transformation, all ETL capabilities into their portfolio. They had become a big data player in their own right, With a strong focus on embedded analytics, as the keynoters indicated this morning. There's a certain point where in this decade it became clear that they couldn't go it any further, in terms of differentiating themselves in this space. In a space that dominated by Hadoop and Spark, and AI things like TensorFlow. Unless they are part of a more diversified solution provider that offered, especially I think the critical thing was the edge orientation of the industrial internet of things. Which is really where many of the opportunities are now for a variety of new markets that are opening up, including autonomous vehicles, which was the focus of here all-- >> Let's clarify some things a little bit. So Pentaho actually started before the whole Hadoop movement. >> Yeah, yeah. >> That's kind of interesting. You know they were young company when Hadoop just started to take off. And they said alright we can adopt these techniques and processes as well. So they weren't true legacy, right? >> Jim: No. >> So they were able to ride that sort of modern wave. But essentially they're in the business of data, I call it data management. And maybe that's not the right term. They do ingest, they're doing ETL, transformation anyway. They're embedding, they've got analytics, they're embedding analytics. Like you said, they're building on top of Weka. >> James: In the first flesh and BI as a hot topic in the market in the mid-200's, they became a fairly substantial BI player. That actually helped them to grow in terms of revenue and customers. >> So they're one of those companies that touches on a lot of different areas. >> Yes. >> So who do we sort of compare them to? Obviously, what you think of guys like Informatica. >> Yeah, yeah. >> Who do heavy ETL. >> Yes. You mentioned BI, you mentioned before. Like, guys like Saas. What about Tableau? >> Well, BBI would be like, there's Tableau, and ClickView and so forth. But there's also very much-- >> Talend. >> Cognos under IBM. And, of course, there's the business objects Portfolio under SAP. >> David: Right. And Talend would be? >> In fact I think Talend is in many ways is the closest analog >> Right. >> to Pentaho in terms of predominatly open-source, go to market approach, that involves both the robust data integration and cleansing and so forth from the back end. And also, a deep dive of open source analytics on the front end. >> So they're differentiation they sort of claim is they're sort of end to end integration. >> Jim: Yeah. >> Which is something we've been talking about at Wikibon for a while. And George is doing some work there, you probably are too. It's an age old thing in software. Do you do best-of-breed or do you do sort of an integrated suite? Now the interesting thing about Pentaho is, they don't own their own cloud. Hitachi Vantara doesn't own their own cloud. So they do a lot of, it's an integrated pipeline, but it doesn't include its own database and other tooling. >> Jim: Yeah. >> Right, and so there is an interesting dynamic occurring that we want to talk to Donna Perlik about obviously, is how they position relative to roll your own. And then how they position, sort of, in the cloud world. >> And we should ask also how are they positioning now in the world of deep learning frameworks? I mean they don't provide, near as I know, their own deep learning frameworks to compete with the likes of TensorFlow, or MXNet, or CNT or so forth. So where are they going in that regard? I'd like to know. I mean there are some others that are big players in this space, like IBM, who don't offer their own deep learning framework, but support more than one of the existing frameworks in a portfolio that includes much of the other componentry. So in other words, what I'm saying is you don't need to have your own deep learning framework, or even open-source deep learning code-based, to compete in this new marketplace. And perhaps Pentaho, or Hitachi Vantara, roadmapping, maybe they'll take an IBM like approach. Where they'll bundle support, or incorporate support, for two or more of these third party tools, or open source code bases into their solution. Weka is not theirs either. It's open source. I mean Weka is an open source tool that they've supported from the get go. And they've done very well by it. >> It's just kind of like early day machine leraning. >> David: Yeah. >> Okay, so we've heard about Hitachi's transformation internally. And then their messaging today was, of course-- >> Exactly, that's where I really wanted to go next was we're talking about it from the product and the technology standpoint. But one of the things we kept hearing about today was this idea of the double bottom line. And this is how Hitachi Vantara is really approaching the marketplace, by really focusing on better business, better outcomes, for their customers. And obviously for Hitachi Vantara, too, but also for bettering society. And that's what we're going to see on theCUBE today. We're going to have a lot of guests who will come on and talk about how they're using Pentaho to solve problems in healthcare data, in keeping kids from dropping out of college, from getting computing and other kinds of internet power to underserved areas. I think that's another really important approach that Hitachi Vantara is taking in its model. >> The fact that Hitachi Vantara, I know, received Pentaho Solution, has been on the market for so long and they have such a wide range of reference customers all over the world, in many vertical. >> Rebecca: That's a great point. >> The most vertical. Willing to go on camera and speak at some length of how they're using it inside their business and so forth. Speaks volumes about a solution provider. Meaning, they do good work. They provide good offerings. They're companies have invested a lot of money in, and are willing to vouch for them. That says a lot. >> Rebecca: Right. >> And so the acquisition was in 2015. I don't believe it was a public number. It's Hitachi Limited. I don't think they had to report it, but the number I heard was about a half a billion. >> Jim: Uh-hm >> Which for a company with the potential of Pentaho, is actually pretty cheap, believe it or not. You see a lot of unicorns, billion dollar plus companies. But the more important thing is it allows Hitachi to further is transformation and really go after this trillion dollar business. Which is really going to be interesting to see how that unfolds. Because while Hitachi has a long-term view, it always takes a long-term view, you still got to make money. It's fuzzy, how you make money in IOT these days. Obviously, you can make money selling devices. >> How do you think money, open source anything? You know, so yeah. >> But they're sort of open source, with a hybrid model, right? >> Yeah. >> And we talked to Brian about this. There's a proprietary component in there so they can make their margin. Wikibon, we see this three tier model emerging. A data model, where you've got the edge in some analytics, real time analytics at the edge, and maybe persists some of that data, but they're low cost devices. And then there's a sort of aggregation point, or a hub. I think Pentaho today called it a gateway. Maybe it was Brian from Forester. A gateway where you're sort of aggregating data, and then ultimately the third tier is the cloud. And that cloud, I think, vectors into two areas. One is Onprem and one was public cloud. What's interesting with Brian from Forester was saying that basically said that puts the nail in the coffin of Onprem analytics and Onprem big data. >> Uh-hm >> I don't buy that. >> I don't buy that either. >> No, I think the cloud is going to go to your data. Wherever the data lives. The cloud model of self-service and agile and elastic is going to go to your data. >> Couple of weeks ago, of course we Wikibon, we did a webinar for our customers all around the notion of a true private cloud. And Dave, of course, Peter Burse were on it. Explaining that hybrid clouds, of course, public and private play together. But where the cloud experience migrates to where the data is. In other words, that data will be both in public and in private clouds. But you will have the same reliability, high availability, scaleability, ease of programming, so forth, wherever you happen to put your data assets. In other words, many companies we talk to do this. They combine zonal architecture. They'll put some of their resources, like some of their analytics, will be in the private cloud for good reason. The data needs to stay there for security and so forth. But much in the public cloud where its way cheaper quite often. Also, they can improve service levels for important things. What I'm getting at is that the whole notion of a true private cloud is critically important to understand that its all datacentric. Its all gravitating to where the data is. And really analytics are gravitating to where the data is. And increasingly the data is on the edge itself. Its on those devices where its being persistent, much of it. Because there's no need to bring much of the raw data to the gateway or to the cloud. If you can do the predominate bulk of the inferrencing on that data at edge devices. And more and more the inferrencing, to drive things like face recognition from you Apple phone, is happening on the edge. Most of the data will live there, and most of the analytics will be developed centrally. And then trained centrally, and pushed to those edge devices. That's the way it's working. >> Well, it is going to be an exciting conference. I can't wait to hear more from all of our guests, and both of you, Dave Vellante and Jim Kobielus. I'm Rebecca Knight, we'll have more from theCUBE's live coverage of Pentaho World, brought to you by Hitachi Vantara just after this.

Published Date : Oct 26 2017

SUMMARY :

Brought to you by Hitachi Vantara. Guys I'm thrilled to be So the question for you both is When we talked to Brian-- is taken now the next step. but in terms of the data world, before the whole Hadoop movement. And they said alright we can And maybe that's not the right term. in the market in the mid-200's, So they're one of those Obviously, what you think You mentioned BI, you mentioned before. ClickView and so forth. And, of course, there's the that involves both the they're sort of end to end integration. Now the interesting sort of, in the cloud world. much of the other componentry. It's just kind of like And then their messaging is really approaching the marketplace, has been on the market for so long Willing to go on camera And so the acquisition was in 2015. Which is really going to be interesting How do you think money, and maybe persists some of that data, is going to go to your data. and most of the analytics brought to you by Hitachi

ENTITIES

Entity	Category	Confidence
Hitachi	ORGANIZATION	0.99+
Brian	PERSON	0.99+
George	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Rebecca	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Donna Perlik	PERSON	0.99+
Pentaho	ORGANIZATION	0.99+
James	PERSON	0.99+
Jim	PERSON	0.99+
Peter Burse	PERSON	0.99+
2015	DATE	0.99+
EMC	ORGANIZATION	0.99+
David	PERSON	0.99+
New Zealand	LOCATION	0.99+
Brian Householder	PERSON	0.99+
IBM	ORGANIZATION	0.99+
80 percent	QUANTITY	0.99+
two	QUANTITY	0.99+
Hitachi Vantara	ORGANIZATION	0.99+
Hitachi Limited	ORGANIZATION	0.99+
last year	DATE	0.99+
Orlando, Florida	LOCATION	0.99+
Onprem	ORGANIZATION	0.99+
today	DATE	0.99+
twice	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
Hitachi Data Systems	ORGANIZATION	0.99+
Forester	ORGANIZATION	0.99+
two areas	QUANTITY	0.99+
two years ago	DATE	0.99+
Informatica	ORGANIZATION	0.99+
one week	QUANTITY	0.99+
one	QUANTITY	0.99+
Weka	ORGANIZATION	0.99+
both	QUANTITY	0.98+
One	QUANTITY	0.98+
Tableau	TITLE	0.98+
PentahoWorld	EVENT	0.98+
14 years ago	DATE	0.98+
Hitachi America	ORGANIZATION	0.98+
Wikibon	ORGANIZATION	0.98+
Linux	TITLE	0.97+
about a half a billion	QUANTITY	0.97+

Day Two Kickoff | Big Data NYC

(quite music) >> I'll open that while he does that. >> Co-Host: Good, perfect. >> Man: All right, rock and roll. >> This is Robin Matlock, the CMO of VMware, and you're watching theCUBE. >> This is John Siegel of VPA Product Marketing at Dell EMC. You're watching theCUBE. >> This is Matthew Morgan, I'm the chief marketing officer at Druva and you are watching theCUBE. >> Announcer: Live from midtown Manhattan, it's theCUBE. Covering BigData New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (rippling music) >> Hello, everyone, welcome to a special CUBE live presentation here in New York City for theCUBE's coverage of BigData NYC. This is where all the action's happening in the big data world, machine learning, AI, the cloud, all kind of coming together. This is our fifth year doing BigData NYC. We've been covering the Hadoop ecosystem, Hadoop World, since 2010, it's our eighth year really at ground zero for the Hadoop, now the BigData, now the Data Market. We're doing this also in conjunction with Strata Data, which was Strata Hadoop. That's a separate event with O'Reilly Media, we are not part of that, we do our own event, our fifth year doing our own event, we bring in all the thought leaders. We bring all the influencers, meaning the entrepreneurs, the CEOs to get the real story about what's happening in the ecosystem. And of course, we do it with our analyst at Wikibon.com. I'm John Furrier with my cohost, Jim Kobielus, who's the chief analyst for our data piece. Lead analyst Jim, you know the data world's changed. We had commenting yesterday all up on YouTube.com/SiliconAngle. Day one was really set the table. And we kind of get the whiff of what's happening, we can kind of feel the trend, we got a finger on the pulse. Two things going on, two big notable stories is the world's continuing to expand around community and hybrid data and all these cool new data architectures, and the second kind of substory is the O'Reilly show has become basically a marketing. They're making millions of dollars over there. A lot of people were, last night, kind of not happy about that, and what's giving back to the community. So, again, the community theme is still resonating strong. You're starting to see that move into the corporate enterprise, which you're covering. What are you finding out, what did you hear last night, what are you hearing in the hallways? What is kind of the tea leaves that you're reading? What are some of the things you're seeing here? >> Well, all things hybrid. I mean, first of all it's building hybrid applications for hybrid cloud environments and there's various layers to that. So yesterday on theCUBE we had, for example, one layer is hybrid semantic virtualization labels are critically important for bridging workloads and microservices and data across public and private clouds. We had, from AtScale, we had Bruno Aziza and one of his customers discussing what they're doing. I'm hearing a fair amount of this venerable topic of semantic data virtualization become even more important now in the era of hybrid clouds. That's a fair amount of the scuttlebutt in the hallway and atrium talks that I participated in. Also yesterday from BMC we had Basil Faruqi talking about basically talking about automating data pipelines. There are data pipelines in hybrid environments. Very, very important for DevOps, productionizing these hybrid applications for these new multi-cloud environments. That's quite important. Hybrid data platforms of all sorts. Yesterday we had from ActIn Jeff Veis discussing their portfolio for on-prem, public cloud, putting the data in various places, and speeding up the queries and so forth. So hybrid data platforms are going increasingly streaming in real time. What I'm getting is that what I'm hearing is more and more of a layering of these hybrid environments is a critical concern for enterprises trying to put all this stuff together, and future-proof it so they can add on all the new stuff. That's coming along like cirrus clouds, without breaking interoperability, and without having to change code. Just plug and play in a massively multi-cloud environment. >> You know, and also I'm critical of a lot of things that are going on. 'Cause to your point, the reason why I'm kind of critical on the O'Reilly show and particularly the hype factor going on in some areas is two kinds of trends I'm seeing with respect to the owners of some of the companies. You have one camp that are kind of groping for solutions, and you'll see that with they're whitewashing new announcements, this is going on here. It's really kind of-- >> Jim: I think it's AI now, by the way. >> And they're AI-washing it, but you can, the tell sign is they're always kind of doing a magic trick of some type of new announcement, something's happening, you got to look underneath that, and say where is the deal for the customers? And you brought this up yesterday with Peter Burris, which is the business side of it is really the conversation now. It's not about the speeds and feeds and the cluster management, it's certainly important, and those solutions are maturing. That came up yesterday. The other thing that you brought up yesterday I thought was notable was the real emphasis on the data science side of it. And it's that it's still not easy or data science to do their job. And this is where you're seeing productivity conversations come up with data science. So, really the emphasis at the end of the day boils down to this. If you don't have any meat on the bone, you don't have a solution that rubber hits the road where you can come in and provide a tangible benefit to a company, an enterprise, then it's probably not going to work out. And we kind of had that tool conversation, you know, as people start to grow. And so as buyers out there, they got to look, and kind of squint through it saying where's the real deal? So that kind of brings up what's next? Who's winning, how do you as an analyst look at the playing field and say, that's good, that's got traction, that's winning, mm not too sure? What's your analysis, how do you tell the winners from the losers, and what's your take on this from the data science lens? >> Well, first of all you can tell the winners when they have an ample number of referenced customers who are doing interesting things. Interesting enough to get a jaded analyst to pay attention. Doing something that changes the fabric of work or life, whatever, clearly. Solution providers who can provide that are, they have all the hallmarks of a winner meaning they're making money, and they're likely to grow and so forth. But also the hallmarks of a winner are those, in many ways, who have a vision and catalyze an ecosystem around that vision of something that could be made, possibly be done before but not quite as efficiently. So you know, for example, now the way what we're seeing now in the whole AI space, deep learning, is, you know, AI means many things. The core right now, in terms of the buzzy stuff is deep learning for being able to process real time streams of video, images and so forth. And so, what we're seeing now is that the vendors who appear to be on the verge of being winners are those who use deep learning inside some new innovation that has enough, that appeals to a potential mass market. It's something you put on your, like an app or something you put on your smart phone, or it's something you buy at Walmart, install in your house. You know, the whole notion of clearly Alexa, and all that stuff. Anything that takes chatbot technology, really deep learning powers chatbots, and is able to drive a conversational UI into things that you wouldn't normally expect to talk to you and does it well in a way that people have to have that. Those are the vendors that I'm looking for, in terms of those are the ones that are going to make a ton of money selling to a mass market, and possibly, and very much once they go there, they're building out a revenue stream and a business model that they can conceivably take into other markets, especially business markets. You know, like Amazon, 20-something years ago when they got started in the consumer space as the exemplar of web retailing, who expected them 20 years later to be a powerhouse provider of business cloud services? You know, so we're looking for the Amazons of the world that can take something as silly as a conversational UI inside of a, driven by DL, inside of a consumer appliance and 20 years from now, maybe even sooner, become a business powerhouse. So that's what's new. >> Yeah, the thing that comes up that I want to get your thoughts on is that we've seen data integration become a continuing theme. The other thing about the community play here is you start to see customers align with syndicates or partnerships, and I think it's always been great to have customer traction, but, as you pointed out, as a benchmark. But now you're starting to see the partner equation, because this isn't open, decentralized, distributed internet these days. And it is looking like it's going to form differently than they way it was, than the web days and with mobile and connected devices it IoT and AI. A whole new infrastructure's developing, so you're starting to see people align with partnerships. So I think that's something that's signaling to me that the partnership is amping up. I think the people are partnering more. We've had Hortonworks on with IBM, people are partner, some people take a Switzerland approach where they partner with everyone. You had, WANdisco partners with all the cloud guys, I mean, they have unique ITP. So you have this model where you got to go out, do something, but you can't do it alone. Open source is a key part of this, so obviously that's part of the collaboration. This is a key thing. And then they're going to check off the boxes. Data integration, deep learning is a new way to kind of dig deeper. So the question I have for you is, the impact on developers, 'cause if you can connect the dots between open source, 90% of the software written will be already open source, 10% differentiated, and then the role of how people going to market with the enterprise of a partnership, you can almost connect the dots and saying it's kind of a community approach. So that leaves the question, what is the impact to developers? >> Well the impact to developers, first of all, is when you go to a community approach, and like some big players are going more community and partnership-oriented in hot new areas like if you look at some of the recent announcements in chatbots and those technologies, we have sort of a rapprochement between Microsoft and Facebook and so forth, or Microsoft and AWS. The impact for developers is that there's convergence among the companies that might have competed to the death in particular hot new areas, like you know, like I said, chatbot-enabled apps for mobile scenarios. And so it cuts short the platform wars fairly quickly, harmonizes around a common set of APIs for accessing a variety of competing offerings that really overlap functionally in many ways. For developers, it's simplification around a broader ecosystem where it's not so much competition on the underlying open source technologies, it's now competition to see who penetrates the mass market with actually valuable solutions that leverage one or more of those erstwhile competitors into some broader synthesis. You know, for example, the whole ramp up to the future of self-driving vehicles, and it's not clear who's going to dominate there. Will it be the vehicle manufacturers that are equipping their cars with all manner of computerized everything to do whatnot? Or will it be the up-and-comers? Will it be the computer companies like Apple and Microsoft and others who get real deep and invest fairly heavily in self-driving vehicle technology, and become themselves the new generation of automakers in the future? So, what we're getting is that going forward, developers want to see these big industry segments converge fairly rapidly around broader ecosystems, where it's not clear who will be the dominate player in 10 years. The developers don't really care, as long as there is consolidation around a common framework to which they can develop fairly soon. >> And open source is obviously a key role in this, and how is deep learning impacting some of the contributions that are being made, because we're starting to see the competitive advantage in collaboration on the community side is with the contributions from companies. For example, you mentioned TensorFlow multiple times yesterday from Google. I mean, that's a great contribution. If you're a young kind coming into the developer community, I mean, this is not normal. It wasn't like this before. People just weren't donating massive libraries of great stuff already pre-packaged, So all new dynamics emerging. Is that putting pressure on Amazon, is that putting pressure on AWS and others? >> It is. First of all, there is a fair amount of, I wouldn't call it first-mover advantage for TensorFlow, there've been a number of DL toolkits on the market, open source, for the last several years. But they achieved the deepest and broadest adoption most rapidly, and now they are a, TensorFlow is essentially a defacto standard in the way, that we just go back, betraying my age, 30, 40 years ago where you had two companies called SAS and SPSS that quickly established themselves as the go-to statistical modeling tools. And then they got a generation, our generation, of developers, or at least of data scientists, what became known as data scientists, to standardize around you're either going to go with SAS or SPSS if you're going to do data mining. Cut ahead to the 2010s now. The new generation of statistical modelers, it's all things DL and machine learning. And so SAS versus SPSS is ages ago, those companies are, those products still exist. But now, what are you going to get hooked on in school? What are you going to get hooked on in high school, for that matter, when you're just hobby-shopping DL? You'll probably get hooked on TensorFlow, 'cause they have the deepest and the broadest open source community where you learn this stuff. You learn the tools of the trade, you adopt that tool, and everybody else in your environment is using that tool, and you got to get up to speed. So the fact is, that broad adoption early on in a hot new area like DL, means tons. It means that essentially TensorFlow is the new Spark, where Spark, you know, once again, Spark just in the past five years came out real fast. And it's been eclipsed, as it were, on the stack of cool by TensorFlow. But it's a deepening stack of open source offerings. So the new generation of developers with data science workbenches, they just assume that there's Spark, and they're going to increasingly assume that there's TensorFlow in there. They're going to increasingly assume that there are the libraries and algorithms and models and so forth that are floating around in the open source space that they can use to bootstrap themselves fairly quickly. >> This is a real issue in the open source community which we talked, when we were in LA for the Open Source Summit, was exactly that. Is that, there are some projects that become fashionable, so for example, a cloud-native foundation, very relevant but also hot, really hot right now. A lot of people are jumping on board the cloud natives bandwagon, and rightfully so. A lot of work to be done there, and a lot of things to harvest from that growth. However, the boring blocking and tackling projects don't get all the fanfare but are still super relevant, so there's a real challenge of how do you nurture these awesome projects that we don't want to become like a nightclub where nobody goes anymore because it's not fashionable. Some of these open source projects are super important and have massive traction, but they're not as sexy, or flair-ish as some of that. >> Dl is not as sexy, or machine learning, for that matter, not as sexy as you would think if you're actually doing it, because the grunt work, John, as we know for any statistical modeling exercise, is data ingestion and preparation and so forth. That's 75% of the challenge for deep learning as well. But also for deep learning and machine learning, training the models that you build is where the rubber meets the road. You can't have a really strongly predictive DL model in terms of face recognition unless you train it against a fair amount of actual face data, whatever it is. And it takes a long time to train these models. That's what you hear constantly. I heard this constantly in the atrium talking-- >> Well that's a data challenge, is you need models that are adapting and you need real time, and I think-- >> Oh, here-- >> This points to the real new way of doing things, it's not yesterday's model. It's constantly evolving. >> Yeah, and that relates to something I read this morning or maybe it was last night, that Microsoft has made a huge investment in AI and deep learning machinery. They're doing amazing things. And one of the strategic advantages they have as a large, established solution provider with a search engine, Bing, is that from what I've been, this is something I read, I haven't talked to Microsoft in the last few hours to confirm this, that Bing is a source of training data that they're using for machine learning and I guess deep learning modeling for their own solutions or within their ecosystem. That actually makes a lot of sense. I mean, Google uses YouTube videos heavily in its deep learning for training data. So there's the whole issue of if you're a pipsqueak developer, some, you know, I'm sorry, this sounds patronizing. Some pimply-faced kid in high school who wants to get real deep on TensorFlow and start building and tuning these awesome kickass models to do face recognition, or whatever it might be. Where are you going to get your training data from? Well, there's plenty of open source database, or training databases out there you can use, but it's what everybody's using. So, there's sourcing the training data, there's labeling the training data, that's human-intensive, you need human beings to label it. There was a funny recent episode, or maybe it was a last-season episode of Silicone Valley that was all about machine learning and building and training models. It was the hot dog, not hot dog episode, it was so funny. They bamboozle a class on the show, fictionally. They bamboozle a class of college students to provide training data and to label the training data for this AI algorithm, it was hilarious. But where are you going to get the data? Where are you going to label it? >> Lot more work to do, that's basically what you're getting at. >> Jim: It's DevOps, you know, but it's grunt work. >> Well, we're going to kick off day two here. This is the SiliconeANGLE Media theCUBE, our fifth year doing our own event separate from O'Reilly media but in conjunction with their event in New York City. It's gotten much bigger here in New York City. We call it BigData NYC, that's the hashtag. Follow us on Twitter, I'm John Furrier, Jim Kobielus, we're here all day, we've got Peter Burris joining us later, head of research for Wikibon, and we've got great guests coming up, stay with us, be back with more after this short break. (rippling music)

Published Date : Sep 27 2017

SUMMARY :

This is Robin Matlock, the CMO of VMware, This is John Siegel of VPA Product Marketing This is Matthew Morgan, I'm the chief marketing officer Brought to you by SiliconANGLE Media What is kind of the tea leaves that you're reading? That's a fair amount of the scuttlebutt I'm kind of critical on the O'Reilly show is really the conversation now. Doing something that changes the fabric So the question I have for you is, the impact on developers, among the companies that might have competed to the death and how is deep learning impacting some of the contributions You learn the tools of the trade, you adopt that tool, and a lot of things to harvest from that growth. That's 75% of the challenge for deep learning as well. This points to the in the last few hours to confirm this, that's basically what you're getting at. This is the SiliconeANGLE Media theCUBE,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Robin Matlock	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Matthew Morgan	PERSON	0.99+
Basil Faruqi	PERSON	0.99+
Jim	PERSON	0.99+
John Siegel	PERSON	0.99+
O'Reilly Media	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
yesterday	DATE	0.99+
90%	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
two companies	QUANTITY	0.99+
New York City	LOCATION	0.99+
SPS	ORGANIZATION	0.99+
SAS	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
75%	QUANTITY	0.99+
LA	LOCATION	0.99+
Silicone Valley	TITLE	0.99+
Facebook	ORGANIZATION	0.99+
10%	QUANTITY	0.99+
Walmart	ORGANIZATION	0.99+
2010s	DATE	0.99+
YouTube	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
AtScale	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
Jeff Veis	PERSON	0.99+
fifth year	QUANTITY	0.99+
one	QUANTITY	0.99+
Yesterday	DATE	0.99+
Dell EMC	ORGANIZATION	0.99+
VMware	ORGANIZATION	0.99+
eighth year	QUANTITY	0.99+
BigData	ORGANIZATION	0.99+
millions of dollars	QUANTITY	0.99+
Bing	ORGANIZATION	0.99+
BMC	ORGANIZATION	0.98+
Amazons	ORGANIZATION	0.98+
last night	DATE	0.98+
two kinds	QUANTITY	0.98+
Spark	TITLE	0.98+
Hortonworks	ORGANIZATION	0.98+
Day one	QUANTITY	0.98+
20 years later	DATE	0.98+
VPA	ORGANIZATION	0.98+
2010	DATE	0.98+
ActIn	ORGANIZATION	0.98+
Open Source Summit	EVENT	0.98+
one layer	QUANTITY	0.98+
Druva	ORGANIZATION	0.97+
Alexa	TITLE	0.97+
day two	QUANTITY	0.97+
Bruno Aziza	PERSON	0.97+
SPSS	TITLE	0.97+
Switzerland	LOCATION	0.97+
Two things	QUANTITY	0.96+
NYC	LOCATION	0.96+
Wikibon	ORGANIZATION	0.96+
30	DATE	0.95+
Wikibon.com	ORGANIZATION	0.95+
SiliconeANGLE Media	ORGANIZATION	0.95+
O'Reilly	ORGANIZATION	0.95+

Wrap Up | IBM Fast Track Your Data 2017

>> Narrator: Live from Munich Germany, it's theCUBE, covering IBM, Fast Track Your Data. Brought to you by IBM. >> We're back. This is Dave Vellante with Jim Kobielus, and this is theCUBE, the leader in live tech coverage. We go out to the events. We extract the signal from the noise. We are here covering special presentation of IBM's Fast Track your Data, and we're in Munich Germany. It's been a day-long session. We started this morning with a panel discussion with five senior level data scientists that Jim and I hosted. Then we did CUBE interviews in the morning. We cut away to the main tent. Kate Silverton did a very choreographed scripted, but very well done, main keynote set of presentations. IBM made a couple of announcements today, and then we finished up theCUBE interviews. Jim and I are here to wrap. We're actually running on IBMgo.com. We're running live. Hilary Mason talking about what she's doing in data science, and also we got a session on GDPR. You got to log in to see those sessions. So go ahead to IBMgo.com, and you'll find those. Hit the schedule and go to the Hilary Mason and GDP our channels, and check that out, but we're going to wrap now. Jim two main announcements today. I hesitate to call them big announcements. I mean they were you know just kind of ... I think the word you used last night was perfunctory. You know I mean they're okay, but they're not game changing. So what did you mean? >> Well first of all, when you look at ... Though IBM is not calling this a signature event, it's essentially a signature event. They do these every June or so. You know in the past several years, the signature events have had like a one track theme, whether it be IBM announcing their investing deeply in Spark, or IBM announcing that they're focusing on investing in R as the core language for data science development. This year at this event in Munich, it's really a three track event, in terms of the broad themes, and I mean they're all important tracks, but none of them is like game-changing. Perhaps IBM doesn't intend them to be it seems like. One of which is obviously Europe. We're holding this in Munich. And a couple of things of importance to European customers, first and foremost GDPR. The deadline next year, in terms of compliance, is approaching. So sound the alarm as it were. And IBM has rolled out compliance or governance tools. Download and the go from the information catalog, governance catalog and so forth. Now announcing the consortium with Hortonworks to build governance on top of Apache Atlas, but also IBM announcing that they've opened up a DSX center in England and a machine-learning hub here in Germany, to help their European clients, in those countries especially, to get deeper down into data science and machine learning, in terms of developing those applicants. That's important for the audience, the regional audience here. The second track, which is also important, and I alluded to it. It's governance. In all of its manifestations you need a master catalog of all the assets for building and maintaining and controlling your data applications and your data science applications. The catalog, the consortium, the various offerings at IBM is announced and discussed in great detail. They've brought in customers and partners like Northern Trust, talk about the importance of governance, not just as a compliance mandate, but also the potential strategy for monetizing your data. That's important. Number three is what I call cloud native data applications and how the state of the art in developing data applications is moving towards containerized and orchestrated environments that involve things like Docker and Kubernetes. The IBM DB2 developer community edition. Been in the market for a few years. The latest version they announced today includes kubernetes support. Includes support for JSON. So it's geared towards new generation of cloud and data apps. What I'm getting at ... Those three core themes are Europe governance and cloud native data application development. Each of them is individually important, but none of them is game changer. And one last thing. Data science and machine learning, is one of the overarching envelope themes of this event. They've had Hilary Mason. A lot of discussion there. My sense I was a little bit disappointed because there wasn't any significant new announcements related to IBM evolving their machine learning portfolio into deep learning or artificial intelligence in an environment where their direct competitors like Microsoft and Google and Amazon are making a huge push in AI, in terms of their investments. There's a bit of a discussion, and Rob Thomas got to it this morning, about DSX. Working with power AI, the IBM platform, I would like to hear more going forward about IBM investments in these areas. So I thought it was an interesting bunch of announcements. I'll backtrack on perfunctory. I'll just say it was good that they had this for a lot of reasons, but like I said, none of these individual announcements is really changing the game. In fact like I said, I think I'm waiting for the fall, to see where IBM goes in terms of doing something that's actually differentiating and innovative. >> Well I think that the event itself is great. You've got a bunch of partners here, a bunch of customers. I mean it's active. IBM knows how to throw a party. They've always have. >> And the sessions are really individually awesome. I mean terms of what you learn. >> The content is very good. I would agree. The two announcements that were sort of you know DB2, sort of what I call community edition. Simpler, easier to download. Even Dave can download DB2. I really don't want to download DB2, but I could, and play with it I guess. You know I'm not database guy, but those of you out there that are, go check it out. And the other one was the sort of unified data governance. They tried to tie it in. I think they actually did a really good job of tying it into GDPR. We're going to hear over the next, you know 11 months, just a ton of GDPR readiness fear, uncertainty and doubt, from the vendor community, kind of like we heard with Y2K. We'll see what kind of impact GDPR has. I mean it looks like it's the real deal Jim. I mean it looks like you know this 4% of turnover penalty. The penalties are much more onerous than any other sort of you know, regulation that we've seen in the past, where you could just sort of fluff it off. Say yeah just pay the fine. I think you're going to see a lot of, well pay the lawyers to delay this thing and battle it. >> And one of our people in theCUBE that we interviewed, said it exactly right. It's like the GDPR is like the inverse of Y2K. In Y2K everybody was freaking out. It was actually nothing when it came down to it. Where nobody on the street is really buzzing. I mean the average person is not buzzing about GDPR, but it's hugely important. And like you said, I mean some serious penalties may be in the works for companies that are not complying, companies not just in Europe, but all around the world who do business with European customers. >> Right okay so now bring it back to sort of machine learning, deep learning. You basically said to Rob Thomas, I see machine learning here. I don't see a lot of the deep learning stuff quite yet. He said stay tuned. You know you were talking about TensorFlow and things like that. >> Yeah they supported that ... >> Explain. >> So Rob indicated that IBM very much, like with power AI and DSX, provides an open framework or toolkit for plugging in your, you the developers, preferred machine learning or deep learning toolkit of an open source nature. And there's a growing range of open source deep learning toolkits beyond you know TensorFlow, including Theano and MXNet and so forth, that IBM is supporting within the overall ESX framework, but also within the power AI framework. In other words they've got those capabilities. They're sort of burying that message under a bushel basket, at least in terms of this event. Also one of the things that ... I said this too Mena Scoyal. Watson data platform, which they launched last fall, very important product. Very important platform for collaboration among data science professionals, in terms of the machine learning development pipeline. I wish there was more about the Watson data platform here, about where they're taking it, what the customers are doing with it. Like I said a couple of times, I see Watson data platform as very much a DevOps tool for the new generation of developers that are building machine learning models directly into their applications. I'd like to see IBM, going forward turn Watson data platform into a true DevOps platform, in terms of continuous integration of machine learning and deep learning another statistical models. Continuous training, continuous deployment, iteration. I believe that's where they're going, or probably she will be going. I'd like to see more. I'm expecting more along those lines going forward. What I just described about DevOps for data science is a big theme that we're focusing on at Wikibon, in terms where the industry is going. >> Yeah, yeah. And I want to come back to that again, and get an update on what you're doing within your team, and talk about the research. Before we do that, I mean one of the things we talked about on theCUBE, in the early days of Hadoop is that the guys are going to make the money in this big data business of the practitioners. They're not going to see, you know these multi-hundred billion dollar valuations come out of the Hadoop world. And so far that prediction has held up well. It's the Airbnbs and the Ubers and the Spotifys and the Facebooks and the Googles, the practitioners who are applying big data, that are crushing it and making all the money. You see Amazon now buying Whole Foods. That in our view is a data play, but who's winning here, in either the vendor or the practitioner community? >> Who's winning are the startups with a hot new idea that's changing, that's disrupting some industry, or set of industries with machine learning, deep learning, big data, etc. For example everybody's, with bated breath, waiting for you know self-driving vehicles. And the ecosystem as it develops somebody's going to clean up. And one or more companies, companies we probably never heard of, leveraging everything we're describing here today, data science and containerized distributed applications that involve you know deep learning for you know image analysis and sensor analyst and so forth. Putting it all together in some new fabric that changes the way we live on this planet, but as you said the platforms themselves, whether they be Hadoop or Spark or TensorFlow, whatever, they're open source. You know and the fact is, by it's very nature, open source based solutions, in terms of profit margins on selling those, inexorably migrate to zero. So you're not going to make any money as a tool vendor, or a platform vendor. You got to make money ... If you're going to make money, you make money, for example from providing an ecosystem, within which innovation can happen. >> Okay we have a few minutes left. Let's talk about the research that you're working on. What's exciting you these days? >> Right, right. So I think a lot of people know I've been around the analyst space for a long long time. I've joined the SiliconANGLE Wikibon team just recently. I used to work for a very large solution provider, and what I do here for Wikibon is I focus on data science as the core of next generation application development. When I say next-generation application development, it's the development of AI, deep learning machine learning, and the deployment of those data-driven statistical assets into all manner of application. And you look at the hot stuff, like chatbots for example. Transforming the experience in e-commerce on mobile devices. Siri and Alexa and so forth. Hugely important. So what we're doing is we're focusing on AI and everything. We're focusing on containerization and building of AI micro-services and the ecosystem of the pipelines and the tools that allow you to do that. DevOps for data science, distributed training, federated training of statistical models, so forth. We are also very much focusing on the whole distributed containerized ecosystem, Docker, Kubernetes and so forth. Where that's going, in terms of changing the state of the art, in terms of application development. Focusing on the API economy. All of those things that you need to wrap around the payload of AI to deliver it into every ... >> So you're focused on that intersection between AI and the related topics and the developer. Who is winning in that developer community? Obviously Amazon's winning. You got Microsoft doing a good job there. Google, Apple, who else? I mean how's IBM doing for example? Maybe name some names. Who do you who impresses you in the developer community? But specifically let's start with IBM. How is IBM doing in that space? >> IBM's doing really well. IBM has been for quite a while, been very good about engaging with new generation of developers, using spark and R and Hadoop and so forth to build applications rapidly and deploy them rapidly into all manner of applications. So IBM has very much reached out to, in the last several years, the Millennials for whom all of this, these new tools, have been their core repertoire from the very start. And I think in many ways, like today like developer edition of the DB2 developer community edition is very much geared to that market. Saying you know to the cloud native application developer, take a second look at DB2. There's a lot in DB2 that you might bring into your next application development initiative, alongside your spark toolkit and so forth. So IBM has startup envy. They're a big old company. Been around more than a hundred years. And they're trying to, very much bootstrap and restart their brand in this new context, in the 21st century. I think they're making a good effort at doing it. In terms of community engagement, they have a really good community engagement program, all around the world, in terms of hackathons and developer days, you know meetups here and there. And they get lots of turnout and very loyal customers and IBM's got to broadest portfolio. >> So you still bleed a little bit of blue. So I got to squeeze it out of you now here. So let me push a little bit on what you're saying. So DB2 is the emphasis here, trying to position DB2 as appealing for developers, but why not some of the other you know acquisitions that they've made? I mean you don't hear that much about Cloudant, Dash TV, and things of that nature. You would think that that would be more appealing to some of the developer communities than DB2. Or am I mistaken? Is it IBM sort of going after the core, trying to evolve that core you know constituency? >> No they've done a lot of strategic acquisitions like Cloudant, and like they've acquired Agrath Databases and brought them into their platform. IBM has every type of database or file system that you might need for web or social or Internet of Things. And so with all of the development challenges, IBM has got a really high-quality, fit-the-purpose, best-of-breed platform, underlying data platform for it. They've got huge amounts of developers energized all around the world working on this platform. DB2, in the last several years they've taken all of their platforms, their legacy ... That's the wrong word. All their existing mature platforms, like DB2 and brought them into the IBM cloud. >> I think legacy is the right word. >> Yeah, yeah. >> These things have been around for 30 years. >> And they're not going away because they're field-proven and ... >> They are evolving. >> And customers have implemented them everywhere. And they're evolving. If you look at how IBM has evolved DB2 in the last several years into ... For example they responded to the challenge from SAP HANA. We brought BLU Acceleration technology in memory technology into DB2 to make it screamingly fast and so forth. IBM has done a really good job of turning around these product groups and the product architecture is making them cloud first. And then reaching out to a new generation of cloud application developers. Like I said today, things like DB2 developer community edition, it's just the next chapter in this ongoing saga of IBM turning itself around. Like I said, each of the individual announcements today is like okay that's interesting. I'm glad to see IBM showing progress. None of them is individually disruptive. I think the last week though, I think Hortonworks was disruptive in the sense that IBM recognized that BigInsights didn't really have a lot of traction in the Hadoop spaces, not as much as they would have wished. Hortonworks very much does, and IBM has cast its lot to work with HDP, but HDP and Hortonworks recognizes they haven't achieved any traction with data scientists, therefore DSX makes sense, as part of the Hortonworks portfolio. Likewise a big sequel makes perfect sense as the sequel front end to the HDP. I think the teaming of IBM and Hortonworks is propitious of further things that they'll be doing in the future, not just governance, but really putting together a broader cloud portfolio for the next generation of data scientists doing work in the cloud. >> Do you think Hortonworks is a legitimate acquisition target for IBM. >> Of course they are. >> Why would IBM ... You know educate us. Why would IBM want to acquire Hortonworks? What does that give IBM? Open source mojo, obviously. >> Yeah mojo. >> What else? >> Strong loyalty with the Hadoop market with developers. >> The developer angle would supercharge the developer angle, and maybe make it more relevant outside of some of those legacy systems. Is that it? >> Yeah, but also remember that Hortonworks came from Yahoo, the team that developed much of what became Hadoop. They've got an excellent team. Strategic team. So in many ways, you can look at Hortonworks as one part aqui-hire if they ever do that and one part really substantial and growing solution portfolio that in many ways is complementary to IBM. Hortonworks is really deep on the governance of Hadoop. IBM has gone there, but I think Hortonworks is even deeper, in terms of their their laser focus. >> Ecosystem expansion, and it actually really wouldn't be that expensive of an acquisition. I mean it's you know north of ... Maybe a billion dollars might get it done. >> Yeah. >> You know so would you pay a billion dollars for Hortonworks? >> Not out of my own pocket. >> No, I mean if you're IBM. You think that would deliver that kind of value? I mean you know how IBM thinks about about acquisitions. They're good at acquisitions. They look at the IRR. They have their formula. They blue-wash the companies and they generally do very well with acquisitions. Do you think Hortonworks would fit profile, that monetization profile? >> I wouldn't say that Hortonworks, in terms of monetization potential, would match say what IBM has achieved by acquiring the Netezza. >> Cognos. >> Or SPSS. I mean SPSS has been an extraordinarily successful ... >> Well the day IBM acquired SPSS they tripled the license fees. As a customer I know, ouch, it worked. It was incredibly successful. >> Well, yeah. Cognos was. Netezza was. And SPSS. Those three acquisitions in the last ten years have been extraordinarily pivotal and successful for IBM to build what they now have, which is really the most comprehensive portfolio of fit-to-purpose data platform. So in other words all those acquisitions prepared IBM to duke it out now with their primary competitors in this new field, which are Microsoft, who's newly resurgent, and Amazon Web Services. In other words, the two Seattle vendors, Seattle has come on strong, in a way that almost Seattle now in big data in the cloud is eclipsing Silicon Valley, in terms of where you know ... It's like the locus of innovation and really of customer adoption in the cloud space. >> Quite amazing. Well Google still hanging in there. >> Oh yeah. >> Alright, Jim. Really a pleasure working with you today. Thanks so much. Really appreciate it. >> Thanks for bringing me on your team. >> And Munich crew, you guys did a great job. Really well done. Chuck, Alex, Patrick wherever he is, and our great makeup lady. Thanks a lot. Everybody back home. We're out. This is Fast Track Your Data. Go to IBMgo.com for all the replays. Youtube.com/SiliconANGLE for all the shows. TheCUBE.net is where we tell you where theCUBE's going to be. Go to wikibon.com for all the research. Thanks for watching everybody. This is Dave Vellante with Jim Kobielus. We're out.

Published Date : Jun 25 2017

SUMMARY :

Brought to you by IBM. I mean they were you know just kind of ... I think the word you used last night was perfunctory. And a couple of things of importance to European customers, first and foremost GDPR. IBM knows how to throw a party. I mean terms of what you learn. seen in the past, where you could just sort of fluff it off. I mean the average person is not buzzing about GDPR, but it's hugely important. I don't see a lot of the deep learning stuff quite yet. And there's a growing range of open source deep learning toolkits beyond you know TensorFlow, of Hadoop is that the guys are going to make the money in this big data business of the And the ecosystem as it develops somebody's going to clean up. Let's talk about the research that you're working on. the pipelines and the tools that allow you to do that. Who do you who impresses you in the developer community? all around the world, in terms of hackathons and developer days, you know meetups here Is it IBM sort of going after the core, trying to evolve that core you know constituency? They've got huge amounts of developers energized all around the world working on this platform. Likewise a big sequel makes perfect sense as the sequel front end to the HDP. You know educate us. The developer angle would supercharge the developer angle, and maybe make it more relevant Hortonworks is really deep on the governance of Hadoop. I mean it's you know north of ... They blue-wash the companies and they generally do very well with acquisitions. I wouldn't say that Hortonworks, in terms of monetization potential, would match say I mean SPSS has been an extraordinarily successful ... Well the day IBM acquired SPSS they tripled the license fees. now in big data in the cloud is eclipsing Silicon Valley, in terms of where you know Well Google still hanging in there. Really a pleasure working with you today. And Munich crew, you guys did a great job.

ENTITIES

Entity	Category	Confidence
Kate Silverton	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Google	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Patrick	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Germany	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Y2K	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Chuck	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
England	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
second track	QUANTITY	0.99+
Siri	TITLE	0.99+
two	QUANTITY	0.99+
21st century	DATE	0.99+
three track	QUANTITY	0.99+
Rob	PERSON	0.99+
next year	DATE	0.99+
4%	QUANTITY	0.99+
Mena Scoyal	PERSON	0.99+
Alex	PERSON	0.99+
Whole Foods	ORGANIZATION	0.99+
Each	QUANTITY	0.99+
Cloudant	ORGANIZATION	0.99+

Rob Thomas, IBM Analytics | IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM: Fast Track Your Data. Brought to you by IBM. >> Welcome, everybody, to Munich, Germany. This is Fast Track Your Data brought to you by IBM, and this is theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise. My name is Dave Vellante, and I'm here with my co-host Jim Kobielus. Rob Thomas is here, he's the General Manager of IBM Analytics, and longtime CUBE guest, good to see you again, Rob. >> Hey, great to see you. Thanks for being here. >> Dave: You're welcome, thanks for having us. So we're talking about, we missed each other last week at the Hortonworks DataWorks Summit, but you came on theCUBE, you guys had the big announcement there. You're sort of getting out, doing a Hadoop distribution, right? TheCUBE gave up our Hadoop distributions several years ago so. It's good that you joined us. But, um, that's tongue-in-cheek. Talk about what's going on with Hortonworks. You guys are now going to be partnering with them essentially to replace BigInsights, you're going to continue to service those customers. But there's more than that. What's that announcement all about? >> We're really excited about that announcement, that relationship, just to kind of recap for those that didn't see it last week. We are making a huge partnership with Hortonworks, where we're bringing data science and machine learning to the Hadoop community. So IBM will be adopting HDP as our distribution, and that's what we will drive into the market from a Hadoop perspective. Hortonworks is adopting IBM Data Science Experience and IBM machine learning to be a core part of their Hadoop platform. And I'd say this is a recognition. One is, companies should do what they do best. We think we're great at data science and machine learning. Hortonworks is the best at Hadoop. Combine those two things, it'll be great for clients. And, we also talked about extending that to things like Big SQL, where they're partnering with us on Big SQL, around modernizing data environments. And then third, which relates a little bit to what we're here in Munich talking about, is governance, where we're partnering closely with them around unified governance, Apache Atlas, advancing Atlas in the enterprise. And so, it's a lot of dimensions to the relationship, but I can tell you since I was on theCUBE a week ago with Rob Bearden, client response has been amazing. Rob and I have done a number of client visits together, and clients see the value of unlocking insights in their Hadoop data, and they love this, which is great. >> Now, I mean, the Hadoop distro, I mean early on you got into that business, just, you had to do it. You had to be relevant, you want to be part of the community, and a number of folks did that. But it's really sort of best left to a few guys who want to do that, and Apache open source is really, I think, the way to go there. Let's talk about Munich. You guys chose this venue. There's a lot of talk about GDPR, you've got some announcements around unified government, but why Munich? >> So, there's something interesting that I see happening in the market. So first of all, you look at the last five years. There's only 10 companies in the world that have outperformed the S&P 500, in each of those five years. And we started digging into who those companies are and what they do. They are all applying data science and machine learning at scale to drive their business. And so, something's happening in the market. That's what leaders are doing. And I look at what's happening in Europe, and I say, I don't see the European market being that aggressive yet around data science, machine learning, how you apply data for competitive advantage, so we wanted to come do this in Munich. And it's a bit of a wake-up call, almost, to say hey, this is what's happening. We want to encourage clients across Europe to think about how do they start to do something now. >> Yeah, of course, GDPR is also a hook. The European Union and you guys have made some talk about that, you've got some keynotes today, and some breakout sessions that are discussing that, but talk about the two announcements that you guys made. There's one on DB2, there's another one around unified governance, what do those mean for clients? >> Yeah, sure, so first of all on GDPR, it's interesting to me, it's kind of the inverse of Y2K, which is there's very little hype, but there's huge ramifications. And Y2K was kind of the opposite. So look, it's coming, May 2018, clients have to be GDPR-compliant. And there's a misconception in the market that that only impacts companies in Europe. It actually impacts any company that does any type of business in Europe. So, it impacts everybody. So we are announcing a platform for unified governance that makes sure clients are GDPR-compliant. We've integrated software technology across analytics, IBM security, some of the assets from the Promontory acquisition that IBM did last year, and we are delivering the only platform for unified governance. And that's what clients need to be GDPR-compliant. The second piece is data has to become a lot simpler. As you think about my comment, who's leading the market today? Data's hard, and so we're trying to make data dramatically simpler. And so for example, with DB2, what we're announcing is you can download and get started using DB2 in 15 minutes or less, and anybody can do it. Even you can do it, Dave, which is amazing. >> Dave: (laughs) >> For the first time ever, you can-- >> We'll test that, Rob. >> Let's go test that. I would love to see you do it, because I guarantee you can. Even my son can do it. I had my son do it this weekend before I came here, because I wanted to see how simple it was. So that announcement is really about bringing, or introducing a new era of simplicity to data and analytics. We call it Download And Go. We started with SPSS, we did that back in March. Now we're bringing Download And Go to DB2, and to our governance catalog. So the idea is make data really simple for enterprises. >> You had a community edition previous to this, correct? There was-- >> Rob: We did, but it wasn't this easy. >> Wasn't this simple, okay. >> Not anybody could do it, and I want to make it so anybody can do it. >> Is simplicity, the rate of simplicity, the only differentiator of the latest edition, or I believe you have Kubernetes support now with this new addition, can you describe what that involves? >> Yeah, sure, so there's two main things that are new functionally-wise, Jim, to your point. So one is, look, we're big supporters of Kubernetes. And as we are helping clients build out private clouds, the best answer for that in our mind is Kubernetes, and so when we released Data Science Experience for Private Cloud earlier this quarter, that was on Kubernetes, extending that now to other parts of the portfolio. The other thing we're doing with DB2 is we're extending JSON support for DB2. So think of it as, you're working in a relational environment, now just through SQL you can integrate with non-relational environments, JSON, documents, any type of no-SQL environment. So we're finally bringing to fruition this idea of a data fabric, which is I can access all my data from a single interface, and that's pretty powerful for clients. >> Yeah, more cloud data development. Rob, I wonder if you can, we can go back to the machine learning, one of the core focuses of this particular event and the announcements you're making. Back in the fall, IBM made an announcement of Watson machine learning, for IBM Cloud, and World of Watson. In February, you made an announcement of IBM machine learning for the z platform. What are the machine learning announcements at this particular event, and can you sort of connect the dots in terms of where you're going, in terms of what sort of innovations are you driving into your machine learning portfolio going forward? >> I have a fundamental belief that machine learning is best when it's brought to the data. So, we started with, like you said, Watson machine learning on IBM Cloud, and then we said well, what's the next big corpus of data in the world? That's an easy answer, it's the mainframe, that's where all the world's transactional data sits, so we did that. Last week with the Hortonworks announcement, we said we're bringing machine learning to Hadoop, so we've kind of covered all the landscape of where data is. Now, the next step is about how do we bring a community into this? And the way that you do that is we don't dictate a language, we don't dictate a framework. So if you want to work with IBM on machine learning, or in Data Science Experience, you choose your language. Python, great. Scala or Java, you pick whatever language you want. You pick whatever machine learning framework you want, we're not trying to dictate that because there's different preferences in the market, so what we're really talking about here this week in Munich is this idea of an open platform for data science and machine learning. And we think that is going to bring a lot of people to the table. >> And with open, one thing, with open platform in mind, one thing to me that is conspicuously missing from the announcement today, correct me if I'm wrong, is any indication that you're bringing support for the deep learning frameworks like TensorFlow into this overall machine learning environment. Am I wrong? I know you have Power AI. Is there a piece of Power AI in these announcements today? >> So, stay tuned on that. We are, it takes some time to do that right, and we are doing that. But we want to optimize so that you can do machine learning with GPU acceleration on Power AI, so stay tuned on that one. But we are supporting multiple frameworks, so if you want to use TensorFlow, that's great. If you want to use Caffe, that's great. If you want to use Theano, that's great. That is our approach here. We're going to allow you to decide what's the best framework for you. >> So as you look forward, maybe it's a question for you, Jim, but Rob I'd love you to chime in. What does that mean for businesses? I mean, is it just more automation, more capabilities as you evolve that timeline, without divulging any sort of secrets? What do you think, Jim? Or do you want me to ask-- >> What do I think, what do I think you're doing? >> No, you ask about deep learning, like, okay, that's, I don't see that, Rob says okay, stay tuned. What does it mean for a business, that, if like-- >> Yeah. >> If I'm planning my roadmap, what does that mean for me in terms of how I should think about the capabilities going forward? >> Yeah, well what it means for a business, first of all, is what they're going, they're using deep learning for, is doing things like video analytics, and speech analytics and more of the challenges involving convolution of neural networks to do pattern recognition on complex data objects for things like connected cars, and so forth. Those are the kind of things that can be done with deep learning. >> Okay. And so, Rob, you're talking about here in Europe how the uptick in some of the data orientation has been a little bit slower, so I presume from your standpoint you don't want to over-rotate, to some of these things. But what do you think, I mean, it sounds like there is difference between certainly Europe and those top 10 companies in the S&P, outperforming the S&P 500. What's the barrier, is it just an understanding of how to take advantage of data, is it cultural, what's your sense of this? >> So, to some extent, data science is easy, data culture is really hard. And so I do think that culture's a big piece of it. And the reason we're kind of starting with a focus on machine learning, simplistic view, machine learning is a general-purpose framework. And so it invites a lot of experimentation, a lot of engagement, we're trying to make it easier for people to on-board. As you get to things like deep learning as Jim's describing, that's where the market's going, there's no question. Those tend to be very domain-specific, vertical-type use cases and to some extent, what I see clients struggle with, they say well, I don't know what my use case is. So we're saying, look, okay, start with the basics. A general purpose framework, do some tests, do some iteration, do some experiments, and once you find out what's hunting and what's working, then you can go to a deep learning type of approach. And so I think you'll see an evolution towards that over time, it's not either-or. It's more of a question of sequencing. >> One of the things we've talked to you about on theCUBE in the past, you and others, is that IBM obviously is a big services business. This big data is complicated, but great for services, but one of the challenges that IBM and other companies have had is how do you take that service expertise, codify it to software and scale it at large volumes and make it adoptable? I thought the Watson data platform announcement last fall, I think at the time you called it Data Works, and then so the name evolved, was really a strong attempt to do that, to package a lot of expertise that you guys had developed over the years, maybe even some different software modules, but bring them together in a scalable software package. So is that the right interpretation, how's that going, what's the uptake been like? >> So, it's going incredibly well. What's interesting to me is what everybody remembers from that announcement is the Watson Data Platform, which is a decomposable framework for doing these types of use cases on the IBM cloud. But there was another piece of that announcement that is just as critical, which is we introduced something called the Data First method. And that is the recipe book to say to a client, so given where you are, how do you get to this future on the cloud? And that's the part that people, clients, struggle with, is how do I get from step to step? So with Data First, we said, well look. There's different approaches to this. You can start with governance, you can start with data science, you can start with data management, you can start with visualization, there's different entry points. You figure out the right one for you, and then we help clients through that. And we've made Data First method available to all of our business partners so they can go do that. We work closely with our own consulting business on that, GBS. But that to me is actually the thing from that event that has had, I'd say, the biggest impact on the market, is just helping clients map out an approach, a methodology, to getting on this journey. >> So that was a catalyst, so this is not a sequential process, you can start, you can enter, like you said, wherever you want, and then pick up the other pieces from majority model standpoint? Exactly, because everybody is at a different place in their own life cycle, and so we want to make that flexible. >> I have a question about the clients, the customers' use of Watson Data Platform in a DevOps context. So, are more of your customers looking to use Watson Data Platform to automate more of the stages of the machine learning development and the training and deployment pipeline, and do you see, IBM, do you see yourself taking the platform and evolving it into a more full-fledged automated data science release pipelining tool? Or am I misunderstanding that? >> Rob: No, I think that-- >> Your strategy. >> Rob: You got it right, I would just, I would expand a little bit. So, one is it's a very flexible way to manage data. When you look at the Watson Data Platform, we've got relational stores, we've got column stores, we've got in-memory stores, we've got the whole suite of open-source databases under the composed-IO umbrella, we've got cloud in. So we've delivered a very flexible data layer. Now, in terms of how you apply data science, we say, again, choose your model, choose your language, choose your framework, that's up to you, and we allow clients, many clients start by building models on their private cloud, then we say you can deploy those into the Watson Data Platform, so therefore then they're running on the data that you have as part of that data fabric. So, we're continuing to deliver a very fluid data layer which then you can apply data science, apply machine learning there, and there's a lot of data moving into the Watson Data Platform because clients see that flexibility. >> All right, Rob, we're out of time, but I want to kind of set up the day. We're doing CUBE interviews all morning here, and then we cut over to the main tent. You can get all of this on IBMgo.com, you'll see the schedule. Rob, you've got, you're kicking off a session. We've got Hilary Mason, we've got a breakout session on GDPR, maybe set up the main tent for us. >> Yeah, main tent's going to be exciting. We're going to debunk a lot of misconceptions about data and about what's happening. Marc Altshuller has got a great segment on what he calls the death of correlations, so we've got some pretty engaging stuff. Hilary's got a great piece that she was talking to me about this morning. It's going to be interesting. We think it's going to provoke some thought and ultimately provoke action, and that's the intent of this week. >> Excellent, well Rob, thanks again for coming to theCUBE. It's always a pleasure to see you. >> Rob: Thanks, guys, great to see you. >> You're welcome; all right, keep it right there, buddy, We'll be back with our next guest. This is theCUBE, we're live from Munich, Fast Track Your Data, right back. (upbeat electronic music)

Published Date : Jun 22 2017

SUMMARY :

Brought to you by IBM. This is Fast Track Your Data brought to you by IBM, Hey, great to see you. It's good that you joined us. and machine learning to the Hadoop community. You had to be relevant, you want to be part of the community, So first of all, you look at the last five years. but talk about the two announcements that you guys made. Even you can do it, Dave, which is amazing. I would love to see you do it, because I guarantee you can. but it wasn't this easy. and I want to make it so anybody can do it. extending that now to other parts of the portfolio. What are the machine learning announcements at this And the way that you do that is we don't dictate I know you have Power AI. We're going to allow you to decide So as you look forward, maybe it's a question No, you ask about deep learning, like, okay, that's, and speech analytics and more of the challenges But what do you think, I mean, it sounds like And the reason we're kind of starting with a focus One of the things we've talked to you about on theCUBE And that is the recipe book to say to a client, process, you can start, you can enter, and deployment pipeline, and do you see, IBM, models on their private cloud, then we say you can deploy and then we cut over to the main tent. and that's the intent of this week. It's always a pleasure to see you. This is theCUBE, we're live from Munich,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Europe	LOCATION	0.99+
Rob	PERSON	0.99+
Marc Altshuller	PERSON	0.99+
Hilary	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Bearden	PERSON	0.99+
February	DATE	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
May 2018	DATE	0.99+
March	DATE	0.99+
Munich	LOCATION	0.99+
Scala	TITLE	0.99+
Apache	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Last week	DATE	0.99+
Java	TITLE	0.99+
last year	DATE	0.99+
two announcements	QUANTITY	0.99+
10 companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
DB2	TITLE	0.99+
15 minutes	QUANTITY	0.99+
last week	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
European Union	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
JSON	TITLE	0.99+
Watson Data Platform	TITLE	0.99+
third	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
today	DATE	0.98+
a week ago	DATE	0.98+
two things	QUANTITY	0.98+
SQL	TITLE	0.98+
last fall	DATE	0.98+
2017	DATE	0.98+
Munich, Germany	LOCATION	0.98+
each	QUANTITY	0.98+
Y2K	ORGANIZATION	0.98+

Steve Roberts, IBM– DataWorks Summit Europe 2017 #DW17 #theCUBE

>> Narrator: Covering DataWorks Summit, Europe 2017, brought to you by Hortonworks. >> Welcome back to Munich everybody. This is The Cube. We're here live at DataWorks Summit, and we are the live leader in tech coverage. Steve Roberts is here as the offering manager for big data on power systems for IBM. Steve, good to see you again. >> Yeah, good to see you Dave. >> So we're here in Munich, a lot of action, good European flavor. It's my second European, formerly Hadoop Summit, now DataWorks. What's your take on the show? >> I like it. I like the size of the venue. It's the ability to interact and talk to a lot of the different sponsors and clients and partners, so the ability to network with a lot of people from a lot of different parts of the world in a short period of time, so it's been great so far and I'm looking forward to building upon this and towards the next DataWorks Summit in San Jose. >> Terri Virnig VP in your organization was up this morning, had a keynote presentation, so IBM got a lot of love in front of a fairly decent sized audience, talking a lot about the sort of ecosystem and that's evolving, the openness. Talk a little bit about open generally at IBM, but specifically what it means to your organization in the context of big data. >> Well, I am from the power systems team. So we have an initiative that we have launched a couple years ago called Open Power. And Open Power is a foundation of participants innovating from the power processor through all aspects, through accelerators, IO, GPUs, advanced analytics packages, system integration, but all to the point of being able to drive open power capability into the market and have power servers delivered not just through IBM, but through a whole ecosystem of partners. This compliments quite well with the Apache, Hadoop, and Spark philosophy of openness as it relates to software stack. So our story's really about being able to marry the benefits of open ecosystem for open power as it relates to the system infrastructure technology, which drives the same time to innovation, community value, and choice for customers as it relates to a multi-vendor ecosystem and coupled with the same premise as it relates to Hadoop and Spark. And of course, IBM is making significant contributions to Spark as part of the Apache Spark community and we're a key active member, as is Hortonworks with the ODPi organization forwarding the standards around Hadoop. So this is a one, two combo of open Hadoop, open Spark, either from Hortonworks or from IBM sitting on the open power platform built for big data. No other story really exists like that in the market today, open on open. >> So Terri mentioned cognitive systems. Bob Picciano has recently taken over and obviously has some cognitive chops, and some systems chops. Is this a rebranding of power? Is it sort of a layer on top? How should we interpret this? >> No, think of it more as a layer on top. So power will now be one of the assets, one of the sort of member family of the cognitive systems portion on IBM. System z can also be used as another great engine for cognitive in certain clients, certain use cases where they want to run cognitive close to the data and they have a lot of data sitting on System z. So power systems as a server really built for big data and machine learning, in particular our S822LC for high performance computing. This is a server which is landing very well in the deep learning, machine learning space. It offers the Tesla P100 GPU and with the NVIDIA NVLink technology can offer up to 2.8x bandwidth benefits CPU to GPU over what would be available through a PCIe Intel combination today. So this drives immediate value when you need to ensure that not just you're exploiting GPUs, but you of course need to move your data quickly from the processor to the GPU. >> So I was going to ask you actually, sort of what make power so well suited for big data and cognitive applications, particularly relative to Intel alternatives. You touched on that. IBM talks a lot about Moore's Law starting to hit its peak, that innovation is going to come from other places. I love that narrative 'cause it's really combinatorial innovation that's going to lead us in the next 50 years, but can we stay on that thread for a bit? What makes power so substantially unique, uniquely suited and qualified to run cognitive systems and big data? >> Yeah, it actually starts with even more of the fundamentals of the power processors. The power processor has eight threads per core in contrast to Intel's two threads per core. So this just means for being able to parallelize your workloads and workloads that come up in the cognitive space, whether you're running complex queries and need to drive SQL over a lot of parallel pipes or you're writing iterative computation, the same data set as when you're doing model training, these can all benefit from highly parallelized workloads, which can benefit from this 4x thread advantage. But of course to do this, you also need large, fast memory, and we have six times more cache per core versus Broadwell, so this just means you have a lot of memory close to the processor, driving that throughput that you require. And then on top of that, now we get to the ability to add accelerators, and unique accelerators such as I mentioned the NVIDIA in the links scenario for GPU or using the open CAPI as an approach to attach FPGA or Flash to get access speeds, processor memory access speeds, but with an attached acceleration device. And so this is economies of scale in terms of being able to offload specialized compute processing to the right accelerator at the right time, so you can drive way more throughput. The upper bounds are driving workload through individual nodes and being able to balance your IO and compute on an individual node is far superior with the power system server. >> Okay, so multi-threaded, giant memories, and this open CAPI gives you primitive level access I guess to a memory extension, instead of having to-- >> Yeah, pluggable accelerators through this high speed memory extension. >> Instead of going through, what I often call the horrible storage stack, aka SCSI, And so that's cool, some good technology discussion there. What's the business impact of all that? What are you seeing with clients? >> Well, the business impact is not everyone is going to start with supped up accelerated workloads, but they're going to get there. So part of the vision that clients need to understand is to begin to get more insights from their data is, it's hard to predict where your workloads are going to go. So you want to start with a server that provides you some of that upper room for growth. You don't want to keep scaling out horizontally by requiring to add nodes every time you need to add storage or add more compute capacity. So firstly, it's the flexibility, being able to bring versatile workloads onto a node or a small number of nodes and be able to exploit some of these memory advantages, acceleration advantages without necessarily having to build large scale out clusters. Ultimately, it's about improving time to insights. So with accelerators and with large memory, running workloads on a similar configured clusters, you're simply going to get your results faster. For example, recent benchmark we did with a representative set of TPC-DS queries on Hortonworks running on Linux and power servers, we're able to drive 70% more queries per hour over a comparable Intel configuration. So this is just getting more work done on what is now similarly priced infrastructure. 'Cause power family is a broad family that now includes 1U, 2U, scale out servers, along with our 192 core horsepowers for enterprise grade. So we can directly price compete on a scale out box, but we offer a lot more flexible choice as clients want to move up in the workload stack or to bring accelerators to the table as they start to experiment with machine learning. >> So if I understand that right, I can turn two knobs. I can do the same amount of work for less money, TCO play. Or, for the same amount of money, I can do more work. >> Absolutely >> Is that fair? >> Absolutely, now in some cases, especially in the Hadoop space, the size of your cluster is somewhat gated by how much storage you require. And if you're using the classic scale up storage model, you're going to have so many nodes no matter what 'cause you can only put so much storage on the node. So in that case, >> You're scaling storage. >> Your clusters can look the same, but you can put a lot more workload on that cluster or you can bring in IBM, a solution like IBM Spectrum Scale our elastic storage server, which allows you to essentially pull that storage off the nodes, put it in a storage appliance, and at that point, you now have high speed access to storage 'cause of course the network bandwidth has increased to the point that the performance benefit of local storage is no longer really a driving factor to a classic Hadoop deployment. You can get that high speed access in a storage appliance mode with the resiliency at far less cost 'cause you don't need 3x replication, you just have about a 30% overhead for the software erasure coding. And now with your compete nodes, you can really choose and scale those nodes just for your workload purposes. So you're not bound by the number of nodes equal total storage required by storage per node, which is a classic, how big is my cluster calculation. That just doesn't work if you get over 10 nodes, 'cause now you're just starting to get to the point where you're wasting something right? You're either wasting storage capacity or typically you're wasting compute capacity 'cause you're over provisioned on one side or the other. >> So you're able to scale compute and storage independent and tune that for the workload and grow that resource efficiently, more efficiently? >> You can right size the compute and storage for your cluster, but also importantly is you gain the flexibility with that storage tier, that data plan can be used for other non-HDFS workloads. You can still have classic POSIX applications or you may have new object based applications and you can with a single copy of the data, one virtual file system, which could also be geographically distributed, serving both Hadoop and non-Hadoop workloads, so you're saving then additional replicas of the data from being required by being able to onboard that onto a common data layer. >> So that's a return on asset play. You got an asset that's more fungible across the application portfolio. You can get more value out of it. You don't have to dedicate it to this one workload and then over provision for another one when you got extra capacity sitting here. >> It's a TCO play, but it's also a time saver. It's going to get you time to insight faster 'cause you don't have to keep moving that data around. The time you spend copying data is time you should be spending getting insights from the data, so having a common data layer removes that delay. >> Okay, 'cause it's HDFS ready I don't have to essentially move data from my existing systems into this new stovepipe. >> Yeah, we just present it through the HDFS API as it lands in the file system from the original application. >> So now, all this talk about rings of flexibility, agility, etc, what about cloud? How does cloud fit into this strategy? What do are you guys doing with your colleagues and cohorts at Bluemix, aka SoftLayer. You don't use that term anymore, but we do. When we get our bill it says SoftLayer still, but any rate, you know what I'm talking about. The cloud with IBM, how does it relate to what you guys are doing in power systems? >> Well the cloud is still, really the born on the cloud philosophy of IBM software analytics team is still very much the motto. So as you see in the data science experience, which was launched last year, born in the cloud, all our analytics packages whether it be our BigInsights software or our business intelligence software like Cognos, our future generations are landing first in the cloud. And of course we have our whole arsenal of Watson based analytics and APIs available through the cloud. So what we're now seeing as well as we're taking those born in the cloud, but now also offering a lot of those in an on-premise model. So they can also participate in the hybrid model, so data science experience now coming on premise, we're showing it at the booth here today. Bluemix has a on premise version as well, and the same software library, BigInsights, Cognos, SPSS are all available for on prem deployment. So power is still ideal place for hosting your on prem data and to run your analytics close to the data, and now we can federate that through hybrid access to these elements running in the cloud. So the focus is really being able to, the cloud applications being able to leverage the power and System z's based data through high speed connectors and being able to build hybrid configurations where you're running your analytics where they most make sense based upon your performance requirements, data security and compliance requirements. And a lot of companies, of course, are still not comfortable putting all their jewels in the cloud, so typically there's going to be a mix and match. We are expanding the footprint for cloud based offerings both in terms of power servers offered through SoftLayer, but also through other cloud providers, Nimbix is a partner we're working with right now who actually is offering our Power AI package. Power AI is a package of open source, deep learning frameworks, packaged by IBM, optimized for Power in an easily deployed package with IBM support available. And that's, could be deployed on premise in a power server, but also available on a pay per drink purpose through the Nimbix cloud. >> All right, we covered a lot of ground here. We talked strategy, we talked strategic fit, which I guess is sort of a adjunct to strategy, we talked a little bit about the competition and where you differentiate, some of the deployment models, like cloud, other bits and pieces of your portfolio. Can we talk specifically about the announcements that you have here at this event, just maybe summarize for use? >> Yeah, no absolutely. As it relates to IBM, and Hadoop, and Spark, we really have the full stack support, the rich analytics capabilities that I was mentioning, deep insight, prescriptive insights, streaming analytics with IBM Streams, Cognos Business Intelligence, so this set of technologies is available for both IBMs, Hadoop stack, and Hortonworks Hadoop stack today. Our BigInsights and IOP offering, is now out for tech preview, their next release their 4.3 release, is available for technical preview will be available for both Linux on Intel, Linux on power towards the end of this month, so that's kind of one piece of new Hadoop news at the analytics layer. As it relates to power systems, as Hortonworks announced this morning, HDP 2.6 is now available for Linux on power, so we've been partnering closely with Hortonworks to ensure that we have an optimized story for HDP running on power system servers as the data point I shared earlier with the 70% improved queries per hour. At the storage layer, we have a work in progress to certify Hortonworks, to certify Spectrum Scale file system, which really now unlocks abilities to offer this converged storage alternative to the classic Hadoop model. Spectrum Scale actually supports and provides advantages in both a classic Hadoop model with local storage or it can provide the flexibility of offering the same sort of multi-application support, but in a scale out model for storage that it also has the ability to form a part of a storage appliance that we call Elastic Storage Server, which is a combination of power servers and high density storage enclosures, SSD or spinning disk, depending upon the, or flash, depending on the configuration, and that certification will now have that as an available storage appliance, which could underpin either IBM Open Platform or HDP as a Hadoop data leg. But as I mentioned, not just for Hadoop, really for building a common data plane behind mixed analytics workloads that reduces your TCO through converged storage footprint, but more importantly, provides you that flexibility of not having to create data copies to support multiple applications. >> Excellent, IBM opening up its portfolio to the open source ecosystem. You guys have always had, well not always, but in the last 20 years, major, major investments in open source. They continue on, we're seeing it here. Steve, people are filing in. The evening festivities are about to begin. >> Steve: Yeah, yeah, the party will begin shortly. >> Really appreciate you coming on The Cube, thanks very much. >> Thanks a lot Dave. >> You're welcome. >> Great to talk to you. >> All right, keep it right there everybody. John and I will be back with a wrap up right after this short break, right back.

Published Date : Apr 6 2017

SUMMARY :

brought to you by Hortonworks. Steve, good to see you again. Munich, a lot of action, so the ability to network and that's evolving, the openness. as it relates to the system and some systems chops. from the processor to the GPU. in the next 50 years, and being able to balance through this high speed memory extension. What's the business impact of all that? and be able to exploit some of these I can do the same amount of especially in the Hadoop space, 'cause of course the network and you can with a You don't have to dedicate It's going to get you I don't have to essentially move data as it lands in the file system to what you guys are and to run your analytics a adjunct to strategy, to ensure that we have an optimized story but in the last 20 years, Steve: Yeah, yeah, the you coming on The Cube, John and I will be back with a wrap up

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
John	PERSON	0.99+
Steve	PERSON	0.99+
Steve Roberts	PERSON	0.99+
Dave	PERSON	0.99+
Munich	LOCATION	0.99+
Bob Picciano	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Terri	PERSON	0.99+
3x	QUANTITY	0.99+
six times	QUANTITY	0.99+
70%	QUANTITY	0.99+
last year	DATE	0.99+
San Jose	LOCATION	0.99+
two knobs	QUANTITY	0.99+
Bluemix	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
eight threads	QUANTITY	0.99+
Linux	TITLE	0.99+
Hadoop	TITLE	0.99+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
Nimbix	ORGANIZATION	0.98+
today	DATE	0.98+
DataWorks Summit	EVENT	0.98+
SoftLayer	TITLE	0.98+
second	QUANTITY	0.97+
Hadoop Summit	EVENT	0.97+
Intel	ORGANIZATION	0.97+
Spark	TITLE	0.97+
IBMs	ORGANIZATION	0.95+
single copy	QUANTITY	0.95+
end of this month	DATE	0.95+
Watson	TITLE	0.95+
S822LC	COMMERCIAL_ITEM	0.94+
Europe	LOCATION	0.94+
this morning	DATE	0.94+
firstly	QUANTITY	0.93+
HDP 2.6	TITLE	0.93+
first	QUANTITY	0.93+
HDFS	TITLE	0.91+
one piece	QUANTITY	0.91+
Apache	ORGANIZATION	0.91+
30%	QUANTITY	0.91+
ODPi	ORGANIZATION	0.9+
DataWorks Summit Europe 2017	EVENT	0.89+
two threads per core	QUANTITY	0.88+
SoftLayer	ORGANIZATION	0.88+

Harley Davis, IBM - IBM Interconnect 2017 - #ibminterconnect - #theCUBE

>> Announcer: Live, from Las Vegas, it's theCUBE. Covering Interconnect 2017. Brought to you by IBM. >> Okay, welcome back everyone we're here live in Las Vegas at the Mandalay Bay, theCUBE's exclusive three day coverage of IBM Interconnect 2017, I'm John Furrier. My co-host, Dave Velliante. Our next guest is Harley Davis, who's the VP of decision management at IBM. Welcome to theCUBE. >> Thank you very much, happy to be here. >> Thanks for your time today, you've got a hot topic, you've got a hot area, making decisions in real-time with data being cognitive, enterprise strong, and data first is really, really hard. So, welcome to theCUBE. What's your thoughts? Because we were talking before we came on about data, we all love, we're all data geeks but the value of the data is all contextual. Give us your color on the data landscape and really the important areas we should shine a light on, that customers are actively working to extract those insights. >> So, you know, traditionally, decisions have really been transactional, all about taking decisions on systems of record, but what's happening now is, we have the availability of all this data, streaming it in real-time, coming from systems of record, data about the past, data about the present, and then data about the future as well, so when you take into account predictive analytics models, machine learning, what you get is kind of data from the future if I can put it that way and what's interesting is how you put it all together, look for situations of risk, opportunity, is there a fraud that's happening now? Is there going to be a lack of resources at a hospital when a patient checks in? How do we put all that context together, look into the future and apply business policies to know what to do about it in real-time and that's really the differentiating use cases that people are excited about now and like you say, it's a real challenge to put that together but it's happening. >> It's happening, and that's, I think that's the key thing and there's a couple megatrends going on right now that's really propelling this. One is machine learning, two is the big data ecosystem as we call it, the big data ecosystem has always been, okay, Hadoop was the first wave, then you saw Spark, and then you're seeing that evolving now to a whole nother level moving data at rest and data in motion is a big conversation, how to do that together, not just I'm a batch only, or real-time only, the integration of those two. Then you combine that with the power of cloud and how fast cloud computing, with compute power, is accelerating, those two forces with machine learning, and IOT, it's just amazing. >> It's all coming together and what's interesting is how you bridge the gap, how you bring it all together, how you create a single system that manages in real-time all this information coming in, how you store it, how you look at, you know, history of events, systems of record and then apply situation detection to it to generate events in real-time. So, you know, one of the things that we've been working on in the decision management lab is a system called decision server insights, which is a big real-time platform, you send a stream of events in, it gets information from systems of records, you insert analytics, predictive analytics, machine learning models into it and then you write a series of situation detection rules that look at all that information and can say right now this is what's happening, I link it in with what's likely to happen in the future, for example I can say my predictive analytics model says based on this data, executed right now, this customer, this transaction is likely, 90% likely to be a fraud and then I can take all the customer information, I can apply my rule and I can apply my business policy to say well what do I do about that? Do I let it go through anyway? Because it's okay, do I reject it? Do I send it to a human analyst? We got to put all that together. >> So that use case that you just described, that's happening today, that's state of the art today, so one of the challenges today, and we all know fraud detection's got much, much better in the last several years, it used to take, if you ever found it, it would take six months, right? And it's too late, but still a lot of false positives, that'll negate a transaction, now that's a business rule decision, right? But are we at the point where even that's going to get better and better and better? >> Well, absolutely. I mean the whole, there have been two main ways to do fraud detection in the past. The first one is kind of long scale predictive analytics that you train every few months and requires, you know, lots and lots of history of data but you don't get new use cases that come up in real-time, like you don't have the Ukrainian hacker who decides, you know, if I do a payment from this one website then I can grab a bunch of money right now and then you have the other alternative, which is having a bunch of human analysts who look for cases like that guy and put it in as business rules and what's interesting is to combine the two, to retrain the models in real-time, and still apply the knowledge that the human analysts can get in real-time, and that's happening every day in lots of companies now. >> And that idea of combining transactional data and analytics, you know, has become popularized over the last couple of years, one obvious use case there is ad-tech, right? Making offers to people, marketing, what's the state of that use case? >> Well, let's look at it from the positive perspective. What we are able to do now is take information about consumers from multiple sources, you can look at the interaction that you've had with them, let's say you're a financial services company, you get all sorts of information about a company, about a customer, sorry, from the CRM system, from the series of interactions you've had with them, from what they've looked at on your website, but you can also get additional information about them if you know them by their Twitter handle or other social media feeds, you can take information from their Twitter feeds, for example, apply some cognitive technology to extract information from that do sentiment analysis, do natural language processing, you get some sense of meaning about the tweets and then you can combine that in real-time in a system like the one I talked about to say ah, this is the moment, right here, where this guy's interested in a new car, we think he just got a promotion or a raise because he's now putting more money into the bank and we see tweets saying "oh I love that new Porsche 911, "can't wait to go look at it in the showroom," if we can put those things together in real-time, why not send him a proactive offer for a loan on a new car, or put him in touch with a dealer? >> No and sometimes as a consumer I want that, you know, when I'm looking for say, scarce tickets to a show or a play-off game or something and I want the best offer and I'm going to five or six different websites, and somebody were to make me an offer, "hey, here are better seats for a lower price," I would be thrilled. >> So geographic information is interesting too for that, so let's say, for example, that you're, you're traveling to Napa Valley and let's say that we can detect that you just, you know, took out some money from the bank, from your ATM in Napa, now we know you're in Napa, now we know that you're a good customer of the bank, and we have a deal with a tour operator, a wine tour operator, so let's spontaneously propose a wine tour to you, give you a discount on that to keep you a good customer. >> Yeah, so relevant offers like that, as a consumer I'd be very interested in. All too often, at least lately, I feel like we're in the first and second innings of that type of, you know, system, where many of the offers that you get are just, wow, okay, for three weeks after I buy the dishwasher, I'm getting dishwasher ads, but it's getting better, you can sort of see it and feel it. >> You can see it getting a little better. I think this is where the combination of all these technologies with machine learning and predictive analytics really comes to the fore and where the new tools that we have available to data scientists, things like, you know, the data scientist experience that IBM offers and other tools, can help you produce a lot more segmented and targeted analytics models that can be combined with all the other information so that when you see that ad, you say oh, the bank really understands me. >> Harley, one of the things that people are working on right now and most customers, your customers and potential customers that we talk to is I got the insights coming, and I'm working on that, and we're pedaling as fast as we can, but I need actionable insight, this is a decision making thing, so decisions are now what people want to do, so that's what you do, so there's some stats out there that decision making can be less than 30 minutes based on good data, the life of the data, as short as six seconds, this speaks to the data in motion, humans aside of it, I might be on my mobile phone, I might be looking at some industrial equipment, whatever, I could be a decision maker in the data center, this is a core problem, what are you guys doing in this area, because this is really a core problem. Or an opportunity. >> Well this all about leveraging, you know, event driven architectures, Kafka, Spark and all the tools that work with it so that we can grab the data in real-time as it comes in, we can associate it with the rest of the context that's relevant for making a decision, so basically with action, when we talk about actionable insights, what are we talking about? We're talking about taking data in real-time, structured, unstructured data, having a framework for managing it, Kafka, Spark, something like decision server insights in ODM, whatever, applying cognitive technology to turn some of the unstructured data into structured data, applying machine learning, predictive analytics, tools like SPSS to create a kind of prediction of what happens in the future and then applying business rules, something like operational decision management, ODM, in order to apply business policies to the insights we've garnered from the rest of the cycle so that we can do something about it, that's decision manager, that's-- >> So you were saying earlier on the use case about, I get some event data, I bring it in to systems of record, I apply some rules to it, I mean, that doesn't sound very hard, I mean, it's almost as if that's happening now-- >> It's hard. >> Well it's hard, let me get, this is my whole point, this is not possible years ago so that's one point, I want to get some color from you on that because this is ungettable, most of the systems, we even go back ten, five years ago, we siloed, so now rule based stuff seems trivial, practically, okay, by some rules, but it's now possible to put this package together and I know it's hard but conceptually those are three concepts that some would say oh, why weren't we doing this before? >> It's been possible for a long time and we have, you know, we have plenty of customers who combine, you know, who do something as simple as when you get approved for a loan, that's based on a score, which is essentially a predictive analytics model combined with business rules that say approve, not approve, ask for more documentations and that's been done for years so it's been possible, what's even more enabled now is doing it in real-time, taking into account a much greater degree of information, having-- >> John: More data sources. >> Data sources, things like social media, things like sensors from IoT, connected car applications, all sorts of things like that and then retraining the models more frequently, so getting better information about the future, faster and faster. >> Give an example of some use cases that you're working with customers on because I think that's fascinating and I think I would agree with you that it's been possible before but the concepts are known, but now it's accelerating to a whole nother level. Talk about some of the use cases end-to-end that you guys have done with customers. >> Let's think about something like an airline, that wants to manage its operations and wants to help its passengers manage operational disruptions or changes. So what we want to do now is, take a series of events coming from all sorts of sources, and that can be basic operational data like the airplanes, what's the airplane, is it running late, is it not running late, is the connection running late, combining it with things about the weather, so information that we get about upcoming weather events from weather analytics models, and then turning that into predicting what's going to happen to this passenger through his journey in the future so that we can proactively notify him that he should be either, we can rebook him automatically on a flight, we can provide him, if we know he's going to be delayed, we can automatically provide him amenities, notify the staff at the airport where he's going to be blocked, because he's our platinum customer, we want to give him lounge access, we want to give him his favorite drink, so combine all this information together and that's a use case-- >> When's this going to happen? >> That's life, that's life. >> I want to fly that airline. Okay, so we've been talking a lot about-- >> Mr. American Airlines? I'm not going to put you on the spot there, hold up, that'll get you in trouble. >> Oh yeah, it's a real life use case. >> And said oh hey, you're not going to make your connection, thanks for letting me know. Okay, so, okay we were talking a lot about the way things used to be, the way things are, and the way things are going to be or actually are today, in that last example, and you talked about event driven workloads. One of the things we've been talking about, at SiliconANGLE and on theCUBE is, is workloads, with batch, interactive, Hadoop brought back batch, and now we have what you call, this event driven workloads, we call it the continuous workloads, right? >> All about data immersion, we all call it different things but it's the same thing. >> Right, and when we look at our forecast, we're like wow, this is really going to hit, it hasn't yet, but it's going to hit the steep part of the s-curve, what do you guys expect in terms of adoption for those types of workloads, is it going to be niche, is it going to be predominant? >> I think it should be predominant and I think companies want it to be predominant. What we still need, I think, is a further iteration on the technology and the ability to bring all these different things together. We have the technologies for the different components, we have machine learning technology, predictive analytics technology, business rules technology, event driven architecture technology, but putting it all together in a single framework, right now it's still a real, it's both a technology implementation challenge, and it's an organizational challenge because you have to have data scientists work with IT architects, work with operational people, work with business policy people and just organizationally, bringing everybody-- >> There's organizational gap. That's what you're talking about. >> Yeah, but every company wants it to happen, because they all see a competitive advantage in doing it this way. >> And what's some of the things that are, barriers being removed as you see them, because that is a consistent thing we're hearing, the products are getting better, but the organizational culture. >> The easy thing is the technology barriers, that's the thing, you know? That's kind of the easy thing to work on, how do we have single frameworks that bring together everything, that let you develop both the machine learning model, the business rules model, and optimization, resource optimization model in a single platform and manage it all together, that's, we're working on that, and that's going to be-- >> I'll throw a wrinkle into the conversation, hopefully a spark, pun intended. Open source and microservices and cloud native apps are coming, that are, with open source, it's actually coming in and fueling a lot more activity. This should be a helpful thing to your point about more data sources, how do you guys talk about that? Because that's something you have to be part of, enabling the inbound migration of new stuff. >> Yeah, we have, I mean, everything's part of the environment. It's been the case for a while that open source has been kind of the driver of a lot of innovation and we assimilate that, we can either assimilate it directly, help our customers use it via services, package it up and rebrand open source technology as services that we manage and we control and integrate it for, on behalf of our customers. >> Alright, last question for you. Future prediction, what's five years out? What's going to happen in your mind's eye, I'm not going to hold you, I mean IBM to this, you personally, just as you see some of this stuff unfolding, machine learning, we're expecting that to crank things up pretty quickly, I'm seeing cognitive, and cognitive to the core, really rocking and rolling here, so what's your, how'd you see the next five years playing out for decision making? >> The first thing is, I don't see Skynet ever happening, I think we're so-- >> Mark Benioff made a nice reference in the keynote about Terminator, I'm like no one pick up on that on Twitter. >> I don't think that's really, nearly impossible, as a scenario but of course what is going to happen and what we're seeing accelerating on a daily basis, is applying machine learning, cognitive technology to more and more aspects of our daily life but I see it, it's in a passive way, so when you're doing image recognition, that's passive, you have to tell the computer tell me what's in this image but you, the human, as the developer or the programmer, still has to kick that off and has to say okay, now that you've told me there's a cat in an image, what do I do about that and that's something a human still has to do and that's, you know, that's the thing that would be scary if our systems started saying we're going to do something on behalf of you because we understand humans completely and what they need so we're going to do it on your behalf, but that's not going to happen. >> So the role of the human is critical, paramount in all this. >> It's not going to go away, we decide what our business policies are and-- >> But isn't, well, autonomous vehicles are an example of that, but it's not a business policy, it's the car making a decision for us, cos we can't react fast enough. >> But the car is not going to tell you where you want to go. If it started, if you get in the car and it said I'm taking you to the doctor because you have a fever, maybe that will happen. (all laugh) >> That's kind of Skynet like. I'd be worried about that. It may make a recommendation. (all laugh) >> Hey, you want to go to the doctor, thank you, no I'm good. >> I really don't see Skynet happening but I do think we're going to get more and more intelligent observations from our systems and that's really cool. >> That's very cool. Harley, thanks so much for coming on theCUBE, sharing the insights, really appreciate it. theCUBE, getting the insights here at IBM Interconnect 2017, I'm John Furrier, stay with us for some more great interviews on day three here, in Las Vegas, more after this short break. (upbeat music)

Published Date : Mar 22 2017

SUMMARY :

Brought to you by IBM. at the Mandalay Bay, and really the important areas and that's really the that's the key thing and there's a couple and then you write a series and then you have the other alternative, and then you can combine that in real-time you know, when I'm looking for and let's say that we can detect of that type of, you know, system, so that when you see that ad, you say oh, so that's what you do, so about the future, faster and faster. and I think I would agree with you so that we can proactively Okay, so we've been talking a lot about-- I'm not going to put you and now we have what you call, immersion, we all call it on the technology and the ability That's what you're talking about. in doing it this way. but the organizational culture. how do you guys talk about that? been kind of the driver mean IBM to this, you personally, in the keynote about Terminator, and that's, you know, So the role of the human is critical, it's the car making a decision for us, and it said I'm taking you to the doctor That's kind of Skynet like. Hey, you want to go to the doctor, and that's really cool. sharing the insights,

ENTITIES

Entity	Category	Confidence
five	QUANTITY	0.99+
Dave Velliante	PERSON	0.99+
Mark Benioff	PERSON	0.99+
Harley Davis	PERSON	0.99+
Napa	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Napa Valley	LOCATION	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
90%	QUANTITY	0.99+
American Airlines	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
six seconds	QUANTITY	0.99+
two	QUANTITY	0.99+
Harley	PERSON	0.99+
one	QUANTITY	0.99+
first	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
less than 30 minutes	QUANTITY	0.99+
Porsche	ORGANIZATION	0.99+
today	DATE	0.99+
three concepts	QUANTITY	0.99+
Spark	TITLE	0.98+
three day	QUANTITY	0.98+
One	QUANTITY	0.98+
one point	QUANTITY	0.98+
five years	QUANTITY	0.98+
two forces	QUANTITY	0.98+
both	QUANTITY	0.98+
six different websites	QUANTITY	0.98+
Terminator	TITLE	0.97+
911	COMMERCIAL_ITEM	0.97+
day three	QUANTITY	0.97+
SiliconANGLE	ORGANIZATION	0.96+
Kafka	TITLE	0.96+
Twitter	ORGANIZATION	0.96+
five years ago	DATE	0.95+
single system	QUANTITY	0.95+
theCUBE	ORGANIZATION	0.94+
Interconnect 2017	EVENT	0.94+
two main ways	QUANTITY	0.93+
single platform	QUANTITY	0.93+
single frameworks	QUANTITY	0.93+
first thing	QUANTITY	0.93+
single framework	QUANTITY	0.91+
three weeks	QUANTITY	0.91+
years ago	DATE	0.91+
Skynet	ORGANIZATION	0.88+
first	EVENT	0.87+
Ukrainian	OTHER	0.87+
first one	QUANTITY	0.85+
one website	QUANTITY	0.84+
ten	DATE	0.83+
last couple of years	DATE	0.82+
second innings	QUANTITY	0.8+
last several years	DATE	0.8+
SPSS	TITLE	0.79+
Hadoop	TITLE	0.74+
years	QUANTITY	0.71+
wave	EVENT	0.62+
plenty of customers	QUANTITY	0.6+
next five years	DATE	0.56+
couple	QUANTITY	0.56+

Dinesh Nirmal, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Announcer] Live from New York, it's theCube, covering the IBM Machine Learning Launch Event brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to the Waldorf Astoria, everybody. This is theCube, the worldwide leader in live tech coverage. We're covering the IBM Machine Learning announcement. IBM bringing machine learning to its zMainframe, its private cloud. Dinesh Nirmel is here. He's the Vice President of Analytics at IBM and a Cube alum. Dinesh, good to see you again. >> Good to see you, Dave. >> So let's talk about ML. So we went through the big data, the data lake, the data swamp, all this stuff with the dupe. And now we're talking about machine learning and deep learning and AI and cognitive. Is it same wine, new bottle? Or is it an evolution of data and analytics? >> Good. So, Dave, let's talk about machine learning. Right. When I look at machine learning, there's three pillars. The first one is the product. I mean, you got to have a product, right. And you got to have a different shared set of functions and features available for customers to build models. For example, Canvas. I mean, those are table stakes. You got to have a set of algorithms available. So that's the product piece. >> [Dave] Uh huh. >> But then there's the process, the process of taking that model that you built in a notebook and being able to operationalize it. Meaning able to deploy it. That is, you know, I was talking to one of the customers today, and he was saying, "Machine learning is 20% fun and 80% elbow grease." Because that operationalizing of that model is not easy. Although they make it sound very simple, it's not. So if you take a banking, enterprise banking example, right? You build a model in the notebook. Some data sense build it. Now you have to take that and put it into your infrastructure or production environment, which has been there for decades. So you could have a third party software that you cannot change. You could have a set of rigid rules that already is there. You could have applications that was written in the 70's and 80's that nobody want to touch. How do you all of a sudden take the model and infuse in there? It's not easy. And so that is a tremendous amount of work. >> [Dave] Okay. >> The third pillar is the people or the expertise or the experience, the skills that needs to come through, right. So the product is one. The process of operationalizing and getting it into your production environment is another piece. And then the people is the third one. So when I look at machine learning, right. Those are three key pillars that you need to have to have a successful, you know, experience of machine learning. >> Okay, let's unpack that a little bit. Let's start with the differentiation. You mentioned Canvas, but talk about IBM specifically. >> [Dinesh] Right. What's so great about IBM? What's the differentiation? >> Right, exactly. Really good point. So we have been in the productive side for a very long time, right. I mean, it's not like we are coming into ML or AI or cognitive yesterday. We have been in that space for a very long time. We have SPSS predictive analytics available. So even if you look from all three pillars, what we are doing is we are, from a product perspective, we are bringing in the product where we are giving a choice or a flexibility to use the language you want. So there are customers who only want to use R. They are religious R users. They don't want to hear about anything else. There are customers who want to use Python, you know. They don't want to use anything else. So how do we give that choice of languages to our customers to say use any language you want. Or execution engines, right? Some folks want to use Park as execution engine. Some folks want to use R or Python, so we give that choice. Then you talked about Canvas. There are folks who want to use the GUI portion of the Canvas or a modeler to build models, or there are, you know, tekkie guys that we'll approach who want to use notebook. So how do you give that choice? So it becomes kind of like a freedom or a flexibility or a choice that we provide, so that's the product piece, right? We do that. Then the other piece is productivity. So one of the customers, the CTO of (mumbles) TV's going to come on stage with me during the main session, talk about how collaboration helped from an IBM machine learning perspective because their data scientists are sitting in New York City, our data scientists who are working with them are sitting in San Jose, California. And they were real time collaborating using notebooks in our ML projects where they can see the real time. What changes their data scientists are making. They can slack messages between each other. And that collaborative piece is what really helped us. So collaboration is one. Right from a productivity piece. We introduced something called Feedback Loop, whereby which your model can get trained. So today, you deploy a model. It could lose the score, and it could get degraded over time. Then you have to take it off-line and re-train, right? What we have done is like we introduced the Feedback Loops, so when you deploy your model, we give you two endpoints. The first endpoint is, basically, a URI, for you to plug-in your application when you, you know, run your application able call the scoring API. The second endpoint is this feedback endpoint, where you can choose to re-train the model. If you want three hours, if you want it to be six hours, you can do that. So we bring that flexibility, we bring that productivity into it. Then, the management of the models, right? How do we make sure that once you develop the model, you deploy the model. There's a life cycle involved there. How do you make sure that we enable, give you the tools to manage the model? So when you talk about differentiation, right? We are bringing differentiation on all three pillars. From a product perspective, with all the things I mentioned. From a deployment perspective. How do we make sure we have different choices of deployment, whether it's streaming, whether it's realtime, whether it's batch. You can do deployment, right? The Feedback Loop is another one. Once you deployed, how do we keep re-training it. And the last piece I talked about is the expertise or the people, right? So we are today announcing IBM Machine Learning Hub, which will become one place where our customers can go, ask questions, get education sessions, get training, right? Work together to build models. I'll give you an example, that although we are announcing hub, the IBM Machine Learning Hub today, we have been working with America First Credit Union for the last month or so. They approached us and said, you know, their underwriting takes a long time. All the knowledge is embedded in 15 to 20 human beings. And they want to make sure a machine should be able to absorb that knowledge and make that decision in minutes. So it takes hours or days. >> [Dave] So, Stu, before you jump in, so I got, put the portfolio. You know, you mentioned SPSS, expertise, choice. The collaboration, which I think you really stressed at the announcement last fall. The management of the models, so you can continuously improve it. >> Right. >> And then this knowledge base, what you're calling the hub. And I could argue, I guess, that if I take any one of those individual pieces, there, some of your competitors have them. Your argument would be it's all there. >> It all comes together, right? And you have to make sure that all three pillars come together. And customers see great value when you have that. >> Dinesh, customers today are used to kind of the deployment model on the public cloud, which is, "I want to activate a new service," you know. I just activate it, and it's there. When I think about private cloud environments, private clouds are operationally faster, but it's usually not miniature hours. It's usually more like months to deploy projects, which is still better than, you know, kind of, I think, before big data, it was, you know, oh, okay, 18 months to see if it works, and let's bring that down to, you know, a couple of months. Can you walk us through what does, you know, a customer today and says, "Great, I love this approach. "How long does it take?" You know, what's kind of the project life cycle of this? And how long will it take them to play around and pull some of these levers before they're, you know, getting productivity out of it? >> Right. So, really good questions, Stu. So let me back one step. So, in private cloud, we are going, we have new initiative called Download and Go, where our goal is to have our desktop products be able to install on your personal desktop in less than five clicks, in less than fifteen minutes. That's the goal. So the other day, you know, the team told me it's ready. That the first product is ready where you can go less than five clicks, fifteen minutes. I said the real test is I'm going to bring my son, who's five years old. Can he install it, and if he can install it, you know, we are good. And he did it. And I have a video to prove it, you know. So after the show, I will show you because and that's, when you talk about, you know, in the private cloud side, or the on-premise side, it has been a long project cycle. What we want is like you should be able to take our product, install it, and get the experience in minutes. That's the goal. And when you talk about private cloud and public cloud, another differentiating factor is that now you get the strength of IBM public cloud combined with the private cloud, so you could, you know, train your model in public cloud, and score on private cloud. You have the same experience. Not many folks, not many competitors can offer that, right? So that's another . .. >> [Stu] So if I get that right. If I as a customer have played around with the machine learning in Bluemix, I'm going to have a similar look, feel, API. >> Exactly the same, so what you have in Bluemix, right? I mean, so you have the Watson in Bluemix, which, you know, has deep learning, machine learning--all those capabilities. What we have done is we have done, is like, we have extracted the core capabilities of Watson on private cloud, and it's IBM Machine Learning. But the experience is the same. >> I want to talk about this notion of operationalizing analytics. And it ties, to me anyway, it ties into transformation. You mentioned going from Notebook to actually being able to embed analytics in workflow of the business. Can you double click on that a little bit, and maybe give some examples of how that has helped companies transform? >> Right. So when I talk about operationalizing, when you look at machine learning, right? You have all the way from data, which is the most critical piece, to building or deploying the model. A lot of times, data itself is not clean. I'll give you an example, right. So >> OSYX. >> Yeah. And when we are working with an insurance company, for example, the data that comes in. For example, if you just take gender, a lot of times the values are null. So we have to build another model to figure out if it's male or female, right? So in this case, for example, we have to say somebody has done a prostate exam. Obviously, he's a male. You know, we figured that. Or has a gynocology exam. It's a female. So we have to, you know, there's a lot of work just to get that data cleansed. So that's where I mentioned it's, you know, machine learning is 20% fun, 80% elbow grease because it's a lot of grease there that you need to make sure that you cleanse the data. Get that right. That's the shaping piece of it. Then, comes the building the model, right. And then, once you build the model on that data comes the operationalization of that model, which in itself is huge because how do you make sure that you infuse that model into your current infrastructure, which is where a lot of skill set, a lot of experience, and a lot of knowledge that comes in because you want to make sure, unless you are a start-up, right? You already have applications and programs and third-party vendors applications worth running for years, or decades, for that matter. So, yeah, so that's operationalization's a huge piece. Cleansing of the data is a huge piece. Getting the model right is another piece. >> And simplifying the whole process. I think about, I got to ingest the data. I've now got to, you know, play with it, explore. I've got to process it. And I've got to serve it to some, you know, some business need or application. And typically, those are separate processes, separate tools, maybe different personas that are doing that. Am I correct that your announcement in the Fall addressed that workflow. How is it being, you know, deployed and adopted in the field? How is it, again back to transformation, are you seeing that people are actually transforming their analytics processes and ultimately creating outcomes that they expect? >> Huge. So good point. We announced data science experience in the Fall. And the customers that who are going to speak with us today on stage, are the customers who have been using that. So, for example, if you take AFCU, America First Credit Union, they worked with us. In two weeks, you know, talk about transformation, we were able to absorb the knowledge of their underwriters. You know, what (mumbles) is in. Build that, get that features. And was able to build a model in two weeks. And the model is predicting 90%, with 90% accuracy. That's what early tests are showing. >> [Dave] And you say that was in a couple of weeks. You were, you developed that model. >> Yeah, yeah, right. So when we talk about transformation, right? We couldn't have done that a few years ago. We have transformed where the different personas can collaborate with each other, and that's a collaboration piece I talked about. Real time. Be able to build a model, and put it in the test to see what kind of benefits they're getting. >> And you've obviously got edge cases where people get really sophisticated, but, you know, we were sort of talking off camera, and you know like the 80/20 rule, or maybe it's the 90/10. You say most use cases can be, you know, solved with regression and classification. Can you talk about that a little more? >> So, so when we talk about machine learning, right? To me, I would say 90% of it is regression or classification. I mean there are edge case of our clustering and all those things. But linear regression or a classification can solve most of the, most of our customers problems, right? So whether it's fraud detection. Or whether it's underwriting the loan. Or whether you're trying to determine the sentiment analysis. I mean, you can kind of classify or do regression on it. So I would say that 90% of the cases can be covered, but like I said, most of the work is not about picking the right algorithm, but it's also about cleansing the data. Picking the algorithm, then comes building the model. Then comes deployment or operationalizing the model. So there's a step process that's involved, and each step involves some amount of work. So if I could make one more point on the technology and the transformation we have done. So even with picking the right algorithm, we automated, so you as a data scientist don't need to, you know, come in and figure out if I have 50 classifiers and each classifier has four parameters. That's 200 different combinations. Even if you take one hour on each combination, that's 200 hours or nine days that takes you to pick the right combination. What we have done is like in IBM Machine Learning we have something called cognitive assistance for data science, which will help you pick the right combination in minutes instead of days. >> So I can see how regression scales, and in the example you gave of classification, I can see how that scales. If you've got a, you know, fixed classification or maybe 200 parameters, or whatever it is, that scales, what happens, how are people dealing with, sort of automating that classification as things change, as they, some kind of new disease or pattern pops up. How do they address that at scale? >> Good point. So as the data changes, the model needs to change, right? Because everything that model knows is based on the training data. Now, if the data has changed, the symptoms of cancer or any disease has changed, obviously, you have to retrain that model. And that's where I talk about the, where the feedback loop comes in, where we will automatically retrain the model based on the new data that's coming in. So you, as an end user, for example, don't need to worry about it because we will take care of that piece also. We will automate that, also. >> Okay, good. And you've got a session this afternoon with you said two clients, right? AFCU and Kaden dot TV, and you're on, let's see, at 2:55. >> Right. >> So you folks watching the live stream, check that out. I'll give you the last word, you know, what shall we expect to hear there. Show a little leg on your discussion this afternoon. >> Right. So, obviously, I'm going to talk about the different shading factors, what we are delivering IBM Machine Learning, right? And I covered some of it. There's going to be much more. We are going to focus on how we are making freedom or flexibility available. How are we going to do productivity, right? Gains for our data scientists and developers. We are going to talk about trust, you know, the trust of data that we are bringing in. Then I'm going to bring the customers in and talk about their experience, right? We are delivering a product, but we already have customers using it, so I want them to come on stage and share the experiences of, you know, it's one thing you hear about that from us, but it's another thing that customers come and talk about it. So, and the last but not least is we are going to announce our first release of IBM Machine Learning on Z because if you look at 90% of the transactional data, today, it runs through Z, so they don't have to off-load the data to do analytics on it. We will make machine learning available, so you can do training and scoring right there on Z for your real time analytics, so. >> Right. Extending that theme that we talked about earlier, Stu, bringing analytics and transactions together, which is a big theme of the Z 13 announcement two years ago. Now you're seeing, you know, machine learning coming on Z. The live stream starts at 2 o'clock. Silicon Angle dot com had an article up on the site this morning from Maria Doucher on the IBM announcement, so check that out. Dinesh, thanks very much for coming back on theCube. Really appreciate it, and good luck today. >> Thank you. >> All right. Keep it right there, buddy. We'll be back with our next guest. This is theCube. We're live from the Waldorf Astoria for the IBM Machine Learning Event announcement. Right back.

Published Date : Feb 15 2017

SUMMARY :

brought to you by IBM. Dinesh, good to see you again. the data lake, the data swamp, And you got to have a different shared set So if you take a banking, to have a successful, you know, experience Let's start with the differentiation. What's the differentiation? the Feedback Loops, so when you deploy your model, The management of the models, so you can And I could argue, I guess, And customers see great value when you have that. and let's bring that down to, you know, So the other day, you know, the machine learning in Bluemix, I mean, so you have the Watson in Bluemix, Can you double click on that a little bit, when you look at machine learning, right? So we have to, you know, And I've got to serve it to some, you know, So, for example, if you take AFCU, [Dave] And you say that was in a couple of weeks. and put it in the test to see what kind You say most use cases can be, you know, we automated, so you as a data scientist and in the example you gave of classification, So as the data changes, with you said two clients, right? So you folks watching the live stream, you know, the trust of data that we are bringing in. on the IBM announcement, for the IBM Machine Learning Event announcement.

ENTITIES

Entity	Category	Confidence
20%	QUANTITY	0.99+
Dave Vellante	PERSON	0.99+
AFCU	ORGANIZATION	0.99+
15	QUANTITY	0.99+
one hour	QUANTITY	0.99+
New York City	LOCATION	0.99+
Dinesh Nirmal	PERSON	0.99+
Dinesh Nirmel	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
200 hours	QUANTITY	0.99+
six hours	QUANTITY	0.99+
90%	QUANTITY	0.99+
Dave	PERSON	0.99+
80%	QUANTITY	0.99+
less than fifteen minutes	QUANTITY	0.99+
New York	LOCATION	0.99+
fifteen minutes	QUANTITY	0.99+
Maria Doucher	PERSON	0.99+
America First Credit Union	ORGANIZATION	0.99+
50 classifiers	QUANTITY	0.99+
nine days	QUANTITY	0.99+
three hours	QUANTITY	0.99+
two clients	QUANTITY	0.99+
Kaden dot TV	ORGANIZATION	0.99+
less than five clicks	QUANTITY	0.99+
18 months	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
two weeks	QUANTITY	0.99+
200 different combinations	QUANTITY	0.99+
Dinesh	PERSON	0.99+
each classifier	QUANTITY	0.99+
200 parameters	QUANTITY	0.99+
each combination	QUANTITY	0.99+
Python	TITLE	0.99+
today	DATE	0.99+
each step	QUANTITY	0.99+
two years ago	DATE	0.99+
three key pillars	QUANTITY	0.99+
one	QUANTITY	0.98+
first product	QUANTITY	0.98+
one step	QUANTITY	0.98+
two endpoints	QUANTITY	0.98+
third one	QUANTITY	0.98+
first one	QUANTITY	0.98+
Watson	TITLE	0.98+
2 o'clock	DATE	0.98+
last month	DATE	0.98+
first endpoint	QUANTITY	0.98+
three pillars	QUANTITY	0.98+
Silicon Angle dot com	ORGANIZATION	0.98+
70's	DATE	0.97+
80's	DATE	0.97+
this afternoon	DATE	0.97+
Z 13	TITLE	0.97+
Z	TITLE	0.97+
last fall	DATE	0.96+
Bluemix	TITLE	0.96+
yesterday	DATE	0.95+
2:55	DATE	0.95+

Anjul Bhambri - IBM Information on Demand 2013 - theCUBE

okay welcome back to IBM's information on demand live in Las Vegas this is the cube SiliconANGLE movie bonds flagship program we go out to the events it's check the student from the noise talk to the thought leaders get all the data share that with you and you go to SiliconANGLE com or Wikibon or to get all the footage and we're if you want to participate with us we're rolling out our new innovative crowd activated innovation application called crowd chat go to crouch at net / IBM iod just login with your twitter handle or your linkedin and participate and share your voice is going to be on the record transcript of the cube conversations I'm John furrier with silicon items with my co-host hi buddy I'm Dave vellante Wikibon dork thanks for watching aren't you Oh bhambri is here she's the vice president of big data and analytics at IBM many time cube guests as you welcome back good to see you again thank you so we were both down at New York City last week for the hadoop world really amazing to see how that industry has evolved I mean you guys I've said the number of times today and I said this to you before you superglued your your big data or your analytics business to the Big Data meme and really created a new category I don't know if that was by design or you know or not but it certainly happened suddenly by design well congratulations then because because I think that you know again even a year a year and a half ago those two terms big data and analytics were sort of separate now it's really considered as one right yeah yeah I think because initially as people our businesses started getting really flooded with big data right dealing with the large volumes dealing with structured semi-structured or unstructured data they were looking at that you know how do you store and manage this data in a cost-effective manner but you know if you're just only storing this data that's useless and now obviously it's people realize that they need and there is insights from this data that has to be gleaned and there's technology that is available to do that so so customers are moving very quickly to that it's not just about cost savings in terms of handling this data but getting insights from it so so big data and analytics you know is becoming it's it's becoming synonymous heroes interesting to me on Jules is you know just following this business it's all it's like there's a zillion different nails out there and and and everybody has a hammer and they're hitting the nail with their unique camera but I've it's like IBM as a lot of different hammers so we could talk about that a little bit you've got a very diverse portfolio you don't try to force one particular solution on the client you it sort of an it's the Pens sort of answer we could talk about that a little bit yeah sure so in the context of big data when we look at just let's start with transactional data right that continues to be the number one source where there is very valuable insights to be gleaned from it so the volumes are growing that you know we have retailers that are handling now 2.5 million transactions per hour a telco industry handling 10 billion call data detailed records every day so when you look at that level that volume of transactions obviously you need to be you need engines that can handle that that can process analyze and gain insights from this that you can get you can do ad hoc analytics on this run queries and get information out of this at the same speed at which this data is getting generated so you know we we announced the blu acceleration rate witches are in memory columnstore which gives you the power to handle these kinds of volumes and be able to really query and get value out of this very quickly so but now when you look at you know you go beyond the structured data or beyond transactional data there is semi structured unstructured data that's where which is still data at rest is where you know we have big insights which leverages Apache Hadoop open source but we've built lots of capabilities on top of that where we get we give the customers the best of open source plus at the same time the ability to analyze this data so you know we have text analytics capabilities we provide machine learning algorithms we have provided integration with that that customers can do predictive modeling on this data using SPSS using open source languages like our and in terms of visualization they can visualize this data using cognos they can visualize this data using MicroStrategy so we are giving customers like you said it's not just you know there's one hammer and they have to use that for every nail the other aspect has been around real time and we heard that a lot at strada right in the like I've been going to start us since the beginning and those that time even though we were talking about real time but nobody else true nobody was talking nobody was back in the hadoop world days ago one big bats job yeah so in real time is now the hotbed of the conversation a journalist storm he's new technologies coming out with him with yarn has done it's been interesting yeah you seen the same thing yeah so so and and of course you know we have a very mature technology in that space you know InfoSphere streams for a real-time analytics has been around for a long time it was you know developed initially for the US government and so we've been you know in the space for more than anybody else and we have deployments in the telco space where you know these tens of billions of call detail records are being processed analyzed in real time and you know these telcos are using it to predict customer churn to prevent customer churn gaining all kinds of insights and extremely high you know very low latency so so it's good to see that you know other companies are recognizing the need for it and are you know bringing other offerings out in this space yes every time before somebody says oh I want to go you know low latency and I want to use spark you say okay no problem we could do that and streets is interesting because if I understand it you're basically acting on the data producing analytics prior to persisting the data on in memory it's all in memory and but yet at the same time is it of my question is is it evolving where you now can blend that sort of real-time yeah activity with maybe some some batch data and and talk about how that's evolving yeah absolutely so so streams is for for you know where as data is coming in it can be processed filtered patterns can be seen in streams of data by correlating connecting different streams of data and based on a certain events occurring actions can be taken now it is possible that you know all of this data doesn't need to be persisted but there may be some aspects or some attributes of this data that need to be persisted you could persist this data in a database that is use it as a way to populate your warehouse you could persist it in a Hadoop based offering like BigInsights where you can you know bring in other kinds of data and enrich the data it's it's like data loans from data and a different picture emerges Jeff Jonas's puzzle right so that's that that's very valid and so so when we look at the real time it is about taking action in real time but there is data that can be persisted from that in both the warehouse as well as on something like the insides are too I want to throw a term at you and see what what what this means to you we actually doing some crowd chats with with IBM on this topic data economy was going to SS you have no date economy what does the data economy mean to you what our customers you know doing with the data economy yes okay so so my take on this is that there are there are two aspects of this one is that the cost of storing the data and analyzing the data processing the data has gone down substantially the but the value in this data because you can now process analyze petabytes of this data you can bring in not just structured but semi-structured and unstructured data you can glean information from different types of data and a different picture emerges so the value that is in this data has gone up substantially I previously a lot of this data was probably discarded people without people knowing that there is useful information in this so to the business the value in the data has gone up what they can do with this data in terms of making business decisions in terms of you know making their customers and consumers more satisfied giving them the right products and services and how they can monetize that data has gone up but the cost of storing and analyzing and processing has gone down rich which i think is fantastic right so it's a huge win win for businesses it's a huge win win for the consumers because they are getting now products and services from you know the businesses which they were not before so that that to me is the economy of data so this is why I John I think IBM is really going to kill it in this in this business because they've got such a huge portfolio they've got if you look at where I OD has evolved data management information management data governance all the stuff on privacy these were all cost items before people looked at him on I gotta deal with all this data and now it's there's been a bit flip uh-huh IBM is just in this wonderful position to take advantage of it of course Ginny's trying to turn that you know the the battleship and try to get everybody aligned but the moons and stars are aligning and really there's a there's a tailwind yeah we have a question on domains where we have a question on Twitter from Jim Lundy analyst former Gartner analyst says own firm now shout out to Jim Jim thanks for for watching as always I know you're a cube cube alum and also avid watcher and now now a loyal member of the crowd chat community the question is blu acceleration is helps drive more data into actionable analytics and dashboards mm-hmm can I BM drive new more new deals with it I've sued so can you expound it answers yes yes yes and can you elaborate on that for Jim yeah I you know with blu acceleration you know we have had customers that have evaluated blue and against sa bihana and have found that what blue can provide is is they ahead of what SI p hana can provide so we have a number of accounts where you know people are going with the performance the throughput you know what blue provides is is very unique and it's very head of what anybody else has in the market in solving SI p including SI p and and you know it's ultimately its value to the business right and that's what we are trying to do that how do we let our customers the right technology so that they can deal with all of this data get their arms around it get value from this data quickly that's that's really of a sense here wonderful part of Jim's question is yes the driving new deals for sure a new product new deals me to drive new footprints is that maybe what he's asking right in other words you traditional IBM accounts are doing doing deals are you able to drive new footprints yeah yeah we you know there are there are customers that you know I'm not gonna take any names here but which have come to us which are new to IBM right so it's a it's that to us and that's happening that new business that's Nate new business and that's happening with us for all our big data offerings because you know the richness that is there in the portfolio it's not that we have like you were saying Dave it's not that we have one hammer and we are going to use it for every nail that is out there you know as people are looking at blue big insights for her to streams for real time and with all this comes the whole lifecycle management and governance right so security privacy all those things don't don't go away so all the stuff that was relevant for the relational data now we are able to bring that to big data very quickly and which is I think of huge value to customers and as people are moving very quickly in this big data space there's nobody else who can just bring all of these assets together from and and you know provide an integrated platform what use cases to Jim's point I don't you know I know you don't want to name names but can you name you how about some use cases that that these customers are using with blue like but use cases and they solving so you know I from from a use case a standpoint it is really like you know people are seeing performance which is you know 30 32 times faster than what they had seen when they were not using and in-memory columnstore you know so eight to twenty five thirty two times per men's gains is is you know something that is huge and is getting more and more people attracted to this so let's take an industry take financial services for example so the big the big ones in financial services are a risk people want to know you know are they credit risk yeah there's obviously marketing serving up serving up ads a fraud detection you would think is another one that in more real time are these these you know these will be the segments and of course you know retail where again you know there is like i was saying right that the number of transactions that are being handled is is growing phenomenally i gave one example which was around 2.5 million transactions per hour which was unheard of before and the information that has to be gleaned from it which is you know to leverage this for demand forecasting to leverage this for gaining insights in terms of giving the customers the right kind of coupons to make sure that those coupons are getting you know are being used so it was you know before the world used to be you get the coupons in your email in your mail then the world changed to that you get coupons after you've done the transaction now where we are seeing customers is that when a customer walks in the store that's where they get the coupons based on which i layer in so it's a combination of the transactional data the location data right and we are able to bring all of this together so so it's blue combined with you know what things like streams and big insights can do that makes the use cases even more powerful and unique so I like this new format of the crowd chatting emily is a one hour crowd chat where it's kind of like thought leaders just going to pounding away but this is more like reddit AMA but much better question coming in from grant case is one of the themes to you is one of the themes we've heard about in Makino was the lack of analytical talent what is going on to contribute more value for an organization skilling up the work for or implementing better software tools for knowledge workers so in terms so skills is definitely an issue that has been a been a challenge in the in the industry with and it got pretty compound with big data and the new technology is coming in from the standpoint of you know what we are doing for the data scientists which is you know the people who are leveraging data to to gain new insights to explore and and and discover what other attributes they should be adding to their predictive models to improve the accuracy of those models so there is there's a very rich set of tools which are used for exploration and discovery so we have which is both from you know Cognos has such such such capabilities we have such capabilities with our data Explorer absolutely basically tooling for the predictive on the modeling sister right now the efforts them on the modeling and for the predictive and descriptive analytics right I mean there's a lot of when you look at that Windows petabytes of data before people even get to predictive there's a lot of value to be gleaned from descriptive analytics and being able to do it at scale at petabytes of data was difficult before and and now that's possible with extra excellent visualization right so that it's it's taking things too that it the analytics is becoming interactive it's not just that you know you you you are able to do this in real time ask the questions get the right answers because the the models running on petabytes of data and the results coming from that is now possible so so interactive analytics is where this is going so another question is Jim was asking i was one of ibm's going around doing blue accelerator upgrades with all its existing clients loan origination is a no brainer upgrade I don't even know that was the kind of follow-up that I had asked is that new accounts is a new footprint or is it just sort of you it is spending existing it's it's boat it's boat what is the characteristic of a company that is successfully or characteristics of a company that is successfully leveraging data yeah so companies are thinking about now that you know their existing edw which is that enterprise data warehouse needs to be expanded so you know before if they were only dealing with warehouses which one handling just structure data they are augmenting that so this is from a technology standpoint right there augmenting that and building their logical data warehouse which takes care of not just the structure data but also semi-structured and unstructured data are bringing augmenting the warehouses with Hadoop based offerings like big insights with real-time offerings like streams so that from an IT standpoint they are ready to deal with all kinds of data and be able to analyze and gain information from all kinds of data now from the standpoint of you know how do you start the Big Data journey it the platform that at least you know we provide is a plug-and-play so there are different starting points for for businesses they may have started with warehouses they bring in a poly structured store with big inside / Hadoop they are building social profiles from social and public data which was not being done before matching that with the enterprise data which may be in CRM systems master data management systems inside the enterprise and which creates quadrants of comparisons and they are gaining more insights about the customer based on master data management based on social profiles that they are building so so this is one big trend that we are seeing you know to take this journey they have to you know take smaller smaller bites digests that get value out of it and you know eat it in chunks rather than try to you know eat the whole pie in one chunk so a lot of companies starting with exploration proof of concepts implementing certain use cases in four to six weeks getting value and then continuing to add more and more data sources and more and more applications so there are those who would say those existing edw so many people man some people would say they should be retired you would disagree with that no no I yeah I I think we very much need that experience and expertise businesses need that experience and expertise because it's not an either/or it's not that that goes away and there comes a different kind of a warehouse it's an evolution right but there's a tension there though wouldn't you say there's an organizational tension between the sort of newbies and the existing you know edw crowd i would say that maybe you know three years ago that was there was a little bit of that but there is i mean i talked to a lot of customers and there is i don't see that anymore so people are people are you know they they understand they know what's happening they are moving with the times and they know that this evolution is where the market is going where the business is going and where the technology you know they're going to be made obsolete if they don't embrace it right yeah yeah so so as we get on time I want to ask you a personal question what's going on with you these days with within IBM asli you're in a hot area you are at just in New York last week tell us what's going on in your life these days I mean things going well I mean what things you're looking at what are you paying attention to what's on your radar when you wake up and get to work before you get to work what's what are you thinking about what's the big picture so so obviously you know big data has been really fascinating right lots of lots of different kinds of applications in different industries so working with the customers in telco and healthcare banking financial sector has been very educational right so a lot of learning and that's very exciting and what's on my radar is we are obviously now seeing that we've done a lot of work in terms of helping customers develop and their Big Data Platform on-premise now we are seeing more and more a trend where people want to put this on the cloud so that's something that we have now a lot of I mean it's not like we haven't paid attention to the cloud but you know in the in the coming months you are going to see more from us are where you know how do we build cus how do we help customers build both private and and and public cloud offerings are and and you know where they can provide analytics as a service two different lines of business by setting up the clouds soso cloud is certainly on my mind software acquisition that was a hole in the portfolio and that filled it you guys got to drive that so so both software and then of course OpenStack right from an infrastructure standpoint for what's happening in the open source so we are you know leveraging both of those and like I said you'll hear more about that OpenStack is key as I say for you guys because you have you have street cred when it comes to open source I mean what you did in Linux and made a you know great business out of that so everybody will point it you know whether it's Oracle or IBM and HP say oh they just want to sell us our stack you've got to demonstrate and that you're open and OpenStack it's great way to do that and other initiatives as well so like I say that's a V excited about that yeah yeah okay I sure well thanks very much for coming on the cube it's always a pleasure to thank you see you yeah same here great having you back thank you very much okay we'll be right back live here inside the cube here and IV IBM information on demand hashtag IBM iod go to crouch at net / IBM iod and join the conversation where we're going to have a on the record crowd chat conversation with the folks out the who aren't here on-site or on-site Worth's we're here alive in Las Vegas I'm Java with Dave on to write back the q

Published Date : Nov 5 2013

SUMMARY :

of newbies and the existing you know edw

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Jeff Jonas	PERSON	0.99+
Jim Lundy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
New York City	LOCATION	0.99+
one hour	QUANTITY	0.99+
New York	LOCATION	0.99+
Anjul Bhambri	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
30	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Dave vellante	PERSON	0.99+
Dave	PERSON	0.99+
2013	DATE	0.99+
Linux	TITLE	0.99+
Gartner	ORGANIZATION	0.99+
last week	DATE	0.99+
eight	QUANTITY	0.99+
two aspects	QUANTITY	0.99+
last week	DATE	0.99+
three years ago	DATE	0.98+
four	QUANTITY	0.98+
both	QUANTITY	0.98+
six weeks	QUANTITY	0.98+
one chunk	QUANTITY	0.98+
SPSS	TITLE	0.98+
John furrier	PERSON	0.97+
one hammer	QUANTITY	0.97+
US government	ORGANIZATION	0.97+
Ginny	PERSON	0.97+
year and a half ago	DATE	0.96+
32 times	QUANTITY	0.96+
two terms	QUANTITY	0.95+
telco	ORGANIZATION	0.95+
today	DATE	0.94+
reddit	ORGANIZATION	0.93+
Cognos	ORGANIZATION	0.93+
around 2.5 million transactions per hour	QUANTITY	0.93+
one example	QUANTITY	0.93+
two different lines	QUANTITY	0.93+
ibm	ORGANIZATION	0.92+
themes	QUANTITY	0.9+
one	QUANTITY	0.9+
number one	QUANTITY	0.9+
petabytes	QUANTITY	0.9+
Jim Jim	PERSON	0.89+
10 billion call data	QUANTITY	0.89+
OpenStack	TITLE	0.89+
Hadoop	TITLE	0.88+
bhambri	PERSON	0.88+
days	DATE	0.88+
tens of billions of call	QUANTITY	0.87+
Wikibon	ORGANIZATION	0.85+
twitter	ORGANIZATION	0.85+
twenty five thirty two times	QUANTITY	0.85+
2.5 million transactions per hour	QUANTITY	0.84+
one big	QUANTITY	0.83+
blue	ORGANIZATION	0.83+
one big bats	QUANTITY	0.82+
one of	QUANTITY	0.8+
IBM iod	TITLE	0.78+
zillion different nails	QUANTITY	0.77+
Twitter	ORGANIZATION	0.74+
SiliconANGLE com	OTHER	0.74+
Makino	TITLE	0.73+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for SPSS: