James Scott, ICIT | CyberConnect 2017
>> Narrator: New York City, it's the Cube covering CyberConnect 2017 brought to you by Centrify and the Institute for Critical Infrastructure Technology. >> Welcome back, everyone. This is the Cube's live coverage in New York City's Grand Hyatt Ballroom for CyberConnect 2017 presented by Centrify. I'm John Furrier, the co-host of the Cube with my co-host this week is Dave Vellante, my partner and co-founder and co-CEO with me in SiliconAngle Media in the Cube. Our next guest is James Scott who is the co-founder and senior fellow at ICIT. Welcome to the Cube. >> Thanks for having me. >> You guys are putting on this event, really putting the content together. Centrify, just so everyone knows, is underwriting the event but this is not a Centrify event. You guys are the key content partner, developing the content agenda. It's been phenomenal. It's an inaugural event so it's the first of its kind bringing in industry, government, and practitioners all together, kind of up leveling from the normal and good events like Black Hat and other events like RSA which go into deep dives. Here it's a little bit different. Explain. >> Yeah, it is. We're growing. We're a newer think tank. We're less than five years old. The objective is to stay smaller. We have organizations, like Centrify, that came out of nowhere in D.C. so we deal, most of what we've done up until now has been purely federal and on the Hill so what I do, I work in the intelligence community. I specialize in social engineering and then I advise in the Senate for the most part, some in the House. We're able to take these organizations into the Pentagon or wherever and when we get a good read on them and when senators are like, "hey, can you bring them back in to brief us?" That's when we know we have a winner so we started really creating a relationship with Tom Kemp, who's the CEO and founder over there, and Greg Cranley, who heads the federal division. They're aggressively trying to be different as opposed to trying to be like everyone else, which makes it easy. If someone wants to do something, they have to be a fellow for us to do it, but if they want to do it, just like if they want to commission a paper, we just basically say, "okay, you can pay for it but we run it." Centrify has just been excellent. >> They get the community model. They get the relationship that you have with your constituents in the community. Trust matters, so you guys are happy to do this but more importantly, the content. You're held to a standard in your community. This is new, not to go in a different direction for a second but this is what the community marketing model is. Stay true to your audience and trust. You're relied upon so that's some balance that you guys have to do. >> The thing is we deal with cylance and others. Cylance, for example, was the first to introduce machine learning artificial intelligence to get passed that mutating hash for endpoint security. They fit in really well in the intelligence community. The great thing about working with Centrify is they let us take the lead and they're very flexible and we just make sure they come out on top each time. The content, it's very content driven. In D.C., we have at our cocktail receptions, they're CIA, NSA, DARPA, NASA. >> You guys are the poster child of be big, think small. >> Exactly. Intimate. >> You say Centrify is doing things differently. They're not falling in line like a lemming. What do you mean by that? What is everybody doing that these guys are doing differently? >> I think in the federal space, I think commercial too, but you have to be willing to take a big risk to be different so you have to be willing to pay a premium. If people work with us, they know they're going to pay a premium but we make sure they come out on top. What they do is, they'll tell us, Centrify will be like, "look, we're going to put x amount of dollars into a lunch. "Here are the types of pedigree individuals "that we need there." Maybe they're not executives. Maybe they're the actual practitioners at DHS or whatever. The one thing that they do different is they're aggressively trying to deviate from the prototype. That's what I mean. >> Like a vendor trying to sell stuff. >> Yeah and the thing is, that's why when someone goes to a Centrify event, I don't work for Centrify (mumbles). That's how they're able to attract. If you see, we have General Alexander. We've got major players here because of the content, because it's been different and then the other players want to be on the stage with other players, you know what I mean. It almost becomes a competition for "hey, I was asked to come to an ICIT thing" you know, that sort of thing. That's what I mean. >> It's reputation. You guys have a reputation and you stay true to that. That's what I was saying. To me, I think this is the future of how things get done. When you have a community model, you're held to a standard with your community. If you cross the line on that standard, you head fake your community, that's the algorithm that brings you a balance so you bring good stuff to the table and you vet everyone else on the other side so it's just more of a collaboration, if you will. >> The themes here, what you'll see is within critical infrastructure, we try to gear this a little more towards the financial sector. We brought, from Aetna, he set up the FS ISAC. Now he's with the health sector ISAC. For this particular geography in New York, we're trying to have it focus more around health sector and financial critical infrastructure. You'll see that. >> Alright, James, I've got to ask you. You're a senior fellow. You're on the front lines with a great Rolodex, great relationships in D.C., and you're adivising and leaned upon by people making policy, looking at the world and the general layout in which, the reality is shit's happening differently now so the world's got to change. Take us through a day in the life of some of the things you guys are seeing and what's the outlook? I mean, it's like a perfect storm of chaos, yet opportunity. >> It really depends. Each federal agency, we look at it from a Hill perspective, it comes down to really educating them. When I'm in advising in the House, I know I'm going to be working with a different policy pedigree than a Senate committee policy expert, you know what I mean. You have to gauge the conversation depending on how new the office is, House, Senate, are they minority side, and then what we try to do is bring the issues that the private sector is having while simultaneously hitting the issues that the federal agency space is. Usually, we'll have a needs list from the CSWEP at the different federal agencies for a particular topic like the Chinese APTs or the Russian APT. What we'll do is, we'll break down what the issue is. With Russia, for example, it's a combination of two types of exploits that are happening. You have the technical exploit, the malicious payload and vulnerability in a critical infrastructure network and then profiling those actors. We also have another problem, the influence operations, which is why we started the Center for Cyber Influence Operations Studies. We've been asked repeatedly since the elections last year by the intelligence community to tell us, explain this new propaganda. The interesting thing is the synergies between the two sides are exploiting and weaponizing the same vectors. While on the technical side, you're exploiting a vulnerability in a network with a technical exploit, with a payload, a compiled payload with a bunch of tools. On the influence operations side, they're weaponizing the same social media platforms that you would use to distribute a payload here but only the... >> Contest payload. Either way you have critical infrastructure. The payload being content, fake content or whatever content, has an underpinning that gamification call it virality, network effect and user psychology around they don't really open up the Facebook post, they just read the headline and picture. There's a dissonance campaign, or whatever they're running, that might not be critical to national security at that time but it's also a post. >> It shifts the conversation in a way where they can use, for example, right now all the rage with nation states is to use metadata, put it into big data analytics, come up with a psychographic algorithm, and go after critical infrastructure executives with elevated privileges. You can do anything with those guys. You can spearfish them. The Russian modus operandi is to call and act like a recruiter, have that first touch of contact be the phone call, which they're not expecting. "Hey, I got this job. "Keep it on the down low. Don't tell anybody. "I'm going to send you the job description. "Here's the PDF." Take it from there. >> How should we think about the different nation state actors? You mentioned Russia, China, there's Iran, North Korea. Lay it out for us. >> Each geography has a different vibe to their hacking. With Russia you have this stealth and sophistication and their hacking is just like their espionage. It's like playing chess. They're really good at making pawns feel like they're kings on the chessboard so they're really good at recruiting insider threats. Bill Evanina is the head of counterintel. He's a bulldog. I know him personally. He's exactly what we need in that position. The Chinese hacking style is more smash and grab, very unsophisticated. They'll use a payload over and over again so forensically, it's easy to... >> Dave: Signatures. >> Yeah, it is. >> More shearing on the tooling or whatever. >> They'll use code to the point of redundancy so it's like alright, the only reason they got in... Chinese get into a network, not because of sophistication, but because the network is not protected. Then you have the mercenary element which is where China really thrives. Chinese PLA will hack for the nation state during the day, but they'll moonlight at night to North Korea so North Korea, they have people who may consider themselves hackers but they're not code writers. They outsource. >> They're brokers, like general contractors. >> They're not sophisticated enough to carry out a real nation state attack. What they'll do is outsource to Chinese PLA members. Chinese PLA members will be like, "okay well, here's what I need for this job." Typically, what the Chinese will do, their loyalties are different than in the west, during the day they'll discover a vulnerability or an O day. They won't tell their boss right away. They'll capitalize off of it for a week. You do that, you go to jail over here. Russia, they'll kill you. China, somehow this is an accepted thing. They don't like it but it just happens. Then you have the eastern European nations and Russia still uses mercenary elements out of Moscow and St. Petersburg so what they'll do is they will freelance, as well. That's when you get the sophisticated, carbonic style hack where they'll go into the financial sector. They'll monitor the situation. Learn the ins and outs of everything having to do with that particular swift or bank or whatever. They go in and those are the guys that are making millions of dollars on a breach. Hacking in general is a grind. It's a lot of vulnerabilities work, but few work for long. Everybody is always thinking there's this omega code that they have. >> It's just brute force. You just pound it all day long. >> That's it and it's a grind. You might have something that you worked on for six months. You're ready to monetize. >> What about South America? What's the vibe down there? Anything happening in there? >> Not really. There is nothing of substance that really affects us here. Again, if an organization is completely unprotected. >> John: Russia? China? >> Russia and China. >> What about our allies? >> GCHQ. >> Israel? What's the collaboration, coordination, snooping? What's the dynamic like there? >> We deal, mostly, with NATO and Five Eyes. I actually had dinner with NATO last night. Five Eyes is important because we share signals intelligence and most of the communications will go through Five Eyes which is California, United States, Australia, New Zealand, and the UK. Those are our five most important allies and then NATO after that, as far as I'm concerned, for cyber. You have the whole weaponization of space going on with SATCOM interception. We're dealing with that with NASA, DARPA. Not a lot is happening down in South America. The next big thing that we have to look at is the cyber caliphate. You have the Muslim brotherhood that funds it. Their influence operations domestically are extremely strong. They have a lot of contacts on the Hill which is a problem. You have ANTIFA. So there's two sides to this. You have the technical exploit but then the information warfare exploit. >> What about the bitcoin underbelly that started with the silk roads and you've seen a lot of bitcoin. Money laundering is a big deal, know your customer. Now regulation is part of big ICOs going on. Are you seeing any activity from those? Are they pulling from previous mercenary groups or are they arbitraging just more free? >> For updating bitcoin? >> The whole bitcoin networks. There's been an effort to commercialize (mumbles) so there's been a legitimate track to bring that on but yet there's still a lot of actors. >> I think bitcoin is important to keep and if you look at the more black ops type hacking or payment stuff, bitcoin is an important element just as tor is an important element, just as encryption is an important element. >> John: It's fundamental, actually. >> It's a necessity so when I hear people on the Hill, I have my researcher, I'm like, "any time you hear somebody trying to have "weakened encryption, back door encryption" the first thing, we add them to the briefing schedule and I'm like, "look, here's what you're proposing. "You're proposing that you outlaw math. "So what? Two plus two doesn't equal four. "What is it? Three and a half? "Where's the logic?" When you break it down for them like that, on the Hill in particular, they begin to get it. They're like, "well how do we get the intelligence community "or the FBI, for example, to get into this iphone?" Civil liberties, you've got to take that into consideration. >> I got to ask you a question. I interviewed a guy, I won't say his name. He actually commented off the record, but he said to me, "you won't believe how dumb some of these state actors are "when it comes to cyber. "There's some super smart ones. "Specifically Iran and the Middle East, "they're really not that bright." He used an example, I don't know if it's true or not, that stuxnet, I forget which one it was, there was a test and it got out of control and they couldn't pull it back and it revealed their hand but it could've been something worse. His point was they actually screwed up their entire operation because they're doing some QA on their thing. >> I can't talk about stuxnet but it's easy to get... >> In terms of how you test them, how do you QA your work? >> James: How do you review malware? (mumbles) >> You can't comment on the accuracy of Zero Days, the documentary? >> Next question. Here's what you find. Some of these nation state actors, they saw what happened with our elections so they're like, "we have a really crappy offensive cyber program "but maybe we can thrive in influence operations "in propaganda and whatever." We're getting hit by everybody and 2020 is going to be, I don't even want to imagine. >> John: You think it's going to be out of control? >> It's going to be. >> I've got to ask this question, this came up. You're bringing up a really good point I think a lot of people aren't talking about but we've brought up a few times. I want to keep on getting it out there. In the old days, state on state actors used to do things, espionage, and everyone knew who they were and it was very important not to bring their queen out, if you will, too early, or reveal their moves. Now with Wikileaks and public domain, a lot of these tools are being democratized so that they can covertly put stuff out in the open for enemies of our country to just attack us at will. Is that happening? I hear about it, meaning that I might be Russia or I might be someone else. I don't want to reveal my hand but hey, you ISIS guys out there, all you guys in the Middle East might want to use this great hack and put it out in the open. >> I think yeah. The new world order, I guess. The order of things, the power positions are completely flipped, B side, counter, whatever. It's completely not what the establishment was thinking it would be. What's happening is Facebook is no more relevant, I mean Facebook is more relevant than the UN. Wikileaks has more information pulsating out of it than a CIA analyst, whatever. >> John: There's a democratization of the information? >> The thing is we're no longer a world that's divided by geographic lines in the sand that were drawn by these two guys that fought and lost a war 50 years ago. We're now in a tribal chieftain digital society and we're separated by ideological variation and so you have tribe members here in the US who have fellow tribe members in Israel, Russia, whatever. Look at Anonymous. Anonymous, I think everyone understands that's the biggest law enforcement honeypot there is, but you look at the ideological variation and it's hashtags and it's keywords and it's forums. That's the Senate. That's congress. >> John: This is a new reality. >> This is reality. >> How do you explain that to senators? I was watching that on TV where they're trying to grasp what Facebook is and Twitter. (mumbles) Certainly Facebook knew what was going on. They're trying to play policy and they're new. They're newbies when it comes to policy. They don't have any experience on the Hill, now it's ramping up and they've had some help but tech has never been an actor on the stage of policy formulation. >> We have a real problem. We're looking at outside threats as our national security threats, which is incorrect. You have dragnet surveillance capitalists. Here's the biggest threats we have. The weaponization of Facebook, twitter, youtube, google, and search engines like comcast. They all have a censorship algorithm, which is how they monetize your traffic. It's censorship. You're signing your rights away and your free will when you use google. You're not getting the right answer, you're getting the answer that coincides with an algorithm that they're meant to monetize and capitalize on. It's complete censorship. What's happening is, we had something that just passed SJ res 34 which no resistance whatsoever, blew my mind. What that allows is for a new actor, the ISPs to curate metadata on their users and charge them their monthly fee as well. It's completely corrupt. These dragnet surveillance capitalists have become dragnet surveillance censorists. Is that a word? Censorists? I'll make it one. Now they've become dragnet surveillance propagandists. That's why 2020 is up for grabs. >> (mumbles) We come from the same school here on this one, but here's the question. The younger generation, I asked a gentleman in the hallway on his way out, I said, "where's the cyber west point? "We're the Navy SEALS in this new digital culture." He said, "oh yeah, some things." We're talking about the younger generation, the kids playing Call of Duty Destiny. These are the guys out there, young kids coming up that will probably end up having multiple disciplinary skills. Where are they going to come from? So the question is, are we going to have a counterculture? We're almost feeling like what the 60s were to the 50s. Vietnam. I kind of feel like maybe the security stuff doesn't get taken care of, a revolt is coming. You talk about dragnet censorship. You're talking about the lack of control and privacy. I don't mind giving Facebook my data to connect with my friends and see my thanksgiving photos or whatever but now I don't want fake news jammed down my throat. Anti-Trump and Anti-Hillary spew. I didn't buy into that. I don't want that anymore. >> I think millennials, I have a 19 year old son, my researchers, they're right out of grad school. >> John: What's the profile like? >> They have no trust whatsoever in the government and they laugh at legislation. They don't care any more about having their face on their Facebook page and all their most intimate details of last night's date and tomorrow's date with two different, whatever. They just don't... They loathe the traditional way of things. You got to talk to General Alexander today. We have a really good relationship with him, Hayden, Mike Rogers. There is a counterculture in the works but it's not going to happen overnight because we have a tech deficit here where we need foreign tech people just to make up for the deficit. >> Bill Mann and I were talking, I heard the general basically, this is my interpretation, "if we don't get our shit together, "this is going to be an f'd up situation." That's what I heard him basically say. You guys don't come together so what Bill talked about was two scenarios. If industry and government don't share and come together, they're going to have stuff mandated on them by the government. Do you agree? >> I do. >> What's going to happen? >> The argument for regulation on the Hill is they don't want to stifle innovation, which makes sense but then ISPs don't innovate at all. They're using 1980s technology, so why did you pass SJ res 34? >> John: For access? >> I don't know because nation states just look at that as, "oh wow another treasure trove of metadata "that we can weaponize. "Let's start psychographically charging alt-left "and alt-right, you know what I mean?" >> Hacks are inevitable. That seems to be the trend. >> You talked before, James, about threats. You mentioned weaponization of social. >> James: Social media. >> You mentioned another in terms of ISPs I think. >> James: Dragnet. >> What are the big threats? Weaponization of social. ISP metadata, obviously. >> Metadata, it really depends and that's the thing. That's what makes the advisory so difficult because you have to go between influence operations and the exploit because the vectors are used for different things in different variations. >> John: Integrated model. >> It really is and so with a question like that I'm like okay so my biggest concern is the propaganda, political warfare, the information warfare. >> People are underestimating the value of how big that is, aren't they? They're oversimplifying the impact of info campaigns. >> Yeah because your reality is based off of... It's like this, influence operations. Traditional media, everybody is all about the narrative and controlling the narrative. What Russia understands is to control the narrative, the most embryo state of the narrative is the meme. Control the meme, control the idea. If you control the idea, you control the belief system. Control the belief system, you control the narrative. Control the narrative, you control the population. No guns were fired, see what I'm saying? >> I was explaining to a friend on Facebook, I was getting into a rant on this. I used a very simple example. In the advertising world, they run millions of dollars of ad campaigns on car companies for post car purchase cognitive dissonance campaigns. Just to make you feel good about your purchase. In a way, that's what's going on and explains what's going on on Facebook. This constant reinforcement of these beliefs whether its for Trump or Hillary, all this stuff was happening. I saw it firsthand. That's just one small nuance but it's across a spectrum of memes. >> You have all these people, you have nation states, you have mercenaries, but the most potent force in this space, the most hyperevolving in influence operations, is the special interest group. The well-funded special interests. That's going to be a problem. 2020, I keep hitting that because I was doing an interview earlier. 2020 is going to be a tug of war for the psychological core of the population and it's free game. Dragnet surveillance capitalists will absolutely be dragnet surveillance propagandists. They will have the candidates that they're going to push. Now that can also work against them because mainstream media, twitter, Facebook were completely against trump, for example, and that worked in his advantage. >> We've seen this before. I'm a little bit older, but we are the same generation. Remember when they were going to open up sealex? Remember the last mile for connectivity? That battle was won before it was even fought. What you're saying, if I get this right, the war and tug of war going on now is a big game. If it's not played in one now, this jerry rigging, gerrymandering of stuff could happen so when people wake up and realize what's happened the game has already been won. >> Yeah, your universe as you know it, your belief systems, what you hold to be true and self evident. Again, the embryo. If you look back to the embryo introduction of that concept, whatever concept it is, to your mind it came from somewhere else. There are very few things that you believe that you came up with yourself. The digital space expedites that process and that's dangerous because now it's being weaponized. >> Back to the, who fixes this. Who's the watchdog on this? These ideas you're talking about, some of them, you're like, "man that guy has lost it, he's crazy." Actually, I don't think you're crazy at all. I think it's right on. Is there a media outlet watching it? Who's reporting on it? What even can grasp what you're saying? What's going on in D.C.? Can you share that perspective? >> Yeah, the people that get this are the intelligence community, okay? The problem is the way we advise is I will go in with one of the silos in the NSA and explain what's happening and how to do it. They'll turn around their computer and say, "show me how to do it. "How do you do a multi vector campaign "with this meme and make it viral in 30 minutes." You have to be able to show them how to do it. >> John: We can do that. Actually we can't. >> That sort of thing, you have to be able to show them because there's not enough practitioners, we call them operators. When you're going in here, you're teaching them. >> The thing is if they have the metadata to your treasure trove, this is how they do it. I'll explain here. If they have the metadata, they know where the touch points are. It's a network effect mole, just distributive mole. They can put content in certain subnetworks that they know have a reaction to the metadata so they have the knowledge going in. It's not like they're scanning the whole world. They're monitoring pockets like a drone, right? Once they get over the territory, then they do the acquired deeper targets and then go viral. That's basically how fake news works. >> See the problem is, you look at something like alt-right and ANTIFA. ANTIFA, just like Black Lives Matter, the initiatives may have started out with righteous intentions just like take a knee. These initiatives, first stage is if it causes chaos, chaos is the op for a nation state in the US. That's the op. Chaos. That's the beginning and the end of an op. What happens is they will say, "oh okay look, this is ticking off all these other people "so let's fan the flame of this take a knee thing "hurt the NFL." Who cares? I don't watch football anyway but you know, take a knee. It's causing all this chaos. >> John: It's called trolling. >> What will happen is Russia and China, China has got their 13 five year plan, Russia has their foreign influence operations. They will fan that flame to exhaustion. Now what happens to the ANTIFA guy when he's a self-radicalized wound collector with a mental disorder? Maybe he's bipolar. Now with ANTIFA, he's experienced a heightened more extreme variation of that particular ideology so who steps in next? Cyber caliphate and Muslim brotherhood. That's why we're going to have an epidemic. I can't believe, you know, ANTIFA is a domestic terrorist organization. It's shocking that the FBI is not taking this more serious. What's happening now is Muslim brotherhood funds basically the cyber caliphate. The whole point of cyber caliphate is to create awareness, instill the illusion of rampant xenophobia for recruiting. They have self-radicalized wound collectors with ANTIFA that are already extremists anyway. They're just looking for a reason to take that up a notch. That's when, cyber caliphate, they hook up with them with a hashtag. They respond and they create a relationship. >> John: They get the fly wheel going. >> They take them to a deep web forum, dark web forum, and start showing them how it works. You can do this. You can be part of something. This guy who was never even muslim now is going under the ISIS moniker and he acts. He drives people over in New York. >> They fossilized their belief system. >> The whole point to the cyber caliphate is to find actors that are already in the self-radicalization phase but what does it take psychologically and from a mentoring perspective, to get them to act? That's the cyber caliphate. >> This is the value of data and context in real time using the current events to use that data, refuel their operation. It's data driven terrorism. >> What's the prescription that you're advising? >> I'm not a regulations kind of guy, but any time you're curating metadata like we're just talking about right now. Any time you have organizations like google, like Facebook, that have become so big, they are like their own nation state. That's a dangerous thing. The metadata curation. >> John: The value of the data is very big. That's the point. >> It is because what's happening... >> John: There's always a vulnerability. >> There's always a vulnerability and it will be exploited and all that metadata, it's unscrubbed. I'm not worried about them selling metadata that's scrubbed. I'm worried about the nation state or the sophisticated actor that already has a remote access Trojan on the network and is exfiltrating in real time. That's the guy that I'm worried about because he can just say, "forget it, I'm going to target people that are at this phase." He knows how to write algorithms, comes up with a good psychographic algorithm, puts the data in there, and now he's like, "look I'm only going to promote this concept, "two people at this particular stage of self-radicalization "or sympathetic to the kremlin." We have a big problem on the college campuses with IP theft because of the Chinese Students Scholar Associations which are directly run by the Chinese communist party. >> I heard a rumor that Equifax's franchising strategy had partners on the VPN that were state sponsored. They weren't even hacking, they had full access. >> There's a reason that the Chinese are buying hotels. They bought the Waldorf Astoria. We do stuff with the UN and NATO, you can't even stay there anymore. I think it's still under construction but it's a no-no to stay there anymore. I mean western nations and allies because they'll have bugs in the rooms. The WiFi that you use... >> Has fake certificates. >> Or there's a vulnerability that's left in that network so the information for executives who have IP or PII or electronic health records, you know what I mean? You go to these places to stay overnight, as an executive, and you're compromised. >> Look what happened with Eugene Kaspersky. I don't know the real story. I don't know if you can comment, but someone sees that and says, "this guy used to have high level meetings "at the Pentagon weekly, monthly." Now he's persona non grata. >> He fell out of favor, I guess, right? It happens. >> James, great conversation. Thanks for coming on the Cube. Congratulations on the great work you guys are doing here at the event. I know the content has been well received. Certainly the key notes we saw were awesome. CSOs, view from the government, from industry, congratulations. James Scott who is the co founder and senior fellow of ICIT, Internet Critical Infrastructure Technology. >> James: Institute of Critical Infrastructure Technology. >> T is for tech. >> And the Center for Cyber Influence Operations Studies. >> Good stuff. A lot of stuff going on (mumbles), exploits, infrastructure, it's all mainstream. It's the crisis of our generation. There's a radical shift happening and the answers are all going to come from industry and government coming together. This is the Cube bringing the data, I'm John Furrier with Dave Vellante. Thanks for watching. More live coverage after this short break. (music)
SUMMARY :
it's the Cube covering CyberConnect 2017 I'm John Furrier, the co-host of the Cube with It's an inaugural event so it's the first of its kind been purely federal and on the Hill They get the relationship that you have The thing is we deal with cylance What do you mean by that? to be different so you have to be willing to pay a premium. Yeah and the thing is, that's why that's the algorithm that brings you a balance so The themes here, what you'll see is You're on the front lines with a great Rolodex, the same social media platforms that you would use that might not be critical to national security "Keep it on the down low. You mentioned Russia, China, there's Iran, North Korea. Bill Evanina is the head of counterintel. so it's like alright, the only reason they got in... Learn the ins and outs of everything having to do with You just pound it all day long. You might have something that you worked on for six months. There is nothing of substance that really affects us here. They have a lot of contacts on the Hill What about the bitcoin underbelly that There's been an effort to commercialize (mumbles) I think bitcoin is important to keep and if you look at on the Hill in particular, they begin to get it. I got to ask you a question. We're getting hit by everybody and 2020 is going to be, and put it out in the open. I mean Facebook is more relevant than the UN. That's the Senate. They don't have any experience on the Hill, What that allows is for a new actor, the ISPs I kind of feel like maybe the security stuff I think millennials, I have a 19 year old son, There is a counterculture in the works I heard the general basically, The argument for regulation on the Hill is I don't know because nation states just look at that as, That seems to be the trend. You mentioned weaponization of social. What are the big threats? and the exploit because the vectors are okay so my biggest concern is the propaganda, They're oversimplifying the impact of info campaigns. Control the belief system, you control the narrative. In the advertising world, they run millions of dollars influence operations, is the special interest group. Remember the last mile for connectivity? Again, the embryo. Who's the watchdog on this? The problem is the way we advise is John: We can do that. That sort of thing, you have to be able to show them that they know have a reaction to the metadata See the problem is, you look at something like It's shocking that the FBI is not They take them to a deep web forum, dark web forum, that are already in the self-radicalization phase This is the value of data and context in real time Any time you have organizations like google, That's the point. We have a big problem on the college campuses had partners on the VPN that were state sponsored. There's a reason that the Chinese are buying hotels. so the information for executives who have IP or PII I don't know the real story. He fell out of favor, I guess, right? I know the content has been well received. the answers are all going to come from
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Greg Cranley | PERSON | 0.99+ |
Trump | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Hillary | PERSON | 0.99+ |
James | PERSON | 0.99+ |
Tom Kemp | PERSON | 0.99+ |
James Scott | PERSON | 0.99+ |
NATO | ORGANIZATION | 0.99+ |
FBI | ORGANIZATION | 0.99+ |
NSA | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
Equifax | ORGANIZATION | 0.99+ |
CIA | ORGANIZATION | 0.99+ |
Center for Cyber Influence Operations Studies | ORGANIZATION | 0.99+ |
six months | QUANTITY | 0.99+ |
ANTIFA | ORGANIZATION | 0.99+ |
Institute for Critical Infrastructure Technology | ORGANIZATION | 0.99+ |
NASA | ORGANIZATION | 0.99+ |
ISAC | ORGANIZATION | 0.99+ |
Israel | LOCATION | 0.99+ |
Centrify | ORGANIZATION | 0.99+ |
Mike Rogers | PERSON | 0.99+ |
Bill Mann | PERSON | 0.99+ |
congress | ORGANIZATION | 0.99+ |
New York | LOCATION | 0.99+ |
Moscow | LOCATION | 0.99+ |
GCHQ | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
South America | LOCATION | 0.99+ |
D.C. | LOCATION | 0.99+ |
UN | ORGANIZATION | 0.99+ |
Bill Evanina | PERSON | 0.99+ |
US | LOCATION | 0.99+ |
New York City | LOCATION | 0.99+ |
comcast | ORGANIZATION | 0.99+ |
DARPA | ORGANIZATION | 0.99+ |
Wikileaks | ORGANIZATION | 0.99+ |
ICIT | ORGANIZATION | 0.99+ |
trump | PERSON | 0.99+ |
two guys | QUANTITY | 0.99+ |
Institute of Critical Infrastructure Technology | ORGANIZATION | 0.99+ |
Aetna | ORGANIZATION | 0.99+ |
two sides | QUANTITY | 0.99+ |
1980s | DATE | 0.99+ |
ISIS | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Call of Duty Destiny | TITLE | 0.99+ |
Russia | LOCATION | 0.99+ |
Middle East | LOCATION | 0.99+ |
youtube | ORGANIZATION | 0.99+ |
two scenarios | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
Eugene Kaspersky | PERSON | 0.99+ |
Jean Francois Puget, IBM | IBM Machine Learning Launch 2017
>> Announcer: Live from New York, it's theCUBE, covering the IBM machine learning launch event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Alright, we're back. Jean Francois Puget is here, he's the distinguished engineer for machine learning and optimization at IBM analytics, CUBE alum. Good to see you again. >> Yes. >> Thanks very much for coming on, big day for you guys. >> Jean Francois: Indeed. >> It's like giving birth every time you guys give one of these products. We saw you a little bit in the analyst meeting, pretty well attended. Give us the highlights from your standpoint. What are the key things that we should be focused on in this announcement? >> For most people, machine learning equals machine learning algorithms. Algorithms, when you look at newspapers or blogs, social media, it's all about algorithms. Our view that, sure, you need algorithms for machine learning, but you need steps before you run algorithms, and after. So before, you need to get data, to transform it, to make it usable for machine learning. And then, you run algorithms. These produce models, and then, you need to move your models into a production environment. For instance, you use an algorithm to learn from past credit card transaction fraud. You can learn models, patterns, that correspond to fraud. Then, you want to use those models, those patterns, in your payment system. And moving from where you run the algorithm to the operation system is a nightmare today, so our value is to automate what you do before you run algorithms, and then what you do after. That's our differentiator. >> I've had some folks in theCUBE in the past have said years ago, actually, said, "You know what, algorithms are plentiful." I think he made the statement, I remember my friend Avi Mehta, "Algorithms are free. "It's what you do with them that matters." >> Exactly, that's, I believe in autonomy that open source won for machine learning algorithms. Now the future is with open source, clearly. But it solves only a part of the problem you're facing if you want to action machine learning. So, exactly what you said. What do you do with the results of algorithm is key. And open source people don't care much about it, for good reasons. They are focusing on producing the best algorithm. We are focusing on creating value for our customers. It's different. >> In terms of, you mentioned open source a couple times, in terms of customer choice, what's your philosophy with regard to the various tooling and platforms for open source, how do you go about selecting which to support? >> Machine learning is fascinating. It's overhyped, maybe, but it's also moving very quickly. Every year there is a new cool stuff. Five years ago, nobody spoke about deep learning. Now it's everywhere. Who knows what will happen next year? Our take is to support open source, to support the top open source packages. We don't know which one will win in the future. We don't know even if one will be enough for all needs. We believe one size does not fit all, so our take is support a curated list of mid-show open source. We start with Spark ML for many reasons, but we won't stop at Spark ML. >> Okay, I wonder if we can talk use cases. Two of my favorite, well, let's just start with fraud. Fraud has become much, much better over the past certainly 10 years, but still not perfect. I don't know if perfection is achievable, but lot of false positives. How will machine learning affect that? Can we expect as consumers even better fraud detection in more real time? >> If we think of the full life cycle going from data to value, we will provide a better answer. We still use machine learning algorithm to create models, but a model does not tell you what to do. It will tell you, okay, for this credit card transaction coming, it has a high probability to be fraud. Or this one has a lower priority, uh, probability. But then it's up to the designer of the overall application to make decisions, so what we recommend is to use machine learning data prediction but not only, and then use, maybe, (murmuring). For instance, if your machine learning model tells you this is a fraud with a high probability, say 90%, and this is a customer you know very well, it's a 10-year customer you know very well, then you can be confident that it's a fraud. Then if next fraud tells you this is 70% probability, but it's a customer since one week. In a week, we don't know the customer, so the confidence we can get in machine learning should be low, and there you will not reject the transaction immediately. Maybe you will enter, you don't approve it automatically, maybe you will send a one-time passcode, or you enter a serve vendor system, but you don't reject it outright. Really, the idea is to use machine learning predictions as yet another input for making decisions. You're making decision informed on what you could learn from your past. But it's not replacing human decision-making. Our approach with IBM, you don't see IBM speak much about artificial intelligence in general because we don't believe we're here to replace humans. We're here to assist humans, so we say, augmented intelligence or assistance. That's the role we see for machine learning. It will give you additional data so that you make better decisions. >> It's not the concept that you object to, it's the term artificial intelligence. It's really machine intelligence, it's not fake. >> I started my career as a PhD in artificial intelligence, I won't say when, but long enough. At that time, there were already promise that we have Terminator in the next decade and this and that. And the same happened in the '60s, or it was after the '60s. And then, there is an AI winter, and we have a risk here to have an AI winter because some people are just raising red flags that are not substantiated, I believe. I don't think that technology's here that we can replace human decision-making altogether any time soon, but we can help. We can certainly make some proficient, more efficient, more productive with machine learning. >> Having said that, there are a lot of cognitive functions that are getting replaced, maybe not by so-called artificial intelligence, but certainly by machines and automation. >> Yes, so we're automating a number of things, and maybe we won't need to have people do quality check and just have an automated vision system detect defects. Sure, so we're automating more and more, but this is not new, it has been going on for centuries. >> Well, the list evolved. So, what can humans do that machines can't, and how would you expect that to change? >> We're moving away from IMB machine learning, but it is interesting. You know, each time there is a capacity that a machine that will automate, we basically redefine intelligence to exclude it, so you know. That's what I foresee. >> Yeah, well, robots a while ago, Stu, couldn't climb stairs, and now, look at that. >> Do we feel threatened because a robot can climb a stair faster than us? Not necessarily. >> No, it doesn't bother us, right. Okay, question? >> Yeah, so I guess, bringing it back down to the solution that we're talking about today, if I now am doing, I'm doing the analytics, the machine learning on the mainframe, how do we make sure that we don't overrun and blow out all our MIPS? >> We recommend, so we are not using the mainframe base compute system. We recommend using ZIPS, so additional calls to not overload, so it's a very important point. We claim, okay, if you do everything on the mainframe, you can learn from operational data. You don't want to disturb, and you don't want to disturb takes a lot of different meanings. One that you just said, you don't want to slow down your operation processings because you're going to hurt your business. But you also want to be careful. Say we have a payment system where there is a machine learning model predicting fraud probability, a part of the system. You don't want a young bright data scientist decide that he had a great idea, a great model, and he wants to push his model in production without asking anyone. So you want to control that. That's why we insist, we are providing governance that includes a lot of things like keeping track of how models were created from which data sets, so lineage. We also want to have access control and not allow anyone to just deploy a new model because we make it easy to deploy, so we want to have a role-based access and only someone someone with some executive, well, it depends on the customer, but not everybody can update the production system, and we want to support that. And that's something that differentiates us from open source. Open source developers, they don't care about governance. It's not their problem, but it is our customer problem, so this solution will come with all the governance and integrity constraints you can expect from us. >> Can you speak to, first solution's going to be on z/OS, what's the roadmap look like and what are some of those challenges of rolling this out to other private cloud solutions? >> We are going to shape this quarter IBM machine learning for Z. It starts with Spark ML as a base open source. This is not, this is interesting, but it's not all that is for machine learning. So that's how we start. We're going to add more in the future. Last week we announced we will shape Anaconda, which is a major distribution for Python ecosystem, and it includes a number of machine learning open source. We announced it for next quarter. >> I believe in the press release it said down the road things like TensorFlow are coming, H20. >> But Anaconda will announce for next quarter, so we will leverage this when it's out. Then indeed, we have a roadmap to include major open source, so major open source are the one from Anaconda (murmuring), mostly. Key deep learning, so TensorFlow and probably one or two additional, we're still discussing. One that I'm very keen on, it's called XGBoost in one word. People don't speak about it in newspapers, but this is what wins all Kaggle competitions. Kaggle is a machine learning competition site. When I say all, all that are not imagery cognition competitions. >> Dave: And that was ex-- >> XGBoost, X-G-B-O-O-S-T. >> Dave: XGBoost, okay. >> XGBoost, and it's-- >> Dave: X-ray gamma, right? >> It's really a package. When I say we don't know which package will win, XGBoost was introduced a year ago also, or maybe a bit more, but not so long ago, and now, if you have structure data, it is the best choice today. It's a really fast-moving, but so, we will support mid-show deep learning package and mid-show classical learning package like the one from Anaconda or XGBoost. The other thing we start with Z. We announced in the analyst session that we will have a power version and a private cloud, meaning XTC69X version as well. I can't tell you when because it's not firm, but it will come. >> And in public cloud as well, I guess we'll, you've got components in the public cloud today like the Watson Data Platform that you've extracted and put here. >> We have extracted part of the testing experience, so we've extracted notebooks and a graphical tool called ModelBuilder from DSX as part of IBM machine learning now, and we're going to add more of DSX as we go. But the goal is to really share code and function across private cloud and public cloud. As Rob Thomas defined it, we want with private cloud to offer all the features and functionality of public cloud, except that it would run inside a firewall. We are really developing machine learning and Watson machine learning on a command code base. It's an internal open source project. We share code, and then, we shape on different platform. >> I mean, you haven't, just now, used the word hybrid. Every now and then IBM does, but do you see that so-called hybrid use case as viable, or do you see it more, some workloads should run on prem, some should run in the cloud, and maybe they'll never come together? >> Machine learning, you basically have to face, one is training and the other is scoring. I see people moving training to cloud quite easily, unless there is some regulation about data privacy. But training is a good fit for cloud because usually you need a large computing system but only for limited time, so elasticity's great. But then deployment, if you want to score transaction in a CICS transaction, it has to run beside CICS, not cloud. If you want to score data on an IoT gateway, you want to score other gateway, not in a data center. I would say that may not be what people think first, but what will drive really the split between public cloud, private, and on prem is where you want to apply your machine learning models, where you want to score. For instance, smart watches, they are switching to gear to fit measurement system. You want to score your health data on the watch, not in the internet somewhere. >> Right, and in that CICS example that you gave, you'd essentially be bringing the model to the CICS data, is that right? >> Yes, that's what we do. That's a value of machine learning for Z is if you want to score transactions happening on Z, you need to be running on Z. So it's clear, mainframe people, they don't want to hear about public cloud, so they will be the last one moving. They have their reasons, but they like mainframe because it ties really, really secure and private. >> Dave: Public cloud's a dirty word. >> Yes, yes, for Z users. At least that's what I was told, and I could check with many people. But we know that in general the move is for public cloud, so we want to help people, depending on their journey, of the cloud. >> You've got one of those, too. Jean Francois, thanks very much for coming on theCUBE, it was really a pleasure having you back. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from the Waldorf Astoria. IBM's machine learning announcement, be right back. (electronic keyboard music)
SUMMARY :
Brought to you by IBM. Good to see you again. on, big day for you guys. What are the key things that we and then what you do after. "It's what you do with them that matters." So, exactly what you said. but we won't stop at Spark ML. the past certainly 10 years, so that you make better decisions. that you object to, that we have Terminator in the next decade cognitive functions that and maybe we won't need to and how would you expect that to change? to exclude it, so you know. and now, look at that. Do we feel threatened because No, it doesn't bother us, right. and you don't want to disturb but it's not all that I believe in the press release it said so we will leverage this when it's out. and now, if you have structure data, like the Watson Data Platform But the goal is to really but do you see that so-called is where you want to apply is if you want to score so we want to help people, depending on it was really a pleasure having you back. from the Waldorf Astoria.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Jean Francois | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
10-year | QUANTITY | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Avi Mehta | PERSON | 0.99+ |
New York | LOCATION | 0.99+ |
Anaconda | ORGANIZATION | 0.99+ |
70% | QUANTITY | 0.99+ |
Jean Francois Puget | PERSON | 0.99+ |
next year | DATE | 0.99+ |
Two | QUANTITY | 0.99+ |
Last week | DATE | 0.99+ |
next quarter | DATE | 0.99+ |
90% | QUANTITY | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
one-time | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Five years ago | DATE | 0.99+ |
one word | QUANTITY | 0.99+ |
CICS | ORGANIZATION | 0.99+ |
Python | TITLE | 0.99+ |
a year ago | DATE | 0.99+ |
one | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
next decade | DATE | 0.98+ |
one week | QUANTITY | 0.98+ |
first solution | QUANTITY | 0.98+ |
XGBoost | TITLE | 0.98+ |
a week | QUANTITY | 0.97+ |
Spark ML | TITLE | 0.97+ |
'60s | DATE | 0.97+ |
ModelBuilder | TITLE | 0.96+ |
one size | QUANTITY | 0.96+ |
One | QUANTITY | 0.95+ |
first | QUANTITY | 0.94+ |
Watson Data Platform | TITLE | 0.93+ |
each time | QUANTITY | 0.93+ |
Kaggle | ORGANIZATION | 0.92+ |
Stu | PERSON | 0.91+ |
this quarter | DATE | 0.91+ |
DSX | TITLE | 0.89+ |
XGBoost | ORGANIZATION | 0.89+ |
Waldorf Astoria | ORGANIZATION | 0.86+ |
Spark ML. | TITLE | 0.85+ |
z/OS | TITLE | 0.82+ |
years | DATE | 0.8+ |
centuries | QUANTITY | 0.75+ |
10 years | QUANTITY | 0.75+ |
DSX | ORGANIZATION | 0.72+ |
Terminator | TITLE | 0.64+ |
XTC69X | TITLE | 0.63+ |
IBM Machine Learning Launch 2017 | EVENT | 0.63+ |
couple times | QUANTITY | 0.57+ |
machine learning | EVENT | 0.56+ |
X | TITLE | 0.56+ |
Watson | TITLE | 0.55+ |
these products | QUANTITY | 0.53+ |
-G-B | COMMERCIAL_ITEM | 0.53+ |
H20 | ORGANIZATION | 0.52+ |
TensorFlow | ORGANIZATION | 0.5+ |
theCUBE | ORGANIZATION | 0.49+ |
CUBE | ORGANIZATION | 0.37+ |
Bryan Smith, Rocket Software - IBM Machine Learning Launch - #IBMML - #theCUBE
>> Announcer: Live from New York, it's theCUBE, covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. We're here at the Waldorf Astoria covering the IBM Machine Learning Launch Event, bringing machine learning to the IBM Z. Bryan Smith is here, he's the vice president of R&D and the CTO of Rocket Software, powering the path to digital transformation. Bryan, welcome to theCUBE, thanks for coming on. >> Thanks for having me. >> So, Rocket Software, Waltham, Mass. based, close to where we are, but a lot of people don't know about Rocket, so pretty large company, give us the background. >> It's been around for, this'll be our 27th year. Private company, we've been a partner of IBM's for the last 23 years. Almost all of that is in the mainframe space, or we focused on the mainframe space, I'll say. We have 1,300 employees, we call ourselves Rocketeers. It's spread around the world. We're really an R&D focused company. More than half the company is engineering, and it's spread across the world on every continent and most major countries. >> You're esstenially OEM-ing your tools as it were. Is that right, no direct sales force? >> About half, there are different lenses to look at this, but about half of our go-to-market is through IBM with IBM-labeled, IBM-branded products. We've always been, for the side of products, we've always been the R&D behind the products. The partnership, though, has really grown. It's more than just an R&D partnership now, now we're doing co-marketing, we're even doing some joint selling to serve IBM mainframe customers. The partnership has really grown over these last 23 years from just being the guys who write the code to doing much more. >> Okay, so how do you fit in this announcement. Machine learning on Z, where does Rocket fit? >> Part of the announcement today is a very important piece of technology that we developed. We call it data virtualization. Data virtualization is really enabling customers to open their mainframe to allow the data to be used in ways that it was never designed to be used. You might have these data structures that were designed 10, 20, even 30 years ago that were designed for a very specific application, but today they want to use it in a very different way, and so, the traditional path is to take that data and copy it, to ETL it someplace else they can get some new use or to build some new application. What data virtualization allows you to do is to leave that data in place but access it using APIs that developers want to use today. They want to use JSON access, for example, or they want to use SQL access. But they want to be able to do things like join across IMS, DB2, and VSAM all with a single query using an SQL statement. We can do that relational databases and non-relational databases. It gets us out of this mode of having to copy data into some other data store through this ETL process, access the data in place, we call it moving the applications or the analytics to the data versus moving the data to the analytics or to the applications. >> Okay, so in this specific case, and I have said several times today, as Stu has heard me, two years ago IBM had a big theme around the z13 bringing analytics and transactions together, this sort of extends that. Great, I've got this transaction data that lives behind a firewall somewhere. Why the mainframe, why now? >> Well, I would pull back to where I said where we see more companies and organizations wanting to move applications and analytics closer to the data. The data in many of these large companies, that core business-critical data is on the mainframe, and so, being able to do more real time analytics without having to look at old data is really important. There's this term data gravity. I love the visual that presents in my mind that you have these different masses, these different planets if you will, and the biggest, massivest planet in that solar system really is the data, and so, it's pulling the smaller satellites if you will into this planet or this star by way of gravity because data is, data's a new currency, data is what the companies are running on. We're helping in this announcement with being able to unlock and open up all mainframe data sources, even some non-mainframe data sources, and using things like Spark that's running on the platform, that's running on z/OS to access that data directly without having to write any special programming or any special code to get to all their data. >> And the preferred place to run all that data is on the mainframe obviously if you're a mainframe customer. One of the questions I guess people have is, okay, I get that, it's the transaction data that I'm getting access to, but if I'm bringing transaction and analytic data together a lot of times that analytic data might be in social media, it might be somewhere else not on the mainframe. How do envision customers dealing with that? Do you have tooling them to do that? >> We do, so this data virtualization solution that I'm talking about is one that is mainframe resident, but it can also access other data sources. It can access DB2 on Linux Windows, it can access Informix, it can access Cloudant, it can access Hadoop through IBM's BigInsights. Other feeds like Twitter, like other social media, it can pull that in. The case where you'd want to do that is where you're trying to take that data and integrate it with a massive amount of mainframe data. It's going to be much more highly performant by pulling this other small amount of data into, next to that core business data. >> I get the performance and I get the security of the mainframe, I like those two things, but what about the economics? >> Couple of things. One, IBM when they ported Spark to z/OS, they did it the right way. They leveraged the architecture, it wasn't just a simple port of recompiling a bunch of open source code from Apache, it was rewriting it to be highly performant on the Z architecture, taking advantage of specialty engines. We've done the same with the data virtualization component that goes along with that Spark on z/OS offering that also leverages the architecture. We actually have different binaries that we load depending on which architecture of the machine that we're running on, whether it be a z9, an EC12, or the big granddaddy of a z13. >> Bryan, can you speak the developers? I think about, you're talking about all this mobile and Spark and everything like that. There's got to be certain developers that are like, "Oh my gosh, there's mainframe stuff. "I don't know anything about that." How do you help bridge that gap between where it lives in the tools that they're using? >> The best example is talking about embracing this API economy. And so, developers really don't care where the stuff is at, they just want it to be easy to get to. They don't have to code up some specific interface or language to get to different types of data, right? IBM's done a great job with the z/OS Connect in opening up the mainframe to the API economy with ReSTful interfaces, and so with z/OS Connect combined with Rocket data virtualization, you can come through that z/OS Connect same path using all those same ReSTful interfaces pushing those APIs out to tools like Swagger, which the developers want to use, and not only can you get to the applications through z/OS Connect, but we're a service provider to z/OS Connect allowing them to also get to every piece of data using those same ReSTful APIs. >> If I heard you correctly, the developer doesn't need to even worry about that it's on mainframe or speak mainframe or anything like that, right? >> The goal is that they never do. That they simply see in their tool-set, again like Swagger, that they have data as well as different services that they can invoke using these very straightforward, simple ReSTful APIs. >> Can you speak to the customers you've talked to? You know, there's certain people out in the industry, I've had this conversation for a few years at IBM shows is there's some part of the market that are like, oh, well, the mainframe is this dusty old box sitting in a corner with nothing new, and my experience has been the containers and cool streaming and everything like that, oh well, you know, mainframe did virtualization and Linux and all these things really early, decades ago and is keeping up with a lot of these trends with these new type of technologies. What do you find in the customers that, how much are they driving forward on new technologies, looking for that new technology and being able to leverage the assets that they have? >> You asked a lot of questions there. The types of customers certainly financial and insurance are the big two, but that doesn't mean that we're limited and not going after retail and helping governments and manufacturing customers as well. What I find is talking with them that there's the folks who get it and the folks who don't, and the folks who get it are the ones who are saying, "Well, I want to be able "to embrace these new technologies," and they're taking things like open source, they're looking at Spark, for example, they're looking at Anaconda. Last week, we just announced at the Anaconda Conference, we stepped on stage with Continuum, IBM, and we, Rocket, stood up there talking about this partnership that we formed to create this ecosystem because the development world changes very, very rapidly. For a while, all the rage was JDBC, or all the rage was component broker, and so today it's Spark and Anaconda are really in the forefront of developers' minds. We're constantly moving to keep up with developers because that's where the action's happening. Again, they don't care where the data is housed as long as you can open that up. We've been playing with this concept that came up from some research firm called two-speed IT where you have maybe your core business that has been running for years, and it's designed to really be slow-moving, very high quality, it keeps everything running today, but they want to embrace some of their new technologies, they want to be able to roll out a brand-new app, and they want to be able to update that multiple times a week. And so, this two-speed IT says, you're kind of breaking 'em off into two separate teams. You don't have to take your existing infrastructure team and say, "You must embrace every Agile "and every DevOps type of methodology." What we're seeing customers be successful with is this two-speed IT where you can fracture these two, and now you need to create some nice integration between those two teams, so things like data virtualization really help with that. It opens up and allows the development teams to very quickly access those assets on the mainframe in this case while allowing those developers to very quickly crank out an application where quality is not that important, where being very quick to respond and doing lots of AB testing with customers is really critical. >> Waterfall still has its place. As a company that predominately, or maybe even exclusively is involved in mainframe, I'm struck by, it must've been 2008, 2009, Paul Maritz comes in and he says VMWare our vision is to build the software mainframe. And of course the world said, "Ah, that's, mainframe's dead," we've been hearing that forever. In many respects, I accredit the VMWare, they built sort of a form of software mainframe, but now you hear a lot of talk, Stu, about going back to bare metal. You don't hear that talk on the mainframe. Everything's virtualized, right, so it's kind of interesting to see, and IBM uses the language of private cloud. The mainframe's, we're joking, the original private cloud. My question is you're strategy as a company has been always focused on the mainframe and going forward I presume it's going to continue to do that. What's your outlook for that platform? >> We're not exclusively by the mainframe, by the way. We're not, we have a good mix. >> Okay, it's overstating that, then. It's half and half or whatever. You don't talk about it, 'cause you're a private company. >> Maybe a little more than half is mainframe-focused. >> Dave: Significant. >> It is significant. >> You've got a large of proportion of the company on mainframe, z/OS. >> So we're bullish on the mainframe. We continue to invest more every year. We invest, we increase our investment every year, and so in a software company, your investment is primarily people. We increase that by double digits every year. We have license revenue increases in the double digits every year. I don't know many other mainframe-based software companies that have that. But I think that comes back to the partnership that we have with IBM because we are more than just a technology partner. We work on strategic projects with IBM. IBM will oftentimes stand up and say Rocket is a strategic partner that works with us on hard problem-solving customers issues every day. We're bullish, we're investing more all the time. We're not backing away, we're not decreasing our interest or our bets on the mainframe. If anything, we're increasing them at a faster rate than we have in the past 10 years. >> And this trend of bringing analytics and transactions together is a huge mega-trend, I mean, why not do it on the mainframe? If the economics are there, which you're arguing that in many use cases they are, because of the value component as well, then the future looks pretty reasonable, wouldn't you say? >> I'd say it's very, very bright. At the Anaconda Conference last week, I was coming up with an analogy for these folks. It's just a bunch of data scientists, right, and during most of the breaks and the receptions, they were just asking questions, "Well, what is a mainframe? "I didn't know that we still had 'em, "and what do they do?" So it was fun to educate them on that. But I was trying to show them an analogy with data warehousing where, say that in the mid-'90s it was perfectly acceptable to have a separate data warehouse separate from your transaction system. You would copy all this data over into the data warehouse. That was the model, right, and then slowly it became more important that the analytics or the BI against that data warehouse was looking at more real time data. So then it became more efficiencies and how do we replicate this faster, and how do we get closer to, not looking at week-old data but day-old data? And so, I explained that to them and said the days of being able to do analytics against old data that's copied are going away. ETL, we're also bullish to say that ETL is dead. ETL's future is very bleak. There's no place for it. It had its time, but now it's done because with data virtualization you can access that data in place. I was telling these folks as they're talking about, these data scientists, as they're talking about how they look at their models, their first step is always ETL. And so I told them this story, I said ETL is dead, and they just look at me kind of strange. >> Dave: Now the first step is load. >> Yes, there you go, right, load it in there. But having access from these platforms directly to that data, you don't have to worry about any type of a delay. >> What you described, though, is still common architecture where you've got, let's say, a Z mainframe, it's got an InfiniBand pipe to some exit data warehouse or something like that, and so, IBM's vision was, okay, we can collapse that, we can simplify that, consolidate it. SAP with HANA has a similar vision, we can do that. I'm sure Oracle's got their vision. What gives you confidence in IBM's approach and legs going forward? >> Probably due to the advances that we see in z/OS itself where handling mixed workloads, which it's just been doing for many of the 50 years that it's been around, being able to prioritize different workloads, not only just at the CPU dispatching, but also at the memory usage, also at the IO, all the way down through the channel to the actual device. You don't see other operating systems that have that level of granularity for managing mixed workloads. >> In the security component, that's what to me is unique about this so-called private cloud, and I say, I was using that software mainframe example from VMWare in the past, and it got a good portion of the way there, but it couldn't get that last mile, which is, any workload, any application with the performance and security that you would expect. It's just never quite got there. I don't know if the pendulum is swinging, I don't know if that's the accurate way to say it, but it's certainly stabilized, wouldn't you say? >> There's certainly new eyes being opened every day to saying, wait a minute, I could do something different here. Muscle memory doesn't have to guide me in doing business the way I have been doing it before, and that's this muscle memory I'm talking about of this ETL piece. >> Right, well, and a large number of workloads in mainframe are running Linux, right, you got Anaconda, Spark, all these modern tools. The question you asked about developers was right on. If it's independent or transparent to developers, then who cares, that's the key. That's the key lever this day and age is the developer community. You know it well. >> That's right. Give 'em what they want. They're the customers, they're the infrastructure that's being built. >> Bryan, we'll give you the last word, bumper sticker on the event, Rocket Software, your partnership, whatever you choose. >> We're excited to be here, it's an exciting day to talk about machine learning on z/OS. I say we're bullish on the mainframe, we are, we're especially bullish on z/OS, and that's what this even today is all about. That's where the data is, that's where we need the analytics running, that's where we need the machine learning running, that's where we need to get the developers to access the data live. >> Excellent, Bryan, thanks very much for coming to theCUBE. >> Bryan: Thank you. >> And keep right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from New York City. Be right back. (electronic keyboard music)
SUMMARY :
Event, brought to you by IBM. powering the path to close to where we are, but and it's spread across the Is that right, no direct sales force? from just being the Okay, so how do you or the analytics to the data versus Why the mainframe, why now? data is on the mainframe, is on the mainframe obviously It's going to be much that also leverages the architecture. There's got to be certain They don't have to code up some The goal is that they never do. and my experience has been the containers and the folks who get it are the ones who You don't hear that talk on the mainframe. the mainframe, by the way. It's half and half or whatever. half is mainframe-focused. of the company on mainframe, z/OS. in the double digits every year. the days of being able to do analytics directly to that data, you don't have it's got an InfiniBand pipe to some for many of the 50 years I don't know if that's the in doing business the way I is the developer community. They're the customers, bumper sticker on the the developers to access the data live. very much for coming to theCUBE. This is theCUBE, we're
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
IBM | ORGANIZATION | 0.99+ |
Bryan | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Paul Maritz | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Rocket Software | ORGANIZATION | 0.99+ |
50 years | QUANTITY | 0.99+ |
2009 | DATE | 0.99+ |
New York City | LOCATION | 0.99+ |
2008 | DATE | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
27th year | QUANTITY | 0.99+ |
New York City | LOCATION | 0.99+ |
first step | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
JDBC | ORGANIZATION | 0.99+ |
1,300 employees | QUANTITY | 0.99+ |
Continuum | ORGANIZATION | 0.99+ |
Last week | DATE | 0.99+ |
New York | LOCATION | 0.99+ |
Anaconda | ORGANIZATION | 0.99+ |
two things | QUANTITY | 0.99+ |
mid-'90s | DATE | 0.99+ |
Spark | TITLE | 0.99+ |
Rocket | ORGANIZATION | 0.99+ |
z/OS Connect | TITLE | 0.99+ |
10 | DATE | 0.99+ |
two teams | QUANTITY | 0.99+ |
Linux | TITLE | 0.99+ |
today | DATE | 0.99+ |
two-speed | QUANTITY | 0.99+ |
two separate teams | QUANTITY | 0.99+ |
Z. Bryan Smith | PERSON | 0.99+ |
SQL | TITLE | 0.99+ |
Bryan Smith | PERSON | 0.99+ |
z/OS | TITLE | 0.98+ |
two years ago | DATE | 0.98+ |
ReSTful | TITLE | 0.98+ |
Swagger | TITLE | 0.98+ |
last week | DATE | 0.98+ |
decades ago | DATE | 0.98+ |
DB2 | TITLE | 0.98+ |
HANA | TITLE | 0.97+ |
IBM Machine Learning Launch Event | EVENT | 0.97+ |
Anaconda Conference | EVENT | 0.97+ |
Hadoop | TITLE | 0.97+ |
Spark | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
Informix | TITLE | 0.96+ |
VMWare | ORGANIZATION | 0.96+ |
More than half | QUANTITY | 0.95+ |
z13 | COMMERCIAL_ITEM | 0.95+ |
JSON | TITLE | 0.95+ |
James Kobielus, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE
>> [Announcer] Live from New York, it's the Cube. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now here are your hosts Dave Vellante and Stu Miniman. >> Welcome back to New York City everybody, this is the CUBE. We're here live at the IBM Machine Learning Launch Event. Bringing analytics and transactions together on Z, extending an announcement that IBM made a couple years ago, sort of laid out that vision, and now bringing machine learning to the mainframe platform. We're here with Jim Kobielus. Jim is the Director of IBM's Community Engagement for Data Science and a long time CUBE alum and friend. Great to see you again James. >> Great to always be back here with you. Wonderful folks from the CUBE. You ask really great questions and >> Well thank you. >> I'm prepared to answer. >> So we saw you last week at Spark Summit so back to back, you know, continuous streaming, machine learning, give us the lay of the land from your perspective of machine learning. >> Yeah well machine learning very much is at the heart of what modern application developers build and that's really the core secret sauce in many of the most disruptive applications. So machine learning has become the core of, of course, what data scientists do day in and day out or what they're asked to do which is to build, essentially artificial neural networks that can process big data and find patterns that couldn't normally be found using other approaches. And then as Dinesh and Rob indicated a lot of it's for regression analysis and classification and the other core things that data scientists have been doing for a long time, but machine learning has come into its own because of the potential for great automation of this function of finding patterns and correlations within data sets. So today at the IBM Machine Learning Launch Event, and we've already announced it, IBM Machine Learning for ZOS takes that automation promised to the next step. And so we're real excited and there'll be more details today in the main event. >> One of the most funs I had, most fun I had last year, most fun interviews I had last year was with you, when we interviewed, I think it was 10 data scientists, rock star data scientists, and Dinesh had a quote, he said, "Machine learning is 20% fun, 80% elbow grease." And data scientists sort of echoed that last year. We spent 80% of our time wrangling data. >> [Jim] Yeah. >> It gets kind of tedious. You guys have made announcements to address that, is the needle moving? >> To some degree the needle's moving. Greater automation of data sourcing and preparation and cleansing is ongoing. Machine learning is being used for that function as well. But nonetheless there is still a lot of need in the data science, sort of, pipeline for a lot of manual effort. So if you look at the core of what machine learning is all about, it's supervised learning involves humans, meaning data scientists, to train their algorithms with data and so that involves finding the right data and then of course doing the feature engineering which is a very human and creative process. And then to be training the data and iterating through models to improve the fit of the machine learning algorithms to the data. In many ways there's still a lot of manual functions that need expertise of data scientists to do it right. There's a lot of ways to do machine learning wrong you know there's a lot of, as it were, tricks of the trade you have to learn just through trial and error. A lot of things like the new generation of things like generative adversarial models ride on machine learning or deep learning in this case, a multilayered, and they're not easy to get going and get working effectively the first time around. I mean with the first run of your training data set, so that's just an example of how, the fact is there's a lot of functions that can't be fully automated yet in the whole machine learning process, but a great many can in fact, especially data preparation and transformation. It's being automated to a great degree, so that data scientists can focus on the more creative work that involves subject matter expertise and really also application development and working with larger teams of coders and subject matter experts and others, to be able to take the machine learning algorithms that have been proved out, have been trained, and to dry them to all manner of applications to deliver some disruptive business value. >> James, can you expand for us a little bit this democratization of before it was not just data but now the machine learning, the analytics, you know, when we put these massive capabilities in the broader hands of the business analysts the business people themselves, what are you seeing your customers, what can they do now that they couldn't do before? Why is this such an exciting period of time for the leveraging of data analytics? >> I don't know that it's really an issue of now versus before. Machine learning has been around for a number of years. It's artificial neural networks at the very heart, and that got going actually in many ways in the late 50s and it steadily improved in terms of sophistication and so forth. But what's going on now is that machine learning tools have become commercialized and refined to a greater degree and now they're in a form in the cloud, like with IBM machine learning for the private cloud on ZOS, or Watson machine learning for the blue mixed public cloud. They're at a level of consumability that they've never been at before. With software as a service offering you just, you pay for it, it's available to you. If you're a data scientist you being doing work right away to build applications, derive quick value. So in other words, the time to value on a machine learning project continues to shorten and shorten, due to the consumability, the packaging of these capabilities and to cloud offerings and into other tools that are prebuilt to deliver success. That's what's fundamentally different now and it's just an ongoing process. You sort of see the recent parallels with the business intelligence market. 10 years ago BI was reporting and OLEP and so forth, was only for the, what we now call data scientists or the technical experts and all that area. But in the last 10 years we've seen the business intelligence community and the industry including IBM's tools, move toward more self service, interactive visualization, visual design, BI and predictive analytics, you know, through our cognos and SPSS portfolios. A similar dynamic is coming in to the progress of machine learning, the democratization, to use your term, the more self service model wherein everybody potentially will be able to be, to do machine learning, to build machine learning and deep learning models without a whole of university training. That day is coming and it's coming fairly rapidly. It's just a matter of the maturation of this technology in the marketplace. >> So I want to ask you, you're right, 1950s it was artificial neural networks or AI, sort of was invented I guess, the concept, and then in the late 70s and early 80s it was heavily hyped. It kind of died in the late 80s or in the 90s, you never heard about it even the early 2000s. Why now, why is it here now? Is it because IBM's putting so much muscle behind it? Is it because we have Siri? What is it that has enabled that? >> Well I wish that IBM putting muscle behind a technology can launch anything to success. And we've done a lot of things in that regard. But the thing is, if you look back at the historical progress of AI, I mean, it's older than me and you in terms of when it got going in the middle 50s as a passion or a focus of computer scientists. What we had for the last, most of the last half century is AI or expert systems that were built on having to do essentially programming is right, declared a rule defining how AI systems could process data whatever under various scenarios. That didn't prove scalable. It didn't prove agile enough to learn on the fly from the statistical patterns within the data that you're trying to process. For face recognition and voice recognition, pattern recognition, you need statistical analysis, you need something along the lines of an artificial neural network that doesn't have to be pre-programmed. That's what's new now about in the last this is the turn of this century, is that AI has become predominantly now focused not so much on declarative rules, expert systems of old, but statistical analysis, artificial neural networks that learn from the data. See the, in the long historical sweep of computing, we have three eras of computing. The first era before the second world war was all electromechanical computing devices like IBM's start of course, like everybody's, was in that era. The business logic was burned into the hardware as it were. The second era from the second world war really to the present day, is all about software, programming, it's COBAL, 4trans, C, Java, where the business logic has to be developed, coded by a cadre of programmers. Since the turn of this millennium and really since the turn of this decade, it's all moved towards the third era, which is the cognitive era, where you're learning the business rules automatically from the data itself, and that involves machine learning at its very heart. So most of what has been commercialized and most of what is being deployed in the real world working, successful AI, is all built on artificial neural networks and cognitive computing in the way that I laid out. Where, you still need human beings in the equation, it can't be completely automated. There's things like unsupervised learning that take the automation of machine learning to a greater extent, but you still have the bulk of machine learning is supervised learning where you have training data sets and you need experts, data scientists, to manage that whole process, that over time supervised learning is evolving towards who's going to label the training data sets, especially when you have so much data flooding in from the internet of things and social media and so forth. A lot of that is being outsourced to crowd sourcing environments in terms of the ongoing labeling of data for machine learning projects of all sorts. That trend will continue a pace. So less and less of the actual labeling of the data for machine learning will need to be manually coded by data scientists or data engineers. >> So the more data the better. See I would argue in the enablement pie. You're going to disagree with that which is good. Let's have a discussion [Jim Laughs]. In the enablement pie, I would say the profundity of Hadup was two things. One is I can leave data where it is and bring code to data. >> [Jim] Yeah. >> 5 megabytes of code to petabyte of data, but the second was the dramatic reduction in the cost to store more data, hence my statement of the more data the better, but you're saying, meh maybe not. Certainly for compliance and other things you might not want to have data lying around. >> Well it's an open issue. How much data do you actually need to find the patterns of interest to you, the correlations of interest to you? Sampling of your data set, 10% sample or whatever, in most cases that might be sufficient to find the correlations you're looking for. But if you're looking for some highly deepened rare nuances in terms of anomalies or outliers or whatever within your data set, you may only find those if you have a petabyte of data of the population of interest. So but if you're just looking for broad historical trends and to do predictions against broad trends, you may not need anywhere near that amount. I mean, if it's a large data set, you may only need five to 10% sample. >> So I love this conversation because people have been on the CUBE, Abi Metter for example said, "Dave, sampling is dead." Now a statistician said that's BS, no way. Of course it's not dead. >> Storage isn't free first of all so you can't necessarily save and process all the data. Compute power isn't free yet, memory isn't free yet, so forth so there's lots... >> You're working on that though. >> Yeah sure, it's asymptotically all moving towards zero. But the bottom line is if the underlying resources, including the expertise of your data scientists that's not for free, these are human beings who need to make a living. So you've got to do a lot of things. A, automate functions on the data science side so that your, these experts can radically improve their productivity. Which is why the announcement today of IBM machine learning is so important, it enables greater automation in the creation and the training and deployment of machine learning models. It is a, as Rob Thomas indicated, it's very much a multiplier of productivity of your data science teams, the capability we offer. So that's the core value. Because our customers live and die increasingly by machine learning models. And the data science teams themselves are highly inelastic in the sense that you can't find highly skilled people that easily at an affordable price if you're a business. And you got to make the most of the team that you have and help them to develop their machine learning muscle. >> Okay, I want to ask you to weigh in on one of Stu's favorite topics which is man versus machine. >> Humans versus mechanisms. Actually humans versus bots, let's, okay go ahead. >> Okay so, you know a lot of discussions, about, machines have always replaced humans for jobs, but for the first time it's really beginning to replace cognitive functions. >> [Jim] Yeah. >> What does that mean for jobs, for skill sets? The greatest, I love the comment, the greatest chess player in the world is not a machine. It's humans and machines, but what do you see in terms of the skill set shift when you talk to your data science colleagues in these communities that you're building? Is that the right way to think about it, that it's the creativity of humans and machines that will drive innovation going forward. >> I think it's symbiotic. If you take Watson, of course, that's a star case of a cognitive AI driven machine in the cloud. We use a Watson all the time of course in IBM. I use it all the time in my job for example. Just to give an example of one knowledge worker and how he happens to use AI and machine learning. Watson is an awesome search engine. Through multi-structure data types and in real time enabling you to ask a sequence of very detailed questions and Watson is a relevance ranking engine, all that stuff. What I've found is it's helped me as a knowledge worker to be far more efficient in doing my upfront research for anything that I might be working on. You see I write blogs and I speak and I put together slide decks that I present and so forth. So if you look at knowledge workers in general, AI as driving far more powerful search capabilities in the cloud helps us to eliminate a lot of the grunt work that normally was attended upon doing deep research into like a knowledge corpus that may be preexisting. And that way we can then ask more questions and more intelligent questions and really work through our quest for answers far more rapidly and entertain and rule out more options when we're trying to develop a strategy. Because we have all the data at our fingertips and we've got this expert resource increasingly in a conversational back and forth that's working on our behalf predictively to find what we need. So if you look at that, everybody who's a knowledge worker which is really the bulk now of the economy, can be far more productive cause you have this high performance virtual assistant in the cloud. I don't know that it's really going, AI or deep learning or machine learning, is really going to eliminate a lot of those jobs. It'll just make us far smarter and more efficient doing what we do. That's, I don't want to belittle, I don't want to minimize the potential for some structural dislocation in some fields. >> Well it's interesting because as an example, you're like the, you're already productive, now you become this hyper-productive individual, but you're also very creative and can pick and choose different toolings and so I think people like you it's huge opportunities. If you're a person who used to put up billboards maybe it's time for retraining. >> Yeah well maybe you know a lot of the people like the research assistants and so forth who would support someone like me and most knowledge worker organizations, maybe those people might be displaced cause we would have less need for them. In the same way that one of my very first jobs out of college before I got into my career, I was a file clerk in a court in Detroit, it's like you know, a totally manual job, and there was no automation or anything. You know that most of those functions, I haven't revisited that court in recent years, I'm sure are automated because you have this thing called computers, especially PCs and LANs and so forth that came along since then. So a fair amount of those kinds of feather bedding jobs have gone away and in any number of bureaucracies due to automation and machine learning is all about automation. So who knows where we'll all end up. >> Alright well we got to go but I wanted to ask you about... >> [Jim] I love unions by the way. >> And you got to meet a lot of lawyers I'm sure. >> Okay cool. >> So I got to ask you about your community of data scientists that you're building. You've been early on in that. It's been a persona that you've really tried to cultivate and collaborate with. So give us an update there. What's your, what's the latest, what's your effort like these days? >> Yeah, well, what we're doing is, I'm on a team now that's managing and bringing together all of our program for community engagement programs for really for across portfolio not just data scientists. That involves meet ups and hack-a-thons and developer days and user groups and so forth. These are really important professional forums for our customers, our developers, our partners, to get together and share their expertise and provide guidance to each other. And these are very very important for these people to become very good at, to help them, get better at what they do, help them stay up to speed on the latest technologies. Like deep learning, machine learning and so forth. So we take it very seriously at IBM that communities are really where customers can realize value and grow their human capital ongoing so we're making significant investments in growing those efforts and bringing them together in a unified way and making it easier for like developers and IT administrators to find the right forums, the right events, the right content, within IBM channels and so forth, to help them do their jobs effectively and machine learning is at the heart, not just of data science, but other professions within the IT and business analytics universe, relying more heavily now on machine learning and understanding the tools of the trade to be effective in their jobs. So we're bringing, we're educating our communities on machine learning, why it's so critically important to the future of IT. >> Well your content machine is great content so congratulations on not only kicking that off but continuing it. Thanks Jim for coming on the CUBE. It's good to see you. >> Thanks for having me. >> You're welcome. Alright keep it right there everybody, we'll be back with our next guest. The CUBE, we're live from the Waldorf-Astoria in New York City at the IBM Machine Learning Launch Event right back. (techno music)
SUMMARY :
Brought to you by IBM. Great to see you again James. Wonderful folks from the CUBE. so back to back, you know, continuous streaming, and that's really the core secret sauce in many One of the most funs I had, most fun I had last year, is the needle moving? of the machine learning algorithms to the data. of machine learning, the democratization, to use your term, It kind of died in the late 80s or in the 90s, So less and less of the actual labeling of the data So the more data the better. but the second was the dramatic reduction in the cost the correlations of interest to you? because people have been on the CUBE, so you can't necessarily save and process all the data. and the training and deployment of machine learning models. Okay, I want to ask you to weigh in Actually humans versus bots, let's, okay go ahead. but for the first time it's really beginning that it's the creativity of humans and machines and in real time enabling you to ask now you become this hyper-productive individual, In the same way that one of my very first jobs So I got to ask you about your community and machine learning is at the heart, Thanks Jim for coming on the CUBE. in New York City at the IBM Machine Learning
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jim Kobielus | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Dinesh | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
James | PERSON | 0.99+ |
80% | QUANTITY | 0.99+ |
James Kobielus | PERSON | 0.99+ |
20% | QUANTITY | 0.99+ |
Jim Laughs | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
Detroit | LOCATION | 0.99+ |
1950s | DATE | 0.99+ |
last year | DATE | 0.99+ |
New York | LOCATION | 0.99+ |
New York City | LOCATION | 0.99+ |
10 data scientists | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Siri | TITLE | 0.99+ |
Dave | PERSON | 0.99+ |
10% | QUANTITY | 0.99+ |
5 megabytes | QUANTITY | 0.99+ |
Abi Metter | PERSON | 0.99+ |
two things | QUANTITY | 0.99+ |
first time | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
second | QUANTITY | 0.99+ |
90s | DATE | 0.99+ |
ZOS | TITLE | 0.99+ |
Rob | PERSON | 0.99+ |
last half century | DATE | 0.99+ |
today | DATE | 0.99+ |
early 2000s | DATE | 0.98+ |
Java | TITLE | 0.98+ |
one | QUANTITY | 0.98+ |
C | TITLE | 0.98+ |
10 years ago | DATE | 0.98+ |
first run | QUANTITY | 0.98+ |
late 80s | DATE | 0.98+ |
Watson | TITLE | 0.97+ |
late 70s | DATE | 0.97+ |
late 50s | DATE | 0.97+ |
zero | QUANTITY | 0.97+ |
IBM Machine Learning Launch Event | EVENT | 0.96+ |
early 80s | DATE | 0.96+ |
4trans | TITLE | 0.96+ |
second world war | EVENT | 0.95+ |
IBM Machine Learning Launch Event | EVENT | 0.94+ |
second era | QUANTITY | 0.94+ |
IBM Machine Learning Launch | EVENT | 0.93+ |
Stu | PERSON | 0.92+ |
first jobs | QUANTITY | 0.92+ |
middle 50s | DATE | 0.91+ |
couple years ago | DATE | 0.89+ |
agile | TITLE | 0.87+ |
petabyte | QUANTITY | 0.85+ |
BAL | TITLE | 0.84+ |
this decade | DATE | 0.81+ |
three eras | QUANTITY | 0.78+ |
last 10 years | DATE | 0.78+ |
this millennium | DATE | 0.75+ |
third era | QUANTITY | 0.72+ |
Dinesh Nirmal, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE
>> [Announcer] Live from New York, it's theCube, covering the IBM Machine Learning Launch Event brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to the Waldorf Astoria, everybody. This is theCube, the worldwide leader in live tech coverage. We're covering the IBM Machine Learning announcement. IBM bringing machine learning to its zMainframe, its private cloud. Dinesh Nirmel is here. He's the Vice President of Analytics at IBM and a Cube alum. Dinesh, good to see you again. >> Good to see you, Dave. >> So let's talk about ML. So we went through the big data, the data lake, the data swamp, all this stuff with the dupe. And now we're talking about machine learning and deep learning and AI and cognitive. Is it same wine, new bottle? Or is it an evolution of data and analytics? >> Good. So, Dave, let's talk about machine learning. Right. When I look at machine learning, there's three pillars. The first one is the product. I mean, you got to have a product, right. And you got to have a different shared set of functions and features available for customers to build models. For example, Canvas. I mean, those are table stakes. You got to have a set of algorithms available. So that's the product piece. >> [Dave] Uh huh. >> But then there's the process, the process of taking that model that you built in a notebook and being able to operationalize it. Meaning able to deploy it. That is, you know, I was talking to one of the customers today, and he was saying, "Machine learning is 20% fun and 80% elbow grease." Because that operationalizing of that model is not easy. Although they make it sound very simple, it's not. So if you take a banking, enterprise banking example, right? You build a model in the notebook. Some data sense build it. Now you have to take that and put it into your infrastructure or production environment, which has been there for decades. So you could have a third party software that you cannot change. You could have a set of rigid rules that already is there. You could have applications that was written in the 70's and 80's that nobody want to touch. How do you all of a sudden take the model and infuse in there? It's not easy. And so that is a tremendous amount of work. >> [Dave] Okay. >> The third pillar is the people or the expertise or the experience, the skills that needs to come through, right. So the product is one. The process of operationalizing and getting it into your production environment is another piece. And then the people is the third one. So when I look at machine learning, right. Those are three key pillars that you need to have to have a successful, you know, experience of machine learning. >> Okay, let's unpack that a little bit. Let's start with the differentiation. You mentioned Canvas, but talk about IBM specifically. >> [Dinesh] Right. What's so great about IBM? What's the differentiation? >> Right, exactly. Really good point. So we have been in the productive side for a very long time, right. I mean, it's not like we are coming into ML or AI or cognitive yesterday. We have been in that space for a very long time. We have SPSS predictive analytics available. So even if you look from all three pillars, what we are doing is we are, from a product perspective, we are bringing in the product where we are giving a choice or a flexibility to use the language you want. So there are customers who only want to use R. They are religious R users. They don't want to hear about anything else. There are customers who want to use Python, you know. They don't want to use anything else. So how do we give that choice of languages to our customers to say use any language you want. Or execution engines, right? Some folks want to use Park as execution engine. Some folks want to use R or Python, so we give that choice. Then you talked about Canvas. There are folks who want to use the GUI portion of the Canvas or a modeler to build models, or there are, you know, tekkie guys that we'll approach who want to use notebook. So how do you give that choice? So it becomes kind of like a freedom or a flexibility or a choice that we provide, so that's the product piece, right? We do that. Then the other piece is productivity. So one of the customers, the CTO of (mumbles) TV's going to come on stage with me during the main session, talk about how collaboration helped from an IBM machine learning perspective because their data scientists are sitting in New York City, our data scientists who are working with them are sitting in San Jose, California. And they were real time collaborating using notebooks in our ML projects where they can see the real time. What changes their data scientists are making. They can slack messages between each other. And that collaborative piece is what really helped us. So collaboration is one. Right from a productivity piece. We introduced something called Feedback Loop, whereby which your model can get trained. So today, you deploy a model. It could lose the score, and it could get degraded over time. Then you have to take it off-line and re-train, right? What we have done is like we introduced the Feedback Loops, so when you deploy your model, we give you two endpoints. The first endpoint is, basically, a URI, for you to plug-in your application when you, you know, run your application able call the scoring API. The second endpoint is this feedback endpoint, where you can choose to re-train the model. If you want three hours, if you want it to be six hours, you can do that. So we bring that flexibility, we bring that productivity into it. Then, the management of the models, right? How do we make sure that once you develop the model, you deploy the model. There's a life cycle involved there. How do you make sure that we enable, give you the tools to manage the model? So when you talk about differentiation, right? We are bringing differentiation on all three pillars. From a product perspective, with all the things I mentioned. From a deployment perspective. How do we make sure we have different choices of deployment, whether it's streaming, whether it's realtime, whether it's batch. You can do deployment, right? The Feedback Loop is another one. Once you deployed, how do we keep re-training it. And the last piece I talked about is the expertise or the people, right? So we are today announcing IBM Machine Learning Hub, which will become one place where our customers can go, ask questions, get education sessions, get training, right? Work together to build models. I'll give you an example, that although we are announcing hub, the IBM Machine Learning Hub today, we have been working with America First Credit Union for the last month or so. They approached us and said, you know, their underwriting takes a long time. All the knowledge is embedded in 15 to 20 human beings. And they want to make sure a machine should be able to absorb that knowledge and make that decision in minutes. So it takes hours or days. >> [Dave] So, Stu, before you jump in, so I got, put the portfolio. You know, you mentioned SPSS, expertise, choice. The collaboration, which I think you really stressed at the announcement last fall. The management of the models, so you can continuously improve it. >> Right. >> And then this knowledge base, what you're calling the hub. And I could argue, I guess, that if I take any one of those individual pieces, there, some of your competitors have them. Your argument would be it's all there. >> It all comes together, right? And you have to make sure that all three pillars come together. And customers see great value when you have that. >> Dinesh, customers today are used to kind of the deployment model on the public cloud, which is, "I want to activate a new service," you know. I just activate it, and it's there. When I think about private cloud environments, private clouds are operationally faster, but it's usually not miniature hours. It's usually more like months to deploy projects, which is still better than, you know, kind of, I think, before big data, it was, you know, oh, okay, 18 months to see if it works, and let's bring that down to, you know, a couple of months. Can you walk us through what does, you know, a customer today and says, "Great, I love this approach. "How long does it take?" You know, what's kind of the project life cycle of this? And how long will it take them to play around and pull some of these levers before they're, you know, getting productivity out of it? >> Right. So, really good questions, Stu. So let me back one step. So, in private cloud, we are going, we have new initiative called Download and Go, where our goal is to have our desktop products be able to install on your personal desktop in less than five clicks, in less than fifteen minutes. That's the goal. So the other day, you know, the team told me it's ready. That the first product is ready where you can go less than five clicks, fifteen minutes. I said the real test is I'm going to bring my son, who's five years old. Can he install it, and if he can install it, you know, we are good. And he did it. And I have a video to prove it, you know. So after the show, I will show you because and that's, when you talk about, you know, in the private cloud side, or the on-premise side, it has been a long project cycle. What we want is like you should be able to take our product, install it, and get the experience in minutes. That's the goal. And when you talk about private cloud and public cloud, another differentiating factor is that now you get the strength of IBM public cloud combined with the private cloud, so you could, you know, train your model in public cloud, and score on private cloud. You have the same experience. Not many folks, not many competitors can offer that, right? So that's another . .. >> [Stu] So if I get that right. If I as a customer have played around with the machine learning in Bluemix, I'm going to have a similar look, feel, API. >> Exactly the same, so what you have in Bluemix, right? I mean, so you have the Watson in Bluemix, which, you know, has deep learning, machine learning--all those capabilities. What we have done is we have done, is like, we have extracted the core capabilities of Watson on private cloud, and it's IBM Machine Learning. But the experience is the same. >> I want to talk about this notion of operationalizing analytics. And it ties, to me anyway, it ties into transformation. You mentioned going from Notebook to actually being able to embed analytics in workflow of the business. Can you double click on that a little bit, and maybe give some examples of how that has helped companies transform? >> Right. So when I talk about operationalizing, when you look at machine learning, right? You have all the way from data, which is the most critical piece, to building or deploying the model. A lot of times, data itself is not clean. I'll give you an example, right. So >> OSYX. >> Yeah. And when we are working with an insurance company, for example, the data that comes in. For example, if you just take gender, a lot of times the values are null. So we have to build another model to figure out if it's male or female, right? So in this case, for example, we have to say somebody has done a prostate exam. Obviously, he's a male. You know, we figured that. Or has a gynocology exam. It's a female. So we have to, you know, there's a lot of work just to get that data cleansed. So that's where I mentioned it's, you know, machine learning is 20% fun, 80% elbow grease because it's a lot of grease there that you need to make sure that you cleanse the data. Get that right. That's the shaping piece of it. Then, comes the building the model, right. And then, once you build the model on that data comes the operationalization of that model, which in itself is huge because how do you make sure that you infuse that model into your current infrastructure, which is where a lot of skill set, a lot of experience, and a lot of knowledge that comes in because you want to make sure, unless you are a start-up, right? You already have applications and programs and third-party vendors applications worth running for years, or decades, for that matter. So, yeah, so that's operationalization's a huge piece. Cleansing of the data is a huge piece. Getting the model right is another piece. >> And simplifying the whole process. I think about, I got to ingest the data. I've now got to, you know, play with it, explore. I've got to process it. And I've got to serve it to some, you know, some business need or application. And typically, those are separate processes, separate tools, maybe different personas that are doing that. Am I correct that your announcement in the Fall addressed that workflow. How is it being, you know, deployed and adopted in the field? How is it, again back to transformation, are you seeing that people are actually transforming their analytics processes and ultimately creating outcomes that they expect? >> Huge. So good point. We announced data science experience in the Fall. And the customers that who are going to speak with us today on stage, are the customers who have been using that. So, for example, if you take AFCU, America First Credit Union, they worked with us. In two weeks, you know, talk about transformation, we were able to absorb the knowledge of their underwriters. You know, what (mumbles) is in. Build that, get that features. And was able to build a model in two weeks. And the model is predicting 90%, with 90% accuracy. That's what early tests are showing. >> [Dave] And you say that was in a couple of weeks. You were, you developed that model. >> Yeah, yeah, right. So when we talk about transformation, right? We couldn't have done that a few years ago. We have transformed where the different personas can collaborate with each other, and that's a collaboration piece I talked about. Real time. Be able to build a model, and put it in the test to see what kind of benefits they're getting. >> And you've obviously got edge cases where people get really sophisticated, but, you know, we were sort of talking off camera, and you know like the 80/20 rule, or maybe it's the 90/10. You say most use cases can be, you know, solved with regression and classification. Can you talk about that a little more? >> So, so when we talk about machine learning, right? To me, I would say 90% of it is regression or classification. I mean there are edge case of our clustering and all those things. But linear regression or a classification can solve most of the, most of our customers problems, right? So whether it's fraud detection. Or whether it's underwriting the loan. Or whether you're trying to determine the sentiment analysis. I mean, you can kind of classify or do regression on it. So I would say that 90% of the cases can be covered, but like I said, most of the work is not about picking the right algorithm, but it's also about cleansing the data. Picking the algorithm, then comes building the model. Then comes deployment or operationalizing the model. So there's a step process that's involved, and each step involves some amount of work. So if I could make one more point on the technology and the transformation we have done. So even with picking the right algorithm, we automated, so you as a data scientist don't need to, you know, come in and figure out if I have 50 classifiers and each classifier has four parameters. That's 200 different combinations. Even if you take one hour on each combination, that's 200 hours or nine days that takes you to pick the right combination. What we have done is like in IBM Machine Learning we have something called cognitive assistance for data science, which will help you pick the right combination in minutes instead of days. >> So I can see how regression scales, and in the example you gave of classification, I can see how that scales. If you've got a, you know, fixed classification or maybe 200 parameters, or whatever it is, that scales, what happens, how are people dealing with, sort of automating that classification as things change, as they, some kind of new disease or pattern pops up. How do they address that at scale? >> Good point. So as the data changes, the model needs to change, right? Because everything that model knows is based on the training data. Now, if the data has changed, the symptoms of cancer or any disease has changed, obviously, you have to retrain that model. And that's where I talk about the, where the feedback loop comes in, where we will automatically retrain the model based on the new data that's coming in. So you, as an end user, for example, don't need to worry about it because we will take care of that piece also. We will automate that, also. >> Okay, good. And you've got a session this afternoon with you said two clients, right? AFCU and Kaden dot TV, and you're on, let's see, at 2:55. >> Right. >> So you folks watching the live stream, check that out. I'll give you the last word, you know, what shall we expect to hear there. Show a little leg on your discussion this afternoon. >> Right. So, obviously, I'm going to talk about the different shading factors, what we are delivering IBM Machine Learning, right? And I covered some of it. There's going to be much more. We are going to focus on how we are making freedom or flexibility available. How are we going to do productivity, right? Gains for our data scientists and developers. We are going to talk about trust, you know, the trust of data that we are bringing in. Then I'm going to bring the customers in and talk about their experience, right? We are delivering a product, but we already have customers using it, so I want them to come on stage and share the experiences of, you know, it's one thing you hear about that from us, but it's another thing that customers come and talk about it. So, and the last but not least is we are going to announce our first release of IBM Machine Learning on Z because if you look at 90% of the transactional data, today, it runs through Z, so they don't have to off-load the data to do analytics on it. We will make machine learning available, so you can do training and scoring right there on Z for your real time analytics, so. >> Right. Extending that theme that we talked about earlier, Stu, bringing analytics and transactions together, which is a big theme of the Z 13 announcement two years ago. Now you're seeing, you know, machine learning coming on Z. The live stream starts at 2 o'clock. Silicon Angle dot com had an article up on the site this morning from Maria Doucher on the IBM announcement, so check that out. Dinesh, thanks very much for coming back on theCube. Really appreciate it, and good luck today. >> Thank you. >> All right. Keep it right there, buddy. We'll be back with our next guest. This is theCube. We're live from the Waldorf Astoria for the IBM Machine Learning Event announcement. Right back.
SUMMARY :
brought to you by IBM. Dinesh, good to see you again. the data lake, the data swamp, And you got to have a different shared set So if you take a banking, to have a successful, you know, experience Let's start with the differentiation. What's the differentiation? the Feedback Loops, so when you deploy your model, The management of the models, so you can And I could argue, I guess, And customers see great value when you have that. and let's bring that down to, you know, So the other day, you know, the machine learning in Bluemix, I mean, so you have the Watson in Bluemix, Can you double click on that a little bit, when you look at machine learning, right? So we have to, you know, And I've got to serve it to some, you know, So, for example, if you take AFCU, [Dave] And you say that was in a couple of weeks. and put it in the test to see what kind You say most use cases can be, you know, we automated, so you as a data scientist and in the example you gave of classification, So as the data changes, with you said two clients, right? So you folks watching the live stream, you know, the trust of data that we are bringing in. on the IBM announcement, for the IBM Machine Learning Event announcement.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
20% | QUANTITY | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
AFCU | ORGANIZATION | 0.99+ |
15 | QUANTITY | 0.99+ |
one hour | QUANTITY | 0.99+ |
New York City | LOCATION | 0.99+ |
Dinesh Nirmal | PERSON | 0.99+ |
Dinesh Nirmel | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
200 hours | QUANTITY | 0.99+ |
six hours | QUANTITY | 0.99+ |
90% | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
80% | QUANTITY | 0.99+ |
less than fifteen minutes | QUANTITY | 0.99+ |
New York | LOCATION | 0.99+ |
fifteen minutes | QUANTITY | 0.99+ |
Maria Doucher | PERSON | 0.99+ |
America First Credit Union | ORGANIZATION | 0.99+ |
50 classifiers | QUANTITY | 0.99+ |
nine days | QUANTITY | 0.99+ |
three hours | QUANTITY | 0.99+ |
two clients | QUANTITY | 0.99+ |
Kaden dot TV | ORGANIZATION | 0.99+ |
less than five clicks | QUANTITY | 0.99+ |
18 months | QUANTITY | 0.99+ |
San Jose, California | LOCATION | 0.99+ |
two weeks | QUANTITY | 0.99+ |
200 different combinations | QUANTITY | 0.99+ |
Dinesh | PERSON | 0.99+ |
each classifier | QUANTITY | 0.99+ |
200 parameters | QUANTITY | 0.99+ |
each combination | QUANTITY | 0.99+ |
Python | TITLE | 0.99+ |
today | DATE | 0.99+ |
each step | QUANTITY | 0.99+ |
two years ago | DATE | 0.99+ |
three key pillars | QUANTITY | 0.99+ |
one | QUANTITY | 0.98+ |
first product | QUANTITY | 0.98+ |
one step | QUANTITY | 0.98+ |
two endpoints | QUANTITY | 0.98+ |
third one | QUANTITY | 0.98+ |
first one | QUANTITY | 0.98+ |
Watson | TITLE | 0.98+ |
2 o'clock | DATE | 0.98+ |
last month | DATE | 0.98+ |
first endpoint | QUANTITY | 0.98+ |
three pillars | QUANTITY | 0.98+ |
Silicon Angle dot com | ORGANIZATION | 0.98+ |
70's | DATE | 0.97+ |
80's | DATE | 0.97+ |
this afternoon | DATE | 0.97+ |
Z 13 | TITLE | 0.97+ |
Z | TITLE | 0.97+ |
last fall | DATE | 0.96+ |
Bluemix | TITLE | 0.96+ |
yesterday | DATE | 0.95+ |
2:55 | DATE | 0.95+ |
Rob Thomas, IBM | IBM Machine Learning Launch
>> Narrator: Live from New York, it's theCUBE. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody this is theCUBE, we're here at the IBM Machine Learning Launch Event, Rob Thomas is here, he's the general manager of the IBM analytics group. Rob, good to see you again. >> Dave, great to see you, thanks for being here. >> Yeah it's our pleasure. So two years ago, IBM announced the Z platform, and the big theme was bringing analytics and transactions together. You guys are sort of extending that today, bringing machine learning. So the news just hit three minutes ago. >> Rob: Yep. >> Take us through what you announced. >> This is a big day for us. The announcement is we are going to bring machine learning to private Clouds, and my observation is this, you look at the world today, over 90% of the data in the world cannot be googled. Why is that? It's because it's behind corporate firewalls. And as we've worked with clients over the last few years, sometimes they don't want to move their most sensitive data to the public Cloud yet, and so what we've done is we've taken the machine learning from IBM Watson, we've extracted that, and we're enabling that on private Clouds, and we're telling clients you can get the power of machine learning across any type of data, whether it's data in a warehouse, a database, unstructured content, email, you name it we're bringing machine learning everywhere. To your point, we were thinking about, so where do we start? And we said, well, what is the world's most valuable data? It's the data on the mainframe. It's the transactional data that runs the retailers of the world, the banks of the world, insurance companies, airlines of the world, and so we said we're going to start there because we can show clients how they can use machine learning to unlock value in their most valuable data. >> And which, you say private Cloud, of course, we're talking about the original private Cloud, >> Rob: Yeah. >> Which is the mainframe, right? >> Rob: Exactly. >> And I presume that you'll extend that to other platforms over time is that right? >> Yeah, I mean, we're going to think about every place that data is managed behind a firewall, we want to enable machine learning as an ingredient. And so this is the first step, and we're going to be delivering every quarter starting next quarter, bringing it to other platforms, other repositories, because once clients get a taste of the idea of automating analytics with machine learning, what we call continuous intelligence, it changes the way they do analytics. And, so, demand will be off the charts here. >> So it's essentially Watson ML extracted and placed on Z, is that right? And describe how people are going to be using this and who's going to be using it. >> Sure, so Watson on the Cloud today is IBM's Cloud platform for artificial intelligence, cognitive computing, augmented intelligence. A component of that is machine learning. So we're bringing that as IBM machine learning which will run today on the mainframe, and then in the future, other platforms. Now let's talk about what it does. What it is, it's a single-place unified model management, so you can manage all your models from one place. And we've got really interesting technology that we pulled out of IBM research, called CADS, which stands for the Cognitive Assistance for Data Scientist. And the idea behind CADS is, you don't have to know which algorithm to choose, we're going to choose the algorithm for you. You build your model, we'll decide based on all the algorithms available on open-source what you built for yourself, what IBM's provided, what's the best way to run it, and our focus here is, it's about productivity of data science and data scientists. No company has as many data scientists as they want, and so we've got to make the ones they do have vastly more productive, and so with technology like CADS, we're helping them do their job more efficiently and better. >> Yeah, CADS, we've talked about this in theCUBE before, it's like an algorithm to choose an algorithm, and makes the best fit. >> Rob: Yeah. >> Okay. And you guys addressed some of the collaboration issues at your Watson data platform announcement last October, so talk about the personas who are asking you to give me access to mainframe data, and give me, to tooling that actually resides on this private Cloud. >> It's definitely a data science persona, but we see, I'd say, an emerging market where it's more the business analyst type that is saying I'd really like to get at that data, but I haven't been able to do that easily in the past. So giving them a single pane of glass if you will, with some light data science experience, where they can manage their models, using CADS to actually make it more productive. And then we have something called a feedback loop that's built into it, which is you build a model running on Z, as you get new data in, these are the largest transactional systems in the world so there's data coming in every second. As you get new data in, that model is constantly updating. The model is learning from the data that's coming in, and it's becoming smarter. That's the whole idea behind machine learning in the first place. And that's what we've been able to enable here. Now, you and I have talked through the years, Dave, about IBM's investment in Spark. This is one of the first, I would say, world-class applications of Spark. We announced Spark on the mainframe last year, what we're bringing with IBM machine learning is leveraging Spark as an execution engine on the mainframe, and so I see this as Spark is finally coming into the mainstream, when you talk about Spark accessing the world's greatest transactional data. >> Rob, I wonder if you can help our audience kind of squint through a compare and contrast, public Cloud versus what you're offering today, 'cause one thing, public Cloud adding new services, machine learning seemed like one of those areas that we would add, like IBM had done with a machine learning platform. Streaming, absolutely you hear mobile streaming applications absolutely happened in the public Cloud. Is cost similar in private Cloud? Can I get all the services? How will IBM and your customer base keep up with that pace of innovation that we've seen from IBM and others in the public Cloud on PRIM? >> Yeah, so, look, my view is it's not an either or. Because when you look at this valuable data, clients want to do some of it in public Cloud, they want to keep a lot of it in the system that they built on PRIMA. So our job is, how do we actually bridge that gap? So I see machine learning like we've talked about becoming much more of a hybrid capability over time because the data they want to move to the Cloud, they should do that. The economics are great. The data, doing it on private Cloud, actually the economics are tremendous as well. And so we're delivering an elastic infrastructure on private Cloud as well that can scale the public Cloud. So to me it's not either or, it's about what everybody wants as Cloud features. They want the elasticity, they want a creatable interface, they want the economics of Cloud, and our job is to deliver that in both places. Whether it's on the public Cloud, which we're doing, or on the private Cloud. >> Yeah, one of the thought exercises I've gone through is if you follow the data, and follow the applications, it's going to show you where customers are going to do things. If you look at IOT, if you look at healthcare, there's lots of uses that it's going to be on PRIMA it's going to be on the edge, I got to interview Walmart a couple of years ago at the IBM Ed show, and they leveraged Z globally to use their sales, their enablement, and obviously they're not going to use AWS as their platform. What's the trends, what do you hear form their customers, how much of the data, are there reasons why it needs to stay at the edge? It's not just compliance and governance, but it's just because that's where the data is and I think you were saying there's just so much data on the Z series itself compared to in other environments. >> Yeah, and it's not just the mainframe, right? Let's be honest, there's just massive amounts of data that still sits behind corporate firewalls. And while I believe the end destination is a lot of that will be on public Cloud, what do you do now? Because you can't wait until that future arrives. And so the place, the biggest change I've seen in the market in the last year is clients are building private Clouds. It's not traditional on-premise deployments, it's, they're building an elastic infrastructure behind their firewall, you see it a lot in heavily-regulated industries, so financial services where they're dealing with things like GDPR, any type of retailer who's dealing with things like PCI compliance. Heavy-regulated industries are saying, we want to move there, but we got challenges to solve right now. And so, our mission is, we want to make data simple and accessible, wherever it is, on private Cloud or public Cloud, and help clients on that journey. >> Okay, so carrying through on that, so you're now unlocking access to mainframe data, great, if I have, say, a retail example, and I've got some data science, I'm building some models, I'm accessing the mainframe data, if I have data that's elsewhere in the Cloud, how specifically with regard to this announcement will a practitioner execute on that? >> Yeah, so, one is you could decide one place that you want to land your data and have it be resonant, so you could do that. We have scenarios where clients are using data science experience on the Cloud, but they're actually leaving the data behind the firewalls. So we don't require them to move the data, so our model is one of flexibility in terms of how they want to manage their data assets. Which I think is unique in terms of IBM's approach to that. Others in the market say, if you want to use our tools, you have to move your data to our Cloud, some of them even say as you click through the terms, now we own your data, now we own your insights, that's not our approach. Our view is it's your data, if you want to run the applications in the Cloud, leave the data where it is, that's fine. If you want to move both to the Cloud, that's fine. If you wanted to leave both on private Cloud, that's fine. We have capabilities like Big SQL where we can actually federate data across public and private Clouds, so we're trying to provide choice and flexibility when it comes to this. >> And, Rob, in the context of this announcement, that would be, that example you gave, would be done through APIs that allow me access to that Cloud data is that right? >> Yeah, exactly, yes. >> Dave: Okay. >> So last year we announced something called Data Connect, which is basically, think of it as a bus between private and public Cloud. You can leverage Data Connect to seamlessly and easily move data. It's very high-speed, it uses our Aspera technology under the covers, so you can do that. >> Dave: A recent acquisition. >> Rob, IBM's been very active in open source engagement, in trying to help the industry sort out some of the challenges out there. Where do you see the state of the machine learning frameworks Google of course has TensorFlow, we've seen Amazon pushing at MXNet, is IBM supporting all of them, there certain horses that you have strong feelings for? What are your customers telling you? >> I believe in openness and choice. So with IBM machine learning you can choose your language, you can use Scala, you can use Java, you can use Python, more to come. You can choose your framework. We're starting with Spark ML because that's where we have our competency and that's where we see a lot of client desire. But I'm open to clients using other frameworks over time as well, so we'll start to bring that in. I think the IT industry always wants to kind of put people into a box. This is the model you should use. That's not our approach. Our approach is, you can use the language, you can use the framework that you want, and through things like IBM machine learning, we give you the ability to tap this data that is your most valuable data. >> Yeah, the box today has just become this mosaic and you have to provide access to all the pieces of that mosaic. One of the things that practitioners tell us is they struggle sometimes, and I wonder if you could weigh in on this, to invest either in improving the model or capturing more data and they have limited budget, and they said, okay. And I've had people tell me, no, you're way better off getting more data in, I've had people say, no no, now with machine learning we can advance the models. What are you seeing there, what are you advising customers in that regard? >> So, computes become relatively cheap, which is good. Data acquisitions become relatively cheap. So my view is, go full speed ahead on both of those. The value comes from the right algorithms and the right models. That's where the value is. And so I encourage clients, even think about maybe you separate your teams. And you have one that's focused on data acquisition and how you do that, and another team that's focused on model development, algorithm development. Because otherwise, if you give somebody both jobs, they both get done halfway, typically. And the value is from the right models, the right algorithms, so that's where we stress the focus. >> And models to date have been okay, but there's a lot of room for improvement. Like the two examples I like to use are retargeting, ad retargeting, which, as we all know as consumers is not great. You buy something and then you get targeted for another week. And then fraud detection, which is actually, for the last ten years, quite good, but there's still a lot of false positives. Where do you see IBM machine learning taking that practical use case in terms of improving those models? >> Yeah, so why are there false positives? The issue typically comes down to the quality of data, and the amount of data that you have that's why. Let me give an example. So one of the clients that's going to be talking at our event this afternoon is Argus who's focused on the healthcare space. >> Dave: Yeah, we're going to have him on here as well. >> Excellent, so Argus is basically, they collect data across payers, they're focused on healthcare, payers, providers, pharmacy benefit managers, and their whole mission is how do we cost-effectively serve different scenarios or different diseases, in this case diabetes, and how do we make sure we're getting the right care at the right time? So they've got all that data on the mainframe, they're constantly getting new data in, it could be about blood sugar levels, it could be about glucose, it could be about changes in blood pressure. Their models will get smarter over time because they built them with IBM machine learning so that what's cost-effective today may not be the most effective or cost-effective solution tomorrow. But we're giving them that continuous intelligence as data comes in to do that. That is the value of machine learning. I think sometimes people miss that point, they think it's just about making the data scientists' job easier, that productivity is part of it, but it's really about the voracity of the data and that you're constantly updating your models. >> And the patient outcome there, I read through some of the notes earlier, is if I can essentially opt in to allow the system to adjudicate the medication or the claim, and if I do so, I can get that instantaneously or in near real-time as opposed to have to wait weeks and phone calls and haggling. Is that right, did I get that right? >> That's right, and look, there's two dimensions. It's the cost of treatment, so you want to optimize that, and then it's the effectiveness. And which one's more important? Well, they're both actually critically important. And so what we're doing with Argus is building, helping them build models where they deploy this so that they're optimizing both of those. >> Right, and in the case, again, back to the personas, that would be, and you guys stressed this at your announcement last October, it's the data scientist, it's the data engineer, it's the, I guess even the application developer, right? Involved in that type of collaboration. >> My hope would be over time, when I talked about we view machine learning as an ingredient across everywhere that data is, is you want to embed machine learning into any applications that are built. And at that point you no longer need a data scientist per se, for that case, you can just have the app developer that's incorporating that. Whereas another tough challenge like the one we discussed, that's where you need data scientists. So think about, you need to divide and conquer the machine learning problem, where the data scientist can play, the business analyst can play, the app developers can play, the data engineers can play, and that's what we're enabling. >> And how does streaming fit in? We talked earlier about this sort of batch, interactive, and now you have this continuous sort of work load. How does streaming fit? >> So we use streaming in a few ways. One is very high-speed data ingest, it's a good way to get data into the Cloud. We also can do analytics on the fly. So a lot of our use case around streaming where we actually build analytical models into the streaming engine so that you're doing analytics on the fly. So I view that as, it's a different side of the same coin. It's kind of based on your use case, how fast you're ingesting data if you're, you know, sub-millisecond response times, you constantly have data coming in, you need something like a streaming engine to do that. >> And it's actually consolidating that data pipeline, is what you described which is big in terms of simplifying the complexity, this mosaic of a dupe, for example and that's a big value proposition of Spark. Alright, we'll give you the last word, you've got an audience outside waiting, big announcement today; final thoughts. >> You know, we talked about machine learning for a long time. I'll give you an analogy. So 1896, Charles Brady King is the first person to drive an automobile down the street in Detroit. It was 20 years later before Henry Ford actually turned it from a novelty into mass appeal. So it was like a 20-year incubation period where you could actually automate it, you could make it more cost-effective, you could make it simpler and easy. I feel like we're kind of in the same thing here where, the data era in my mind began around the turn of the century. Companies came onto the internet, started to collect a lot more data. It's taken us a while to get to the point where we could actually make this really easy and to do it at scale. And people have been wanting to do machine learning for years. It starts today. So we're excited about that. >> Yeah, and we saw the same thing with the steam engine, it was decades before it actually was perfected, and now the timeframe in our industry is compressed to years, sometimes months. >> Rob: Exactly. >> Alright, Rob, thanks very much for coming on theCUBE. Good luck with the announcement today. >> Thank you. >> Good to see you again. >> Thank you guys. >> Alright, keep it right there, everybody. We'll be right back with our next guest, we're live from the Waldorf Astoria, the IBM Machine Learning Launch Event. Be right back. [electronic music]
SUMMARY :
Brought to you by IBM. Rob, good to see you again. Dave, great to see you, and the big theme was bringing analytics and we're telling clients you can get it changes the way they do analytics. are going to be using this And the idea behind CADS and makes the best fit. so talk about the personas do that easily in the past. in the public Cloud. Whether it's on the public Cloud, and follow the applications, And so the place, that you want to land your under the covers, so you can do that. of the machine learning frameworks This is the model you should use. and you have to provide access to and the right models. for the last ten years, quite good, and the amount of data to have him on here as well. That is the value of machine learning. the system to adjudicate It's the cost of treatment, Right, and in the case, And at that point you no and now you have this We also can do analytics on the fly. in terms of simplifying the complexity, King is the first person and now the timeframe in our industry much for coming on theCUBE. the IBM Machine Learning Launch Event.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Henry Ford | PERSON | 0.99+ |
Rob | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Detroit | LOCATION | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
Charles Brady King | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
Walmart | ORGANIZATION | 0.99+ |
Scala | TITLE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
New York | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
two dimensions | QUANTITY | 0.99+ |
1896 | DATE | 0.99+ |
Java | TITLE | 0.99+ |
both | QUANTITY | 0.99+ |
Argus | ORGANIZATION | 0.99+ |
tomorrow | DATE | 0.99+ |
Python | TITLE | 0.99+ |
20-year | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
Argus | PERSON | 0.99+ |
one | QUANTITY | 0.99+ |
two examples | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
both jobs | QUANTITY | 0.99+ |
first step | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
next quarter | DATE | 0.99+ |
two years ago | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
ORGANIZATION | 0.98+ | |
first person | QUANTITY | 0.98+ |
three minutes ago | DATE | 0.98+ |
20 years later | DATE | 0.98+ |
Watson | TITLE | 0.98+ |
last October | DATE | 0.97+ |
IBM Machine Learning Launch Event | EVENT | 0.96+ |
IBM Machine Learning Launch Event | EVENT | 0.96+ |
Spark ML | TITLE | 0.96+ |
both places | QUANTITY | 0.95+ |
One | QUANTITY | 0.95+ |
IBM Machine Learning Launch Event | EVENT | 0.94+ |
MXNet | ORGANIZATION | 0.94+ |
Watson ML | TITLE | 0.94+ |
Data Connect | TITLE | 0.94+ |
Cloud | TITLE | 0.93+ |
Kickoff - IBM Machine Learning Launch - #IBMML - #theCUBE
>> Narrator: Live from New York, it's The Cube covering the IBM Machine Learning Launch Event brought to you by IBM. Here are your hosts, Dave Vellante and Stu Miniman. >> Good morning everybody, welcome to the Waldorf Astoria. Stu Miniman and I are here in New York City, the Big Apple, for IBM's Machine Learning Event #IBMML. We're fresh off Spark Summit, Stu, where we had The Cube, this by the way is The Cube, the worldwide leader in live tech coverage. We were at Spark Summit last week, George Gilbert and I, watching the evolution of so-called big data. Let me frame, Stu, where we're at and bring you into the conversation. The early days of big data were all about offloading the data warehouse and reducing the cost of the data warehouse. I often joke that the ROI of big data is reduction on investment, right? There's these big, expensive data warehouses. It was quite successful in that regard. What then happened is we started to throw all this data into the data warehouse. People would joke it became a data swamp, and you had a lot of tooling to try to clean the data warehouse and a lot of transforming and loading and the ETL vendors started to participate there in a bigger way. Then you saw the extension of these data pipelines to try to more with that data. The Cloud guys have now entered in a big way. We're now entering the Cognitive Era, as IBM likes to refer to it. Others talk about AI and machine learning and deep learning, and that's really the big topic here today. What we can tell you, that the news goes out at 9:00am this morning, and it was well known that IBM's bringing machine learning to its mainframe, z mainframe. Two years ago, Stu, IBM announced the z13, which was really designed to bring analytic and transaction processing together on a single platform. Clearly IBM is extending the useful life of the mainframe by bringing things like Spark, certainly what it did with Linux and now machine learning into z. I want to talk about Cloud, the importance of Cloud, and how that has really taken over the world of big data. Virtually every customer you talk to now is doing work on the Cloud. It's interesting to see now IBM unlocking its transaction base, its mission-critical data, to this machine learning world. What are you seeing around Cloud and big data? >> We've been digging into this big data space since before it was called big data. One of the early things that really got me interested and exciting about it is, from the infrastructure standpoint, storage has always been one of its costs that we had to have, and the massive amounts of data, the digital explosion we talked about, is keeping all that information or managing all that information was a huge challenge. Big data was really that bit flip. How do we take all that information and make it an opportunity? How do we get new revenue streams? Dave, IBM has been at the center of this and looking at the higher-level pieces of not just storing data, but leveraging it. Obviously huge in analytics, lots of focus on everything from Hadoop and Spark and newer technologies, but digging in to how they can leverage up the stack, which is where IBM has done a lot of acquisitions in that space and leveraging that and wants to make sure that they have a strong position both in Cloud, which was renamed. The soft layer is now IBM Bluemix with a lot of services including a machine learning service that leverages the Watson technology and of course OnPrem they've got the z and the power solutions that you and I have covered for many years at the IBM Med show. >> Machine learning obviously heavily leverages models. We've seen in the early days of the data, the data scientists would build models and machine learning allows those models to be perfected over time. So there's this continuous process. We're familiar with the world of Batch and then some mini computer brought in the world of interactive, so we're familiar with those types of workloads. Now we're talking about a new emergent workload which is continuous. Continuous apps where you're streaming data in, what Spark is all about. The models that data scientists are building can constantly be improved. The key is automation, right? Being able to automate that whole process, and being able to collaborate between the data scientist, the data quality engineers, even the application developers that's something that IBM really tried to address in its last big announcement in this area of which was in October of last year the Watson data platform, what they called at the time the DataWorks. So really trying to bring together those different personas in a way that they can collaborate together and improve models on a continuous basis. The use cases that you often hear in big data and certainly initially in machine learning are things like fraud detection. Obviously ad serving has been a big data application for quite some time. In financial services, identifying good targets, identifying risk. What I'm seeing, Stu, is that the phase that we're in now of this so-called big data and analytics world, and now bringing in machine learning and deep learning, is to really improve on some of those use cases. For example, fraud's gotten much, much better. Ten years ago, let's say, it took many, many months, if you ever detected fraud. Now you get it in seconds, or sometimes minutes, but you also get a lot of false positives. Oops, sorry, the transaction didn't go through. Did you do this transaction? Yes, I did. Oh, sorry, you're going to have to redo it because it didn't go through. It's very frustrating for a lot of users. That will get better and better and better. We've all experienced retargeting from ads, and we know how crappy they are. That will continue to get better. The big question that people have and it goes back to Jeff Hammerbacher, the best minds of my generation are trying to get people to click on ads. When will we see big data really start to affect our lives in different ways like patient outcomes? We're going to hear some of that today from folks in health care and pharma. Again, these are the things that people are waiting for. The other piece is, of course, IT. What you're seeing, in terms of IT, in the whole data flow? >> Yes, a big question we have, Dave, is where's the data? And therefore, where does it make sense to be able to do that processing? In big data we talked about you've got masses amounts of data, can we move the processing to that data? With IT, the day before, your RCTO talked that there's going to be massive amounts of data at the edge and I don't have the time or the bandwidth or the need necessarily to pull that back to some kind of central repository. I want to be able to work on it there. Therefore there's going to be a lot of data worked at the edge. Peter Levine did a whole video talking about how, "Oh, Public Cloud is dead, it's all going to the edge." A little bit hyperbolic to the statement we understand that there's plenty use cases for both Public Cloud and for the edge. In fact we see Google big pushing machine learning TensorFlow, it's got one of those machine learning frameworks out there that we expect a lot of people to be working on. Amazon is putting effort into the MXNet framework, which is once again an open-source effort. One of the things I'm looking at the space, and I think IBM can provide some leadership here is to what frameworks are going to become popular across multiple scenarios? How many winners can there be for these frameworks? We already have multiple programming languages, multiple Clouds. How much of it is just API compatibility? How much of work there, and where are the repositories of data going to be, and where does it make sense to do that predictive analytics, that advanced processing? >> You bring up a good point. Last year, last October, at Big Data CIV, we had a special segment of data scientists with a data scientist panel. It was great. We had some rockstar data scientists on there like Dee Blanchfield and Joe Caserta, and a number of others. They echoed what you always hear when you talk to data scientists. "We spend 80% of our time messing with the data, "trying to clean the data, figuring out the data quality, "and precious little time on the models "and proving the models "and actually getting outcomes from those models." So things like Spark have simplified that whole process and unified a lot of the tooling around so-called big data. We're seeing Spark adoption increase. George Gilbert in our part one and part two last week in the big data forecast from Wikibon showed that we're still not on the steep part of the Se-curve, in terms of Spark adoption. Generically, we're talking about streaming as well included in that forecast, but it's forecasting that increasingly those applications are going to become more and more important. It brings you back to what IBM's trying to do is bring machine learning into this critical transaction data. Again, to me, it's an extension of the vision that they put forth two years ago, bringing analytic and transaction data together, actually processing within that Private Cloud complex, which is what essentially this mainframe is, it's the original Private Cloud, right? You were saying off-camera, it's the original converged infrastructure. It's the original Private Cloud. >> The mainframe's still here, lots of Linux on it. We've covered for many years, you want your cool Linux docker, containerized, machine learning stuff, I can do that on the Zn-series. >> You want Python and Spark and Re and Papa Java, and all the popular programming languages. It makes sense. It's not like a huge growth platform, it's kind of flat, down, up in the product cycle but it's alive and well and a lot of companies run their businesses obviously on the Zn. We're going to be unpacking that all day. Some of the questions we have is, what about Cloud? Where does it fit? What about Hybrid Cloud? What are the specifics of this announcement? Where does it fit? Will it be extended? Where does it come from? How does it relate to other products within the IBM portfolio? And very importantly, how are customers going to be applying these capabilities to create business value? That's something that we'll be looking at with a number of the folks on today. >> Dave, another thing, it reminds me of two years ago you and I did an event with the MIT Sloan school on The Second Machine Age with Andy McAfee and Erik Brynjolfsson talking about as machines can help with some of these analytics, some of this advanced technology, what happens to the people? Talk about health care, it's doctors plus machines most of the time. As these two professors say, it's racing with the machines. What is the impact on people? What's the impact on jobs? And productivity going forward, really interesting hot space. They talk about everything from autonomous vehicles, advanced health care and the like. This is right at the core of where the next generation of the economy and jobs are going to go. >> It's a great point, and no doubt that's going to come up today and some of our segments will explore that. Keep it right there, everybody. We'll be here all day covering this announcement, talking to practitioners, talking to IBM executives and thought leaders and sharing some of the major trends that are going on in machine learning, the specifics of this announcement. Keep it right there, everybody. This is The Cube. We're live from the Waldorf Astoria. We'll be right back.
SUMMARY :
covering the IBM Machine and that's really the and the massive amounts of data, and it goes back to Jeff Hammerbacher, and I don't have the time or the bandwidth of the Se-curve, in I can do that on the Zn-series. Some of the questions we have is, of the economy and jobs are going to go. and sharing some of the major trends
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Peter Levine | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Erik Brynjolfsson | PERSON | 0.99+ |
Joe Caserta | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Last year | DATE | 0.99+ |
80% | QUANTITY | 0.99+ |
Andy McAfee | PERSON | 0.99+ |
Stu | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
last October | DATE | 0.99+ |
Dee Blanchfield | PERSON | 0.99+ |
last week | DATE | 0.99+ |
Python | TITLE | 0.99+ |
two professors | QUANTITY | 0.99+ |
Spark | TITLE | 0.99+ |
October | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
New York | LOCATION | 0.99+ |
Linux | TITLE | 0.98+ |
today | DATE | 0.98+ |
two years ago | DATE | 0.98+ |
Ten years ago | DATE | 0.98+ |
Waldorf Astoria | ORGANIZATION | 0.98+ |
Big Apple | LOCATION | 0.98+ |
Two years ago | DATE | 0.97+ |
Spark Summit | EVENT | 0.97+ |
single platform | QUANTITY | 0.97+ |
both | QUANTITY | 0.97+ |
One | QUANTITY | 0.97+ |
Wikibon | ORGANIZATION | 0.96+ |
one | QUANTITY | 0.96+ |
The Cube | COMMERCIAL_ITEM | 0.96+ |
MIT Sloan school | ORGANIZATION | 0.96+ |
Watson | TITLE | 0.91+ |
9:00am this morning | DATE | 0.9+ |
Hadoop | TITLE | 0.9+ |
Re | TITLE | 0.9+ |
Papa Java | TITLE | 0.9+ |
Zn | TITLE | 0.88+ |
Watson | ORGANIZATION | 0.87+ |
IBM Machine Learning Launch Event | EVENT | 0.87+ |
MXNet | TITLE | 0.84+ |
part two | QUANTITY | 0.82+ |
Cloud | TITLE | 0.81+ |
Second | TITLE | 0.8+ |
IBM Med | EVENT | 0.8+ |
Machine Learning Event | EVENT | 0.79+ |
z13 | COMMERCIAL_ITEM | 0.78+ |
#IBMML | EVENT | 0.77+ |
Big | ORGANIZATION | 0.75+ |
#IBMML | TITLE | 0.75+ |
DataWorks | ORGANIZATION | 0.71+ |
Christos Karamanolis, VMware | VMworld 2016
>> live from the Mandalay Bay Convention Center in Las Vegas. It's the King covering via World 2016 brought to you by IBM Wear and its ecosystem sponsors. Now here's your host stew minimum. Welcome back to the Cube here at VM World 2016. Happy to welcome back to the PO program. Christos Caramel analysts. Who's the fellow in CTO of the V A more storage and availability business unit. Thank you for joining us again. >> About to be buck >> Storage is a big focus here. Big announcements around. Not only the sand, but everything happened in the storage room. Tell us what you've been working on the last year. >> Yeah, quite a few things. As you know, Miss Olsen has become practically mainstream product now, especially since we saved the very same 6.2 back in March 2016 with a number of new enterprise grade features for space efficiency. New availability. Fisher's with the razor calls right 56 The product is really taking off. Taking off, especially in old flask configurations, is becoming the predominant model that our customers are using. So ultimately, of course, customers buy a new product like this on and hyper converts product because of the operational efficiencies and brings to their data centers. The way I present this is you have the personal efficiency off public clouds into your private data center now. But this is for me is thus the stepping stone for even a longer term term, bolder vision will have around the stores, the data management. So, the last several months now, I have been working on a new range of projects. Main theme. There is moving up the stock from stores and the physical infrastructure implications. It has two data management on starting with data protection on overall and managing the life cycle of your data for protection, for disaster recovery, for archival, so that you can have tools to be able to effectively and efficiently discover your data. Mine your data. Use them by new applications, including cloud native applications and a dent even know that this may sound a little controversial coming from Vienna, where sitio even moving your data to public clouds and allow application mobility freely between private public clouds. >> Yeah, it's really interesting and wonder if you can packed out a little bit for us, Veum, where, of course, really dominant, the Enterprise Data Center. We're trying to understand where Veum, where fits into the public cloud on how you cut both support the existing ecosystem and move forward. So, you know, it's interesting off >> course. There are silences. There are many open questions. I do not claim that we have the answers to everything. Everything. But you do see that we put a lot of emphasis on that because it is obvious that the I T world is evolving. Our own customers are gradually slowly, but certainly there start incorporating public clouds into the bigger I T organizations that have. So our goal is to start delivering value to our customers based on clouds, starting with what they have today into the data centers. Let me give you a specific example in the case of Virtual San, who have some really cool tools for Mona's in your infrastructure in a holistic way, computer networking and now stores a SZ part of that you have ah solutions and tools that allow the customer to monitor constantly there covered infrastructure, the configuration of that. The class is the network servers controller's down to individual devices, and we provide a lot of data to the customers, not only for the health but also for the performance off the off the infrastructure data to the customer can today used to perform root cause analysis of potential issues to decide how to optimize there. Infrastructure in the world clothes. But that is actually pretty no sophisticated house. You cannot expect a lot 500 thousands 1000 customers. Of'em were to be ableto do this kind of sophisticate analysis. So what we're working on right now is a set off analytics tools that do all this data Kranz ink and analysis a root cause analysis on DDE evaluation of the infrastructure on because of the customer instead of providing data now we're providing answers and suggestions now way want to be able to deliver those analytics in a very rapid cadence. So what we do is we develop all those things in via Morse. Cloud will collect data from the customer side through telemetry, the emir's phone home product, and we get off the data up in our club. We crunch the data on because of the customer, and we use really sophisticated methods that will be evolving over time and eventually will be delivering feedback and suggestions at a kind level to the customer that can be actionable. For example, weekend point out that certain firm were the 1st 1 off certain controllers, and the infrastructure is falling behind. I may have problems or point out to a certain SS thes uh, a problem getting close to the end off life. For more sophisticated thing. Starts us reconfigure your application with a different policy for data distribution to achieve better performers. The interesting thing is that going to be, you're going to be combining data from must multiple sites, multiple customers to be able to do this holistic analytics and say, You know what? Based on trance, I see. Another customer says. It says You also do that. Now they're really coursing out of this is that the customer does not have to go and use yet another portal on a public cloud to take advantage of that. But they in fact, we send all that feedback through the this fear you. I own premise to the customers, so really cool. So you have the best of both wars. There are big development off analytics using actually behind the senses a really complex cloud native application with the existing tools that the customers are usedto in on premise. So this is just one example >> crystals. Could you give us a little bit of insight as the guiding light for your development process? Do you use that kind of core customers that you're pulling in and working in? Is it a mandate from above that says, you know, Hey, we need to build a more robust and move up the stack. You know, what are some of the pieces that lead to the development that you >> know? This is a very interesting point. I must start by stating that vehement has always bean admitting they're driven company. Um, and look for products were, you know, ideas that were, you know, Martin by engineers, while others thought that was not your not even visible, of course, Mutualization in several stages. But features like the Muslim or stores of emotion Oreo even, you know, ideas kind of ritual, son, right. Claiming that I could do very effectively rate six in software was something that was not really, you know, appreciated in the industrial area stages. So a lot of the innovation is a grassroots innovation. We have our engineers exposed directly to customers customer problems off course. They also understand what is happening in the industry. The trends, whether that is encounter as its case these days with a new generation off first or its cover that is emerging, or where that that is a trend. Samoan customers, for example, using public clouds in certain ways where that is for doing testing dead or archiving their data way. Observe those things and then through a grassroots. Therefore, all this get amalgamated into some concrete ideas. I'm not saying that all those ideas result into products, but we definitely have a very open mind in letting engineers experiment and prove sometimes common sense to be wrong. So this is the process thesis. How Virtual Son started were a couple of us went to our CEO back then for marriage and suggested we do this drastic thing that is called no softer stores on that you can run the soft store of stock in software on the same servers that we visualize, and we're under V. M. So this is really how the process has always been working and this is still the case and we're very proud of this culture. This is one way we're actually tracking opens enduring talent in the competent. >> Yeah, I was loved digging into some of the innovation processes. Had a good chat with Steve Harris, former CEO of GM, where if I remember right? One of the thing processes user called flings, whereas you can actually get visibility from the outside it to some of those kind of trials and things that are going on that aren't yet fully supported yet. >> Absolutely. And that is still the case. Probably the best known fling these days is the HTML five days they you I for your sex, which is used extensively, both internally in the humor where it actually started as a tool for that purpose, but now wild by the community. And that Flynn gave us a lot off insides and how to evolve our mainstream user interface for for this fear, proper notes, Astoria sex. So this is exactly this alternative process that leads us to test the water and feel much more confident when we make bigger and investments in in Ireland, >> right architecturally via Moore has been around for quite a while now. I had a good talk with such a Pagani Who? I m f s earlier today and we were talking about, you know, new applications and new architectures when vms foot fest was built. You know, nobody's thinking about containers. You know, they weren't thinking about applications like duper some of these more cloud native applications. How do you take into consideration where things were going? How did these fit into, you know, kind of traditional VM wear V sphere. You know what things need to change? How do you look at kind of the code basis? >> Right. So first of all of'em affairs, I must say it's probably the most mature and most widely adopted class. The file system in the industry for over 10 years now has been used to visualize enterprise grade store, its stores, alien networks, and it was going to have a role for many years to come. But on the other hand, we all are technologists, and we understand that the product is designed with certain assumptions and constraints, and the EM affairs was designed back in the meat to thousands toe address the requirements for ritual izing lungs, and you know the traditional volumes that you'd be consuming from a disgrace. Now the world is changing, right. We have a whole new generation off solid state devices for stores. Servers on softer on commodity servers with Commodity stores Devices is becoming as your own reports that have been indicating the predominant no mortal of delivering stores in there in the enterprise that the sender and off course in even public clouds with copper scale storage. So what? The requirements there? Some things are changing. You need the store. Its plot from that can really take out the violence of the very low latency is off those devices. I was at Intel Developer for form a couple of weeks ago, and their intel announced for first time performance numbers for the new generation off Envy Me devices obtained that include the three D Chris Point technology under the covers. Latents is at around 10 microseconds, right and Iost per second scruples that are in the several kinds of thousands, if not millions so completely young game changer. And that is not the only company that is coming up with this technology. So you need to invest now in new technologies that can take the can harness the capabilities of this new devices, lightweights protocols like Envy me. In fact, I see envy me as the protocol is not just a protocol to accident device, but I can see a future for that off. Replacing Scott Z into the software start soon, and this is committing specific days. But soon will be sipping a vision off this fear that comes with ritual and via me in the guest visual ization of envy Me. So you can see here where we're heading and envy me, becoming a predominant protocol for the transport and for brutalizing stores. >> Interesting. And we've got a long history of things that start on. The guests Usually then takes a lot of engineering work to get them down to the hyper visor themselves. So, you know, without having to give away too much, is that we see that kind of progression sometime in the future. For some of these new memory, architectures >> certainly certainly are the sex store stock, and this is the stuff that is used by Veum infest by ritual son. It has been designed again for another era off stores. Now we are regarding a lot of these things there, and I cannot disclose too much detail, obviously, but I can tell that it's going to be a very different software stock. Much leaner, much more optimized for local, very fast devices and ultimately envying me is going to be a key technology in this new store stock. >> All right, so just last follow up on that topic. I think about kind of a new memory architectures. What's going on? As of September 7th, Del will acquire TMC. There's the relationship between A. M, C and V M wear. So could we expect some of these new memory technologies impacting things to be something that you'll work even closer with a deli emcee? And >> that is definitely case irrespective off the deal between the emcee and Dell, which, as you said, it's going to be closing. It seems pretty soon. From what I read in the newspapers, >> Michael confirmed, it's finally official. Some of the pathetic ALS. >> Yes, we're moving ahead with this new technologists, and we're working closely with all the partners micro intel and many of the other car vendors that are introducing such technologies to incorporate them into our systems into our software, for example, I see great opportunities for this very fast Cayenne dude owns but still quite expensive technologies to be used, for example, to store meta data. Things like duplication. Costabile is those kind off meta data that have an impact through because of my own verification to the performance that is perceived by the application by moving meta data like that into those tears are going to make a great difference in terms of performance consistent, late and see predictability of the day for the application. Now, thanks to the relations with del Auntie em. See, I can hope that some of these technologies will find their way into several platforms sooner than later. So all of us and our customers would benefit from that. >> All right? What? Christos really appreciate getting the update from you. Lots happening on the storage world. We're kind of talking about. One of my things coming into this this'll week was, if we can really simplify storage, we might actually have a storage. This world doesn't mean it reduces the value of storage or the importance of it, but gonna help the users to be able to move beyond that, we'll be back with lots more coverage here from the emerald 2016. You're watching the Cube. Glad to be here. Whatever. Apply from the Mandalay Bay Convention Center in Las Vegas. It's the King covering via World 2016 brought to you by IBM Wear and its ecosystem sponsors. Now here's your host stew minimum. Welcome back to the Cube here at VM World 2016. Happy to welcome back to the PO program. Christos Caramel analysts. Who's the fellow in CTO of the V A more storage and availability business unit. Thank you for joining us again. >> Glad to be back.
SUMMARY :
Who's the fellow in CTO of the V A more storage and availability but everything happened in the storage room. so that you can have tools to be able to effectively and efficiently discover your data. the existing ecosystem and move forward. The class is the network servers controller's down to individual devices, Is it a mandate from above that says, you know, Hey, we need to build a more robust and move up So a lot of the innovation is a grassroots One of the thing processes user called flings, days is the HTML five days they you I for your and we were talking about, you know, new applications and new architectures when vms And that is not the only company that is coming up with this technology. sometime in the future. certainly certainly are the sex store stock, and this is the stuff that is used by There's the relationship between A. M, C and V M wear. that is definitely case irrespective off the deal between the emcee and Dell, Some of the of the day for the application. of storage or the importance of it, but gonna help the users to be able to move beyond that,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Michael | PERSON | 0.99+ |
March 2016 | DATE | 0.99+ |
Steve Harris | PERSON | 0.99+ |
September 7th | DATE | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Christos Karamanolis | PERSON | 0.99+ |
Ireland | LOCATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Christos Caramel | PERSON | 0.99+ |
Vienna | LOCATION | 0.99+ |
Scott Z | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
VM World 2016 | EVENT | 0.99+ |
GM | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
Mandalay Bay Convention Center | LOCATION | 0.99+ |
over 10 years | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
Veum | ORGANIZATION | 0.98+ |
one example | QUANTITY | 0.98+ |
Intel | ORGANIZATION | 0.98+ |
VMware | ORGANIZATION | 0.98+ |
thousands | QUANTITY | 0.98+ |
both | QUANTITY | 0.97+ |
World 2016 | EVENT | 0.97+ |
Enterprise Data Center | ORGANIZATION | 0.97+ |
Olsen | PERSON | 0.97+ |
Christos | PERSON | 0.95+ |
500 thousands 1000 customers | QUANTITY | 0.95+ |
six | QUANTITY | 0.95+ |
TMC | ORGANIZATION | 0.94+ |
VMworld 2016 | EVENT | 0.94+ |
today | DATE | 0.94+ |
earlier today | DATE | 0.93+ |
Fisher | ORGANIZATION | 0.91+ |
both wars | QUANTITY | 0.9+ |
couple of weeks ago | DATE | 0.88+ |
around 10 microseconds | QUANTITY | 0.86+ |
del Auntie | PERSON | 0.85+ |
Cube | ORGANIZATION | 0.85+ |
first | QUANTITY | 0.85+ |
Oreo | ORGANIZATION | 0.84+ |
first time | QUANTITY | 0.84+ |
intel | ORGANIZATION | 0.84+ |
Pagani | ORGANIZATION | 0.83+ |
56 | OTHER | 0.83+ |
Samoan | OTHER | 0.83+ |
Kranz | ORGANIZATION | 0.82+ |
five days | QUANTITY | 0.81+ |
6.2 | QUANTITY | 0.8+ |
one way | QUANTITY | 0.8+ |
Costabile | ORGANIZATION | 0.79+ |
Chris Point | OTHER | 0.79+ |
Moore | PERSON | 0.78+ |
2016 | DATE | 0.77+ |
Cayenne | ORGANIZATION | 0.74+ |
1st 1 | QUANTITY | 0.73+ |
Mona | ORGANIZATION | 0.72+ |
two data | QUANTITY | 0.72+ |
emir | ORGANIZATION | 0.71+ |
envy | TITLE | 0.69+ |
HTML | TITLE | 0.64+ |
Del | PERSON | 0.63+ |
Morse | TITLE | 0.6+ |
months | DATE | 0.6+ |
V A | ORGANIZATION | 0.59+ |
last | DATE | 0.59+ |
Martin | PERSON | 0.58+ |
San | ORGANIZATION | 0.58+ |
Flynn | PERSON | 0.58+ |
Muslim | ORGANIZATION | 0.55+ |
Cube | COMMERCIAL_ITEM | 0.54+ |
couple | QUANTITY | 0.53+ |
second | QUANTITY | 0.5+ |
DDE | ORGANIZATION | 0.47+ |
emerald | TITLE | 0.46+ |
Envy | TITLE | 0.42+ |
Astoria | ORGANIZATION | 0.41+ |