Scott Hebner, IBM | Change the Game: Winning With AI

>> Live from Times Square in New York City, it's theCUBE. Covering IBMs Change the Game, Winning With AI. Brought to you by, IBM. >> Hi, everybody, we're back. My name is Dave Vellante and you're watching, theCUBE. The leader in live tech coverage. We're here with Scott Hebner who's the VP of marketing for IBM analytics and AI. Scott, it's good to see you again, thanks for coming back on theCUBE. >> It's always great to be here, I love doing these. >> So one of the things we've been talking about for quite some time on theCUBE now, we've been following the whole big data movement since the early Hadoop days. And now AI is the big trend and we always ask is this old wine, new bottle? Or is it something substantive? And the consensus is, it's real, it's real innovation because of the data. What's your perspective? >> I do think it's another one of these major waves, and if you kind of go back through time, there's been a series of them, right? We went from, sort of centralized computing into client server, and then we went from client server into the whole world of e-business in the internet, back around 2000 time frame or so. Then we went from internet computing to, cloud. Right? And I think the next major wave here is that next step is AI. And machine learning, and applying all this intelligent automation to the entire system. So I think, and it's not just a evolution, it's a pretty big change that's occurring here. Particularly the value that it can provide businesses is pretty profound. >> Well it seems like that's the innovation engine for at least the next decade. It's not Moore's Law anymore, it's applying machine intelligence and AI to the data and then being able to actually operationalize that at scale. With the cloud-like model, whether its OnPrem or Offprem, your thoughts on that? >> Yeah, I mean I think that's right on 'cause, if you kind of think about what AI's going to do, in the end it's going to be about just making much better decisions. Evidence based decisions, your ability to get to data that is previously unattainable, right? 'Cause it can discover things in real time. So it's about decision making and it's about fueling better, and more intelligent business processing. Right? But I think, what's really driving, sort of under the covers of that, is this idea that, are clients really getting what they need from their data? 'Cause we all know that the data's exploding in terms of growth. And what we know from our clients and from studies is only about 15% of what business leaders believe that they're getting what they need from their data. Yet most businesses are sitting on about 80% of their data, that's either inaccessible, un-analyzed, or un-trusted, right? So, what they're asking themselves is how do we first unlock the value of all this data. And they knew they have to do it in new ways, and I think the new ways starts to talk about cloud native architectures, containerization, things of that nature. Plus, artificial intelligence. So, I think what the market is starting to tell us is, AI is the way to unlock the value of all this data. And it's time to really do something significant with it otherwise, it's just going to be marginal progress over time. They need to make big progress. >> But data is plentiful, insights aren't. And part of your strategy is always been to bring insights out of that dividend and obviously focused on clients outcomes. But, a big part of your role is not only communicating IBMs analytic and AI strategy, but also helping shape that strategy. How do you, sort of summarize that strategy? >> Well we talk about the ladder to AI, 'cause one thing when you look at the actual clients that are ahead of the game here, and the challenges that they've faced to get to the value of AI, what we've learned, very, very clearly, is that the hardest part of AI is actually making your data ready for AI. It's about the data. It's sort of this notion that there's no AI without a information architecture, right? You have to build that architecture to make your data ready, 'cause bad data will be paralyzing to AI. And actually there was a great MIT Sloan study that they did earlier in the year that really dives into all these challenges and if I remember correctly, about 81% of them said that the number one challenge they had is, their data. Is their data ready? Do they know what data to get to? And that's really where it all starts. So we have this notion of the ladder to AI, it's several, very prescriptive steps, that we believe through best practices, you need to actually take to get to AI. And once you get to AI then it becomes about how you operationalize it in the way that it scales, that you have explainability, you have transparency, you have trust in what the model is. But it really much is a systematical approach here that we believe clients are going to get there in a much faster way. >> So the picture of the ladder here it starts with collect, and that's kind of what we did with, Hadoop, we collected a lot of data 'cause it was inexpensive and then organizing it, it says, create a trusted analytics foundation. Still building that sort of framework and then analyze and actually start getting insights on demand. And then automation, that seems to be the big theme now. Is, how do I get automation? Whether it's through machine learning, infusing AI everywhere. Be a blockchain is part of that automation, obviously. And it ultimately getting to the outcome, you call it trust, achieving trust and transparency, that's the outcome that we want here, right? >> I mean I think it all really starts with making your data simple and accessible. Which is about collecting the data. And doing it in a way you can tap into all types of data, regardless of where it lives. So the days of trying to move data around all over the place or, heavy duty replication and integration, let it sit where it is, but be able to virtualize it and collect it and containerize it, so it can be more accessible and usable. And that kind of goes to the point that 80% of the enterprised data, is inaccessible, right? So it all starts first with, are you getting all the data collected appropriately, and getting it into a way that you can use it. And then we start feeding things in like, IOT data, and sensors, and it becomes real time data that you have to do this against, right? So, notions of replicating and integrating and moving data around becomes not very practical. So that's step one. Step two is, once you collect all the data doesn't necessarily mean you trust it, right? So when we say, trust, we're talking about business ready data. Do people know what the data is? Are there business entities associated with it? Has it been cleansed, right? Has it been take out all the duplicate data? What do you when a situation with data, you know you have sources of data that are telling you different things. Like, I think we've all been on a treadmill where the phone, the watch, and the treadmill will actually tell you different distances, I mean what's the truth? The whole notion of organizing is getting it ready to be used by the business, in applying the policies, the compliance, and all the protections that you need for that data. Step three is, the ability to build out all this, ability to analyze it. To do it on scale, right, and to do it in a way that everyone can leverage the data. So not just the business analysts, but you need to enable everyone through self-service. And that's the advancements that we're getting in new analytics capabilities that make mere mortals able to get to that data and do their analysis. >> And if I could inject, the challenge with the sort of traditional decision support world is you had maybe two, or three people that were like, the data gods. You had to go through them, and they would get the analysis. And it's just, the agility wasn't there. >> Right. >> So you're trying to, democratizing that, putting it in the hands. >> Absolutely. >> Maybe the business user's not as much of an expert as the person who can build theCUBE, but they could find new use cases, and drive more value, right? >> Actually, from a developer, that needs to get access, and analytics infused into their applications, to the other end of the spectrum which could be, a marketing leader, a finance planner, someone who's planning budgets, supply chain planner. Right, so it's that whole spectrum, not only allowing them to tap into, and analyze the data and gain insights from it, but allow them to customize how they do it and do it in a more self-service. So that's the notion of scale on demand insights. It's really a cultural thing enabled through the technology. With that foundation, then you have the ability to start infuse, where I think the real power starts to kick in here. So I mean, all that's kind of making your data ready for AI, right? Then you start to infuse machine learning, everywhere. And that's when you start to build these models that are self-learning, that start to automate the ability to get to these insights, and to the data. And uncover what has previously been unattainable, right? And that's where the whole thing starts to become automated and more real time and more intelligent. And that's where those models then allow you to do things you couldn't do before. With the data, they're saying they're not getting access to. And then of course, once you get the models, just because you have good models doesn't mean that they've been operationalized, that they've been embedded in applications, embedded in business process. That you have trust and transparency and explainability of what it's telling you. And that's that top tier of the ladder, is really about embedding it, right, so that into your business process in a way that you trust it. So, we have a systematic set of approaches to that, best practices. And of course we have the portfolio that would help you step up that ladder. >> So the fat middle of this bell curve is, something kind of this maturity curve, is kind of the organize and analyze phase, that's probably where most people are today. And what's the big challenge of getting up that ladder, is it the algorithms, what is it? >> Well I think it, it clearly with most movements like this, starts with culture and skills, right? And the ability to just change the game within an organization. But putting that aside, I think what's really needed here is an information architecture that's based in the agility of a cloud native platform, that gives you the productivity, and truly allows you to leverage your data, wherever it resides. So whether it's in the private cloud, the public cloud, on premise, dedicated no matter where it sits, you want to be able to tap into all that data. 'Cause remember, the challenge with data is it's always changing. I don't mean the sources, but the actual data. So you need an architecture that can handle all that. Once you stabilize that, then you can start to apply better analytics to it. And so yeah, I think you're right. That is sort of the bell curve here. And with that foundation that's when the power of infusing machine learning and deep learning and neuronetworks, I mean those kind of AI technologies and models into it all, just takes it to a whole new level. But you can't do those models until you have those bottom tiers under control. >> Right, setting that foundation. Building that framework. >> Exactly. >> And then applying. >> What developers of AI applications, particularly those that have been successful, have told us pretty clearly, is that building the actual algorithms, is not necessarily the hard part. The hard part is making all the data ready for that. And in fact I was reading a survey the other day of actual data scientists and AI developers and 60% of them said the thing they hate the most, is all the data collection, data prep. 'Cause it's so hard. And so, a big part of our strategy is just to simplify that. Make it simple and accessible so that you can really focus on what you want to do and where the value is, which is building the algorithms and the models, and getting those deployed. >> Big challenge and hugely important, I mean IBM is a 100 year old company that's going through it's own digital transformation. You know, we've had Inderpal Bhandari on talking about how to essentially put data at the core of the company, it's a real hard problem for a lot of companies who were not born, you know, five or, seven years ago. And so, putting data at that core and putting human expertise around it as opposed to maybe, having whatever as the core. Humans or the plant or the manufacturing facility, that's a big change for a lot of organizations. Now at the end of the day IBM, and IBM sells strategy but the analytics group, you're in the software business so, what offerings do you have, to help people get there? >> Well in the collect step, it's essentially our hybrid data management portfolio. So think DB2, DB2 warehouse, DB2 event store, which is about IOT data. So there's a set of, and that's where big data in Hadoop and all that with Wentworth's, that's where that all fits in. So building the ability to access all this data, virtualize it, do things like Queryplex, things of that nature, is where that all sits. >> Queryplex being that to the data, virtualization capability. >> Yeah. >> Get to the data no matter where it is. >> To find a queary and don't worry about where it resides, we'll figure that out for you, kind of thought, right? In the organize, that is infosphere, so that's basically our unified governance and integration part of our portfolio. So again, that is collecting all this, taking the collected data and organizing it, and making sure you're compliant with whatever policies. And making it, you know, business ready, right? And so infosphere's where you should look to understand that portfolio better. When you get into scale and analytics on demand, that's Cognos analytics, it is our planning analytics portfolio. And that's essentially our business analytics part of all this. And some data science tools like, SPSS, we're doing statistical analysis and SPSS modeler, if we're doing statistical modeling, things of that nature, right? When you get into the automate and the ML, everywhere, that's Watson Studio which is the integrated development environment, right? Not just for IBM Watson, but all, has a huge array of open technologies in it like, TensorFlow and Python, and all those kind of things. So that's the development environment that Watson machine learning is the runtime that will allow you to run those models anywhere. So those are the two big pieces of that. And then from there you'll see IBM building out more and more of what we already have. But we have Watson applications. Like Watson Assistant, Watson Discovery. We have a huge portfolio of Watson APIs for everything from tone to speech, things of that nature. And then the ability to infuse that all into the business processes. Sort of where you're going to see IBM heading in the future here. >> I love how you brought that home, and we talked about the ladder and it's more than just a PowerPoint slide. It actually is fundamental to your strategy, it maps with your offerings. So you can get the heads nodding, with the customers. Where are you on this maturity curve, here's how we can help with products and services. And then the other thing I'll mention, you know, we kind of learned when we spoke to some others this week, and we saw some of your announcements previously, the Red Hat component which allows you to bring that cloud experience no matter where you are, and you've got technologies to do that, obviously, you know, Red Hat, you guys have been sort of birds of a feather, an open source. Because, your data is going to live wherever it lives, whether it's on Prem, whether it's in the cloud, whether it's in the Edge, and you want to bring sort of a common model. Whether it's, containers, kubernetes, being able to, bring that cloud experience to the data, your thoughts on that? >> And this is where the big deal comes in, is for each one of those tiers, so, the DB2 family, infosphere, business analytics, Cognos and all that, and Watson Studio, you can get started, purchase those technologies and start to use them, right, as individual products or softwares that service. What we're also doing is, this is the more important step into the future, is we're building all those capabilities into one integrated unified cloud platform. That's called, IBM Cloud Private for data. Think of that as a unified, collaborative team environment for AI and data science. Completely built on a cloud native architecture of containers and micro services. That will support a multi cloud environment. So, IBM cloud, other clouds, you mention Red Hat with Openshift, so, over time by adopting IBM Cloud Private for data, you'll get those steps of the ladder all integrated to one unified environment. So you have the ability to buy the unified environment, get involved in that, and it all integrated, no assembly required kind of thought. Or, you could assemble it by buying the individual components, or some combination of both. So a big part of the strategy is, a great deal of flexibility on how you acquire these capabilities and deploy them in your enterprise. There's no one size fits all. We give you a lot of flexibility to do that. >> And that's a true hybrid vision, I don't have to have just IBM and IBM cloud, you're recognizing other clouds out there, you're not exclusive like some companies, but that's really important. >> It's a multi cloud strategy, it really is, it's a multi cloud strategy. And that's exactly what we need, we recognize that most businesses, there's very few that have standardized on only one cloud provider, right? Most of them have multiples clouds, and then it breaks up of dedicated, private, public. And so our strategy is to enable this capability, think of it as a cloud data platform for AI, across all these clouds, regardless of what you have. >> All right, Scott, thanks for taking us through the strategies. I've always loved talking to you 'cause you're a clear thinker, and you explain things really well in simple terms, a lot of complexity here but, it is really important as the next wave sets up. So thanks very much for your time. >> Great, always great to be here, thank you. >> All right, good to see you. All right, thanks for watching everybody. We are now going to bring it back to CubeNYC so, thanks for watching and we will see you in the afternoon. We've got the panel, the influencer panel, that I'll be running with Peter Burris and John Furrier. So, keep it right there, we'll be right back. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by, IBM. it's good to see you again, It's always great to be And now AI is the big and if you kind of go back through time, and then being able to actually in the end it's going to be about And part of your strategy is of the ladder to AI, So the picture of the ladder And that's the advancements And it's just, the agility wasn't there. the hands. And that's when you start is it the algorithms, what is it? And the ability to just change Right, setting that foundation. is that building the actual algorithms, And so, putting data at that core So building the ability Queryplex being that to the data, Get to the data no matter And so infosphere's where you should look and you want to bring So a big part of the strategy is, I don't have to have And so our strategy is to I've always loved talking to you to be here, thank you. We've got the panel, the influencer panel,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Scott	PERSON	0.99+
Scott Hebner	PERSON	0.99+
80%	QUANTITY	0.99+
two	QUANTITY	0.99+
60%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Python	TITLE	0.99+
Inderpal Bhandari	PERSON	0.99+
PowerPoint	TITLE	0.99+
IBMs	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
TensorFlow	TITLE	0.99+
three people	QUANTITY	0.99+
both	QUANTITY	0.98+
Times Square	LOCATION	0.98+
Watson	TITLE	0.98+
about 80%	QUANTITY	0.98+
Watson Assistant	TITLE	0.98+
step one	QUANTITY	0.98+
one	QUANTITY	0.97+
MIT Sloan	ORGANIZATION	0.97+
next decade	DATE	0.97+
about 15%	QUANTITY	0.97+
Watson Studio	TITLE	0.97+
this week	DATE	0.97+
Step two	QUANTITY	0.96+
Watson Discovery	TITLE	0.96+
two big pieces	QUANTITY	0.96+
Red Hat	TITLE	0.96+
about 81%	QUANTITY	0.96+
Openshift	TITLE	0.95+
CubeNYC	LOCATION	0.94+
five	DATE	0.94+
Queryplex	TITLE	0.94+
first	QUANTITY	0.93+
today	DATE	0.92+
100 year old	QUANTITY	0.92+
Wentworth	ORGANIZATION	0.91+
Step three	QUANTITY	0.91+
Change the Game: Winning With AI	TITLE	0.9+
one cloud provider	QUANTITY	0.9+
one thing	QUANTITY	0.89+
DB2	TITLE	0.85+
each one	QUANTITY	0.84+
seven years ago	DATE	0.83+
OnPrem	ORGANIZATION	0.83+
waves	EVENT	0.82+
number one challenge	QUANTITY	0.8+
Red Hat	TITLE	0.78+
Offprem	ORGANIZATION	0.77+
DB2	ORGANIZATION	0.76+
major	EVENT	0.76+
major wave	EVENT	0.75+
SPSS	TITLE	0.73+
Moore's Law	TITLE	0.72+
Cognos	TITLE	0.72+
next	EVENT	0.66+
Cloud	TITLE	0.64+
around 2000	QUANTITY	0.64+
Hadoop	TITLE	0.61+
early Hadoop days	DATE	0.55+
them	QUANTITY	0.51+
wave	EVENT	0.5+
in	DATE	0.49+
theCUBE	TITLE	0.45+
theCUBE	ORGANIZATION	0.42+

Action Item Quick Take | Jim Kobielus - Mar 2018

(Upbeat music) (Coughs) >> Hi, I'm Peter Burris with another Wikibooks action item quick take. Jim Kobielus, IBM's up to some good with new tooling for managing data. What's going on? >> Yes Peter, it's not brand new tooling but its important because it actually is a foreshadowing of what's going to be universal. I think it's a capability for programming the uni grade as we've been discussing. Essentially this week at the IBM Signature event Sam Whitestone of IBM discussed with Dave Valente a product they have called Queryplex which is on the market for money even more. Essentially it's a data virtualization environment for distributor query processing in a mesh fabric. And what's important about Queryplex to understand, in a uni grade context, is it enables link binding distributed computation to find the lowest latency path between... Across very fairly complex edge clouds. So to speed up queries no matter where the data may reside and so forth in a fairly real time dynamic fashion. So I think the important things to know about Queryplex are A- that it prioritizes connections with lowest latency based on ongoing computations that are performed and is able to distribute this computation to find the lowest path across the network to prevent the query... The computation controller from being a bottle neck. I think that's a fundamental, architectural capability we're going to see more of with the advent or the growth of the uni grade as a broad concept for building up a distributor cloud computing environment. >> And very importantly there are still a lot of applications that run the businesses on top of IBM machines. Jim Kabielus thanks very much talking about IBM Queryplex and some of the next steps coming. This is Peter Burris with another Wikibooks action item quick take. (upbeat music)

Published Date : Mar 2 2018

SUMMARY :

Hi, I'm Peter Burris with this computation to find the lowest path a lot of applications that run

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Jim Kabielus	PERSON	0.99+
Sam Whitestone	PERSON	0.99+
Dave Valente	PERSON	0.99+
Peter Burris	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mar 2018	DATE	0.99+
Peter	PERSON	0.99+
this week	DATE	0.96+
IBM Signature	EVENT	0.93+
Wikibooks	ORGANIZATION	0.69+
Queryplex	TITLE	0.59+
Wikibooks	TITLE	0.55+

Wrap | Machine Learning Everywhere 2018

>> Narrator: Live from New York, it's theCUBE. Covering machine learning everywhere. Build your ladder to AI. Brought to you by IBM. >> Welcome back to IBM's Machine Learning Everywhere. Build your ladder to AI, along with Dave Vellante, John Walls here, wrapping up here in New York City. Just about done with the programming here in Midtown. Dave, let's just take a step back. We've heard a lot, seen a lot, talked to a lot of folks today. First off, tell me, AI. We've heard some optimistic outlooks, some, I wouldn't say pessimistic, but some folks saying, "Eh, hold off." Not as daunting as some might think. So just your take on the artificial intelligence conversation we've heard so far today. >> I think generally, John, that people don't realize what's coming. I think the industry, in general, our industry, technology industry, the consumers of technology, the businesses that are out there, they're steeped in the past, that's what they know. They know what they've done, they know the history and they're looking at that as past equals prologue. Everybody knows that's not the case, but I think it's hard for people to envision what's coming, and what the potential of AI is. Having said that, Jennifer Shin is a near-term pessimist on the potential for AI, and rightly so. There are a lot of implementation challenges. But as we said at the open, I'm very convinced that we are now entering a new era. The Hadoop big data industry is going to pale in comparison to what we're seeing. And we're already seeing very clear glimpses of it. The obvious things are Airbnb and Uber, and the disruptions that are going on with Netflix and over-the-top programming, and how Google has changed advertising, and how Amazon is changing and has changed retail. But what you can see, and again, the best examples are Apple getting into financial services, moving into healthcare, trying to solve that problem. Amazon buying a grocer. The rumor that I heard about Amazon potentially buying Nordstrom, which my wife said is a horrible idea. (John laughs) But think about the fact that they can do that is a function of, that they are a digital-first company. Are built around data, and they can take those data models and they can apply it to different places. Who would have thought, for example, that Alexa would be so successful? That Siri is not so great? >> Alexa's become our best friend. >> And it came out of the blue. And it seems like Google has a pretty competitive piece there, but I can almost guarantee that doing this with our thumbs is not the way in which we're going to communicate in the future. It's going to be some kind of natural language interface that's going to rely on artificial intelligence and machine learning and the like. And so, I think it's hard for people to envision what's coming, other than fast forward where machines take over the world and Stephen Hawking and Elon Musk say, "Hey, we should be concerned." Maybe they're right, not in the next 10 years. >> You mentioned Jennifer, we were talking about her and the influencer panel, and we've heard from others as well, it's a combination of human intelligence and artificial intelligence. That combination's more powerful than just artificial intelligence, and so, there is a human component to this. So, for those who might be on the edge of their seat a little bit, or looking at this from a slightly more concerning perspective, maybe not the case. Maybe not necessary, is what you're thinking. >> I guess at the end of the day, the question is, "Is the world going to be a better place with all this AI? "Are we going to be more prosperous, more productive, "healthier, safer on the roads?" I am an optimist, I come down on the side of yes. I would not want to go back to the days where I didn't have GPS. That's worth it to me. >> Can you imagine, right? If you did that now, you go back five years, just five years from where we are now, back to where we were. Waze was nowhere, right? >> All the downside of these things, I feel is offset by that. And I do think it's incumbent upon the industry to try to deal with the problem, especially with young people, the blue light problem. >> John: The addictive issue. >> That's right. But I feel like those downsides are manageable, and the upsides are of enough value that society is going to continue to move forward. And I do think that humans and machines are going to continue to coexist, at least in the near- to mid- reasonable long-term. But the question is, "What can machines "do that humans can't do?" And "What can humans do that machines can't do?" And the answer to that changes every year. It's like I said earlier, not too long ago, machines couldn't climb stairs. They can now, robots can climb stairs. Can they negotiate? Can they identify cats? Who would've imagined that all these cats on the Internet would've led to facial recognition technology. It's improving very, very rapidly. So, I guess my point is that that is changing very rapidly, and there's no question it's going to have an impact on society and an impact on jobs, and all those other negative things that people talk about. To me, the key is, how do we embrace that and turn it into an opportunity? And it's about education, it's about creativity, it's about having multi-talented disciplines that you can tap. So we talked about this earlier, not just being an expert in marketing, but being an expert in marketing with digital as an understanding in your toolbox. So it's that two-tool star that I think is going to emerge. And maybe it's more than two tools. So that's how I see it shaping up. And the last thing is disruption, we talked a lot about disruption. I don't think there's any industry that's safe. Colin was saying, "Well, certain industries "that are highly regulated-" In some respects, I can see those taking longer. But I see those as the most ripe for disruption. Financial services, healthcare. Can't we solve the HIPAA challenge? We can't get access to our own healthcare information. Well, things like artificial intelligence and blockchain, we were talking off-camera about blockchain, those things, I think, can help solve the challenge of, maybe I can carry around my health profile, my medical records. I don't have access to them, it's hard to get them. So can things like artificial intelligence improve our lives? I think there's no question about it. >> What about, on the other side of the coin, if you will, the misuse concerns? There are a lot of great applications. There are a lot of great services. As you pointed out, a lot of positive, a lot of upside here. But as opportunities become available and technology develops, that you run the risk of somebody crossing the line for nefarious means. And there's a lot more at stake now because there's a lot more of us out there, if you will. So, how do you balance that? >> There's no question that's going to happen. And it has to be managed. But even if you could stop it, I would say you shouldn't because the benefits are going to outweigh the risks. And again, the question we asked the panelists, "How far can we take machines? "How far can we go?" That's question number one, number two is, "How far should we go?" We're not even close to the "should we go" yet. We're still on the, "How far can we go?" Jennifer was pointing out, I can't get my password reset 'cause I got to call somebody. That problem will be solved. >> So, you're saying it's more of a practical consideration now than an ethical one, right now? >> Right now. Moreso, and there's certainly still ethical considerations, don't get me wrong, but I see light at the end of the privacy tunnel, I see artificial intelligence as, well, analytics is helping us solve credit card fraud and things of that nature. Autonomous vehicles are just fascinating, right? Both culturally, we talked about that, you know, we learned how to drive a stick shift. (both laugh) It's a funny story you told me. >> Not going to worry about that anymore, right? >> But it was an exciting time in our lives, so there's a cultural downside of that. I don't know what the highway death toll number is, but it's enormous. If cell phones caused that many deaths, we wouldn't be using them. So that's a problem that I think things like artificial intelligence and machine intelligence can solve. And then the other big thing that we talked about is, I see a huge gap between traditional companies and these born-in-the-cloud, born-data-oriented companies. We talked about the top five companies by market cap. Microsoft, Amazon, Facebook, Alphabet, which is Google, who am I missing? >> John: Apple. >> Apple, right. And those are pretty much very much data companies. Apple's got the data from the phones, Google, we know where they get their data, et cetera, et cetera. Traditional companies, however, their data resides in silos. Jennifer talked about this, Craig, as well as Colin. Data resides in silos, it's hard to get to. It's a very human-driven business and the data is bolted on. With the companies that we just talked about, it's a data-driven business, and the humans have expertise to exploit that data, which is very important. So there's a giant skills gap in existing companies. There's data silos. The other thing we touched on this is, where does innovation come from? Innovation drives value drives disruption. So the innovation comes from data. He or she who has the best data wins. It comes from artificial intelligence, and the ability to apply artificial intelligence and machine learning. And I think something that we take for granted a lot, but it's cloud economics. And it's more than just, and somebody, one of the folks mentioned this on the interview, it's more than just putting stuff in the cloud. It's certainly managed services, that's part of it. But it's also economies of scale. It's marginal economics that are essentially zero. It's speed, it's low latency. It's, and again, global scale. You combine those things, data, artificial intelligence, and cloud economics, that's where the innovation is going to come from. And if you think about what Uber's done, what Airbnb have done, where Waze came from, they were picking and choosing from the best digital services out there, and then developing their own software from this, what I say my colleague Dave Misheloff calls this matrix. And, just to repeat, that matrix is, the vertical matrix is industries. The horizontal matrix are technology platforms, cloud, data, mobile, social, security, et cetera. They're building companies on top of that matrix. So, it's how you leverage the matrix is going to determine your future. Whether or not you get disrupted, whether your the disruptor or the disruptee. It's not just about, we talked about this at the open. Cloud, SaaS, mobile, social, big data. They're kind of yesterday's news. It's now new artificial intelligence, machine intelligence, deep learning, machine learning, cognitive. We're still trying to figure out the parlance. You could feel the changes coming. I think this matrix idea is very powerful, and how that gets leveraged in organizations ultimately will determine the levels of disruption. But every single industry is at risk. Because every single industry is going digital, digital allows you to traverse industries. We've said it many times today. Amazon went from bookseller to content producer to grocer- >> John: To grocer now, right? >> To maybe high-end retailer. Content company, Apple with Apple Pay and companies getting into healthcare, trying to solve healthcare problems. The future of warfare, you live in the Beltway. The future of warfare and cybersecurity are just coming together. One of the biggest issues I think we face as a country is we have fake news, we're seeing the weaponization of social media, as James Scott said on theCUBE. So, all these things are coming together that I think are going to make the last 10 years look tame. >> Let's just switch over to the currency of AI, data. And we've talked to, Sam Lightstone today was talking about the database querying that they've developed with the Plex product. Some fascinating capabilities now that make it a lot richer, a lot more meaningful, a lot more relevant. And that seems to be, really, an integral step to making that stuff come alive and really making it applicable to improving your business. Because they've come up with some fantastic new ways to squeeze data that's relevant out, and get it out to the user. >> Well, if you think about what I was saying earlier about data as a foundational core and human expertise around it, versus what most companies are, is human expertise with data bolted on or data in silos. What was interesting about Queryplex, I think they called it, is it essentially virtualizes the data. Well, what does that mean? That means i can have data in place, but I can have access to that data, I can democratize that data, make it accessible to people so that they can become data-driven, data is the core. Now, what I don't know, and I don't know enough, just heard about it today, I missed that announcement, I think they announced it a year ago. He mentioned DB2, he mentioned Netezza. Most of the world is not on DB2 and Netezza even though IBM customers are. I think they can get to Hadoop data stores and other data stores, I just don't know how wide that goes, what the standards look like. He joked about the standards as, the great thing about standards is- >> There are a lot of 'em. (laughs) >> There's always another one you can pick if this one fails. And he's right about that. So, that was very interesting. And so, this is again, the question, can traditional companies close that machine learning, machine intelligence, AI gap? Close being, close the gap that the big five have created. And even the small guys, small guys like Uber and Airbnb, and so forth, but even those guys are getting disrupted. The Airbnbs and the Ubers, right? Again, blockchain comes in and you say, "Why do I need a trusted third party called Uber? "Why can't I do this on the blockchain?" I predict you're going to see even those guys get disrupted. And I'll say something else, it's hard to imagine that a Google or a Facebook can be unseated. But I feel like we may be entering an era where this is their peak. Could be wrong, I'm an Apple customer. I don't know, I'm not as enthralled as I used to be. They got trillions in the bank. But is it possible that opensource and blockchain and the citizen developer, the weekend and nighttime developers, can actually attack that engine of growth for the last 10 years, 20 years, and really break that monopoly? The Internet has basically become an oligopoly where five companies, six companies, whatever, 10 companies kind of control things. Is it possible that opensource software, AI, cryptography, all this activity could challenge the status quo? Being in this business as long as I have, things never stay the same. Leaders come, leaders go. >> I just want to say, never say never. You don't know. >> So, it brings it back to IBM, which is interesting to me. It was funny, I was asking Rob Thomas a question about disruption, and I think he misinterpreted it. I think he was thinking that I was saying, "Hey, you're going to get disrupted by all these little guys." IBM's been getting disrupted for years. They know how to reinvent. A lot of people criticize IBM, how many quarters they haven't had growth, blah, blah, blah, but IBM's made some big, big bets on the future. People criticizing Watson, but it's going to be really interesting to see how all this investment that IBM has made is going to pay off. They were early on. People in the Valley like to say, "Well, the Facebooks, and even Amazon, "Google, they got the best AI. "IBM is not there with them." But think about what IBM is trying to do versus what Google is doing. They're very consumer-oriented, solving consumer problems. Consumers have really led the consumerization of IT, that's true, but none of those guys are trying to solve cancer. So IBM is talking about some big, hairy, audacious goals. And I'm not as pessimistic as some others you've seen in the trade press, it's popular to do. So, bringing it back to IBM, I saw IBM as trying to disrupt itself. The challenge IBM has, is it's got a lot of legacy software products that have purchased over the years. And it's got to figure out how to get through those. So, things like Queryplex allow them to create abstraction layers. Things like Bluemix allow them to bring together their hundreds and hundreds and hundreds of SaaS applications. That takes time, but I do see IBM making some big investments to disrupt themselves. They've got a huge analytics business. We've been covering them for quite some time now. They're a leader, if not the leader, in that business. So, their challenge is, "Okay, how do we now "apply all these technologies to help "our customers create innovation?" What I like about the IBM story is they're not out saying, "We're going to go disrupt industries." Silicon Valley has a bifurcated disruption agenda. On the one hand, they're trying to, cloud, and SaaS, and mobile, and social, very disruptive technologies. On the other hand, is Silicon Valley going to disrupt financial services, healthcare, government, education? I think they have plans to do so. Are they going to be able to execute that dual disruption agenda? Or are the consumers of AI and the doers of AI going to be the ones who actually do the disrupting? We'll see, I mean, Uber's obviously disrupted taxis, Silicon Valley company. Is that too much to ask Silicon Valley to do? That's going to be interesting to see. So, my point is, IBM is not trying to disrupt its customers' businesses, and it can point to Amazon trying to do that. Rather, it's saying, "We're going to enable you." So it could be really interesting to see what happens. You're down in DC, Jeff Bezos spent a lot of time there at the Washington Post. >> We just want the headquarters, that's all we want. We just want the headquarters. >> Well, to the point, if you've got such a growing company monopoly, maybe you should set up an HQ2 in DC. >> Three of the 20, right, for a DC base? >> Yeah, he was saying the other day that, maybe we should think about enhancing, he didn't call it social security, but the government, essentially, helping people plan for retirement and the like. I heard that and said, "Whoa, is he basically "telling us he's going to put us all out of jobs?" (both laugh) So, that, if I'm a customer of Amazon's, I'm kind of scary. So, one of the things they should absolutely do is spin out AWS, I think that helps solve that problem. But, back to IBM, Ginni Rometty was very clear at the World of Watson conference, the inaugural one, that we are not out trying to compete with our customers. I would think that resonates to a lot of people. >> Well, to be continued, right? Next month, back with IBM again? Right, three days? >> Yeah, I think third week in March. Monday, Tuesday, Wednesday, theCUBE's going to be there. Next week we're in the Bahamas. This week, actually. >> Not as a group taking vacation. Actually a working expedition. >> No, it's that blockchain conference. Actually, it's this week, what am I saying next week? >> Although I'm happy to volunteer to grip on that shoot, by the way. >> Flying out tomorrow, it's happening fast. >> Well, enjoyed this, always good to spend time with you. And good to spend time with you as well. So, you've been watching theCUBE, machine learning everywhere. Build your ladder to AI. Brought to you by IBM. Have a good one. (techno music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. talked to a lot of folks today. and they can apply it to different places. And so, I think it's hard for people to envision and so, there is a human component to this. I guess at the end of the day, the question is, back to where we were. to try to deal with the problem, And the answer to that changes every year. What about, on the other side of the coin, because the benefits are going to outweigh the risks. of the privacy tunnel, I see artificial intelligence as, And then the other big thing that we talked about is, And I think something that we take that I think are going to make the last 10 years look tame. And that seems to be, really, an integral step I can democratize that data, make it accessible to people There are a lot of 'em. The Airbnbs and the Ubers, right? I just want to say, never say never. People in the Valley like to say, We just want the headquarters, that's all we want. Well, to the point, if you've got such But, back to IBM, Ginni Rometty was very clear Monday, Tuesday, Wednesday, theCUBE's going to be there. Actually a working expedition. No, it's that blockchain conference. to grip on that shoot, by the way. And good to spend time with you as well.

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Eric Herzog	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Diane	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mark Albertson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Tricia Wang	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
James Scott	PERSON	0.99+
Scott	PERSON	0.99+
Ray Wang	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Brian Walden	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
Rachel Tobik	PERSON	0.99+
Alphabet	ORGANIZATION	0.99+
Zeynep Tufekci	PERSON	0.99+
Tricia	PERSON	0.99+
Stu	PERSON	0.99+
Tom Barton	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sandra Rivera	PERSON	0.99+
John	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
France	LOCATION	0.99+
Jennifer Lin	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Seattle	LOCATION	0.99+
Brian	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
Scott Raynovich	PERSON	0.99+
Radisys	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Eric	PERSON	0.99+
Amanda Silver	PERSON	0.99+

Sam Lightstone, IBM | Machine Learning Everywhere 2018

>> Narrator: Live from New York, it's the Cube. Covering Machine Learning Everywhere: Build Your Ladder to AI. Brought to you by IBM. >> And welcome back here to New York City. We're at IBM's Machine Learning Everywhere: Build Your Ladder to AI, along with Dave Vellante, John Walls, and we're now joined by Sam Lightstone, who is an IBM fellow in analytics. And Sam, good morning. Thanks for joining us here once again on the Cube. >> Yeah, thanks a lot. Great to be back. >> Yeah, great. Yeah, good to have you here on kind of a moldy New York day here in late February. So we're talking, obviously data is the new norm, is what certainly, have heard a lot about here today and of late here from IBM. Talk to me about, in your terms, of just when you look at data and evolution and to where it's now become so central to what every enterprise is doing and must do. I mean, how do you do it? Give me a 30,000-foot level right now from your prism. >> Sure, I mean, from a super, if you just stand back, like way far back, and look at what data means to us today, it's really the thing that is separating companies one from the other. How much data do they have and can they make excellent use of it to achieve competitive advantage? And so many companies today are about data and only data. I mean, I'll give you some like really striking, disruptive examples of companies that are tremendously successful household names and it's all about the data. So the world's largest transportation company, or personal taxi, can't call it taxi, but (laughs) but, you know, Uber-- >> Yeah, right. >> Owns no cars, right? The world's largest accommodation company, Airbnb, owns no hotels, right? The world's largest distributor of motion pictures owns no movie theaters. So these companies are disrupting because they're focused on data, not on the material stuff. Material stuff is important, obviously. Somebody needs to own a car, somebody needs to own a way to view a motion picture, and so on. But data is what differentiates companies more than anything else today. And can they tap into the data, can they make sense of it for competitive advantage? And that's not only true for companies that are, you know, cloud companies. That's true for every company, whether you're a bricks and mortars organization or not. Now, one level of that data is to simply look at the data and ask questions of the data, the kinds of data that you already have in your mind. Generating reports, understanding who your customers are, and so on. That's sort of a fundamental level. But the deeper level, the exciting transformation that's going on right now, is the transformation from reporting and what we'll call business intelligence, the ability to take those reports and that insight on data and to visualize it in the way that human beings can understand it, and go much deeper into machine learning and AI, cognitive computing where we can start to learn from this data and learn at the pace of machines, and to drill into the data in a way that a human being cannot because we can't look at bajillions of bytes of data on our own, but machines can do that and they're very good at doing that. So it is a huge, that's one level. The other level is, there's so much more data now than there ever was because there's so many more devices that are now collecting data. And all of us, you know, every one of our phones is collecting data right now. Your cars are collecting data. I think there's something like 60 sensors on every car that rolls of the manufacturing line today. 60. So it's just a wild time and a very exciting time because there's so much untapped potential. And that's what we're here about today, you know. Machine learning, tapping into that unbelievable potential that's there in that data. >> So you're absolutely right on. I mean the data is foundational, or must be foundational in order to succeed in this sort of data-driven world. But it's not necessarily the center of the universe for a lot of companies. I mean, it is for the big data, you know, guys that we all know. You know, the top market cap companies. But so many organizations, they're sort of, human expertise is at the center of their universe, and data is sort of, oh yeah, bolt on, and like you say, reporting. >> Right. >> So how do they deal with that? Do they get one big giant DB2 instance and stuff all the data in there, and infuse it with MI? Is that even practical? How do they solve this problem? >> Yeah, that's a great question. And there's, again, there's a multi-layered answer to that. But let me start with the most, you know, one of the big changes, one of the massive shifts that's been going on over the last decade is the shift to cloud. And people think of the shift to cloud as, well, I don't have to own the server. Someone else will own the server. That's actually not the right way to look at it. I mean, that is one element of cloud computing, but it's not, for me, the most transformative. The big thing about the cloud is the introduction of fully-managed services. It's not just you don't own the server. You don't have to install, configure, or tune anything. Now that's directly related to the topic that you just raised, because people have expertise, domains of expertise in their business. Maybe you're a manufacturer and you have expertise in manufacturing. If you're a bank, you have expertise in banking. You may not be a high-tech expert. You may not have deep skills in tech. So one of the great elements of the cloud is that now you can use these fully managed services and you don't have to be a database expert anymore. You don't have to be an expert in tuning SQL or JSON, or yadda yadda. Someone else takes care of that for you, and that's the elegance of a fully managed service, not just that someone else has got the hardware, but they're taking care of all the complexity. And that's huge. The other thing that I would say is, you know, the companies that are really like the big data houses, they got lots of data, they've spent the last 20 years working so hard to converge their data into larger and larger data lakes. And some have been more successful than others. But everybody has found that that's quite hard to do. Data is coming in many places, in many different repositories, and trying to consolidate, you know, rip the data out, constantly ripping it out and replicating into some data lake where you, or data warehouse where you can do your analytics, is complicated. And it means in some ways you're multiplying your costs because you have the data in its original location and now you're copying it into yet another location. You've got to pay for that, too. So you're multiplying costs. So one of the things I'm very excited about at IBM is we've been working on this new technology that we've now branded it as IBM Queryplex. And that gives us the ability to query data across all of these myriad sources as if they are in one place. As if they are a single consolidated data lake, and make it all look like (snaps) one repository. And not only to the application appear as one repository, but actually tap into the processing power of every one of those data sources. So if you have 1,000 of them, we'll bring to bear the power 1,000 data sources and all that computing and all that memory on these analytics problems. >> Well, give me an example why that matters, of what would be a real-world application of that. >> Oh, sure, so there, you know, there's a couple of examples. I'll give you two extremes, two different extremes. One extreme would be what I'll call enterprise, enterprise data consolidation or virtualization, where you're a large institution and you have several of these repositories. Maybe you got some IBM repositories like DB2. Maybe you've got a little bit of Oracle and a little bit of SQL Server. Maybe you've got some open source stuff like Postgres or MySQL. You got a bunch of these and different departments use different things, and it develops over decades and to some extent you can't even control it, (laughs) right? And now you just want to get analytics on that. You just, what's this data telling me? And as long as all that data is sitting in these, you know, dozens or hundreds of different repositories, you can't tell, unless you copy it all out into a big data lake, which is expensive and complicated. So Queryplex will solve that problem. >> So it's sort of a virtual data store. >> Yeah, and one of the terms, many different terms that are used, but one of the terms that's used in the industry is data virtualization. So that would be a suitable terminology here as well. To make all that data in hundreds, thousands, even millions of possible data sources, appear as one thing, it has to tap into the processing power of all of them at once. Now, that's one extreme. Let's take another extreme, which is even more extreme, which is the IoT scenario, Internet of Things, right? Internet of Things. Imagine you've, have devices, you know, shipping containers and smart meters on buildings. You could literally have 100,000 of these or a million of these things. They're usually small; they don't usually have a lot of data on them. But they can store, usually, couple of months of data. And what's fascinating about that is that most analytics today are really on the most recent you know, 48 hours or four weeks, maybe. And that time is getting shorter and shorter, because people are doing analytics more regularly and they're interested in, just tell me what's going on recently. >> I got to geek out here, for a second. >> Please, well thanks for the warning. (laughs) >> And I know you know things, but I'm not a, I'm not a technical person, but I've been a molt. I've been around a long time. A lot of questions on data virtualization, but let me start with Queryplex. The name is really interesting to me. When I, and you're a database expert, so I'm going to tap your expertise. When I read the Google Spanner paper, I called up my colleague David Floyer, who's an ex-IBM, I said, "This is like global Sysplex. "It's a global distributed thing," And he goes, "Yeah, kind of." And I got very excited. And then my eyes started bleeding when I read the paper, but the name, Queryplex, is it a play on Sysplex? Is there-- >> It's actually, there's a long story. I don't think I can say the story on-air, but we, suffice it to say we wanted to get a name that was legally usable and also descriptive. >> Dave: Okay. >> And we went through literally hundreds and hundreds of permutations of words and we finally landed on Queryplex. But, you know, you mentioned Google Spanner. I probably should spend a moment to differentiate how what we're doing is-- >> Great, if you would. >> A different kind of thing. You know, on Google Spanner, you put data into Google Spanner. With Queryplex, you don't put data into it. >> Dave: Don't have to move it. >> You don't have to move it. You leave it where it is. You can have your data in DB2, you can have it in Oracle, you can have it in a flat file, you can have an Excel spreadsheet, and you know, think about that. An Excel spreadsheet, a collection of text files, comma delimited text files, SQL Server, Oracle, DB2, Netezza, all these things suddenly appear as one database. So that's the transformation. It's not about we'll take your data and copy it into our system, this is about leave your data where it is, and we're going to tap into your (snaps) existing systems for you and help you see them in a unified way. So it's a very different paradigm than what others have done. Part of the reason why we're so excited about it is we're, as far as we know, nobody else is really doing anything quite like this. >> And is that what gets people to the 21st century, basically, is that they have all these legacy systems and yet the conversion is much simpler, much more economical for them? >> Yeah, exactly. It's economical, it's fast. (snaps) You can deploy this in, you know, a very small amount of time. And we're here today talking about machine learning and it's a very good segue to point out in order to get to high-quality AI, you need to have a really strong foundation of an information architecture. And for the industry to show up, as some have done over the past decade, and keep telling people to re-architect their data infrastructure, keep modifying their databases and creating new databases and data lakes and warehouses, you know, it's just not realistic. And so we want to provide a different path. A path that says we're going to make it possible for you to have superb machine learning, cognitive computing, artificial intelligence, and you don't have to rebuild your information architecture. We're going to make it possible for you to leverage what you have and do something special. >> This is exciting. I wasn't aware of this capability. And we were talking earlier about the cloud and the managed service component of that as a major driver of lowering cost and complexity. There's another factor here, which is, we talked about moving data-- >> Right. >> And that's one of the most expensive components of any infrastructure. If I got to move data and the transmission costs and the latency, it's virtually impossible. Speed of light's still up. I know you guys are working on speed of light, but (Sam laughs) you'll eventually get there. >> Right. >> Maybe. But the other thing about cloud economics, and this relates to sort of Queryplex. There's this API economy. You've got virtually zero marginal costs. When you were talking, I was writing these down. You got global scale, it's never down, you've got this network effect working for you. Are you able to, are the standards there? Are you able to replicate those sort of cloud economics the APIs, the standards, that scale, even though you're not in control of this, there's not a single point of control? Can you explain sort of how that magic works? >> Yeah, well I think the API economy is for real and it's very important for us. And it's very important that, you know, we talk about API standards. There's a beautiful quote I once heard. The beautiful thing about standards is there's so many to choose from. (All laugh) And the reality is that, you know, you have standards that are official standards, and then you have the de facto standards because something just catches on and nobody blessed it. It just got popular. So that's a big part of what we're doing at IBM is being at the forefront of adopting the standards that matter. We made a big, a big investment in being Spark compatible, and, in fact, even with Queryplex. You can issue Spark SQL against Queryplex even though it's not a Spark engine, per se, but we make it look and feel like it can be Spark SQL. Another critical point here, when we talk about the API economy, and the speed of light, and movement to the cloud, and these topics you just raised, the friction of the Internet is an unbelievable friction. (John laughs) It's unbelievable. I mean, you know, when you go and watch a movie over the Internet, your home connection is just barely keeping up. I mean, you're pushing it, man. So a gigabyte, you know, a gigabyte an hour or something like that, right? Okay, and if you're a big company, maybe you have a fatter pipe. But not a lot fatter. I mean, not orders of, you're talking incredible friction. And what that means is that it is difficult for people, for companies, to en masse, move everything to the cloud. It's just not happening overnight. And, again, in the interest of doing the best possible service to our customers, that's why we've made it a fundamental element of our strategy in IBM to be a hybrid, what we call hybrid data management company, so that the APIs that we use on the cloud, they are compatible with the APIs that we use on premises. And whether that's software or private cloud. You've got software, you've got private cloud, you've got public cloud. And our APIs are going to be consistent across, and applications that you code for one will run on the other. And you can, that makes it a lot easier to migrate at your leisure when you're ready. >> Makes a lot of sense. That way you can bring cloud economics and the cloud operating model to your data, wherever the data exists. Listening to you speak, Sam, it reminds me, do you remember when Bob Metcalfe who I used to work with at IDG, predicted the collapse of the Internet? He predicted that year after year after year, in speech after speech, that it was so fragile, and you're bringing back that point of, guys, it's still, you know, a lot of friction. So that's very interesting, (laughs) as an architect. >> You think Bob's going to be happy that you brought up that he predicted the Internet was going to be its own demise? (Sam laughs) >> Well, he did it in-- >> I'm just saying. >> I'm staying out of it, man. >> He did it as a lightning rod. >> As a talking-- >> To get the industry to respond, and he had a big enough voice so he could do that. >> That it worked, right. But so I want to get back to Queryplex and the secret sauce. Somehow you're creating this data virtualization capability. What's the secret sauce behind it? >> Yeah, so I think, we're not the first to try, by the way. Actually this problem-- >> Hard problem. >> Of all these data sources all over the place, you try to make them look like one thing. People have been trying to figure out how to do that since like the '70s, okay, so, but-- >> Dave: Really hasn't worked. >> And it hasn't worked. And really, the reason why it hasn't worked is that there's been two fundamental strategies. One strategy is, you have a central coordinator that tries to speak to each of these data sources. So I've got, let's say, 10,000 data sources. I want to have one coordinator tap into each of them and have a dialogue. And what happens is that that coordinator, a server, an agent somewhere, becomes a network bottleneck. You were talking about the friction of the Internet. This is a great example of friction. One coordinator trying to speak to, you know, and collaborators becomes a point of friction. And it also becomes a point of friction not only in the Internet, but also in the computation, because he ends up doing too much of the work. There's too many things that cannot be done at the, at these edge repositories, aggregations, and joins, and so on. So all the aggregations and joins get done by this one sucker who can't keep up. >> Dave: The queue. >> Yeah, so there's a big queue, right. So that's one strategy that didn't work. The other strategy that people tried was sort of an end squared topology where every data source tries to speak to every other data source. And that doesn't scale as well. So what we've done in Queryplex is something that we think is unique and much more organic where we try to organize the universe or constellation of these data sources so that every data source speaks to a small number of peers but not a large number of peers. And that way no single source is a bottleneck, either in network or in computation. That's one trick. And the second trick is we've designed algorithms that can truly be distributed. So you can do joins in a distributed manner. You can do aggregation in a distributed manner. These are things, you know, when I say aggregation, I'm talking about simple things like a sum or an average or a median. These are super popular in, in analytic queries. Everybody wants to do a sum or an average or a median, right? But in the past, those things were hard to do in a distributed manner, getting all the participants in this universe to do some small incremental piece of the computation. So it's really these two things. Number one, this organic, dynamically forming constellation of devices. Dynamically forming a way that is latency aware. So if I'm a, if I represent a data source that's joining this universe or constellation, I'm going to try to find peers who I have a fast connection with. If all the universe of peers were out there, I'll try to find ones that are fast. And the second is having algorithms that we can all collaborate on. Those two things change the game. >> We're getting the two minute sign, and this is fascinating stuff. But so, how do you deal with the data consistency problem? You hear about eventual consistency and people using atomic clocks and-- Right, so Queryplex, you know, there's a reason we call it Queryplex not Dataplex. Queryplex is really a read-only operation. >> Dave: Oh, there you go. >> You've got all these-- >> Problem solved. (laughs) >> Problem solved. You've got all these data sources. They're already doing their, they already have data's coming in how it's coming in. >> Dave: Simple and brilliant. >> Right, and we're not changing any of that. All we're saying is, if you want to query them as one, you can query them as one. I should say a few words about the machine learning that we're doing here at the conference. We've talked about the importance of an information architecture and how that lays a foundation for machine learning. But one of the things that we're showing and demonstrating at the conference today, or at the showcase today, is how we're actually putting machine learning into the database. Create databases that learn and improve over time, learn from experience. In 1952, Arthur Samuel was a researcher at IBM who first, had one of the most fundamental breakthroughs in machine learning when he created a machine learning algorithm that will play checkers. And he programmed this checker playing game of his so it would learn over time. And then he had a great idea. He programmed it so it would play itself, thousands and thousands and thousands of times over, so it would actually learn from its own mistakes. And, you know, the evolution since then. Deep Blue playing chess and so on. The Watson Jeopardy game. We've seen tremendous potential in machine learning. We're putting into the database so databases can be smarter, faster, more consistent, and really just out of the box (snaps) performing. >> I'm glad you brought that up. I was going to ask you, because the legend Steve Mills once said to me, I had asked him a question about in-memory databases. He said ever databases have been around, in-memory databases have been around. But ML-infused databases are new. >> Sam: That's right, something totally new. >> Dave: Yeah, great. >> Well, you mentioned Deep Blue. Looking forward to having Garry Kasparov on a little bit later on here. And I know he's speaking as well. But fascinating stuff that you've covered here, Sam. We appreciate the time here. >> Thank you, thanks for having me. >> And wish you continued success, as well. >> Thank you very much. >> Sam Lightstone, IBM fellow joining us here live on the Cube. We're back with more here from New York City right after this. (electronic music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. and we're now joined by Sam Lightstone, Great to be back. Yeah, good to have you here on kind of a moldy New York day and it's all about the data. the kinds of data that you already have in your mind. I mean, it is for the big data, you know, and trying to consolidate, you know, rip the data out, of what would be a real-world application of that. and you have several of these repositories. Yeah, and one of the terms, Please, well thanks for the warning. And I know you know things, but I'm not a, suffice it to say we wanted to get a name that was But, you know, you mentioned Google Spanner. With Queryplex, you don't put data into it. and you know, think about that. And for the industry to show up, and the managed service component of that And that's one of the most expensive components and this relates to sort of Queryplex. And the reality is that, you know, and the cloud operating model to your data, To get the industry What's the secret sauce behind it? Yeah, so I think, we're not the first to try, by the way. you try to make them look like one thing. And really, the reason why it hasn't worked is that And the second trick is Right, so Queryplex, you know, Problem solved. You've got all these data sources. and really just out of the box (snaps) performing. because the legend Steve Mills once said to me, Well, you mentioned Deep Blue. live on the Cube.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Justin Warren	PERSON	0.99+
Sanjay Poonen	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Clarke	PERSON	0.99+
David Floyer	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Dave Volante	PERSON	0.99+
George	PERSON	0.99+
Dave	PERSON	0.99+
Diane Greene	PERSON	0.99+
Michele Paluso	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Sam Lightstone	PERSON	0.99+
Dan Hushon	PERSON	0.99+
Nutanix	ORGANIZATION	0.99+
Teresa Carlson	PERSON	0.99+
Kevin	PERSON	0.99+
Andy Armstrong	PERSON	0.99+
Michael Dell	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Kevin Sheehan	PERSON	0.99+
Leandro Nunez	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
Keith	PERSON	0.99+
Bob Metcalfe	PERSON	0.99+
VMware	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Sam	PERSON	0.99+
Larry Biagini	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Brendan	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Clarke Patterson	PERSON	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Queryplex: