James Kobielus, Wikibon | The Skinny on Machine Intelligence
>> Announcer: From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now here's your host, Dave Vellante. >> In the early days of big data and Hadoop, the focus was really on operational efficiency where ROI was largely centered on reduction of investment. Fast forward 10 years and you're seeing a plethora of activity around machine learning, and deep learning, and artificial intelligence, and deeper business integration as a function of machine intelligence. Welcome to this Cube conversation, The Skinny on Machine Intelligence. I'm Dave Vellante and I'm excited to have Jim Kobielus here up from the District area. Jim, great to see you. Thanks for coming into the office today. >> Thanks a lot, Dave, yes great to be here in beautiful Marlboro, Massachusetts. >> Yes, so you know Jim, when you think about all the buzz words in this big data business, I have to ask you, is this just sort of same wine, new bottle when we talk about all this AI and machine intelligence stuff? >> It's actually new wine. But of course there's various bottles and they have different vintages, and much of that wine is still quite tasty, and let me just break it out for you, the skinny on machine intelligence. AI as a buzzword and as a set of practices really goes back of course to the early post-World War II era, as we know Alan Turing and the Imitation Game and so forth. There are other developers, theorists, academics in the '40s and the '50s and '60s that pioneered in this field. So we don't want to give Alan Turing too much credit, but he was clearly a mathematician who laid down the theoretical framework for much of what we now call Artificial Intelligence. But when you look at Artificial Intelligence as a ever-evolving set of practices, where it began was in an area that focused on deterministic rules, rule-driven expert systems, and that was really the state of the art of AI for a long, long time. And so you had expert systems in a variety of areas that became useful or used in business, and science, and government and so forth. Cut ahead to the turn of the millennium, we are now in the 21st century, and what's different, the new wine, is big data, larger and larger data sets that can reveal great insights, patterns, correlations that might be highly useful if you have the right statistical modeling tools and approaches to be able to surface up these patterns in an automated or semi-automated fashion. So one of the core areas is what we now call machine learning, which really is using statistical models to infer correlations, anomalies, trends, and so forth in the data itself, and machine learning, the core approach for machine learning is something called Artificial Neural Networks, which is essentially modeling a statistical model along the lines of how, at a very high level, the nervous system is made up, with neurons connected by synapses, and so forth. It's an analog in statistical modeling called a perceptron. The whole theoretical framework of perceptrons actually got started in the 1950s with the first flush of AI, but didn't become a practical reality until after the turn of this millennium, really after the turn of this particular decade, 2010, when we started to see not only very large big data sets emerge and new approaches for managing it all, like Hadoop, come to the fore. But we've seen artificial neural nets get more sophisticated in terms of their capabilities, and a new approach for doing machine learning, artificial neural networks, with deeper layers of perceptrons, neurons, called deep learning has come to the fore. With deep learning, you have new algorithms like convolutional neural networks, recurrent neural networks, generative adversarial neural networks. These are different ways of surfacing up higher level abstractions in the data, for example for face recognition and object recognition, voice recognition and so forth. These all depend on this new state of the art for machine learning called deep learning. So what we have now in the year 2017 is we have quite a mania for all things AI, much of it is focused on deep learning, much of it is focused on tools that your average data scientist or your average developer increasingly can use and get very productive with and build these models and train and test them, and deploy them into working applications like going forward, things like autonomous vehicles would be impossible without this. >> Right, and we'll get some of that. But so you're saying that machine learning is essentially math that infers patterns from data. And math, it's new math, math that's been around for awhile or. >> Yeah, and inferring patterns from data has been done for a long time with software, and we have some established approaches that in many ways predate the current vogue for neural networks. We have support vector machines, and decision trees, and Bayesian logic. These are different ways of approaches statistical for inferring patterns, correlations in the data. They haven't gone away, they're a big part of the overall AI space, but it's a growing area that I've only skimmed the surface of. >> And they've been around for many many years, like SVM for example. Okay, now describe further, add some color to deep learning. You sort of painted a picture of this sort of deep layers of these machine learning algorithms and this network with some depth to it, but help us better understand the difference between machine learning and deep learning, and then ultimately AI. >> Yeah, well with machine learning generally, you know, inferring patterns from data that I said, artificial neural networks of which the deep learning networks are one subset. Artificial neural networks can be two or more layers of perceptrons or neurons, they have relationship to each other in terms of their activation according to various mathematical functions. So when you look at an artificial neural network, it basically does very complex math equations through a combination of what they call scalar functions, like multiplication and so forth, and then you have these non-linear functions, like cosine and so forth, tangent, all that kind of math playing together in these deep structures that are triggered by data, data input that's processed according to activation functions that set weights and reset the weights among all the various neural processing elements, that ultimately output something, the insight or the intelligence that you're looking for, like a yes or no, is this a face or not a face, that these incoming bits are presenting. Or it might present output in terms of categories. What category of face is this, a man, a woman, a child, or whatever. What I'm getting at is that so deep learning is more layers of these neural processing elements that are specialized to various functions to be able to abstract higher level phenomena from the data, it's not just, "Is this a face," but if it's a scene recognition deep learning network, it might recognize that this is a face that corresponds to a person named Dave who also happens to be the father in the particular family scene, and by the way this is a family scene that this deep learning network is able to ascertain. What I'm getting at is those are the higher level abstractions that deep learning algorithms of various sorts are built to identify in an automated way. >> Okay, and these in your view all fit under the umbrella of artificial intelligence, or is that sort of an uber field that we should be thinking of. >> Yeah, artificial intelligence as the broad envelope essentially refers to any number of approaches that help machines to think like humans, essentially. When you say, "Think like humans," what does that mean actually? To do predictions like humans, to look for anomalies or outliers like a human might, you know separate figure from ground for example in a scene, to identify the correlations or trends in a given scene. Like I said, to do categorization or classification based on what they're seeing in a given frame or what they're hearing in a given speech sample. So all these cognitive processes just skim the surface, or what AI is all about, automating to a great degree. When I say cognitive, but I'm also referring to affective like emotion detection, that's another set of processes that goes on in our heads or our hearts, that AI based on deep learning and so forth is able to do depending on different types of artificial neural networks are specialized particular functions, and they can only perform these functions if A, they've been built and optimized for those functions, and B, they have been trained with actual data from the phenomenon of interest. Training the algorithms with the actual data to determine how effective the algorithms are is the key linchpin of the process, 'cause without training the algorithms you don't know if the algorithm is effective for its intended purpose, so in Wikibon what we're doing is in the whole development process, DevOps cycle, for all things AI, training the models through a process called supervised learning is absolutely an essential component of ascertaining the quality of the network that you've built. >> So that's the calibration and the iteration to increase the accuracy, and like I say, the quality of the outcome. Okay, what are some of the practical applications that you're seeing for AI, and ML, and DL. >> Well, chat bots, you know voice recognition in general, Siri and Alexa, and so forth. Without machine learning, without deep learning to do speech recognition, those can't work, right? Pretty much in every field, now for example, IT service management tools of all sorts. When you have a large network that's logging data at the server level, at the application level and so forth, those data logs are too large and too complex and changing too fast for humans to be able to identify the patterns related to issues and faults and incidents. So AI, machine learning, deep learning is being used to fathom those anomalies and so forth in an automated fashion to be able to alert a human to take action, like an IT administrator, or to be able to trigger a response work flow, either human or automated. So AI within IT service management, hot hot topic, and we're seeing a lot of vendors incorporate that capability into their tools. Like I said, in the broad world we live in in terms of face recognition and Facebook, the fact is when I load a new picture of myself or my family or even with some friends or brothers in it, Facebook knows lickity-split whether it's my brother Tom or it's my wife or whoever, because of face recognition which obviously depends, well it's not obvious to everybody, depends on deep learning algorithms running inside Facebook's big data network, big data infrastructure. They're able to immediately know this. We see this all around us now, speech recognition, face recognition, and we just take it for granted that it's done, but it's done through the magic of AI. >> I want to get to the development angle scenario that you specialize in. Part of the reason why you came to Wikibon is to really focus on that whole application development angle. But before we get there, I want to follow the data for a bit 'cause you mentioned that was really the catalyst for the resurgence in AI, and last week at the Wikibon research meeting we talked about this three-tiered model. Edge, as edge piece, and then something in the middle which is this aggregation point for all this edge data, and then cloud which is where I guess all the deep modeling occurs, so sort of a three-tier model for the data flow. >> John: Yes. >> So I wonder if you could comment on that in the context of AI, it means more data, more I guess opportunities for machine learning and digital twins, and all this other cool stuff that's going on. But I'm really interested in how that is going to affect the application development and the programming model. John Farrier has a phrase that he says that, "Data is the new development kit." Well, if you got all this data that's distributed all over the place, that changes the application development model, at least you think it does. So I wonder if you could comment on that edge explosion, the data explosion as a result, and what it means for application development. >> Right, so more and more deep learning algorithms are being pushed to edge devices, by that I mean smartphones, and smart appliances like the ones that incorporate Alexa and so forth. And so what we're talking about is the algorithms themselves are being put into CPUs and FPGAs and ASICs and GPUs. All that stuff's getting embedded in everything that we're using, everything's that got autonomous, more and more devices have the ability if not to be autonomous in terms of making decisions, independent of us, or simply to serve as augmentation vehicles for our own whatever we happen to be doing thanks to the power of deep learning at the client. Okay, so when deep learning algorithms are embedded in say an internet of things edge device, what the deep learning algorithms are doing is A, they're ingesting the data through the sensors of that device, B, they're making inferences, deep learning algorithmic-driven inferences, based on that data. It might be speech recognition, face recognition, environmental sensing and being able to sense geospatially where you are and whether you're in a hospitable climate for whatever. And then the inferences might drive what we call actuation. Now in the autonomous vehicle scenario, the autonomous vehicle is equipped with all manner of sensors in terms of LiDAR and sonar and GPS and so forth, and it's taking readings all the time. It's doing inferences that either autonomously or in conjunction with inferences that are being made through deep learning and machine learning algorithms that are executing in those intermediary hubs like you described, or back in the cloud, or in a combination of all of that. But ultimately, the results of all those analytics, all those deep learning models, feed the what we call actuation of the car itself. Should it stop, should it put on the brakes 'cause it's about to hit a wall, should it turn right, should it turn left, should it slow down because it happened to have entered a new speed zone or whatever. All of the decisions, the actions that the edge device, like a car would be an edge device in this scenario, are being driven by evermore complex algorithms that are trained by data. Now, let's stay with the autonomous vehicle because that's an extreme case of a very powerful edge device. To train an autonomous vehicle you need of course lots and lots of data that's acquired from possibly a prototype that you, a Google or a Tesla, or whoever you might be, have deployed into the field or your customers are using, B, proving grounds like there's one out by my stomping ground out in Ann Arbor, a proving ground for the auto industry for self-driving vehicles and gaining enough real training data based on the operation of these vehicles in various simulated scenarios, and so forth. This data is used to build and iterate and refine the algorithms, the deep learning models that are doing the various operations of not only the vehicles in isolation but the vehicles operating as a fleet within an entire end to end transportation system. So what I'm getting at, is if you look at that three-tier model, then the edge device is the car, it's running under its own algorithms, the middle tier the hub might be a hub that's controlling a particular zone within a traffic system, like in my neck of the woods it might be a hub that's controlling congestion management among self-driving vehicles in eastern Fairfax County, Virginia. And then the cloud itself might be managing an entire fleet of vehicles, let's say you might have an entire fleet of vehicles under the control of say an Uber, or whatever is managing its own cars from a cloud-based center. So when you look at the tiering model that analytics, deep learning analytics is being performed, increasingly it will be for various, not just self-driving vehicles, through this tiered model, because the edge device needs to make decisions based on local data. The hub needs to make decisions based on a wider view of data across a wider range of edge entities. And then the cloud itself has responsibility or visibility for making deep learning driven determinations for some larger swath. And the cloud might be managing both the deep learning driven edge devices, as well as monitoring other related systems that self-driving network needs to coordinate with, like the government or whatever, or police. >> So envisioning that three-tier model then, how does the programming paradigm change and evolve as a result of that. >> Yeah, the programming paradigm is the modeling itself, the building and the training and the iterating the models generally will stay centralized, meaning to do all these functions, I mean to do modeling and training and iteration of these models, you need teams of data scientists and other developers who are both adept as to statistical modeling, who are adept at acquiring the training data, at labeling it, labeling is an important function there, and who are adept at basically developing and deploying one model after another in an iterative fashion through DevOps, through a standard release pipeline with version controls, and so forth built in, the governance built in. And that's really it needs to be a centralized function, and it's also very compute and data intensive, so you need storage resources, you need large clouds full of high performance computing, and so forth. Be able to handle these functions over and over. Now the edge devices themselves will feed in the data in just the data that is fed into the centralized platform where the training and the modeling is done. So what we're going to see is more and more centralized modeling and training with decentralized execution of the actual inferences that are driven by those models is the way it works in this distributive environment. >> It's the Holy Grail. All right, Jim, we're out of time but thanks very much for helping us unpack and giving us the skinny on machine learning. >> John: It's a fat stack. >> Great to have you in the office and to be continued. Thanks again. >> John: Sure. >> All right, thanks for watching everybody. This is Dave Vellante with Jim Kobelius, and you're watching theCUBE at the Marlboro offices. See ya next time. (upbeat music)
SUMMARY :
Announcer: From the SiliconANGLE Media office Thanks for coming into the office today. Thanks a lot, Dave, yes great to be here in beautiful So one of the core areas is what we now call math that infers patterns from data. that I've only skimmed the surface of. the difference between machine learning might recognize that this is a face that corresponds to a of artificial intelligence, or is that sort of an Training the algorithms with the actual data to determine So that's the calibration and the iteration at the server level, at the application level and so forth, Part of the reason why you came to Wikibon is to really all over the place, that changes the application development devices have the ability if not to be autonomous in terms how does the programming paradigm change and so forth built in, the governance built in. It's the Holy Grail. Great to have you in the office and to be continued. and you're watching theCUBE at the Marlboro offices.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
John | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Jim Kobelius | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Jim Kobielus | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
John Farrier | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
21st century | DATE | 0.99+ |
James Kobielus | PERSON | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
Alan Turing | PERSON | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Siri | TITLE | 0.99+ |
two | QUANTITY | 0.99+ |
Wikibon | ORGANIZATION | 0.99+ |
last week | DATE | 0.99+ |
Alexa | TITLE | 0.99+ |
Marlboro | LOCATION | 0.99+ |
Tom | PERSON | 0.99+ |
Boston, Massachusetts | LOCATION | 0.99+ |
10 years | QUANTITY | 0.98+ |
Ann Arbor | LOCATION | 0.98+ |
1950s | DATE | 0.98+ |
both | QUANTITY | 0.97+ |
today | DATE | 0.97+ |
Marlboro, Massachusetts | LOCATION | 0.97+ |
one | QUANTITY | 0.96+ |
2017 | DATE | 0.95+ |
three-tier | QUANTITY | 0.95+ |
2010 | DATE | 0.95+ |
World War II | EVENT | 0.95+ |
first flush | QUANTITY | 0.94+ |
three-tier model | QUANTITY | 0.93+ |
Alan Turing | TITLE | 0.88+ |
'50s | DATE | 0.88+ |
eastern Fairfax County, Virginia | LOCATION | 0.87+ |
The Skinny on Machine Intelligence | TITLE | 0.87+ |
Wikibon | TITLE | 0.87+ |
one model | QUANTITY | 0.86+ |
'40s | DATE | 0.85+ |
Cube | ORGANIZATION | 0.84+ |
DevOps | TITLE | 0.83+ |
three-tiered | QUANTITY | 0.82+ |
one subset | QUANTITY | 0.81+ |
The Skinny | ORGANIZATION | 0.81+ |
'60s | DATE | 0.8+ |
Imitation Game | TITLE | 0.79+ |
more layers | QUANTITY | 0.74+ |
theCUBE | ORGANIZATION | 0.73+ |
SiliconANGLE Media | ORGANIZATION | 0.72+ |
post- | DATE | 0.56+ |
decade | DATE | 0.46+ |