Machine Learning Applied to Computationally Difficult Problems in Quantum Physics

>> My name is Franco Nori. Is a great pleasure to be here and I thank you for attending this meeting and I'll be talking about some of the work we are doing within the NTT-PHI group. I would like to thank the organizers for putting together this very interesting event. The topics studied by NTT-PHI are very exciting and I'm glad to be part of this great team. Let me first start with a brief overview of just a few interactions between our team and other groups within NTT-PHI. After this brief overview or these interactions then I'm going to start talking about machine learning and neural networks applied to computationally difficult problems in quantum physics. The first one I would like to raise is the following. Is it possible to have decoherence free interaction between qubits? And the proposed solution was a postdoc and a visitor and myself some years ago was to study decoherence free interaction between giant atoms made of superconducting qubits in the context of waveguide quantum electrodynamics. The theoretical prediction was confirmed by a very nice experiment performed by Will Oliver's group at MIT was probably so a few months ago in nature and it's called waveguide quantum electrodynamics with superconducting artificial giant atoms. And this is the first joint MIT Michigan nature paper during this NTT-PHI grand period. And we're very pleased with this. And I look forward to having additional collaborations like this one also with other NTT-PHI groups, Another collaboration inside NTT-PHI regards the quantum hall effects in a rapidly rotating polarity and condensates. And this work is mainly driven by two people, a Michael Fraser and Yoshihisa Yamamoto. They are the main driving forces of this project and this has been a great fun. We're also interacting inside the NTT-PHI environment with the groups of marandI Caltech, like McMahon Cornell, Oliver MIT, and as I mentioned before, Fraser Yamamoto NTT and others at NTT-PHI are also very welcome to interact with us. NTT-PHI is interested in various topics including how to use neural networks to solve computationally difficult and important problems. Let us now look at one example of using neural networks to study computationally difficult and hard problems. Everything we'll be talking today is mostly working progress to be extended and improve in the future. So the first example I would like to discuss is topological quantum phase transition retrieved through manifold learning, which is a variety of version of machine learning. This work is done in collaboration with Che, Gneiting and Liu all members of the group. preprint is available in the archive. Some groups are studying a quantum enhanced machine learning where machine learning is supposed to be used in actual quantum computers to use exponential speed-up and using quantum error correction we're not working on these kind of things we're doing something different. We're studying how to apply machine learning applied to quantum problems. For example how to identify quantum phases and phase transitions. We shall be talking about right now. How to achieve, how to perform quantum state tomography in a more efficient manner. That's another work of ours which I'll be showing later on. And how to assist the experimental data analysis which is a separate project which we recently published. But I will not discuss today because the experiments can produce massive amounts of data and machine learning can help to understand these huge tsunami of data provided by these experiments. Machine learning can be either supervised or unsupervised. Supervised is requires human labeled data. So we have here the blue dots have a label. The red dots have a different label. And the question is the new data corresponds to either the blue category or the red category. And many of these problems in machine learning they use the example of identifying cats and dogs but this is typical example. However, there are the cases which are also provides with there are no labels. So you're looking at the cluster structure and you need to define a metric, a distance between the different points to be able to correlate them together to create these clusters. And you can manifold learning is ideally suited to look at problems we just did our non-linearities and unsupervised. Once you're using the principle component analysis along this green axis here which are the principal axis here. You can actually identify a simple structure with linear projection when you increase the axis here, you get the red dots in one area, and the blue dots down here. But in general you could get red green, yellow, blue dots in a complicated manner and the correlations are better seen when you do an nonlinear embedding. And in unsupervised learning the colors represent similarities are not labels because there are no prior labels here. So we are interested on using machine learning to identify topological quantum phases. And this requires looking at the actual phases and their boundaries. And you start from a set of Hamiltonians or wave functions. And recall that this is difficult to do because there is no symmetry breaking, there is no local order parameters and in complicated cases you can not compute the topological properties analytically and numerically is very hard. So therefore machine learning is enriching the toolbox for studying topological quantum phase transitions. And before our work, there were quite a few groups looking at supervised machine learning. The shortcomings that you need to have prior knowledge of the system and the data must be labeled for each phase. This is needed in order to train the neural networks . More recently in the past few years, there has been increased push on looking at all supervised and Nonlinear embeddings. One of the shortcomings we have seen is that they all use the Euclidean distance which is a natural way to construct the similarity matrix. But we have proven that it is suboptimal. It is not the optimal way to look at distance. The Chebyshev distances provides better performance. So therefore the difficulty here is how to detect topological quantifies transition is a challenge because there is no local order parameters. Few years ago we thought well, three or so years ago machine learning may provide effective methods for identifying topological Features needed in the past few years. The past two years several groups are moving this direction. And we have shown that one type of machine learning called manifold learning can successfully retrieve topological quantum phase transitions in momentum and real spaces. We have also Shown that if you use the Chebyshev distance between data points are supposed to Euclidean distance, you sharpen the characteristic features of these topological quantum phases in momentum space and the afterwards we do so-called diffusion map, Isometric map can be applied to implement the dimensionality reduction and to learn about these phases and phase transition in an unsupervised manner. So this is a summary of this work on how to characterize and study topological phases. And the example we used is to look at the canonical famous models like the SSH model, the QWZ model, the quenched SSH model. We look at this momentous space and the real space, and we found that the metal works very well in all of these models. And moreover provides a implications and demonstrations for learning also in real space where the topological invariants could be either or known or hard to compute. So it provides insight on both momentum space and real space and its the capability of manifold learning is very good especially when you have the suitable metric in exploring topological quantum phase transition. So this is one area we would like to keep working on topological faces and how to detect them. Of course there are other problems where neural networks can be useful to solve computationally hard and important problems in quantum physics. And one of them is quantum state tomography which is important to evaluate the quality of state production experiments. The problem is quantum state tomography scales really bad. It is impossible to perform it for six and a half 20 qubits. If you have 2000 or more forget it, it's not going to work. So now we're seeing a very important process which is one here tomography which cannot be done because there is a computationally hard bottleneck. So machine learning is designed to efficiently handle big data. So the question we're asking a few years ago is chemistry learning help us to solve this bottleneck which is quantum state tomography. And this is a project called Eigenstate extraction with neural network tomography with a student Melkani , research scientists of the group Clemens Gneiting and I'll be brief in summarizing this now. The specific machine learning paradigm is the standard artificial neural networks. They have been recently shown in the past couple of years to be successful for tomography of pure States. Our approach will be to carry this over to mixed States. And this is done by successively reconstructing the eigenStates or the mixed states. So it is an iterative procedure where you can slowly slowly get into the desired target state. If you wish to see more details, this has been recently published in phys rev A and has been selected as a editor suggestion. I mean like some of the referees liked it. So tomography is very hard to do but it's important and machine learning can help us to do that using neural networks and these to achieve mixed state tomography using an iterative eigenstate reconstruction. So why it is so challenging? Because you're trying to reconstruct the quantum States from measurements. You have a single qubit, you have a few Pauli matrices there are very few measurements to make when you have N qubits then the N appears in the exponent. So the number of measurements grows exponentially and this exponential scaling makes the computation to be very difficult. It's prohibitively expensive for large system sizes. So this is the bottleneck is these exponential dependence on the number of qubits. So by the time you get to 20 or 24 it is impossible. It gets even worst. Experimental data is noisy and therefore you need to consider maximum-likelihood estimation in order to reconstruct the quantum state that kind of fits the measurements best. And again these are expensive. There was a seminal work sometime ago on ion-traps. The post-processing for eight qubits took them an entire week. There were different ideas proposed regarding compressed sensing to reduce measurements, linear regression, et cetera. But they all have problems and you quickly hit a wall. There's no way to avoid it. Indeed the initial estimate is that to do tomography for 14 qubits state, you will take centuries and you cannot support a graduate student for a century because you need to pay your retirement benefits and it is simply complicated. So therefore a team here sometime ago we're looking at the question of how to do a full reconstruction of 14-qubit States with in four hours. Actually it was three point three hours Though sometime ago and many experimental groups were telling us that was very popular paper to read and study because they wanted to do fast quantum state tomography. They could not support the student for one or two centuries. They wanted to get the results quickly. And then because we need to get these density matrices and then they need to do these measurements here. But we have N qubits the number of expectation values go like four to the N to the Pauli matrices becomes much bigger. A maximum likelihood makes it even more time consuming. And this is the paper by the group in Inns brook, where they go this one week post-processing and they will speed-up done by different groups and hours. Also how to do 14 qubit tomography in four hours, using linear regression. But the next question is can machine learning help with quantum state tomography? Can allow us to give us the tools to do the next step to improve it even further. And then the standard one is this one here. Therefore for neural networks there are some inputs here, X1, X2 X3. There are some weighting factors when you get an output function PHI we just call Nonlinear activation function that could be heavy side Sigmon piecewise, linear logistic hyperbolic. And this creates a decision boundary and input space where you get let's say the red one, the red dots on the left and the blue dots on the right. Some separation between them. And you could have either two layers or three layers or any number layers can do either shallow or deep. This cannot allow you to approximate any continuous function. You can train data via some cost function minimization. And then there are different varieties of neural nets. We're looking at some sequel restricted Boltzmann machine. Restricted means that the input layer speeds are not talking to each other. The output layers means are not talking to each other. And we got reasonably good results with the input layer, output layer, no hidden layer and the probability of finding a spin configuration called the Boltzmann factor. So we try to leverage Pure-state tomography for mixed-state tomography. By doing an iterative process where you start here. So there are the mixed States in the blue area the pure States boundary here. And then the initial state is here with the iterative process you get closer and closer to the actual mixed state. And then eventually once you get here, you do the final jump inside. So you're looking at a dominant eigenstate which is closest pure state and then computer some measurements and then do an iterative algorithm that to make you approach this desire state. And after you do that then you can essentially compare results with some data. We got some data for four to eight trapped-ion qubits approximate W States were produced and they were looking at let's say the dominant eigenstate is reliably recorded for any equal four, five six, seven, eight for the ion-state, for the eigenvalues we're still working because we're getting some results which are not as accurate as we would like to. So this is still work in progress, but for the States is working really well. So there is some cost scaling which is beneficial, goes like NR as opposed to N squared. And then the most relevant information on the quality of the state production is retrieved directly. This works for flexible rank. And so it is possible to extract the ion-state within network tomography. It is cost-effective and scalable and delivers the most relevant information about state generation. And it's an interesting and viable use case for machine learning in quantum physics. We're also now more recently working on how to do quantum state tomography using Conditional Generative Adversarial Networks. Usually the masters student are analyzed in PhD and then two former postdocs. So this CGANs refers to this Conditional Generative Adversarial Networks. In this framework you have two neural networks which are essentially having a dual, they're competing with each other. And one of them is called generator another one is called discriminator. And there they're learning multi-modal models from the data. And then we improved these by adding a cost of neural network layers that enable the conversion of outputs from any standard neural network into physical density matrix. So therefore to reconstruct the density matrix, the generator layer and the discriminator networks So the two networks, they must train each other on data using standard gradient-based methods. So we demonstrate that our quantum state tomography and the adversarial network can reconstruct the optical quantum state with very high fidelity which is orders of magnitude faster and from less data than a standard maximum likelihood metals. So we're excited about this. We also show that this quantum state tomography with these adversarial networks can reconstruct a quantum state in a single evolution of the generator network. If it has been pre-trained on similar quantum States. so requires some additional training. And all of these is still work in progress where some preliminary results written up but we're continuing. And I would like to thank all of you for attending this talk. And thanks again for the invitation.

Published Date : Sep 26 2020

SUMMARY :

And recall that this is difficult to do

ENTITIES

Entity	Category	Confidence
Michael Fraser	PERSON	0.99+
Franco Nori	PERSON	0.99+
Yoshihisa Yamamoto	PERSON	0.99+
one	QUANTITY	0.99+
NTT-PHI	ORGANIZATION	0.99+
two people	QUANTITY	0.99+
two layers	QUANTITY	0.99+
Clemens Gneiting	ORGANIZATION	0.99+
20	QUANTITY	0.99+
MIT	ORGANIZATION	0.99+
three hours	QUANTITY	0.99+
first	QUANTITY	0.99+
three layers	QUANTITY	0.99+
four	QUANTITY	0.99+
one week	QUANTITY	0.99+
Melkani	PERSON	0.99+
14 qubits	QUANTITY	0.99+
today	DATE	0.98+
one area	QUANTITY	0.98+
first example	QUANTITY	0.98+
Inns brook	LOCATION	0.98+
six and a half 20 qubits	QUANTITY	0.98+
24	QUANTITY	0.98+
four hours	QUANTITY	0.98+
Will Oliver	PERSON	0.98+
two centuries	QUANTITY	0.98+
Few years ago	DATE	0.98+
first joint	QUANTITY	0.98+
One	QUANTITY	0.98+
both	QUANTITY	0.98+
each phase	QUANTITY	0.97+
three point	QUANTITY	0.96+
Fraser Yamamoto	PERSON	0.96+
two networks	QUANTITY	0.96+
first one	QUANTITY	0.96+
2000	QUANTITY	0.96+
six	QUANTITY	0.95+
five	QUANTITY	0.94+
14 qubit	QUANTITY	0.94+
Boltzmann	OTHER	0.94+
a century	QUANTITY	0.93+
one example	QUANTITY	0.93+
eight qubits	QUANTITY	0.92+
Caltech	ORGANIZATION	0.91+
NTT	ORGANIZATION	0.91+
centuries	QUANTITY	0.91+
few months ago	DATE	0.91+
single	QUANTITY	0.9+
Oliver	PERSON	0.9+
two former postdocs	QUANTITY	0.9+
single qubit	QUANTITY	0.89+
few years ago	DATE	0.88+
14-qubit	QUANTITY	0.86+
NTT-PHI	TITLE	0.86+
eight	QUANTITY	0.86+
Michigan	LOCATION	0.86+
past couple of years	DATE	0.85+
two neural	QUANTITY	0.84+
seven	QUANTITY	0.83+
eight trapped-	QUANTITY	0.83+
three or so years ago	DATE	0.82+
Liu	PERSON	0.8+
Pauli	OTHER	0.79+
one type	QUANTITY	0.78+
past two years	DATE	0.77+
some years ago	DATE	0.73+
Cornell	PERSON	0.72+
McMahon	ORGANIZATION	0.71+
Gneiting	PERSON	0.69+
Chebyshev	OTHER	0.68+
few years	DATE	0.67+
phys rev	TITLE	0.65+
past few years	DATE	0.64+
NTT	EVENT	0.64+
Che	PERSON	0.63+
CGANs	ORGANIZATION	0.61+
Boltzmann	PERSON	0.57+
Euclidean	LOCATION	0.57+
marandI	ORGANIZATION	0.5+
Hamiltonians	OTHER	0.5+
each	QUANTITY	0.5+
NTT	TITLE	0.44+
-PHI	TITLE	0.31+
PHI	ORGANIZATION	0.31+

UNLIST TILL 4/2 - Tapping Vertica's Integration with TensorFlow for Advanced Machine Learning

>> Paige: Hello, everybody, and thank you for joining us today for the Virtual Vertica BDC 2020. Today's breakout session is entitled "Tapping Vertica's Integration with TensorFlow for Advanced Machine Learning." I'm Paige Roberts, Opensource Relations Manager at Vertica, and I'll be your host for this session. Joining me is Vertica Software Engineer, George Larionov. >> George: Hi. >> Paige: (chuckles) That's George. So, before we begin, I encourage you guys to submit questions or comments during the virtual session. You don't have to wait. Just type your question or comment in the question box below the slides and click submit. So, as soon as a question occurs to you, go ahead and type it in, and there will be a Q and A session at the end of the presentation. We'll answer as many questions as we're able to get to during that time. Any questions we don't get to, we'll do our best to answer offline. Now, alternatively, you can visit Vertica Forum to post your questions there, after the session. Our engineering team is planning to join the forums to keep the conversation going, so you can ask an engineer afterwards, just as if it were a regular conference in person. Also, reminder, you can maximize your screen by clicking the double-arrow button in the lower right corner of the slides. And, before you ask, yes, this virtual session is being recorded, and it will be available to view by the end this week. We'll send you a notification as soon as it's ready. Now, let's get started, over to you, George. >> George: Thank you, Paige. So, I've been introduced. I'm a Software Engineer at Vertica, and today I'm going to be talking about a new feature, Vertica's Integration with TensorFlow. So, first, I'm going to go over what is TensorFlow and what are neural networks. Then, I'm going to talk about why integrating with TensorFlow is a useful feature, and, finally, I am going to talk about the integration itself and give an example. So, as we get started here, what is TensorFlow? TensorFlow is an opensource machine learning library, developed by Google, and it's actually one of many such libraries. And, the whole point of libraries like TensorFlow is to simplify the whole process of working with neural networks, such as creating, training, and using them, so that it's available to everyone, as opposed to just a small subset of researchers. So, neural networks are computing systems that allow us to solve various tasks. Traditionally, computing algorithms were designed completely from the ground up by engineers like me, and we had to manually sift through the data and decide which parts are important for the task and which are not. Neural networks aim to solve this problem, a little bit, by sifting through the data themselves, automatically and finding traits and features which correlate to the right results. So, you can think of it as neural networks learning to solve a specific task by looking through the data without having human beings have to sit and sift through the data themselves. So, there's a couple necessary parts to getting a trained neural model, which is the final goal. By the way, a neural model is the same as a neural network. Those are synonymous. So, first, you need this light blue circle, an untrained neural model, which is pretty easy to get in TensorFlow, and, in edition to that, you need your training data. Now, this involves both training inputs and training labels, and I'll talk about exactly what those two things are on the next slide. But, basically, you need to train your model with the training data, and, once it is trained, you can use your trained model to predict on just the purple circle, so new training inputs. And, it will predict the training labels for you. You don't have to label it anymore. So, a neural network can be thought of as... Training a neural network can be thought of as teaching a person how to do something. For example, if I want to learn to speak a new language, let's say French, I would probably hire some sort of tutor to help me with that task, and I would need a lot of practice constructing and saying sentences in French. And a lot of feedback from my tutor on whether my pronunciation or grammar, et cetera, is correct. And, so, that would take me some time, but, finally, hopefully, I would be able to learn the language and speak it without any sort of feedback, getting it right. So, in a very similar manner, a neural network needs to practice on, example, training data, first, and, along with that data, it needs labeled data. In this case, the labeled data is kind of analogous to the tutor. It is the correct answers, so that the network can learn what those look like. But, ultimately, the goal is to predict on unlabeled data which is analogous to me knowing how to speak French. So, I went over most of the bullets. A neural network needs a lot of practice. To do that, it needs a lot of good labeled data, and, finally, since a neural network needs to iterate over the training data many, many times, it needs a powerful machine which can do that in a reasonable amount of time. So, here's a quick checklist on what you need if you have a specific task that you want to solve with a neural network. So, the first thing you need is a powerful machine for training. We discussed why this is important. Then, you need TensorFlow installed on the machine, of course, and you need a dataset and labels for your dataset. Now, this dataset can be hundreds of examples, thousands, sometimes even millions. I won't go into that because the dataset size really depends on the task at hand, but if you have these four things, you can train a good neural network that will predict whatever result you want it to predict at the end. So, we've talked about neural networks and TensorFlow, but the question is if we already have a lot of built-in machine-learning algorithms in Vertica, then why do we need to use TensorFlow? And, to answer that question, let's look at this dataset. So, this is a pretty simple toy dataset with 20,000 points, but it shows, it simulates a more complex dataset with some sort of two different classes which are not related in a simple way. So, the existing machine-learning algorithms that Vertica already has, mostly fail on this pretty simple dataset. Linear models can't really draw a good line separating the two types of points. Naïve Bayes, also, performs pretty badly, and even the Random Forest algorithm, which is a pretty powerful algorithm, with 300 trees gets only 80% accuracy. However, a neural network with only two hidden layers gets 99% accuracy in about ten minutes of training. So, I hope that's a pretty compelling reason to use neural networks, at least sometimes. So, as an aside, there are plenty of tasks that do fit the existing machine-learning algorithms in Vertica. That's why they're there, and if one of your tasks that you want to solve fits one of the existing algorithms, well, then I would recommend using that algorithm, not TensorFlow, because, while neural networks have their place and are very powerful, it's often easier to use an existing algorithm, if possible. Okay, so, now that we've talked about why neural networks are needed, let's talk about integrating them with Vertica. So, neural networks are best trained using GPUs, which are Graphics Processing Units, and it's, basically, just a different processing unit than a CPU. GPUs are good for training neural networks because they excel at doing many, many simple operations at the same time, which is needed for a neural network to be able to iterate through the training data many times. However, Vertica runs on CPUs and cannot run on GPUs at all because that's not how it was designed. So, to train our neural networks, we have to go outside of Vertica, and exporting a small batch of training data is pretty simple. So, that's not really a problem, but, given this information, why do we even need Vertica? If we train outside, then why not do everything outside of Vertica? So, to answer that question, here is a slide that Philips was nice enough to let us use. This is an example of production system at Philips. So, it consists of two branches. On the left, we have a branch with historical device log data, and this can kind of be thought of as a bunch of training data. And, all that data goes through some data integration, data analysis. Basically, this is where you train your models, whether or not they are neural networks, but, for the purpose of this talk, this is where you would train your neural network. And, on the right, we have a branch which has live device log data coming in from various MRI machines, CAT scan machines, et cetera, and this is a ton of data. So, these machines are constantly running. They're constantly on, and there's a bunch of them. So, data just keeps streaming in, and, so, we don't want this data to have to take any unnecessary detours because that would greatly slow down the whole system. So, this data in the right branch goes through an already trained predictive model, which need to be pretty fast, and, finally, it allows Philips to do some maintenance on these machines before they actually break, which helps Philips, obviously, and definitely the medical industry as well. So, I hope this slide helped explain the complexity of a live production system and why it might not be reasonable to train your neural networks directly in the system with the live device log data. So, a quick summary on just the neural networks section. So, neural networks are powerful, but they need a lot of processing power to train which can't really be done well in a production pipeline. However, they are cheap and fast to predict with. Prediction with a neural network does not require GPU anymore. And, they can be very useful in production, so we do want them there. We just don't want to train them there. So, the question is, now, how do we get neural networks into production? So, we have, basically, two options. The first option is to take the data and export it to our machine with TensorFlow, our powerful GPU machine, or we can take our TensorFlow model and put it where the data is. In this case, let's say that that is Vertica. So, I'm going to go through some pros and cons of these two approaches. The first one is bringing the data to the analytics. The pros of this approach are that TensorFlow is already installed, running on this GPU machine, and we don't have to move the model at all. The cons, however, are that we have to transfer all the data to this machine and if that data is big, if it's, I don't know, gigabytes, terabytes, et cetera, then that becomes a huge bottleneck because you can only transfer in small quantities. Because GPU machines tend to not be that big. Furthermore, TensorFlow prediction doesn't actually need a GPU. So, you would end up paying for an expensive GPU for no reason. It's not parallelized because you just have one GPU machine. You can't put your production system on this GPU, as we discussed. And, so, you're left with good results, but not fast and not where you need them. So, now, let's look at the second option. So, the second option is bringing the analytics to the data. So, the pros of this approach are that we can integrate with our production system. It's low impact because prediction is not processor intensive. It's cheap, or, at least, it's pretty much as cheap as your system was before. It's parallelized because Vertica was always parallelized, which we'll talk about in the next slide. There's no extra data movement. You get the benefit from model management in Vertica, meaning, if you import multiple TensorFlow models, you can keep track of their various attributes, when they were imported, et cetera. And, the results are right where you need them, inside your production pipeline. So, two cons are that TensorFlow is limited to just prediction inside Vertica, and, if you want to retrain your model, you need to do that outside of Vertica and, then, reimport. So, just as a recap of parallelization. Everything in Vertica is parallelized and distributed, and TensorFlow is no exception. So, when you import your TensorFlow model to your Vertica cluster, it gets copied to all the nodes, automatically, and TensorFlow will run in fenced mode which means that it the TensorFlow process fails for whatever reason, even though it shouldn't, but if it does, Vertica itself will not crash, which is obviously important. And, finally, prediction happens on each node. There are multiple threads of TensorFlow processes running, processing different little bits of data, which is faster, much faster, than processing the data line by line because it happens all in a parallelized fashion. And, so, the result is fast prediction. So, here's an example which I hope is a little closer to what everyone is used to than the usual machine learning TensorFlow example. This is the Boston housing dataset, or, rather, a small subset of it. Now, on the left, we have the input data to go back to, I think, the first slide, and, on the right, is the training label. So, the input data consists of, each line is a plot of land in Boston, along with various attributes, such as the level of crime in that area, how much industry is in that area, whether it's on the Charles River, et cetera, and, on the right, we have as the labels the median house value in that plot of land. And, so, the goal is to put all this data into the neural network and, finally, get a model which can train... I don't know, which can predict on new incoming data and predict a good housing value for that data. Now, I'm going to go through, step by step, how to actually use TensorFlow models in Vertica. So, the first step I won't go into much detail on because there are countless tutorials and resources online on how to use TensorFlow to train a neural network, so that's the first step. Second step is to save the model in TensorFlow's 'frozen graph' format. Again, this information is available online. The third step is to create a small, simple JSON file describing the inputs and outputs of the model, and what data type they are, et cetera. And, this is needed for Vertica to be able to translate from TensorFlow land into Vertica equal land, so that it can use a sequel table instead of the input set TensorFlow usually takes. So, once you have your model file and your JSON file, you want to put both of those files in a directory on a node, any node, in a Vertica cluster, and name that directory whatever you want your model to ultimately be called inside of Vertica. So, once you do that you can go ahead and import that directory into Vertica. So, this import model's function already exists in Vertica. All we added was a new category to be able to import. So, what you need to do is specify the pass to your neural network directory and specify that the category that the model is is a TensorFlow model. Once you successfully import, in order to predict, you run this brand new predict TensorFlow function, so, in this case, we're predicting on everything from the input table, which is what the star means. The model name is Boston housing net which is the name of your directory, and, then, there's a little bit of boilerplate. And, the two ID and value after the as are just the names of the columns of your outputs, and, finally, the Boston housing data is whatever sequel table you want to predict on that fits the import type of your network. And, this will output a bunch of predictions. In this case, values of houses that the network thinks are appropriate for all the input data. So, just a quick summary. So, we talked about what is TensorFlow and what are neural networks, and, then, we discussed that TensorFlow works best on GPUs because it needs very specific characteristics. That is TensorFlow works best for training on GPUs while Vertica is designed to use CPUs, and it's really good at storing and accessing a lot of data quickly. But, it's not very well designed for having neural networks trained inside of it. Then, we talked about how neural models are powerful, and we want to use them in our production flow. And, since prediction is fast, we can go ahead and do that, but we just don't want to train there, and, finally, I presented Vertica TensorFlow integration which allows importing a trained neural model, a trained neural TensorFlow model, into Vertica and predicting on all the data that is inside Vertica with few simple lines of sequel. So, thank you for listening. I'm going to take some questions, now.

Published Date : Mar 30 2020

SUMMARY :

and I'll be your host for this session. So, as soon as a question occurs to you, So, the second option is bringing the analytics to the data.

ENTITIES

Entity	Category	Confidence
Vertica	ORGANIZATION	0.99+
Philips	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
George	PERSON	0.99+
99%	QUANTITY	0.99+
20,000 points	QUANTITY	0.99+
second option	QUANTITY	0.99+
Charles River	LOCATION	0.99+
Google	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Paige Roberts	PERSON	0.99+
third step	QUANTITY	0.99+
first step	QUANTITY	0.99+
George Larionov	PERSON	0.99+
first option	QUANTITY	0.99+
two things	QUANTITY	0.99+
first	QUANTITY	0.99+
Second step	QUANTITY	0.99+
Paige	PERSON	0.99+
each line	QUANTITY	0.99+
two branches	QUANTITY	0.99+
Today	DATE	0.99+
two options	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
300 trees	QUANTITY	0.99+
two approaches	QUANTITY	0.99+
millions	QUANTITY	0.99+
first slide	QUANTITY	0.99+
TensorFlow	TITLE	0.99+
Tapping Vertica's Integration with TensorFlow for Advanced Machine Learning	TITLE	0.99+
two types	QUANTITY	0.99+
two different classes	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
Vertica	TITLE	0.99+
first one	QUANTITY	0.98+
two cons	QUANTITY	0.97+
about ten minutes	QUANTITY	0.97+
two hidden layers	QUANTITY	0.97+
French	OTHER	0.96+
each node	QUANTITY	0.95+
one	QUANTITY	0.95+
end this week	DATE	0.94+
two ID	QUANTITY	0.91+
four things	QUANTITY	0.89+

Wrap | Machine Learning Everywhere 2018

>> Narrator: Live from New York, it's theCUBE. Covering machine learning everywhere. Build your ladder to AI. Brought to you by IBM. >> Welcome back to IBM's Machine Learning Everywhere. Build your ladder to AI, along with Dave Vellante, John Walls here, wrapping up here in New York City. Just about done with the programming here in Midtown. Dave, let's just take a step back. We've heard a lot, seen a lot, talked to a lot of folks today. First off, tell me, AI. We've heard some optimistic outlooks, some, I wouldn't say pessimistic, but some folks saying, "Eh, hold off." Not as daunting as some might think. So just your take on the artificial intelligence conversation we've heard so far today. >> I think generally, John, that people don't realize what's coming. I think the industry, in general, our industry, technology industry, the consumers of technology, the businesses that are out there, they're steeped in the past, that's what they know. They know what they've done, they know the history and they're looking at that as past equals prologue. Everybody knows that's not the case, but I think it's hard for people to envision what's coming, and what the potential of AI is. Having said that, Jennifer Shin is a near-term pessimist on the potential for AI, and rightly so. There are a lot of implementation challenges. But as we said at the open, I'm very convinced that we are now entering a new era. The Hadoop big data industry is going to pale in comparison to what we're seeing. And we're already seeing very clear glimpses of it. The obvious things are Airbnb and Uber, and the disruptions that are going on with Netflix and over-the-top programming, and how Google has changed advertising, and how Amazon is changing and has changed retail. But what you can see, and again, the best examples are Apple getting into financial services, moving into healthcare, trying to solve that problem. Amazon buying a grocer. The rumor that I heard about Amazon potentially buying Nordstrom, which my wife said is a horrible idea. (John laughs) But think about the fact that they can do that is a function of, that they are a digital-first company. Are built around data, and they can take those data models and they can apply it to different places. Who would have thought, for example, that Alexa would be so successful? That Siri is not so great? >> Alexa's become our best friend. >> And it came out of the blue. And it seems like Google has a pretty competitive piece there, but I can almost guarantee that doing this with our thumbs is not the way in which we're going to communicate in the future. It's going to be some kind of natural language interface that's going to rely on artificial intelligence and machine learning and the like. And so, I think it's hard for people to envision what's coming, other than fast forward where machines take over the world and Stephen Hawking and Elon Musk say, "Hey, we should be concerned." Maybe they're right, not in the next 10 years. >> You mentioned Jennifer, we were talking about her and the influencer panel, and we've heard from others as well, it's a combination of human intelligence and artificial intelligence. That combination's more powerful than just artificial intelligence, and so, there is a human component to this. So, for those who might be on the edge of their seat a little bit, or looking at this from a slightly more concerning perspective, maybe not the case. Maybe not necessary, is what you're thinking. >> I guess at the end of the day, the question is, "Is the world going to be a better place with all this AI? "Are we going to be more prosperous, more productive, "healthier, safer on the roads?" I am an optimist, I come down on the side of yes. I would not want to go back to the days where I didn't have GPS. That's worth it to me. >> Can you imagine, right? If you did that now, you go back five years, just five years from where we are now, back to where we were. Waze was nowhere, right? >> All the downside of these things, I feel is offset by that. And I do think it's incumbent upon the industry to try to deal with the problem, especially with young people, the blue light problem. >> John: The addictive issue. >> That's right. But I feel like those downsides are manageable, and the upsides are of enough value that society is going to continue to move forward. And I do think that humans and machines are going to continue to coexist, at least in the near- to mid- reasonable long-term. But the question is, "What can machines "do that humans can't do?" And "What can humans do that machines can't do?" And the answer to that changes every year. It's like I said earlier, not too long ago, machines couldn't climb stairs. They can now, robots can climb stairs. Can they negotiate? Can they identify cats? Who would've imagined that all these cats on the Internet would've led to facial recognition technology. It's improving very, very rapidly. So, I guess my point is that that is changing very rapidly, and there's no question it's going to have an impact on society and an impact on jobs, and all those other negative things that people talk about. To me, the key is, how do we embrace that and turn it into an opportunity? And it's about education, it's about creativity, it's about having multi-talented disciplines that you can tap. So we talked about this earlier, not just being an expert in marketing, but being an expert in marketing with digital as an understanding in your toolbox. So it's that two-tool star that I think is going to emerge. And maybe it's more than two tools. So that's how I see it shaping up. And the last thing is disruption, we talked a lot about disruption. I don't think there's any industry that's safe. Colin was saying, "Well, certain industries "that are highly regulated-" In some respects, I can see those taking longer. But I see those as the most ripe for disruption. Financial services, healthcare. Can't we solve the HIPAA challenge? We can't get access to our own healthcare information. Well, things like artificial intelligence and blockchain, we were talking off-camera about blockchain, those things, I think, can help solve the challenge of, maybe I can carry around my health profile, my medical records. I don't have access to them, it's hard to get them. So can things like artificial intelligence improve our lives? I think there's no question about it. >> What about, on the other side of the coin, if you will, the misuse concerns? There are a lot of great applications. There are a lot of great services. As you pointed out, a lot of positive, a lot of upside here. But as opportunities become available and technology develops, that you run the risk of somebody crossing the line for nefarious means. And there's a lot more at stake now because there's a lot more of us out there, if you will. So, how do you balance that? >> There's no question that's going to happen. And it has to be managed. But even if you could stop it, I would say you shouldn't because the benefits are going to outweigh the risks. And again, the question we asked the panelists, "How far can we take machines? "How far can we go?" That's question number one, number two is, "How far should we go?" We're not even close to the "should we go" yet. We're still on the, "How far can we go?" Jennifer was pointing out, I can't get my password reset 'cause I got to call somebody. That problem will be solved. >> So, you're saying it's more of a practical consideration now than an ethical one, right now? >> Right now. Moreso, and there's certainly still ethical considerations, don't get me wrong, but I see light at the end of the privacy tunnel, I see artificial intelligence as, well, analytics is helping us solve credit card fraud and things of that nature. Autonomous vehicles are just fascinating, right? Both culturally, we talked about that, you know, we learned how to drive a stick shift. (both laugh) It's a funny story you told me. >> Not going to worry about that anymore, right? >> But it was an exciting time in our lives, so there's a cultural downside of that. I don't know what the highway death toll number is, but it's enormous. If cell phones caused that many deaths, we wouldn't be using them. So that's a problem that I think things like artificial intelligence and machine intelligence can solve. And then the other big thing that we talked about is, I see a huge gap between traditional companies and these born-in-the-cloud, born-data-oriented companies. We talked about the top five companies by market cap. Microsoft, Amazon, Facebook, Alphabet, which is Google, who am I missing? >> John: Apple. >> Apple, right. And those are pretty much very much data companies. Apple's got the data from the phones, Google, we know where they get their data, et cetera, et cetera. Traditional companies, however, their data resides in silos. Jennifer talked about this, Craig, as well as Colin. Data resides in silos, it's hard to get to. It's a very human-driven business and the data is bolted on. With the companies that we just talked about, it's a data-driven business, and the humans have expertise to exploit that data, which is very important. So there's a giant skills gap in existing companies. There's data silos. The other thing we touched on this is, where does innovation come from? Innovation drives value drives disruption. So the innovation comes from data. He or she who has the best data wins. It comes from artificial intelligence, and the ability to apply artificial intelligence and machine learning. And I think something that we take for granted a lot, but it's cloud economics. And it's more than just, and somebody, one of the folks mentioned this on the interview, it's more than just putting stuff in the cloud. It's certainly managed services, that's part of it. But it's also economies of scale. It's marginal economics that are essentially zero. It's speed, it's low latency. It's, and again, global scale. You combine those things, data, artificial intelligence, and cloud economics, that's where the innovation is going to come from. And if you think about what Uber's done, what Airbnb have done, where Waze came from, they were picking and choosing from the best digital services out there, and then developing their own software from this, what I say my colleague Dave Misheloff calls this matrix. And, just to repeat, that matrix is, the vertical matrix is industries. The horizontal matrix are technology platforms, cloud, data, mobile, social, security, et cetera. They're building companies on top of that matrix. So, it's how you leverage the matrix is going to determine your future. Whether or not you get disrupted, whether your the disruptor or the disruptee. It's not just about, we talked about this at the open. Cloud, SaaS, mobile, social, big data. They're kind of yesterday's news. It's now new artificial intelligence, machine intelligence, deep learning, machine learning, cognitive. We're still trying to figure out the parlance. You could feel the changes coming. I think this matrix idea is very powerful, and how that gets leveraged in organizations ultimately will determine the levels of disruption. But every single industry is at risk. Because every single industry is going digital, digital allows you to traverse industries. We've said it many times today. Amazon went from bookseller to content producer to grocer- >> John: To grocer now, right? >> To maybe high-end retailer. Content company, Apple with Apple Pay and companies getting into healthcare, trying to solve healthcare problems. The future of warfare, you live in the Beltway. The future of warfare and cybersecurity are just coming together. One of the biggest issues I think we face as a country is we have fake news, we're seeing the weaponization of social media, as James Scott said on theCUBE. So, all these things are coming together that I think are going to make the last 10 years look tame. >> Let's just switch over to the currency of AI, data. And we've talked to, Sam Lightstone today was talking about the database querying that they've developed with the Plex product. Some fascinating capabilities now that make it a lot richer, a lot more meaningful, a lot more relevant. And that seems to be, really, an integral step to making that stuff come alive and really making it applicable to improving your business. Because they've come up with some fantastic new ways to squeeze data that's relevant out, and get it out to the user. >> Well, if you think about what I was saying earlier about data as a foundational core and human expertise around it, versus what most companies are, is human expertise with data bolted on or data in silos. What was interesting about Queryplex, I think they called it, is it essentially virtualizes the data. Well, what does that mean? That means i can have data in place, but I can have access to that data, I can democratize that data, make it accessible to people so that they can become data-driven, data is the core. Now, what I don't know, and I don't know enough, just heard about it today, I missed that announcement, I think they announced it a year ago. He mentioned DB2, he mentioned Netezza. Most of the world is not on DB2 and Netezza even though IBM customers are. I think they can get to Hadoop data stores and other data stores, I just don't know how wide that goes, what the standards look like. He joked about the standards as, the great thing about standards is- >> There are a lot of 'em. (laughs) >> There's always another one you can pick if this one fails. And he's right about that. So, that was very interesting. And so, this is again, the question, can traditional companies close that machine learning, machine intelligence, AI gap? Close being, close the gap that the big five have created. And even the small guys, small guys like Uber and Airbnb, and so forth, but even those guys are getting disrupted. The Airbnbs and the Ubers, right? Again, blockchain comes in and you say, "Why do I need a trusted third party called Uber? "Why can't I do this on the blockchain?" I predict you're going to see even those guys get disrupted. And I'll say something else, it's hard to imagine that a Google or a Facebook can be unseated. But I feel like we may be entering an era where this is their peak. Could be wrong, I'm an Apple customer. I don't know, I'm not as enthralled as I used to be. They got trillions in the bank. But is it possible that opensource and blockchain and the citizen developer, the weekend and nighttime developers, can actually attack that engine of growth for the last 10 years, 20 years, and really break that monopoly? The Internet has basically become an oligopoly where five companies, six companies, whatever, 10 companies kind of control things. Is it possible that opensource software, AI, cryptography, all this activity could challenge the status quo? Being in this business as long as I have, things never stay the same. Leaders come, leaders go. >> I just want to say, never say never. You don't know. >> So, it brings it back to IBM, which is interesting to me. It was funny, I was asking Rob Thomas a question about disruption, and I think he misinterpreted it. I think he was thinking that I was saying, "Hey, you're going to get disrupted by all these little guys." IBM's been getting disrupted for years. They know how to reinvent. A lot of people criticize IBM, how many quarters they haven't had growth, blah, blah, blah, but IBM's made some big, big bets on the future. People criticizing Watson, but it's going to be really interesting to see how all this investment that IBM has made is going to pay off. They were early on. People in the Valley like to say, "Well, the Facebooks, and even Amazon, "Google, they got the best AI. "IBM is not there with them." But think about what IBM is trying to do versus what Google is doing. They're very consumer-oriented, solving consumer problems. Consumers have really led the consumerization of IT, that's true, but none of those guys are trying to solve cancer. So IBM is talking about some big, hairy, audacious goals. And I'm not as pessimistic as some others you've seen in the trade press, it's popular to do. So, bringing it back to IBM, I saw IBM as trying to disrupt itself. The challenge IBM has, is it's got a lot of legacy software products that have purchased over the years. And it's got to figure out how to get through those. So, things like Queryplex allow them to create abstraction layers. Things like Bluemix allow them to bring together their hundreds and hundreds and hundreds of SaaS applications. That takes time, but I do see IBM making some big investments to disrupt themselves. They've got a huge analytics business. We've been covering them for quite some time now. They're a leader, if not the leader, in that business. So, their challenge is, "Okay, how do we now "apply all these technologies to help "our customers create innovation?" What I like about the IBM story is they're not out saying, "We're going to go disrupt industries." Silicon Valley has a bifurcated disruption agenda. On the one hand, they're trying to, cloud, and SaaS, and mobile, and social, very disruptive technologies. On the other hand, is Silicon Valley going to disrupt financial services, healthcare, government, education? I think they have plans to do so. Are they going to be able to execute that dual disruption agenda? Or are the consumers of AI and the doers of AI going to be the ones who actually do the disrupting? We'll see, I mean, Uber's obviously disrupted taxis, Silicon Valley company. Is that too much to ask Silicon Valley to do? That's going to be interesting to see. So, my point is, IBM is not trying to disrupt its customers' businesses, and it can point to Amazon trying to do that. Rather, it's saying, "We're going to enable you." So it could be really interesting to see what happens. You're down in DC, Jeff Bezos spent a lot of time there at the Washington Post. >> We just want the headquarters, that's all we want. We just want the headquarters. >> Well, to the point, if you've got such a growing company monopoly, maybe you should set up an HQ2 in DC. >> Three of the 20, right, for a DC base? >> Yeah, he was saying the other day that, maybe we should think about enhancing, he didn't call it social security, but the government, essentially, helping people plan for retirement and the like. I heard that and said, "Whoa, is he basically "telling us he's going to put us all out of jobs?" (both laugh) So, that, if I'm a customer of Amazon's, I'm kind of scary. So, one of the things they should absolutely do is spin out AWS, I think that helps solve that problem. But, back to IBM, Ginni Rometty was very clear at the World of Watson conference, the inaugural one, that we are not out trying to compete with our customers. I would think that resonates to a lot of people. >> Well, to be continued, right? Next month, back with IBM again? Right, three days? >> Yeah, I think third week in March. Monday, Tuesday, Wednesday, theCUBE's going to be there. Next week we're in the Bahamas. This week, actually. >> Not as a group taking vacation. Actually a working expedition. >> No, it's that blockchain conference. Actually, it's this week, what am I saying next week? >> Although I'm happy to volunteer to grip on that shoot, by the way. >> Flying out tomorrow, it's happening fast. >> Well, enjoyed this, always good to spend time with you. And good to spend time with you as well. So, you've been watching theCUBE, machine learning everywhere. Build your ladder to AI. Brought to you by IBM. Have a good one. (techno music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. talked to a lot of folks today. and they can apply it to different places. And so, I think it's hard for people to envision and so, there is a human component to this. I guess at the end of the day, the question is, back to where we were. to try to deal with the problem, And the answer to that changes every year. What about, on the other side of the coin, because the benefits are going to outweigh the risks. of the privacy tunnel, I see artificial intelligence as, And then the other big thing that we talked about is, And I think something that we take that I think are going to make the last 10 years look tame. And that seems to be, really, an integral step I can democratize that data, make it accessible to people There are a lot of 'em. The Airbnbs and the Ubers, right? I just want to say, never say never. People in the Valley like to say, We just want the headquarters, that's all we want. Well, to the point, if you've got such But, back to IBM, Ginni Rometty was very clear Monday, Tuesday, Wednesday, theCUBE's going to be there. Actually a working expedition. No, it's that blockchain conference. to grip on that shoot, by the way. And good to spend time with you as well.

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Eric Herzog	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Diane	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mark Albertson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Tricia Wang	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
James Scott	PERSON	0.99+
Scott	PERSON	0.99+
Ray Wang	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Brian Walden	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
Rachel Tobik	PERSON	0.99+
Alphabet	ORGANIZATION	0.99+
Zeynep Tufekci	PERSON	0.99+
Tricia	PERSON	0.99+
Stu	PERSON	0.99+
Tom Barton	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sandra Rivera	PERSON	0.99+
John	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
France	LOCATION	0.99+
Jennifer Lin	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Seattle	LOCATION	0.99+
Brian	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
Scott Raynovich	PERSON	0.99+
Radisys	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Eric	PERSON	0.99+
Amanda Silver	PERSON	0.99+

Machine Learning Panel | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE. Covering machine learning everywhere. Build your ladder to AI. Brought to you by IBM. Welcome back to New York City. Along with Dave Vellante, I'm John Walls. We continue our coverage here on theCUBE of machine learning everywhere. Build your ladder to AI, IBM our host here today. We put together, occasionally at these events, a panel of esteemed experts with deep perspectives on a particular subject. Today our influencer panel is comprised of three well-known and respected authorities in this space. Glad to have Colin Sumpter here with us. He's the man with the mic, by the way. He's going to talk first. But, Colin is an IT architect with CrowdMole. Thank you for being with us, Colin. Jennifer Shin, those of you on theCUBE, you're very familiar with Jennifer, a long time Cuber. Founded 8 Path Solutions, on the faculty at NYU and Cal Berkeley, and also with us is Craig Brown, a big data consultant. And a home game for all of you guys, right, more or less here we are in the city. So, thanks for having us, we appreciate the time. First off, let's just talk about the title of the event, Build Your Path... Or Your Ladder, excuse me, to AI. What are those steps on that ladder, Colin? The fundamental steps that you've got to jump on, or step on, in order to get to that true AI environment? >> In order to get to that true AI environment, John, is a matter of mastering or organizing your information well enough to perform analytics. That'll give you two choices to do either linear regression or supervised classification, and then you actually have enough organized data to talk to your team and organize your team around that data to begin that ladder to successively benefit from your data science program. >> Want to take a stab at it, Jennifer? >> So, I would say, compute, right? You need to have the right processing, or at least the ability to scale out to be able to process the algorithm fast enough to be able to find value in your data. I think the other thing is, of course, the data source itself. Do you have right data to answer the questions you want to answer? So, I think, without those two things, you'll either have a lot of great data that you can't process in time, or you'll have a great process or a great algorithm that has no real information, so your output is useless. I think those are the fundamental things you really do need to have any sort of AI solution built. >> I'll take a stab at it from the business side. They have to adopt it first. They have to believe that this is going to benefit them and that the effort that's necessary in order to build into the various aspects of algorithms and data subjects is there, so I think adopting the concept of machine learning and the development aspects that it takes to do that is a key component to building the ladder. >> So this just isn't toe in the water, right? You got to dive in the deep end, right? >> Craig: Right. >> It gets to culture. If you look at most organizations, not the big five market capped companies, but most organizations, data is not at their core. Humans are at their core, human expertise and data is sort of bolted on, but that has to change, or they're going to get disrupted. Data has to be at the core, maybe the human expertise leverages that data. What do you guys seeing with end customers in terms of their readiness for this transformation? >> What I'm seeing customers spending time right now is getting out of the silos. So, when you speak culture, that's primarily what the culture surrounds. They develop applications with functionality as a silo, and data specific to that functionality is the component in which they look at data. They have to get out of that mindset and look at the data holistically, and ultimately, in these events, looking at it as an asset. >> The data is a shared resource. >> Craig: Right, correct. >> Okay, and again, with the exception of the... Whether it's Google, Facebook, obviously, but the Ubers, the AirBNB's, etc... With the exception of those guys, most customers aren't there. Still, the data is in silos, they've got myriad infrastructure. Your thoughts, Jennifer? >> I'm also seeing sort of a disconnect between the operationalizing team, the team that runs these codes, or has a real business need for it, and sometimes you'll see corporations with research teams, and there's sort of a disconnect between what the researchers do and what these operations, or marketing, whatever domain it is, what they're doing in terms of a day to day operation. So, for instance, a researcher will look really deep into these algorithms, and may know a lot about deep learning in theory, in theoretical world, and might publish a paper that's really interesting. But, that application part where they're actually being used every day, there's this difference there, where you really shouldn't have that difference. There should be more alignment. I think actually aligning those resources... I think companies are struggling with that. >> So, Colin, we were talking off camera about RPA, Robotic Process Automation. Where's the play for machine intelligence and RPA? Maybe, first of all, you could explain RPA. >> David, RPA stands for Robotic Process Automation. That's going to enable you to grow and scale a digital workforce. Typically, it's done in the cloud. The way RPA and Robotic Process Automation plays into machine learning and data science, is that it allows you to outsource business processes to compensate for the lack of human expertise that's available in the marketplace, because you need competency to enable the technology to take advantage of these new benefits coming in the market. And, when you start automating some of these processes, you can keep pace with the innovation in the marketplace and allow the human expertise to gradually grow into these new data science technologies. >> So, I was mentioning some of the big guys before. Top five market capped companies: Google, Amazon, Apple, Facebook, Microsoft, all digital. Microsoft you can argue, but still, pretty digital, pretty data oriented. My question is about closing that gap. In your view, can companies close that gap? How can they close that gap? Are you guys helping companies close that gap? It's a wide chasm, it seems. Thoughts? >> The thought on closing the chasm is... presenting the technology to the decision-makers. What we've learned is that... you don't know what you don't know, so it's impossible to find the new technologies if you don't have the vocabulary to just begin a simple research of these new technologies. And, to close that gap, it really comes down to the awareness, events like theCUBE, webinars, different educational opportunities that are available to line of business owners, directors, VP's of systems and services, to begin that awareness process, finding consultants... begin that pipeline enablement to begin allowing the business to take advantage and harness data science, machine learning and what's coming. >> One of the things I've noticed is that there's a lot of information out there, like everyone a webinar, everyone has tutorials, but there's a lot of overlap. There aren't that many very sophisticated documents you can find about how to implement it in real world conditions. They all tend to use the same core data set, a lot of these machine learning tutorials you'll find, which is hilarious because the data set's actually very small. And I know where it comes from, just from having the expertise, but it's not something I'd ever use in the real world. The level of skill you need to be able to do any of these methodologies. But that's what's out there. So, there's a lot of information, but they're kind of at a rudimentary level. They're not really at that sophisticated level where you're going to learn enough to deploy in real world conditions. One of the things I'm noticing is, with the technical teams, with the data science team, machine learning teams, they're kind of using the same methodologies I used maybe 10 years ago. Because the management who manage these teams are not technical enough. They're business people, so they don't understand how to guide them, how to explain hey maybe you shouldn't do that with your code, because that's actually going to cause a problem. You should use parallel code, you should make sure everything is running in parallel so compute's faster. But, if these younger teams are actually learning for the first time, they make the same mistakes you made 10 years ago. So, I think, what I'm noticing is that lack of leadership is partly one of the reasons, and also the assumption that a non-technical person can lead the technical team. >> So, it's just not skillset on the worker level, if you will. It's also knowledge base on the decision-maker level. That's a bad place to be, right? So, how do you get into the door to a business like that? Obviously, and we've talked about this a little bit today, that some companies say, "We're not data companies, we're not digital companies, we sell widgets." Well, yeah but you sell widgets and you need this to sell more widgets. And so, how do you get into the door and talk about this problem that Jennifer just cited? You're signing the checks, man. You're going to have to get up to speed on this otherwise you're not going to have checks to sign in three to five years, you're done! >> I think that speaks to use cases. I think that, and what I'm actually saying at customers, is that there's a disconnect and an understanding from the executive teams and the low-level technical teams on what the use case actually means to the business. Some of the use cases are operational in nature. Some of the use cases are data in nature. There's no real conformity on what does the use case mean across the organization, and that understanding isn't there. And so, the CIO's, the CEO's, the CTO's think that, "Okay, we're going to achieve a certain level of capability if we do a variety of technological things," and the business is looking to effectively improve some or bring some efficiency to business processes. At each level within the organization, the understanding is at the level at which the discussions are being made. And so, I'm in these meetings with senior executives and we have lots of ideas on how we can bring efficiencies and some operational productivity with technology. And then we get in a meeting with the data stewards and "What are these guys talking about? They don't understand what's going on at the data level and what data we have." And then that's where the data quality challenges come into the conversation, so I think that, to close that cataclysm, we have to figure out who needs to be in the room to effectively help us build the right understanding around the use cases and then bring the technology to those use cases then actually see within the organization how we're affecting that. >> So, to change the questioning here... I want you guys to think about how capable can we make machines in the near term, let's talk next decade near term. Let's say next decade. How capable can we make machines and are there limits to what we should do? >> That's a tough one. Although you want to go next decade, we're still faced with some of the challenges today in terms of, again, that adoption, the use case scenarios, and then what my colleagues are saying here about the various data challenges and dev ops and things. So, there's a number of things that we have to overcome, but if we can get past those areas in the next decade, I don't think there's going to be much of a limit, in my opinion, as to what the technology can do and what we can ask the machines to produce for us. As Colin mentioned, with RPA, I think that the capability is there, right? But, can we also ultimately, as humans, leverage that capability effectively? >> I get this question a lot. People are really worried about AI and robots taking over, and all of that. And I go... Well, let's think about the example. We've all been online, probably over the weekend, maybe it's 3 or 4 AM, checking your bank account, and you get an error message your password is wrong. And we swear... And I've been there where I'm like, "No, no my password's right." And it keeps saying that the password is wrong. Of course, then I change it, and it's still wrong. Then, the next day when I login, I can login, same password, because they didn't put a great error message there. They just defaulted to wrong password when it's probably a server that's down. So, there are these basics or processes that we could be improving which no one's improving. So you think in that example, how many customer service reps are going to be contacted to try to address that? How many IT teams? So, for every one of these bad technologies that are out there, or technologies that are not being run efficiently or run in a way that makes sense, you actually have maybe three people that are going to be contacted to try to resolve an issue that actually maybe could have been avoided to begin with. I feel like it's optimistic to say that robots are going to take over, because you're probably going to need more people to put band-aids on bad technology and bad engineering, frankly. And I think that's the reality of it. If we had hoverboards, that would be great, you know? For a while, we thought we did, right? But we found out, oh it's not quite hoverboards. I feel like that might be what happens with AI. We might think we have it, and then go oh wait, it's not really what we thought it was. >> So there are real limits, certainly in the near to mid to maybe even long term, that are imposed. But you're an optimist. >> Yeah. Well, not so much with AI but everything else, sure. (laughing) AI, I'm a little bit like, "Well, it would be great, but I'd like basic things to be taken care of every day." So, I think the usefulness of technology is not something anyone's talking about. They're talking about this advancement, that advancement, things people don't understand, don't know even how to use in their life. Great, great is an idea. But, what about useful things we can actually use in our real life? >> So block and tackle first, and then put some reverses in later, if you will, to switch over to football. We were talking about it earlier, just about basics. Fundamentals, get your fundamentals right and then you can complement on that with supplementary technologies. Craig, Colin? >> Jen made some really good points and brought up some very good points, and so has... >> John: Craig. >> Craig, I'm sorry. (laughing) >> Craig: It's alright. >> 10 years out, Jen and Craig spoke to false positives. And false positives create a lot of inefficiency in businesses. So, when you start using machine learning and AI 10 years from now, maybe there's reduced false positives that have been scored in real time, allowing teams not to have their time consumed and their business resources consumed trying to resolve false positives. These false positives have a business value that, today, some businesses might not be able to record. In financial services, banks count money not lended. But, in every day business, a lot of businesses aren't counting the monetary consequences of false positives and the drag it has on their operational ability and capacity. >> I want to ask you guys about disruption. If you look at where the disruption, the digital disruptions, have taken place, obviously retail, certainly advertising, certainly content businesses... There are some industries that haven't been highly disruptive: financial services, insurance, we were talking earlier about aerospace, defense rather. Is any business, any industry, safe from digital disruption? >> There are. Certain industries are just highly regulated: healthcare, financial services, real estate, transactional law... These are very extremely regulated technologies, or businesses, that are... I don't want to say susceptible to technology, but they can be disrupted at a basic level, operational efficiency, to make these things happen, these business processes happen more rapidly, more accurately. >> So you guys buy that? There's some... I'd like to get a little debate going here. >> So, I work with the government, and the government's trying to change things. I feel like that's kind of a sign because they tend to be a little bit slower than, say, other private industries, or private companies. They have data, they're trying to actually put it into a system, meaning like if they have files... I think that, at some point, I got contacted about putting files that they found, like birth records, right, marriage records, that they found from 100-plus years ago and trying to put that into the system. By the way, I did look into it, there was no way to use AI for that, because there was no standardization across these files, so they have half a million files, but someone's probably going to manually have to enter that in. The reality is, I think because there's a demand for having things be digital, we aren't likely to see a decrease in that. We're not going to have one industry that goes, "Oh, your files aren't digital." Probably because they also want to be digital. The companies themselves, the employees themselves, want to see that change. So, I think there's going to be this continuous move toward it, but there's the question of, "Are we doing it better?" It is better than, say, having it on paper sometimes? Because sometimes I just feel like it's easier on paper than to have to look through my phone, look through the app. There's so many apps now! >> (laughing) I got my index cards cards still, Jennifer! Dave's got his notebook! >> I'm not sure I want my ledger to be on paper... >> Right! So I think that's going to be an interesting thing when people take a step back and go like, "Is this really better? Is this actually an improvement?" Because I don't think all things are better digital. >> That's a great question. Will the world be a better, more prosperous place... Uncertain. Your thoughts? >> I think the competition is probably the driver as to who has to this now, who's not safe. The organizations that are heavily regulated or compliance-driven can actually use that as the reasoning for not jumping into the barrel right now, and letting it happen in other areas first, watching the technology mature-- >> Dave: Let's wait. >> Yeah, let's wait, because that's traditionally how they-- >> Dave: Good strategy in your opinion? >> It depends on the entity but I think there's nothing wrong with being safe. There's nothing wrong with waiting for a variety of innovations to mature. What level of maturity, I think, is the perspective that probably is another discussion for another day, but I think that it's okay. I don't think that everyone should jump in. Get some lessons learned, watch how the other guys do it. I think that safety is in the eyes of the beholder, right? But some organizations are just competition fierce and they need a competitive edge and this is where they get it. >> When you say safety, do you mean safety in making decisions, or do you mean safety in protecting data? How are you defining safety? >> Safety in terms of when they need to launch, and look into these new technologies as a basis for change within the organization. >> What about the other side of that point? There's so much more data about it, so much more behavior about it, so many more attitudes, so on and so forth. And there is privacy issues and security issues and all that... Those are real challenges for any company, and becoming exponentially more important as more is at stake. So, how do companies address that? That's got to be absolutely part of their equation, as they decide what these future deployments are, because they're going to have great, vast reams of data, but that's a lot of vulnerability too, isn't it? >> It's as vulnerable as they... So, from an organizational standpoint, they're accustomed to these... These challenges aren't new, right? We still see data breaches. >> They're bigger now, right? >> They're bigger, but we still see occasionally data breaches in organizations where we don't expect to see them. I think that, from that perspective, it's the experiences of the organizations that determine the risks they want to take on, to a certain degree. And then, based on those risks, and how they handle adversity within those risks, from an experience standpoint they know ultimately how to handle it, and get themselves to a place where they can figure out what happened and then fix the issues. And then the others watch while these risk-takers take on these types of scenarios. >> I want to underscore this whole disruption thing and ask... We don't have much time, I know we're going a little over. I want to ask you to pull out your Hubble telescopes. Let's make a 20 to 30 year view, so we're safe, because we know we're going to be wrong. I want a sort of scale of 1 to 10, high likelihood being 10, low being 1. Maybe sort of rapid fire. Do you think large retail stores are going to mostly disappear? What do you guys think? >> I think the way that they are structured, the way that they interact with their customers might change, but you're still going to need them because there are going to be times where you need to buy something. >> So, six, seven, something like that? Is that kind of consensus, or do you feel differently Colin? >> I feel retail's going to be around, especially fashion because certain people, and myself included, I need to try my clothes on. So, you need a location to go to, a physical location to actually feel the material, experience the material. >> Alright, so we kind of have a consensus there. It's probably no. How about driving-- >> I was going to say, Amazon opened a book store. Just saying, it's kind of funny because they got... And they opened the book store, so you know, I think what happens is people forget over time, they go, "It's a new idea." It's not so much a new idea. >> I heard a rumor the other day that their next big acquisition was going to be, not Neiman Marcus. What's the other high end retailer? >> Nordstrom? >> Nordstrom, yeah. And my wife said, "Bad idea, they'll ruin it." Will driving and owning your own car become an exception? >> Driving and owning your own car... >> Dave: 30 years now, we're talking. >> 30 years... Sure, I think the concept is there. I think that we're looking at that. IOT is moving us in that direction. 5G is around the corner. So, I think the makings of it is there. So, since I can dare to be wrong, yeah I think-- >> We'll be on 10G by then anyway, so-- >> Automobiles really haven't been disrupted, the car industry. But you're forecasting, I would tend to agree. Do you guys agree or no, or do you think that culturally I want to drive my own car? >> Yeah, I think people, I think a couple of things. How well engineered is it? Because if it's badly engineered, people are not going to want to use it. For instance, there are people who could take public transportation. It's the same idea, right? Everything's autonomous, you'd have to follow in line. There's going to be some system, some order to it. And you might go-- >> Dave: Good example, yeah. >> You might go, "Oh, I want it to be faster. I don't want to be in line with that autonomous vehicle. I want to get there faster, get there sooner." And there are people who want to have that control over their lives, but they're not subject to things like schedules all the time and that's their constraint. So, I think if the engineering is bad, you're going to have more problems and people are probably going to go away from wanting to be autonomous. >> Alright, Colin, one for you. Will robots and maybe 3D printing, for example RPA, will it reverse the trend toward offshore manufacturing? >> 30 years from now, yes. I think robotic process engineering, eventually you're going to be at your cubicle or your desk, or whatever it is, and you're going to be able to print office supplies. >> Do you guys think machines will make better diagnoses than doctors? Ohhhhh. >> I'll take that one. >> Alright, alright. >> I think yes, to a certain degree, because if you look at the... problems with diagnosis, right now they miss it and I don't know how people, even 30 years from now, will be different from that perspective, where machines can look at quite a bit of data about a patient in split seconds and say, "Hey, the likelihood of you recurring this disease is nil to none, because here's what I'm basing it on." I don't think doctors will be able to do that. Now, again, daring to be wrong! (laughing) >> Jennifer: Yeah so--6 >> Don't tell your own doctor either. (laughing) >> That's true. If anything happens, we know, we all know. I think it depends. So maybe 80%, some middle percentage might be the case. I think extreme outliers, maybe not so much. You think about anything that's programmed into an algorithm, someone probably identified that disease, a human being identified that as a disease, made that connection, and then it gets put into the algorithm. I think what w6ll happen is that, for the 20% that isn't being done well by machine, you'll have people who are more specialized being able to identify the outlier cases from, say, the standard. Normally, if you have certain symptoms, you have a cold, those are kind of standard ones. If you have this weird sort of thing where there's n6w variables, environmental variables for instance, your environment can actually lead to you having cancer. So, there's othe6 factors other than just your body and your health that's going to actually be important to think about wh6n diagnosing someone. >> John: Colin, go ahead. >> I think machines aren't going to out-decision doctors. I think doctors are going to work well the machine learning. For instance, there's a published document of Watson doing the research of a team of four in 10 minutes, when it normally takes a month. So, those doctors,6to bring up Jen and Craig's point, are going to have more time to focus in on what the actual symptoms are, to resolve the outcome of patient care and patient services in a way that benefits humanity. >> I just wish that, Dave, that you would have picked a shorter horizon that... 30 years, 20 I feel good about our chances of seeing that. 30 I'm just not so sure, I mean... For the two old guys on the panel here. >> The consensus is 20 years, not so much. But beyond 10 years, a lot's going to change. >> Well, thank you all for joining this. I always enjoy the discussions. Craig, Jennifer and Colin, thanks for being here with us here on theCUBE, we appreciate the time. Back with more here from New York right after this. You're watching theCUBE. (upbeat digital music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. enough organized data to talk to your team and organize or at least the ability to scale out to be able to process and that the effort that's necessary in order to build but that has to change, or they're going to get disrupted. and data specific to that functionality but the Ubers, the AirBNB's, etc... I think companies are struggling with that. Maybe, first of all, you could explain RPA. and allow the human expertise to gradually grow Are you guys helping companies close that gap? presenting the technology to the decision-makers. how to guide them, how to explain hey maybe you shouldn't You're going to have to get up to speed on this and the business is looking to effectively improve some and are there limits to what we should do? I don't think there's going to be much of a limit, that are going to be contacted to try to resolve an issue certainly in the near to mid to maybe even long term, but I'd like basic things to be taken care of every day." in later, if you will, to switch over to football. and brought up some very good points, and so has... Craig, I'm sorry. and the drag it has on their operational ability I want to ask you guys about disruption. operational efficiency, to make these things happen, I'd like to get a little debate going here. So, I think there's going to be this continuous move ledger to be on paper... So I think that's going to be an interesting thing Will the world be a better, more prosperous place... as to who has to this now, who's not safe. It depends on the entity but I think and look into these new technologies as a basis That's got to be absolutely part of their equation, they're accustomed to these... and get themselves to a place where they can figure out I want to ask you to pull out your Hubble telescopes. because there are going to be times I feel retail's going to be around, Alright, so we kind of have a consensus there. I think what happens is people forget over time, I heard a rumor the other day that their next big Will driving and owning your own car become an exception? So, since I can dare to be wrong, yeah I think-- or do you think that culturally I want to drive my own car? There's going to be some system, some order to it. going to go away from wanting to be autonomous. Alright, Colin, one for you. be able to print office supplies. Do you guys think machines will make "Hey, the likelihood of you recurring this disease Don't tell your own doctor either. being able to identify the outlier cases from, say, I think doctors are going to work well the machine learning. I just wish that, Dave, that you would have picked The consensus is 20 years, not so much. I always enjoy the discussions.

ENTITIES

Entity	Category	Confidence
Craig	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
David	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jen	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Jennifer Shin	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Colin Sumpter	PERSON	0.99+
Craig Brown	PERSON	0.99+
John Walls	PERSON	0.99+
20	QUANTITY	0.99+
John	PERSON	0.99+
Nordstrom	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
AirBNB	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Neiman Marcus	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
20%	QUANTITY	0.99+
3	DATE	0.99+
today	DATE	0.99+
three	QUANTITY	0.99+
New York City	LOCATION	0.99+
20 years	QUANTITY	0.99+
CrowdMole	ORGANIZATION	0.99+
10	QUANTITY	0.99+
4 AM	DATE	0.99+
8 Path Solutions	ORGANIZATION	0.99+
Today	DATE	0.99+
two old guys	QUANTITY	0.99+
five years	QUANTITY	0.99+
30 years	QUANTITY	0.99+
30 year	QUANTITY	0.99+
First	QUANTITY	0.99+
three people	QUANTITY	0.99+
Ubers	ORGANIZATION	0.99+
10 minutes	QUANTITY	0.99+
10 years	QUANTITY	0.99+
a month	QUANTITY	0.98+
one	QUANTITY	0.98+
first time	QUANTITY	0.98+
next decade	DATE	0.98+
10 years ago	DATE	0.98+
seven	QUANTITY	0.98+
30	QUANTITY	0.98+
Hubble	ORGANIZATION	0.98+
two things	QUANTITY	0.98+
1	QUANTITY	0.98+
half a million files	QUANTITY	0.97+

Garry Kasparov | Machine Learning Everywhere 2018

>> [Narrator] Live from New York, it's theCube, covering Machine Learning Everywhere. Build your ladder to AI, brought to you by IBM. >> Welcome back here to New York City as we continue at IBM's Machine Learning Everywhere, build your ladder to AI, along with Dave Vellante, I'm John Walls. It is now a great honor of ours to have I think probably and arguably the greatest chess player of all time, Garry Kasparov now joins us. He's currently the chairman of the Human Rights Foundation, political activist in Russia as well some time ago. Thank you for joining us, we really appreciate the time, sir. >> Thank you for inviting me. >> We've been looking forward to this. Let's just, if you would, set the stage for us. Artificial Intelligence obviously quite a hot topic. The maybe not conflict, the complementary nature of human intelligence. There are people on both sides of the camp. But you see them as being very complementary to one another. >> I think that's natural development in this industry that will bring together humans and machines. Because this collaboration will produce the best results. Our abilities are complementary. The humans will bring creativity and intuition and other typical human qualities like human judgment and strategic vision while machines will add calculation, memory, and many other abilities that they have been acquiring quickly. >> So there's room for both, right? >> Yes, I think it's inevitable because no machine will ever reach 100% perfection. Machines will be coming closer and closer, 90%, 92, 94, 95. But there's still room for humans because at the end of the day even with this massive power you have guide it. You have to evaluate the results and at the end of the day the machine will never understand when it reaches the territory of diminishing returns. It's very important for humans actually to identify. So what is the task? I think it's a mistake that is made by many pundits that they automatically transfer the machine's expertise for the closed systems into the open-ended systems. Because in every closed system, whether it's the game of chess, the game of gall, video games like daughter, or anything else where humans already define the parameters of the problem, machines will perform phenomenally. But if it's an open-ended system then machine will never identify what is the sort of the right question to be asked. >> Don't hate me for this question, but it's been reported, now I don't know if it's true or not, that at one point you said that you would never lose to a machine. My question is how capable can we make machines? First of all, is that true? Did you maybe underestimate the power of computers? How capable to you think we can actually make machines? >> Look, in the 80s when the question was asked I was much more optimistic because we saw very little at that time from machines that could make me, world champion at the time, worry about machines' capability of defeating me in the real chess game. I underestimated the pace it was developing. I could see something was happening, was cooking, but I thought it would take longer for machines to catch up. As I said in my talk here is that we should simply recognize the fact that everything we do while knowing how we do that, machines will do better. Any particular task that human perform, machine will eventually surpass us. >> What I love about your story, I was telling you off-camera about when we had Erik Brynjolfsson and Andrew McAfee on, you're the opposite of Samuel P. Langley to me. You know who Samuel P. Langley is? >> No, please. >> Samuel P. Langley, do you know who Samuel P. Langley is? He was the gentleman that, you guys will love this, that the government paid. I think it was $50,000 at the time, to create a flying machine. But the Wright Brothers beat him to it, so what did Samuel P. Langley do after the Wright Brothers succeeded? He quit. But after you lost to the machine you said you know what? I can beat the machine with other humans, and created what is now the best chess player in the world, is my understanding. It's not a machine, but it's a combination of machines and humans. Is that accurate? >> Yes, in chess actually, we could demonstrate how the collaboration can work. Now in many areas people rely on the lessons that have been revealed, learned from what I call advanced chess. That in this team, human plus machine, the most important element of success is not the strengths of the human expert. It's not the speed of the machine, but it's a process. It's an interface, so how you actually make them work together. In the future I think that will be the key of success because we have very powerful machine, those AIs, intelligent algorithms. All of them will require very special treatment. That's why also I use this analogy with the right fuel for Ferrari. We will have expert operators, I call them the shepherds, that will have to know exactly what are the requirements of this machine or that machine, or that group of algorithms to guarantee that we'll be able by our human input to compensate for their deficiencies. Not the other way around. >> What let you to that response? Was it your competitiveness? Was it your vision of machines and humans working together? >> I thought I could last longer as the undefeated world champion. Ironically, 1997 when you just look at the game and the quality of the game and try to evaluate the Deep Blue real strengths, I think I was objective, I was stronger. Because today you can analyze these games with much more powerful computers. I mean any chess app on your laptop. I mean you cannot really compare with Deep Blue. That's natural progress. But as I said, it's not about solving the game, it's not about objective strengths. It's about your ability to actually perform at the board. I just realized while we could compete with machines for few more years, and that's great, it did take place. I played two more matches in 2003 with German program. Not as publicized as IBM match. Both ended as a tie and I think they were probably stronger than Deep Blue, but I knew it would just be over, maybe a decade. How can we make chess relevant? For me it was very natural. I could see this immense power of calculations, brute force. On the other side I could see us having qualities that machines will never acquire. How about bringing together and using chess as a laboratory to find the most productive ways for human-machine collaboration? >> What was the difference in, I guess, processing power basically, or processing capabilities? You played the match, this is 1997. You played the match on standard time controls which allow you or a player a certain amount of time. How much time did Deep Blue, did the machine take? Or did it take its full time to make considerations as opposed to what you exercised? >> Well it's the standard time control. I think you should explain to your audience at that time it was seven hours game. It's what we call classical chess. We have rapid chess that is under one hour. Then you have blitz chess which is five to ten minutes. That was a normal time control. It's worth mentioning that other computers they were beating human players, myself included, in blitz chess. In the very fast chess. We still thought that more time was more time we could have sort of a bigger comfort zone just to contemplate the machine's plans and actually to create real problems that machine would not be able to solve. Again, more time helps humans but at the end of the day it's still about your ability not to crack under pressure because there's so many things that could take you off your balance, and machine doesn't care about it. At the end of the day machine has a steady hand, and steady hand wins. >> Emotion doesn't come into play. >> It's not about apps and strength, but it's about guaranteeing that it will play at a certain level for the entire game. While human game maybe at one point it could go a bit higher. But at the end of the day when you look at average it's still lower. I played many world championship matches and I analyze the games, games played at the highest level. I can tell you that even the best games played by humans at the highest level, they include not necessarily big mistakes, but inaccuracies that are irrelevant when humans facing humans because I make a mistake, tiny mistake, then I can expect you to return the favor. Against the machine it's just that's it. Humans cannot play at the same level throughout the whole game. The concentration, the vigilance are now required when humans face humans. Psychologically when you have a strong machine, machine's good enough to play with a steady hand, the game's over. >> I want to point out too, just so we get the record straight for people who might not be intimately familiar with your record, you were ranked number one in the world from 1986 to 2005 for all but three months. Three months, that's three decades. >> Two decades. >> Well 80s, 90s, and naughts, I'll give you that. (laughing) That's unheard of, that's phenomenal. >> Just going back to your previous question about why I just look for some new form of chess. It's one of the key lessons I learned from my childhood thanks to my mother who spent her live just helping me to become who I am, who I was after my father died when I was seven. It's about always trying to make the difference. It's not just about winning, it's about making a difference. It led me to kind of a new motto in my professional life. That is it's all about my own quality of the game. As long as I'm challenging my own excellence I will never be short of opponents. For me the defeat was just a kick, a push. So let's come up with something new. Let's find a new challenge. Let's find a way to turn this defeat, the lessons from this defeat into something more practical. >> Love it, I mean I think in your book I think, was it John Henry, the famous example. (all men speaking at once) >> He won, but he lost. >> Motivation wasn't competition, it was advancing society and creativity, so I love it. Another thing I just want, a quick aside, you mentioned performing under pressure. I think it was in the 1980s, it might have been in the opening of your book. You talked about playing multiple computers. >> [Garry] Yeah, in 1985. >> In 1985 and you were winning all of them. There was one close match, but the computer's name was Kasparov and you said I've got to beat this one because people will think that it's rigged or I'm getting paid to do this. So well done. >> It's I always mention this exhibition I played in 1985 against 32 chess-playing computers because it's not the importance of this event was not just I won all the games, but nobody was surprised. I have to admit that the fact that I could win all the games against these 32 chess-playing computers they're only chess-playing machine so they did nothing else. Probably boosted my confidence that I would never be defeated even by more powerful machines. >> Well I love it, that's why I asked the question how far can we take machines? We don't know, like you said. >> Why should we bother? I see so many new challenges that we will be able to take and challenges that we abandoned like space exploration or deep ocean exploration because they were too risky. We couldn't actually calculate all the odds. Great, now we have AI. It's all about increasing our risk because we could actually measure against this phenomenal power of AI that will help us to find the right pass. >> I want to follow up on some other commentary. Brynjolfsson and McAfee basically put forth the premise, look machines have always replaced humans. But this is the first time in history that they have replaced humans in the terms of cognitive tasks. They also posited look, there's no question that it's affecting jobs. But they put forth the prescription which I think as an optimist you would agree with, that it's about finding new opportunities. It's about bringing creativity in, complementing the machines and creating new value. As an optimist, I presume you would agree with that. >> Absolutely, I'm always saying jobs do not disappear, they evolve. It's an inevitable part of the technological progress. We come up with new ideas and every disruptive technology destroys some industries but creates new jobs. So basically we see jobs shifting from one industry to another. Like from agriculture, manufacture, from manufacture to other sectors, cognitive tasks. But now there will be something else. I think the market will change, the job market will change quite dramatically. Again I believe that we will have to look for riskier jobs. We will have to start doing things that we abandoned 30, 40 years ago because we thought they were too risky. >> Back to the book you were talking about, deep thinking or machine learning, or machine intelligence ends and human intelligence begins, you talked about courage. We need fail safes in place, but you also need that human element of courage like you said, to accept risk and take risk. >> Now it probably will be easier, but also as I said the machine's wheel will force a lot of talent actually to move into other areas that were not as attractive because there were other opportunities. There's so many what I call raw cognitive tasks that are still financially attractive. I hope and I will close many loops. We'll see talent moving into areas where we just have to open new horizons. I think it's very important just to remember it's the technological progress especially when you're talking about disruptive technology. It's more about unintended consequences. The fly to the moon was just psychologically it's important, the Space Race, the Cold War. But it was about also GPS, about so many side effects that in the 60s were not yet appreciated but eventually created the world we have now. I don't know what the consequences of us flying to Mars. Maybe something will happen, one of the asteroids will just find sort of a new substance that will replace fossil fuel. What I know, it will happen because when you look at the human history there's all this great exploration. They ended up with unintended consequences as the main result. Not what was originally planned as the number one goal. >> We've been talking about where innovation comes from today. It's a combination of a by-product out there. A combination of data plus being able to apply artificial intelligence. And of course there's cloud economics as well. Essentially, well is that reasonable? I think about something you said, I believe, in the past that you didn't have the advantage of seeing Deep Blue's moves, but it had the advantage of studying your moves. You didn't have all the data, it had the data. How does data fit into the future? >> Data is vital, data is fuel. That's why I think we need to find some of the most effective ways of collaboration between humans and machines. Machines can mine the data. For instance, it's a breakthrough in instantly mining data and human language. Now we could see even more effective tools to help us to mine the data. But at the end of the day it's why are we doing that? What's the purpose? What does matter to us, so why do we want to mine this data? Why do we want to do here and not there? It seems at first sight that the human responsibilities are shrinking. I think it's the opposite. We don't have to move too much but by the tiny shift, just you know percentage of a degree of an angle could actually make huge difference when this bullet reaches the target. The same with AI. More power actually offers opportunities to start just making tiny adjustments that could have massive consequences. >> Open up a big, that's why you like augmented intelligence. >> I think artificial is sci-fi. >> What's artificial about it, I don't understand. >> Artificial, it's an easy sell because it's sci-fi. But augmented is what it is because our intelligent machines are making us smarter. Same way as the technology in the past made us stronger and faster. >> It's not artificial horsepower. >> It's created from something. >> Exactly, it's created from something. Even if the machines can adjust their own code, fine. It still will be confined within the parameters of the tasks. They cannot go beyond that because again they can only answer questions. They can only give you answers. We provide the questions so it's very important to recognize that it is we will be in the leading role. That's why I use the term shepherds. >> How do you spend your time these days? You're obviously writing, you're speaking. >> Writing, speaking, traveling around the world because I have to show up at many conferences. The AI now is a very hot topic. Also as you mentioned I'm the Chairman of Human Rights Foundation. My responsibilities to help people who are just dissidents around the world who are fighting for their principles and for freedom. Our organization runs the largest dissident gathering in the world. It's called the Freedom Forum. We have the tenth anniversary, tenth event this May. >> It has been a pleasure. Garry Kasparov, live on theCube. Back with more from New York City right after this. (lively instrumental music)

Published Date : Feb 27 2018

SUMMARY :

Build your ladder to AI, brought to you by IBM. He's currently the chairman of the Human Rights Foundation, The maybe not conflict, the complementary nature that will bring together humans and machines. of the day even with this massive power you have guide it. How capable to you think we can actually make machines? recognize the fact that everything we do while knowing P. Langley to me. But the Wright Brothers beat him to it, In the future I think that will be the key of success the Deep Blue real strengths, I think I was objective, as opposed to what you exercised? I think you should explain to your audience But at the end of the day when you look at average you were ranked number one in the world from 1986 to 2005 Well 80s, 90s, and naughts, I'll give you that. For me the defeat was just a kick, a push. Love it, I mean I think in your book I think, in the opening of your book. was Kasparov and you said I've got to beat this one the importance of this event was not just I won We don't know, like you said. I see so many new challenges that we will be able Brynjolfsson and McAfee basically put forth the premise, Again I believe that we will have to look Back to the book you were talking about, deep thinking the machine's wheel will force a lot of talent but it had the advantage of studying your moves. But at the end of the day it's why are we doing that? But augmented is what it is because to recognize that it is we will be in the leading role. How do you spend your time these days? We have the tenth anniversary, tenth event this May. Back with more from New York City right after this.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Samuel P. Langley	PERSON	0.99+
Samuel P. Langley	PERSON	0.99+
John Walls	PERSON	0.99+
Human Rights Foundation	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
$50,000	QUANTITY	0.99+
Kasparov	PERSON	0.99+
Russia	LOCATION	0.99+
five	QUANTITY	0.99+
Garry Kasparov	PERSON	0.99+
2003	DATE	0.99+
2005	DATE	0.99+
1986	DATE	0.99+
Andrew McAfee	PERSON	0.99+
seven hours	QUANTITY	0.99+
90%	QUANTITY	0.99+
1985	DATE	0.99+
100%	QUANTITY	0.99+
Ferrari	ORGANIZATION	0.99+
1997	DATE	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
1980s	DATE	0.99+
92	QUANTITY	0.99+
Mars	LOCATION	0.99+
John Henry	PERSON	0.99+
Space Race	EVENT	0.99+
Three months	QUANTITY	0.99+
seven	QUANTITY	0.99+
three months	QUANTITY	0.99+
94	QUANTITY	0.99+
Both	QUANTITY	0.99+
both sides	QUANTITY	0.99+
ten minutes	QUANTITY	0.99+
Deep Blue	TITLE	0.99+
one	QUANTITY	0.99+
first time	QUANTITY	0.99+
95	QUANTITY	0.99+
Cold War	EVENT	0.99+
under one hour	QUANTITY	0.99+
tenth event	QUANTITY	0.99+
two more matches	QUANTITY	0.98+
both	QUANTITY	0.98+
Erik Brynjolfsson	PERSON	0.98+
Garry	PERSON	0.98+
one close match	QUANTITY	0.98+
tenth anniversary	QUANTITY	0.98+
30	DATE	0.97+
Two decades	QUANTITY	0.97+
32 chess	QUANTITY	0.97+
80s	DATE	0.97+
three decades	QUANTITY	0.97+
today	DATE	0.96+
one point	QUANTITY	0.95+
Wright Brothers	PERSON	0.95+
first sight	QUANTITY	0.94+
Freedom Forum	ORGANIZATION	0.93+
First	QUANTITY	0.92+
one industry	QUANTITY	0.92+
a decade	QUANTITY	0.92+
60s	DATE	0.92+
2018	DATE	0.91+
this May	DATE	0.87+
McAfee	ORGANIZATION	0.83+
90s	DATE	0.78+
40 years ago	DATE	0.75+
German	OTHER	0.74+
Brynjolfsson	ORGANIZATION	0.63+
more years	QUANTITY	0.61+
theCube	ORGANIZATION	0.6+
80s	QUANTITY	0.59+
number	QUANTITY	0.57+
Learning	ORGANIZATION	0.35+
Blue	OTHER	0.35+
Everywhere	TITLE	0.32+

Madhu Kochar, IBM | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE covering Machine Learning Everywhere, Build Your Ladder To AI, brought to you by IBM. (techy music playing) >> Welcome back to New York City as we continue here at IBM's Machine Learning Everywhere, Build Your Ladder To AI bringing it to you here on theCUBE, of course the rights to the broadcast of SiliconANGLE Media and Dave Vellante joins me here. Dave, good morning once again to you, sir. >> Hey, John, good to see you. >> And we're joined by Madhu Kochar, who is the Vice President of Analytics Development and Client Success at IBM, I like that, client success. Good to see you this morning, thanks for joining us. >> Yeah, thank you. >> Yeah, so let's bring up a four letter / ten letter word, governance, that some people just cringe, right, right away, but that's very much in your wheelhouse. Let's talk about that in terms of what you're having to be aware of today with data and all of a sudden these great possibilities, right, but also on the other side, you've got to be careful, and I know there's some clouds over in Europe as well, but let's just talk about your perspective on governance and how it's important to get it all under one umbrella. >> Yeah, so I lead product development for IBM analytics, governance, and integration, and like you said, right, governance has... Every time you talk that, people cringe and you think it's a dirty word, but it's not anymore, right. Especially when you want to tie your AI ladder story, right, there is no AI without information architecture, no AI without IA, and if you think about IA, what does that really mean? It means the foundation of that is data and analytics. Now, let's look deeper, what does that really mean, what is data analytics? Data is coming at us from everywhere, right, and there's records... The data shows there's about 2.5 quintillion bytes of data getting generated every single day, raw data from everywhere. How are we going to make sense out of it, right, and from that perspective it is just so important that you understand this type of data, what is the type of data, what's the classification of this means in a business. You know, when you are running your business, there's a lot of cryptic fields out there, what is the business terms assigned to it and what's the lineage of it, where did it come from. If you do have to do any analytics, if data scientists have to do any analytics on it they need to understand where did it actually originated from, can I even trust this data. Trust is really, really important here, right, and is the data clean, what is the quality of this data. The data is coming at us all raw formats from IOT sensors and such. What is the quality of this data? To me, that is the real definition of governance. Right, it's not just about what we used to think about compliance, yes, that's-- >> John: Like rolling a rag. >> Right, right. >> But it's all about being appropriate with all the data you have coming in. >> Exactly, I call it governance 2.0 or governance for insights, because that's what it needs to be all about. Right, compliance, yes indeed, with GDPR and other things coming at us it's important, but I think the most critical is that we have to change the term of governance into, like, this is that foundation for your AI ladder that is going to help us really drive the right insights, that's my perspective. >> I want to double click on that because you're right, I mean, it is kind of governance 2.0. It used to be, you know, Enron forced a lot of, you know, governance and the Federal Rules of Civil Procedure forced a lot of sort of even some artificial governance, and then I think organization, especially public companies and large organizations said, "You know what, we can't just do "this as a band-aid every time." You know, now GDPR, many companies are not ready for GDPR, we know that. Having said that, because it is, went through governance 1.0, many companies are not panicked. I mean, they're kind of panicking because May is coming, (laughs) but they've been through this before. >> Madhu: Mm-hm. >> Do you agree with that premise, that they've got at least the skillsets and the professionals to, if they focus, they can get there pretty quickly? >> Yeah, no, I agree with that, but I think our technology and tools needs to change big time here, right, because regulations are coming at us from all different angles. Everybody's looking to cut costs, right? >> Dave: Right. >> You're not going to hire more people to sit there and classify the data and say, "Hey, is this data ready for GDPR," or for Basel or for POPI, like in South Africa. I mean, there's just >> Dave: Yeah. >> Tons of things, right, so I do think the technology needs to change, and that's why, you know, in our governance portfolio, in IBM information server, we have infused machine learning in it, right, >> Dave: Hm. >> Where it's automatically you have machine learning algorithms and models understanding your data, classifying the data. You know, you don't need humans to sit there and assign terms, the business terms to it. We have compliance built into our... It's running actually on machine learning. You can feed in taxonomy for GDPR. It would automatically tag your data in your catalog and say, "Hey, this is personal data, "this is sensitive data, or this data "is needed for these type of compliance," and that's the aspect which I think we need to go focus on >> Dave: Mm-hm. >> So the companies, to your point, don't shrug every time they hear regulations, that it's kind of built in-- >> Right. >> In the DNA, but technologies have to change, the tools have to change. >> So, to me that's good news, if you're saying the technology and the tools is the gap. You know, we always talk about people, process, and technology the bromide is, but it's true, people and process are the really-- >> Madhu: Mm-hm. >> Hard pieces of it. >> Madhu: Mm-hm. >> Technology comes and goes >> Madhu: Mm-hm. >> And people kind of generally get used to that. So, I'm inferring from your comments that you feel as though governance, there's a value component of governance now >> Yeah, yeah. >> It's not just a negative risk avoidance. It can be a contributor to value. You mentioned the example of classification, which I presume is auto-classification >> Madhu: Yes. >> At the point of use or creation-- >> Madhu: Yes. >> Which has been a real nagging problem for decades, especially after FRCP, Federal Rules of Civil Procedure, where it was like, "Ugh, we can't figure "this out, we'll do email archiving." >> Madhu: Mm-hm. >> You can't do this manually, it's just too much data-- >> Yeah. >> To your point, so I wonder if you could talk a little bit about governance and its contribution to value. >> Yeah, so this is good question. I was just recently visiting some large banks, right, >> Dave: Mm-hm. >> And normally, the governance and compliance has always been an IT job, right? >> Dave: Right. >> And they figure out bunch of products, you know, you can download opensource and do other things to quickly deliver data or insights to their business groups, right, and for business to further figure out new business models and such, right. So, recently what has happened is by doing machine learning into governance, you're making your IT guys the heroes because now they can deliver stuff very quickly, and the business guys are starting to get those insights and their thoughts on data is changing, you know, and recently I was talking with these banks where they're like, "Can you come and talk to "our CFOs because I think the policies," the cultural change you referred to then, maybe the data needs to be owned by businesses. >> Dave: Hm. >> No longer an IT thing, right? So, governance I feel like, you know, governance and integration I feel like is a glue which is helping us drive that cultural change in the organizations, bringing IT and the business groups together to further drive the insights. >> So, for years we've been talking about information as a liability or an asset, and for decades it was really viewed as a liability, get rid of it if you can. You have to keep it for seven years, then get rid of it, you know. That started to change, you know, with the big data movement, >> Madhu: Yeah. >> But there was still sort of... It was hard, right, but what I'm hearing now is increasingly, especially of the businesses sort of owning the data, it's becoming viewed as an asset. >> Madhu: Yes. >> You've got to manage the liabilities, we got that, but now how do we use it to drive business value. >> Yeah, yeah, no, exactly, and that's where I think our focus in IBM analytics, with machine learning and automation, and truly driving that insights out of the data. I mean, you know, people... We've been saying data is a natural resource. >> Dave: Mm-hm. >> It's our bloodline, it's this and that. It truly is, you know, and talking to the large enterprises, everybody is in their mode of digital transformation or transforming, right? We in IBM are doing the same things. Right, we're eating our own, drinking our own champagne (laughs). >> John: Not the Kool-Aid. >> You know, yeah, yeah. >> John: Go right to the dog. >> Madhu: Yeah, exactly. >> Dave: No dog smoothie. (laughs) >> Drinking our own champagne, and truly we're seeing transformation in how we're running our own business as well. >> Now what, there are always surprises. There are always some, you know, accidents kind of waiting to happen, but in terms of the IOT, you know, have got these millions, right, of sensors-- >> Madhu: Mm-hm. >> You know, feeding data in, and what, from a governance perspective, is maybe a concern about, you know, an unexpected source or an unexpected problem or something where yeah, you have great capabilities, but with those capabilities might come a surprise or two in terms of protecting data and a machine might provide perhaps a little more insight than you might've expected. So, I mean, just looking down the road from your perspective, you know, is there anything along those lines that you're putting up flags for just to keep an eye on to see what new inputs might create new problems for you? >> Yeah, no, for sure, I mean, we're always looking at how do we further do innovation, how do we disrupt ourselves and make sure that data doesn't become our enemy, right, I mean it's... You know, as we are talking about AI, people are starting to ask a lot of questions about ethics and other things, too, right. So, very critical, so obviously when you focus on governance, the point of that is let's take the manual stuff out, make it much faster, but part of the governance is that we're protecting you, right. That's part of that security and understanding of the data, it's all about that you don't end up in jail. Right, that's the real focus in terms of our technology in terms of the way we're looking at. >> So, maybe help our audience a little bit. So, I described at our open AI is sort of the umbrella and machine learning is the math and the algorithms-- >> Madhu: Yeah. >> That you apply to train systems to do things maybe better than, maybe better than humans can do and then there's deep learning, which is, you know, neural nets and so forth, but am I understanding that you've essentially... First of all, is that sort of, I know it's rudimentary, but is it reasonable, and then it sounds like you've infused ML into your software. >> Madho: Yes. >> And so I wonder if you could comment on that and then describe from the client's standpoint what skills they need to take advantage of that, if any. >> Oh, yeah, no, so embedding ML into a software, like a packaged software which gets delivered to our client, people don't understand actually how powerful that is, because your data, your catalog, is learning. It's continuously learning from the system itself, from the data itself, right, and that's very exciting. The value to the clients really is it cuts them their cost big time. Let me give you an example, in a large organization today for example, if they have, like, maybe 22,000 some terms, normally it would take them close to six months for one application with a team of 20 to sit there and assign the terms, the right business glossary for their business to get data. (laughs) So, by now doing machine learning in our software, we can do this in days, even in hours, obviously depending on what's the quantity of the data in the organization. That's the value, so the value to the clients is cutting down that. They can take those folks and go focus on some, you know, bigger value add applications and others and take advantage of that data. >> The other huge value that I see is as the business changes, the machine can help you adapt. >> Madhu: Yeah. >> I mean, taxonomies are like cement in data classification, and while we can't, you know, move the business forward because we have this classification, can your machines adapt, you know, in real time and can they change at the speed of my business, is my question. >> Right, right, no, it is, right, and clients are not able to move on their transformation journey because they don't have data classified done right. >> Dave: Mm-hm. >> They don't, and you can't put humans to it. You're going to need the technology, you're going to need the machine learning algorithms and the AI built into your software to get that, and that will lead to, really, success of every kind. >> Broader question, one of the good things about things like GDPR is it forces, it puts a deadline on there and we all know, "Give me a deadline and I'll hit it," so it sort of forces action. >> Madhu: Mm-hm. >> And that's good, we've talked about the value that you can bring to an organization from a data perspective, but there's a whole non-governance component of data orientation. How do you see that going, can the governance initiatives catalyze sort of what I would call a... You know, people talk about a data driven organization. Most companies, they may say they are data driven but they're really not foundational. >> Mm-hm. >> Can governance initiatives catalyze that transformation to a data driven organization, and if so, how? >> Yeah, no, absolutely, right. So, the example I was sharing earlier with talking to some of the large financial institutes, where the business guys, you know, outside of IT are talking about how important it is for them to get the data really real time, right, and self-service. They don't want to be dependent on either opening a work ticket for somebody in IT to produce data for them and god forbid if somebody's out on vacation they can never get that. >> Dave: Right. >> We don't live in that world anymore, right. It's online, it's real time, it's all, you know, self-service type of aspects, which the business, the data scientists building new analytic models are looking for that. So, for that, data is the key, key core foundation in governance. The way I explained it earlier, it's not just about compliance. That is going to lead to that transformation for every client, it's the core. They will not be successful without that. >> And the attributes are changing. Not only is it self-service, it's pervasive-- >> Madhu: Yeah. >> It's embedded, it's aware, it's anticipatory. Am I overstating that? >> Madhu: No. >> I mean, is the data going to find me? >> Yeah, you know, (laughs) that's a good way to put it, you know, so no, you're at the, I think you got it. This is absolutely the right focus, and the companies and the enterprises who understand this and use the right technology to fix it that they'll win. >> So, if you have a partner that maybe, if it is contextual, I mean... >> Dave: Yeah. >> So, also make it relevant-- >> Madhu: Yes. >> To me and help me understand its relevance-- >> Madhu: Yes. >> Because maybe as a, I hate to say as a human-- >> Madhu: Yes. >> That maybe just don't have that kind of prism, but can that, does that happen as well, too? >> Madhu: Yeah, no. >> John: It can put up these white flags and say, "Yeah, this is what you need." >> Yeah, no, absolutely, so like the focus we have on our natural language processing, for example, right. If you're looking for something you don't have to always know what your SQL is going to be for a query to do it. You just type in, "Hey, I'm looking for "some customer retention data," you know, and it will go out and figure it out and say, "Hey, are you looking for churn analysis "or are you looking to do some more promotions?" It will learn, you know, and that's where this whole aspect of machine learning and natural language processing is going to give you that contextual aspect of it, because that's how the self-service models will work. >> Right, what about skills, John asked me at the open about skillsets and I want to ask a general question, but then specifically about governance. I would make the assertion that most employees don't have the multidimensional digital skills and domain expertise skills today. >> Yeah. >> Some companies they do, the big data companies, but in governance, because it's 2.0, do you feel like the skills are largely there to take advantage of the innovations that IBM is coming out with? >> I think I generally, my personal opinion is the way the technology's moving, the way we are getting driven by a lot of disruptions, which are happening around us, I think we don't have the right skills out there, right. We all have to retool, I'm sure all of us in our career have done this all the time. You know, so (laughs) to me, I don't think we have it. So, building the right tools, the right technologies and enabling the resources that the teams out there to retool themselves so they can actually focus on innovation in their own enterprises is going to be critical, and that's why I really think more burn I can take off from the IT groups, more we can make them smarter and have them do their work faster. It will help give that time to go see hey, what's their next big disruption in their organization. >> Is it fair to say that traditionally governance has been a very people-intensive activity? >> Mm-hm. >> Will governance, you know, in the next, let's say decade, become essentially automated? >> That's my desire, and with the product-- >> Dave: That's your job. >> That's my job, and I'm actually really proud of what we have done thus far and where we are heading. So, next time when we meet we will be talking maybe governance 3.0, I don't know, right. (laughs) Yeah, that's the thing, right? I mean, I think you hit it on the nail, that this is, we got to take a lot of human-intensive stuff out of our products and more automation we can do, more smarts we can build in. I coined this term like, hey, we've got to build smarter metadata, right? >> Dave: Right. >> Data needs to, metadata is all about data of your data, right? That needs to become smarter, think about having a universe where you don't have to sit there and connect the dots and say, "I want to move from here to there." System already knows it, they understand certain behaviors, they know what your applications is going to do and it kind of automatically does it for you. No more science fake, I think it can happen. (laughs) >> Do you think we'll ever have more metadata than data... (laughs) >> Actually, somebody did ask me that question, will we be figuring out here we're building data lakes, what do we do about metadata. No, I think we will not have that problem for a while, we'll make it smarter. >> Dave: Going too fast, right. >> You're right. >> But it is, it's like working within your workforce and you're telling people, you know, "You're a treasure hunter and we're going to give you a better map." >> Madhu: Yeah. >> So, governance is your better map, so trust me. >> Madhu: Hey, I like that, maybe I'll use it next time. >> Yeah, but it's true, it's like are you saying governance is your friend here-- >> Madhu: Yes. >> And we're going to fine-tune your search, we're going to make you a more efficient employee, we're going to make you a smarter person and you're going to be able to contribute in a much better way, but it's almost enforced, but let it be your friend, not your foe. >> Yes, yeah, be your differentiator, right. >> But my takeaway is it's fundamental, it's embedded. You know, you're doing this now with less thinking. Security's got to get to the same play, but for years security, "Ugh, it slows me down," but now people are like, "Help me," right, >> Madhu: Mm-hm. >> And I think the same dynamic is true here, embedded governance in my business. Not a bolt on, not an afterthought. It's fundamental and foundational to my organization. >> Madhu: Yeah, absolutely. >> Well, Madhu, thank you for the time. We mentioned on the outset by the interview if you want to say hi to your kids that's your camera right there. Do you want to say hi to your kids real quick? >> Yeah, hi Mohed, Kepa, I love you so much. (laughs) >> All right. >> Thank you. >> So, they know where mom is. (laughs) New York City at IBM's Machine Learning Everywhere, Build Your Ladder To AI. Thank you for joining us, Madhu Kochar. >> Thank you, thank you. >> Back with more here from New York in just a bit, you're watching theCUBE. (techy music playing)

Published Date : Feb 27 2018

SUMMARY :

Build Your Ladder To AI, brought to you by IBM. Build Your Ladder To AI bringing it to you here Good to see you this morning, thanks for joining us. right, but also on the other side, You know, when you are running your business, with all the data you have coming in. that is going to help us really drive a lot of, you know, governance and the Everybody's looking to cut costs, You're not going to hire more people and assign terms, the business terms to it. to change, the tools have to change. So, to me that's good news, if you're saying So, I'm inferring from your comments that you feel Yeah, You mentioned the example of classification, Federal Rules of Civil Procedure, and its contribution to value. Yeah, so this is good question. and the business guys are starting to get So, governance I feel like, you know, That started to change, you know, is increasingly, especially of the businesses You've got to manage the liabilities, we got that, I mean, you know, people... It truly is, you know, and talking to Dave: No dog smoothie. Drinking our own champagne, and truly the IOT, you know, have got these concern about, you know, an unexpected source it's all about that you don't end up in jail. is the math and the algorithms-- which is, you know, neural nets and so forth, And so I wonder if you could comment on and assign the terms, the right business changes, the machine can help you adapt. you know, move the business forward and clients are not able to move on algorithms and the AI built into your software Broader question, one of the good things the value that you can bring to an organization where the business guys, you know, That is going to lead to that transformation And the attributes are changing. It's embedded, it's aware, it's anticipatory. Yeah, you know, (laughs) that's a good So, if you have a partner that and say, "Yeah, this is what you need." have to always know what your SQL is don't have the multidimensional digital do you feel like the skills are largely You know, so (laughs) to me, I don't think we have it. I mean, I think you hit it on the nail, applications is going to do and it Do you think we'll ever have more metadata than data... No, I think we will not have that problem and we're going to give you a better map." we're going to make you a more efficient employee, Security's got to get to the same play, It's fundamental and foundational to my organization. if you want to say hi to your kids Yeah, hi Mohed, Kepa, I love you so much. Thank you for joining us, Madhu Kochar. a bit, you're watching theCUBE.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Dave	PERSON	0.99+
Europe	LOCATION	0.99+
Madhu	PERSON	0.99+
Mohed	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Enron	ORGANIZATION	0.99+
Madhu Kochar	PERSON	0.99+
South Africa	LOCATION	0.99+
seven years	QUANTITY	0.99+
Dave Vellante	PERSON	0.99+
Kepa	PERSON	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
Federal Rules of Civil Procedure	TITLE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Madho	PERSON	0.99+
22,000	QUANTITY	0.99+
GDPR	TITLE	0.99+
two	QUANTITY	0.99+
ten letter	QUANTITY	0.99+
one application	QUANTITY	0.99+
FRCP	TITLE	0.98+
today	DATE	0.98+
four letter	QUANTITY	0.96+
Kool-Aid	ORGANIZATION	0.96+
about 2.5 quintillion bytes	QUANTITY	0.94+
decades	QUANTITY	0.92+
First	QUANTITY	0.9+
millions	QUANTITY	0.9+
20	QUANTITY	0.9+
one	QUANTITY	0.89+
2018	DATE	0.87+
Machine	TITLE	0.86+
six months	QUANTITY	0.85+

Rob Thomas, IBM | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE, covering Machine Learning Everywhere: Build Your Ladder to AI, brought to you by IBM. >> Welcome back to New York City. theCUBE continue our coverage here at IBM's event, Machine Learning Everywhere: Build Your Ladder to AI. And with us now is Rob Thomas, who is the vice president of, or general manager, rather, of IBM analytics. Sorry about that, Rob. Good to have you with us this morning. Good to see you, sir. >> Great to see you John. Dave, great to see you as well. >> Great to see you. >> Well let's just talk about the event first. Great lineup of guests. We're looking forward to visiting with several of them here on theCUBE today. But let's talk about, first off, general theme with what you're trying to communicate and where you sit in terms of that ladder to success in the AI world. >> So, maybe start by stepping back to, we saw you guys a few times last year. Once in Munich, I recall, another one in New York, and the theme of both of those events was, data science renaissance. We started to see data science picking up steam in organizations. We also talked about machine learning. The great news is that, in that timeframe, machine learning has really become a real thing in terms of actually being implemented into organizations, and changing how companies run. And that's what today is about, is basically showcasing a bunch of examples, not only from our clients, but also from within IBM, how we're using machine learning to run our own business. And the thing I always remind clients when I talk to them is, machine learning is not going to replace managers, but I think machine learning, managers that use machine learning will replace managers that do not. And what you see today is a bunch of examples of how that's true because it gives you superpowers. If you've automated a lot of the insight, data collection, decision making, it makes you a more powerful manager, and that's going to change a lot of enterprises. >> It seems like a no-brainer, right? I mean, or a must-have. >> I think there's a, there's always that, sometimes there's a fear factor. There is a culture piece that holds people back. We're trying to make it really simple in terms of how we talk about the day, and the examples that we show, to get people comfortable, to kind of take a step onto that ladder back to the company. >> It's conceptually a no-brainer, but it's a challenge. You wrote a blog and it was really interesting. It was, one of the clients said to you, "I'm so glad I'm not in the technology industry." And you went, "Uh, hello?" (laughs) "I've got news for you, you are in the technology industry." So a lot of customers that I talk to feel like, meh, you know, in our industry, it's really not getting disrupted. That's kind of taxis and retail. We're in banking and, you know, but, digital is disrupting every industry and every industry is going to have to adopt ML, AI, whatever you want to call it. Can traditional companies close that gap? What's your take? >> I think they can, but, I'll go back to the word I used before, it starts with culture. Am I accepting that I'm a technology company, even if traditionally I've made tractors, as an example? Or if traditionally I've just been you know, selling shirts and shoes, have I embraced the role, my role as a technology company? Because if you set that culture from the top, everything else flows from there. It can't be, IT is something that we do on the side. It has to be a culture of, it's fundamental to what we do as a company. There was an MIT study that said, data-driven cultures drive productivity gains of six to 10 percent better than their competition. You can't, that stuff compounds, too. So if your competitors are doing that and you're not, not only do you fall behind in the short term but you fall woefully behind in the medium term. And so, I think companies are starting to get there but it takes a constant push to get them focused on that. >> So if you're a tractor company, you've got human expertise around making tractors and messaging and marketing tractors, and then, and data is kind of there, sort of a bolt-on, because everybody's got to be data-driven, but if you look at the top companies by market cap, you know, we were talking about it earlier. Data is foundational. It's at their core, so, that seems to me to be the hard part, Rob, I'd like you to comment in terms of that cultural shift. How do you go from sort of data in silos and, you know, not having cloud economics and, that are fundamental, to having that dynamic, and how does IBM help? >> You know, I think, to give companies credit, I think most organizations have developed some type of data practice or discipline over the last, call it five years. But most of that's historical, meaning, yeah, we'll take snapshots of history. We'll use that to guide decision making. You fast-forward to what we're talking about today, just so we're on the same page, machine learning is about, you build a model, you train a model with data, and then as new data flows in, your model is constantly updating. So your ability to make decisions improves over time. That's very different from, we're doing historical reporting on data. And so I think it's encouraging that companies have kind of embraced that data discipline in the last five years, but what we're talking about today is a big next step and what we're trying to break it down to what I call the building blocks, so, back to the point on an AI ladder, what I mean by an AI ladder is, you can't do AI without machine learning. You can't do machine learning without analytics. You can't do analytics without the right data architecture. So those become the building blocks of how you get towards a future of AI. And so what I encourage companies is, if you're not ready for that AI leading edge use case, that's okay, but you can be preparing for that future now. That's what the building blocks are about. >> You know, I think we're, I know we're ahead of, you know, Jeremiah Owyang on a little bit later, but I was reading something that he had written about gut and instinct, from the C-Suite, and how, that's how companies were run, right? You had your CEO, your president, they made decisions based on their guts or their instincts. And now, you've got this whole new objective tool out there that's gold, and it's kind of taking some of the gut and instinct out of it, in a way, and maybe there are people who still can't quite grasp that, that maybe their guts and their instincts, you know, what their gut tells them, you know, is one thing, but there's pretty objective data that might indicate something else. >> Moneyball for business. >> A little bit of a clash, I mean, is there a little bit of a clash in that respect? >> I think you'd be surprise by how much decision making is still pure opinion. I mean, I see that everywhere. But we're heading more towards what you described for sure. One of the clients talking here today, AMC Networks, think it's a great example of a company that you wouldn't think of as a technology company, primarily a content producer, they make great shows, but they've kind of gone that extra step to say, we can integrate data sources from third parties, our own data about viewer habits, we can do that to change our relationship with advertisers. Like, that's a company that's really embraced this idea of being a technology company, and you can see it in their results, and so, results are not coincidence in this world anymore. It's about a practice applied to data, leveraging machine learning, on a path towards AI. If companies are doing that, they're going to be successful. >> And we're going to have the tally from AMC on, but so there's a situation where they have embraced it, that they've dealt with that culture, and data has become foundational. Now, I'm interested as to what their journey look like. What are you seeing with clients? How they break this down, the silos of data that have been built up over decades. >> I think, so they get almost like a maturity curve. You've got, and the rule I talk about is 40-40-20, where 40% of organizations are really using data just to optimize costs right now. That's okay, but that's on the lower end of the maturity curve. 40% are saying, all right, I'm starting to get into data science. I'm starting to think about how I extend to new products, new services, using data. And then 20% are on the leading edge. And that's where I'd put AMC Networks, by the way, because they've done unique things with integrating data sets and building models so that they've automated a lot of what used to be painstakingly long processes, internal processes to do it. So you've got this 40-40-20 of organizations in terms of their maturity on this. If you're not on that curve right now, you have a problem. But I'd say most are somewhere on that curve. If you're in the first 40% and you're, right now data for you is just about optimizing cost, you're going to be behind. If you're not right now, you're going to be behind in the next year, that's a problem. So I'd kind of encourage people to think about what it takes to be in the next 40%. Ultimately you want to be in the 20% that's actually leading this transformation. >> So change it to 40-20-40. That's where you want it to go, right? You want to flip that paradigm. >> I want to ask you a question. You've done a lot of M and A in the past. You spent a lot of time in Silicon Valley and Silicon Valley obviously very, very disruptive, you know, cultures and organizations and it's always been a sort of technology disruption. It seems like there's a ... another disruption going on, not just horizontal technologies, you know, cloud or mobile or social, whatever it is, but within industries. Some industries, as we've been talking, radically disrupted. Retail, taxis, certainly advertising, et cetera et cetera. Some have not yet, the client that you talked to. Do you see, technology companies generally, Silicon Valley companies specifically, as being able to pull off a sort of disruption of not only technologies but also industries and where does IBM play there? You've made a sort of, Ginni in particular has made a deal about, hey, we're not going to compete with our customers. So talking about this sort of dual disruption agenda, one on the technology side, one within industries that Apple's getting into financial services and, you know, Amazon getting into grocery, what's your take on that and where does IBM fit in that world? >> So, I mean, IBM has been in Silicon Valley for a long time, I would say probably longer than 99.9% of the companies in Silicon Valley, so, we've got a big lab there. We do a lot of innovation out of there. So love it, I mean, the culture of the valley is great for the world because it's all about being the challenger, it's about innovation, and that's tremendous. >> No fear. >> Yeah, absolutely. So, look, we work with a lot of different partners, some who are, you know, purely based in the valley. I think they challenge us. We can learn from them, and that's great. I think the one, the one misnomer that I see right now, is there's a undertone that innovation is happening in Silicon Valley and only in Silicon Valley. And I think that's a myth. Give you an example, we just, in December, we released something called Event Store which is basically our stab at reinventing the database business that's been pretty much the same for the last 30 to 40 years. And we're now ingesting millions of rows of data a second. We're doing it in a Parquet format using a Spark engine. Like, this is an amazing innovation that will change how any type of IOT use case can manage data. Now ... people don't think of IBM when they think about innovations like that because it's not the only thing we talk about. We don't have, the IBM website isn't dedicated to that single product because IBM is a much bigger company than that. But we're innovating like crazy. A lot of that is out of what we're doing in Silicon Valley and our labs around the world and so, I'm very optimistic on what we're doing in terms of innovation. >> Yeah, in fact, I think, rephrase my question. I was, you know, you're right. I mean people think of IBM as getting disrupted. I wasn't posing it, I think of you as a disruptor. I know that may sound weird to some people but in the sense that you guys made some huge bets with things like Watson on solving some of the biggest, world's problems. And so I see you as disrupting sort of, maybe yourselves. Okay, frame that. But I don't see IBM as saying, okay, we are going to now disrupt healthcare, disrupt financial services, rather we are going to help our, like some of your comp... I don't know if you'd call them competitors. Amazon, as they say, getting into content and buying grocery, you know, food stores. You guys seems to have a different philosophy. That's what I'm trying to get to is, we're going to disrupt ourselves, okay, fine. But we're not going to go hard into healthcare, hard into financial services, other than selling technology and services to those organizations, does that make sense? >> Yeah, I mean, look, our mission is to make our clients ... better at what they do. That's our mission, we want to be essential in terms of their journey to be successful in their industry. So frankly, I love it every time I see an announcement about Amazon entering another vertical space, because all of those companies just became my clients. Because they're not going to work with Amazon when they're competing with them head to head, day in, day out, so I love that. So us working with these companies to make them better through things like Watson Health, what we're doing in healthcare, it's about making companies who have built their business in healthcare, more effective at how they perform, how they drive results, revenue, ROI for their investors. That's what we do, that's what IBM has always done. >> Yeah, so it's an interesting discussion. I mean, I tend to agree. I think Silicon Valley maybe should focus on those technology disruptions. I think that they'll have a hard time pulling off that dual disruption and maybe if you broadly define Silicon Valley as Seattle and so forth, but, but it seems like that formula has worked for decades, and will continue to work. Other thoughts on sort of the progression of ML, how it gets into organizations. You know, where you see this going, again, I was saying earlier, the parlance is changing. Big data is kind of, you know, mm. Okay, Hadoop, well, that's fine. We seem to be entering this new world that's pervasive, it's embedded, it's intelligent, it's autonomous, it's self-healing, it's all these things that, you know, we aspire to. We're now back in the early innings. We're late innings of big data, that's kind of ... But early innings of this new era, what are your thoughts on that? >> You know, I'd say the biggest restriction right now I see, we talked before about somehow, sometimes companies don't have the desire, so we have to help create the desire, create the culture to go do this. Even for the companies that have a burning desire, the issue quickly becomes a skill gap. And so we're doing a lot to try to help bridge that skill gap. Let's take data science as an example. There's two worlds of data science that I would describe. There's clickers, and there's coders. Clickers want to do drag and drop. They will use traditional tools like SPSS, which we're modernizing, that's great. We want to support them if that's how they want to work and build models and deploy models. There's also this world of coders. This is people that want to do all their data science in ML, and Python, and Scala, and R, like, that's what they want to do. And so we're supporting them through things like Data Science Experience, which is built on Apache Jupiter. It's all open source tooling, it'd designed for coders. The reason I think that's important, it goes back to the point on skill sets. There is a skill gap in most companies. So if you walk in and you say, this is the only way to do this thing, you kind of excluded half the companies because they say, I can't play in that world. So we are intentionally going after a strategy that says, there's a segmentation in skill types. In places there's a gap, we can help you fill that gap. That's how we're thinking about them. >> And who does that bode well for? If you say that you were trying to close a gap, does that bode well for, we talked about the Millennial crowd coming in and so they, you know, do they have a different approach or different mental outlook on this, or is it to the mid-range employee, you know, who is open minded, I mean, but, who is the net sweet spot, you think, that say, oh, this is a great opportunity right now? >> So just take data science as an example. The clicker coder comment I made, I would put the clicker audience as mostly people that are 20 years into their career. They've been around a while. The coder audience is all the Millennials. It's all the new audience. I think the greatest beneficiary is the people that find themselves kind of stuck in the middle, which is they're kind of interested in this ... >> That straddle both sides of the line yeah? >> But they've got the skill set and the desire to do some of the new tooling and new approaches. So I think this kind of creates an opportunity for that group in the middle to say, you know, what am I going to adopt as a platform for how I go forward and how I provide leadership in my company? >> So your advice, then, as you're talking to your clients, I mean you're also talking to their workforce. In a sense, then, your advice to them is, you know, join, jump in the wave, right? You've got your, you can't straddle, you've got to go. >> And you've got to experiment, you've got to try things. Ultimately, organizations are going to gravitate to things that they like using in terms of an approach or a methodology or a tool. But that comes with experimentation, so people need to get out there and try something. >> Maybe we could talk about developers a little bit. We were talking to Dinesh earlier and you guys of course have focused on data scientists, data engineers, obviously developers. And Dinesh was saying, look, many, if not most, of the 10 million Java developers out there, they're not, like, focused around the data. That's really the data scientist's job. But then, my colleague John Furrier says, hey, data is the new development kit. You know, somebody said recently, you know, Andreessen's comment, "software is eating the world." Well, data is eating software. So if Furrier is right and that comment is right, it seems like developers increasingly have to become more data aware, fundamentally. Blockchain developers clearly are more data focused. What's your take on the developer community, where they fit into this whole AI, machine learning space? >> I was just in Las Vegas yesterday and I did a session with a bunch of our business partners. ISVs, so software companies, mostly a developer audience, and the discussion I had with them was around, you're doing, you're building great products, you're building great applications. But your product is only as good as the data and the intelligence that you embed in your product. Because you're still putting too much of a burden on the user, as opposed to having everything happen magically, if you will. So that discussion was around, how do you embed data, embed AI, into your products and do that at the forefront versus, you deliver a product and the client has to say, all right, now I need to get my data out of this application and move it somewhere else so I can do the data science that I want to do. That's what I see happening with developers. It's kind of ... getting them to think about data as opposed to just thinking about the application development framework, because that's where most of them tend to focus. >> Mm, right. >> Well, we've talked about, well, earlier on about the governance, so just curious, with Madhu, which I'll, we'll have that interview in just a little bit here. I'm kind of curious about your take on that, is that it's a little kinder, gentler, friendlier than maybe some might look at it nowadays because of some organization that it causes, within your group and some value that's being derived from that, that more efficiency, more contextual information that's, you know, more relevant, whatever. When you talk to your clients about meeting rules, regs, GDPR, all these things, how do you get them to see that it's not a black veil of doom and gloom but it really is, really more of an opportunity for them to cash in? >> You know, my favorite question to ask when I go visit clients is I say, I say, just show of hands, how many people have all the data they need to do their job? To date, nobody has ever raised their hand. >> Not too many hands up. >> The reason I phrased it that way is, that's fundamentally a governance challenge. And so, when you think about governance, I think everybody immediately thinks about compliance, GDPR, types of things you mentioned, and that's great. But there's two use cases for governance. One is compliance, the other one is self service analytics. Because if you've done data governance, then you can make your data available to everybody in the organization because you know you've got the right rules, the right permissions set up. That will change how people do their jobs and I think sometimes governance gets painted into a compliance corner, when organizations need to think about it as, this is about making data accessible to my entire workforce. That's a big change. I don't think anybody has that today. Except for the clients that we're working with, where I think we've made good strides in that. >> What's your sort of number one, two, and three, or pick one, advice for those companies that as you blogged about, don't realize yet that they're in the software business and the technology business? For them to close the ... machine intelligence, machine learning, AI gap, where should they start? >> I do think it can be basic steps. And the reason I say that is, if you go to a company that hasn't really viewed themselves as a technology company, and you start talking about machine intelligence, AI, like, everybody like, runs away scared, like it's not interesting. So I bring it back to building blocks. For a client to be great in data, and to become a technology company, you really need three platforms for how you think about data. You need a platform for how you manage your data, so think of it as data management. You need a platform for unified governance and integration, and you need a platform for data science and business analytics. And to some extent, I don't care where you start, but you've got to start with one of those. And if you do that, you know, you'll start to create a flywheel of momentum where you'll get some small successes. Then you can go in the other area, and so I just encourage everybody, start down that path. Pick one of the three. Or you may already have something going in one of them, so then pick one where you don't have something going. Just start down the path, because, those building blocks, once you have those in place, you'll be able to scale AI and ML in the future in your organization. But without that, you're going to always be limited to kind of a use case at a time. >> Yeah, and I would add, this is, you talked about it a couple times today, is that cultural aspect, that realization that in order to be data driven, you know, buzzword, you have to embrace that and drive that through the culture. Right? >> That starts at the top, right? Which is, it's not, you know, it's not normal to have a culture of, we're going to experiment, we're going to try things, half of them may not work. And so, it starts at the top in terms of how you set the tone and set that culture. >> IBM Think, we're less than a month away. CUBE is going to be there, very excited about that. First time that you guys have done Think. You've consolidated all your big, big events. What can we expect from you guys? >> I think it's going to be an amazing show. To your point, we thought about this for a while, consolidating to a single IBM event. There's no question just based on the response and the enrollment we have so far, that was the right answer. We'll have people from all over the world. A bunch of clients, we've got some great announcements that will come out that week. And for clients that are thinking about coming, honestly the best thing about it is all the education and training. We basically build a curriculum, and think of it as a curriculum around, how do we make our clients more effective at competing with the Amazons of the world, back to the other point. And so I think we build a great curriculum and it will be a great week. >> Well, if I've heard anything today, it's about, don't be afraid to dive in at the deep end, just dive, right? Get after it and, looking forward to the rest of the day. Rob, thank you for joining us here and we'll see you in about a month! >> Sounds great. >> Right around the corner. >> All right, Rob Thomas joining us here from IBM Analytics, the GM at IBM Analytics. Back with more here on theCUBE. (upbeat music)

Published Date : Feb 27 2018

SUMMARY :

Build Your Ladder to AI, brought to you by IBM. Good to have you with us this morning. Dave, great to see you as well. and where you sit in terms of that ladder And what you see today is a bunch of examples I mean, or a must-have. onto that ladder back to the company. So a lot of customers that I talk to And so, I think companies are starting to get there to be the hard part, Rob, I'd like you to comment You fast-forward to what we're talking about today, and it's kind of taking some of the gut But we're heading more towards what you described for sure. Now, I'm interested as to what their journey look like. to think about what it takes to be in the next 40%. That's where you want it to go, right? I want to ask you a question. So love it, I mean, the culture of the valley for the last 30 to 40 years. but in the sense that you guys made some huge bets in terms of their journey to be successful Big data is kind of, you know, mm. create the culture to go do this. The coder audience is all the Millennials. for that group in the middle to say, you know, you know, join, jump in the wave, right? so people need to get out there and try something. and you guys of course have focused on data scientists, that you embed in your product. When you talk to your clients about have all the data they need to do their job? And so, when you think about governance, and the technology business? And to some extent, I don't care where you start, that in order to be data driven, you know, buzzword, Which is, it's not, you know, it's not normal CUBE is going to be there, very excited about that. I think it's going to be an amazing show. and we'll see you in about a month! from IBM Analytics, the GM at IBM Analytics.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
December	DATE	0.99+
Rob Thomas	PERSON	0.99+
New York	LOCATION	0.99+
Dinesh	PERSON	0.99+
AMC Networks	ORGANIZATION	0.99+
John	PERSON	0.99+
Jeremiah Owyang	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Rob	PERSON	0.99+
20 years	QUANTITY	0.99+
Dave	PERSON	0.99+
Munich	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
MIT	ORGANIZATION	0.99+
10 million	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
last year	DATE	0.99+
Furrier	PERSON	0.99+
AMC	ORGANIZATION	0.99+
One	QUANTITY	0.99+
yesterday	DATE	0.99+
six	QUANTITY	0.99+
New York City	LOCATION	0.99+
GDPR	TITLE	0.99+
40%	QUANTITY	0.99+
both	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.99+
Seattle	LOCATION	0.99+
Scala	TITLE	0.99+
two use cases	QUANTITY	0.99+
today	DATE	0.99+
Python	TITLE	0.98+
Andreessen	PERSON	0.98+
both sides	QUANTITY	0.98+
two	QUANTITY	0.98+
Watson Health	ORGANIZATION	0.98+
millions of rows	QUANTITY	0.98+
five years	QUANTITY	0.97+
next year	DATE	0.97+
less than a month	QUANTITY	0.97+
Madhu	PERSON	0.97+
Amazons	ORGANIZATION	0.96+

Vitaly Tsivin, AMC | Machine Learning Everywhere 2018

>> Voiceover: Live from New York it's theCUBE, covering Machine Learning Everywhere: Build Your Ladder to AI. Brought to you by IBM. (upbeat techno music) >> Welcome back to New York City as theCUBE continues our coverage here at IBM's Machine Learning Everywhere: Build Your Ladder to AI. Along with Dave Vellante, I'm John Walls. We're now joined by Vitaly Tsivan who is Executive Vice President at AMC Networks. And Vitaly, thanks for joining us here this morning. >> Thank you. >> I don't know how this interview is going to go, frankly. Because we've got a die-hard Yankee fan in our guest, and a Red Sox fans who bleeds Red Sox Nation. Can you guys get along for about 15 minutes? >> Dave: Maybe about 15. >> I'm glad there's a bit of space between us. >> Dave: It's given us the off-season and the Yankees have done so well. I'll be humble. Okay? (John laughs) We'll wait and see. >> All right. Just in case, I'm ready to jump in if we have to separate here. But it is good to have you here with us this morning. Thanks for making the time. First off, talk about AMC Networks a little bit. So, five U.S. networks. You said multiple international networks and great presence there. But you've had to make this transition to becoming a data company, in essence. You have content and you're making this merger in the data. How has that gone for you? And how have you done that? >> First of all, you make me happy when you say that AMC Networks have made a transition to be a data company. So, we haven't. We are using data to help our primary business, which is obviously broadcasting our content to our viewers. But yes, we use data to help to tune our business, to follow the lead that viewers are giving us. As you can imagine, in the last so many years, viewers have actually dictating how they want to watch. Whether it's streaming video rather than just turning their satellite boxes or TV boxes on, and pretty much dictating what content they want to watch. So, we have to follow, we have to adjust and be at the cutting edge all for our business. And this is where data come into play. >> How did you get there? You must have done a lot of testing, right? I mean, I remember when binge watching didn't even exist, and then all of a sudden now everybody drops 10 episodes at once. Was that a lot of A-B testing? Just analyzing data? How does a company like yours come to that realization? Or is it just, wow, the competition is doing it, we should too. Explain how -- >> Vitaly: Interesting. So, when I speak to executives, I always tell them that business intelligence and data analytics for any company is almost like an iceberg. So, you can actually see the top of it, and you enjoy it very much but there's so much underwater. So, that's what you're referring to which is that in order to be able to deliver that premium thing that's the tip of the iceberg is that we have to have state of the art data management platforms. We have to curate our own first by data. We have to acquire meaningful third party data. We have to mingle it all together. We have to employ optimization predictive algorithms on top of that. We have to employ statistics, and arm business with data-driven decisions. And then it all comes to fruition. >> Now, your company's been around for awhile. You've got an application -- You're a developer. You're an application development executive. So, you've sort of made your personal journey. I'm curious as to how the company made its journey. How did you close that gap between the data platforms that we all know, the Googles, the Facebooks, etc., which data is the central part of their organization, to where you used to be? Which probably was building, looking back doing a lot of business intelligence, decision support, and a lot of sort of asynchronous activities. How did you get from there to where you are today? >> Makes sense. So, I've been with AMC Networks for four years. Prior to that I'd been with Disney, ABC, ESPN four, six years, doing roughly the same thing. So, number one, we're utilizing ever rapidly changing technologies to get us to the right place. Number two is during those four years with AMC, we've employed various tactics. Some of them are called data democratization. So, that's actually not only get the right data sources not only process them correctly, but actually arm everyone in the company with immediate, easy access to this data. Because the entire business, data business, is all about insights. So, the insights -- And if you think of the business, if you for a minute separate business and business intelligence, then business doesn't want to know too much about business intelligence. What they want insights on a silver plate that will tell them what to do next. Now, that's the hardest thing, you can imagine, right? And so the search and drive for those insights has to come from every business person in the organization. Now, obviously, you don't expect them to build their own statistical algorithms and see the results in employee and machine learning. But if you arm them with that data at the tip of their fingers, they'll make many better decisions on a daily basis which means that they're actually coming up with their own small insights. So, there are small insights, big insights, and they're all extremely valuable. >> A big part of that is cultural as well, that mindset. Many companies that I work with, they're data is very siloed. I don't know if that was the case with your firm, maybe less prior to your joining. I'd be curious as to how you've achieved that cultural mindset shift. Cause a lot of times, people try to keep their own data. They don't want to share it. They want to keep it in a silo, gain political power. How did you address that? >> Vitaly: Absolutely. One of my conversations with the president, we were discussing the fact that if we were to go make recordings of how people talk about data in their organization today and go back in time and show them what they will be doing three years from now, they would be shocked. They wouldn't believe that. So, absolutely. So, culturally, educationally, bringing everyone into the place where they can understand data. They can take advantage of the data. It's an undertaking. But we are successful in doing that. >> Help me out here. Maybe I just have never acquired a little translation here, or simplification. So, you think about AMC. You've got programming. You've got your line up. I come on, I click, I go, I watch a movie and I enjoy it or watch my program, whatever. So, now in this new world of viewer habits changing, my behaviors are changing. What have you done? What have you looked for in terms of data and telling you about me that has now allowed you to modify your business and adapt to that. So, I mean, health data shouldn't drive that on a day to day basis in terms of how I access your programming. >> So, good example to that would be something we called TV everywhere. So, you said it yourself, obviously users or viewers are used to watching television as when the shows were provided via television. So, with new technologies, with streaming opportunities, today, they want to watch when they want to watch, and what they want to watch. So, one of the ways we accommodate them with that is that we don't just television, so we are on every available platform today and we are allowing viewers to watch our content on demand, digitally, when they want to watch it. So, that is one of the ways how we are reacting to it. And so, that puts us in the position as one of the B to C type of businesses, where we're now speaking directly to our consumers not via just the television. So, we're broadcasting, their watching which means that we understand how they watch and we try to react accordingly to that. Which is something that Netflix is bragging about is that they know the patterns, they actually kind of promote their business so we on that business too. >> Can you describe your innovation formula, if you will? How do you go about innovating? Obviously, there's data, there's technology. Presumably, there's infrastructure that scales. You have to be able to scale and have massive speed and infrastructure that heals itself. All those other things. But what's your innovation formula? How would you describe it? So, informally simple. It starts with business. I'm fortunate that business has desire to innovate. So, formulating goals is something that drives us to respond to it. So, we don't just walk around the thing, and look around and say, "Let's innovate." So, we follow the business goals with innovation. A good example is when we promote our shows. So, the major portion of our marketing campaigns falls on our own air. So, we promote our shows to our AMC viewers or WE tv viewers. When we do that, we try to optimize our campaigns to the highest level possible, to get the most out of ROI out of that. And so, we've succeeded and we managed today to get about 30% ROI on that and either just do better with our promotional campaigns or reallocate that time for other businesses. >> You were saying that after the first question, or during responding to the first question, about you saying we're really not ... We're a content company still. And we have incorporated data, but you really aren't, Dave and I have talked about this a lot, everybody's a data company now, in a way. Because you have to be. Cause you've got this hugely competitive landscape that you're operating in, right? In terms of getting more odd calls. >> That's right. >> So, it's got to be no longer just a part of what you do or a section of what you do. It's got to be embedded in what you do. Does it not? Oh, it absolutely is. I still think that it's a bit premature to call AMC Networks a data company. But to a degree, every company today is a data company. And with the culture change over the years, if I used to solicit requests and go about implementing them, today it's more of a prioritization of work because every department in the company got educated to the degree that they all want to get better. And they all want those insights from the data. They want their parts of the business to be improved. And we're venturing into new businesses. And it's quite a bit in demand. >> So, is it your aspiration to become a data company? Or is it more data-driven sort of TV network? How would you sort of view that? >> I'd like to say data-driven TV network. Of course. >> Dave: Okay. >> It's more in tune with reality. >> And so, talk about aligning with the business goals. That's kind of your starting point. You were talking earlier about a gut feel. We were joking about baseball. Moneyball for business. So, you're a data person. The data doesn't lie, etc. But insights sometimes are hard. They don't just pop out. Is that true? Do you see that changing as the time to insight, from insight to decision going to compress? What do you see there? >> The search for insights will never stop. And the more dense we are in that journey the better we are going to be as a company. The data business is so much depends on technologies. So, that when technologies matures, and we manage to employ them in a timely basis, so we simply get better from that. So, good example is machine learning. There are a ton of optimizations, optimization algorithms, forecasting algorithms that we put in place. So, for awhile it was a pinnacle of our deliveries. Now, with machine learning maturing today. We are able or trying to be in tune with the audience that is changing their behavior. So, the patterns that we would be looking for manually in the past, machine is now looking for those patterns. So, that's the perfect example for our strength to catch up with the reality. What I'm hoping for, and that's where the future is, is that one day we won't be just reacting utilizing machine learning to the change in patterns in behavior. We are actually going to be ahead of those patterns and anticipate those changes to come, and react properly. >> I was going to say, yeah, what is the next step? Because you said that you are reacting. >> Vitaly: I was ahead of your question. >> Yeah, you were. (laughter) So, I'm going to go ahead and re-ask it. >> Dave: Data guy. (laughter) >> But you've got to get to that next step of not just anticipating but almost creating, right, in your way. Creating new opportunities, creating news data to develop these insights into almost shaping viewer behavior, right? >> Vitaly: Totally. So, like I said, optimization is one avenue that we pursue and continue to pursue. Forecasting is another. But I'm talking about true predictability. I mean, something goes beyond just to say how our show will do. Even beyond, which show would do better. >> John: Can you do that? Even to the point and say these are the elements that have been successful for this genre and for this size of audience, and therefore as we develop programming, whether it's in script and casting, whatever. I mean, take it all the way down to that micro-level to developing almost these ideals, these optimal programs that are going to be better received by your audience. >> Look, it's not a big secret. Every company that is in the content business is trying to get as many The Walking Deads as they can in their portfolio. Is there a direct path to success? Probably not, otherwise everyone would have been-- >> John: Over do it. >> Yeah, would be doing that. But yeah, so those are the most critical and difficult insights to get ahold of and we're working toward that. >> Are you finding that your predictive capabilities are getting meaningfully better? Maybe you could talk about that a little bit in terms of predicting those types of successes. Or is it still a lot of trial and error? >> I'd like to say they are meaningfully better. (laughter) Look, we do, there are obviously interesting findings. There are sometimes setbacks and we learn from it, and we move forward. >> Okay, as good as the weather or better? Or worse? (laughs) >> Depends on the morning and the season. (laughter) >> Vitaly, how have your success or have your success measurements changed as we enter this world of digital and machine learning and artificial intelligence? And if so, how? >> Well, they become more and more challenging and complex. Like, I gave an example for data democratization. It was such an interesting and telling company-wide initiative. And at the time, it felt as a true achievement when everybody get access to their data on their desktops and laptops. When we look back now a few years, it was a walk in the park to achieve. So, the more complex data and objectives we set in front of ourselves, the more educated people in the company become, the more challenging it is to deliver and take the next step. And we strive to do that. >> I wonder if I can ask you a question from a developers perspective. You obviously understand the developer mindset. We were talking to Dennis earlier. He's like, "Yeah, you know, it's really the data scientists that are loving the data, taking a bath in it. The data engineers and so forth." And I was kind of pushing on that saying, "Well, but eventually the developers have to be data-oriented. Data is the new development kit. What's your take? I mean, granted the 10 million Java developers most of them are not focused on the data per se. Will that change? Is that changing? >> So, first of all, I want separate the classical IT that you just referred to, which are developers. Because this discipline has been well established whether it's Waterfall or Agile. So, every company has those departments and they serve companies well. Business intelligence is a different animal. So, most of the work, if not all of the work we do is more of an R&D type of work. It is impossible to say, in three months I'll arrive with the model that will transform this business. So, we're driving there. That's the major distinction between the two. Is it the right path for some of the data-oriented developers to move on from, let's say, IT disciplines and into BI disciplines? I would highly encourage that because the job is so much more challenging, so interesting. There's very little routine as we said. It's actually challenge, challenge, and challenge. And, you know, you look at the news the way I do, and you see that data scientists becomes the number one desired job in America. I hope that there will be more and more people in that space because as every other department was struggling to find good people, right people for the space, and even within that space, you have as you mentioned, data engineers. You have data scientists or statisticians. And now it's maturing to the point that you have people who are above and beyond that. Those who actually can envision models not to execute on them. >> Are you investigating blockchain and playing around with that at all? Is there an application in your business? >> It hasn't matured fully yet in our hands but we're looking into it. >> And the reason I ask is that there seems to me that blockchain developers are data-oriented. And those two worlds, in my view, are coming together. But it's earlier days. >> Look, I mean, we are in R&D space. And like I said, we don't know exactly, we can't fully commit to a delivery. But it's always a balance between being practical and dreaming. So, if I were to say, you know, let me jump into a blockchain right now and be ahead of the game. Maybe. But then my commitments are going to be sort of farther ahead and I'm trying to be pragmatic. >> Before we let you go, I got to give you 30 seconds on your Yankees. How do you feel about the season coming up? >> As for with every season, I'm super-excited. And I can't wait until the season starts. >> We're always excited when pitchers and catchers show up. >> That's right. (laughter) >> If I were a Yankee fan, I'd be excited too. I must admit. >> Nobody's lost a game. >> That's right. >> Vitaly, thank you for being with us here. We appreciate it. And continued success at AMC Networks. Thank you for having me. >> Back with more on theCUBE right after this. (upbeat techno music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. Build Your Ladder to AI. I don't know how this interview is going to go, frankly. and the Yankees have done so well. But it is good to have you here with us this morning. So, we have to follow, How did you get there? that's the tip of the iceberg is that we have to have to where you used to be? Now, that's the hardest thing, you can imagine, right? I don't know if that was the case with your firm, But we are successful in doing that. that has now allowed you to modify your business So, that is one of the ways how we are reacting to it. So, we follow the business goals with innovation. or during responding to the first question, So, it's got to be no longer just a part of what you do I'd like to say data-driven TV network. Do you see that changing as the time to insight, So, the patterns that we would be looking for Because you said that you are reacting. So, I'm going to go ahead and re-ask it. (laughter) creating news data to develop these insights So, like I said, optimization is one avenue that we pursue and therefore as we develop programming, Every company that is in the content business and difficult insights to get ahold of Are you finding that your predictive capabilities and we move forward. and the season. So, the more complex have to be data-oriented. And now it's maturing to the point that but we're looking into it. And the reason I ask is that there seems to me and be ahead of the game. Before we let you go, I got to give you 30 seconds And I can't wait until the season starts. and catchers show up. That's right. I must admit. Vitaly, thank you for being with us here. Back with more on theCUBE right after this.

ENTITIES

Entity	Category	Confidence
AMC	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Disney	ORGANIZATION	0.99+
Vitaly	PERSON	0.99+
Vitaly Tsivin	PERSON	0.99+
Dennis	PERSON	0.99+
AMC Networks	ORGANIZATION	0.99+
Vitaly Tsivan	PERSON	0.99+
ABC	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
John Walls	PERSON	0.99+
John	PERSON	0.99+
America	LOCATION	0.99+
10 episodes	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
Red Sox	ORGANIZATION	0.99+
ESPN	ORGANIZATION	0.99+
first question	QUANTITY	0.99+
four years	QUANTITY	0.99+
30 seconds	QUANTITY	0.99+
10 million	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Yankees	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
two	QUANTITY	0.99+
Googles	ORGANIZATION	0.99+
Facebooks	ORGANIZATION	0.99+
Yankee	ORGANIZATION	0.99+
today	DATE	0.99+
six years	QUANTITY	0.99+
five	QUANTITY	0.99+
Red Sox Nation	ORGANIZATION	0.99+
first	QUANTITY	0.99+
One	QUANTITY	0.98+
three months	QUANTITY	0.98+
one	QUANTITY	0.98+
two worlds	QUANTITY	0.96+
about 15 minutes	QUANTITY	0.96+
First	QUANTITY	0.96+
The Walking Deads	TITLE	0.96+
Machine Learning Everywhere: Build Your Ladder to AI	TITLE	0.93+
this morning	DATE	0.92+
four	QUANTITY	0.91+
about 30%	QUANTITY	0.91+
about 15	QUANTITY	0.9+
Number two	QUANTITY	0.88+
Java	TITLE	0.88+
2018	DATE	0.81+
one avenue	QUANTITY	0.81+
Agile	TITLE	0.81+
New York	LOCATION	0.81+
Executive Vice President	PERSON	0.79+
three years	QUANTITY	0.73+
one of the ways	QUANTITY	0.72+
U.S.	LOCATION	0.67+
Machine Learning Everywhere	TITLE	0.63+
number one	QUANTITY	0.63+
theCUBE	TITLE	0.59+
Voiceover	TITLE	0.56+
theCUBE	ORGANIZATION	0.43+
years	QUANTITY	0.35+

Sam Lightstone, IBM | Machine Learning Everywhere 2018

>> Narrator: Live from New York, it's the Cube. Covering Machine Learning Everywhere: Build Your Ladder to AI. Brought to you by IBM. >> And welcome back here to New York City. We're at IBM's Machine Learning Everywhere: Build Your Ladder to AI, along with Dave Vellante, John Walls, and we're now joined by Sam Lightstone, who is an IBM fellow in analytics. And Sam, good morning. Thanks for joining us here once again on the Cube. >> Yeah, thanks a lot. Great to be back. >> Yeah, great. Yeah, good to have you here on kind of a moldy New York day here in late February. So we're talking, obviously data is the new norm, is what certainly, have heard a lot about here today and of late here from IBM. Talk to me about, in your terms, of just when you look at data and evolution and to where it's now become so central to what every enterprise is doing and must do. I mean, how do you do it? Give me a 30,000-foot level right now from your prism. >> Sure, I mean, from a super, if you just stand back, like way far back, and look at what data means to us today, it's really the thing that is separating companies one from the other. How much data do they have and can they make excellent use of it to achieve competitive advantage? And so many companies today are about data and only data. I mean, I'll give you some like really striking, disruptive examples of companies that are tremendously successful household names and it's all about the data. So the world's largest transportation company, or personal taxi, can't call it taxi, but (laughs) but, you know, Uber-- >> Yeah, right. >> Owns no cars, right? The world's largest accommodation company, Airbnb, owns no hotels, right? The world's largest distributor of motion pictures owns no movie theaters. So these companies are disrupting because they're focused on data, not on the material stuff. Material stuff is important, obviously. Somebody needs to own a car, somebody needs to own a way to view a motion picture, and so on. But data is what differentiates companies more than anything else today. And can they tap into the data, can they make sense of it for competitive advantage? And that's not only true for companies that are, you know, cloud companies. That's true for every company, whether you're a bricks and mortars organization or not. Now, one level of that data is to simply look at the data and ask questions of the data, the kinds of data that you already have in your mind. Generating reports, understanding who your customers are, and so on. That's sort of a fundamental level. But the deeper level, the exciting transformation that's going on right now, is the transformation from reporting and what we'll call business intelligence, the ability to take those reports and that insight on data and to visualize it in the way that human beings can understand it, and go much deeper into machine learning and AI, cognitive computing where we can start to learn from this data and learn at the pace of machines, and to drill into the data in a way that a human being cannot because we can't look at bajillions of bytes of data on our own, but machines can do that and they're very good at doing that. So it is a huge, that's one level. The other level is, there's so much more data now than there ever was because there's so many more devices that are now collecting data. And all of us, you know, every one of our phones is collecting data right now. Your cars are collecting data. I think there's something like 60 sensors on every car that rolls of the manufacturing line today. 60. So it's just a wild time and a very exciting time because there's so much untapped potential. And that's what we're here about today, you know. Machine learning, tapping into that unbelievable potential that's there in that data. >> So you're absolutely right on. I mean the data is foundational, or must be foundational in order to succeed in this sort of data-driven world. But it's not necessarily the center of the universe for a lot of companies. I mean, it is for the big data, you know, guys that we all know. You know, the top market cap companies. But so many organizations, they're sort of, human expertise is at the center of their universe, and data is sort of, oh yeah, bolt on, and like you say, reporting. >> Right. >> So how do they deal with that? Do they get one big giant DB2 instance and stuff all the data in there, and infuse it with MI? Is that even practical? How do they solve this problem? >> Yeah, that's a great question. And there's, again, there's a multi-layered answer to that. But let me start with the most, you know, one of the big changes, one of the massive shifts that's been going on over the last decade is the shift to cloud. And people think of the shift to cloud as, well, I don't have to own the server. Someone else will own the server. That's actually not the right way to look at it. I mean, that is one element of cloud computing, but it's not, for me, the most transformative. The big thing about the cloud is the introduction of fully-managed services. It's not just you don't own the server. You don't have to install, configure, or tune anything. Now that's directly related to the topic that you just raised, because people have expertise, domains of expertise in their business. Maybe you're a manufacturer and you have expertise in manufacturing. If you're a bank, you have expertise in banking. You may not be a high-tech expert. You may not have deep skills in tech. So one of the great elements of the cloud is that now you can use these fully managed services and you don't have to be a database expert anymore. You don't have to be an expert in tuning SQL or JSON, or yadda yadda. Someone else takes care of that for you, and that's the elegance of a fully managed service, not just that someone else has got the hardware, but they're taking care of all the complexity. And that's huge. The other thing that I would say is, you know, the companies that are really like the big data houses, they got lots of data, they've spent the last 20 years working so hard to converge their data into larger and larger data lakes. And some have been more successful than others. But everybody has found that that's quite hard to do. Data is coming in many places, in many different repositories, and trying to consolidate, you know, rip the data out, constantly ripping it out and replicating into some data lake where you, or data warehouse where you can do your analytics, is complicated. And it means in some ways you're multiplying your costs because you have the data in its original location and now you're copying it into yet another location. You've got to pay for that, too. So you're multiplying costs. So one of the things I'm very excited about at IBM is we've been working on this new technology that we've now branded it as IBM Queryplex. And that gives us the ability to query data across all of these myriad sources as if they are in one place. As if they are a single consolidated data lake, and make it all look like (snaps) one repository. And not only to the application appear as one repository, but actually tap into the processing power of every one of those data sources. So if you have 1,000 of them, we'll bring to bear the power 1,000 data sources and all that computing and all that memory on these analytics problems. >> Well, give me an example why that matters, of what would be a real-world application of that. >> Oh, sure, so there, you know, there's a couple of examples. I'll give you two extremes, two different extremes. One extreme would be what I'll call enterprise, enterprise data consolidation or virtualization, where you're a large institution and you have several of these repositories. Maybe you got some IBM repositories like DB2. Maybe you've got a little bit of Oracle and a little bit of SQL Server. Maybe you've got some open source stuff like Postgres or MySQL. You got a bunch of these and different departments use different things, and it develops over decades and to some extent you can't even control it, (laughs) right? And now you just want to get analytics on that. You just, what's this data telling me? And as long as all that data is sitting in these, you know, dozens or hundreds of different repositories, you can't tell, unless you copy it all out into a big data lake, which is expensive and complicated. So Queryplex will solve that problem. >> So it's sort of a virtual data store. >> Yeah, and one of the terms, many different terms that are used, but one of the terms that's used in the industry is data virtualization. So that would be a suitable terminology here as well. To make all that data in hundreds, thousands, even millions of possible data sources, appear as one thing, it has to tap into the processing power of all of them at once. Now, that's one extreme. Let's take another extreme, which is even more extreme, which is the IoT scenario, Internet of Things, right? Internet of Things. Imagine you've, have devices, you know, shipping containers and smart meters on buildings. You could literally have 100,000 of these or a million of these things. They're usually small; they don't usually have a lot of data on them. But they can store, usually, couple of months of data. And what's fascinating about that is that most analytics today are really on the most recent you know, 48 hours or four weeks, maybe. And that time is getting shorter and shorter, because people are doing analytics more regularly and they're interested in, just tell me what's going on recently. >> I got to geek out here, for a second. >> Please, well thanks for the warning. (laughs) >> And I know you know things, but I'm not a, I'm not a technical person, but I've been a molt. I've been around a long time. A lot of questions on data virtualization, but let me start with Queryplex. The name is really interesting to me. When I, and you're a database expert, so I'm going to tap your expertise. When I read the Google Spanner paper, I called up my colleague David Floyer, who's an ex-IBM, I said, "This is like global Sysplex. "It's a global distributed thing," And he goes, "Yeah, kind of." And I got very excited. And then my eyes started bleeding when I read the paper, but the name, Queryplex, is it a play on Sysplex? Is there-- >> It's actually, there's a long story. I don't think I can say the story on-air, but we, suffice it to say we wanted to get a name that was legally usable and also descriptive. >> Dave: Okay. >> And we went through literally hundreds and hundreds of permutations of words and we finally landed on Queryplex. But, you know, you mentioned Google Spanner. I probably should spend a moment to differentiate how what we're doing is-- >> Great, if you would. >> A different kind of thing. You know, on Google Spanner, you put data into Google Spanner. With Queryplex, you don't put data into it. >> Dave: Don't have to move it. >> You don't have to move it. You leave it where it is. You can have your data in DB2, you can have it in Oracle, you can have it in a flat file, you can have an Excel spreadsheet, and you know, think about that. An Excel spreadsheet, a collection of text files, comma delimited text files, SQL Server, Oracle, DB2, Netezza, all these things suddenly appear as one database. So that's the transformation. It's not about we'll take your data and copy it into our system, this is about leave your data where it is, and we're going to tap into your (snaps) existing systems for you and help you see them in a unified way. So it's a very different paradigm than what others have done. Part of the reason why we're so excited about it is we're, as far as we know, nobody else is really doing anything quite like this. >> And is that what gets people to the 21st century, basically, is that they have all these legacy systems and yet the conversion is much simpler, much more economical for them? >> Yeah, exactly. It's economical, it's fast. (snaps) You can deploy this in, you know, a very small amount of time. And we're here today talking about machine learning and it's a very good segue to point out in order to get to high-quality AI, you need to have a really strong foundation of an information architecture. And for the industry to show up, as some have done over the past decade, and keep telling people to re-architect their data infrastructure, keep modifying their databases and creating new databases and data lakes and warehouses, you know, it's just not realistic. And so we want to provide a different path. A path that says we're going to make it possible for you to have superb machine learning, cognitive computing, artificial intelligence, and you don't have to rebuild your information architecture. We're going to make it possible for you to leverage what you have and do something special. >> This is exciting. I wasn't aware of this capability. And we were talking earlier about the cloud and the managed service component of that as a major driver of lowering cost and complexity. There's another factor here, which is, we talked about moving data-- >> Right. >> And that's one of the most expensive components of any infrastructure. If I got to move data and the transmission costs and the latency, it's virtually impossible. Speed of light's still up. I know you guys are working on speed of light, but (Sam laughs) you'll eventually get there. >> Right. >> Maybe. But the other thing about cloud economics, and this relates to sort of Queryplex. There's this API economy. You've got virtually zero marginal costs. When you were talking, I was writing these down. You got global scale, it's never down, you've got this network effect working for you. Are you able to, are the standards there? Are you able to replicate those sort of cloud economics the APIs, the standards, that scale, even though you're not in control of this, there's not a single point of control? Can you explain sort of how that magic works? >> Yeah, well I think the API economy is for real and it's very important for us. And it's very important that, you know, we talk about API standards. There's a beautiful quote I once heard. The beautiful thing about standards is there's so many to choose from. (All laugh) And the reality is that, you know, you have standards that are official standards, and then you have the de facto standards because something just catches on and nobody blessed it. It just got popular. So that's a big part of what we're doing at IBM is being at the forefront of adopting the standards that matter. We made a big, a big investment in being Spark compatible, and, in fact, even with Queryplex. You can issue Spark SQL against Queryplex even though it's not a Spark engine, per se, but we make it look and feel like it can be Spark SQL. Another critical point here, when we talk about the API economy, and the speed of light, and movement to the cloud, and these topics you just raised, the friction of the Internet is an unbelievable friction. (John laughs) It's unbelievable. I mean, you know, when you go and watch a movie over the Internet, your home connection is just barely keeping up. I mean, you're pushing it, man. So a gigabyte, you know, a gigabyte an hour or something like that, right? Okay, and if you're a big company, maybe you have a fatter pipe. But not a lot fatter. I mean, not orders of, you're talking incredible friction. And what that means is that it is difficult for people, for companies, to en masse, move everything to the cloud. It's just not happening overnight. And, again, in the interest of doing the best possible service to our customers, that's why we've made it a fundamental element of our strategy in IBM to be a hybrid, what we call hybrid data management company, so that the APIs that we use on the cloud, they are compatible with the APIs that we use on premises. And whether that's software or private cloud. You've got software, you've got private cloud, you've got public cloud. And our APIs are going to be consistent across, and applications that you code for one will run on the other. And you can, that makes it a lot easier to migrate at your leisure when you're ready. >> Makes a lot of sense. That way you can bring cloud economics and the cloud operating model to your data, wherever the data exists. Listening to you speak, Sam, it reminds me, do you remember when Bob Metcalfe who I used to work with at IDG, predicted the collapse of the Internet? He predicted that year after year after year, in speech after speech, that it was so fragile, and you're bringing back that point of, guys, it's still, you know, a lot of friction. So that's very interesting, (laughs) as an architect. >> You think Bob's going to be happy that you brought up that he predicted the Internet was going to be its own demise? (Sam laughs) >> Well, he did it in-- >> I'm just saying. >> I'm staying out of it, man. >> He did it as a lightning rod. >> As a talking-- >> To get the industry to respond, and he had a big enough voice so he could do that. >> That it worked, right. But so I want to get back to Queryplex and the secret sauce. Somehow you're creating this data virtualization capability. What's the secret sauce behind it? >> Yeah, so I think, we're not the first to try, by the way. Actually this problem-- >> Hard problem. >> Of all these data sources all over the place, you try to make them look like one thing. People have been trying to figure out how to do that since like the '70s, okay, so, but-- >> Dave: Really hasn't worked. >> And it hasn't worked. And really, the reason why it hasn't worked is that there's been two fundamental strategies. One strategy is, you have a central coordinator that tries to speak to each of these data sources. So I've got, let's say, 10,000 data sources. I want to have one coordinator tap into each of them and have a dialogue. And what happens is that that coordinator, a server, an agent somewhere, becomes a network bottleneck. You were talking about the friction of the Internet. This is a great example of friction. One coordinator trying to speak to, you know, and collaborators becomes a point of friction. And it also becomes a point of friction not only in the Internet, but also in the computation, because he ends up doing too much of the work. There's too many things that cannot be done at the, at these edge repositories, aggregations, and joins, and so on. So all the aggregations and joins get done by this one sucker who can't keep up. >> Dave: The queue. >> Yeah, so there's a big queue, right. So that's one strategy that didn't work. The other strategy that people tried was sort of an end squared topology where every data source tries to speak to every other data source. And that doesn't scale as well. So what we've done in Queryplex is something that we think is unique and much more organic where we try to organize the universe or constellation of these data sources so that every data source speaks to a small number of peers but not a large number of peers. And that way no single source is a bottleneck, either in network or in computation. That's one trick. And the second trick is we've designed algorithms that can truly be distributed. So you can do joins in a distributed manner. You can do aggregation in a distributed manner. These are things, you know, when I say aggregation, I'm talking about simple things like a sum or an average or a median. These are super popular in, in analytic queries. Everybody wants to do a sum or an average or a median, right? But in the past, those things were hard to do in a distributed manner, getting all the participants in this universe to do some small incremental piece of the computation. So it's really these two things. Number one, this organic, dynamically forming constellation of devices. Dynamically forming a way that is latency aware. So if I'm a, if I represent a data source that's joining this universe or constellation, I'm going to try to find peers who I have a fast connection with. If all the universe of peers were out there, I'll try to find ones that are fast. And the second is having algorithms that we can all collaborate on. Those two things change the game. >> We're getting the two minute sign, and this is fascinating stuff. But so, how do you deal with the data consistency problem? You hear about eventual consistency and people using atomic clocks and-- Right, so Queryplex, you know, there's a reason we call it Queryplex not Dataplex. Queryplex is really a read-only operation. >> Dave: Oh, there you go. >> You've got all these-- >> Problem solved. (laughs) >> Problem solved. You've got all these data sources. They're already doing their, they already have data's coming in how it's coming in. >> Dave: Simple and brilliant. >> Right, and we're not changing any of that. All we're saying is, if you want to query them as one, you can query them as one. I should say a few words about the machine learning that we're doing here at the conference. We've talked about the importance of an information architecture and how that lays a foundation for machine learning. But one of the things that we're showing and demonstrating at the conference today, or at the showcase today, is how we're actually putting machine learning into the database. Create databases that learn and improve over time, learn from experience. In 1952, Arthur Samuel was a researcher at IBM who first, had one of the most fundamental breakthroughs in machine learning when he created a machine learning algorithm that will play checkers. And he programmed this checker playing game of his so it would learn over time. And then he had a great idea. He programmed it so it would play itself, thousands and thousands and thousands of times over, so it would actually learn from its own mistakes. And, you know, the evolution since then. Deep Blue playing chess and so on. The Watson Jeopardy game. We've seen tremendous potential in machine learning. We're putting into the database so databases can be smarter, faster, more consistent, and really just out of the box (snaps) performing. >> I'm glad you brought that up. I was going to ask you, because the legend Steve Mills once said to me, I had asked him a question about in-memory databases. He said ever databases have been around, in-memory databases have been around. But ML-infused databases are new. >> Sam: That's right, something totally new. >> Dave: Yeah, great. >> Well, you mentioned Deep Blue. Looking forward to having Garry Kasparov on a little bit later on here. And I know he's speaking as well. But fascinating stuff that you've covered here, Sam. We appreciate the time here. >> Thank you, thanks for having me. >> And wish you continued success, as well. >> Thank you very much. >> Sam Lightstone, IBM fellow joining us here live on the Cube. We're back with more here from New York City right after this. (electronic music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. and we're now joined by Sam Lightstone, Great to be back. Yeah, good to have you here on kind of a moldy New York day and it's all about the data. the kinds of data that you already have in your mind. I mean, it is for the big data, you know, and trying to consolidate, you know, rip the data out, of what would be a real-world application of that. and you have several of these repositories. Yeah, and one of the terms, Please, well thanks for the warning. And I know you know things, but I'm not a, suffice it to say we wanted to get a name that was But, you know, you mentioned Google Spanner. With Queryplex, you don't put data into it. and you know, think about that. And for the industry to show up, and the managed service component of that And that's one of the most expensive components and this relates to sort of Queryplex. And the reality is that, you know, and the cloud operating model to your data, To get the industry What's the secret sauce behind it? Yeah, so I think, we're not the first to try, by the way. you try to make them look like one thing. And really, the reason why it hasn't worked is that And the second trick is Right, so Queryplex, you know, Problem solved. You've got all these data sources. and really just out of the box (snaps) performing. because the legend Steve Mills once said to me, Well, you mentioned Deep Blue. live on the Cube.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Justin Warren	PERSON	0.99+
Sanjay Poonen	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Clarke	PERSON	0.99+
David Floyer	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Dave Volante	PERSON	0.99+
George	PERSON	0.99+
Dave	PERSON	0.99+
Diane Greene	PERSON	0.99+
Michele Paluso	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Sam Lightstone	PERSON	0.99+
Dan Hushon	PERSON	0.99+
Nutanix	ORGANIZATION	0.99+
Teresa Carlson	PERSON	0.99+
Kevin	PERSON	0.99+
Andy Armstrong	PERSON	0.99+
Michael Dell	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Kevin Sheehan	PERSON	0.99+
Leandro Nunez	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
Keith	PERSON	0.99+
Bob Metcalfe	PERSON	0.99+
VMware	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Sam	PERSON	0.99+
Larry Biagini	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Brendan	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Clarke Patterson	PERSON	0.99+

Dinesh Nirmal, IBM | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE, covering Machine Learning Everywhere: Build Your Ladder to AI. Brought to you by IBM. >> Welcome back to Midtown, New York. We are at Machine Learning Everywhere: Build Your Ladder to AI being put on by IBM here in late February in the Big Apple. Along with Dave Vellante, I'm John Walls. We're now joined by Dinesh Nirmal, who is the Vice President of Analytics Development and Site Executive at the IBM Silicon Valley lab, soon. Dinesh, good to see you, this morning, sir. >> Thank you, John. >> Fresh from California. You look great. >> Thanks. >> Alright, you've talked about this, and it's really your world: data, the new normal. Explain that. When you say it's the new normal, what exactly... How is it transforming, and what are people having to adjust to in terms of the new normal. >> So, if you look at data, I would say each and every one of us has become a living data set. Our age, our race, our salary. What our likes or dislikes, every business is collecting every second. I mean, every time you use your phone, that data is transmitted somewhere, stored somewhere. And, airlines for example, is looking, you know, what do you like? Do you like an aisle seat? Do you like to get home early? You know, all those data. >> All of the above, right? >> And petabytes and zettabytes of data is being generated. So now, businesses' challenge is that how do you take that data and make insights out of it to serve you as a better customer. That's where I've come to, but the biggest challenge is that, how do you deal with this tremendous amount of data? That is the challenge. And creating sites out of it. >> That's interesting. I mean, that means the definition of identity is really... For decades it's been the same, and what you just described is a whole new persona, identity of an individual. >> And now, you take the data, and you add some compliance or provisioning like GDPR on top of it, all of a sudden how do-- >> John: What is GDPR? For those who might not be familiar with it. >> Dinesh: That's the regulatory term that's used by EU to make sure that, >> In the EU. >> If me as a customer come to an enterprise, say, I don't want any of my data stored, it's up to you to go delete that data completely, right? That's the term that's being used. And that goes into effect in May. How do you make sure that that data gets completely deleted by that time the customer has... How do you get that consent from the customer to go do all those... So there's a whole lot of challenges, as data multiplies, how do you deal with the data, how do you create insights to the data, how do you create consent on the data, how do you be compliant on that data, how do you create the policies that's needed to generate that data? All those things need to be... Those are the challenges that enterprises face. >> You bring up GDPR, which, for those who are not familiar with it, actually went into effect last year but the fines go into effect this year, and the fines are onerous, like 4% of turnover, I mean it's just hideous, and the question I have for you is, this is really scary for companies because they've been trying to catch up to the big data world, and so they're just throwing big data projects all over the place, which is collecting data, oftentimes private information, and now the EU is coming down and saying, "Hey you have to be able to, if requested, delete that." A lot of times they don't even know where it is, so big challenge. Are you guys, can you help? >> Yeah, I mean, today if you look at it, the data exists all over the place. I mean, whether it's in your relational database or in your Hadoop, unstructured data, whereas you know, optics store, it exists everywhere. And you have to have a way to say where the data is and the customer has to consent given to go, for you to look at the data, for you to delete the data, all those things. We have tools that we have built and we have been in the business for a very long time for example our governance catalog where you can see all the data sources, the policies that's associated with it, the compliance, all those things. So for you to look through that catalog, and you can see which data is GDPR compliant, which data is not, which data you can delete, which data you cannot. >> We were just talking in the open, Dave made the point that many companies, you need all-stars, not just somebody who has a specialty in one particular area, but maybe somebody who's in a particular regiment and they've got to wear about five different hats. So how do you democratize data to the point that you can make these all-stars? Across all kinds of different business units or different focuses within a company, because all of a sudden people have access to great reams of information. I've never had to worry about this before. But now, you've got to spread that wealth out and make everybody valuable. >> Right, really good question. Like I said, the data is existing everywhere, and most enterprises don't want to move the data. Because it's a tremendous effort to move from an existing place to another one and make sure the applications work and all those things. We are building a data virtualization layer, a federation layer, whereby which if you are, let's say you're a business unit. You want to get access to that data. Now you can use that federational data virtualization layer without moving data, to go and grab that small piece of data, if you're a data scientist, let's say, you want only a very small piece of data that exists in your enterprise. You can go after, without moving the data, just pick that data, do your work, and build a model, for example, based on that data. So that data virtualization layer really helps because it's basically an SQL statement, if I were to simplify it. That can go after the data that exists, whether it's at relational or non-relational place, and then bring it back, have your work done, and then put that data back into work. >> I don't want to be a pessimist, because I am an optimist, but it's scary times for companies. If they're a 20th century organization, they're really built around human expertise. How to make something, how to transact something, or how to serve somebody, or consult, whatever it is. The 21st century organization, data is foundational. It's at the core, and if my data is all over the place, I wasn't born data-driven, born in the cloud, all those buzzwords, how do traditional organizations catch up? What's the starting point for them? >> Most, if not all, enterprises are moving into a data-driven economy, because it's all going to be driven by data. Now it's not just data, you have to change your applications also. Because your applications are the ones that's accessing the data. One, how do you make an application adaptable to the amount of data that's coming in? How do you make accuracy? I mean, if you're building a model, having an accurate model, generating accuracy, is key. How do you make it performant, or govern and self-secure? That's another challenge. How do you make it measurable, monitor all those things? If you take three or four core tenets, that's what the 21st century's going to be about, because as we augment our humans, or developers, with AI and machine learning, it becomes more and more critical how do you bring these three or four core tenets into the data so that, as the data grows, the applications can also scale. >> Big task. If you look at the industries that have been disrupted, taxis, hotels, books, advertising. >> Dinesh: Retail. >> Retail, thank you. Maybe less now, and you haven't seen that disruption yet in banks, insurance companies, certainly parts of government, defense, you haven't seen a big disruption yet, but it's coming. If you've got the data all over the place, you said earlier that virtually every company has to be data-driven, but a lot of companies that I talk to say, "Well, our industry is kind of insulated," or "Yeah, we're going to wait and see." That seems to me to be the recipe for disaster, what are your thoughts on that? >> I think the disruption will come from three angles. One, AI. Definitely that will change the way, blockchain, another one. When you say, we haven't seen in the financial side, blockchain is going to change that. Third is quantum computing. The way we do compute is completely going to change by quantum computing. So I think the disruption is coming. Those are the three, if I have to predict into the 21st century, that will change the way we work. I mean, AI is already doing a tremendous amount of work. Now a machine can basically look at an image and say what it is, right? We have Watson for cancer oncology, we have 400 to 500,000 patients being treated by Watson. So AI is changing, not just from an enterprise perspective, but from a socio-economic perspective and a from a human perspective, so Watson is a great example for that. But yeah, disruption is happening as we speak. >> And do you agree that foundational to AI is the data? >> Oh yeah. >> And so, with your clients, like you said, you described it, they've got data all over the place, it's all in silos, not all, but much of it is in silos. How does IBM help them be a silo-buster? >> Few things, right? One, data exists everywhere. How do you make sure you get access to the data without moving the data, that's one. But if you look at the whole lifecycle, it's about ingesting the data, bringing the data, cleaning the data, because like you said, data becomes the core. Garbage in, garbage out. You cannot get good models unless the data is clean. So there's that whole process, I would say if you're a data scientist, probably 70% of your time is spent on cleaning the data, making the data ready for building a model or for a model to consume. And then once you build that model, how do you make sure that the model gets retrained on a regular basis, how do you monitor the model, how do you govern the model, so that whole aspect goes in. And then the last piece is visualizational reporting. How do you make sure, once the model or the application is built, how do you create a report that you want to generate or you want to visualize that data. The data becomes the base layer, but then there's a whole lifecycle on top of it based on that data. >> So the formula for future innovation, then, starts with data. You add in AI, I would think that cloud economics, however we define that, is also a part of that. My sense is most companies aren't ready, what's your take? >> For the cloud, or? >> I'm talking about innovation. If we agree that innovation comes from the data plus AI plus you've got to have... By cloud economics I mean it's an API economy, you've got massive scale, those kinds of, to compete. If you look at the disruptions in taxis and retail, it's got cloud economics underneath it. So most customers don't really have... They haven't yet even mastered cloud economics, yet alone the data and the AI component. So there's a big gap. >> It's a huge challenge. How do we take the data and create insights out of the data? And not just existing data, right? The data is multiplying by the second. Every second, petabytes or zettabytes of data are being generated. So you're not thinking about the data that exists within your enterprise right now, but now the data is coming from several different places. Unstructured data, structured data, semi-structured data, how do you make sense of all of that? That is the challenge the customers face, and, if you have existing data, on top of the newcoming data, how do you predict what do you want to come out of that. >> It's really a pretty tough conundrum that some companies are in, because if you're behind the curve right now, you got a lot of catching up to do. So you think that we have to be in this space, but keeping up with this space, because the change happens so quickly, is really hard, so we have to pedal twice as fast just to get in the game. So talk about the challenge, how do you address it? How do you get somebody there to say, "Yep, here's your roadmap. "I know it's going to be hard, "but once you get there you're going to be okay, "or at least you're going to be on a level playing field." >> I look at the three D's. There's the data, there's the development of the models or the applications, and then the deployment of those models or applications into your existing enterprise infrastructure. Not only the data is changing, but that development of the models, the tools that you use to develop are also changing. If you look at just the predictive piece, I mean look from the 80's to now. You look at vanilla machine learning versus deep learning, I mean there's so many tools available. How do you bring it all together to make sense which one would you use? I think, Dave, you mentioned Hadoop was the term from a decade ago, now it's about object store and how do you make sure that data is there or JSON and all those things. Everything is changing, so how do you bring, as an enterprise, you keep up, afloat, on not only the data piece, but all the core infrastructure piece, the applications piece, the development of those models piece, and then the biggest challenge comes when you have to deploy it. Because now you have a model that you have to take and deploy in your current infrastructure, which is not easy. Because you're infusing machine learning into your legacy applications, your third-party software, software that was written in the 60's and 70's, it's not an easy task. I was at a major bank in Europe, and the CTO mentioned to me that, "Dinesh, we built our model in three weeks. "It has been 11 months, we still haven't deployed it." And that's the reality. >> There's a cultural aspect too, I think. I think it was Rob Thomas, I was reading a blog that he wrote, and he said that he was talking to a customer saying, "Thank god I'm not in the technology industry, "things change so fast I could never, "so glad I'm not a software company." And Rob's reaction was, "Uh, hang on. (laughs) "You are in the technology business, "you are a software company." And so there's that cultural mindset. And you saw it with GE, Jeffrey Immelt said, "I went to bed an industrial giant, "woke up a software company." But look at the challenges that industrial giant has had transforming, so... They need partners, they need people that have done this before, they need expertise and obviously technology, but it's people and process that always hold it up. >> I mean technology is one piece, and that's where I think companies like IBM make a huge difference. You understand enterprise. Because you bring that wealth of knowledge of working with them for decades and they understand your infrastructure, and that is a core element, like I said the last piece is the deployment piece, how do you bring that model into your existing infrastructure and make sure that it fits into that architecture. And that involves a tremendous amount of work, skills, and knowledge. >> Job security. (all laugh) >> Dinesh, thanks for being with us this morning, we appreciate that and good luck with the rest of the event, here in New York City. Back with more here on theCUBE, right after this. (calming techno music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. and Site Executive at the IBM Silicon Valley lab, soon. You look great. When you say it's the new normal, what exactly... I mean, every time you use your phone, how do you take that data and make insights out of it and what you just described is a whole new persona, For those who might not be familiar with it. How do you get that consent from the customer and the question I have for you is, given to go, for you to look at the data, So how do you democratize data to the point a federation layer, whereby which if you are, It's at the core, and if my data is all over the place, One, how do you make If you look at the industries that have been disrupted, Maybe less now, and you haven't seen that disruption yet When you say, we haven't seen in the financial side, like you said, you described it, how do you make sure that the model gets retrained So the formula for future innovation, If you look at the disruptions in taxis and retail, how do you predict what do you want to come out of that. So talk about the challenge, how do you address it? and how do you make sure that data is there And you saw it with GE, Jeffrey Immelt said, how do you bring that model the rest of the event, here in New York City.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Tom	PERSON	0.99+
Marta	PERSON	0.99+
John	PERSON	0.99+
IBM	ORGANIZATION	0.99+
David	PERSON	0.99+
Dave	PERSON	0.99+
Peter Burris	PERSON	0.99+
Chris Keg	PERSON	0.99+
Laura Ipsen	PERSON	0.99+
Jeffrey Immelt	PERSON	0.99+
Chris	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Chris O'Malley	PERSON	0.99+
Andy Dalton	PERSON	0.99+
Chris Berg	PERSON	0.99+
Dave Velante	PERSON	0.99+
Maureen Lonergan	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Paul Forte	PERSON	0.99+
Erik Brynjolfsson	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Andrew McCafee	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
Cheryl	PERSON	0.99+
Mark	PERSON	0.99+
Marta Federici	PERSON	0.99+
Larry	PERSON	0.99+
Matt Burr	PERSON	0.99+
Sam	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Dave Wright	PERSON	0.99+
Maureen	PERSON	0.99+
Google	ORGANIZATION	0.99+
Cheryl Cook	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
$8,000	QUANTITY	0.99+
Justin Warren	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
2012	DATE	0.99+
Europe	LOCATION	0.99+
Andy	PERSON	0.99+
30,000	QUANTITY	0.99+
Mauricio	PERSON	0.99+
Philips	ORGANIZATION	0.99+
Robb	PERSON	0.99+
Jassy	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Mike Nygaard	PERSON	0.99+

Kickoff John Walls and Dave Vellante | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE! Covering Machine Learning Everywhere: Build Your Ladder To AI. Brought to you by IBM. >> Well, good morning! Welcome here on theCUBE. Along with Dave Vellante, I'm John Walls. We're in Midtown New York for IBM's Machine Learning Everywhere: Build Your Ladder To AI. Great lineup of guests we have for you today, looking forward to bringing them to you, including world champion chess master Garry Kasparov a little bit later on. It's going to be fascinating. Dave, glad you're here. Dave, good to see you, sir. >> John, always a pleasure. >> How you been? >> Up from DC, you know, I was in your area last week doing some stuff with John Furrier, but I've been great. >> Stopped by the White House, drop in? >> You know, I didn't this time. No? >> No. >> Dave: My son, as you know, goes to school down there, so when I go by my hotel, I always walk by the White House, I wave. >> Just in case, right? >> No reciprocity. >> Same deal, we're in the same boat. Let's talk about what we have coming up here today. We're talking about this digital transformation that's going on within multiple industries. But you have an interesting take on it that it's a different wave, and it's a bigger wave, and it's an exciting wave right now, that digital is creating. >> Look at me, I've been around for a long time. I think we're entering a new era. You know, the great thing about theCUBE is you go to all these events, you hear the innovations, and we started theCUBE in 2010. The Big Data theme was just coming in, and it appeared, everybody was very excited. Still excited, obviously, about the data-driven concept. But we're now entering a new era. It's like every 10 years, the parlance in our industry changes. It was cloud, Big Data, SaaS, mobile, social. It just feels like, okay, we're here. We're doing that now. That's sort of a daily ritual. We used to talk about how it's early innings. It's not anymore. It's the late innings for those. I think the industry is changing. The describers of what we're entering are autonomous, pervasive, self-healing, intelligent. When you infuse artificial intelligence, I'm not crazy about that name, but when you infuse that throughout the landscape, things start to change. Data is at the center of it, but I think, John, we're going to see the parlance change. IBM, for example, uses cognitive. People use artificial intelligence. I like machine intelligence. We're trying to still figure out the names. To me, it's an indicator that things are changing. It's early innings now. What we're seeing is a whole new set of opportunities emerging, and if you think about it, it's based on this notion of digital services, where data is at the center. That's something that I want to poke at with the folks at IBM and our guests today. How are people going to build new companies? You're certainly seeing it with the likes of Uber, Airbnb, Waze. It's built on these existing cloud and security, off-the-shelf, if you will, horizontal technologies. How are new companies going to be built, what industries are going to be disruptive? Hint, every industry. But really, the key is, how will existing companies keep pace? That's what I really want to understand. >> You said, every industry's going to be disrupted, which is certainly, I think, an exciting prospect in some respects, but a little scary to some, too, right? Because they think, "No, we're fat and happy "and things are going well right now in our space, "and we know our space better than anybody." Some of those leaders might be thinking that. But as you point out, digital technology has transformed to the extent now that there's nobody safe, because you just slap this application in, you put this technology in, and I'm going to change your business overnight. >> That's right. Digital means data, data is at the center of this transformation. A colleague of mine, David Moschella, has come up with this concept of the matrix, and what the matrix is is a set of horizontal technology services. Think about cloud, or SaaS, or security, or mobile, social, all the way up the stack through data services. But when you look at the companies like Airbnb and Uber and, certainly, what Google is doing, and Facebook, and others, they're building services on top of this matrix. The matrix is comprised of vertical slices by industry and horizontal slices of technology. Disruptors are cobbling together through software and data new sets of services that are disrupting industries. The key to this, John, in my view, anyway, is that, historically, within healthcare or financial services, or insurance, or manufacturing, or education, those were very siloed. But digital and data allows companies and disruptors to traverse silos like never before. Think about it. Amazon buying Whole Foods. Apple getting into healthcare and financial services. You're seeing these big giants disrupt all of these different industries, and even smaller guys, there's certainly room for startups. But it's all around the data and the digital transformation. >> You spoke about traditional companies needing to convert, right? Needing to get caught up, perhaps, or to catch up with what's going on in that space. What do you do with your workforce in that case? You've got a bunch of great, hardworking people, embedded legacy. You feel good about where you are. And now you're coming to that workforce and saying, "Here's a new hat." >> I think that's a great question. I think the concern that one would have for traditional companies is, data is not foundational for most companies. It's not at their core. The vast majority of companies, the core are the people. You hear it all the time. "The people are our greatest asset." That, I hate to say it, but it's somewhat changing. If you look at the top five companies by market cap, their greatest asset is their data, and the people are surrounding that data. They're very, very important because they know how to leverage that data. But if you look at most traditional companies, people are at their core. Data is kind of, "Oh, we got this bolt-on," or it's in a bunch of different silos. The big question is, how do they close that gap? You're absolutely right. The key is skillsets, and the skills have to be, you know, we talk about five-tool baseball players. You're a baseball fan, as am I. Well, you need multi-tool players, those that understand not only the domain of whether it's marketing or sales or operational expertise or finance, but they also require digital expertise. They know, for example, if you're a marketing professional, they know how to do hypertargeting. They know how to leverage social. They know how to do SEO, all these digital skills, and they know how to get information that's relevant and messaging out into the marketplace and permeate that. And so, we're entering, again, this whole new world that's highly scalable, highly intelligent, pervasive, autonomous. We're going to talk about that today with a lot of their guests, with a lot of our guests, that really are kind of futurists and have thought through, I think, the changes that are coming. >> You can't have a DH anymore, right, that's what you're saying? You need a guy that can play the field. >> Not only play the field, not only a utility player, but somebody who's a utility player, but great. Best of breed at all these different skillsets. >> Machine learning, we haven't talked much about that, and another term, right, that certainly has different definitions, but certainly real specific applications to what's going on today. We'll talk a lot about ML today. Your thoughts about that, and how that squares into the artificial intelligence picture, and what we're doing with all those machines out there that are churning 24/7. >> Yeah, so, real quick, I know we're tight on time here. Artificial intelligence to me is the umbrella. Machine learning is the application of math and algorithms to solve a particular problem or answer a particular question. And then there's deep learning, which is highly focused neural networks that go deeper and deeper and deeper, and become auto-didactic, self-learning, in a manner. Those are just the very quick and rudimentary description. Machine learning to me is the starting point, and that's really where organizations really want to start to learn and begin to close the gap. >> A lot of ground to cover, and we're going to do that for you right here on theCUBE as we continue our coverage of Machine Learning Everywhere: Your Ladder To AI, coming up here, IBM hosting us in Midtown, New York. Back with more here on theCUBE in just a bit. (fast electronic music)

Published Date : Feb 27 2018

SUMMARY :

Brought to you by IBM. Great lineup of guests we have for you today, Up from DC, you know, I was in your area last week You know, I didn't this time. I always walk by the White House, I wave. But you have an interesting take on it that and if you think about it, and I'm going to change your business overnight. But when you look at the companies like Airbnb or to catch up with what's going on in that space. and the skills have to be, You need a guy that can play the field. Not only play the field, and what we're doing with all those machines out there of math and algorithms to solve a particular problem and we're going to do that for you right here on theCUBE

ENTITIES

Entity	Category	Confidence
David Moschella	PERSON	0.99+
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Dave	PERSON	0.99+
John Walls	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
2010	DATE	0.99+
New York	LOCATION	0.99+
Uber	ORGANIZATION	0.99+
Garry Kasparov	PERSON	0.99+
Whole Foods	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
today	DATE	0.99+
last week	DATE	0.98+
five-tool	QUANTITY	0.98+
five companies	QUANTITY	0.98+
Midtown, New York	LOCATION	0.97+
DC	LOCATION	0.97+
Waze	ORGANIZATION	0.91+
Midtown New York	LOCATION	0.9+
every 10 years	QUANTITY	0.88+
Machine Learning Everywhere	TITLE	0.82+
White House	LOCATION	0.71+
2018	DATE	0.66+
theCUBE	ORGANIZATION	0.62+
Kickoff	PERSON	0.61+
To	TITLE	0.51+

Wrap Up - IBM Machine Learning Launch - #IBMML - #theCUBE

(jazzy intro music) [Narrator] Live from New York, it's the Cube! Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts: Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. This is theCUBE, the leader in live tech coverage. We've been covering, all morning, the IBM Machine Learning announcement. Essentially what IBM did is they brought Machine Learning to the z platform. My co-host and I, Stu Miniman, have been talking to a number of guests, and we're going to do a quick wrap here. You know, Stu, my take is, when we first heard about this, and the world first heard about this, we were like, "Eh, okay, that's nice, that's interesting." But what it underscores is IBM's relentless effort to continue to keep z relevant. We saw it with the early Linux stuff, we're now seeing it with all the OpenSource and Spark tooling. You're seeing IBM make big positioning efforts to bring analytics and transactions together, and the simple point is, a lot of the world's really important data runs on mainframes. You were just quoting some stats, which were pretty interesting. >> Yeah, I mean, Dave, you know, one of the biggest challenges we know in IT is migrating. Moving from one thing to another is really tough. I love the comment from Barry Baker. Well, if I need to change my platform, by the time I've moved it, that whole digital transformation, we've missed that window. It's there. We know how long that takes: months, quarters. I was actually watching Twitter, and it looks like Chris Maddern is here. Chris was the architect of Venmo, which my younger sisters, all the millennials that I know, everybody uses Venmo. He's here, and he was like, "Almost all the banks, airlines, and retailers "still run on mainframes in 2017, and it's growing. "Who knew?" You've got a guy here that's developing really cool apps that was finding this interesting, and that's an angle I've been looking at today, Dave, is how do you make it easy for developers to leverage these platforms that are already there? The developers aren't going to need to care whether it's a mainframe or a cloud or x86 underneath. IBM is giving you the options, and as a number of our guests said, they're not looking to solve all the problems here. Here's taking this really great, new type of application using Machine Learning and making it available on that platform that so many of their customers already use. >> Right, so we heard a little bit of roadmap here: the ML for z goes GA in Q1, and then we don't have specific timeframes, but we're going to see Power platform pick this up. We heard from Jean-Francois Puget that they'll have an x86 version, and then obviously a cloud version. It's unclear what that hybrid cloud will look like. It's a little fuzzy right now, but that's something that we're watching. Obviously a lot of the model development and training is going to live in the cloud, but the scoring is going to be done locally is how the data scientists like to think about these things. So again, Stu, more mainframe relevance. We've got another cycle coming soon for the mainframe. We're two years into the z13. When IBM has mainframe cycles, it tends to give a little bump to earnings. Now, granted, a smaller and smaller portion of the company's business is mainframe, but still, mainframe drags a lot of other software with it, so it remains a strategic component. So one of the questions we get a lot is what's IBM doing in so-called hardware? Of course, IBM says it's all software, but we know they're still selling boxes, right? So, all the hardware guys, EMC, Dell, IBM, HPE, et cetera. A lot of software content, but it's still a hardware business. So there's really two platforms there: there's the z and there's the Power. And those are both strategic to IBM. It sold its x86 business because it didn't see it as strategic. They just put Bob Picciano in charge of the Power business, so there's obviously real commitments to those platforms. Will they make a dent in the market share numbers? Unclear. It looks like it's steady as she goes, not dramatic increase in share. >> Yeah, and Dave, I didn't hear anybody come in here and say this offering is going to say, well let me dump x86 and go buy mainframe. That's not the target that I heard here. I would have loved to hear a little bit more as to where this fits into the broader IOT strategy. We talked a little bit on the intro, Dave. There's a lot of reasons why data's going to stick at the edge when we look at the numbers. For the huge growth of public cloud, the amount of data in public cloud hasn't caught up to the equivalent of what it would be in data centers itself. What I mean by that is, we usually spend, say 30% on average for storage costs inside a data center. If we look at public cloud, it's more around 10%. So, at AWS Reinvent, I talked to a number of the ecosystem partners, that started to see things like data lakes starting to appear in the cloud. This solution isn't in the data lake family, but it's with the analytics and everything that's happening with streaming and machine learning. It's large repositories of data and huge transactions of data that are happening in the mainframe, and just trying to squint through where all the data lives, and the new waves of technologies coming in. We heard how this can tie into some of the mobile and streaming activities that aren't on the mainframe, so that it can pull them into the other decisions, but some broader picture that I'm sure IBM will be able to give in the future. >> Well, normally you would expect a platform that is however many decades old the mainframe is, after the whole mainframe downsizing trend, you would expect there would be a managed decline in that business. I mean, you're seeing it in a lot of places now. We've talked about this, with things like Symmetrics, right? You minimize and focus the R&D investments, and you try to manage cost, you manage the decline of the business. IBM has almost sort of flipped that. They say, okay, we've got DB2, we're going to continue to invest in that platform. We've got our major subsystems, we're going to enhance the platform with Open Source technologies. We've got a big enough base that we can continue to mine perpetually. The more interesting thing to me about this announcement is it underscores how IBM is leveraging its analytics platform. So, we saw the announcement of the Watson Data Platform last September, which was sort of this end-to-end data pipeline collaboration between different persona engine, which is quite unique in the marketplace, a lot of differentiation there. Still some services. Last week at Spark Summit, I talked to some of the users and some of the partners of the Watson Data Platform. They said it's great, we love it, it's probably the most robust in the marketplace, but it's still a heavy lift. It still requires a fair amount of services, and IBM's still pushing those services. So IBM still has a large portion of the company still a services company. So, not surprising there, but as I've said many many times, the challenge IBM has is to really drive that software business, simplify the deployment and management of that software for its customers, which is something that I think it's working hard on doing. And the other thing is you're seeing IBM leverage those platforms, those analytics platforms, into different hardware segments, or hardware/cloud segments, whether it's BlueMix, z, Power, so, pushing it out through the organization. IBM still has a stack, like Oracle has a stack, so wherever it can push its own stack, it's going to do that, cuz the margins are better. At the same time, I think it understands very well, it's got to have open source choice. >> Yeah, absolutely, and that's something we heard loud and clear here, Dave, which is what we expect from IBM: choice of language, choice of framework. When I hear the public cloud guys, it's like, "Oh, well here's kind of the main focus we have, "and maybe we'll have a little bit of choice there." Absolutely the likes of Google and Amazon are working with open source, but at least first blush, when I look at things, it looks like once IBM fleshes this out -- and as we've said, it's the Spark to start and others that they're adding on -- but IBM could have a broader offering than I expect to see from some of the public cloud guys. We'll see. As you know, Dave, Google's got their cloud event in a couple of weeks in San Francisco. We'll be covering that, and of course Amazon, you expect their regular cadence of announcements that they'll make. So, definitely a new front in the Cloud Wars as it were, for machine learning. >> Excellent! Alright, Stu, we got to wrap, cuz we're broadcasting the livestream. We got to go set up for that. Thanks, I really appreciate you coming down here and co-hosting with me. Good event. >> Always happy to come down to the Big Apple, Dave. >> Alright, good. Alright, thanks for watching, everybody! So, check out SiliconAngle.com, you'll get all the new from this event and around the world. Check out SiliconAngle.tv for this and other CUBE activities, where we're going to be next. We got a big spring coming up, end of winter, big spring coming in this season. And check out WikiBon.com for all the research. Thanks guys, good job today, that's a wrap! We'll see you next time. This is theCUBE, we're out. (jazzy music)

Published Date : Feb 15 2017

SUMMARY :

New York, it's the Cube! a lot of the world's really important data the biggest challenges we Obviously a lot of the model a number of the ecosystem partners, the challenge IBM has is to really kind of the main focus we have, We got to go set up for that. down to the Big Apple, Dave. and around the world.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Chris	PERSON	0.99+
Dave	PERSON	0.99+
Barry Baker	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Chris Maddern	PERSON	0.99+
2017	DATE	0.99+
Bob Picciano	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
San Francisco	LOCATION	0.99+
Stu	PERSON	0.99+
New York City	LOCATION	0.99+
Last week	DATE	0.99+
New York	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
one	QUANTITY	0.99+
30%	QUANTITY	0.99+
two platforms	QUANTITY	0.99+
two years	QUANTITY	0.99+
Linux	TITLE	0.99+
Alrig	PERSON	0.99+
last September	DATE	0.99+
Jean-Francois Puget	PERSON	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.98+
today	DATE	0.98+
Watson Data Platform	TITLE	0.98+
Venmo	ORGANIZATION	0.97+
Spark Summit	EVENT	0.97+
Q1	DATE	0.96+
Big Apple	LOCATION	0.96+
EMC	ORGANIZATION	0.95+
HPE	ORGANIZATION	0.95+
BlueMix	TITLE	0.94+
Spark	TITLE	0.91+
WikiBon.com	ORGANIZATION	0.9+
IBM Machine Learning Launch	EVENT	0.89+
one thing	QUANTITY	0.86+
AWS Reinvent	ORGANIZATION	0.82+
around 10%	QUANTITY	0.8+
x86	COMMERCIAL_ITEM	0.78+
SiliconAngle.tv	ORGANIZATION	0.77+
#IBMML	TITLE	0.76+
z13	COMMERCIAL_ITEM	0.74+
end	DATE	0.71+
Machine Learning	TITLE	0.65+
x86	TITLE	0.62+
CUBE	ORGANIZATION	0.56+
OpenSource	TITLE	0.56+
Twitter	TITLE	0.54+
Learning	TITLE	0.5+
decades	QUANTITY	0.48+
Symmetrics	TITLE	0.46+
SiliconAngle.com	ORGANIZATION	0.43+
theCUBE	ORGANIZATION	0.41+
Wars	TITLE	0.35+

Barry Baker, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Narrator] Live from New York, it's theCUBE! Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts: Dave Vellante and Stu Miniman. >> Hi everybody, we're back, this is theCUBE. We're live at the IBM Machine Learning Launch Event. Barry Baker is here, he's the Vice President of Offering Management for z Systems. Welcome to theCUBE, thanks for coming on! >> Well, it's my first time, thanks for having me! >> A CUBE newbie, alright! Let's get right into it! >> [Barry Baker] Go easy! >> So, two years ago, January of 2015, we covered the z13 launch. The big theme there was bringing analytics and transactions together, z13 being the platform for that. Today, we're hearing about machine learning on mainframe. Why machine learning on mainframe, Barry? >> Well, for one, it is all about the data on the platform, and the applications that our clients have on the platform. And it becomes a very natural fit for predictive analytics and what you can get from machine learning. So whether you're trying to do churn analysis or fraud detection at the moment of the transaction, it becomes a very natural place for us to inject what is pretty advanced capability from a machine learning perspective into the mainframe environment. We're not trying to solve all analytics problems on the mainframe, we're not trying to become a data lake, but for the applications and the data that reside on the platform, we believe it's a prime use case that our clients are waiting to adopt. >> Okay, so help me think through the use case of I have all this transaction data on the mainframe. Not trying to be a data lake, but I've got this data lake elsewhere, that might be useful for some of the activity I want to do. How do I do that? I'm presuming I'm not extracting my sensitive transaction data and shipping it into the data lake. So, how am I getting access to some of that social data or other data? >> Yeah, and we just saw an example in the demo pad before, whereby the bulk of the data you want to perform scoring on, and also the machine learning on to build your models, is resident on the mainframe, but there does exist data out there. In the example we just saw, it was social data. So the demo that was done was how you can take and use IBM Bluemix and get at key pieces of social data. Not a whole mass of the volume of unstructured data that lives out there. It's not about bringing that to the platform and doing machine learning on it. It's about actually taking a subset of that data, a filtered subset that makes sense to be married with the bigger data set that sits on the platform. And so that's how we envision it. We provide a number of ways to do that through the IBM Machine Learning offering, where you can marry data sources from different places. But really, the bulk of the data needs to be on z and on the platform for it to make sense to have this workload running there. >> Okay. One of the big themes, of course, that IBM puts forth is platform modernization, application modernization. I think it kind of started with Linux on z? Maybe there were other examples, but that was a big one. I don't know what the percentage is, but a meaningful percentage of workloads running on z are Linux-based, correct? >> Yeah, so, the way I would view it is it's still today that the majority of workload on the platform is z/OS based, but Linux is one of our fastest growing workloads on the platform. And it is about how do you marry and bring other capabilities and other applications closer to the systems of record that is sitting there on z/OS. >> So, last week, at AnacondaCON, you announced Anaconda on z, certainly Spark, a lot of talk on Spark. Give us the update on the sort of tooling. >> We recognized a few years back that Spark was going to be key to our platform longer-term. So, contrary to what people have seen from z in the past, we jumped on it fast. We view it as an enabling technology, an enabling piece of infrastructure that allows for analytics solutions to be built and brought to market really rapidly. And the machine learning announcement today is proof of that. In a matter of months, we've been able to take the cloud-based IBM Watson Machine Learning offering and have the big chunk of it run on the mainframe, because of the investment we made in spark a year and a half ago, two years ago. We continue to invest in Spark, we're at 2.0.2 level. The announcement last week around Anaconda is, again, how do we continue to bring the right infrastructure, from an analytics perspective, onto the platform. And you'll see later, maybe in the session, where the roadmap for ML isn't just based on Spark. The roadmap for ML also requires us to go after and provide new runtimes and new languages on the platform, like Python and Anaconda in particular. So, it's a coordinated strategy where we're laying the foundation on the infrastructure side to enable the solutions from the analytics unit. >> Barry, when I hear about streaming, it reminds me of the general discussion we've been having with customers about digital transformation. How does mainframe fit into that digital mandate that you hear from customers? >> That's a great, great question. From our perspective, we've come out of the woods of many of our discussions with clients being about, I need to move off the platform, and rather, I need to actually leverage this platform, because the time it's going to take me to move off this platform, by the time I do that, digital's going to overwash me and I'm going to be gone." So the very first step that our clients take, and some of our leading clients take, on the platform for digital transformation, is moving toward standard RESTful APIs, taking z/OS Connect Enterprise Edition, putting that in front of their core, mission-critical applications and data stores, and enabling those assets to be exposed externally. And what's happening is those clients then build out new engaging mobile web apps that are then coming directly back to the mainframe at those high value assets. But in addition, what that is driving is a whole other set of interaction patterns that we're actually able to see on the mainframe in how they're being used. So, opening up the API channel is the first step our clients are taking. Next is how do they take the 200 billion lines of COBOL code that is out there in the wild, running on these systems, and how do they over time modernize it? And we have some leading clients that are doing very tight integration whereby they have a COBOL application, and as they want to make changes to it, we give them the ability to make changes in it, but do it in Java, or do it in another language, a more modern language, tightly integrated with the COBOL runtime. So, we call that progressive modernization. It's not about come in and replace the whole app and rewrite that thing. That's one next step on the journey, and then as the clients start to do that, they start to really need to lay down a continuous integration, continuous delivery tool chain, building a whole dev ops end-to-end flow. That's kind of the path that our clients are on for really getting much more faster and getting more productivity out of their development side of things. And in turn, the platform is now becoming a platform that they can deliver results on, just like they could on any other platform. >> That's big because a lot of customers use to complain, well, I can't get COBOL skills or, you know, and so IBM's answer was often, well, we got 'em. You can outsource it to us and that's not always the preferred approach so, glad to hear you're addressing that. On the dev ops discussion, you know, a lot of times dev ops is about breaking stuff. How about the main frame workload's all about not breaking stuff so, waterfall, more traditional methodologies are still appropriate. Can you help us understand how customers are dealing with that, sort of, schism. >> Yeah, I think dev ops, some people would come at it and say, that's just about moving fast and breaking some eggs and cleaning up the mess and then moving forward from but from our perspective it's, that's not it, right? That can't be it for our customers because of the criticality of these systems will not allow that so from our, our dev ops model is not so much about move fast and break some eggs, it's about move fast in smaller increments and in establishing clear chains and a clear pipeline with automated test suites getting executed and run at each phase of the pipeline before you move to production. So, we're not going to... And our approach is not to compromise on quality as you kind of move towards dev ops and we have, internally, our major subsystems right? So, KIX, IMS, DB2. They're all on their own journey to deliver and move towards continuous integration in dev ops internally. So, we're eating our own... We're dog fooding this here, right? We're building our own teams around this and we're not seeing a decline in quality. In fact, as we start to really fix and move testing to the left, as they call it, shift left testing, right? Earlier in the cycle you regression test. We are seeing better quality come because of that effort. >> You put forth this vision, as I said, at the top of this segment. Vision, this vision of bringing data in analytics, in transactions together. That was the Z13 announcement. But the reality is, a lot of customers would have their main frame and then they'd have, you know, some other data warehouse, some infiniband pipe, maybe to that data warehouse was there approximation of real time. So, the vision that you put forth was to consolidate that. And has that happened? Are you starting to do that? What are they doing with the data warehouse? >> So, we're starting to see it. I mean, and frankly, we have clients that struggle with that model, right? And that's precisely why we have a very strong point of view that says, if this is data that you're going to get value from, from an analytics perspective and you can use it on the platform, moving it off the platform is going to create a number of challenges for you. And we've seen it first hand. We've seen companies that ETL the data off the platform. They end up with 9, 10, 12 copies of the data. As soon as you do that, the data is, it's old, it's stale and so any insights you derive are then going to be potentially old and stale as well. The other side of it is, our customers in the industries that heavy users of the mainframe, finance, banking, healthcare. These are heavily regulated industries that are getting more regulated. And they're under more pressure to ensure governance and, in their meeting, the various regulation needs. As soon as you start to move that data off the platform, your problem just got that much harder. So, we are seeing a shift in approaches and it's going to take some time for clients to get past this, right? Because, enterprise data warehouse is a pretty big market and there's a lot of them out there but we're confident that for specific use cases, it makes a great deal of sense to leave the data where it is bring the analytics as close to that data as possible, and leverage the insight right there at the point of impact as opposed to pushing it off. >> How about the economics? So, I have talked, certainly talked to customers that understand it for a lot of the work that they're doing. Doing it on the Z platform is more cost effective than maybe, try to manage a bunch of, you know, bespoke X86 boxes, no question. But at the end of the day, there's still that CAPEX. What is IBM doing to help customers, sort of, absorb, you know, the costs and bring together, more aggressively, analytic and transaction data. >> Yeah, so, in agreement a 100%, I think we can create the best technology in the world but if we don't close on the financials, it's not going to go anywhere, it's not going to get, it's not going to move. So, from an analytics perspective, just starting at the ground level with spark, even underneath the spark layer, there are things we've done in the hardware to accelerate performance and so that's one layer. Then you move into spark. Well, spark is running on our java, our JDK and it takes advantage of using and being moved off to the ziip offload processors. So, those processors alone are lower cost than general purpose processors. We then have additionally thought this through, in terms of working with clients and seeing that, you know, a typical use case for running spark on the platform, they require three or four ziips and then a hundred, two hundred gig of additional memory. We've come at that as a, let's do a bundled offer and with you that comes in and says, for that workload, we're going to come in with a different price point for you. So, the other side of it is, we've been delivering over the last couple of years, ways to isolate workload from a software license cost perspective, right. 'Cause the other knock that people will say is, as I add new workload it impacts all the rest of my software Well, no. There are multiple paths forward for you to isolate that workload, add new workload to the platform and not have it impact your existing MLC charges so we continue to actually evolve that and make that easier to do but that's something we're very focused on. >> But that's more than just, sort of an LPAR or... >> Yeah, so there's other ways we could do that with... (mumbles) We're IBM so there's acronyms right. So there's ZCAP and there's all other pricing mechanisms that we can take advantage of to help you, you know, the way I simply say it is, we have to enable for new workload, we need to enable the pricing to be supportive of growth, right, not protecting and so we are very focused on, how do we do this in the right way that clients can adopt it, take advantage of the capabilities and also do it in a cost effective way. >> And what about security? That's another big theme that you guys have put forth. What's new there? >> Yeah so we have a lot underway from the security perspective. I'm going to say stay tuned, more to come there but there's a heavy investment, again, going back to what our clients are struggling with and that we hear in day in and day out, is around how do I ensure, you know, and how do I do encryption pervasively across the platform for all of the data being managed by the system, how do I do that with ease, and how do I do that without having to drive changes at the application layer, having to drive operational changes. How do I enable these systems to get that much more secure with these and low cost. >> Right, because if you... In an ideal world you'd encrypt everything but there's a cost of doing that. There are some downstream nuances with things like compression >> Yup. >> And so forth so... Okay, so more to come there. We'll stay tuned. >> More to come. >> Alright, we'll give you the final word. Big day for you, guys so congratulations on the announcement You got a bunch of customers who're comin' in very shortly. >> Yeah no... It's extremely, we're excited to be here. We think that the combination of IBM systems, working with the IBM analytics team to put forward an offering that pulls key aspects of Watson and delivers it on the mainframe is something that will get noticed and actually solve some real challenges so we're excited. >> Great. Barry, thanks very much for coming to theCUBE, appreciate it >> Thanks for having me. Thanks for going easy on me. >> You're welcome. Keep it right there. We'll be back with our next guest, right after this short break. (techno music)

Published Date : Feb 15 2017

SUMMARY :

brought to you by IBM. Barry Baker is here, he's the analytics and transactions together, that reside on the platform, we believe So, how am I getting access to and also the machine learning on to build your models, One of the big themes, of course, that the majority of workload on the platform is z/OS based, you announced Anaconda on z, and have the big chunk of it run on the mainframe, it reminds me of the general discussion we've been having because the time it's going to take me to move On the dev ops discussion, you know, a lot of times dev ops Earlier in the cycle you regression test. So, the vision that you put forth was to consolidate that. moving it off the platform is going to create But at the end of the day, there's still that CAPEX. and make that easier to do but the way I simply say it is, we have to enable That's another big theme that you guys have put forth. and that we hear in day in and day out, but there's a cost of doing that. Okay, so more to come there. Alright, we'll give you the final word. and delivers it on the mainframe Barry, thanks very much for coming to theCUBE, appreciate it Thanks for going easy on me. We'll be back with our next guest,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Barry	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Barry Baker	PERSON	0.99+
Stu Miniman	PERSON	0.99+
New York	LOCATION	0.99+
9	QUANTITY	0.99+
last week	DATE	0.99+
COBOL	TITLE	0.99+
Python	TITLE	0.99+
100%	QUANTITY	0.99+
Java	TITLE	0.99+
10	QUANTITY	0.99+
Linux	TITLE	0.99+
first step	QUANTITY	0.99+
two years ago	DATE	0.99+
a year and a half ago	DATE	0.99+
200 billion lines	QUANTITY	0.99+
12 copies	QUANTITY	0.99+
first time	QUANTITY	0.98+
one layer	QUANTITY	0.98+
January of 2015	DATE	0.98+
today	DATE	0.98+
Today	DATE	0.98+
three	QUANTITY	0.98+
Spark	TITLE	0.97+
Anaconda	ORGANIZATION	0.97+
z/OS	TITLE	0.96+
2.0.2 level	QUANTITY	0.94+
IBM Machine Learning Launch Event	EVENT	0.94+
each phase	QUANTITY	0.93+
AnacondaCON	ORGANIZATION	0.93+
X86	COMMERCIAL_ITEM	0.92+
z	TITLE	0.91+
Anaconda	TITLE	0.91+
Vice President	PERSON	0.91+
z Systems	ORGANIZATION	0.9+
java	TITLE	0.9+
z13	TITLE	0.9+
a hundred	QUANTITY	0.89+
one	QUANTITY	0.88+
four ziips	QUANTITY	0.88+
ML	TITLE	0.86+
One	QUANTITY	0.85+
Bluemix	COMMERCIAL_ITEM	0.82+
z/OS Connect Enterprise Edition	TITLE	0.76+
Spark	ORGANIZATION	0.76+
two hundred gig	QUANTITY	0.75+
a few years back	DATE	0.74+
last	DATE	0.69+
first	QUANTITY	0.69+
Anaconda	LOCATION	0.68+
z13	COMMERCIAL_ITEM	0.68+
one next	QUANTITY	0.66+
theCUBE	ORGANIZATION	0.64+
Watson	TITLE	0.64+
Z13	ORGANIZATION	0.62+
spark	ORGANIZATION	0.61+
ZCAP	TITLE	0.6+
#IBMML	TITLE	0.57+
lake	LOCATION	0.57+
KIX	TITLE	0.46+
years	DATE	0.44+

Jean Francois Puget, IBM | IBM Machine Learning Launch 2017

>> Announcer: Live from New York, it's theCUBE, covering the IBM machine learning launch event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Alright, we're back. Jean Francois Puget is here, he's the distinguished engineer for machine learning and optimization at IBM analytics, CUBE alum. Good to see you again. >> Yes. >> Thanks very much for coming on, big day for you guys. >> Jean Francois: Indeed. >> It's like giving birth every time you guys give one of these products. We saw you a little bit in the analyst meeting, pretty well attended. Give us the highlights from your standpoint. What are the key things that we should be focused on in this announcement? >> For most people, machine learning equals machine learning algorithms. Algorithms, when you look at newspapers or blogs, social media, it's all about algorithms. Our view that, sure, you need algorithms for machine learning, but you need steps before you run algorithms, and after. So before, you need to get data, to transform it, to make it usable for machine learning. And then, you run algorithms. These produce models, and then, you need to move your models into a production environment. For instance, you use an algorithm to learn from past credit card transaction fraud. You can learn models, patterns, that correspond to fraud. Then, you want to use those models, those patterns, in your payment system. And moving from where you run the algorithm to the operation system is a nightmare today, so our value is to automate what you do before you run algorithms, and then what you do after. That's our differentiator. >> I've had some folks in theCUBE in the past have said years ago, actually, said, "You know what, algorithms are plentiful." I think he made the statement, I remember my friend Avi Mehta, "Algorithms are free. "It's what you do with them that matters." >> Exactly, that's, I believe in autonomy that open source won for machine learning algorithms. Now the future is with open source, clearly. But it solves only a part of the problem you're facing if you want to action machine learning. So, exactly what you said. What do you do with the results of algorithm is key. And open source people don't care much about it, for good reasons. They are focusing on producing the best algorithm. We are focusing on creating value for our customers. It's different. >> In terms of, you mentioned open source a couple times, in terms of customer choice, what's your philosophy with regard to the various tooling and platforms for open source, how do you go about selecting which to support? >> Machine learning is fascinating. It's overhyped, maybe, but it's also moving very quickly. Every year there is a new cool stuff. Five years ago, nobody spoke about deep learning. Now it's everywhere. Who knows what will happen next year? Our take is to support open source, to support the top open source packages. We don't know which one will win in the future. We don't know even if one will be enough for all needs. We believe one size does not fit all, so our take is support a curated list of mid-show open source. We start with Spark ML for many reasons, but we won't stop at Spark ML. >> Okay, I wonder if we can talk use cases. Two of my favorite, well, let's just start with fraud. Fraud has become much, much better over the past certainly 10 years, but still not perfect. I don't know if perfection is achievable, but lot of false positives. How will machine learning affect that? Can we expect as consumers even better fraud detection in more real time? >> If we think of the full life cycle going from data to value, we will provide a better answer. We still use machine learning algorithm to create models, but a model does not tell you what to do. It will tell you, okay, for this credit card transaction coming, it has a high probability to be fraud. Or this one has a lower priority, uh, probability. But then it's up to the designer of the overall application to make decisions, so what we recommend is to use machine learning data prediction but not only, and then use, maybe, (murmuring). For instance, if your machine learning model tells you this is a fraud with a high probability, say 90%, and this is a customer you know very well, it's a 10-year customer you know very well, then you can be confident that it's a fraud. Then if next fraud tells you this is 70% probability, but it's a customer since one week. In a week, we don't know the customer, so the confidence we can get in machine learning should be low, and there you will not reject the transaction immediately. Maybe you will enter, you don't approve it automatically, maybe you will send a one-time passcode, or you enter a serve vendor system, but you don't reject it outright. Really, the idea is to use machine learning predictions as yet another input for making decisions. You're making decision informed on what you could learn from your past. But it's not replacing human decision-making. Our approach with IBM, you don't see IBM speak much about artificial intelligence in general because we don't believe we're here to replace humans. We're here to assist humans, so we say, augmented intelligence or assistance. That's the role we see for machine learning. It will give you additional data so that you make better decisions. >> It's not the concept that you object to, it's the term artificial intelligence. It's really machine intelligence, it's not fake. >> I started my career as a PhD in artificial intelligence, I won't say when, but long enough. At that time, there were already promise that we have Terminator in the next decade and this and that. And the same happened in the '60s, or it was after the '60s. And then, there is an AI winter, and we have a risk here to have an AI winter because some people are just raising red flags that are not substantiated, I believe. I don't think that technology's here that we can replace human decision-making altogether any time soon, but we can help. We can certainly make some proficient, more efficient, more productive with machine learning. >> Having said that, there are a lot of cognitive functions that are getting replaced, maybe not by so-called artificial intelligence, but certainly by machines and automation. >> Yes, so we're automating a number of things, and maybe we won't need to have people do quality check and just have an automated vision system detect defects. Sure, so we're automating more and more, but this is not new, it has been going on for centuries. >> Well, the list evolved. So, what can humans do that machines can't, and how would you expect that to change? >> We're moving away from IMB machine learning, but it is interesting. You know, each time there is a capacity that a machine that will automate, we basically redefine intelligence to exclude it, so you know. That's what I foresee. >> Yeah, well, robots a while ago, Stu, couldn't climb stairs, and now, look at that. >> Do we feel threatened because a robot can climb a stair faster than us? Not necessarily. >> No, it doesn't bother us, right. Okay, question? >> Yeah, so I guess, bringing it back down to the solution that we're talking about today, if I now am doing, I'm doing the analytics, the machine learning on the mainframe, how do we make sure that we don't overrun and blow out all our MIPS? >> We recommend, so we are not using the mainframe base compute system. We recommend using ZIPS, so additional calls to not overload, so it's a very important point. We claim, okay, if you do everything on the mainframe, you can learn from operational data. You don't want to disturb, and you don't want to disturb takes a lot of different meanings. One that you just said, you don't want to slow down your operation processings because you're going to hurt your business. But you also want to be careful. Say we have a payment system where there is a machine learning model predicting fraud probability, a part of the system. You don't want a young bright data scientist decide that he had a great idea, a great model, and he wants to push his model in production without asking anyone. So you want to control that. That's why we insist, we are providing governance that includes a lot of things like keeping track of how models were created from which data sets, so lineage. We also want to have access control and not allow anyone to just deploy a new model because we make it easy to deploy, so we want to have a role-based access and only someone someone with some executive, well, it depends on the customer, but not everybody can update the production system, and we want to support that. And that's something that differentiates us from open source. Open source developers, they don't care about governance. It's not their problem, but it is our customer problem, so this solution will come with all the governance and integrity constraints you can expect from us. >> Can you speak to, first solution's going to be on z/OS, what's the roadmap look like and what are some of those challenges of rolling this out to other private cloud solutions? >> We are going to shape this quarter IBM machine learning for Z. It starts with Spark ML as a base open source. This is not, this is interesting, but it's not all that is for machine learning. So that's how we start. We're going to add more in the future. Last week we announced we will shape Anaconda, which is a major distribution for Python ecosystem, and it includes a number of machine learning open source. We announced it for next quarter. >> I believe in the press release it said down the road things like TensorFlow are coming, H20. >> But Anaconda will announce for next quarter, so we will leverage this when it's out. Then indeed, we have a roadmap to include major open source, so major open source are the one from Anaconda (murmuring), mostly. Key deep learning, so TensorFlow and probably one or two additional, we're still discussing. One that I'm very keen on, it's called XGBoost in one word. People don't speak about it in newspapers, but this is what wins all Kaggle competitions. Kaggle is a machine learning competition site. When I say all, all that are not imagery cognition competitions. >> Dave: And that was ex-- >> XGBoost, X-G-B-O-O-S-T. >> Dave: XGBoost, okay. >> XGBoost, and it's-- >> Dave: X-ray gamma, right? >> It's really a package. When I say we don't know which package will win, XGBoost was introduced a year ago also, or maybe a bit more, but not so long ago, and now, if you have structure data, it is the best choice today. It's a really fast-moving, but so, we will support mid-show deep learning package and mid-show classical learning package like the one from Anaconda or XGBoost. The other thing we start with Z. We announced in the analyst session that we will have a power version and a private cloud, meaning XTC69X version as well. I can't tell you when because it's not firm, but it will come. >> And in public cloud as well, I guess we'll, you've got components in the public cloud today like the Watson Data Platform that you've extracted and put here. >> We have extracted part of the testing experience, so we've extracted notebooks and a graphical tool called ModelBuilder from DSX as part of IBM machine learning now, and we're going to add more of DSX as we go. But the goal is to really share code and function across private cloud and public cloud. As Rob Thomas defined it, we want with private cloud to offer all the features and functionality of public cloud, except that it would run inside a firewall. We are really developing machine learning and Watson machine learning on a command code base. It's an internal open source project. We share code, and then, we shape on different platform. >> I mean, you haven't, just now, used the word hybrid. Every now and then IBM does, but do you see that so-called hybrid use case as viable, or do you see it more, some workloads should run on prem, some should run in the cloud, and maybe they'll never come together? >> Machine learning, you basically have to face, one is training and the other is scoring. I see people moving training to cloud quite easily, unless there is some regulation about data privacy. But training is a good fit for cloud because usually you need a large computing system but only for limited time, so elasticity's great. But then deployment, if you want to score transaction in a CICS transaction, it has to run beside CICS, not cloud. If you want to score data on an IoT gateway, you want to score other gateway, not in a data center. I would say that may not be what people think first, but what will drive really the split between public cloud, private, and on prem is where you want to apply your machine learning models, where you want to score. For instance, smart watches, they are switching to gear to fit measurement system. You want to score your health data on the watch, not in the internet somewhere. >> Right, and in that CICS example that you gave, you'd essentially be bringing the model to the CICS data, is that right? >> Yes, that's what we do. That's a value of machine learning for Z is if you want to score transactions happening on Z, you need to be running on Z. So it's clear, mainframe people, they don't want to hear about public cloud, so they will be the last one moving. They have their reasons, but they like mainframe because it ties really, really secure and private. >> Dave: Public cloud's a dirty word. >> Yes, yes, for Z users. At least that's what I was told, and I could check with many people. But we know that in general the move is for public cloud, so we want to help people, depending on their journey, of the cloud. >> You've got one of those, too. Jean Francois, thanks very much for coming on theCUBE, it was really a pleasure having you back. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from the Waldorf Astoria. IBM's machine learning announcement, be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Good to see you again. on, big day for you guys. What are the key things that we and then what you do after. "It's what you do with them that matters." So, exactly what you said. but we won't stop at Spark ML. the past certainly 10 years, so that you make better decisions. that you object to, that we have Terminator in the next decade cognitive functions that and maybe we won't need to and how would you expect that to change? to exclude it, so you know. and now, look at that. Do we feel threatened because No, it doesn't bother us, right. and you don't want to disturb but it's not all that I believe in the press release it said so we will leverage this when it's out. and now, if you have structure data, like the Watson Data Platform But the goal is to really but do you see that so-called is where you want to apply is if you want to score so we want to help people, depending on it was really a pleasure having you back. from the Waldorf Astoria.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jean Francois	PERSON	0.99+
IBM	ORGANIZATION	0.99+
10-year	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
Avi Mehta	PERSON	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Jean Francois Puget	PERSON	0.99+
next year	DATE	0.99+
Two	QUANTITY	0.99+
Last week	DATE	0.99+
next quarter	DATE	0.99+
90%	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
one-time	QUANTITY	0.99+
today	DATE	0.99+
Five years ago	DATE	0.99+
one word	QUANTITY	0.99+
CICS	ORGANIZATION	0.99+
Python	TITLE	0.99+
a year ago	DATE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
next decade	DATE	0.98+
one week	QUANTITY	0.98+
first solution	QUANTITY	0.98+
XGBoost	TITLE	0.98+
a week	QUANTITY	0.97+
Spark ML	TITLE	0.97+
'60s	DATE	0.97+
ModelBuilder	TITLE	0.96+
one size	QUANTITY	0.96+
One	QUANTITY	0.95+
first	QUANTITY	0.94+
Watson Data Platform	TITLE	0.93+
each time	QUANTITY	0.93+
Kaggle	ORGANIZATION	0.92+
Stu	PERSON	0.91+
this quarter	DATE	0.91+
DSX	TITLE	0.89+
XGBoost	ORGANIZATION	0.89+
Waldorf Astoria	ORGANIZATION	0.86+
Spark ML.	TITLE	0.85+
z/OS	TITLE	0.82+
years	DATE	0.8+
centuries	QUANTITY	0.75+
10 years	QUANTITY	0.75+
DSX	ORGANIZATION	0.72+
Terminator	TITLE	0.64+
XTC69X	TITLE	0.63+
IBM Machine Learning Launch 2017	EVENT	0.63+
couple times	QUANTITY	0.57+
machine learning	EVENT	0.56+
X	TITLE	0.56+
Watson	TITLE	0.55+
these products	QUANTITY	0.53+
-G-B	COMMERCIAL_ITEM	0.53+
H20	ORGANIZATION	0.52+
TensorFlow	ORGANIZATION	0.5+
theCUBE	ORGANIZATION	0.49+
CUBE	ORGANIZATION	0.37+

Bryan Smith, Rocket Software - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Announcer: Live from New York, it's theCUBE, covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. We're here at the Waldorf Astoria covering the IBM Machine Learning Launch Event, bringing machine learning to the IBM Z. Bryan Smith is here, he's the vice president of R&D and the CTO of Rocket Software, powering the path to digital transformation. Bryan, welcome to theCUBE, thanks for coming on. >> Thanks for having me. >> So, Rocket Software, Waltham, Mass. based, close to where we are, but a lot of people don't know about Rocket, so pretty large company, give us the background. >> It's been around for, this'll be our 27th year. Private company, we've been a partner of IBM's for the last 23 years. Almost all of that is in the mainframe space, or we focused on the mainframe space, I'll say. We have 1,300 employees, we call ourselves Rocketeers. It's spread around the world. We're really an R&D focused company. More than half the company is engineering, and it's spread across the world on every continent and most major countries. >> You're esstenially OEM-ing your tools as it were. Is that right, no direct sales force? >> About half, there are different lenses to look at this, but about half of our go-to-market is through IBM with IBM-labeled, IBM-branded products. We've always been, for the side of products, we've always been the R&D behind the products. The partnership, though, has really grown. It's more than just an R&D partnership now, now we're doing co-marketing, we're even doing some joint selling to serve IBM mainframe customers. The partnership has really grown over these last 23 years from just being the guys who write the code to doing much more. >> Okay, so how do you fit in this announcement. Machine learning on Z, where does Rocket fit? >> Part of the announcement today is a very important piece of technology that we developed. We call it data virtualization. Data virtualization is really enabling customers to open their mainframe to allow the data to be used in ways that it was never designed to be used. You might have these data structures that were designed 10, 20, even 30 years ago that were designed for a very specific application, but today they want to use it in a very different way, and so, the traditional path is to take that data and copy it, to ETL it someplace else they can get some new use or to build some new application. What data virtualization allows you to do is to leave that data in place but access it using APIs that developers want to use today. They want to use JSON access, for example, or they want to use SQL access. But they want to be able to do things like join across IMS, DB2, and VSAM all with a single query using an SQL statement. We can do that relational databases and non-relational databases. It gets us out of this mode of having to copy data into some other data store through this ETL process, access the data in place, we call it moving the applications or the analytics to the data versus moving the data to the analytics or to the applications. >> Okay, so in this specific case, and I have said several times today, as Stu has heard me, two years ago IBM had a big theme around the z13 bringing analytics and transactions together, this sort of extends that. Great, I've got this transaction data that lives behind a firewall somewhere. Why the mainframe, why now? >> Well, I would pull back to where I said where we see more companies and organizations wanting to move applications and analytics closer to the data. The data in many of these large companies, that core business-critical data is on the mainframe, and so, being able to do more real time analytics without having to look at old data is really important. There's this term data gravity. I love the visual that presents in my mind that you have these different masses, these different planets if you will, and the biggest, massivest planet in that solar system really is the data, and so, it's pulling the smaller satellites if you will into this planet or this star by way of gravity because data is, data's a new currency, data is what the companies are running on. We're helping in this announcement with being able to unlock and open up all mainframe data sources, even some non-mainframe data sources, and using things like Spark that's running on the platform, that's running on z/OS to access that data directly without having to write any special programming or any special code to get to all their data. >> And the preferred place to run all that data is on the mainframe obviously if you're a mainframe customer. One of the questions I guess people have is, okay, I get that, it's the transaction data that I'm getting access to, but if I'm bringing transaction and analytic data together a lot of times that analytic data might be in social media, it might be somewhere else not on the mainframe. How do envision customers dealing with that? Do you have tooling them to do that? >> We do, so this data virtualization solution that I'm talking about is one that is mainframe resident, but it can also access other data sources. It can access DB2 on Linux Windows, it can access Informix, it can access Cloudant, it can access Hadoop through IBM's BigInsights. Other feeds like Twitter, like other social media, it can pull that in. The case where you'd want to do that is where you're trying to take that data and integrate it with a massive amount of mainframe data. It's going to be much more highly performant by pulling this other small amount of data into, next to that core business data. >> I get the performance and I get the security of the mainframe, I like those two things, but what about the economics? >> Couple of things. One, IBM when they ported Spark to z/OS, they did it the right way. They leveraged the architecture, it wasn't just a simple port of recompiling a bunch of open source code from Apache, it was rewriting it to be highly performant on the Z architecture, taking advantage of specialty engines. We've done the same with the data virtualization component that goes along with that Spark on z/OS offering that also leverages the architecture. We actually have different binaries that we load depending on which architecture of the machine that we're running on, whether it be a z9, an EC12, or the big granddaddy of a z13. >> Bryan, can you speak the developers? I think about, you're talking about all this mobile and Spark and everything like that. There's got to be certain developers that are like, "Oh my gosh, there's mainframe stuff. "I don't know anything about that." How do you help bridge that gap between where it lives in the tools that they're using? >> The best example is talking about embracing this API economy. And so, developers really don't care where the stuff is at, they just want it to be easy to get to. They don't have to code up some specific interface or language to get to different types of data, right? IBM's done a great job with the z/OS Connect in opening up the mainframe to the API economy with ReSTful interfaces, and so with z/OS Connect combined with Rocket data virtualization, you can come through that z/OS Connect same path using all those same ReSTful interfaces pushing those APIs out to tools like Swagger, which the developers want to use, and not only can you get to the applications through z/OS Connect, but we're a service provider to z/OS Connect allowing them to also get to every piece of data using those same ReSTful APIs. >> If I heard you correctly, the developer doesn't need to even worry about that it's on mainframe or speak mainframe or anything like that, right? >> The goal is that they never do. That they simply see in their tool-set, again like Swagger, that they have data as well as different services that they can invoke using these very straightforward, simple ReSTful APIs. >> Can you speak to the customers you've talked to? You know, there's certain people out in the industry, I've had this conversation for a few years at IBM shows is there's some part of the market that are like, oh, well, the mainframe is this dusty old box sitting in a corner with nothing new, and my experience has been the containers and cool streaming and everything like that, oh well, you know, mainframe did virtualization and Linux and all these things really early, decades ago and is keeping up with a lot of these trends with these new type of technologies. What do you find in the customers that, how much are they driving forward on new technologies, looking for that new technology and being able to leverage the assets that they have? >> You asked a lot of questions there. The types of customers certainly financial and insurance are the big two, but that doesn't mean that we're limited and not going after retail and helping governments and manufacturing customers as well. What I find is talking with them that there's the folks who get it and the folks who don't, and the folks who get it are the ones who are saying, "Well, I want to be able "to embrace these new technologies," and they're taking things like open source, they're looking at Spark, for example, they're looking at Anaconda. Last week, we just announced at the Anaconda Conference, we stepped on stage with Continuum, IBM, and we, Rocket, stood up there talking about this partnership that we formed to create this ecosystem because the development world changes very, very rapidly. For a while, all the rage was JDBC, or all the rage was component broker, and so today it's Spark and Anaconda are really in the forefront of developers' minds. We're constantly moving to keep up with developers because that's where the action's happening. Again, they don't care where the data is housed as long as you can open that up. We've been playing with this concept that came up from some research firm called two-speed IT where you have maybe your core business that has been running for years, and it's designed to really be slow-moving, very high quality, it keeps everything running today, but they want to embrace some of their new technologies, they want to be able to roll out a brand-new app, and they want to be able to update that multiple times a week. And so, this two-speed IT says, you're kind of breaking 'em off into two separate teams. You don't have to take your existing infrastructure team and say, "You must embrace every Agile "and every DevOps type of methodology." What we're seeing customers be successful with is this two-speed IT where you can fracture these two, and now you need to create some nice integration between those two teams, so things like data virtualization really help with that. It opens up and allows the development teams to very quickly access those assets on the mainframe in this case while allowing those developers to very quickly crank out an application where quality is not that important, where being very quick to respond and doing lots of AB testing with customers is really critical. >> Waterfall still has its place. As a company that predominately, or maybe even exclusively is involved in mainframe, I'm struck by, it must've been 2008, 2009, Paul Maritz comes in and he says VMWare our vision is to build the software mainframe. And of course the world said, "Ah, that's, mainframe's dead," we've been hearing that forever. In many respects, I accredit the VMWare, they built sort of a form of software mainframe, but now you hear a lot of talk, Stu, about going back to bare metal. You don't hear that talk on the mainframe. Everything's virtualized, right, so it's kind of interesting to see, and IBM uses the language of private cloud. The mainframe's, we're joking, the original private cloud. My question is you're strategy as a company has been always focused on the mainframe and going forward I presume it's going to continue to do that. What's your outlook for that platform? >> We're not exclusively by the mainframe, by the way. We're not, we have a good mix. >> Okay, it's overstating that, then. It's half and half or whatever. You don't talk about it, 'cause you're a private company. >> Maybe a little more than half is mainframe-focused. >> Dave: Significant. >> It is significant. >> You've got a large of proportion of the company on mainframe, z/OS. >> So we're bullish on the mainframe. We continue to invest more every year. We invest, we increase our investment every year, and so in a software company, your investment is primarily people. We increase that by double digits every year. We have license revenue increases in the double digits every year. I don't know many other mainframe-based software companies that have that. But I think that comes back to the partnership that we have with IBM because we are more than just a technology partner. We work on strategic projects with IBM. IBM will oftentimes stand up and say Rocket is a strategic partner that works with us on hard problem-solving customers issues every day. We're bullish, we're investing more all the time. We're not backing away, we're not decreasing our interest or our bets on the mainframe. If anything, we're increasing them at a faster rate than we have in the past 10 years. >> And this trend of bringing analytics and transactions together is a huge mega-trend, I mean, why not do it on the mainframe? If the economics are there, which you're arguing that in many use cases they are, because of the value component as well, then the future looks pretty reasonable, wouldn't you say? >> I'd say it's very, very bright. At the Anaconda Conference last week, I was coming up with an analogy for these folks. It's just a bunch of data scientists, right, and during most of the breaks and the receptions, they were just asking questions, "Well, what is a mainframe? "I didn't know that we still had 'em, "and what do they do?" So it was fun to educate them on that. But I was trying to show them an analogy with data warehousing where, say that in the mid-'90s it was perfectly acceptable to have a separate data warehouse separate from your transaction system. You would copy all this data over into the data warehouse. That was the model, right, and then slowly it became more important that the analytics or the BI against that data warehouse was looking at more real time data. So then it became more efficiencies and how do we replicate this faster, and how do we get closer to, not looking at week-old data but day-old data? And so, I explained that to them and said the days of being able to do analytics against old data that's copied are going away. ETL, we're also bullish to say that ETL is dead. ETL's future is very bleak. There's no place for it. It had its time, but now it's done because with data virtualization you can access that data in place. I was telling these folks as they're talking about, these data scientists, as they're talking about how they look at their models, their first step is always ETL. And so I told them this story, I said ETL is dead, and they just look at me kind of strange. >> Dave: Now the first step is load. >> Yes, there you go, right, load it in there. But having access from these platforms directly to that data, you don't have to worry about any type of a delay. >> What you described, though, is still common architecture where you've got, let's say, a Z mainframe, it's got an InfiniBand pipe to some exit data warehouse or something like that, and so, IBM's vision was, okay, we can collapse that, we can simplify that, consolidate it. SAP with HANA has a similar vision, we can do that. I'm sure Oracle's got their vision. What gives you confidence in IBM's approach and legs going forward? >> Probably due to the advances that we see in z/OS itself where handling mixed workloads, which it's just been doing for many of the 50 years that it's been around, being able to prioritize different workloads, not only just at the CPU dispatching, but also at the memory usage, also at the IO, all the way down through the channel to the actual device. You don't see other operating systems that have that level of granularity for managing mixed workloads. >> In the security component, that's what to me is unique about this so-called private cloud, and I say, I was using that software mainframe example from VMWare in the past, and it got a good portion of the way there, but it couldn't get that last mile, which is, any workload, any application with the performance and security that you would expect. It's just never quite got there. I don't know if the pendulum is swinging, I don't know if that's the accurate way to say it, but it's certainly stabilized, wouldn't you say? >> There's certainly new eyes being opened every day to saying, wait a minute, I could do something different here. Muscle memory doesn't have to guide me in doing business the way I have been doing it before, and that's this muscle memory I'm talking about of this ETL piece. >> Right, well, and a large number of workloads in mainframe are running Linux, right, you got Anaconda, Spark, all these modern tools. The question you asked about developers was right on. If it's independent or transparent to developers, then who cares, that's the key. That's the key lever this day and age is the developer community. You know it well. >> That's right. Give 'em what they want. They're the customers, they're the infrastructure that's being built. >> Bryan, we'll give you the last word, bumper sticker on the event, Rocket Software, your partnership, whatever you choose. >> We're excited to be here, it's an exciting day to talk about machine learning on z/OS. I say we're bullish on the mainframe, we are, we're especially bullish on z/OS, and that's what this even today is all about. That's where the data is, that's where we need the analytics running, that's where we need the machine learning running, that's where we need to get the developers to access the data live. >> Excellent, Bryan, thanks very much for coming to theCUBE. >> Bryan: Thank you. >> And keep right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from New York City. Be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Event, brought to you by IBM. powering the path to close to where we are, but and it's spread across the Is that right, no direct sales force? from just being the Okay, so how do you or the analytics to the data versus Why the mainframe, why now? data is on the mainframe, is on the mainframe obviously It's going to be much that also leverages the architecture. There's got to be certain They don't have to code up some The goal is that they never do. and my experience has been the containers and the folks who get it are the ones who You don't hear that talk on the mainframe. the mainframe, by the way. It's half and half or whatever. half is mainframe-focused. of the company on mainframe, z/OS. in the double digits every year. the days of being able to do analytics directly to that data, you don't have it's got an InfiniBand pipe to some for many of the 50 years I don't know if that's the in doing business the way I is the developer community. They're the customers, bumper sticker on the the developers to access the data live. very much for coming to theCUBE. This is theCUBE, we're

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Bryan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Paul Maritz	PERSON	0.99+
Dave	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Rocket Software	ORGANIZATION	0.99+
50 years	QUANTITY	0.99+
2009	DATE	0.99+
New York City	LOCATION	0.99+
2008	DATE	0.99+
Oracle	ORGANIZATION	0.99+
27th year	QUANTITY	0.99+
New York City	LOCATION	0.99+
first step	QUANTITY	0.99+
two	QUANTITY	0.99+
JDBC	ORGANIZATION	0.99+
1,300 employees	QUANTITY	0.99+
Continuum	ORGANIZATION	0.99+
Last week	DATE	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
mid-'90s	DATE	0.99+
Spark	TITLE	0.99+
Rocket	ORGANIZATION	0.99+
z/OS Connect	TITLE	0.99+
10	DATE	0.99+
two teams	QUANTITY	0.99+
Linux	TITLE	0.99+
today	DATE	0.99+
two-speed	QUANTITY	0.99+
two separate teams	QUANTITY	0.99+
Z. Bryan Smith	PERSON	0.99+
SQL	TITLE	0.99+
Bryan Smith	PERSON	0.99+
z/OS	TITLE	0.98+
two years ago	DATE	0.98+
ReSTful	TITLE	0.98+
Swagger	TITLE	0.98+
last week	DATE	0.98+
decades ago	DATE	0.98+
DB2	TITLE	0.98+
HANA	TITLE	0.97+
IBM Machine Learning Launch Event	EVENT	0.97+
Anaconda Conference	EVENT	0.97+
Hadoop	TITLE	0.97+
Spark	ORGANIZATION	0.97+
One	QUANTITY	0.97+
Informix	TITLE	0.96+
VMWare	ORGANIZATION	0.96+
More than half	QUANTITY	0.95+
z13	COMMERCIAL_ITEM	0.95+
JSON	TITLE	0.95+

Steven Astorino, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Announcer: Live from New York, it's the CUBE. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now here are your hosts Dave Vellante and Stu Miniman. >> Welcome back to New York City everybody the is The CUBE the leader in live tech coverage. We're here at the IBM Machine Learning Launch Event, bringing machine learning to the Z platform. Steve Astorino is here, he's the VP for Development for the IBM Private Cloud Analytics Platform. Steve, good to see you, thanks for coming on. >> Hi how are you? >> Good thanks, how you doing? >> Good, good. >> Down from Toronto. So this is your baby. >> It is >> This product right? >> It is. So you developed this thing in the labs and now you point it at platforms. So talk about, sort of, what's new here today specifically. >> So today we're launching and announcing our machine learning, our IBM machine learning product. It's really a new solution that allows, obviously, machine learning to be automated and for data scientists and line of business, business analysts to work together and create models to be able to apply machine learning, do predictions and build new business models in the end. To provide better services for their customers. >> So how is it different than what we knew as Watson machine learning? Is it the same product pointed at Z or is it different? >> It's a great question. So Watson is our cloud solution, it's our cloud brand, so we're building something on private cloud for the private cloud customers and enterprises. Same product built for private cloud as opposed to public cloud. Think of it more as a branding and Watson is sort of a bigger solution set in the cloud. >> So it's your product, your baby, what's so great about it? How does it compare with what else is in the marketplace? Why should we get excited about this product? >> Actually, a bunch of things. It's great for many angles, what we're trying to do, obviously it's based on open source, it's an open platform just like what we've been talking about with the other products that we've been launching over the last six months to a year. It's based on Spark, you know we're bringing in all the open source technology, to your fingertips. As well as we're integrating with IBM's top-notch research and capabilities that we're driving in-house, integrating them together and being able to provide one experience to be able to do machine learning. That's at a very high level, also if you think about it there's three things that we're calling out, there's freedom, basically being able to choose what tools you want to use, what environments you want to use, what language you want to use, whether it's Python, Scala, R, right there's productivity. So we really enable and make it simple to be productive and build these machine learning models and then an application developer can leverage and use within their application. The other one is trust. IBM is very well known for its enterprise level capabilities, whether it's governance, whether its trust of the data, how to manage the data, but also more importantly, we're creating something called The Feedback Loop which allows the models to stay current and the data scientists, the administrators, know when these models, for example, is degrading. To make sure it's giving you the right outcome. >> OK, so you mention it's built on Spark. When I think about the efforts to build a data pipeline I think I've got to ingest the data, I've got to explore, I've got to process it and clean it up and then I've got to ultimately serve whomever, the business. >> Right, Right. >> What pieces of that does Spark unify and simplify? >> So we leveraged Spark to able to, obviously for the analytics. When you're building a model you one, have your choice of tooling that you want to use, whether it's programmatic or not. That's one of the value propositions we're bringing forward. But then we create these models, we train them, we evaluate them, we leverage Spark for that. Then obviously, we're trying to bring the models where the data is. So one of the key value proposition is we operationalize these models very simply and quickly. Just at a click of a button you can say hey deploy this model now and we deploy it right on where the data is in this case we're launching it on mainframe first. So Spark on the mainframe, we're deploying the model there and you can score the model directly in Spark on the mainframe. That's a huge value add, get better performance. >> Right, okay, just in terms of differentiates from the competition, you're the only company I think, providing machine learning on Z, so. >> Definitely, definitely. >> That's pretty easy, but in terms of the capabilities that you have, how are you different from the competition? When you talk to clients and they say well what about this vendor or that vendor, how do you respond? >> So let me talk about one of the research technologies that we're launching as part of this called CADS, Cognitive Assistant for Data Scientists. This is a feature where essentially, it takes the complexity out of building a model where you tell it, or you give it the algorithms you want to work with and the CADS assistant basically returns which one is the best which one performs the best. Now, all of a sudden you have the best model to use without having to go and spend, potentially weeks, on figuring out which one that is. So that's a huge value proposition. >> So automating the choice of the algorithm, an algorithm to choose the algorithm. what have you found in terms of it's level of accuracy in terms of the best fit? >> Actually it works really well. And in fact we have a live demo that we'll be doing today, where it shows CADS coming back with a 90% accurate model in terms of the data that we're feeding it and outcome it will give you in terms of what model to use. It works really well. >> Choosing an algorithm is not like choosing a programming language right, this bias if I like Scala or R or whatever, Java, Python okay fine, I've got skill sets associated with that. Algorithm choice is one that's more scientific, I guess? >> It is more scientific, it's based on the algorithm, the statistical algorithm and the selection of the algorithm or the model itself is a huge deal because that's where you're going to drive your business. If you're offering a new service that's where you're providing that solution from, so it has to be the right algorithm the right model so that you can build that more efficiently. >> What are you seeing as the big barriers to customer adopting machine learning? >> I think everybody, I mean it's the hottest thing around right now, everybody wants machine learning it's great, it's a huge buzz. The hardest thing is they know they want it, but don't really know how to apply it into their own environment, or they think they don't have the right skills. So, that actually one of the things that we're going after, to be able to enable them to do that. We're for example working on building different industry-based examples to showcase here's how you would use it in your environment. So last year when we did the Watson data platform we did a retail example, now today we're doing a finance example, a churn example with customers potentially churning and leaving a bank. So we're looking at all those different scenarios, and then also we're creating hubs, locations we're launching today also, announcing today, actually Dinesh will be doing that. There is a hub in Silicon Valley where it would allow customers to come in and work with us and we help them figure out how they can leverage machine learning. It is a great way to interact with our customers and be able to do that. >> So Steve nirvana is, and you gave that example, the retail example in September, when you launched Watson Data Platform, the nirvana in this world is you can use data, and maybe put in an offer, or save a patients life or effect an outcome in real time. So the retail example was just that. If I recall, you were making an offer real-time it was very fast, live demo it wasn't just a fakey. The example on churn, is the outcome is to effect that customer's decisions so that they don't leave? Is that? >> Yes, pretty much, Essentially what we are looking at is , we're using live data, we're using social media data bringing in Twitter sentiment about a particular individual for example, and try to predict if this customer, if this user is happy with the service that they are getting or not. So for example, people will go and socialize, oh I went to this bank and I hated this experience, or they really got me upset or whatever. Bringing that data from Twitter, so open data and merging it with the bank's data, banks have a lot of data they can leverage and monetize. And then making an assessment using machine learning to predict is this customer going to leave me or not? What probability do they have that they are going to leave me or not based on the machine learning model. The example or scenario we are using now, if we think they are going to leave us, we're going to make special offers to them. It's a way to enhance your service for those customers. So that they don't leave you. >> So operationalizing that would be a call center has some kind on dashboard that says red, green, yellow, boom heres an offer that you should make, and that's done in near real time. In fact, real time is before you lose the customer. That's as good a definition as anything else. >> But it's actually real-time, and when we call it the scoring of the data, so as the data transaction is coming in, you can actually make that assessment in real time, it's called in-transaction scoring where you can make that right on the fly and be able to determine is this customer at risk or not. And then be able to make smarter decisions to that service you are providing on whether you want to offer something better. >> So is the primary use case for this those streams those areas I'm getting you know, whether it be, you mentioned Twitter data, maybe IoT, you're getting can we point machine learning at just archives of data and things written historically or is it mostly the streams? >> It's both of course and machine learning is based on historical data right and that's hot the models are built. The more accurate or more data you have on historical data, the more accurate that you picked the right model and you'll get the better predictition of what's going to happen next time. So it's exactly, it's both. >> How are you helping customers with that initial fit? My understanding is how big of a data set do you need, Do I have enough to really model where I have, how do you help customers work through that? >> So my opinion is obvious to a certain extent, the more data you have as your sample set, the more accurate your model is going to be. So if we have one that's too small, your prediction is going to be inaccurate. It really depends on the scenario, it depends on how many features or the fields you have you're looking at within your dataset. It depends on many things, and it's variable depending on the scenario, but in general you want to have a good chunk of historical data that you can build expertise on right. >> So you've worked on both the Watson Services in the public cloud and now this private cloud, is there any differentiation or do you see significant use case different between those two or is it just kind of where the data lives and we're going to do similar activities there. >> So it is similar. At the end of the day, we're trying to provide similar products on both public cloud and private cloud. But for this specific case, we're launching it on mainframe that's a different angle at this. But we know that's where the biggest banks, the insurance companies, the biggest retailers in the world are, and that's where the biggest transactions are running and we really want to help them leverage machine learning and get their services to the next level. I think it's going to be a huge differentiator for them. >> Steve, you gave an example before of Twitter sentiment data. How would that fit in to this announcement. So I've got this ML on Z and I what API into the twitter data? How does that sort of all get adjusted and consolidated? >> So we allow hooks to be able to access data from different sources, bring in data. That is part of the ingest process. Then once you have that data there into data frames into the machine learning product, now you're feeding into a statistical algorithm to figure out what the best prediction is going to be, and the best model's going to be. >> I have a slide that you guys are sharing on the data scientist workflow. It starts with ingestion, selection, preparation, generation, transform, model. It's a complex set of tasks, and typically historically, at least in the last fIve or six years, different tools to de each of those. And not just different tools, multiples of different tools. That you had to cobble together. If I understand it correctly the Watson Data Platform was designed to really consolidate that and simplify that, provide collaboration tools for different personas, so my question is this. Because you were involved in that product as well. And I was excited about it when I saw it, I talked to people about it, sometimes I hear the criticism of well IBM just took a bunch of legacy products threw them together, threw and abstraction layer on top and is now going to wrap a bunch of services around it. Is that true? >> Absolutely not. Actually, you may have heard a while back IBM had made a big shift into design first design methodology. So we started with the Watson Data Platform, the Data Science Experience, they started with design first approach. We looked at this, we said what do we want the experience to be, for which persona do we want to target. Then we understood what we wanted the experience to be and then we leverage IBM analytics portfolio to be able to feed in and provide and integrate those services together to fit into that experience. So, its not a dumping ground for, I'll take this product, it's part of Watson Data Platform, not at all the case. It was the design first, and then integrate for that experience. >> OK, but there are some so-called legacy products in there, but you're saying you picked the ones that were relevant and then was there additional design done? >> There was a lot of work involved to take them from a traditional product, to be able to componentize, create a micro service architecture, I mean the whole works to be able to redesign it and fit into this new experience. >> So microservices architecture, runs on cloud, I think it only runs on cloud today right? >> Correct, correct. >> OK, maybe roadmap without getting too specific. What should we be paying attention to in the future? >> Right now we're doing our first release. Definitely we want to target any platform behind the firewall. So we don't have specific dates, but now we started with machine learning on a mainframe and we want to be able to target the other platforms behind the firewall and the private cloud environment. Definitely we should be looking at that. Our goal is to make, I talked about the feedback loop a little bit, so that is essentially once you deploy the model we actually look at that model you could schedule in a valuation, automatically, within the machine learning product. To be able to say, this model is still good enough. And if it's not we automatically flag it, and we look at the retraining process and redeployment process to make sure you always have the most up to date model. So this is truly machine learning where it requires very little to no intervention from a human. We're going to continue down that path and continue that automation in providing those capabilities so there's a bigger roadmap, there's a lot of things we're looking at. >> We've sort of looked at our big data analyst George Gilbert has talked about you had batch and you had interactive, not the sort of emergent workload is this continuous, streaming data. How do you see the adoption. First of all, is it a valid assertion? That there is a new class of workload, and then how do you see that adoption occurring? Is it going to be a dominant force over the next 10 years? >> Yeah, I think so. Like I said there is a huge buzz around machine learning in general and artificial intelligence, deep learning, all of these terms you hear about. I think as users and customers get more comfortable with understanding how they're going to leverage this in their enterprise. This real-time streaming of data and being able to do analytics on the fly and machine learning on the fly. It's a big deal and it will really helps them be more competitive in their own space with the services we're providing. >> OK Steve, thanks very much for coming on The CUBE. We'll give you the last word. The event, very intimate event a lot of customers coming in very shortly here in just a couple of hours. Give us the bumper sticker. >> All of that's very exciting, we're very excited, this is a big deal for us, that's why whenever IBM does a signature moment it's a big deal for us and we got something cool to talk about, we're very excited about that. Lot's of clients coming so there's an entire session this afternoon, which will be live streamed as well. So it's great, I think we have a differentiating product and we're already getting that feedback from our customers. >> Well congratulations, I love the cadence that you're on. We saw some announcements in September, we're here in February, I expect we're going to see more innovation coming out of your labs in Toronto, and cross IBM so thank you very much for coming on The CUBE. >> Thank you. >> You're welcome OK keep it right there everybody, we'll be back with our next guest right after this short break. This is The CUBE we're live from New York City. (energetic music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. for the IBM Private So this is your baby. and now you point it at platforms. and create models to be able for the private cloud the last six months to a year. the data, I've got to explore, So Spark on the mainframe, from the competition, you're the best model to use without So automating the of the data that we're feeding it Algorithm choice is one that's and the selection and be able to do that. the retail example in September, when you based on the machine learning model. boom heres an offer that you should make, and be able to determine on historical data, the more accurate the more data you have as your sample set, in the public cloud and and get their services to the next level. to this announcement. and the best model's going to be. and is now going to wrap a the experience to be, I mean the whole works attention to in the future? to make sure you always and then how do you see and machine learning on the fly. We'll give you the last word. So it's great, I think we and cross IBM so thank you very This is The CUBE we're

ENTITIES

Entity	Category	Confidence
Steve	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George Gilbert	PERSON	0.99+
Steve Astorino	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
September	DATE	0.99+
Toronto	LOCATION	0.99+
90%	QUANTITY	0.99+
February	DATE	0.99+
Silicon Valley	LOCATION	0.99+
New York City	LOCATION	0.99+
Scala	TITLE	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
New York	LOCATION	0.99+
Python	TITLE	0.99+
Twitter	ORGANIZATION	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
twitter	ORGANIZATION	0.99+
R	TITLE	0.99+
both	QUANTITY	0.99+
Java	TITLE	0.99+
first release	QUANTITY	0.98+
three things	QUANTITY	0.98+
IBM Machine Learning Launch Event	EVENT	0.97+
one experience	QUANTITY	0.96+
one	QUANTITY	0.96+
Watson Data Platform	TITLE	0.96+
first approach	QUANTITY	0.95+
Watson	TITLE	0.95+
Steve nirvana	PERSON	0.94+
Watson Data Platform	TITLE	0.93+
Spark	TITLE	0.93+
six years	QUANTITY	0.92+
First	QUANTITY	0.91+
Watson Services	ORGANIZATION	0.91+
this afternoon	DATE	0.9+
first	QUANTITY	0.89+
last six months	DATE	0.89+
each	QUANTITY	0.86+
#IBMML	TITLE	0.82+
Astorino	PERSON	0.77+
Dinesh	ORGANIZATION	0.76+
CUBE	ORGANIZATION	0.74+
next 10 years	DATE	0.72+
Private Cloud Analytics Platform	TITLE	0.71+
a year	QUANTITY	0.65+
first design methodology	QUANTITY	0.65+
of clients	QUANTITY	0.62+
Watson	ORGANIZATION	0.55+
Loop	OTHER	0.48+

James Kobielus, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Announcer] Live from New York, it's the Cube. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now here are your hosts Dave Vellante and Stu Miniman. >> Welcome back to New York City everybody, this is the CUBE. We're here live at the IBM Machine Learning Launch Event. Bringing analytics and transactions together on Z, extending an announcement that IBM made a couple years ago, sort of laid out that vision, and now bringing machine learning to the mainframe platform. We're here with Jim Kobielus. Jim is the Director of IBM's Community Engagement for Data Science and a long time CUBE alum and friend. Great to see you again James. >> Great to always be back here with you. Wonderful folks from the CUBE. You ask really great questions and >> Well thank you. >> I'm prepared to answer. >> So we saw you last week at Spark Summit so back to back, you know, continuous streaming, machine learning, give us the lay of the land from your perspective of machine learning. >> Yeah well machine learning very much is at the heart of what modern application developers build and that's really the core secret sauce in many of the most disruptive applications. So machine learning has become the core of, of course, what data scientists do day in and day out or what they're asked to do which is to build, essentially artificial neural networks that can process big data and find patterns that couldn't normally be found using other approaches. And then as Dinesh and Rob indicated a lot of it's for regression analysis and classification and the other core things that data scientists have been doing for a long time, but machine learning has come into its own because of the potential for great automation of this function of finding patterns and correlations within data sets. So today at the IBM Machine Learning Launch Event, and we've already announced it, IBM Machine Learning for ZOS takes that automation promised to the next step. And so we're real excited and there'll be more details today in the main event. >> One of the most funs I had, most fun I had last year, most fun interviews I had last year was with you, when we interviewed, I think it was 10 data scientists, rock star data scientists, and Dinesh had a quote, he said, "Machine learning is 20% fun, 80% elbow grease." And data scientists sort of echoed that last year. We spent 80% of our time wrangling data. >> [Jim] Yeah. >> It gets kind of tedious. You guys have made announcements to address that, is the needle moving? >> To some degree the needle's moving. Greater automation of data sourcing and preparation and cleansing is ongoing. Machine learning is being used for that function as well. But nonetheless there is still a lot of need in the data science, sort of, pipeline for a lot of manual effort. So if you look at the core of what machine learning is all about, it's supervised learning involves humans, meaning data scientists, to train their algorithms with data and so that involves finding the right data and then of course doing the feature engineering which is a very human and creative process. And then to be training the data and iterating through models to improve the fit of the machine learning algorithms to the data. In many ways there's still a lot of manual functions that need expertise of data scientists to do it right. There's a lot of ways to do machine learning wrong you know there's a lot of, as it were, tricks of the trade you have to learn just through trial and error. A lot of things like the new generation of things like generative adversarial models ride on machine learning or deep learning in this case, a multilayered, and they're not easy to get going and get working effectively the first time around. I mean with the first run of your training data set, so that's just an example of how, the fact is there's a lot of functions that can't be fully automated yet in the whole machine learning process, but a great many can in fact, especially data preparation and transformation. It's being automated to a great degree, so that data scientists can focus on the more creative work that involves subject matter expertise and really also application development and working with larger teams of coders and subject matter experts and others, to be able to take the machine learning algorithms that have been proved out, have been trained, and to dry them to all manner of applications to deliver some disruptive business value. >> James, can you expand for us a little bit this democratization of before it was not just data but now the machine learning, the analytics, you know, when we put these massive capabilities in the broader hands of the business analysts the business people themselves, what are you seeing your customers, what can they do now that they couldn't do before? Why is this such an exciting period of time for the leveraging of data analytics? >> I don't know that it's really an issue of now versus before. Machine learning has been around for a number of years. It's artificial neural networks at the very heart, and that got going actually in many ways in the late 50s and it steadily improved in terms of sophistication and so forth. But what's going on now is that machine learning tools have become commercialized and refined to a greater degree and now they're in a form in the cloud, like with IBM machine learning for the private cloud on ZOS, or Watson machine learning for the blue mixed public cloud. They're at a level of consumability that they've never been at before. With software as a service offering you just, you pay for it, it's available to you. If you're a data scientist you being doing work right away to build applications, derive quick value. So in other words, the time to value on a machine learning project continues to shorten and shorten, due to the consumability, the packaging of these capabilities and to cloud offerings and into other tools that are prebuilt to deliver success. That's what's fundamentally different now and it's just an ongoing process. You sort of see the recent parallels with the business intelligence market. 10 years ago BI was reporting and OLEP and so forth, was only for the, what we now call data scientists or the technical experts and all that area. But in the last 10 years we've seen the business intelligence community and the industry including IBM's tools, move toward more self service, interactive visualization, visual design, BI and predictive analytics, you know, through our cognos and SPSS portfolios. A similar dynamic is coming in to the progress of machine learning, the democratization, to use your term, the more self service model wherein everybody potentially will be able to be, to do machine learning, to build machine learning and deep learning models without a whole of university training. That day is coming and it's coming fairly rapidly. It's just a matter of the maturation of this technology in the marketplace. >> So I want to ask you, you're right, 1950s it was artificial neural networks or AI, sort of was invented I guess, the concept, and then in the late 70s and early 80s it was heavily hyped. It kind of died in the late 80s or in the 90s, you never heard about it even the early 2000s. Why now, why is it here now? Is it because IBM's putting so much muscle behind it? Is it because we have Siri? What is it that has enabled that? >> Well I wish that IBM putting muscle behind a technology can launch anything to success. And we've done a lot of things in that regard. But the thing is, if you look back at the historical progress of AI, I mean, it's older than me and you in terms of when it got going in the middle 50s as a passion or a focus of computer scientists. What we had for the last, most of the last half century is AI or expert systems that were built on having to do essentially programming is right, declared a rule defining how AI systems could process data whatever under various scenarios. That didn't prove scalable. It didn't prove agile enough to learn on the fly from the statistical patterns within the data that you're trying to process. For face recognition and voice recognition, pattern recognition, you need statistical analysis, you need something along the lines of an artificial neural network that doesn't have to be pre-programmed. That's what's new now about in the last this is the turn of this century, is that AI has become predominantly now focused not so much on declarative rules, expert systems of old, but statistical analysis, artificial neural networks that learn from the data. See the, in the long historical sweep of computing, we have three eras of computing. The first era before the second world war was all electromechanical computing devices like IBM's start of course, like everybody's, was in that era. The business logic was burned into the hardware as it were. The second era from the second world war really to the present day, is all about software, programming, it's COBAL, 4trans, C, Java, where the business logic has to be developed, coded by a cadre of programmers. Since the turn of this millennium and really since the turn of this decade, it's all moved towards the third era, which is the cognitive era, where you're learning the business rules automatically from the data itself, and that involves machine learning at its very heart. So most of what has been commercialized and most of what is being deployed in the real world working, successful AI, is all built on artificial neural networks and cognitive computing in the way that I laid out. Where, you still need human beings in the equation, it can't be completely automated. There's things like unsupervised learning that take the automation of machine learning to a greater extent, but you still have the bulk of machine learning is supervised learning where you have training data sets and you need experts, data scientists, to manage that whole process, that over time supervised learning is evolving towards who's going to label the training data sets, especially when you have so much data flooding in from the internet of things and social media and so forth. A lot of that is being outsourced to crowd sourcing environments in terms of the ongoing labeling of data for machine learning projects of all sorts. That trend will continue a pace. So less and less of the actual labeling of the data for machine learning will need to be manually coded by data scientists or data engineers. >> So the more data the better. See I would argue in the enablement pie. You're going to disagree with that which is good. Let's have a discussion [Jim Laughs]. In the enablement pie, I would say the profundity of Hadup was two things. One is I can leave data where it is and bring code to data. >> [Jim] Yeah. >> 5 megabytes of code to petabyte of data, but the second was the dramatic reduction in the cost to store more data, hence my statement of the more data the better, but you're saying, meh maybe not. Certainly for compliance and other things you might not want to have data lying around. >> Well it's an open issue. How much data do you actually need to find the patterns of interest to you, the correlations of interest to you? Sampling of your data set, 10% sample or whatever, in most cases that might be sufficient to find the correlations you're looking for. But if you're looking for some highly deepened rare nuances in terms of anomalies or outliers or whatever within your data set, you may only find those if you have a petabyte of data of the population of interest. So but if you're just looking for broad historical trends and to do predictions against broad trends, you may not need anywhere near that amount. I mean, if it's a large data set, you may only need five to 10% sample. >> So I love this conversation because people have been on the CUBE, Abi Metter for example said, "Dave, sampling is dead." Now a statistician said that's BS, no way. Of course it's not dead. >> Storage isn't free first of all so you can't necessarily save and process all the data. Compute power isn't free yet, memory isn't free yet, so forth so there's lots... >> You're working on that though. >> Yeah sure, it's asymptotically all moving towards zero. But the bottom line is if the underlying resources, including the expertise of your data scientists that's not for free, these are human beings who need to make a living. So you've got to do a lot of things. A, automate functions on the data science side so that your, these experts can radically improve their productivity. Which is why the announcement today of IBM machine learning is so important, it enables greater automation in the creation and the training and deployment of machine learning models. It is a, as Rob Thomas indicated, it's very much a multiplier of productivity of your data science teams, the capability we offer. So that's the core value. Because our customers live and die increasingly by machine learning models. And the data science teams themselves are highly inelastic in the sense that you can't find highly skilled people that easily at an affordable price if you're a business. And you got to make the most of the team that you have and help them to develop their machine learning muscle. >> Okay, I want to ask you to weigh in on one of Stu's favorite topics which is man versus machine. >> Humans versus mechanisms. Actually humans versus bots, let's, okay go ahead. >> Okay so, you know a lot of discussions, about, machines have always replaced humans for jobs, but for the first time it's really beginning to replace cognitive functions. >> [Jim] Yeah. >> What does that mean for jobs, for skill sets? The greatest, I love the comment, the greatest chess player in the world is not a machine. It's humans and machines, but what do you see in terms of the skill set shift when you talk to your data science colleagues in these communities that you're building? Is that the right way to think about it, that it's the creativity of humans and machines that will drive innovation going forward. >> I think it's symbiotic. If you take Watson, of course, that's a star case of a cognitive AI driven machine in the cloud. We use a Watson all the time of course in IBM. I use it all the time in my job for example. Just to give an example of one knowledge worker and how he happens to use AI and machine learning. Watson is an awesome search engine. Through multi-structure data types and in real time enabling you to ask a sequence of very detailed questions and Watson is a relevance ranking engine, all that stuff. What I've found is it's helped me as a knowledge worker to be far more efficient in doing my upfront research for anything that I might be working on. You see I write blogs and I speak and I put together slide decks that I present and so forth. So if you look at knowledge workers in general, AI as driving far more powerful search capabilities in the cloud helps us to eliminate a lot of the grunt work that normally was attended upon doing deep research into like a knowledge corpus that may be preexisting. And that way we can then ask more questions and more intelligent questions and really work through our quest for answers far more rapidly and entertain and rule out more options when we're trying to develop a strategy. Because we have all the data at our fingertips and we've got this expert resource increasingly in a conversational back and forth that's working on our behalf predictively to find what we need. So if you look at that, everybody who's a knowledge worker which is really the bulk now of the economy, can be far more productive cause you have this high performance virtual assistant in the cloud. I don't know that it's really going, AI or deep learning or machine learning, is really going to eliminate a lot of those jobs. It'll just make us far smarter and more efficient doing what we do. That's, I don't want to belittle, I don't want to minimize the potential for some structural dislocation in some fields. >> Well it's interesting because as an example, you're like the, you're already productive, now you become this hyper-productive individual, but you're also very creative and can pick and choose different toolings and so I think people like you it's huge opportunities. If you're a person who used to put up billboards maybe it's time for retraining. >> Yeah well maybe you know a lot of the people like the research assistants and so forth who would support someone like me and most knowledge worker organizations, maybe those people might be displaced cause we would have less need for them. In the same way that one of my very first jobs out of college before I got into my career, I was a file clerk in a court in Detroit, it's like you know, a totally manual job, and there was no automation or anything. You know that most of those functions, I haven't revisited that court in recent years, I'm sure are automated because you have this thing called computers, especially PCs and LANs and so forth that came along since then. So a fair amount of those kinds of feather bedding jobs have gone away and in any number of bureaucracies due to automation and machine learning is all about automation. So who knows where we'll all end up. >> Alright well we got to go but I wanted to ask you about... >> [Jim] I love unions by the way. >> And you got to meet a lot of lawyers I'm sure. >> Okay cool. >> So I got to ask you about your community of data scientists that you're building. You've been early on in that. It's been a persona that you've really tried to cultivate and collaborate with. So give us an update there. What's your, what's the latest, what's your effort like these days? >> Yeah, well, what we're doing is, I'm on a team now that's managing and bringing together all of our program for community engagement programs for really for across portfolio not just data scientists. That involves meet ups and hack-a-thons and developer days and user groups and so forth. These are really important professional forums for our customers, our developers, our partners, to get together and share their expertise and provide guidance to each other. And these are very very important for these people to become very good at, to help them, get better at what they do, help them stay up to speed on the latest technologies. Like deep learning, machine learning and so forth. So we take it very seriously at IBM that communities are really where customers can realize value and grow their human capital ongoing so we're making significant investments in growing those efforts and bringing them together in a unified way and making it easier for like developers and IT administrators to find the right forums, the right events, the right content, within IBM channels and so forth, to help them do their jobs effectively and machine learning is at the heart, not just of data science, but other professions within the IT and business analytics universe, relying more heavily now on machine learning and understanding the tools of the trade to be effective in their jobs. So we're bringing, we're educating our communities on machine learning, why it's so critically important to the future of IT. >> Well your content machine is great content so congratulations on not only kicking that off but continuing it. Thanks Jim for coming on the CUBE. It's good to see you. >> Thanks for having me. >> You're welcome. Alright keep it right there everybody, we'll be back with our next guest. The CUBE, we're live from the Waldorf-Astoria in New York City at the IBM Machine Learning Launch Event right back. (techno music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Great to see you again James. Wonderful folks from the CUBE. so back to back, you know, continuous streaming, and that's really the core secret sauce in many One of the most funs I had, most fun I had last year, is the needle moving? of the machine learning algorithms to the data. of machine learning, the democratization, to use your term, It kind of died in the late 80s or in the 90s, So less and less of the actual labeling of the data So the more data the better. but the second was the dramatic reduction in the cost the correlations of interest to you? because people have been on the CUBE, so you can't necessarily save and process all the data. and the training and deployment of machine learning models. Okay, I want to ask you to weigh in Actually humans versus bots, let's, okay go ahead. but for the first time it's really beginning that it's the creativity of humans and machines and in real time enabling you to ask now you become this hyper-productive individual, In the same way that one of my very first jobs So I got to ask you about your community and machine learning is at the heart, Thanks Jim for coming on the CUBE. in New York City at the IBM Machine Learning

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jim	PERSON	0.99+
Dinesh	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
James	PERSON	0.99+
80%	QUANTITY	0.99+
James Kobielus	PERSON	0.99+
20%	QUANTITY	0.99+
Jim Laughs	PERSON	0.99+
five	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Detroit	LOCATION	0.99+
1950s	DATE	0.99+
last year	DATE	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
10 data scientists	QUANTITY	0.99+
One	QUANTITY	0.99+
Siri	TITLE	0.99+
Dave	PERSON	0.99+
10%	QUANTITY	0.99+
5 megabytes	QUANTITY	0.99+
Abi Metter	PERSON	0.99+
two things	QUANTITY	0.99+
first time	QUANTITY	0.99+
last week	DATE	0.99+
second	QUANTITY	0.99+
90s	DATE	0.99+
ZOS	TITLE	0.99+
Rob	PERSON	0.99+
last half century	DATE	0.99+
today	DATE	0.99+
early 2000s	DATE	0.98+
Java	TITLE	0.98+
one	QUANTITY	0.98+
C	TITLE	0.98+
10 years ago	DATE	0.98+
first run	QUANTITY	0.98+
late 80s	DATE	0.98+
Watson	TITLE	0.97+
late 70s	DATE	0.97+
late 50s	DATE	0.97+
zero	QUANTITY	0.97+
IBM Machine Learning Launch Event	EVENT	0.96+
early 80s	DATE	0.96+
4trans	TITLE	0.96+
second world war	EVENT	0.95+
IBM Machine Learning Launch Event	EVENT	0.94+
second era	QUANTITY	0.94+
IBM Machine Learning Launch	EVENT	0.93+
Stu	PERSON	0.92+
first jobs	QUANTITY	0.92+
middle 50s	DATE	0.91+
couple years ago	DATE	0.89+
agile	TITLE	0.87+
petabyte	QUANTITY	0.85+
BAL	TITLE	0.84+
this decade	DATE	0.81+
three eras	QUANTITY	0.78+
last 10 years	DATE	0.78+
this millennium	DATE	0.75+
third era	QUANTITY	0.72+

Dinesh Nirmal, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Announcer] Live from New York, it's theCube, covering the IBM Machine Learning Launch Event brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to the Waldorf Astoria, everybody. This is theCube, the worldwide leader in live tech coverage. We're covering the IBM Machine Learning announcement. IBM bringing machine learning to its zMainframe, its private cloud. Dinesh Nirmel is here. He's the Vice President of Analytics at IBM and a Cube alum. Dinesh, good to see you again. >> Good to see you, Dave. >> So let's talk about ML. So we went through the big data, the data lake, the data swamp, all this stuff with the dupe. And now we're talking about machine learning and deep learning and AI and cognitive. Is it same wine, new bottle? Or is it an evolution of data and analytics? >> Good. So, Dave, let's talk about machine learning. Right. When I look at machine learning, there's three pillars. The first one is the product. I mean, you got to have a product, right. And you got to have a different shared set of functions and features available for customers to build models. For example, Canvas. I mean, those are table stakes. You got to have a set of algorithms available. So that's the product piece. >> [Dave] Uh huh. >> But then there's the process, the process of taking that model that you built in a notebook and being able to operationalize it. Meaning able to deploy it. That is, you know, I was talking to one of the customers today, and he was saying, "Machine learning is 20% fun and 80% elbow grease." Because that operationalizing of that model is not easy. Although they make it sound very simple, it's not. So if you take a banking, enterprise banking example, right? You build a model in the notebook. Some data sense build it. Now you have to take that and put it into your infrastructure or production environment, which has been there for decades. So you could have a third party software that you cannot change. You could have a set of rigid rules that already is there. You could have applications that was written in the 70's and 80's that nobody want to touch. How do you all of a sudden take the model and infuse in there? It's not easy. And so that is a tremendous amount of work. >> [Dave] Okay. >> The third pillar is the people or the expertise or the experience, the skills that needs to come through, right. So the product is one. The process of operationalizing and getting it into your production environment is another piece. And then the people is the third one. So when I look at machine learning, right. Those are three key pillars that you need to have to have a successful, you know, experience of machine learning. >> Okay, let's unpack that a little bit. Let's start with the differentiation. You mentioned Canvas, but talk about IBM specifically. >> [Dinesh] Right. What's so great about IBM? What's the differentiation? >> Right, exactly. Really good point. So we have been in the productive side for a very long time, right. I mean, it's not like we are coming into ML or AI or cognitive yesterday. We have been in that space for a very long time. We have SPSS predictive analytics available. So even if you look from all three pillars, what we are doing is we are, from a product perspective, we are bringing in the product where we are giving a choice or a flexibility to use the language you want. So there are customers who only want to use R. They are religious R users. They don't want to hear about anything else. There are customers who want to use Python, you know. They don't want to use anything else. So how do we give that choice of languages to our customers to say use any language you want. Or execution engines, right? Some folks want to use Park as execution engine. Some folks want to use R or Python, so we give that choice. Then you talked about Canvas. There are folks who want to use the GUI portion of the Canvas or a modeler to build models, or there are, you know, tekkie guys that we'll approach who want to use notebook. So how do you give that choice? So it becomes kind of like a freedom or a flexibility or a choice that we provide, so that's the product piece, right? We do that. Then the other piece is productivity. So one of the customers, the CTO of (mumbles) TV's going to come on stage with me during the main session, talk about how collaboration helped from an IBM machine learning perspective because their data scientists are sitting in New York City, our data scientists who are working with them are sitting in San Jose, California. And they were real time collaborating using notebooks in our ML projects where they can see the real time. What changes their data scientists are making. They can slack messages between each other. And that collaborative piece is what really helped us. So collaboration is one. Right from a productivity piece. We introduced something called Feedback Loop, whereby which your model can get trained. So today, you deploy a model. It could lose the score, and it could get degraded over time. Then you have to take it off-line and re-train, right? What we have done is like we introduced the Feedback Loops, so when you deploy your model, we give you two endpoints. The first endpoint is, basically, a URI, for you to plug-in your application when you, you know, run your application able call the scoring API. The second endpoint is this feedback endpoint, where you can choose to re-train the model. If you want three hours, if you want it to be six hours, you can do that. So we bring that flexibility, we bring that productivity into it. Then, the management of the models, right? How do we make sure that once you develop the model, you deploy the model. There's a life cycle involved there. How do you make sure that we enable, give you the tools to manage the model? So when you talk about differentiation, right? We are bringing differentiation on all three pillars. From a product perspective, with all the things I mentioned. From a deployment perspective. How do we make sure we have different choices of deployment, whether it's streaming, whether it's realtime, whether it's batch. You can do deployment, right? The Feedback Loop is another one. Once you deployed, how do we keep re-training it. And the last piece I talked about is the expertise or the people, right? So we are today announcing IBM Machine Learning Hub, which will become one place where our customers can go, ask questions, get education sessions, get training, right? Work together to build models. I'll give you an example, that although we are announcing hub, the IBM Machine Learning Hub today, we have been working with America First Credit Union for the last month or so. They approached us and said, you know, their underwriting takes a long time. All the knowledge is embedded in 15 to 20 human beings. And they want to make sure a machine should be able to absorb that knowledge and make that decision in minutes. So it takes hours or days. >> [Dave] So, Stu, before you jump in, so I got, put the portfolio. You know, you mentioned SPSS, expertise, choice. The collaboration, which I think you really stressed at the announcement last fall. The management of the models, so you can continuously improve it. >> Right. >> And then this knowledge base, what you're calling the hub. And I could argue, I guess, that if I take any one of those individual pieces, there, some of your competitors have them. Your argument would be it's all there. >> It all comes together, right? And you have to make sure that all three pillars come together. And customers see great value when you have that. >> Dinesh, customers today are used to kind of the deployment model on the public cloud, which is, "I want to activate a new service," you know. I just activate it, and it's there. When I think about private cloud environments, private clouds are operationally faster, but it's usually not miniature hours. It's usually more like months to deploy projects, which is still better than, you know, kind of, I think, before big data, it was, you know, oh, okay, 18 months to see if it works, and let's bring that down to, you know, a couple of months. Can you walk us through what does, you know, a customer today and says, "Great, I love this approach. "How long does it take?" You know, what's kind of the project life cycle of this? And how long will it take them to play around and pull some of these levers before they're, you know, getting productivity out of it? >> Right. So, really good questions, Stu. So let me back one step. So, in private cloud, we are going, we have new initiative called Download and Go, where our goal is to have our desktop products be able to install on your personal desktop in less than five clicks, in less than fifteen minutes. That's the goal. So the other day, you know, the team told me it's ready. That the first product is ready where you can go less than five clicks, fifteen minutes. I said the real test is I'm going to bring my son, who's five years old. Can he install it, and if he can install it, you know, we are good. And he did it. And I have a video to prove it, you know. So after the show, I will show you because and that's, when you talk about, you know, in the private cloud side, or the on-premise side, it has been a long project cycle. What we want is like you should be able to take our product, install it, and get the experience in minutes. That's the goal. And when you talk about private cloud and public cloud, another differentiating factor is that now you get the strength of IBM public cloud combined with the private cloud, so you could, you know, train your model in public cloud, and score on private cloud. You have the same experience. Not many folks, not many competitors can offer that, right? So that's another . .. >> [Stu] So if I get that right. If I as a customer have played around with the machine learning in Bluemix, I'm going to have a similar look, feel, API. >> Exactly the same, so what you have in Bluemix, right? I mean, so you have the Watson in Bluemix, which, you know, has deep learning, machine learning--all those capabilities. What we have done is we have done, is like, we have extracted the core capabilities of Watson on private cloud, and it's IBM Machine Learning. But the experience is the same. >> I want to talk about this notion of operationalizing analytics. And it ties, to me anyway, it ties into transformation. You mentioned going from Notebook to actually being able to embed analytics in workflow of the business. Can you double click on that a little bit, and maybe give some examples of how that has helped companies transform? >> Right. So when I talk about operationalizing, when you look at machine learning, right? You have all the way from data, which is the most critical piece, to building or deploying the model. A lot of times, data itself is not clean. I'll give you an example, right. So >> OSYX. >> Yeah. And when we are working with an insurance company, for example, the data that comes in. For example, if you just take gender, a lot of times the values are null. So we have to build another model to figure out if it's male or female, right? So in this case, for example, we have to say somebody has done a prostate exam. Obviously, he's a male. You know, we figured that. Or has a gynocology exam. It's a female. So we have to, you know, there's a lot of work just to get that data cleansed. So that's where I mentioned it's, you know, machine learning is 20% fun, 80% elbow grease because it's a lot of grease there that you need to make sure that you cleanse the data. Get that right. That's the shaping piece of it. Then, comes the building the model, right. And then, once you build the model on that data comes the operationalization of that model, which in itself is huge because how do you make sure that you infuse that model into your current infrastructure, which is where a lot of skill set, a lot of experience, and a lot of knowledge that comes in because you want to make sure, unless you are a start-up, right? You already have applications and programs and third-party vendors applications worth running for years, or decades, for that matter. So, yeah, so that's operationalization's a huge piece. Cleansing of the data is a huge piece. Getting the model right is another piece. >> And simplifying the whole process. I think about, I got to ingest the data. I've now got to, you know, play with it, explore. I've got to process it. And I've got to serve it to some, you know, some business need or application. And typically, those are separate processes, separate tools, maybe different personas that are doing that. Am I correct that your announcement in the Fall addressed that workflow. How is it being, you know, deployed and adopted in the field? How is it, again back to transformation, are you seeing that people are actually transforming their analytics processes and ultimately creating outcomes that they expect? >> Huge. So good point. We announced data science experience in the Fall. And the customers that who are going to speak with us today on stage, are the customers who have been using that. So, for example, if you take AFCU, America First Credit Union, they worked with us. In two weeks, you know, talk about transformation, we were able to absorb the knowledge of their underwriters. You know, what (mumbles) is in. Build that, get that features. And was able to build a model in two weeks. And the model is predicting 90%, with 90% accuracy. That's what early tests are showing. >> [Dave] And you say that was in a couple of weeks. You were, you developed that model. >> Yeah, yeah, right. So when we talk about transformation, right? We couldn't have done that a few years ago. We have transformed where the different personas can collaborate with each other, and that's a collaboration piece I talked about. Real time. Be able to build a model, and put it in the test to see what kind of benefits they're getting. >> And you've obviously got edge cases where people get really sophisticated, but, you know, we were sort of talking off camera, and you know like the 80/20 rule, or maybe it's the 90/10. You say most use cases can be, you know, solved with regression and classification. Can you talk about that a little more? >> So, so when we talk about machine learning, right? To me, I would say 90% of it is regression or classification. I mean there are edge case of our clustering and all those things. But linear regression or a classification can solve most of the, most of our customers problems, right? So whether it's fraud detection. Or whether it's underwriting the loan. Or whether you're trying to determine the sentiment analysis. I mean, you can kind of classify or do regression on it. So I would say that 90% of the cases can be covered, but like I said, most of the work is not about picking the right algorithm, but it's also about cleansing the data. Picking the algorithm, then comes building the model. Then comes deployment or operationalizing the model. So there's a step process that's involved, and each step involves some amount of work. So if I could make one more point on the technology and the transformation we have done. So even with picking the right algorithm, we automated, so you as a data scientist don't need to, you know, come in and figure out if I have 50 classifiers and each classifier has four parameters. That's 200 different combinations. Even if you take one hour on each combination, that's 200 hours or nine days that takes you to pick the right combination. What we have done is like in IBM Machine Learning we have something called cognitive assistance for data science, which will help you pick the right combination in minutes instead of days. >> So I can see how regression scales, and in the example you gave of classification, I can see how that scales. If you've got a, you know, fixed classification or maybe 200 parameters, or whatever it is, that scales, what happens, how are people dealing with, sort of automating that classification as things change, as they, some kind of new disease or pattern pops up. How do they address that at scale? >> Good point. So as the data changes, the model needs to change, right? Because everything that model knows is based on the training data. Now, if the data has changed, the symptoms of cancer or any disease has changed, obviously, you have to retrain that model. And that's where I talk about the, where the feedback loop comes in, where we will automatically retrain the model based on the new data that's coming in. So you, as an end user, for example, don't need to worry about it because we will take care of that piece also. We will automate that, also. >> Okay, good. And you've got a session this afternoon with you said two clients, right? AFCU and Kaden dot TV, and you're on, let's see, at 2:55. >> Right. >> So you folks watching the live stream, check that out. I'll give you the last word, you know, what shall we expect to hear there. Show a little leg on your discussion this afternoon. >> Right. So, obviously, I'm going to talk about the different shading factors, what we are delivering IBM Machine Learning, right? And I covered some of it. There's going to be much more. We are going to focus on how we are making freedom or flexibility available. How are we going to do productivity, right? Gains for our data scientists and developers. We are going to talk about trust, you know, the trust of data that we are bringing in. Then I'm going to bring the customers in and talk about their experience, right? We are delivering a product, but we already have customers using it, so I want them to come on stage and share the experiences of, you know, it's one thing you hear about that from us, but it's another thing that customers come and talk about it. So, and the last but not least is we are going to announce our first release of IBM Machine Learning on Z because if you look at 90% of the transactional data, today, it runs through Z, so they don't have to off-load the data to do analytics on it. We will make machine learning available, so you can do training and scoring right there on Z for your real time analytics, so. >> Right. Extending that theme that we talked about earlier, Stu, bringing analytics and transactions together, which is a big theme of the Z 13 announcement two years ago. Now you're seeing, you know, machine learning coming on Z. The live stream starts at 2 o'clock. Silicon Angle dot com had an article up on the site this morning from Maria Doucher on the IBM announcement, so check that out. Dinesh, thanks very much for coming back on theCube. Really appreciate it, and good luck today. >> Thank you. >> All right. Keep it right there, buddy. We'll be back with our next guest. This is theCube. We're live from the Waldorf Astoria for the IBM Machine Learning Event announcement. Right back.

Published Date : Feb 15 2017

SUMMARY :

brought to you by IBM. Dinesh, good to see you again. the data lake, the data swamp, And you got to have a different shared set So if you take a banking, to have a successful, you know, experience Let's start with the differentiation. What's the differentiation? the Feedback Loops, so when you deploy your model, The management of the models, so you can And I could argue, I guess, And customers see great value when you have that. and let's bring that down to, you know, So the other day, you know, the machine learning in Bluemix, I mean, so you have the Watson in Bluemix, Can you double click on that a little bit, when you look at machine learning, right? So we have to, you know, And I've got to serve it to some, you know, So, for example, if you take AFCU, [Dave] And you say that was in a couple of weeks. and put it in the test to see what kind You say most use cases can be, you know, we automated, so you as a data scientist and in the example you gave of classification, So as the data changes, with you said two clients, right? So you folks watching the live stream, you know, the trust of data that we are bringing in. on the IBM announcement, for the IBM Machine Learning Event announcement.

ENTITIES

Entity	Category	Confidence
20%	QUANTITY	0.99+
Dave Vellante	PERSON	0.99+
AFCU	ORGANIZATION	0.99+
15	QUANTITY	0.99+
one hour	QUANTITY	0.99+
New York City	LOCATION	0.99+
Dinesh Nirmal	PERSON	0.99+
Dinesh Nirmel	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
200 hours	QUANTITY	0.99+
six hours	QUANTITY	0.99+
90%	QUANTITY	0.99+
Dave	PERSON	0.99+
80%	QUANTITY	0.99+
less than fifteen minutes	QUANTITY	0.99+
New York	LOCATION	0.99+
fifteen minutes	QUANTITY	0.99+
Maria Doucher	PERSON	0.99+
America First Credit Union	ORGANIZATION	0.99+
50 classifiers	QUANTITY	0.99+
nine days	QUANTITY	0.99+
three hours	QUANTITY	0.99+
two clients	QUANTITY	0.99+
Kaden dot TV	ORGANIZATION	0.99+
less than five clicks	QUANTITY	0.99+
18 months	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
two weeks	QUANTITY	0.99+
200 different combinations	QUANTITY	0.99+
Dinesh	PERSON	0.99+
each classifier	QUANTITY	0.99+
200 parameters	QUANTITY	0.99+
each combination	QUANTITY	0.99+
Python	TITLE	0.99+
today	DATE	0.99+
each step	QUANTITY	0.99+
two years ago	DATE	0.99+
three key pillars	QUANTITY	0.99+
one	QUANTITY	0.98+
first product	QUANTITY	0.98+
one step	QUANTITY	0.98+
two endpoints	QUANTITY	0.98+
third one	QUANTITY	0.98+
first one	QUANTITY	0.98+
Watson	TITLE	0.98+
2 o'clock	DATE	0.98+
last month	DATE	0.98+
first endpoint	QUANTITY	0.98+
three pillars	QUANTITY	0.98+
Silicon Angle dot com	ORGANIZATION	0.98+
70's	DATE	0.97+
80's	DATE	0.97+
this afternoon	DATE	0.97+
Z 13	TITLE	0.97+
Z	TITLE	0.97+
last fall	DATE	0.96+
Bluemix	TITLE	0.96+
yesterday	DATE	0.95+
2:55	DATE	0.95+

Rob Thomas, IBM | IBM Machine Learning Launch

>> Narrator: Live from New York, it's theCUBE. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody this is theCUBE, we're here at the IBM Machine Learning Launch Event, Rob Thomas is here, he's the general manager of the IBM analytics group. Rob, good to see you again. >> Dave, great to see you, thanks for being here. >> Yeah it's our pleasure. So two years ago, IBM announced the Z platform, and the big theme was bringing analytics and transactions together. You guys are sort of extending that today, bringing machine learning. So the news just hit three minutes ago. >> Rob: Yep. >> Take us through what you announced. >> This is a big day for us. The announcement is we are going to bring machine learning to private Clouds, and my observation is this, you look at the world today, over 90% of the data in the world cannot be googled. Why is that? It's because it's behind corporate firewalls. And as we've worked with clients over the last few years, sometimes they don't want to move their most sensitive data to the public Cloud yet, and so what we've done is we've taken the machine learning from IBM Watson, we've extracted that, and we're enabling that on private Clouds, and we're telling clients you can get the power of machine learning across any type of data, whether it's data in a warehouse, a database, unstructured content, email, you name it we're bringing machine learning everywhere. To your point, we were thinking about, so where do we start? And we said, well, what is the world's most valuable data? It's the data on the mainframe. It's the transactional data that runs the retailers of the world, the banks of the world, insurance companies, airlines of the world, and so we said we're going to start there because we can show clients how they can use machine learning to unlock value in their most valuable data. >> And which, you say private Cloud, of course, we're talking about the original private Cloud, >> Rob: Yeah. >> Which is the mainframe, right? >> Rob: Exactly. >> And I presume that you'll extend that to other platforms over time is that right? >> Yeah, I mean, we're going to think about every place that data is managed behind a firewall, we want to enable machine learning as an ingredient. And so this is the first step, and we're going to be delivering every quarter starting next quarter, bringing it to other platforms, other repositories, because once clients get a taste of the idea of automating analytics with machine learning, what we call continuous intelligence, it changes the way they do analytics. And, so, demand will be off the charts here. >> So it's essentially Watson ML extracted and placed on Z, is that right? And describe how people are going to be using this and who's going to be using it. >> Sure, so Watson on the Cloud today is IBM's Cloud platform for artificial intelligence, cognitive computing, augmented intelligence. A component of that is machine learning. So we're bringing that as IBM machine learning which will run today on the mainframe, and then in the future, other platforms. Now let's talk about what it does. What it is, it's a single-place unified model management, so you can manage all your models from one place. And we've got really interesting technology that we pulled out of IBM research, called CADS, which stands for the Cognitive Assistance for Data Scientist. And the idea behind CADS is, you don't have to know which algorithm to choose, we're going to choose the algorithm for you. You build your model, we'll decide based on all the algorithms available on open-source what you built for yourself, what IBM's provided, what's the best way to run it, and our focus here is, it's about productivity of data science and data scientists. No company has as many data scientists as they want, and so we've got to make the ones they do have vastly more productive, and so with technology like CADS, we're helping them do their job more efficiently and better. >> Yeah, CADS, we've talked about this in theCUBE before, it's like an algorithm to choose an algorithm, and makes the best fit. >> Rob: Yeah. >> Okay. And you guys addressed some of the collaboration issues at your Watson data platform announcement last October, so talk about the personas who are asking you to give me access to mainframe data, and give me, to tooling that actually resides on this private Cloud. >> It's definitely a data science persona, but we see, I'd say, an emerging market where it's more the business analyst type that is saying I'd really like to get at that data, but I haven't been able to do that easily in the past. So giving them a single pane of glass if you will, with some light data science experience, where they can manage their models, using CADS to actually make it more productive. And then we have something called a feedback loop that's built into it, which is you build a model running on Z, as you get new data in, these are the largest transactional systems in the world so there's data coming in every second. As you get new data in, that model is constantly updating. The model is learning from the data that's coming in, and it's becoming smarter. That's the whole idea behind machine learning in the first place. And that's what we've been able to enable here. Now, you and I have talked through the years, Dave, about IBM's investment in Spark. This is one of the first, I would say, world-class applications of Spark. We announced Spark on the mainframe last year, what we're bringing with IBM machine learning is leveraging Spark as an execution engine on the mainframe, and so I see this as Spark is finally coming into the mainstream, when you talk about Spark accessing the world's greatest transactional data. >> Rob, I wonder if you can help our audience kind of squint through a compare and contrast, public Cloud versus what you're offering today, 'cause one thing, public Cloud adding new services, machine learning seemed like one of those areas that we would add, like IBM had done with a machine learning platform. Streaming, absolutely you hear mobile streaming applications absolutely happened in the public Cloud. Is cost similar in private Cloud? Can I get all the services? How will IBM and your customer base keep up with that pace of innovation that we've seen from IBM and others in the public Cloud on PRIM? >> Yeah, so, look, my view is it's not an either or. Because when you look at this valuable data, clients want to do some of it in public Cloud, they want to keep a lot of it in the system that they built on PRIMA. So our job is, how do we actually bridge that gap? So I see machine learning like we've talked about becoming much more of a hybrid capability over time because the data they want to move to the Cloud, they should do that. The economics are great. The data, doing it on private Cloud, actually the economics are tremendous as well. And so we're delivering an elastic infrastructure on private Cloud as well that can scale the public Cloud. So to me it's not either or, it's about what everybody wants as Cloud features. They want the elasticity, they want a creatable interface, they want the economics of Cloud, and our job is to deliver that in both places. Whether it's on the public Cloud, which we're doing, or on the private Cloud. >> Yeah, one of the thought exercises I've gone through is if you follow the data, and follow the applications, it's going to show you where customers are going to do things. If you look at IOT, if you look at healthcare, there's lots of uses that it's going to be on PRIMA it's going to be on the edge, I got to interview Walmart a couple of years ago at the IBM Ed show, and they leveraged Z globally to use their sales, their enablement, and obviously they're not going to use AWS as their platform. What's the trends, what do you hear form their customers, how much of the data, are there reasons why it needs to stay at the edge? It's not just compliance and governance, but it's just because that's where the data is and I think you were saying there's just so much data on the Z series itself compared to in other environments. >> Yeah, and it's not just the mainframe, right? Let's be honest, there's just massive amounts of data that still sits behind corporate firewalls. And while I believe the end destination is a lot of that will be on public Cloud, what do you do now? Because you can't wait until that future arrives. And so the place, the biggest change I've seen in the market in the last year is clients are building private Clouds. It's not traditional on-premise deployments, it's, they're building an elastic infrastructure behind their firewall, you see it a lot in heavily-regulated industries, so financial services where they're dealing with things like GDPR, any type of retailer who's dealing with things like PCI compliance. Heavy-regulated industries are saying, we want to move there, but we got challenges to solve right now. And so, our mission is, we want to make data simple and accessible, wherever it is, on private Cloud or public Cloud, and help clients on that journey. >> Okay, so carrying through on that, so you're now unlocking access to mainframe data, great, if I have, say, a retail example, and I've got some data science, I'm building some models, I'm accessing the mainframe data, if I have data that's elsewhere in the Cloud, how specifically with regard to this announcement will a practitioner execute on that? >> Yeah, so, one is you could decide one place that you want to land your data and have it be resonant, so you could do that. We have scenarios where clients are using data science experience on the Cloud, but they're actually leaving the data behind the firewalls. So we don't require them to move the data, so our model is one of flexibility in terms of how they want to manage their data assets. Which I think is unique in terms of IBM's approach to that. Others in the market say, if you want to use our tools, you have to move your data to our Cloud, some of them even say as you click through the terms, now we own your data, now we own your insights, that's not our approach. Our view is it's your data, if you want to run the applications in the Cloud, leave the data where it is, that's fine. If you want to move both to the Cloud, that's fine. If you wanted to leave both on private Cloud, that's fine. We have capabilities like Big SQL where we can actually federate data across public and private Clouds, so we're trying to provide choice and flexibility when it comes to this. >> And, Rob, in the context of this announcement, that would be, that example you gave, would be done through APIs that allow me access to that Cloud data is that right? >> Yeah, exactly, yes. >> Dave: Okay. >> So last year we announced something called Data Connect, which is basically, think of it as a bus between private and public Cloud. You can leverage Data Connect to seamlessly and easily move data. It's very high-speed, it uses our Aspera technology under the covers, so you can do that. >> Dave: A recent acquisition. >> Rob, IBM's been very active in open source engagement, in trying to help the industry sort out some of the challenges out there. Where do you see the state of the machine learning frameworks Google of course has TensorFlow, we've seen Amazon pushing at MXNet, is IBM supporting all of them, there certain horses that you have strong feelings for? What are your customers telling you? >> I believe in openness and choice. So with IBM machine learning you can choose your language, you can use Scala, you can use Java, you can use Python, more to come. You can choose your framework. We're starting with Spark ML because that's where we have our competency and that's where we see a lot of client desire. But I'm open to clients using other frameworks over time as well, so we'll start to bring that in. I think the IT industry always wants to kind of put people into a box. This is the model you should use. That's not our approach. Our approach is, you can use the language, you can use the framework that you want, and through things like IBM machine learning, we give you the ability to tap this data that is your most valuable data. >> Yeah, the box today has just become this mosaic and you have to provide access to all the pieces of that mosaic. One of the things that practitioners tell us is they struggle sometimes, and I wonder if you could weigh in on this, to invest either in improving the model or capturing more data and they have limited budget, and they said, okay. And I've had people tell me, no, you're way better off getting more data in, I've had people say, no no, now with machine learning we can advance the models. What are you seeing there, what are you advising customers in that regard? >> So, computes become relatively cheap, which is good. Data acquisitions become relatively cheap. So my view is, go full speed ahead on both of those. The value comes from the right algorithms and the right models. That's where the value is. And so I encourage clients, even think about maybe you separate your teams. And you have one that's focused on data acquisition and how you do that, and another team that's focused on model development, algorithm development. Because otherwise, if you give somebody both jobs, they both get done halfway, typically. And the value is from the right models, the right algorithms, so that's where we stress the focus. >> And models to date have been okay, but there's a lot of room for improvement. Like the two examples I like to use are retargeting, ad retargeting, which, as we all know as consumers is not great. You buy something and then you get targeted for another week. And then fraud detection, which is actually, for the last ten years, quite good, but there's still a lot of false positives. Where do you see IBM machine learning taking that practical use case in terms of improving those models? >> Yeah, so why are there false positives? The issue typically comes down to the quality of data, and the amount of data that you have that's why. Let me give an example. So one of the clients that's going to be talking at our event this afternoon is Argus who's focused on the healthcare space. >> Dave: Yeah, we're going to have him on here as well. >> Excellent, so Argus is basically, they collect data across payers, they're focused on healthcare, payers, providers, pharmacy benefit managers, and their whole mission is how do we cost-effectively serve different scenarios or different diseases, in this case diabetes, and how do we make sure we're getting the right care at the right time? So they've got all that data on the mainframe, they're constantly getting new data in, it could be about blood sugar levels, it could be about glucose, it could be about changes in blood pressure. Their models will get smarter over time because they built them with IBM machine learning so that what's cost-effective today may not be the most effective or cost-effective solution tomorrow. But we're giving them that continuous intelligence as data comes in to do that. That is the value of machine learning. I think sometimes people miss that point, they think it's just about making the data scientists' job easier, that productivity is part of it, but it's really about the voracity of the data and that you're constantly updating your models. >> And the patient outcome there, I read through some of the notes earlier, is if I can essentially opt in to allow the system to adjudicate the medication or the claim, and if I do so, I can get that instantaneously or in near real-time as opposed to have to wait weeks and phone calls and haggling. Is that right, did I get that right? >> That's right, and look, there's two dimensions. It's the cost of treatment, so you want to optimize that, and then it's the effectiveness. And which one's more important? Well, they're both actually critically important. And so what we're doing with Argus is building, helping them build models where they deploy this so that they're optimizing both of those. >> Right, and in the case, again, back to the personas, that would be, and you guys stressed this at your announcement last October, it's the data scientist, it's the data engineer, it's the, I guess even the application developer, right? Involved in that type of collaboration. >> My hope would be over time, when I talked about we view machine learning as an ingredient across everywhere that data is, is you want to embed machine learning into any applications that are built. And at that point you no longer need a data scientist per se, for that case, you can just have the app developer that's incorporating that. Whereas another tough challenge like the one we discussed, that's where you need data scientists. So think about, you need to divide and conquer the machine learning problem, where the data scientist can play, the business analyst can play, the app developers can play, the data engineers can play, and that's what we're enabling. >> And how does streaming fit in? We talked earlier about this sort of batch, interactive, and now you have this continuous sort of work load. How does streaming fit? >> So we use streaming in a few ways. One is very high-speed data ingest, it's a good way to get data into the Cloud. We also can do analytics on the fly. So a lot of our use case around streaming where we actually build analytical models into the streaming engine so that you're doing analytics on the fly. So I view that as, it's a different side of the same coin. It's kind of based on your use case, how fast you're ingesting data if you're, you know, sub-millisecond response times, you constantly have data coming in, you need something like a streaming engine to do that. >> And it's actually consolidating that data pipeline, is what you described which is big in terms of simplifying the complexity, this mosaic of a dupe, for example and that's a big value proposition of Spark. Alright, we'll give you the last word, you've got an audience outside waiting, big announcement today; final thoughts. >> You know, we talked about machine learning for a long time. I'll give you an analogy. So 1896, Charles Brady King is the first person to drive an automobile down the street in Detroit. It was 20 years later before Henry Ford actually turned it from a novelty into mass appeal. So it was like a 20-year incubation period where you could actually automate it, you could make it more cost-effective, you could make it simpler and easy. I feel like we're kind of in the same thing here where, the data era in my mind began around the turn of the century. Companies came onto the internet, started to collect a lot more data. It's taken us a while to get to the point where we could actually make this really easy and to do it at scale. And people have been wanting to do machine learning for years. It starts today. So we're excited about that. >> Yeah, and we saw the same thing with the steam engine, it was decades before it actually was perfected, and now the timeframe in our industry is compressed to years, sometimes months. >> Rob: Exactly. >> Alright, Rob, thanks very much for coming on theCUBE. Good luck with the announcement today. >> Thank you. >> Good to see you again. >> Thank you guys. >> Alright, keep it right there, everybody. We'll be right back with our next guest, we're live from the Waldorf Astoria, the IBM Machine Learning Launch Event. Be right back. [electronic music]

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Rob, good to see you again. Dave, great to see you, and the big theme was bringing analytics and we're telling clients you can get it changes the way they do analytics. are going to be using this And the idea behind CADS and makes the best fit. so talk about the personas do that easily in the past. in the public Cloud. Whether it's on the public Cloud, and follow the applications, And so the place, that you want to land your under the covers, so you can do that. of the machine learning frameworks This is the model you should use. and you have to provide access to and the right models. for the last ten years, quite good, and the amount of data to have him on here as well. That is the value of machine learning. the system to adjudicate It's the cost of treatment, Right, and in the case, And at that point you no and now you have this We also can do analytics on the fly. in terms of simplifying the complexity, King is the first person and now the timeframe in our industry much for coming on theCUBE. the IBM Machine Learning Launch Event.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Henry Ford	PERSON	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Detroit	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
Charles Brady King	PERSON	0.99+
New York City	LOCATION	0.99+
Walmart	ORGANIZATION	0.99+
Scala	TITLE	0.99+
Amazon	ORGANIZATION	0.99+
New York	LOCATION	0.99+
last year	DATE	0.99+
two dimensions	QUANTITY	0.99+
1896	DATE	0.99+
Java	TITLE	0.99+
both	QUANTITY	0.99+
Argus	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Python	TITLE	0.99+
20-year	QUANTITY	0.99+
GDPR	TITLE	0.99+
Argus	PERSON	0.99+
one	QUANTITY	0.99+
two examples	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
both jobs	QUANTITY	0.99+
first step	QUANTITY	0.99+
today	DATE	0.99+
next quarter	DATE	0.99+
two years ago	DATE	0.98+
first	QUANTITY	0.98+
Google	ORGANIZATION	0.98+
first person	QUANTITY	0.98+
three minutes ago	DATE	0.98+
20 years later	DATE	0.98+
Watson	TITLE	0.98+
last October	DATE	0.97+
IBM Machine Learning Launch Event	EVENT	0.96+
IBM Machine Learning Launch Event	EVENT	0.96+
Spark ML	TITLE	0.96+
both places	QUANTITY	0.95+
One	QUANTITY	0.95+
IBM Machine Learning Launch Event	EVENT	0.94+
MXNet	ORGANIZATION	0.94+
Watson ML	TITLE	0.94+
Data Connect	TITLE	0.94+
Cloud	TITLE	0.93+

Kickoff - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Narrator: Live from New York, it's The Cube covering the IBM Machine Learning Launch Event brought to you by IBM. Here are your hosts, Dave Vellante and Stu Miniman. >> Good morning everybody, welcome to the Waldorf Astoria. Stu Miniman and I are here in New York City, the Big Apple, for IBM's Machine Learning Event #IBMML. We're fresh off Spark Summit, Stu, where we had The Cube, this by the way is The Cube, the worldwide leader in live tech coverage. We were at Spark Summit last week, George Gilbert and I, watching the evolution of so-called big data. Let me frame, Stu, where we're at and bring you into the conversation. The early days of big data were all about offloading the data warehouse and reducing the cost of the data warehouse. I often joke that the ROI of big data is reduction on investment, right? There's these big, expensive data warehouses. It was quite successful in that regard. What then happened is we started to throw all this data into the data warehouse. People would joke it became a data swamp, and you had a lot of tooling to try to clean the data warehouse and a lot of transforming and loading and the ETL vendors started to participate there in a bigger way. Then you saw the extension of these data pipelines to try to more with that data. The Cloud guys have now entered in a big way. We're now entering the Cognitive Era, as IBM likes to refer to it. Others talk about AI and machine learning and deep learning, and that's really the big topic here today. What we can tell you, that the news goes out at 9:00am this morning, and it was well known that IBM's bringing machine learning to its mainframe, z mainframe. Two years ago, Stu, IBM announced the z13, which was really designed to bring analytic and transaction processing together on a single platform. Clearly IBM is extending the useful life of the mainframe by bringing things like Spark, certainly what it did with Linux and now machine learning into z. I want to talk about Cloud, the importance of Cloud, and how that has really taken over the world of big data. Virtually every customer you talk to now is doing work on the Cloud. It's interesting to see now IBM unlocking its transaction base, its mission-critical data, to this machine learning world. What are you seeing around Cloud and big data? >> We've been digging into this big data space since before it was called big data. One of the early things that really got me interested and exciting about it is, from the infrastructure standpoint, storage has always been one of its costs that we had to have, and the massive amounts of data, the digital explosion we talked about, is keeping all that information or managing all that information was a huge challenge. Big data was really that bit flip. How do we take all that information and make it an opportunity? How do we get new revenue streams? Dave, IBM has been at the center of this and looking at the higher-level pieces of not just storing data, but leveraging it. Obviously huge in analytics, lots of focus on everything from Hadoop and Spark and newer technologies, but digging in to how they can leverage up the stack, which is where IBM has done a lot of acquisitions in that space and leveraging that and wants to make sure that they have a strong position both in Cloud, which was renamed. The soft layer is now IBM Bluemix with a lot of services including a machine learning service that leverages the Watson technology and of course OnPrem they've got the z and the power solutions that you and I have covered for many years at the IBM Med show. >> Machine learning obviously heavily leverages models. We've seen in the early days of the data, the data scientists would build models and machine learning allows those models to be perfected over time. So there's this continuous process. We're familiar with the world of Batch and then some mini computer brought in the world of interactive, so we're familiar with those types of workloads. Now we're talking about a new emergent workload which is continuous. Continuous apps where you're streaming data in, what Spark is all about. The models that data scientists are building can constantly be improved. The key is automation, right? Being able to automate that whole process, and being able to collaborate between the data scientist, the data quality engineers, even the application developers that's something that IBM really tried to address in its last big announcement in this area of which was in October of last year the Watson data platform, what they called at the time the DataWorks. So really trying to bring together those different personas in a way that they can collaborate together and improve models on a continuous basis. The use cases that you often hear in big data and certainly initially in machine learning are things like fraud detection. Obviously ad serving has been a big data application for quite some time. In financial services, identifying good targets, identifying risk. What I'm seeing, Stu, is that the phase that we're in now of this so-called big data and analytics world, and now bringing in machine learning and deep learning, is to really improve on some of those use cases. For example, fraud's gotten much, much better. Ten years ago, let's say, it took many, many months, if you ever detected fraud. Now you get it in seconds, or sometimes minutes, but you also get a lot of false positives. Oops, sorry, the transaction didn't go through. Did you do this transaction? Yes, I did. Oh, sorry, you're going to have to redo it because it didn't go through. It's very frustrating for a lot of users. That will get better and better and better. We've all experienced retargeting from ads, and we know how crappy they are. That will continue to get better. The big question that people have and it goes back to Jeff Hammerbacher, the best minds of my generation are trying to get people to click on ads. When will we see big data really start to affect our lives in different ways like patient outcomes? We're going to hear some of that today from folks in health care and pharma. Again, these are the things that people are waiting for. The other piece is, of course, IT. What you're seeing, in terms of IT, in the whole data flow? >> Yes, a big question we have, Dave, is where's the data? And therefore, where does it make sense to be able to do that processing? In big data we talked about you've got masses amounts of data, can we move the processing to that data? With IT, the day before, your RCTO talked that there's going to be massive amounts of data at the edge and I don't have the time or the bandwidth or the need necessarily to pull that back to some kind of central repository. I want to be able to work on it there. Therefore there's going to be a lot of data worked at the edge. Peter Levine did a whole video talking about how, "Oh, Public Cloud is dead, it's all going to the edge." A little bit hyperbolic to the statement we understand that there's plenty use cases for both Public Cloud and for the edge. In fact we see Google big pushing machine learning TensorFlow, it's got one of those machine learning frameworks out there that we expect a lot of people to be working on. Amazon is putting effort into the MXNet framework, which is once again an open-source effort. One of the things I'm looking at the space, and I think IBM can provide some leadership here is to what frameworks are going to become popular across multiple scenarios? How many winners can there be for these frameworks? We already have multiple programming languages, multiple Clouds. How much of it is just API compatibility? How much of work there, and where are the repositories of data going to be, and where does it make sense to do that predictive analytics, that advanced processing? >> You bring up a good point. Last year, last October, at Big Data CIV, we had a special segment of data scientists with a data scientist panel. It was great. We had some rockstar data scientists on there like Dee Blanchfield and Joe Caserta, and a number of others. They echoed what you always hear when you talk to data scientists. "We spend 80% of our time messing with the data, "trying to clean the data, figuring out the data quality, "and precious little time on the models "and proving the models "and actually getting outcomes from those models." So things like Spark have simplified that whole process and unified a lot of the tooling around so-called big data. We're seeing Spark adoption increase. George Gilbert in our part one and part two last week in the big data forecast from Wikibon showed that we're still not on the steep part of the Se-curve, in terms of Spark adoption. Generically, we're talking about streaming as well included in that forecast, but it's forecasting that increasingly those applications are going to become more and more important. It brings you back to what IBM's trying to do is bring machine learning into this critical transaction data. Again, to me, it's an extension of the vision that they put forth two years ago, bringing analytic and transaction data together, actually processing within that Private Cloud complex, which is what essentially this mainframe is, it's the original Private Cloud, right? You were saying off-camera, it's the original converged infrastructure. It's the original Private Cloud. >> The mainframe's still here, lots of Linux on it. We've covered for many years, you want your cool Linux docker, containerized, machine learning stuff, I can do that on the Zn-series. >> You want Python and Spark and Re and Papa Java, and all the popular programming languages. It makes sense. It's not like a huge growth platform, it's kind of flat, down, up in the product cycle but it's alive and well and a lot of companies run their businesses obviously on the Zn. We're going to be unpacking that all day. Some of the questions we have is, what about Cloud? Where does it fit? What about Hybrid Cloud? What are the specifics of this announcement? Where does it fit? Will it be extended? Where does it come from? How does it relate to other products within the IBM portfolio? And very importantly, how are customers going to be applying these capabilities to create business value? That's something that we'll be looking at with a number of the folks on today. >> Dave, another thing, it reminds me of two years ago you and I did an event with the MIT Sloan school on The Second Machine Age with Andy McAfee and Erik Brynjolfsson talking about as machines can help with some of these analytics, some of this advanced technology, what happens to the people? Talk about health care, it's doctors plus machines most of the time. As these two professors say, it's racing with the machines. What is the impact on people? What's the impact on jobs? And productivity going forward, really interesting hot space. They talk about everything from autonomous vehicles, advanced health care and the like. This is right at the core of where the next generation of the economy and jobs are going to go. >> It's a great point, and no doubt that's going to come up today and some of our segments will explore that. Keep it right there, everybody. We'll be here all day covering this announcement, talking to practitioners, talking to IBM executives and thought leaders and sharing some of the major trends that are going on in machine learning, the specifics of this announcement. Keep it right there, everybody. This is The Cube. We're live from the Waldorf Astoria. We'll be right back.

Published Date : Feb 15 2017

SUMMARY :

covering the IBM Machine and that's really the and the massive amounts of data, and it goes back to Jeff Hammerbacher, and I don't have the time or the bandwidth of the Se-curve, in I can do that on the Zn-series. Some of the questions we have is, of the economy and jobs are going to go. and sharing some of the major trends

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Peter Levine	PERSON	0.99+
George Gilbert	PERSON	0.99+
Erik Brynjolfsson	PERSON	0.99+
Joe Caserta	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Last year	DATE	0.99+
80%	QUANTITY	0.99+
Andy McAfee	PERSON	0.99+
Stu	PERSON	0.99+
New York City	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
last October	DATE	0.99+
Dee Blanchfield	PERSON	0.99+
last week	DATE	0.99+
Python	TITLE	0.99+
two professors	QUANTITY	0.99+
Spark	TITLE	0.99+
October	DATE	0.99+
Google	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Linux	TITLE	0.98+
today	DATE	0.98+
two years ago	DATE	0.98+
Ten years ago	DATE	0.98+
Waldorf Astoria	ORGANIZATION	0.98+
Big Apple	LOCATION	0.98+
Two years ago	DATE	0.97+
Spark Summit	EVENT	0.97+
single platform	QUANTITY	0.97+
both	QUANTITY	0.97+
One	QUANTITY	0.97+
Wikibon	ORGANIZATION	0.96+
one	QUANTITY	0.96+
The Cube	COMMERCIAL_ITEM	0.96+
MIT Sloan school	ORGANIZATION	0.96+
Watson	TITLE	0.91+
9:00am this morning	DATE	0.9+
Hadoop	TITLE	0.9+
Re	TITLE	0.9+
Papa Java	TITLE	0.9+
Zn	TITLE	0.88+
Watson	ORGANIZATION	0.87+
IBM Machine Learning Launch Event	EVENT	0.87+
MXNet	TITLE	0.84+
part two	QUANTITY	0.82+
Cloud	TITLE	0.81+
Second	TITLE	0.8+
IBM Med	EVENT	0.8+
Machine Learning Event	EVENT	0.79+
z13	COMMERCIAL_ITEM	0.78+
#IBMML	EVENT	0.77+
Big	ORGANIZATION	0.75+
#IBMML	TITLE	0.75+
DataWorks	ORGANIZATION	0.71+

Opening Panel | Generative AI: Hype or Reality | AWS Startup Showcase S3 E1

(light airy music) >> Hello, everyone, welcome to theCUBE's presentation of the AWS Startup Showcase, AI and machine learning. "Top Startups Building Generative AI on AWS." This is season three, episode one of the ongoing series covering the exciting startups from the AWS ecosystem, talking about AI machine learning. We have three great guests Bratin Saha, VP, Vice President of Machine Learning and AI Services at Amazon Web Services. Tom Mason, the CTO of Stability AI, and Aidan Gomez, CEO and co-founder of Cohere. Two practitioners doing startups and AWS. Gentlemen, thank you for opening up this session, this episode. Thanks for coming on. >> Thank you. >> Thank you. >> Thank you. >> So the topic is hype versus reality. So I think we're all on the reality is great, hype is great, but the reality's here. I want to get into it. Generative AI's got all the momentum, it's going mainstream, it's kind of come out of the behind the ropes, it's now mainstream. We saw the success of ChatGPT, opens up everyone's eyes, but there's so much more going on. Let's jump in and get your early perspectives on what should people be talking about right now? What are you guys working on? We'll start with AWS. What's the big focus right now for you guys as you come into this market that's highly active, highly hyped up, but people see value right out of the gate? >> You know, we have been working on generative AI for some time. In fact, last year we released Code Whisperer, which is about using generative AI for software development and a number of customers are using it and getting real value out of it. So generative AI is now something that's mainstream that can be used by enterprise users. And we have also been partnering with a number of other companies. So, you know, stability.ai, we've been partnering with them a lot. We want to be partnering with other companies as well. In seeing how we do three things, you know, first is providing the most efficient infrastructure for generative AI. And that is where, you know, things like Trainium, things like Inferentia, things like SageMaker come in. And then next is the set of models and then the third is the kind of applications like Code Whisperer and so on. So, you know, it's early days yet, but clearly there's a lot of amazing capabilities that will come out and something that, you know, our customers are starting to pay a lot of attention to. >> Tom, talk about your company and what your focus is and why the Amazon Web Services relationship's important for you? >> So yeah, we're primarily committed to making incredible open source foundation models and obviously stable effusions been our kind of first big model there, which we trained all on AWS. We've been working with them over the last year and a half to develop, obviously a big cluster, and bring all that compute to training these models at scale, which has been a really successful partnership. And we're excited to take it further this year as we develop commercial strategy of the business and build out, you know, the ability for enterprise customers to come and get all the value from these models that we think they can get. So we're really excited about the future. We got hugely exciting pipeline for this year with new modalities and video models and wonderful things and trying to solve images for once and for all and get the kind of general value and value proposition correct for customers. So it's a really exciting time and very honored to be part of it. >> It's great to see some of your customers doing so well out there. Congratulations to your team. Appreciate that. Aidan, let's get into what you guys do. What does Cohere do? What are you excited about right now? >> Yeah, so Cohere builds large language models, which are the backbone of applications like ChatGPT and GPT-3. We're extremely focused on solving the issues with adoption for enterprise. So it's great that you can make a super flashy demo for consumers, but it takes a lot to actually get it into billion user products and large global enterprises. So about six months ago, we released our command models, which are some of the best that exist for large language models. And in December, we released our multilingual text understanding models and that's on over a hundred different languages and it's trained on, you know, authentic data directly from native speakers. And so we're super excited to continue pushing this into enterprise and solving those barriers for adoption, making this transformation a reality. >> Just real quick, while I got you there on the new products coming out. Where are we in the progress? People see some of the new stuff out there right now. There's so much more headroom. Can you just scope out in your mind what that looks like? Like from a headroom standpoint? Okay, we see ChatGPT. "Oh yeah, it writes my papers for me, does some homework for me." I mean okay, yawn, maybe people say that, (Aidan chuckles) people excited or people are blown away. I mean, it's helped theCUBE out, it helps me, you know, feed up a little bit from my write-ups but it's not always perfect. >> Yeah, at the moment it's like a writing assistant, right? And it's still super early in the technologies trajectory. I think it's fascinating and it's interesting but its impact is still really limited. I think in the next year, like within the next eight months, we're going to see some major changes. You've already seen the very first hints of that with stuff like Bing Chat, where you augment these dialogue models with an external knowledge base. So now the models can be kept up to date to the millisecond, right? Because they can search the web and they can see events that happened a millisecond ago. But that's still limited in the sense that when you ask the question, what can these models actually do? Well they can just write text back at you. That's the extent of what they can do. And so the real project, the real effort, that I think we're all working towards is actually taking action. So what happens when you give these models the ability to use tools, to use APIs? What can they do when they can actually affect change out in the real world, beyond just streaming text back at the user? I think that's the really exciting piece. >> Okay, so I wanted to tee that up early in the segment 'cause I want to get into the customer applications. We're seeing early adopters come in, using the technology because they have a lot of data, they have a lot of large language model opportunities and then there's a big fast follower wave coming behind it. I call that the people who are going to jump in the pool early and get into it. They might not be advanced. Can you guys share what customer applications are being used with large language and vision models today and how they're using it to transform on the early adopter side, and how is that a tell sign of what's to come? >> You know, one of the things we have been seeing both with the text models that Aidan talked about as well as the vision models that stability.ai does, Tom, is customers are really using it to change the way you interact with information. You know, one example of a customer that we have, is someone who's kind of using that to query customer conversations and ask questions like, you know, "What was the customer issue? How did we solve it?" And trying to get those kinds of insights that was previously much harder to do. And then of course software is a big area. You know, generating software, making that, you know, just deploying it in production. Those have been really big areas that we have seen customers start to do. You know, looking at documentation, like instead of you know, searching for stuff and so on, you know, you just have an interactive way, in which you can just look at the documentation for a product. You know, all of this goes to where we need to take the technology. One of which is, you know, the models have to be there but they have to work reliably in a production setting at scale, with privacy, with security, and you know, making sure all of this is happening, is going to be really key. That is what, you know, we at AWS are looking to do, which is work with partners like stability and others and in the open source and really take all of these and make them available at scale to customers, where they work reliably. >> Tom, Aidan, what's your thoughts on this? Where are customers landing on this first use cases or set of low-hanging fruit use cases or applications? >> Yeah, so I think like the first group of adopters that really found product market fit were the copywriting companies. So one great example of that is HyperWrite. Another one is Jasper. And so for Cohere, that's the tip of the iceberg, like there's a very long tail of usage from a bunch of different applications. HyperWrite is one of our customers, they help beat writer's block by drafting blog posts, emails, and marketing copy. We also have a global audio streaming platform, which is using us the power of search engine that can comb through podcast transcripts, in a bunch of different languages. Then a global apparel brand, which is using us to transform how they interact with their customers through a virtual assistant, two dozen global news outlets who are using us for news summarization. So really like, these large language models, they can be deployed all over the place into every single industry sector, language is everywhere. It's hard to think of any company on Earth that doesn't use language. So it's, very, very- >> We're doing it right now. We got the language coming in. >> Exactly. >> We'll transcribe this puppy. All right. Tom, on your side, what do you see the- >> Yeah, we're seeing some amazing applications of it and you know, I guess that's partly been, because of the growth in the open source community and some of these applications have come from there that are then triggering this secondary wave of innovation, which is coming a lot from, you know, controllability and explainability of the model. But we've got companies like, you know, Jasper, which Aidan mentioned, who are using stable diffusion for image generation in block creation, content creation. We've got Lensa, you know, which exploded, and is built on top of stable diffusion for fine tuning so people can bring themselves and their pets and you know, everything into the models. So we've now got fine tuned stable diffusion at scale, which is democratized, you know, that process, which is really fun to see your Lensa, you know, exploded. You know, I think it was the largest growing app in the App Store at one point. And lots of other examples like NightCafe and Lexica and Playground. So seeing lots of cool applications. >> So much applications, we'll probably be a customer for all you guys. We'll definitely talk after. But the challenges are there for people adopting, they want to get into what you guys see as the challenges that turn into opportunities. How do you see the customers adopting generative AI applications? For example, we have massive amounts of transcripts, timed up to all the videos. I don't even know what to do. Do I just, do I code my API there. So, everyone has this problem, every vertical has these use cases. What are the challenges for people getting into this and adopting these applications? Is it figuring out what to do first? Or is it a technical setup? Do they stand up stuff, they just go to Amazon? What do you guys see as the challenges? >> I think, you know, the first thing is coming up with where you think you're going to reimagine your customer experience by using generative AI. You know, we talked about Ada, and Tom talked about a number of these ones and you know, you pick up one or two of these, to get that robust. And then once you have them, you know, we have models and we'll have more models on AWS, these large language models that Aidan was talking about. Then you go in and start using these models and testing them out and seeing whether they fit in use case or not. In many situations, like you said, John, our customers want to say, "You know, I know you've trained these models on a lot of publicly available data, but I want to be able to customize it for my use cases. Because, you know, there's some knowledge that I have created and I want to be able to use that." And then in many cases, and I think Aidan mentioned this. You know, you need these models to be up to date. Like you can't have it staying. And in those cases, you augmented with a knowledge base, you know you have to make sure that these models are not hallucinating. And so you need to be able to do the right kind of responsible AI checks. So, you know, you start with a particular use case, and there are a lot of them. Then, you know, you can come to AWS, and then look at one of the many models we have and you know, we are going to have more models for other modalities as well. And then, you know, play around with the models. We have a playground kind of thing where you can test these models on some data and then you can probably, you will probably want to bring your own data, customize it to your own needs, do some of the testing to make sure that the model is giving the right output and then just deploy it. And you know, we have a lot of tools. >> Yeah. >> To make this easy for our customers. >> How should people think about large language models? Because do they think about it as something that they tap into with their IP or their data? Or is it a large language model that they apply into their system? Is the interface that way? What's the interaction look like? >> In many situations, you can use these models out of the box. But in typical, in most of the other situations, you will want to customize it with your own data or with your own expectations. So the typical use case would be, you know, these are models are exposed through APIs. So the typical use case would be, you know you're using these APIs a little bit for testing and getting familiar and then there will be an API that will allow you to train this model further on your data. So you use that AI, you know, make sure you augmented the knowledge base. So then you use those APIs to customize the model and then just deploy it in an application. You know, like Tom was mentioning, a number of companies that are using these models. So once you have it, then you know, you again, use an endpoint API and use it in an application. >> All right, I love the example. I want to ask Tom and Aidan, because like most my experience with Amazon Web Service in 2007, I would stand up in EC2, put my code on there, play around, if it didn't work out, I'd shut it down. Is that a similar dynamic we're going to see with the machine learning where developers just kind of log in and stand up infrastructure and play around and then have a cloud-like experience? >> So I can go first. So I mean, we obviously, with AWS working really closely with the SageMaker team, do fantastic platform there for ML training and inference. And you know, going back to your point earlier, you know, where the data is, is hugely important for companies. Many companies bringing their models to their data in AWS on-premise for them is hugely important. Having the models to be, you know, open sources, makes them explainable and transparent to the adopters of those models. So, you know, we are really excited to work with the SageMaker team over the coming year to bring companies to that platform and make the most of our models. >> Aidan, what's your take on developers? Do they just need to have a team in place, if we want to interface with you guys? Let's say, can they start learning? What do they got to do to set up? >> Yeah, so I think for Cohere, our product makes it much, much easier to people, for people to get started and start building, it solves a lot of the productionization problems. But of course with SageMaker, like Tom was saying, I think that lowers a barrier even further because it solves problems like data privacy. So I want to underline what Bratin was saying earlier around when you're fine tuning or when you're using these models, you don't want your data being incorporated into someone else's model. You don't want it being used for training elsewhere. And so the ability to solve for enterprises, that data privacy and that security guarantee has been hugely important for Cohere, and that's very easy to do through SageMaker. >> Yeah. >> But the barriers for using this technology are coming down super quickly. And so for developers, it's just becoming completely intuitive. I love this, there's this quote from Andrej Karpathy. He was saying like, "It really wasn't on my 2022 list of things to happen that English would become, you know, the most popular programming language." And so the barrier is coming down- >> Yeah. >> Super quickly and it's exciting to see. >> It's going to be awesome for all the companies here, and then we'll do more, we're probably going to see explosion of startups, already seeing that, the maps, ecosystem maps, the landscape maps are happening. So this is happening and I'm convinced it's not yesterday's chat bot, it's not yesterday's AI Ops. It's a whole another ballgame. So I have to ask you guys for the final question before we kick off the company's showcasing here. How do you guys gauge success of generative AI applications? Is there a lens to look through and say, okay, how do I see success? It could be just getting a win or is it a bigger picture? Bratin we'll start with you. How do you gauge success for generative AI? >> You know, ultimately it's about bringing business value to our customers. And making sure that those customers are able to reimagine their experiences by using generative AI. Now the way to get their ease, of course to deploy those models in a safe, effective manner, and ensuring that all of the robustness and the security guarantees and the privacy guarantees are all there. And we want to make sure that this transitions from something that's great demos to actual at scale products, which means making them work reliably all of the time not just some of the time. >> Tom, what's your gauge for success? >> Look, I think this, we're seeing a completely new form of ways to interact with data, to make data intelligent, and directly to bring in new revenue streams into business. So if businesses can use our models to leverage that and generate completely new revenue streams and ultimately bring incredible new value to their customers, then that's fantastic. And we hope we can power that revolution. >> Aidan, what's your take? >> Yeah, reiterating Bratin and Tom's point, I think that value in the enterprise and value in market is like a huge, you know, it's the goal that we're striving towards. I also think that, you know, the value to consumers and actual users and the transformation of the surface area of technology to create experiences like ChatGPT that are magical and it's the first time in human history we've been able to talk to something compelling that's not a human. I think that in itself is just extraordinary and so exciting to see. >> It really brings up a whole another category of markets. B2B, B2C, it's B2D, business to developer. Because I think this is kind of the big trend the consumers have to win. The developers coding the apps, it's a whole another sea change. Reminds me everyone use the "Moneyball" movie as example during the big data wave. Then you know, the value of data. There's a scene in "Moneyball" at the end, where Billy Beane's getting the offer from the Red Sox, then the owner says to the Red Sox, "If every team's not rebuilding their teams based upon your model, there'll be dinosaurs." I think that's the same with AI here. Every company will have to need to think about their business model and how they operate with AI. So it'll be a great run. >> Completely Agree >> It'll be a great run. >> Yeah. >> Aidan, Tom, thank you so much for sharing about your experiences at your companies and congratulations on your success and it's just the beginning. And Bratin, thanks for coming on representing AWS. And thank you, appreciate for what you do. Thank you. >> Thank you, John. Thank you, Aidan. >> Thank you John. >> Thanks so much. >> Okay, let's kick off season three, episode one. I'm John Furrier, your host. Thanks for watching. (light airy music)

Published Date : Mar 9 2023

SUMMARY :

of the AWS Startup Showcase, of the behind the ropes, and something that, you know, and build out, you know, Aidan, let's get into what you guys do. and it's trained on, you know, it helps me, you know, the ability to use tools, to use APIs? I call that the people and you know, making sure the first group of adopters We got the language coming in. Tom, on your side, what do you see the- and you know, everything into the models. they want to get into what you guys see and you know, you pick for our customers. then you know, you again, All right, I love the example. and make the most of our models. And so the ability to And so the barrier is coming down- and it's exciting to see. So I have to ask you guys and ensuring that all of the robustness and directly to bring in new and it's the first time in human history the consumers have to win. and it's just the beginning. I'm John Furrier, your host.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Tom	PERSON	0.99+
Tom Mason	PERSON	0.99+
Aidan	PERSON	0.99+
Red Sox	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Andrej Karpathy	PERSON	0.99+
Bratin Saha	PERSON	0.99+
December	DATE	0.99+
2007	DATE	0.99+
John Furrier	PERSON	0.99+
Aidan Gomez	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Billy Beane	PERSON	0.99+
Bratin	PERSON	0.99+
Moneyball	TITLE	0.99+
one	QUANTITY	0.99+
Ada	PERSON	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Earth	LOCATION	0.99+
yesterday	DATE	0.99+
Two practitioners	QUANTITY	0.99+
Amazon Web Services	ORGANIZATION	0.99+
ChatGPT	TITLE	0.99+
next year	DATE	0.99+
Code Whisperer	TITLE	0.99+
third	QUANTITY	0.99+
this year	DATE	0.99+
App Store	TITLE	0.99+
first time	QUANTITY	0.98+
first	QUANTITY	0.98+
Inferentia	TITLE	0.98+
EC2	TITLE	0.98+
GPT-3	TITLE	0.98+
both	QUANTITY	0.98+
Lensa	TITLE	0.98+
SageMaker	ORGANIZATION	0.98+
three things	QUANTITY	0.97+
Cohere	ORGANIZATION	0.96+
over a hundred different languages	QUANTITY	0.96+
English	OTHER	0.96+
one example	QUANTITY	0.96+
about six months ago	DATE	0.96+
One	QUANTITY	0.96+
first use	QUANTITY	0.96+
SageMaker	TITLE	0.96+
Bing Chat	TITLE	0.95+
one point	QUANTITY	0.95+
Trainium	TITLE	0.95+
Lexica	TITLE	0.94+
Playground	TITLE	0.94+
three great guests	QUANTITY	0.93+
HyperWrite	TITLE	0.92+

AI Meets the Supercloud | Supercloud2

(upbeat music) >> Okay, welcome back everyone at Supercloud 2 event, live here in Palo Alto, theCUBE Studios live stage performance, virtually syndicating it all over the world. I'm John Furrier with Dave Vellante here as Cube alumni, and special influencer guest, Howie Xu, VP of Machine Learning and Zscaler, also part-time as a CUBE analyst 'cause he is that good. Comes on all the time. You're basically a CUBE analyst as well. Thanks for coming on. >> Thanks for inviting me. >> John: Technically, you're not really a CUBE analyst, but you're kind of like a CUBE analyst. >> Happy New Year to everyone. >> Dave: Great to see you. >> Great to see you, Dave and John. >> John: We've been talking about ChatGPT online. You wrote a great post about it being more like Amazon, not like Google. >> Howie: More than just Google Search. >> More than Google Search. Oh, it's going to compete with Google Search, which it kind of does a little bit, but more its infrastructure. So a clever point, good segue into this conversation, because this is kind of the beginning of these kinds of next gen things we're going to see. Things where it's like an obvious next gen, it's getting real. Kind of like seeing the browser for the first time, Mosaic browser. Whoa, this internet thing's real. I think this is that moment and Supercloud like enablement is coming. So this has been a big part of the Supercloud kind of theme. >> Yeah, you talk about Supercloud, you talk about, you know, AI, ChatGPT. I really think the ChatGPT is really another Netscape moment, the browser moment. Because if you think about internet technology, right? It was brewing for 20 years before early 90s. Not until you had a, you know, browser, people realize, "Wow, this is how wonderful this technology could do." Right? You know, all the wonderful things. Then you have Yahoo and Amazon. I think we have brewing, you know, the AI technology for, you know, quite some time. Even then, you know, neural networks, deep learning. But not until ChatGPT came along, people realize, "Wow, you know, the user interface, user experience could be that great," right? So I really think, you know, if you look at the last 30 years, there is a browser moment, there is iPhone moment. I think ChatGPT moment is as big as those. >> Dave: What do you see as the intersection of things like ChatGPT and the Supercloud? Of course, the media's going to focus, journalists are going to focus on all the negatives and the privacy. Okay. You know we're going to get by that, right? Always do. Where do you see the Supercloud and sort of the distributed data fitting in with ChatGPT? Does it use that as a data source? What's the link? >> Howie: I think there are number of use cases. One of the use cases, we talked about why we even have Supercloud because of the complexity, because of the, you know, heterogeneous nature of different clouds. In order for me as a developer, in order for me to create applications, I have so many things to worry about, right? It's a complexity. But with ChatGPT, with the AI, I don't have to worry about it, right? Those kind of details will be taken care of by, you know, the underlying layer. So we have been talking about on this show, you know, over the last, what, year or so about the Supercloud, hey, defining that, you know, API layer spanning across, you know, multiple clouds. I think that will be happening. However, for a lot of the things, that will be more hidden, right? A lot of that will be automated by the bots. You know, we were just talking about it right before the show. One of the profound statement I heard from Adrian Cockcroft about 10 years ago was, "Hey Howie, you know, at Netflix, right? You know, IT is just one API call away." That's a profound statement I heard about a decade ago. I think next decade, right? You know, the IT is just one English language away, right? So when it's one English language away, it's no longer as important, API this, API that. You still need API just like hardware, right? You still need all of those things. That's going to be more hidden. The high level thing will be more, you know, English language or the language, right? Any language for that matter. >> Dave: And so through language, you'll tap services that live across the Supercloud, is what you're saying? >> Howie: You just tell what you want, what you desire, right? You know, the bots will help you to figure out where the complexity is, right? You know, like you said, a lot of criticism about, "Hey, ChatGPT doesn't do this, doesn't do that." But if you think about how to break things down, right? For instance, right, you know, ChatGPT doesn't have Microsoft stock price today, obviously, right? However, you can ask ChatGPT to write a program for you, retrieve the Microsoft stock price, (laughs) and then just run it, right? >> Dave: Yeah. >> So the thing to think about- >> John: It's only going to get better. It's only going to get better. >> The thing people kind of unfairly criticize ChatGPT is it doesn't do this. But can you not break down humans' task into smaller things and get complex things to be done by the ChatGPT? I think we are there already, you know- >> John: That to me is the real game changer. That's the assembly of atomic elements at the top of the stack, whether the interface is voice or some programmatic gesture based thing, you know, wave your hand or- >> Howie: One of the analogy I used in my blog was, you know, each person, each professional now is a quarterback. And we suddenly have, you know, a lot more linebacks or you know, any backs to work for you, right? For free even, right? You know, and then that's sort of, you should think about it. You are the quarterback of your day-to-day job, right? Your job is not to do everything manually yourself. >> Dave: You call the play- >> Yes. >> Dave: And they execute. Do your job. >> Yes, exactly. >> Yeah, all the players are there. All the elves are in the North Pole making the toys, Dave, as we say. But this is the thing, I want to get your point. This change is going to require a new kind of infrastructure software relationship, a new kind of operating runtime, a new kind of assembler, a new kind of loader link things. This very operating systems kind of concepts. >> Data intensive, right? How to process the data, how to, you know, process so gigantic data in parallel, right? That's actually a tough job, right? So if you think about ChatGPT, why OpenAI is ahead of the game, right? You know, Google may not want to acknowledge it, right? It's not necessarily they do, you know, not have enough data scientist, but the software engineering pieces, you know, behind it, right? To train the model, to actually do all those things in parallel, to do all those things in a cost effective way. So I think, you know, a lot of those still- >> Let me ask you a question. Let me ask you a question because we've had this conversation privately, but I want to do it while we're on stage here. Where are all the alpha geeks and developers and creators and entrepreneurs going to gravitate to? You know, in every wave, you see it in crypto, all the alphas went into crypto. Now I think with ChatGPT, you're going to start to see, like, "Wow, it's that moment." A lot of people are going to, you know, scrum and do startups. CTOs will invent stuff. There's a lot of invention, a lot of computer science and customer requirements to figure out. That's new. Where are the alpha entrepreneurs going to go to? What do you think they're going to gravitate to? If you could point to the next layer to enable this super environment, super app environment, Supercloud. 'Cause there's a lot to do to enable what you just said. >> Howie: Right. You know, if you think about using internet as the analogy, right? You know, in the early 90s, internet came along, browser came along. You had two kind of companies, right? One is Amazon, the other one is walmart.com. And then there were company, like maybe GE or whatnot, right? Really didn't take advantage of internet that much. I think, you know, for entrepreneurs, suddenly created the Yahoo, Amazon of the ChatGPT native era. That's what we should be all excited about. But for most of the Fortune 500 companies, your job is to surviving sort of the big revolution. So you at least need to do your walmart.com sooner than later, right? (laughs) So not be like GE, right? You know, hand waving, hey, I do a lot of the internet, but you know, when you look back last 20, 30 years, what did they do much with leveraging the- >> So you think they're going to jump in, they're going to build service companies or SaaS tech companies or Supercloud companies? >> Howie: Okay, so there are two type of opportunities from that perspective. One is, you know, the OpenAI ish kind of the companies, I think the OpenAI, the game is still open, right? You know, it's really Close AI today. (laughs) >> John: There's room for competition, you mean? >> There's room for competition, right. You know, you can still spend you know, 50, $100 million to build something interesting. You know, there are company like Cohere and so on and so on. There are a bunch of companies, I think there is that. And then there are companies who's going to leverage those sort of the new AI primitives. I think, you know, we have been talking about AI forever, but finally, finally, it's no longer just good, but also super useful. I think, you know, the time is now. >> John: And if you have the cloud behind you, what do you make the Amazon do differently? 'Cause Amazon Web Services is only going to grow with this. It's not going to get smaller. There's more horsepower to handle, there's more needs. >> Howie: Well, Microsoft already showed what's the future, right? You know, you know, yes, there is a kind of the container, you know, the serverless that will continue to grow. But the future is really not about- >> John: Microsoft's shown the future? >> Well, showing that, you know, working with OpenAI, right? >> Oh okay. >> They already said that, you know, we are going to have ChatGPT service. >> $10 billion, I think they're putting it. >> $10 billion putting, and also open up the Open API services, right? You know, I actually made a prediction that Microsoft future hinges on OpenAI. I think, you know- >> John: They believe that $10 billion bet. >> Dave: Yeah. $10 billion bet. So I want to ask you a question. It's somewhat academic, but it's relevant. For a number of years, it looked like having first mover advantage wasn't an advantage. PCs, spreadsheets, the browser, right? Social media, Friendster, right? Mobile. Apple wasn't first to mobile. But that's somewhat changed. The cloud, AWS was first. You could debate whether or not, but AWS okay, they have first mover advantage. Crypto, Bitcoin, first mover advantage. Do you think OpenAI will have first mover advantage? >> It certainly has its advantage today. I think it's year two. I mean, I think the game is still out there, right? You know, we're still in the first inning, early inning of the game. So I don't think that the game is over for the rest of the players, whether the big players or the OpenAI kind of the, sort of competitors. So one of the VCs actually asked me the other day, right? "Hey, how much money do I need to spend, invest, to get, you know, another shot to the OpenAI sort of the level?" You know, I did a- (laughs) >> Line up. >> That's classic VC. "How much does it cost me to replicate?" >> I'm pretty sure he asked the question to a bunch of guys, right? >> Good luck with that. (laughs) >> So we kind of did some napkin- >> What'd you come up with? (laughs) >> $100 million is the order of magnitude that I came up with, right? You know, not a billion, not 10 million, right? So 100 million. >> John: Hundreds of millions. >> Yeah, yeah, yeah. 100 million order of magnitude is what I came up with. You know, we can get into details, you know, in other sort of the time, but- >> Dave: That's actually not that much if you think about it. >> Howie: Exactly. So when he heard me articulating why is that, you know, he's thinking, right? You know, he actually, you know, asked me, "Hey, you know, there's this company. Do you happen to know this company? Can I reach out?" You know, those things. So I truly believe it's not a billion or 10 billion issue, it's more like 100. >> John: And also, your other point about referencing the internet revolution as a good comparable. The other thing there is online user population was a big driver of the growth of that. So what's the equivalent here for online user population for AI? Is it more apps, more users? I mean, we're still early on, it's first inning. >> Yeah. We're kind of the, you know- >> What's the key metric for success of this sector? Do you have a read on that? >> I think the, you know, the number of users is a good metrics, but I think it's going to be a lot of people are going to use AI services without even knowing they're using it, right? You know, I think a lot of the applications are being already built on top of OpenAI, and then they are kind of, you know, help people to do marketing, legal documents, you know, so they're already inherently OpenAI kind of the users already. So I think yeah. >> Well, Howie, we've got to wrap, but I really appreciate you coming on. I want to give you a last minute to wrap up here. In your experience, and you've seen many waves of innovation. You've even had your hands in a lot of the big waves past three inflection points. And obviously, machine learning you're doing now, you're deep end. Why is this Supercloud movement, this wave of Supercloud and the discussion of this next inflection point, why is it so important? For the folks watching, why should they be paying attention to this particular moment in time? Could you share your super clip on Supercloud? >> Howie: Right. So this is simple from my point of view. So why do you even have cloud to begin with, right? IT is too complex, too complex to operate or too expensive. So there's a newer model. There is a better model, right? Let someone else operate it, there is elasticity out of it, right? That's great. Until you have multiple vendors, right? Many vendors even, you know, we're talking about kind of how to make multiple vendors look like the same, but frankly speaking, even one vendor has, you know, thousand services. Now it's kind of getting, what Kid was talking about what, cloud chaos, right? It's the evolution. You know, the history repeats itself, right? You know, you have, you know, next great things and then too many great things, and then people need to sort of abstract this out. So it's almost that you must do this. But I think how to abstract this out is something that at this time, AI is going to help a lot, right? You know, like I mentioned, right? A lot of the abstraction, you don't have to think about API anymore. I bet 10 years from now, you know, IT is one language away, not API away. So think about that world, right? So Supercloud in, in my opinion, sure, you kind of abstract things out. You have, you know, consistent layers. But who's going to do that? Is that like we all agreed upon the model, agreed upon those APIs? Not necessary. There are certain, you know, truth in that, but there are other truths, let bots take care of, right? Whether you know, I want some X happens, whether it's going to be done by Azure, by AWS, by GCP, bots will figure out at a given time with certain contacts with your security requirement, posture requirement. I'll think that out. >> John: That's awesome. And you know, Dave, you and I have been talking about this. We think scale is the new ratification. If you have first mover advantage, I'll see the benefit, but scale is a huge thing. OpenAI, AWS. >> Howie: Yeah. Every day, we are using OpenAI. Today, we are labeling data for them. So you know, that's a little bit of the- (laughs) >> John: Yeah. >> First mover advantage that other people don't have, right? So it's kind of scary. So I'm very sure that Google is a little bit- (laughs) >> When we do our super AI event, you're definitely going to be keynoting. (laughs) >> Howie: I think, you know, we're talking about Supercloud, you know, before long, we are going to talk about super intelligent cloud. (laughs) >> I'm super excited, Howie, about this. Thanks for coming on. Great to see you, Howie Xu. Always a great analyst for us contributing to the community. VP of Machine Learning and Zscaler, industry legend and friend of theCUBE. Thanks for coming on and sharing really, really great advice and insight into what this next wave means. This Supercloud is the next wave. "If you're not on it, you're driftwood," says Pat Gelsinger. So you're going to see a lot more discussion. We'll be back more here live in Palo Alto after this short break. >> Thank you. (upbeat music)

Published Date : Feb 17 2023

SUMMARY :

it all over the world. but you're kind of like a CUBE analyst. Great to see you, You wrote a great post about Kind of like seeing the So I really think, you know, Of course, the media's going to focus, will be more, you know, You know, like you said, John: It's only going to get better. I think we are there already, you know- you know, wave your hand or- or you know, any backs Do your job. making the toys, Dave, as we say. So I think, you know, A lot of people are going to, you know, I think, you know, for entrepreneurs, One is, you know, the OpenAI I think, you know, the time is now. John: And if you have You know, you know, yes, They already said that, you know, $10 billion, I think I think, you know- that $10 billion bet. So I want to ask you a question. to get, you know, another "How much does it cost me to replicate?" Good luck with that. You know, not a billion, into details, you know, if you think about it. You know, he actually, you know, asked me, the internet revolution We're kind of the, you know- I think the, you know, in a lot of the big waves You have, you know, consistent layers. And you know, Dave, you and I So you know, that's a little bit of the- So it's kind of scary. to be keynoting. Howie: I think, you know, This Supercloud is the next wave. (upbeat music)

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
GE	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Adrian Cockcroft	PERSON	0.99+
John Furrier	PERSON	0.99+
$10 billion	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
10 million	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Netflix	ORGANIZATION	0.99+
50	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Howie Xu	PERSON	0.99+
CUBE	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
100 million	QUANTITY	0.99+
Hundreds of millions	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
10 billion	QUANTITY	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
North Pole	LOCATION	0.99+
next decade	DATE	0.99+
first	QUANTITY	0.99+
Cohere	ORGANIZATION	0.99+
first inning	QUANTITY	0.99+
100	QUANTITY	0.99+
Today	DATE	0.99+
Machine Learning	ORGANIZATION	0.99+
Supercloud 2	EVENT	0.99+
English	OTHER	0.98+
each person	QUANTITY	0.98+
two type	QUANTITY	0.98+
One	QUANTITY	0.98+
one	QUANTITY	0.98+
Zscaler	ORGANIZATION	0.98+
early 90s	DATE	0.97+
Howie	PERSON	0.97+
two kind	QUANTITY	0.97+
one vendor	QUANTITY	0.97+
one language	QUANTITY	0.97+
each professional	QUANTITY	0.97+

Oracle Announces MySQL HeatWave on AWS

>>Oracle continues to enhance my sequel Heatwave at a very rapid pace. The company is now in its fourth major release since the original announcement in December 2020. 1 of the main criticisms of my sequel, Heatwave, is that it only runs on O. C I. Oracle Cloud Infrastructure and as a lock in to Oracle's Cloud. Oracle recently announced that heat wave is now going to be available in AWS Cloud and it announced its intent to bring my sequel Heatwave to Azure. So my secret heatwave on AWS is a significant TAM expansion move for Oracle because of the momentum AWS Cloud continues to show. And evidently the Heatwave Engineering team has taken the development effort from O. C I. And is bringing that to A W S with a number of enhancements that we're gonna dig into today is senior vice president. My sequel Heatwave at Oracle is back with me on a cube conversation to discuss the latest heatwave news, and we're eager to hear any benchmarks relative to a W S or any others. Nippon has been leading the Heatwave engineering team for over 10 years and there's over 100 and 85 patents and database technology. Welcome back to the show and good to see you. >>Thank you. Very happy to be back. >>Now for those who might not have kept up with the news, uh, to kick things off, give us an overview of my sequel, Heatwave and its evolution. So far, >>so my sequel, Heat Wave, is a fully managed my secret database service offering from Oracle. Traditionally, my secret has been designed and optimised for transaction processing. So customers of my sequel then they had to run analytics or when they had to run machine learning, they would extract the data out of my sequel into some other database for doing. Unlike processing or machine learning processing my sequel, Heat provides all these capabilities built in to a single database service, which is my sequel. He'd fake So customers of my sequel don't need to move the data out with the same database. They can run transaction processing and predicts mixed workloads, machine learning, all with a very, very good performance in very good price performance. Furthermore, one of the design points of heat wave is is a scale out architecture, so the system continues to scale and performed very well, even when customers have very large late assignments. >>So we've seen some interesting moves by Oracle lately. The collaboration with Azure we've we've covered that pretty extensively. What was the impetus here for bringing my sequel Heatwave onto the AWS cloud? What were the drivers that you considered? >>So one of the observations is that a very large percentage of users of my sequel Heatwave, our AWS users who are migrating of Aurora or so already we see that a good percentage of my secret history of customers are migrating from GWS. However, there are some AWS customers who are still not able to migrate the O. C. I to my secret heat wave. And the reason is because of, um, exorbitant cost, which was charges. So in order to migrate the workload from AWS to go see, I digress. Charges are very high fees which becomes prohibitive for the customer or the second example we have seen is that the latency of practising a database which is outside of AWS is very high. So there's a class of customers who would like to get the benefits of my secret heatwave but were unable to do so and with this support of my secret trip inside of AWS, these customers can now get all the grease of the benefits of my secret he trip without having to pay the high fees or without having to suffer with the poorly agency, which is because of the ws architecture. >>Okay, so you're basically meeting the customer's where they are. So was this a straightforward lifted shift from from Oracle Cloud Infrastructure to AWS? >>No, it is not because one of the design girls we have with my sequel, Heatwave is that we want to provide our customers with the best price performance regardless of the cloud. So when we decided to offer my sequel, he headed west. Um, we have optimised my sequel Heatwave on it as well. So one of the things to point out is that this is a service with the data plane control plane and the console are natively running on AWS. And the benefits of doing so is that now we can optimise my sequel Heatwave for the E. W s architecture. In addition to that, we have also announced a bunch of new capabilities as a part of the service which will also be available to the my secret history of customers and our CI, But we just announced them and we're offering them as a part of my secret history of offering on AWS. >>So I just want to make sure I understand that it's not like you just wrapped your stack in a container and stuck it into a W s to be hosted. You're saying you're actually taking advantage of the capabilities of the AWS cloud natively? And I think you've made some other enhancements as well that you're alluding to. Can you maybe, uh, elucidate on those? Sure. >>So for status, um, we have taken the mind sequel Heatwave code and we have optimised for the It was infrastructure with its computer network. And as a result, customers get very good performance and price performance. Uh, with my secret he trade in AWS. That's one performance. Second thing is, we have designed new interactive counsel for the service, which means that customers can now provision there instances with the council. But in addition, they can also manage their schemas. They can. Then court is directly from the council. Autopilot is integrated. The council we have introduced performance monitoring, so a lot of capabilities which we have introduced as a part of the new counsel. The third thing is that we have added a bunch of new security features, uh, expose some of the security features which were part of the My Secret Enterprise edition as a part of the service, which gives customers now a choice of using these features to build more secure applications. And finally, we have extended my secret autopilot for a number of old gpus cases. In the past, my secret autopilot had a lot of capabilities for Benedict, and now we have augmented my secret autopilot to offer capabilities for elderly people. Includes as well. >>But there was something in your press release called Auto thread. Pooling says it provides higher and sustained throughput. High concerns concerns concurrency by determining Apple number of transactions, which should be executed. Uh, what is that all about? The auto thread pool? It seems pretty interesting. How does it affect performance? Can you help us understand that? >>Yes, and this is one of the capabilities of alluding to which we have added in my secret autopilot for transaction processing. So here is the basic idea. If you have a system where there's a large number of old EP transactions coming into it at a high degrees of concurrency in many of the existing systems of my sequel based systems, it can lead to a state where there are few transactions executing, but a bunch of them can get blocked with or a pilot tried pulling. What we basically do is we do workload aware admission control and what this does is it figures out, what's the right scheduling or all of these algorithms, so that either the transactions are executing or as soon as something frees up, they can start executing, so there's no transaction which is blocked. The advantage to the customer of this capability is twofold. A get significantly better throughput compared to service like Aurora at high levels of concurrency. So at high concurrency, for instance, uh, my secret because of this capability Uh oh, thread pulling offers up to 10 times higher compared to Aurora, that's one first benefit better throughput. The second advantage is that the true part of the system never drops, even at high levels of concurrency, whereas in the case of Aurora, the trooper goes up, but then, at high concurrency is, let's say, starting, uh, level of 500 or something. It depends upon the underlying shit they're using the troopers just dropping where it's with my secret heatwave. The truth will never drops. Now, the ramification for the customer is that if the truth is not gonna drop, the user can start off with a small shape, get the performance and be a show that even the workload increases. They will never get a performance, which is worse than what they're getting with lower levels of concurrency. So this let's leads to customers provisioning a shape which is just right for them. And if they need, they can, uh, go with the largest shape. But they don't like, you know, over pay. So those are the two benefits. Better performance and sustain, uh, regardless of the level of concurrency. >>So how do we quantify that? I know you've got some benchmarks. How can you share comparisons with other cloud databases especially interested in in Amazon's own databases are obviously very popular, and and are you publishing those again and get hub, as you have done in the past? Take us through the benchmarks. >>Sure, So benchmarks are important because that gives customers a sense of what performance to expect and what price performance to expect. So we have run a number of benchmarks. And yes, all these benchmarks are available on guitar for customers to take a look at. So we have performance results on all the three castle workloads, ol DB Analytics and Machine Learning. So let's start with the Rdp for Rdp and primarily because of the auto thread pulling feature. We show that for the IPCC for attended dataset at high levels of concurrency, heatwave offers up to 10 times better throughput and this performance is sustained, whereas in the case of Aurora, the performance really drops. So that's the first thing that, uh, tend to alibi. Sorry, 10 gigabytes. B B C c. I can come and see the performance are the throughput is 10 times better than Aurora for analytics. We have done a comparison of my secret heatwave in AWS and compared with Red Ship Snowflake Googled inquiry, we find that the price performance of my secret heatwave compared to read ship is seven times better. So my sequel, Heat Wave in AWS, provides seven times better price performance than red ship. That's a very, uh, interesting results to us. Which means that customers of Red Shift are really going to take the service seriously because they're gonna get seven times better price performance. And this is all running in a W s so compared. >>Okay, carry on. >>And then I was gonna say, compared to like, Snowflake, uh, in AWS offers 10 times better price performance. And compared to Google, ubiquity offers 12 times better price performance. And this is based on a four terabyte p PCH workload. Results are available on guitar, and then the third category is machine learning and for machine learning, uh, for training, the performance of my secret heatwave is 25 times faster compared to that shit. So all the three workloads we have benchmark's results, and all of these scripts are available on YouTube. >>Okay, so you're comparing, uh, my sequel Heatwave on AWS to Red Shift and snowflake on AWS. And you're comparing my sequel Heatwave on a W s too big query. Obviously running on on Google. Um, you know, one of the things Oracle is done in the past when you get the price performance and I've always tried to call fouls you're, like, double your price for running the oracle database. Uh, not Heatwave, but Oracle Database on a W s. And then you'll show how it's it's so much cheaper on on Oracle will be like Okay, come on. But they're not doing that here. You're basically taking my sequel Heatwave on a W s. I presume you're using the same pricing for whatever you see to whatever else you're using. Storage, um, reserved instances. That's apples to apples on A W s. And you have to obviously do some kind of mapping for for Google, for big query. Can you just verify that for me, >>we are being more than fair on two dimensions. The first thing is, when I'm talking about the price performance for analytics, right for, uh, with my secret heat rape, the cost I'm talking about from my secret heat rape is the cost of running transaction processing, analytics and machine learning. So it's a fully loaded cost for the case of my secret heatwave. There has been I'm talking about red ship when I'm talking about Snowflake. I'm just talking about the cost of these databases for running, and it's only it's not, including the source database, which may be more or some other database, right? So that's the first aspect that far, uh, trip. It's the cost for running all three kinds of workloads, whereas for the competition, it's only for running analytics. The second thing is that for these are those services whether it's like shit or snowflakes, That's right. We're talking about one year, fully paid up front cost, right? So that's what most of the customers would pay for. Many of the customers would pay that they will sign a one year contract and pay all the costs ahead of time because they get a discount. So we're using that price and the case of Snowflake. The costs were using is their standard edition of price, not the Enterprise edition price. So yes, uh, more than in this competitive. >>Yeah, I think that's an important point. I saw an analysis by Marx Tamer on Wiki Bond, where he was doing the TCO comparisons. And I mean, if you have to use two separate databases in two separate licences and you have to do et yelling and all the labour associated with that, that that's that's a big deal and you're not even including that aspect in in your comparison. So that's pretty impressive. To what do you attribute that? You know, given that unlike, oh, ci within the AWS cloud, you don't have as much control over the underlying hardware. >>So look hard, but is one aspect. Okay, so there are three things which give us this advantage. The first thing is, uh, we have designed hateful foreign scale out architecture. So we came up with new algorithms we have come up with, like, uh, one of the design points for heat wave is a massively partitioned architecture, which leads to a very high degree of parallelism. So that's a lot of hype. Each were built, So that's the first part. The second thing is that although we don't have control over the hardware, but the second design point for heat wave is that it is optimised for commodity cloud and the commodity infrastructure so we can have another guys, what to say? The computer we get, how much network bandwidth do we get? How much of, like objects to a brand that we get in here? W s. And we have tuned heat for that. That's the second point And the third thing is my secret autopilot, which provides machine learning based automation. So what it does is that has the users workload is running. It learns from it, it improves, uh, various premieres in the system. So the system keeps getting better as you learn more and more questions. And this is the third thing, uh, as a result of which we get a significant edge over the competition. >>Interesting. I mean, look, any I SV can go on any cloud and take advantage of it. And that's, uh I love it. We live in a new world. How about machine learning workloads? What? What did you see there in terms of performance and benchmarks? >>Right. So machine learning. We offer three capabilities training, which is fully automated, running in France and explanations. So one of the things which many of our customers told us coming from the enterprise is that explanations are very important to them because, uh, customers want to know that. Why did the the system, uh, choose a certain prediction? So we offer explanations for all models which have been derailed by. That's the first thing. Now, one of the interesting things about training is that training is usually the most expensive phase of machine learning. So we have spent a lot of time improving the performance of training. So we have a bunch of techniques which we have developed inside of Oracle to improve the training process. For instance, we have, uh, metal and proxy models, which really give us an advantage. We use adaptive sampling. We have, uh, invented in techniques for paralysing the hyper parameter search. So as a result of a lot of this work, our training is about 25 times faster than that ship them health and all the data is, uh, inside the database. All this processing is being done inside the database, so it's much faster. It is inside the database. And I want to point out that there is no additional charge for the history of customers because we're using the same cluster. You're not working in your service. So all of these machine learning capabilities are being offered at no additional charge inside the database and as a performance, which is significantly faster than that, >>are you taking advantage of or is there any, uh, need not need, but any advantage that you can get if two by exploiting things like gravity. John, we've talked about that a little bit in the past. Or trainee. Um, you just mentioned training so custom silicon that AWS is doing, you're taking advantage of that. Do you need to? Can you give us some insight >>there? So there are two things, right? We're always evaluating What are the choices we have from hybrid perspective? Obviously, for us to leverage is right and like all the things you mention about like we have considered them. But there are two things to consider. One is he is a memory system. So he favours a big is the dominant cost. The processor is a person of the cost, but memory is the dominant cost. So what we have evaluated and found is that the current shape which we are using is going to provide our customers with the best price performance. That's the first thing. The second thing is that there are opportunities at times when we can use a specialised processor for vaccinating the world for a bit. But then it becomes a matter of the cost of the customer. Advantage of our current architecture is on the same hardware. Customers are getting very good performance. Very good, energetic performance in a very good machine learning performance. If you will go with the specialised processor, it may. Actually, it's a machine learning, but then it's an additional cost with the customers we need to pay. So we are very sensitive to the customer's request, which is usually to provide very good performance at a very low cost. And we feel is that the current design we have as providing customers very good performance and very good price performance. >>So part of that is architectural. The memory intensive nature of of heat wave. The other is A W s pricing. If AWS pricing were to flip, it might make more sense for you to take advantage of something like like cranium. Okay, great. Thank you. And welcome back to the benchmarks benchmarks. Sometimes they're artificial right there. A car can go from 0 to 60 in two seconds. But I might not be able to experience that level of performance. Do you? Do you have any real world numbers from customers that have used my sequel Heatwave on A W s. And how they look at performance? >>Yes, absolutely so the my Secret service on the AWS. This has been in Vera for, like, since November, right? So we have a lot of customers who have tried the service. And what actually we have found is that many of these customers, um, planning to migrate from Aurora to my secret heat rape. And what they find is that the performance difference is actually much more pronounced than what I was talking about. Because with Aurora, the performance is actually much poorer compared to uh, like what I've talked about. So in some of these cases, the customers found improvement from 60 times, 240 times, right? So he travels 100 for 240 times faster. It was much less expensive. And the third thing, which is you know, a noteworthy is that customers don't need to change their applications. So if you ask the top three reasons why customers are migrating, it's because of this. No change to the application much faster, and it is cheaper. So in some cases, like Johnny Bites, what they found is that the performance of their applications for the complex storeys was about 60 to 90 times faster. Then we had 60 technologies. What they found is that the performance of heat we have compared to Aurora was 100 and 39 times faster. So, yes, we do have many such examples from real workloads from customers who have tried it. And all across what we find is if it offers better performance, lower cost and a single database such that it is compatible with all existing by sequel based applications and workloads. >>Really impressive. The analysts I talked to, they're all gaga over heatwave, and I can see why. Okay, last question. Maybe maybe two and one. Uh, what's next? In terms of new capabilities that customers are going to be able to leverage and any other clouds that you're thinking about? We talked about that upfront, but >>so in terms of the capabilities you have seen, like they have been, you know, non stop attending to the feedback from the customers in reacting to it. And also, we have been in a wedding like organically. So that's something which is gonna continue. So, yes, you can fully expect that people not dressed and continue to in a way and with respect to the other clouds. Yes, we are planning to support my sequel. He tripped on a show, and this is something that will be announced in the near future. Great. >>All right, Thank you. Really appreciate the the overview. Congratulations on the work. Really exciting news that you're moving my sequel Heatwave into other clouds. It's something that we've been expecting for some time. So it's great to see you guys, uh, making that move, and as always, great to have you on the Cube. >>Thank you for the opportunity. >>All right. And thank you for watching this special cube conversation. I'm Dave Volonte, and we'll see you next time.

Published Date : Sep 14 2022

SUMMARY :

The company is now in its fourth major release since the original announcement in December 2020. Very happy to be back. Now for those who might not have kept up with the news, uh, to kick things off, give us an overview of my So customers of my sequel then they had to run analytics or when they had to run machine So we've seen some interesting moves by Oracle lately. So one of the observations is that a very large percentage So was this a straightforward lifted shift from No, it is not because one of the design girls we have with my sequel, So I just want to make sure I understand that it's not like you just wrapped your stack in So for status, um, we have taken the mind sequel Heatwave code and we have optimised Can you help us understand that? So this let's leads to customers provisioning a shape which is So how do we quantify that? So that's the first thing that, So all the three workloads we That's apples to apples on A W s. And you have to obviously do some kind of So that's the first aspect And I mean, if you have to use two So the system keeps getting better as you learn more and What did you see there in terms of performance and benchmarks? So we have a bunch of techniques which we have developed inside of Oracle to improve the training need not need, but any advantage that you can get if two by exploiting We're always evaluating What are the choices we have So part of that is architectural. And the third thing, which is you know, a noteworthy is that In terms of new capabilities that customers are going to be able so in terms of the capabilities you have seen, like they have been, you know, non stop attending So it's great to see you guys, And thank you for watching this special cube conversation.

ENTITIES

Entity	Category	Confidence
Dave Volonte	PERSON	0.99+
December 2020	DATE	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
France	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
10 times	QUANTITY	0.99+
two things	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Heatwave	TITLE	0.99+
100	QUANTITY	0.99+
60 times	QUANTITY	0.99+
one year	QUANTITY	0.99+
12 times	QUANTITY	0.99+
GWS	ORGANIZATION	0.99+
60 technologies	QUANTITY	0.99+
first part	QUANTITY	0.99+
240 times	QUANTITY	0.99+
two separate licences	QUANTITY	0.99+
third category	QUANTITY	0.99+
second advantage	QUANTITY	0.99+
0	QUANTITY	0.99+
seven times	QUANTITY	0.99+
two seconds	QUANTITY	0.99+
two	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
seven times	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
one	QUANTITY	0.99+
25 times	QUANTITY	0.99+
second point	QUANTITY	0.99+
November	DATE	0.99+
85 patents	QUANTITY	0.99+
second thing	QUANTITY	0.99+
Aurora	TITLE	0.99+
third thing	QUANTITY	0.99+
Each	QUANTITY	0.99+
second example	QUANTITY	0.99+
10 gigabytes	QUANTITY	0.99+
three things	QUANTITY	0.99+
One	QUANTITY	0.99+
two benefits	QUANTITY	0.99+
one aspect	QUANTITY	0.99+
first aspect	QUANTITY	0.98+
two separate databases	QUANTITY	0.98+
over 10 years	QUANTITY	0.98+
fourth major release	QUANTITY	0.98+
39 times	QUANTITY	0.98+
first thing	QUANTITY	0.98+
Heat Wave	TITLE	0.98+

Luis Ceze, OctoML | Amazon re:MARS 2022

(upbeat music) >> Welcome back, everyone, to theCUBE's coverage here live on the floor at AWS re:MARS 2022. I'm John Furrier, host for theCUBE. Great event, machine learning, automation, robotics, space, that's MARS. It's part of the re-series of events, re:Invent's the big event at the end of the year, re:Inforce, security, re:MARS, really intersection of the future of space, industrial, automation, which is very heavily DevOps machine learning, of course, machine learning, which is AI. We have Luis Ceze here, who's the CEO co-founder of OctoML. Welcome to theCUBE. >> Thank you very much for having me in the show, John. >> So we've been following you guys. You guys are a growing startup funded by Madrona Venture Capital, one of your backers. You guys are here at the show. This is a, I would say small show relative what it's going to be, but a lot of robotics, a lot of space, a lot of industrial kind of edge, but machine learning is the centerpiece of this trend. You guys are in the middle of it. Tell us your story. >> Absolutely, yeah. So our mission is to make machine learning sustainable and accessible to everyone. So I say sustainable because it means we're going to make it faster and more efficient. You know, use less human effort, and accessible to everyone, accessible to as many developers as possible, and also accessible in any device. So, we started from an open source project that began at University of Washington, where I'm a professor there. And several of the co-founders were PhD students there. We started with this open source project called Apache TVM that had actually contributions and collaborations from Amazon and a bunch of other big tech companies. And that allows you to get a machine learning model and run on any hardware, like run on CPUs, GPUs, various GPUs, accelerators, and so on. It was the kernel of our company and the project's been around for about six years or so. Company is about three years old. And we grew from Apache TVM into a whole platform that essentially supports any model on any hardware cloud and edge. >> So is the thesis that, when it first started, that you want to be agnostic on platform? >> Agnostic on hardware, that's right. >> Hardware, hardware. >> Yeah. >> What was it like back then? What kind of hardware were you talking about back then? Cause a lot's changed, certainly on the silicon side. >> Luis: Absolutely, yeah. >> So take me through the journey, 'cause I could see the progression. I'm connecting the dots here. >> So once upon a time, yeah, no... (both chuckling) >> I walked in the snow with my bare feet. >> You have to be careful because if you wake up the professor in me, then you're going to be here for two hours, you know. >> Fast forward. >> The average version here is that, clearly machine learning has shown to actually solve real interesting, high value problems. And where machine learning runs in the end, it becomes code that runs on different hardware, right? And when we started Apache TVM, which stands for tensor virtual machine, at that time it was just beginning to start using GPUs for machine learning, we already saw that, with a bunch of machine learning models popping up and CPUs and GPU's starting to be used for machine learning, it was clear that it come opportunity to run on everywhere. >> And GPU's were coming fast. >> GPUs were coming and huge diversity of CPUs, of GPU's and accelerators now, and the ecosystem and the system software that maps models to hardware is still very fragmented today. So hardware vendors have their own specific stacks. So Nvidia has its own software stack, and so does Intel, AMD. And honestly, I mean, I hope I'm not being, you know, too controversial here to say that it kind of of looks like the mainframe era. We had tight coupling between hardware and software. You know, if you bought IBM hardware, you had to buy IBM OS and IBM database, IBM applications, it all tightly coupled. And if you want to use IBM software, you had to buy IBM hardware. So that's kind of like what machine learning systems look like today. If you buy a certain big name GPU, you've got to use their software. Even if you use their software, which is pretty good, you have to buy their GPUs, right? So, but you know, we wanted to help peel away the model and the software infrastructure from the hardware to give people choice, ability to run the models where it best suit them. Right? So that includes picking the best instance in the cloud, that's going to give you the right, you know, cost properties, performance properties, or might want to run it on the edge. You might run it on an accelerator. >> What year was that roughly, when you were going this? >> We started that project in 2015, 2016 >> Yeah. So that was pre-conventional wisdom. I think TensorFlow wasn't even around yet. >> Luis: No, it wasn't. >> It was, I'm thinking like 2017 or so. >> Luis: Right. So that was the beginning of, okay, this is opportunity. AWS, I don't think they had released some of the nitro stuff that the Hamilton was working on. So, they were already kind of going that way. It's kind of like converging. >> Luis: Yeah. >> The space was happening, exploding. >> Right. And the way that was dealt with, and to this day, you know, to a large extent as well is by backing machine learning models with a bunch of hardware specific libraries. And we were some of the first ones to say, like, know what, let's take a compilation approach, take a model and compile it to very efficient code for that specific hardware. And what underpins all of that is using machine learning for machine learning code optimization. Right? But it was way back when. We can talk about where we are today. >> No, let's fast forward. >> That's the beginning of the open source project. >> But that was a fundamental belief, worldview there. I mean, you have a world real view that was logical when you compare to the mainframe, but not obvious to the machine learning community. Okay, good call, check. Now let's fast forward, okay. Evolution, we'll go through the speed of the years. More chips are coming, you got GPUs, and seeing what's going on in AWS. Wow! Now it's booming. Now I got unlimited processors, I got silicon on chips, I got, everywhere >> Yeah. And what's interesting is that the ecosystem got even more complex, in fact. Because now you have, there's a cross product between machine learning models, frameworks like TensorFlow, PyTorch, Keras, and like that and so on, and then hardware targets. So how do you navigate that? What we want here, our vision is to say, folks should focus, people should focus on making the machine learning models do what they want to do that solves a value, like solves a problem of high value to them. Right? So another deployment should be completely automatic. Today, it's very, very manual to a large extent. So once you're serious about deploying machine learning model, you got a good understanding where you're going to deploy it, how you're going to deploy it, and then, you know, pick out the right libraries and compilers, and we automated the whole thing in our platform. This is why you see the tagline, the booth is right there, like bringing DevOps agility for machine learning, because our mission is to make that fully transparent. >> Well, I think that, first of all, I use that line here, cause I'm looking at it here on live on camera. People can't see, but it's like, I use it on a couple couple of my interviews because the word agility is very interesting because that's kind of the test on any kind of approach these days. Agility could be, and I talked to the robotics guys, just having their product be more agile. I talked to Pepsi here just before you came on, they had this large scale data environment because they built an architecture, but that fostered agility. So again, this is an architectural concept, it's a systems' view of agility being the output, and removing dependencies, which I think what you guys were trying to do. >> Only part of what we do. Right? So agility means a bunch of things. First, you know-- >> Yeah explain. >> Today it takes a couple months to get a model from, when the model's ready, to production, why not turn that in two hours. Agile, literally, physically agile, in terms of walk off time. Right? And then the other thing is give you flexibility to choose where your model should run. So, in our deployment, between the demo and the platform expansion that we announced yesterday, you know, we give the ability of getting your model and, you know, get it compiled, get it optimized for any instance in the cloud and automatically move it around. Today, that's not the case. You have to pick one instance and that's what you do. And then you might auto scale with that one instance. So we give the agility of actually running and scaling the model the way you want, and the way it gives you the right SLAs. >> Yeah, I think Swami was mentioning that, not specifically that use case for you, but that use case generally, that scale being moving things around, making them faster, not having to do that integration work. >> Scale, and run the models where they need to run. Like some day you want to have a large scale deployment in the cloud. You're going to have models in the edge for various reasons because speed of light is limited. We cannot make lights faster. So, you know, got to have some, that's a physics there you cannot change. There's privacy reasons. You want to keep data locally, not send it around to run the model locally. So anyways, and giving the flexibility. >> Let me jump in real quick. I want to ask this specific question because you made me think of something. So we're just having a data mesh conversation. And one of the comments that's come out of a few of these data as code conversations is data's the product now. So if you can move data to the edge, which everyone's talking about, you know, why move data if you don't have to, but I can move a machine learning algorithm to the edge. Cause it's costly to move data. I can move computer, everyone knows that. But now I can move machine learning to anywhere else and not worry about integrating on the fly. So the model is the code. >> It is the product. >> Yeah. And since you said, the model is the code, okay, now we're talking even more here. So machine learning models today are not treated as code, by the way. So do not have any of the typical properties of code that you can, whenever you write a piece of code, you run a code, you don't know, you don't even think what is a CPU, we don't think where it runs, what kind of CPU it runs, what kind of instance it runs. But with machine learning model, you do. So what we are doing and created this fully transparent automated way of allowing you to treat your machine learning models if you were a regular function that you call and then a function could run anywhere. >> Yeah. >> Right. >> That's why-- >> That's better. >> Bringing DevOps agility-- >> That's better. >> Yeah. And you can use existing-- >> That's better, because I can run it on the Artemis too, in space. >> You could, yeah. >> If they have the hardware. (both laugh) >> And that allows you to run your existing, continue to use your existing DevOps infrastructure and your existing people. >> So I have to ask you, cause since you're a professor, this is like a masterclass on theCube. Thank you for coming on. Professor. (Luis laughing) I'm a hardware guy. I'm building hardware for Boston Dynamics, Spot, the dog, that's the diversity in hardware, it's tends to be purpose driven. I got a spaceship, I'm going to have hardware on there. >> Luis: Right. >> It's generally viewed in the community here, that everyone I talk to and other communities, open source is going to drive all software. That's a check. But the scale and integration is super important. And they're also recognizing that hardware is really about the software. And they even said on stage, here. Hardware is not about the hardware, it's about the software. So if you believe that to be true, then your model checks all the boxes. Are people getting this? >> I think they're starting to. Here is why, right. A lot of companies that were hardware first, that thought about software too late, aren't making it. Right? There's a large number of hardware companies, AI chip companies that aren't making it. Probably some of them that won't make it, unfortunately just because they started thinking about software too late. I'm so glad to see a lot of the early, I hope I'm not just doing our own horn here, but Apache TVM, the infrastructure that we built to map models to different hardware, it's very flexible. So we see a lot of emerging chip companies like SiMa.ai's been doing fantastic work, and they use Apache TVM to map algorithms to their hardware. And there's a bunch of others that are also using Apache TVM. That's because you have, you know, an opening infrastructure that keeps it up to date with all the machine learning frameworks and models and allows you to extend to the chips that you want. So these companies pay attention that early, gives them a much higher fighting chance, I'd say. >> Well, first of all, not only are you backable by the VCs cause you have pedigree, you're a professor, you're smart, and you get good recruiting-- >> Luis: I don't know about the smart part. >> And you get good recruiting for PhDs out of University of Washington, which is not too shabby computer science department. But they want to make money. The VCs want to make money. >> Right. >> So you have to make money. So what's the pitch? What's the business model? >> Yeah. Absolutely. >> Share us what you're thinking there. >> Yeah. The value of using our solution is shorter time to value for your model from months to hours. Second, you shrink operator, op-packs, because you don't need a specialized expensive team. Talk about expensive, expensive engineers who can understand machine learning hardware and software engineering to deploy models. You don't need those teams if you use this automated solution, right? Then you reduce that. And also, in the process of actually getting a model and getting specialized to the hardware, making hardware aware, we're talking about a very significant performance improvement that leads to lower cost of deployment in the cloud. We're talking about very significant reduction in costs in cloud deployment. And also enabling new applications on the edge that weren't possible before. It creates, you know, latent value opportunities. Right? So, that's the high level value pitch. But how do we make money? Well, we charge for access to the platform. Right? >> Usage. Consumption. >> Yeah, and value based. Yeah, so it's consumption and value based. So depends on the scale of the deployment. If you're going to deploy machine learning model at a larger scale, chances are that it produces a lot of value. So then we'll capture some of that value in our pricing scale. >> So, you have direct sales force then to work those deals. >> Exactly. >> Got it. How many customers do you have? Just curious. >> So we started, the SaaS platform just launched now. So we started onboarding customers. We've been building this for a while. We have a bunch of, you know, partners that we can talk about openly, like, you know, revenue generating partners, that's fair to say. We work closely with Qualcomm to enable Snapdragon on TVM and hence our platform. We're close with AMD as well, enabling AMD hardware on the platform. We've been working closely with two hyperscaler cloud providers that-- >> I wonder who they are. >> I don't know who they are, right. >> Both start with the letter A. >> And they're both here, right. What is that? >> They both start with the letter A. >> Oh, that's right. >> I won't give it away. (laughing) >> Don't give it away. >> One has three, one has four. (both laugh) >> I'm guessing, by the way. >> Then we have customers in the, actually, early customers have been using the platform from the beginning in the consumer electronics space, in Japan, you know, self driving car technology, as well. As well as some AI first companies that actually, whose core value, the core business come from AI models. >> So, serious, serious customers. They got deep tech chops. They're integrating, they see this as a strategic part of their architecture. >> That's what I call AI native, exactly. But now there's, we have several enterprise customers in line now, we've been talking to. Of course, because now we launched the platform, now we started onboarding and exploring how we're going to serve it to these customers. But it's pretty clear that our technology can solve a lot of other pain points right now. And we're going to work with them as early customers to go and refine them. >> So, do you sell to the little guys, like us? Will we be customers if we wanted to be? >> You could, absolutely, yeah. >> What we have to do, have machine learning folks on staff? >> So, here's what you're going to have to do. Since you can see the booth, others can't. No, but they can certainly, you can try our demo. >> OctoML. >> And you should look at the transparent AI app that's compiled and optimized with our flow, and deployed and built with our flow. That allows you to get your image and do style transfer. You know, you can get you and a pineapple and see how you look like with a pineapple texture. >> We got a lot of transcript and video data. >> Right. Yeah. Right, exactly. So, you can use that. Then there's a very clear-- >> But I could use it. You're not blocking me from using it. Everyone's, it's pretty much democratized. >> You can try the demo, and then you can request access to the platform. >> But you get a lot of more serious deeper customers. But you can serve anybody, what you're saying. >> Luis: We can serve anybody, yeah. >> All right, so what's the vision going forward? Let me ask this. When did people start getting the epiphany of removing the machine learning from the hardware? Was it recently, a couple years ago? >> Well, on the research side, we helped start that trend a while ago. I don't need to repeat that. But I think the vision that's important here, I want the audience here to take away is that, there's a lot of progress being made in creating machine learning models. So, there's fantastic tools to deal with training data, and creating the models, and so on. And now there's a bunch of models that can solve real problems there. The question is, how do you very easily integrate that into your intelligent applications? Madrona Venture Group has been very vocal and investing heavily in intelligent applications both and user applications as well as enablers. So we say an enable of that because it's so easy to use our flow to get a model integrated into your application. Now, any regular software developer can integrate that. And that's just the beginning, right? Because, you know, now we have CI/CD integration to keep your models up to date, to continue to integrate, and then there's more downstream support for other features that you normally have in regular software development. >> I've been thinking about this for a long, long, time. And I think this whole code, no one thinks about code. Like, I write code, I'm deploying it. I think this idea of machine learning as code independent of other dependencies is really amazing. It's so obvious now that you say it. What's the choices now? Let's just say that, I buy it, I love it, I'm using it. Now what do I got to do if I want to deploy it? Do I have to pick processors? Are there verified platforms that you support? Is there a short list? Is there every piece of hardware? >> We actually can help you. I hope we're not saying we can do everything in the world here, but we can help you with that. So, here's how. When you have them all in the platform you can actually see how this model runs on any instance of any cloud, by the way. So we support all the three major cloud providers. And then you can make decisions. For example, if you care about latency, your model has to run on, at most 50 milliseconds, because you're going to have interactivity. And then, after that, you don't care if it's faster. All you care is that, is it going to run cheap enough. So we can help you navigate. And also going to make it automatic. >> It's like tire kicking in the dealer showroom. >> Right. >> You can test everything out, you can see the simulation. Are they simulations, or are they real tests? >> Oh, no, we run all in real hardware. So, we have, as I said, we support any instances of any of the major clouds. We actually run on the cloud. But we also support a select number of edge devices today, like ARMs and Nvidia Jetsons. And we have the OctoML cloud, which is a bunch of racks with a bunch Raspberry Pis and Nvidia Jetsons, and very soon, a bunch of mobile phones there too that can actually run the real hardware, and validate it, and test it out, so you can see that your model runs performant and economically enough in the cloud. And it can run on the edge devices-- >> You're a machine learning as a service. Would that be an accurate? >> That's part of it, because we're not doing the machine learning model itself. You come with a model and we make it deployable and make it ready to deploy. So, here's why it's important. Let me try. There's a large number of really interesting companies that do API models, as in API as a service. You have an NLP model, you have computer vision models, where you call an API and then point in the cloud. You send an image and you got a description, for example. But it is using a third party. Now, if you want to have your model on your infrastructure but having the same convenience as an API you can use our service. So, today, chances are that, if you have a model that you know that you want to do, there might not be an API for it, we actually automatically create the API for you. >> Okay, so that's why I get the DevOps agility for machine learning is a better description. Cause it's not, you're not providing the service. You're providing the service of deploying it like DevOps infrastructure as code. You're now ML as code. >> It's your model, your API, your infrastructure, but all of the convenience of having it ready to go, fully automatic, hands off. >> Cause I think what's interesting about this is that it brings the craftsmanship back to machine learning. Cause it's a craft. I mean, let's face it. >> Yeah. I want human brains, which are very precious resources, to focus on building those models, that is going to solve business problems. I don't want these very smart human brains figuring out how to scrub this into actually getting run the right way. This should be automatic. That's why we use machine learning, for machine learning to solve that. >> Here's an idea for you. We should write a book called, The Lean Machine Learning. Cause the lean startup was all about DevOps. >> Luis: We call machine leaning. No, that's not it going to work. (laughs) >> Remember when iteration was the big mantra. Oh, yeah, iterate. You know, that was from DevOps. >> Yeah, that's right. >> This code allowed for standing up stuff fast, double down, we all know the history, what it turned out. That was a good value for developers. >> I could really agree. If you don't mind me building on that point. You know, something we see as OctoML, but we also see at Madrona as well. Seeing that there's a trend towards best in breed for each one of the stages of getting a model deployed. From the data aspect of creating the data, and then to the model creation aspect, to the model deployment, and even model monitoring. Right? We develop integrations with all the major pieces of the ecosystem, such that you can integrate, say with model monitoring to go and monitor how a model is doing. Just like you monitor how code is doing in deployment in the cloud. >> It's evolution. I think it's a great step. And again, I love the analogy to the mainstream. I lived during those days. I remember the monolithic propriety, and then, you know, OSI model kind of blew it. But that OSI stack never went full stack, and it only stopped at TCP/IP. So, I think the same thing's going on here. You see some scalability around it to try to uncouple it, free it. >> Absolutely. And sustainability and accessibility to make it run faster and make it run on any deice that you want by any developer. So, that's the tagline. >> Luis Ceze, thanks for coming on. Professor. >> Thank you. >> I didn't know you were a professor. That's great to have you on. It was a masterclass in DevOps agility for machine learning. Thanks for coming on. Appreciate it. >> Thank you very much. Thank you. >> Congratulations, again. All right. OctoML here on theCube. Really important. Uncoupling the machine learning from the hardware specifically. That's only going to make space faster and safer, and more reliable. And that's where the whole theme of re:MARS is. Let's see how they fit in. I'm John for theCube. Thanks for watching. More coverage after this short break. >> Luis: Thank you. (gentle music)

Published Date : Jun 24 2022

SUMMARY :

live on the floor at AWS re:MARS 2022. for having me in the show, John. but machine learning is the And that allows you to get certainly on the silicon side. 'cause I could see the progression. So once upon a time, yeah, no... because if you wake up learning runs in the end, that's going to give you the So that was pre-conventional wisdom. the Hamilton was working on. and to this day, you know, That's the beginning of that was logical when you is that the ecosystem because that's kind of the test First, you know-- and scaling the model the way you want, not having to do that integration work. Scale, and run the models So if you can move data to the edge, So do not have any of the typical And you can use existing-- the Artemis too, in space. If they have the hardware. And that allows you So I have to ask you, So if you believe that to be true, to the chips that you want. about the smart part. And you get good recruiting for PhDs So you have to make money. And also, in the process So depends on the scale of the deployment. So, you have direct sales How many customers do you have? We have a bunch of, you know, And they're both here, right. I won't give it away. One has three, one has four. in Japan, you know, self They're integrating, they see this as it to these customers. Since you can see the booth, others can't. and see how you look like We got a lot of So, you can use that. But I could use it. and then you can request But you can serve anybody, of removing the machine for other features that you normally have It's so obvious now that you say it. So we can help you navigate. in the dealer showroom. you can see the simulation. And it can run on the edge devices-- You're a machine learning as a service. know that you want to do, I get the DevOps agility but all of the convenience it brings the craftsmanship for machine learning to solve that. Cause the lean startup No, that's not it going to work. You know, that was from DevOps. double down, we all know the such that you can integrate, and then, you know, OSI on any deice that you Professor. That's great to have you on. Thank you very much. Uncoupling the machine learning Luis: Thank you.

ENTITIES

Entity	Category	Confidence
Luis Ceze	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Luis	PERSON	0.99+
2015	DATE	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Boston Dynamics	ORGANIZATION	0.99+
two hours	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
2017	DATE	0.99+
Japan	LOCATION	0.99+
Madrona Venture Capital	ORGANIZATION	0.99+
AMD	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
three	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
One	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
four	QUANTITY	0.99+
2016	DATE	0.99+
University of Washington	ORGANIZATION	0.99+
Today	DATE	0.99+
Pepsi	ORGANIZATION	0.99+
Both	QUANTITY	0.99+
yesterday	DATE	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.99+
Second	QUANTITY	0.99+
today	DATE	0.99+
SiMa.ai	ORGANIZATION	0.99+
OctoML	TITLE	0.99+
OctoML	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.98+
one instance	QUANTITY	0.98+
DevOps	TITLE	0.98+
Madrona Venture Group	ORGANIZATION	0.98+
Swami	PERSON	0.98+
Madrona	ORGANIZATION	0.98+
about six years	QUANTITY	0.96+
Spot	ORGANIZATION	0.96+
The Lean Machine Learning	TITLE	0.95+
first	QUANTITY	0.95+
theCUBE	ORGANIZATION	0.94+
ARMs	ORGANIZATION	0.94+
pineapple	ORGANIZATION	0.94+
Raspberry Pis	ORGANIZATION	0.92+
TensorFlow	TITLE	0.89+
Snapdragon	ORGANIZATION	0.89+
about three years old	QUANTITY	0.89+
a couple years ago	DATE	0.88+
two hyperscaler cloud providers	QUANTITY	0.88+
first ones	QUANTITY	0.87+
one of	QUANTITY	0.85+
50 milliseconds	QUANTITY	0.83+
Apache TVM	ORGANIZATION	0.82+
both laugh	QUANTITY	0.82+
three major cloud providers	QUANTITY	0.81+

Chris Samuels, Slalom & Bethany Petryszak Mudd, Experience Design | Snowflake Summit 2022

(upbeat music) >> Good morning. Welcome back to theCUBE's continuing coverage of Snowflake Summit 22, live from Las Vegas. Lisa Martin, here with Dave Villante. We are at Caesar's Forum, having lots of great conversations. As I mentioned, this is just the start of day two, a tremendous amount of content yesterday. I'm coming at you today. Two guests join us from Slalom, now, we've got Chris Samuels, Principal Machine Learning, and Bethany Mudd, Senior Director, Experience Design. Welcome to theCube, guys. >> Hi, thanks for having us. >> Thank you. >> So, Slalom and Snowflake, over 200 joint customers, over 1,800 plus engagements, lots of synergies there, partnership. We're here today to talk about intelligent products. Talk to us about what- how do you define intelligent products, and then kind of break that down? >> Yeah, I can, I can start with the simple version, right? So, when we think about intelligent products, what they're doing, is they're doing more than they were explicitly programmed to do. So, instead of having a developer write all of these rules and have, "If this, then that," right, we're using data, and real time insights to make products that are more performing and improving over time. >> Chris: Yeah, it's really bringing together an ecosystem of a series of things to have integrated capabilities working together that themselves offer constant improvement, better understanding, better flexibility, and better usability, for everyone involved. >> Lisa: And there are four pillars of intelligent products that let's walk through those: technology, intelligence, experiences, and operations. >> Sure. So for technology, like most modern data architectures, it has sort of a data component and it has a modern cloud platform, but here, the key is is sort of things being disconnected, things being self contained, and decoupled, such that there's better integration time, better iteration time, more cross use, and more extensibility and scalability with the cloud native portion of that. >> And the intelligence piece? >> The intelligence piece is the data that's been processed by machine learning algorithms, or by predictive analytics that provides sort of the most valuable, or more- most insightful inferences, or conclusions. So, by bringing together again, the tech and the intelligence, that's, you know, sort of the, two of the pillars that begin to move forward that enable sort of the other two pillars, which are- >> Experiences and operations. >> Yeah. >> Perfect. >> And if we think about those, all of the technology, all of the intelligence in the world, doesn't mean anything if it doesn't actually work for people. Without use, there is no value. So, as we're designing these products, we want to make sure that they're supporting people. As we're automating, there are still people accountable for those tasks. There are still impacts to people in the real world. So, we want to make sure that we're doing that intentionally. So, we're building the greater good. >> Yeah. And from the operations perspective, it's you can think of traditional DevOps becoming MLOps, where there's an overall platform and a framework in place to manage not only the software components of it, but the overall workflow, and the data flow, and the model life cycle such that we have tools and people from different backgrounds and different teams developing and maintaining this than you would previously see with something like product engineering. >> Dave: Can you guys walk us through an example of how you work with a customer? I'm envisioning, you know, meeting with a lot of yellow stickies, and prioritization, and I don't know if that's how it works, but take us through like the start and the sequence. >> You have my heart, I am a workshop lover. Anytime you have the scratch off, like, lottery stickers on something, you know it's a good one. But, as we think about our approach, we typically start with either a discovery or mobilized phase. We're really, we're starting by gathering context, and really understanding the business, the client, the users, and that full path the value. Who are all the teams that are going to have to come together and start working together to deliver this intelligent product? And once we've got that context, we can start solutioning and ideating on that. But, really it comes down to making sure that we've earned the right, and we've got the smarts to move into the space intelligently. >> Yeah, and, truly, it's the intelligent product itself is sort of tied to the use case. The business knows what the most- what is potentially the most valuable here. And so, so by communicating and working and co-creating with the business, we can define then, okay, here are the use cases and here are where machine learning and the overall intelligent product can maybe add more disruptive value than others. By saying, let's pretend that, you know, maybe your ML model or your predictive analytics is like a dial that we could turn up to 11. Which one of those dials turning turned up to 11 could add the most value or disruption to your business? And therefore, you know, how can we prioritize and then work toward that pie-in-the-sky goal. >> Okay. So the client comes and says, "This is the outcome we want." Okay, and then you help them. You gather the right people, sort of extract all the little, you know, pieces of knowledge, and then help them prioritize so they can focus. And then what? >> Yeah. So, from there we're going to take the approach that seeing is solving. We want to make sure that we get the right voices in the room, and we've got the right alignment. So, we're going to map out everything. We're going to diagram what that experience is going to look like, how technology's going to play into it, all of the roles and actors involved. We're going to draw a map of the ecosystem that everyone can understand, whether you're in marketing, or the IT sort of area, once again, so we can get crisp on that outcome and how we're going to deliver it. And, from there, we start building out that roadmap and backlog, and we deliver iteratively. So, by not thinking of things as getting to the final product after a three year push, we really want to shrink those build, measure, and learn loops. So, we're getting all of that feedback and we're listening and evolving and growing the same way that our products are. >> Yeah. Something like an intelligent product is is pretty heady. So it's a pretty heavy concept to talk about. And so, the question becomes, "What is the outcome that ultimately needs to be achieved?" And then, who, from where in the business across the different potentially business product lines or business departments needs to be brought together? What data needs to be brought together? Such that the people can understand how they themselves can shape. The stakeholders can, how the product itself can be shaped. And therefore, what is the ultimate outcome, collectively, for everybody involved? 'Cause while your data might be fueling, you know, finances or someone else's intelligence and that kind of thing, bringing it all together allows for a more seamless product that might benefit more of the overall structure of the organization. >> Can you talk a little bit about how Slalom and Snowflake are enabling, like a customer example? A customer to take that data, flex that muscle, and create intelligent products that delight and surprise their customers? >> Chris: Yeah, so here's a great story. We worked to co-create with Kawasaki Heavy Industries. So, we created an intelligent product with them to enable safer rail travel, more preventative, more efficient, preventative maintenance, and a more efficient and real time track status feedback to the rail operators. So, in this case, we brought, yeah, the intelligent product itself was, "Okay, how do you create a better rail monitoring service?" And while that itself was the primary driver of the data, multiple other parts of the organization are using sort of the intelligent product as part of their now daily routine, whether it's from the preventative maintenance perspective, or it's from route usage, route prediction. Or, indeed, helping KHI move forward into making trains a more software centered set of products in the future. >> So, taking that example, I would imagine when you running- like I'm going to call that a project. I hope that's okay. So, when I'm running a project, that I would imagine that sometimes you run into, "Oh, wow. Okay." To really be successful at this, the company- project versus whole house. The company doesn't have the right data architecture, the right skills or the right, you know, data team. Now, is it as simple as, oh yeah, just put it all into Snowflake? I doubt it. So how do you, do you encounter that often? How do you deal with that? >> Bethany: It's a journey. So, I think it's really about making sure we're meeting clients where they are. And I think that's something that we actually do pretty well. So, as we think about delivery co-creation, and co-delivering is a huge part of our model. So, we want to make sure that we have the client teams, with us. So, as we start thinking about intelligent products, it can be incorporating a small feature, with subscription based services. It doesn't have to be creating your own model and sort of going deep. It really does come down to like what value do you want to get out of this? Right? >> Yeah. It is important that it is a journey, right? So, it doesn't have to be okay, there's a big bang applied to you and your company's tech industry or tech ecosystem. You can just start by saying, "Okay, how will I bring my data together at a data lake? How do I see across my different pillars of excellence in my own business?" And then, "How do I manage, potentially, this in an overall MLOps platform such that it can be sustainable and gather more insights and improve itself with time, and therefore be more impactful to the ultimate users of the tool?" 'Cause again, as Bethany said that without use, these things are just tools on the shelf somewhere that have little value. >> So, it's a journey, as you both said, completely agree with that. It's a journey that's getting faster and faster. Because, I mean, we've seen so much acceleration in the last couple of the years, the consumer demands have massively changed. >> Bethany: Absolutely. >> In every industry, how do Slalom and Snowflake come together to help businesses define the journey, but also accelerate it, so that they can stay ahead or get ahead of the competition? >> Yeah. So, one thing I think is interesting about the technology field right now is I feel like we're at the point where it's not the technology or the tools that's limiting us or, you know, constraining what we can build, it's our imaginations. Right? And, when I think about intelligent products and all of the things that are capable, that you can achieve with AI and ML, that's not widely known. There's so much tech jargon. And, we put all of those statistical words on it, and you know the things you don't know. And, instead, really, what we're doing is we're providing different ways to learn and grow. So, I think if we can demystify and humanize some of that language, I really would love to see all of these companies better understand the crayons and the tools in their toolbox. >> Speaking from a creative perspective, I love it. >> No, And I'll do the tech nerd bit. So, there is- you're right. There is a portion where you need to bring data together, and tech together, and that kind of thing. So, something like Snowflake is a great enabler for how to actually bring the data of multiple parts of an organization together into, you know, a data warehouse, or a data lake, and then be able to manage that sort of in an MLOps platform, particularly with some of the press that Snowflake has put out this week. Things becoming more Python-native, allowing for more ML experimentation, and some more native insights on the platform, rather than going off Snowflake platform to do some of that kind of thing. Makes Snowflake an incredibly valuable portion of the data management and of the tech and of the engineering of the overall product. >> So, I agree, Bethany, lack of imagination sometimes is the barrier we get so down into the weeds, but there's also lack of skills, as mentioned the organizational, you know, structural issues, politics, you know, whatever it is, you know, specific agendas, how do you guys help with that? Can, will you bring in, you know, resources to help and fill gaps? >> Yeah, so we will bring in a cross-disciplinary team of experts. So, you will see an experienced designer, as well as your ML architects, as well as other technical architects, and what we call solution owners, because we want to make sure that we've got a lot of perspectives, so we can see that problem from a lot of different angles. The other thing that we're bringing in is a repeatable process, a repeatable engineering methodology, which, when you zoom out, and you look at it, it doesn't seem like that big of a deal. But, what we're doing, is we're training against it. We're building tools, we're building templates, we're re-imagining what our deliverables look like for intelligent products, just so, we're not only speeding up the development and getting to those outcomes faster, but we're also continuing to grow and we can gift those things to our clients, and help support them as well. >> And not only that, what we do at Slalom is we want to think about transition from the beginning. And so, by having all the stakeholders in the room from the earliest point, both the business stakeholders, the technical stakeholders, if they have data scientists, if they have engineers, who's going to be taking this and maintaining this intelligent product long after we're gone, because again, we will transition, and someone else will be taking over the maintenance of this team. One, they will understand, you know, early from beginning the path that it is on, and be more capable of maintaining this, and two, understand sort of the ethical concerns behind, okay, here's how parts of your system affect this other parts of the system. And, you know, sometimes ML gets some bad press because it's misapplied, or there are concerns, or models or data are used outside of context. And there's some, you know, there are potentially some ill effects to be had. By bringing those people together much earlier, it allows for the business to truly understand and the stakeholders to ask the questions that they- that need to be continually asked to evaluate, is this the right thing to do? How do I, how does my part affect the whole? And, how do I have an overall impact that is in a positive way and is something, you know, truly being done most effectively. >> So, that's that knowledge transfer. I hesitate to even say that because it makes it sound so black and white, because you're co-creating here. But, essentially, you're, you know, to use the the cliche, you're teaching them how to fish. Not, you know, going to ongoing, you know, do the fishing for them, so. >> Lisa: That thought diversity is so critical, as is the internal alignment. Last question for you guys, before we wrap here, where can customers go to get started? Do they engage Slalom, Snowflake? Can they do both? >> Chris: You definitely can. We can come through. I mean, we're fortunate that snowflake has blessed us with the title of partner of the year again for the fifth time. >> Lisa: Congratulations. >> Thank you, thank you. We are incredibly humbled in that. So, we would do a lot of work with Snowflake. You could certainly come to Slalom, any one of our local markets, or build or emerge. We'll definitely work together. We'll figure out what the right team is. We'll have lots and lots of conversations, because it is most important for you as a set of business stakeholders to define what is right for you and what you need. >> Yeah. Good stuff, you guys, thank you so much for joining Dave and me, talking about intelligent products, what they are, how you co-design them, and the impact that data can make with customers if they really bring the right minds together and get creative. We appreciate your insights and your thoughts. >> Thank you. >> Thanks for having us guys. Yeah. >> All right. For Dave Villante, I am Lisa Martin. You're watching theCUBE's coverage, day two, Snowflake Summit 22, from Las Vegas. We'll be right back with our next guest. (upbeat music)

Published Date : Jun 15 2022

SUMMARY :

just the start of day two, So, Slalom and Snowflake, and improving over time. and better usability, of intelligent products that and decoupled, such that and the intelligence, that's, all of the technology, all of and the data flow, the start and the sequence. and that full path the value. and the overall intelligent product sort of extract all the little, you know, all of the roles and actors involved. Such that the people can understand the intelligent product itself was, the right skills or the that we have the client teams, with us. there's a big bang applied to you in the last couple of the years, and all of the things that are capable, Speaking from a creative and of the engineering and getting to those outcomes faster, and the stakeholders to ask the questions do the fishing for them, so. as is the internal alignment. the title of partner of the to define what is right and the impact that data Thanks for having us guys. We'll be right back with our next guest.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Chris	PERSON	0.99+
Dave Villante	PERSON	0.99+
Bethany	PERSON	0.99+
Lisa	PERSON	0.99+
Chris Samuels	PERSON	0.99+
Kawasaki Heavy Industries	ORGANIZATION	0.99+
Bethany Mudd	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Two guests	QUANTITY	0.99+
two pillars	QUANTITY	0.99+
Slalom	ORGANIZATION	0.99+
three year	QUANTITY	0.99+
KHI	ORGANIZATION	0.99+
today	DATE	0.99+
fifth time	QUANTITY	0.99+
Bethany Petryszak Mudd	PERSON	0.99+
both	QUANTITY	0.98+
Python	TITLE	0.98+
Snowflake	ORGANIZATION	0.98+
two	QUANTITY	0.98+
Snowflake Summit 22	EVENT	0.98+
yesterday	DATE	0.98+
over 200 joint customers	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
day two	QUANTITY	0.97+
theCube	ORGANIZATION	0.97+
this week	DATE	0.96+
Snowflake Summit 2022	EVENT	0.96+
one thing	QUANTITY	0.96+
Snowflake	TITLE	0.95+
One	QUANTITY	0.94+
over 1,800 plus engagements	QUANTITY	0.93+
Slalom	PERSON	0.92+
one	QUANTITY	0.83+
Slalom	TITLE	0.83+
four	QUANTITY	0.82+
11	QUANTITY	0.78+
up	QUANTITY	0.76+
two of the pillars	QUANTITY	0.7+
Machine Learning	ORGANIZATION	0.68+
Caesar's Forum	LOCATION	0.6+
last couple of	DATE	0.56+
years	QUANTITY	0.42+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Machine Learning: