SiliconANGLE News | Swami Sivasubramanian Extended Version

(bright upbeat music) >> Hello, everyone. Welcome to SiliconANGLE News breaking story here. Amazon Web Services expanding their relationship with Hugging Face, breaking news here on SiliconANGLE. I'm John Furrier, SiliconANGLE reporter, founder, and also co-host of theCUBE. And I have with me, Swami, from Amazon Web Services, vice president of database, analytics, machine learning with AWS. Swami, great to have you on for this breaking news segment on AWS's big news. Thanks for coming on and taking the time. >> Hey, John, pleasure to be here. >> You know- >> Looking forward to it. >> We've had many conversations on theCUBE over the years, we've watched Amazon really move fast into the large data modeling, SageMaker became a very smashing success, obviously you've been on this for a while. Now with ChatGPT OpenAI, a lot of buzz going mainstream, takes it from behind the curtain inside the ropes, if you will, in the industry to a mainstream. And so this is a big moment, I think, in the industry, I want to get your perspective, because your news with Hugging Face, I think is another tell sign that we're about to tip over into a new accelerated growth around making AI now application aware, application centric, more programmable, more API access. What's the big news about, with AWS Hugging Face, you know, what's going on with this announcement? >> Yeah. First of all, they're very excited to announce our expanded collaboration with Hugging Face, because with this partnership, our goal, as you all know, I mean, Hugging Face, I consider them like the GitHub for machine learning. And with this partnership, Hugging Face and AWS, we'll be able to democratize AI for a broad range of developers, not just specific deep AI startups. And now with this, we can accelerate the training, fine tuning and deployment of these large language models, and vision models from Hugging Face in the cloud. And the broader context, when you step back and see what customer problem we are trying to solve with this announcement, essentially if you see these foundational models, are used to now create like a huge number of applications, suggest like tech summarization, question answering, or search image generation, creative, other things. And these are all stuff we are seeing in the likes of these ChatGPT style applications. But there is a broad range of enterprise use cases that we don't even talk about. And it's because these kind of transformative, generative AI capabilities and models are not available to, I mean, millions of developers. And because either training these elements from scratch can be very expensive or time consuming and need deep expertise, or more importantly, they don't need these generic models, they need them to be fine tuned for the specific use cases. And one of the biggest complaints we hear is that these models, when they try to use it for real production use cases, they are incredibly expensive to train and incredibly expensive to run inference on, to use it at a production scale. So, and unlike web search style applications, where the margins can be really huge, here in production use cases and enterprises, you want efficiency at scale. That's where Hugging Face and AWS share our mission. And by integrating with Trainium and Inferentia, we're able to handle the cost efficient training and inference at scale, I'll deep dive on it. And by teaming up on the SageMaker front, now the time it takes to build these models and fine tune them is also coming down. So that's what makes this partnership very unique as well. So I'm very excited. >> I want to get into the time savings and the cost savings as well on the training and inference, it's a huge issue, but before we get into that, just how long have you guys been working with Hugging Face? I know there's a previous relationship, this is an expansion of that relationship, can you comment on what's different about what's happened before and then now? >> Yeah. So, Hugging Face, we have had a great relationship in the past few years as well, where they have actually made their models available to run on AWS, you know, fashion. Even in fact, their Bloom Project was something many of our customers even used. Bloom Project, for context, is their open source project which builds a GPT-3 style model. And now with this expanded collaboration, now Hugging Face selected AWS for that next generation office generative AI model, building on their highly successful Bloom Project as well. And the nice thing is, now, by direct integration with Trainium and Inferentia, where you get cost savings in a really significant way, now, for instance, Trn1 can provide up to 50% cost to train savings, and Inferentia can deliver up to 60% better costs, and four x more higher throughput than (indistinct). Now, these models, especially as they train that next generation generative AI models, it is going to be, not only more accessible to all the developers, who use it in open, so it'll be a lot cheaper as well. And that's what makes this moment really exciting, because we can't democratize AI unless we make it broadly accessible and cost efficient and easy to program and use as well. >> Yeah. >> So very exciting. >> I'll get into the SageMaker and CodeWhisperer angle in a second, but you hit on some good points there. One, accessibility, which is, I call the democratization, which is getting this in the hands of developers, and/or AI to develop, we'll get into that in a second. So, access to coding and Git reasoning is a whole nother wave. But the three things I know you've been working on, I want to put in the buckets here and comment, one, I know you've, over the years, been working on saving time to train, that's a big point, you mentioned some of those stats, also cost, 'cause now cost is an equation on, you know, bundling whether you're uncoupling with hardware and software, that's a big issue. Where do I find the GPUs? Where's the horsepower cost? And then also sustainability. You've mentioned that in the past, is there a sustainability angle here? Can you talk about those three things, time, cost, and sustainability? >> Certainly. So if you look at it from the AWS perspective, we have been supporting customers doing machine learning for the past years. Just for broader context, Amazon has been doing ML the past two decades right from the early days of ML powered recommendation to actually also supporting all kinds of generative AI applications. If you look at even generative AI application within Amazon, Amazon search, when you go search for a product and so forth, we have a team called MFi within Amazon search that helps bring these large language models into creating highly accurate search results. And these are created with models, really large models with tens of billions of parameters, scales to thousands of training jobs every month and trained on large model of hardware. And this is an example of a really good large language foundation model application running at production scale, and also, of course, Alexa, which uses a large generator model as well. And they actually even had a research paper that showed that they are more, and do better in accuracy than other systems like GPT-3 and whatnot. So, and we also touched on things like CodeWhisperer, which uses generative AI to improve developer productivity, but in a responsible manner, because 40% of some of the studies show 40% of this generated code had serious security flaws in it. This is where we didn't just do generative AI, we combined with automated reasoning capabilities, which is a very, very useful technique to identify these issues and couple them so that it produces highly secure code as well. Now, all these learnings taught us few things, and which is what you put in these three buckets. And yeah, like more than 100,000 customers using ML and AI services, including leading startups in the generative AI space, like stability AI, AI21 Labs, or Hugging Face, or even Alexa, for that matter. They care about, I put them in three dimension, one is around cost, which we touched on with Trainium and Inferentia, where we actually, the Trainium, you provide to 50% better cost savings, but the other aspect is, Trainium is a lot more power efficient as well compared to traditional one. And Inferentia is also better in terms of throughput, when it comes to what it is capable of. Like it is able to deliver up to three x higher compute performance and four x higher throughput, compared to it's previous generation, and it is extremely cost efficient and power efficient as well. >> Well. >> Now, the second element that really is important is in a day, developers deeply value the time it takes to build these models, and they don't want to build models from scratch. And this is where SageMaker, which is, even going to Kaggle uses, this is what it is, number one, enterprise ML platform. What it did to traditional machine learning, where tens of thousands of customers use StageMaker today, including the ones I mentioned, is that what used to take like months to build these models have dropped down to now a matter of days, if not less. Now, a generative AI, the cost of building these models, if you look at the landscape, the model parameter size had jumped by more than thousand X in the past three years, thousand x. And that means the training is like a really big distributed systems problem. How do you actually scale these model training? How do you actually ensure that you utilize these efficiently? Because these machines are very expensive, let alone they consume a lot of power. So, this is where SageMaker capability to build, automatically train, tune, and deploy models really concern this, especially with this distributor training infrastructure, and those are some of the reasons why some of the leading generative AI startups are actually leveraging it, because they do not want a giant infrastructure team, which is constantly tuning and fine tuning, and keeping these clusters alive. >> It sounds like a lot like what startups are doing with the cloud early days, no data center, you move to the cloud. So, this is the trend we're seeing, right? You guys are making it easier for developers with Hugging Face, I get that. I love that GitHub for machine learning, large language models are complex and expensive to build, but not anymore, you got Trainium and Inferentia, developers can get faster time to value, but then you got the transformers data sets, token libraries, all that optimized for generator. This is a perfect storm for startups. Jon Turow, a former AWS person, who used to work, I think for you, is now a VC at Madrona Venture, he and I were talking about the generator AI landscape, it's exploding with startups. Every alpha entrepreneur out there is seeing this as the next frontier, that's the 20 mile stairs, next 10 years is going to be huge. What is the big thing that's happened? 'Cause some people were saying, the founder of Yquem said, "Oh, the start ups won't be real, because they don't all have AI experience." John Markoff, former New York Times writer told me that, AI, there's so much work done, this is going to explode, accelerate really fast, because it's almost like it's been waiting for this moment. What's your reaction? >> I actually think there is going to be an explosion of startups, not because they need to be AI startups, but now finally AI is really accessible or going to be accessible, so that they can create remarkable applications, either for enterprises or for disrupting actually how customer service is being done or how creative tools are being built. And I mean, this is going to change in many ways. When we think about generative AI, we always like to think of how it generates like school homework or arts or music or whatnot, but when you look at it on the practical side, generative AI is being actually used across various industries. I'll give an example of like Autodesk. Autodesk is a customer who runs an AWS and SageMaker. They already have an offering that enables generated design, where designers can generate many structural designs for products, whereby you give a specific set of constraints and they actually can generate a structure accordingly. And we see similar kind of trend across various industries, where it can be around creative media editing or various others. I have the strong sense that literally, in the next few years, just like now, conventional machine learning is embedded in every application, every mobile app that we see, it is pervasive, and we don't even think twice about it, same way, like almost all apps are built on cloud. Generative AI is going to be part of every startup, and they are going to create remarkable experiences without needing actually, these deep generative AI scientists. But you won't get that until you actually make these models accessible. And I also don't think one model is going to rule the world, then you want these developers to have access to broad range of models. Just like, go back to the early days of deep learning. Everybody thought it is going to be one framework that will rule the world, and it has been changing, from Caffe to TensorFlow to PyTorch to various other things. And I have a suspicion, we had to enable developers where they are, so. >> You know, Dave Vellante and I have been riffing on this concept called super cloud, and a lot of people have co-opted to be multicloud, but we really were getting at this whole next layer on top of say, AWS. You guys are the most comprehensive cloud, you guys are a super cloud, and even Adam and I are talking about ISVs evolving to ecosystem partners. I mean, your top customers have ecosystems building on top of it. This feels like a whole nother AWS. How are you guys leveraging the history of AWS, which by the way, had the same trajectory, startups came in, they didn't want to provision a data center, the heavy lifting, all the things that have made Amazon successful culturally. And day one thinking is, provide the heavy lifting, undifferentiated heavy lifting, and make it faster for developers to program code. AI's got the same thing. How are you guys taking this to the next level, because now, this is an opportunity for the competition to change the game and take it over? This is, I'm sure, a conversation, you guys have a lot of things going on in AWS that makes you unique. What's the internal and external positioning around how you take it to the next level? >> I mean, so I agree with you that generative AI has a very, very strong potential in terms of what it can enable in terms of next generation application. But this is where Amazon's experience and expertise in putting these foundation models to work internally really has helped us quite a bit. If you look at it, like amazon.com search is like a very, very important application in terms of what is the customer impact on number of customers who use that application openly, and the amount of dollar impact it does for an organization. And we have been doing it silently for a while now. And the same thing is true for like Alexa too, which actually not only uses it for natural language understanding other city, even national leverages is set for creating stories and various other examples. And now, our approach to it from AWS is we actually look at it as in terms of the same three tiers like we did in machine learning, because when you look at generative AI, we genuinely see three sets of customers. One is, like really deep technical expert practitioner startups. These are the startups that are creating the next generation models like the likes of stability AIs or Hugging Face with Bloom or AI21. And they generally want to build their own models, and they want the best price performance of their infrastructure for training and inference. That's where our investments in silicon and hardware and networking innovations, where Trainium and Inferentia really plays a big role. And we can nearly do that, and that is one. The second middle tier is where I do think developers don't want to spend time building their own models, let alone, they actually want the model to be useful to that data. They don't need their models to create like high school homeworks or various other things. What they generally want is, hey, I had this data from my enterprises that I want to fine tune and make it really work only for this, and make it work remarkable, can be for tech summarization, to generate a report, or it can be for better Q&A, and so forth. This is where we are. Our investments in the middle tier with SageMaker, and our partnership with Hugging Face and AI21 and co here are all going to very meaningful. And you'll see us investing, I mean, you already talked about CodeWhisperer, which is an open preview, but we are also partnering with a whole lot of top ISVs, and you'll see more on this front to enable the next wave of generated AI apps too, because this is an area where we do think lot of innovation is yet to be done. It's like day one for us in this space, and we want to enable that huge ecosystem to flourish. >> You know, one of the things Dave Vellante and I were talking about in our first podcast we just did on Friday, we're going to do weekly, is we highlighted the AI ChatGPT example as a horizontal use case, because everyone loves it, people are using it in all their different verticals, and horizontal scalable cloud plays perfectly into it. So I have to ask you, as you look at what AWS is going to bring to the table, a lot's changed over the past 13 years with AWS, a lot more services are available, how should someone rebuild or re-platform and refactor their application of business with AI, with AWS? What are some of the tools that you see and recommend? Is it Serverless, is it SageMaker, CodeWhisperer? What do you think's going to shine brightly within the AWS stack, if you will, or service list, that's going to be part of this? As you mentioned, CodeWhisperer and SageMaker, what else should people be looking at as they start tinkering and getting all these benefits, and scale up their ups? >> You know, if we were a startup, first, I would really work backwards from the customer problem I try to solve, and pick and choose, bar, I don't need to deal with the undifferentiated heavy lifting, so. And that's where the answer is going to change. If you look at it then, the answer is not going to be like a one size fits all, so you need a very strong, I mean, granted on the compute front, if you can actually completely accurate it, so unless, I will always recommend it, instead of running compute for running your ups, because it takes care of all the undifferentiated heavy lifting, but on the data, and that's where we provide a whole variety of databases, right from like relational data, or non-relational, or dynamo, and so forth. And of course, we also have a deep analytical stack, where data directly flows from our relational databases into data lakes and data virus. And you can get value along with partnership with various analytical providers. The area where I do think fundamentally things are changing on what people can do is like, with CodeWhisperer, I was literally trying to actually program a code on sending a message through Twilio, and I was going to pull up to read a documentation, and in my ID, I was actually saying like, let's try sending a message to Twilio, or let's actually update a Route 53 error code. All I had to do was type in just a comment, and it actually started generating the sub-routine. And it is going to be a huge time saver, if I were a developer. And the goal is for us not to actually do it just for AWS developers, and not to just generate the code, but make sure the code is actually highly secure and follows the best practices. So, it's not always about machine learning, it's augmenting with automated reasoning as well. And generative AI is going to be changing, and not just in how people write code, but also how it actually gets built and used as well. You'll see a lot more stuff coming on this front. >> Swami, thank you for your time. I know you're super busy. Thank you for sharing on the news and giving commentary. Again, I think this is a AWS moment and industry moment, heavy lifting, accelerated value, agility. AIOps is going to be probably redefined here. Thanks for sharing your commentary. And we'll see you next time, I'm looking forward to doing more follow up on this. It's going to be a big wave. Thanks. >> Okay. Thanks again, John, always a pleasure. >> Okay. This is SiliconANGLE's breaking news commentary. I'm John Furrier with SiliconANGLE News, as well as host of theCUBE. Swami, who's a leader in AWS, has been on theCUBE multiple times. We've been tracking the growth of how Amazon's journey has just been exploding past five years, in particular, past three. You heard the numbers, great performance, great reviews. This is a watershed moment, I think, for the industry, and it's going to be a lot of fun for the next 10 years. Thanks for watching. (bright music)

Published Date : Feb 22 2023

SUMMARY :

Swami, great to have you on inside the ropes, if you And one of the biggest complaints we hear and easy to program and use as well. I call the democratization, the Trainium, you provide And that means the training What is the big thing that's happened? and they are going to create this to the next level, and the amount of dollar impact that's going to be part of this? And generative AI is going to be changing, AIOps is going to be John, always a pleasure. and it's going to be a lot

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Swami	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Jon Turow	PERSON	0.99+
John Markoff	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
40%	QUANTITY	0.99+
Autodesk	ORGANIZATION	0.99+
50%	QUANTITY	0.99+
Madrona Venture	ORGANIZATION	0.99+
20 mile	QUANTITY	0.99+
Hugging Face	ORGANIZATION	0.99+
Friday	DATE	0.99+
second element	QUANTITY	0.99+
more than 100,000 customers	QUANTITY	0.99+
AI21	ORGANIZATION	0.99+
tens of thousands	QUANTITY	0.99+
first podcast	QUANTITY	0.99+
three tiers	QUANTITY	0.98+
SiliconANGLE	ORGANIZATION	0.98+
twice	QUANTITY	0.98+
Bloom Project	TITLE	0.98+
one	QUANTITY	0.98+
SageMaker	ORGANIZATION	0.98+
Hugging Face	TITLE	0.98+
Alexa	TITLE	0.98+
first	QUANTITY	0.98+
GitHub	ORGANIZATION	0.98+
one model	QUANTITY	0.98+
up to 50%	QUANTITY	0.97+
ChatGPT	TITLE	0.97+
First	QUANTITY	0.97+
more than thousand X	QUANTITY	0.97+
amazon.com	ORGANIZATION	0.96+
tens of billions	QUANTITY	0.96+
One	QUANTITY	0.96+
up to 60%	QUANTITY	0.96+
one framework	QUANTITY	0.96+
Yquem	ORGANIZATION	0.94+
three things	QUANTITY	0.94+
Inferentia	ORGANIZATION	0.94+
CodeWhisperer	TITLE	0.93+
four	QUANTITY	0.92+
three sets	QUANTITY	0.92+
three	QUANTITY	0.92+
Twilio	ORGANIZATION	0.92+

Sri Satish Ambati, H20.ai | CUBE Conversation, May 2020

>> connecting with thought leaders all around the world, this is a CUBE Conversation. Hi, everybody this is Dave Vellante of theCUBE, and welcome back to my CXO series. I've been running this through really since the start of the COVID-19 crisis to really understand how leaders are dealing with this pandemic. Sri Ambati is here, he's the CEO and founder of H20. Sri, it's great to see you again, thanks for coming on. >> Thank you for having us. >> Yeah, so this pandemic has obviously given people fits, no question, but it's also given opportunities for companies to kind of reassess where they are. Automation is a huge watchword, flexibility, business resiliency and people who maybe really hadn't fully leaned into things like the cloud and AI and automation are now realizing, wow, we have no choice, it's about survival. Your thought as to what you're seeing in the marketplace. >> Thanks for having us. I think first of all, kudos to the frontline health workers who have been ruthlessly saving lives across the country and the world, and what you're really doing is a fraction of what we could have done or should be doing to stay away the next big pandemic. But that apart I think, I usually tend to say BC is before COVID. So if the world was thinking about going digital after COVID-19, they have been forced to go digital and as a result, you're seeing tremendous transformation across our customers, and a lot of application to kind of go in and reinvent their business models that allow them to scale as effortlessly as they could using the digital means. >> So, think about, doctors and diagnosis machines, in some cases, are helping doctors make diagnoses, they're sometimes making even better diagnosis, (mumbles) is informing. There's been a lot of talk about the models, you know how... Yeah, I know you've been working with a lot of healthcare organizations, you may probably familiar with that, you know, the Medium post, The Hammer and the Dance, and if people criticize the models, of course, they're just models, right? And you iterate models and machine intelligence can help us improve. So, in this, you know, you talk about BC and post C, how have you seen the data and in machine intelligence informing the models and proving that what we know about this pandemic, I mean, it changed literally daily, what are you seeing? >> Yeah, and I think it started with Wuhan and we saw the best application of AI in trying to trace, literally from Alipay, to WeChat, track down the first folks who were spreading it across China and then eventually the rest of the world. I think contact tracing, for example, has become a really interesting problem. supply chain has been disrupted like never before. We're beginning to see customers trying to reinvent their distribution mechanisms in the second order effects of the COVID, and the the prime center is hospital staffing, how many ventilator, is the first few weeks so that after COVID crisis as it evolved in the US. We are busy predicting working with some of the local healthcare communities to predict how staffing in hospitals will work, how many PPE and ventilators will be needed and so henceforth, but that quickly and when the peak surge will be those with the beginning problems, and many of our customers have begin to do these models and iterate and improve and kind of educate the community to practice social distancing, and that led to a lot of flattening the curve and you're talking flattening the curve, you're really talking about data science and analytics in public speak. That led to kind of the next level, now that we have somewhat brought a semblance of order to the reaction to COVID, I think what we are beginning to figure out is, is there going to be a second surge, what elective procedures that were postponed, will be top of the mind for customers, and so this is the kind of things that hospitals are beginning to plan out for the second half of the year, and as businesses try to open up, certain things were highly correlated to surgeon cases, such as cleaning supplies, for example, the obvious one or pantry buying. So retailers are beginning to see what online stores are doing well, e-commerce, online purchases, electronic goods, and so everyone essentially started working from home, and so homes needed to have the same kind of bandwidth that offices and commercial enterprises needed to have, and so a lot of interesting, as one side you saw airlines go away, this side you saw the likes of Zoom and video take off. So you're kind of seeing a real divide in the digital divide and that's happening and AI is here to play a very good role to figure out how to enhance your profitability as you're looking about planning out the next two years. >> Yeah, you know, and obviously, these things they get, they get partisan, it gets political, I mean, our job as an industry is to report, your job is to help people understand, I mean, let the data inform and then let public policy you know, fight it out. So who are some of the people that you're working with that you know, as a result of COVID-19. What's some of the work that H2O has done, I want to better understand what role are you playing? >> So one of the things we're kind of privileged as a company to come into the crisis, with a strong balance and an ability to actually have the right kind of momentum behind the company in terms of great talent, and so we have 10% of the world's top data scientists in the in the form of Kaggle Grand Masters in the company. And so we put most of them to work, and they started collecting data sets, curating data sets and making them more qualitative, picking up public data sources, for example, there's a tremendous amount of job loss out there, figuring out which are the more difficult kind of sectors in the economy and then we started looking at exodus from the cities, we're looking at mobility data that's publicly available, mobility data through the data exchanges, you're able to find which cities which rural areas, did the New Yorkers as they left the city, which places did they go to, and what's to say, Californians when they left Los Angeles, which are the new places they have settled in? These are the places which are now busy places for the same kind of items that you need to sell if you're a retailer, but if you go one step further, we started engaging with FEMA, we start engaging with the universities, like Imperial College London or Berkeley, and started figuring out how best to improve the models and automate them. The SEER model, the most popular SEER model, we added that into our Driverless AI product as a recipe and made that accessible to our customers in testing, to customers in healthcare who are trying to predict where the surge is likely to come. But it's mostly about information right? So the AI at the end of it is all about intelligence and being prepared. Predictive is all about being prepared and that's kind of what we did with general, lots of blogs, typical blog articles and working with the largest health organizations and starting to kind of inform them on the most stable models. What we found to our not so much surprise, is that the simplest, very interpretable models are actually the most widely usable, because historical data is actually no longer as effective. You need to build a model that you can quickly understand and retry again to the feedback loop of back testing that model against what really happened. >> Yeah, so I want to double down on that. So really, two things I want to understand, if you have visibility on it, sounds like you do. Just in terms of the surge and the comeback, you know, kind of what those models say, based upon, you know, we have some advanced information coming from the global market, for sure, but it seems like every situation is different. What's the data telling you? Just in terms of, okay, we're coming into the spring and the summer months, maybe it'll come down a little bit. Everybody says it... We fully expect it to come back in the fall, go back to college, don't go back to college. What is the data telling you at this point in time with an understanding that, you know, we're still iterating every day? >> Well, I think I mean, we're not epidemiologists, but at the same time, the science of it is a highly local response, very hyper local response to COVID-19 is what we've seen. Santa Clara, which is just a county, I mean, is different from San Francisco, right, sort of. So you beginning to see, like we saw in Brooklyn, it's very different, and Bronx, very different from Manhattan. So you're seeing a very, very local response to this disease, and I'm talking about US. You see the likes of Brazil, which we're worried about, has picked up quite a bit of cases now. I think the silver lining I would say is that China is up and running to a large degree, a large number of our user base there are back active, you can see the traffic patterns there. So two months after their last research cases, the business and economic activity is back and thriving. And so, you can kind of estimate from that, that this can be done where you can actually contain the rise of active cases and it will take masking of the entire community, masking and the healthy dose of increase in testing. One of our offices is in Prague, and Czech Republic has done an incredible job in trying to contain this and they've done essentially, masked everybody and as a result they're back thinking about opening offices, schools later this month. So I think that's a very, very local response, hyper local response, no one country and no one community is symmetrical with other ones and I think we have a unique situation where in United States you have a very, very highly connected world, highly connected economy and I think we have quite a problem on our hands on how to safeguard our economy while also safeguarding life. >> Yeah, so you can't just, you can't just take Norway and apply it or South Korea and apply it, every situation is different. And then I want to ask you about, you know, the economy in terms of, you know, how much can AI actually, you know, how can it work in this situation where you have, you know, for example, okay, so the Fed, yes, it started doing asset buys back in 2008 but still, very hard to predict, I mean, at this time of this interview you know, Stock Market up 900 points, very difficult to predict that but some event happens in the morning, somebody, you know, Powell says something positive and it goes crazy but just sort of even modeling out the V recovery, the W recovery, deep recession, the comeback. You have to have enough data, do you not? In order for AI to be reasonably accurate? How does it work? And how does at what pace can you iterate and improve on the models? >> So I think that's exactly where I would say, continuous modeling, instead of continuously learning continuous, that's where the vision of the world is headed towards, where data is coming, you build a model, and then you iterate, try it out and come back. That kind of rapid, continuous learning would probably be needed for all our models as opposed to the typical, I'm pushing a model to production once a year, or once every quarter. I think what we're beginning to see is the kind of where companies are beginning to kind of plan out. A lot of people lost their jobs in the last couple of months, right, sort of. And so up scaling and trying to kind of bring back these jobs back both into kind of, both from the manufacturing side, but also lost a lot of jobs in the transportation and the kind of the airlines slash hotel industries, right, sort of. So it's trying to now bring back the sense of confidence and will take a lot more kind of testing, a lot more masking, a lot more social empathy, I think well, some of the things that we are missing while we are socially distant, we know that we are so connected as a species, we need to kind of start having that empathy for we need to wear a mask, not for ourselves, but for our neighbors and people we may run into. And I think that kind of, the same kind of thinking has to kind of parade, before we can open up the economy in a big way. The data, I mean, we can do a lot of transfer learning, right, sort of there are new methods, like try to model it, similar to the 1918, where we had a second bump, or a lot of little bumps, and that's kind of where your W shaped pieces, but governments are trying very well in seeing stimulus dollars being pumped through banks. So some of the US case we're looking for banks is, which small medium business in especially, in unsecured lending, which business to lend to, (mumbles) there's so many applications that have come to banks across the world, it's not just in the US, and banks are caught up with the problem of which and what's growing the concern for this business to kind of, are they really accurate about the number of employees they are saying they have? Do then the next level problem or on forbearance and mortgage, that side of the things are coming up at some of these banks as well. So they're looking at which, what's one of the problems that one of our customers Wells Fargo, they have a question which branch to open, right, sort of that itself, it needs a different kind of modeling. So everything has become a very highly good segmented models, and so AI is absolutely not just a good to have, it has become a must have for most of our customers in how to go about their business. (mumbles) >> I want to talk a little bit about your business, you have been on a mission to democratize AI since the beginning, open source. Explain your business model, how you guys make money and then I want to help people understand basic theoretical comparisons and current affairs. >> Yeah, that's great. I think the last time we spoke, probably about at the Spark Summit. I think Dave and we were talking about Sparkling Water and H2O our open source platforms, which are premium platforms for democratizing machine learning and math at scale, and that's been a tremendous brand for us. Over the last couple of years, we have essentially built a platform called Driverless AI, which is a license software and that automates machine learning models, we took the best practices of all these data scientists, and combined them to essentially build recipes that allow people to build the best forecasting models, best fraud prevention models or the best recommendation engines, and so we started augmenting traditional data scientists with this automatic machine learning called AutoML, that essentially allows them to build models without necessarily having the same level of talent as these great Kaggle Grand Masters. And so that has democratized, allowed ordinary companies to start producing models of high caliber and high quality that would otherwise have been the pedigree of Google, Microsoft or Amazon or some of these top tier AI houses like Netflix and others. So what we've done is democratize not just the algorithms at the open source level. Now, we've made it easy for kind of rapid adoption of AI across every branch inside a company, a large organization, also across smaller organizations which don't have the access to the same kind of talent. Now, third level, you know, what we've brought to market, is ability to augment data sets, especially public and private data sets that you can, the alternative data sets that can increase the signal. And that's where we've started working on a new platform called Q, again, more license software, and I mean, to give you an idea there from business models endpoint, now majority of our software sales is coming from closed source software. And sort of so, we've made that transition, we still make our open source widely accessible, we continue to improve it, a large chunk of the teams are improving and participating in building the communities but I think from a business model standpoint as of last year, 51% of our revenues are now coming from closed source software and that change is continuing to grow. >> And this is the point I wanted to get to, so you know, the open source model was you know, Red Hat the one company that, you know, succeeded wildly and it was, put it out there open source, come up with a service, maintain the software, you got to buy the subscription okay, fine. And everybody thought that you know, you were going to do that, they thought that Databricks was going to do and that changed. But I want to take two examples, Hortonworks which kind of took the Red Hat model and Cloudera which does IP. And neither really lived up to the expectation, but now there seems to be sort of a new breed I mentioned, you guys, Databricks, there are others, that seem to be working. You with your license software model, Databricks with a managed service and so there's, it's becoming clear that there's got to be some level of IP that can be licensed in order to really thrive in the open source community to be able to fund the committers that you have to put forth to open source. I wonder if you could give me your thoughts on that narrative. >> So on Driverless AI, which is the closest platform I mentioned, we opened up the layers in open source as recipes. So for example, different companies build their zip codes differently, right, the domain specific recipes, we put about 150 of them in open source again, on top of our Driverless AI platform, and the idea there is that, open source is about freedom, right? It is not necessarily about, it's not a philosophy, it's not a business model, it allows freedom for rapid adoption of a platform and complete democratization and commodification of a space. And that allows a small company like ours to compete at the level of an SaaS or a Google or a Microsoft because you have the same level of voice as a very large company and you're focused on using code as a community building exercise as opposed to a business model, right? So that's kind of the heart of open source, is allowing that freedom for our end users and the customers to kind of innovate at the same level of that a Silicon Valley company or one of these large tech giants are building software. So it's really about making, it's a maker culture, as opposed to a consumer culture around software. Now, if you look at how the the Red Hat model, and the others who have tried to replicate that, the difficult part there was, if the product is very good, customers are self sufficient and if it becomes a standard, then customers know how to use it. If the product is crippled or difficult to use, then you put a lot of services and that's where you saw the classic Hadoop companies, get pulled into a lot of services, which is a reasonably difficult business to scale. So I think what we chose was, instead, a great product that builds a fantastic brand, that makes AI, even when other first or second.ai domain, and for us to see thousands of companies which are not AI and AI first, and even more companies adopting AI and talking about AI as a major way that was possible because of open source. If you had chosen close source and many of your peers did, they all vanished. So that's kind of how the open source is really about building the ecosystem and having the patience to build a company that takes 10, 20 years to build. And what we are expecting unfortunately, is a first and fast rise up to become unicorns. In that race, you're essentially sacrifice, building a long ecosystem play, and that's kind of what we chose to do, and that took a little longer. Now, if you think about the, how do you truly monetize open source, it takes a little longer and is much more difficult sales machine to scale, right, sort of. Our open source business actually is reasonably positive EBITDA business because it makes more money than we spend on it. But trying to teach sales teams, how to sell open source, that's a much, that's a rate limiting step. And that's why we chose and also explaining to the investors, how open source is being invested in as you go closer to the IPO markets, that's where we chose, let's go into license software model and scale that as a regular business. >> So I've said a few times, it's kind of like ironic that, this pandemic is as we're entering a new decade, you know, we've kind of we're exiting the era, I mean, the many, many decades of Moore's law being the source of innovation and now it's a combination of data, applying machine intelligence and being able to scale and with cloud. Well, my question is, what did we expect out of AI this decade if those are sort of the three, the cocktail of innovation, if you will, what should we expect? Is it really just about, I suggest, is it really about automating, you know, businesses, giving them more agility, flexibility, you know, etc. Or should we should we expect more from AI this decade? >> Well, I mean, if you think about the decade of 2010 2011, that was defined by software is eating the world, right? And now you can say software is the world, right? I mean, pretty much almost all conditions are digital. And AI is eating software, right? (mumbling) A lot of cloud transitions are happening and are now happening much faster rate but cloud and AI are kind of the leading, AI is essentially one of the biggest driver for cloud adoption for many of our customers. So in the enterprise world, you're seeing rebuilding of a lot of data, fast data driven applications that use AI, instead of rule based software, you're beginning to see patterned, mission AI based software, and you're seeing that in spades. And, of course, that is just the tip of the iceberg, AI has been with us for 100 years, and it's going to be ahead of us another hundred years, right, sort of. So as you see the discovery rate at which, it is really a fundamentally a math, math movement and in that math movement at the beginning of every century, it leads to 100 years of phenomenal discovery. So AI is essentially making discoveries faster, AI is producing, entertainment, AI is producing music, AI is producing choreographing, you're seeing AI in every walk of life, AI summarization of Zoom meetings, right, you beginning to see a lot of the AI enabled ETF peaking of stocks, right, sort of. You're beginning to see, we repriced 20,000 bonds every 15 seconds using H2O AI, corporate bonds. And so you and one of our customers is on the fastest growing stock, mostly AI is powering a lot of these insights in a fast changing world which is globally connected. No one of us is able to combine all the multiple dimensions that are changing and AI has that incredible opportunity to be a partner for every... (mumbling) For a hospital looking at how the second half will look like for physicians looking at what is the sentiment of... What is the surge to expect? To kind of what is the market demand looking at the sentiment of the customers. AI is the ultimate money ball in business and then I think it's just showing its depth at this point. >> Yeah, I mean, I think you're right on, I mean, basically AI is going to convert every software, every application, or those tools aren't going to have much use, Sri we got to go but thanks so much for coming to theCUBE and the great work you guys are doing. Really appreciate your insights. stay safe, and best of luck to you guys. >> Likewise, thank you so much. >> Welcome, and thank you for watching everybody, this is Dave Vellante for the CXO series on theCUBE. We'll see you next time. All right, we're clear. All right.

Published Date : May 19 2020

SUMMARY :

Sri, it's great to see you Your thought as to what you're and a lot of application and if people criticize the models, and kind of educate the community and then let public policy you know, and starting to kind of inform them What is the data telling you of the entire community, and improve on the models? and the kind of the airlines and then I want to help people understand and I mean, to give you an idea there in the open source community to be able and the customers to kind of innovate and being able to scale and with cloud. What is the surge to expect? and the great work you guys are doing. Welcome, and thank you

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
2008	DATE	0.99+
Dave Vellante	PERSON	0.99+
Wells Fargo	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Prague	LOCATION	0.99+
Brooklyn	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
51%	QUANTITY	0.99+
May 2020	DATE	0.99+
China	LOCATION	0.99+
United States	LOCATION	0.99+
100 years	QUANTITY	0.99+
Bronx	LOCATION	0.99+
Databricks	ORGANIZATION	0.99+
Manhattan	LOCATION	0.99+
US	LOCATION	0.99+
Santa Clara	LOCATION	0.99+
last year	DATE	0.99+
10%	QUANTITY	0.99+
20,000 bonds	QUANTITY	0.99+
Imperial College London	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
One	QUANTITY	0.99+
COVID-19	OTHER	0.99+
Los Angeles	LOCATION	0.99+
Netflix	ORGANIZATION	0.99+
H20	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
South Korea	LOCATION	0.99+
Sri Satish Ambati	PERSON	0.99+
thousands	QUANTITY	0.99+
FEMA	ORGANIZATION	0.99+
Brazil	LOCATION	0.99+
second half	QUANTITY	0.99+
first	QUANTITY	0.99+
second surge	QUANTITY	0.99+
two months	QUANTITY	0.99+
one	QUANTITY	0.98+
second bump	QUANTITY	0.98+
two things	QUANTITY	0.98+
H2O	ORGANIZATION	0.98+
both	QUANTITY	0.98+
Czech Republic	LOCATION	0.98+
Silicon Valley	LOCATION	0.98+
WeChat	TITLE	0.98+
three	QUANTITY	0.98+
hundred years	QUANTITY	0.98+
once a year	QUANTITY	0.97+
Powell	PERSON	0.97+
Sparkling Water	ORGANIZATION	0.97+
Alipay	TITLE	0.97+
Norway	LOCATION	0.97+
pandemic	EVENT	0.97+
second order	QUANTITY	0.97+
third level	QUANTITY	0.97+
first folks	QUANTITY	0.97+
COVID-19 crisis	EVENT	0.96+
Fed	ORGANIZATION	0.95+
1918	DATE	0.95+
later this month	DATE	0.95+
one side	QUANTITY	0.94+
Sri Ambati	PERSON	0.94+
two examples	QUANTITY	0.93+
Moore	PERSON	0.92+
Californians	PERSON	0.92+
CXO	TITLE	0.92+
last couple of months	DATE	0.92+
COVID	OTHER	0.91+
Spark Summit	EVENT	0.91+
one step	QUANTITY	0.91+
The Hammer	TITLE	0.9+
COVID crisis	EVENT	0.87+
every 15 seconds	QUANTITY	0.86+

Sri Satish Ambati, H20.ai | CUBE Conversation, May 2020

>> Starting the record, Dave in five, four, three. Hi, everybody this is Dave Vellante, theCUBE, and welcome back to my CXO series. I've been running this through really since the start of the COVID-19 crisis to really understand how leaders are dealing with this pandemic. Sri Ambati is here, he's the CEO and founder of H20. Sri, it's great to see you again, thanks for coming on. >> Thank you for having us. >> Yeah, so this pandemic has obviously given people fits, no question, but it's also given opportunities for companies to kind of reassess where they are. Automation is a huge watchword, flexibility, business resiliency and people who maybe really hadn't fully leaned into things like the cloud and AI and automation are now realizing, wow, we have no choice, it's about survival. Your thought as to what you're seeing in the marketplace. >> Thanks for having us. I think first of all, kudos to the frontline health workers who have been ruthlessly saving lives across the country and the world, and what you're really doing is a fraction of what we could have done or should be doing to stay away the next big pandemic. But that apart I think, I usually tend to say BC is before COVID. So if the world was thinking about going digital after COVID-19, they have been forced to go digital and as a result, you're seeing tremendous transformation across our customers, and a lot of application to kind of go in and reinvent their business models that allow them to scale as effortlessly as they could using the digital means. >> So, think about, doctors and diagnosis machines, in some cases, are helping doctors make diagnoses, they're sometimes making even better diagnosis, (mumbles) is informing. There's been a lot of talk about the models, you know how... Yeah, I know you've been working with a lot of healthcare organizations, you may probably familiar with that, you know, the Medium post, The Hammer and the Dance, and if people criticize the models, of course, they're just models, right? And you iterate models and machine intelligence can help us improve. So, in this, you know, you talk about BC and post C, how have you seen the data and in machine intelligence informing the models and proving that what we know about this pandemic, I mean, it changed literally daily, what are you seeing? >> Yeah, and I think it started with Wuhan and we saw the best application of AI in trying to trace, literally from Alipay, to WeChat, track down the first folks who were spreading it across China and then eventually the rest of the world. I think contact tracing, for example, has become a really interesting problem. supply chain has been disrupted like never before. We're beginning to see customers trying to reinvent their distribution mechanisms in the second order effects of the COVID, and the the prime center is hospital staffing, how many ventilator, is the first few weeks so that after COVID crisis as it evolved in the US. We are busy predicting working with some of the local healthcare communities to predict how staffing in hospitals will work, how many PPE and ventilators will be needed and so henceforth, but that quickly and when the peak surge will be those with the beginning problems, and many of our customers have begin to do these models and iterate and improve and kind of educate the community to practice social distancing, and that led to a lot of flattening the curve and you're talking flattening the curve, you're really talking about data science and analytics in public speak. That led to kind of the next level, now that we have somewhat brought a semblance of order to the reaction to COVID, I think what we are beginning to figure out is, is there going to be a second surge, what elective procedures that were postponed, will be top of the mind for customers, and so this is the kind of things that hospitals are beginning to plan out for the second half of the year, and as businesses try to open up, certain things were highly correlated to surgeon cases, such as cleaning supplies, for example, the obvious one or pantry buying. So retailers are beginning to see what online stores are doing well, e-commerce, online purchases, electronic goods, and so everyone essentially started working from home, and so homes needed to have the same kind of bandwidth that offices and commercial enterprises needed to have, and so a lot of interesting, as one side you saw airlines go away, this side you saw the likes of Zoom and video take off. So you're kind of seeing a real divide in the digital divide and that's happening and AI is here to play a very good role to figure out how to enhance your profitability as you're looking about planning out the next two years. >> Yeah, you know, and obviously, these things they get, they get partisan, it gets political, I mean, our job as an industry is to report, your job is to help people understand, I mean, let the data inform and then let public policy you know, fight it out. So who are some of the people that you're working with that you know, as a result of COVID-19. What's some of the work that H2O has done, I want to better understand what role are you playing? >> So one of the things we're kind of privileged as a company to come into the crisis, with a strong balance and an ability to actually have the right kind of momentum behind the company in terms of great talent, and so we have 10% of the world's top data scientists in the in the form of Kaggle Grand Masters in the company. And so we put most of them to work, and they started collecting data sets, curating data sets and making them more qualitative, picking up public data sources, for example, there's a tremendous amount of job loss out there, figuring out which are the more difficult kind of sectors in the economy and then we started looking at exodus from the cities, we're looking at mobility data that's publicly available, mobility data through the data exchanges, you're able to find which cities which rural areas, did the New Yorkers as they left the city, which places did they go to, and what's to say, Californians when they left Los Angeles, which are the new places they have settled in? These are the places which are now busy places for the same kind of items that you need to sell if you're a retailer, but if you go one step further, we started engaging with FEMA, we start engaging with the universities, like Imperial College London or Berkeley, and started figuring out how best to improve the models and automate them. The SaaS model, the most popular SaaS model, we added that into our Driverless AI product as a recipe and made that accessible to our customers in testing, to customers in healthcare who are trying to predict where the surge is likely to come. But it's mostly about information right? So the AI at the end of it is all about intelligence and being prepared. Predictive is all about being prepared and that's kind of what we did with general, lots of blogs, typical blog articles and working with the largest health organizations and starting to kind of inform them on the most stable models. What we found to our not so much surprise, is that the simplest, very interpretable models are actually the most widely usable, because historical data is actually no longer as effective. You need to build a model that you can quickly understand and retry again to the feedback loop of back testing that model against what really happened. >> Yeah, so I want to double down on that. So really, two things I want to understand, if you have visibility on it, sounds like you do. Just in terms of the surge and the comeback, you know, kind of what those models say, based upon, you know, we have some advanced information coming from the global market, for sure, but it seems like every situation is different. What's the data telling you? Just in terms of, okay, we're coming into the spring and the summer months, maybe it'll come down a little bit. Everybody says it... We fully expect it to come back in the fall, go back to college, don't go back to college. What is the data telling you at this point in time with an understanding that, you know, we're still iterating every day? >> Well, I think I mean, we're not epidemiologists, but at the same time, the science of it is a highly local response, very hyper local response to COVID-19 is what we've seen. Santa Clara, which is just a county, I mean, is different from San Francisco, right, sort of. So you beginning to see, like we saw in Brooklyn, it's very different, and Bronx, very different from Manhattan. So you're seeing a very, very local response to this disease, and I'm talking about US. You see the likes of Brazil, which we're worried about, has picked up quite a bit of cases now. I think the silver lining I would say is that China is up and running to a large degree, a large number of our user base there are back active, you can see the traffic patterns there. So two months after their last research cases, the business and economic activity is back and thriving. And so, you can kind of estimate from that, that this can be done where you can actually contain the rise of active cases and it will take masking of the entire community, masking and the healthy dose of increase in testing. One of our offices is in Prague, and Czech Republic has done an incredible job in trying to contain this and they've done essentially, masked everybody and as a result they're back thinking about opening offices, schools later this month. So I think that's a very, very local response, hyper local response, no one country and no one community is symmetrical with other ones and I think we have a unique situation where in United States you have a very, very highly connected world, highly connected economy and I think we have quite a problem on our hands on how to safeguard our economy while also safeguarding life. >> Yeah, so you can't just, you can't just take Norway and apply it or South Korea and apply it, every situation is different. And then I want to ask you about, you know, the economy in terms of, you know, how much can AI actually, you know, how can it work in this situation where you have, you know, for example, okay, so the Fed, yes, it started doing asset buys back in 2008 but still, very hard to predict, I mean, at this time of this interview you know, Stock Market up 900 points, very difficult to predict that but some event happens in the morning, somebody, you know, Powell says something positive and it goes crazy but just sort of even modeling out the V recovery, the W recovery, deep recession, the comeback. You have to have enough data, do you not? In order for AI to be reasonably accurate? How does it work? And how does at what pace can you iterate and improve on the models? >> So I think that's exactly where I would say, continuous modeling, instead of continuously learning continuous, that's where the vision of the world is headed towards, where data is coming, you build a model, and then you iterate, try it out and come back. That kind of rapid, continuous learning would probably be needed for all our models as opposed to the typical, I'm pushing a model to production once a year, or once every quarter. I think what we're beginning to see is the kind of where companies are beginning to kind of plan out. A lot of people lost their jobs in the last couple of months, right, sort of. And so up scaling and trying to kind of bring back these jobs back both into kind of, both from the manufacturing side, but also lost a lot of jobs in the transportation and the kind of the airlines slash hotel industries, right, sort of. So it's trying to now bring back the sense of confidence and will take a lot more kind of testing, a lot more masking, a lot more social empathy, I think well, some of the things that we are missing while we are socially distant, we know that we are so connected as a species, we need to kind of start having that empathy for we need to wear a mask, not for ourselves, but for our neighbors and people we may run into. And I think that kind of, the same kind of thinking has to kind of parade, before we can open up the economy in a big way. The data, I mean, we can do a lot of transfer learning, right, sort of there are new methods, like try to model it, similar to the 1918, where we had a second bump, or a lot of little bumps, and that's kind of where your W shaped pieces, but governments are trying very well in seeing stimulus dollars being pumped through banks. So some of the US case we're looking for banks is, which small medium business in especially, in unsecured lending, which business to lend to, (mumbles) there's so many applications that have come to banks across the world, it's not just in the US, and banks are caught up with the problem of which and what's growing the concern for this business to kind of, are they really accurate about the number of employees they are saying they have? Do then the next level problem or on forbearance and mortgage, that side of the things are coming up at some of these banks as well. So they're looking at which, what's one of the problems that one of our customers Wells Fargo, they have a question which branch to open, right, sort of that itself, it needs a different kind of modeling. So everything has become a very highly good segmented models, and so AI is absolutely not just a good to have, it has become a must have for most of our customers in how to go about their business. (mumbles) >> I want to talk a little bit about your business, you have been on a mission to democratize AI since the beginning, open source. Explain your business model, how you guys make money and then I want to help people understand basic theoretical comparisons and current affairs. >> Yeah, that's great. I think the last time we spoke, probably about at the Spark Summit. I think Dave and we were talking about Sparkling Water and H2O or open source platforms, which are premium platforms for democratizing machine learning and math at scale, and that's been a tremendous brand for us. Over the last couple of years, we have essentially built a platform called Driverless AI, which is a license software and that automates machine learning models, we took the best practices of all these data scientists, and combined them to essentially build recipes that allow people to build the best forecasting models, best fraud prevention models or the best recommendation engines, and so we started augmenting traditional data scientists with this automatic machine learning called AutoML, that essentially allows them to build models without necessarily having the same level of talent as these Greek Kaggle Grand Masters. And so that has democratized, allowed ordinary companies to start producing models of high caliber and high quality that would otherwise have been the pedigree of Google, Microsoft or Amazon or some of these top tier AI houses like Netflix and others. So what we've done is democratize not just the algorithms at the open source level. Now, we've made it easy for kind of rapid adoption of AI across every branch inside a company, a large organization, also across smaller organizations which don't have the access to the same kind of talent. Now, third level, you know, what we've brought to market, is ability to augment data sets, especially public and private data sets that you can, the alternative data sets that can increase the signal. And that's where we've started working on a new platform called Q, again, more license software, and I mean, to give you an idea there from business models endpoint, now majority of our software sales is coming from closed source software. And sort of so, we've made that transition, we still make our open source widely accessible, we continue to improve it, a large chunk of the teams are improving and participating in building the communities but I think from a business model standpoint as of last year, 51% of our revenues are now coming from closed source software and that change is continuing to grow. >> And this is the point I wanted to get to, so you know, the open source model was you know, Red Hat the one company that, you know, succeeded wildly and it was, put it out there open source, come up with a service, maintain the software, you got to buy the subscription okay, fine. And everybody thought that you know, you were going to do that, they thought that Databricks was going to do and that changed. But I want to take two examples, Hortonworks which kind of took the Red Hat model and Cloudera which does IP. And neither really lived up to the expectation, but now there seems to be sort of a new breed I mentioned, you guys, Databricks, there are others, that seem to be working. You with your license software model, Databricks with a managed service and so there's, it's becoming clear that there's got to be some level of IP that can be licensed in order to really thrive in the open source community to be able to fund the committers that you have to put forth to open source. I wonder if you could give me your thoughts on that narrative. >> So on Driverless AI, which is the closest platform I mentioned, we opened up the layers in open source as recipes. So for example, different companies build their zip codes differently, right, the domain specific recipes, we put about 150 of them in open source again, on top of our Driverless AI platform, and the idea there is that, open source is about freedom, right? It is not necessarily about, it's not a philosophy, it's not a business model, it allows freedom for rapid adoption of a platform and complete democratization and commodification of a space. And that allows a small company like ours to compete at the level of an SaaS or a Google or a Microsoft because you have the same level of voice as a very large company and you're focused on using code as a community building exercise as opposed to a business model, right? So that's kind of the heart of open source, is allowing that freedom for our end users and the customers to kind of innovate at the same level of that a Silicon Valley company or one of these large tech giants are building software. So it's really about making, it's a maker culture, as opposed to a consumer culture around software. Now, if you look at how the the Red Hat model, and the others who have tried to replicate that, the difficult part there was, if the product is very good, customers are self sufficient and if it becomes a standard, then customers know how to use it. If the product is crippled or difficult to use, then you put a lot of services and that's where you saw the classic Hadoop companies, get pulled into a lot of services, which is a reasonably difficult business to scale. So I think what we chose was, instead, a great product that builds a fantastic brand, that makes AI, even when other first or second.ai domain, and for us to see thousands of companies which are not AI and AI first, and even more companies adopting AI and talking about AI as a major way that was possible because of open source. If you had chosen close source and many of your peers did, they all vanished. So that's kind of how the open source is really about building the ecosystem and having the patience to build a company that takes 10, 20 years to build. And what we are expecting unfortunately, is a first and fast rise up to become unicorns. In that race, you're essentially sacrifice, building a long ecosystem play, and that's kind of what we chose to do, and that took a little longer. Now, if you think about the, how do you truly monetize open source, it takes a little longer and is much more difficult sales machine to scale, right, sort of. Our open source business actually is reasonably positive EBITDA business because it makes more money than we spend on it. But trying to teach sales teams, how to sell open source, that's a much, that's a rate limiting step. And that's why we chose and also explaining to the investors, how open source is being invested in as you go closer to the IPO markets, that's where we chose, let's go into license software model and scale that as a regular business. >> So I've said a few times, it's kind of like ironic that, this pandemic is as we're entering a new decade, you know, we've kind of we're exiting the era, I mean, the many, many decades of Moore's law being the source of innovation and now it's a combination of data, applying machine intelligence and being able to scale and with cloud. Well, my question is, what did we expect out of AI this decade if those are sort of the three, the cocktail of innovation, if you will, what should we expect? Is it really just about, I suggest, is it really about automating, you know, businesses, giving them more agility, flexibility, you know, etc. Or should we should we expect more from AI this decade? >> Well, I mean, if you think about the decade of 2010 2011, that was defined by software is eating the world, right? And now you can say software is the world, right? I mean, pretty much almost all conditions are digital. And AI is eating software, right? (mumbling) A lot of cloud transitions are happening and are now happening much faster rate but cloud and AI are kind of the leading, AI is essentially one of the biggest driver for cloud adoption for many of our customers. So in the enterprise world, you're seeing rebuilding of a lot of data, fast data driven applications that use AI, instead of rule based software, you're beginning to see patterned, mission AI based software, and you're seeing that in spades. And, of course, that is just the tip of the iceberg, AI has been with us for 100 years, and it's going to be ahead of us another hundred years, right, sort of. So as you see the discovery rate at which, it is really a fundamentally a math, math movement and in that math movement at the beginning of every century, it leads to 100 years of phenomenal discovery. So AI is essentially making discoveries faster, AI is producing, entertainment, AI is producing music, AI is producing choreographing, you're seeing AI in every walk of life, AI summarization of Zoom meetings, right, you beginning to see a lot of the AI enabled ETF peaking of stocks, right, sort of. You're beginning to see, we repriced 20,000 bonds every 15 seconds using H2O AI, corporate bonds. And so you and one of our customers is on the fastest growing stock, mostly AI is powering a lot of these insights in a fast changing world which is globally connected. No one of us is able to combine all the multiple dimensions that are changing and AI has that incredible opportunity to be a partner for every... (mumbling) For a hospital looking at how the second half will look like for physicians looking at what is the sentiment of... What is the surge to expect? To kind of what is the market demand looking at the sentiment of the customers. AI is the ultimate money ball in business and then I think it's just showing its depth at this point. >> Yeah, I mean, I think you're right on, I mean, basically AI is going to convert every software, every application, or those tools aren't going to have much use, Sri we got to go but thanks so much for coming to theCUBE and the great work you guys are doing. Really appreciate your insights. stay safe, and best of luck to you guys. >> Likewise, thank you so much. >> Welcome, and thank you for watching everybody, this is Dave Vellante for the CXO series on theCUBE. We'll see you next time. All right, we're clear. All right.

Published Date : May 18 2020

SUMMARY :

Sri, it's great to see you Your thought as to what you're and a lot of application and if people criticize the models, and kind of educate the community and then let public policy you know, is that the simplest, What is the data telling you of the entire community, and improve on the models? and the kind of the airlines and then I want to help people understand and I mean, to give you an idea there in the open source community to be able and the customers to kind of innovate and being able to scale and with cloud. What is the surge to expect? and the great work you guys are doing. Welcome, and thank you

ENTITIES

Entity	Category	Confidence
Wells Fargo	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
2008	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
five	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Brooklyn	LOCATION	0.99+
Prague	LOCATION	0.99+
China	LOCATION	0.99+
Bronx	LOCATION	0.99+
100 years	QUANTITY	0.99+
May 2020	DATE	0.99+
Manhattan	LOCATION	0.99+
51%	QUANTITY	0.99+
US	LOCATION	0.99+
Brazil	LOCATION	0.99+
Databricks	ORGANIZATION	0.99+
United States	LOCATION	0.99+
COVID-19	OTHER	0.99+
10%	QUANTITY	0.99+
20,000 bonds	QUANTITY	0.99+
Los Angeles	LOCATION	0.99+
last year	DATE	0.99+
H20	ORGANIZATION	0.99+
Imperial College London	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
one	QUANTITY	0.99+
four	QUANTITY	0.99+
Santa Clara	LOCATION	0.99+
One	QUANTITY	0.99+
hundred years	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
Netflix	ORGANIZATION	0.99+
Sri Satish Ambati	PERSON	0.99+
South Korea	LOCATION	0.99+
three	QUANTITY	0.99+
second half	QUANTITY	0.99+
two things	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
both	QUANTITY	0.98+
second surge	QUANTITY	0.98+
first	QUANTITY	0.98+
H2O	ORGANIZATION	0.98+
third level	QUANTITY	0.98+
once a year	QUANTITY	0.98+
Sparkling Water	ORGANIZATION	0.98+
FEMA	ORGANIZATION	0.98+
WeChat	TITLE	0.98+
pandemic	EVENT	0.98+
Powell	PERSON	0.97+
COVID-19 crisis	EVENT	0.97+
second bump	QUANTITY	0.97+
Czech Republic	LOCATION	0.96+
second order	QUANTITY	0.96+
1918	DATE	0.96+
Norway	LOCATION	0.96+
Fed	ORGANIZATION	0.95+
first folks	QUANTITY	0.94+
thousands of companies	QUANTITY	0.94+
two examples	QUANTITY	0.91+
10, 20 years	QUANTITY	0.91+
COVID	OTHER	0.91+
CXO	TITLE	0.91+
two months	QUANTITY	0.91+
last couple of months	DATE	0.9+
Moore	PERSON	0.9+
later this month	DATE	0.9+
Alipay	TITLE	0.89+
Sri Ambati	PERSON	0.88+
every 15 seconds	QUANTITY	0.88+
COVID crisis	EVENT	0.86+
Californians	PERSON	0.85+
Driverless	TITLE	0.84+

Sreesha Rao, Niagara Bottling & Seth Dobrin, IBM | Change The Game: Winning With AI 2018

>> Live, from Times Square, in New York City, it's theCUBE covering IBM's Change the Game: Winning with AI. Brought to you by IBM. >> Welcome back to the Big Apple, everybody. I'm Dave Vellante, and you're watching theCUBE, the leader in live tech coverage, and we're here covering a special presentation of IBM's Change the Game: Winning with AI. IBM's got an analyst event going on here at the Westin today in the theater district. They've got 50-60 analysts here. They've got a partner summit going on, and then tonight, at Terminal 5 of the West Side Highway, they've got a customer event, a lot of customers there. We've talked earlier today about the hard news. Seth Dobern is here. He's the Chief Data Officer of IBM Analytics, and he's joined by Shreesha Rao who is the Senior Manager of IT Applications at California-based Niagara Bottling. Gentlemen, welcome to theCUBE. Thanks so much for coming on. >> Thank you, Dave. >> Well, thanks Dave for having us. >> Yes, always a pleasure Seth. We've known each other for a while now. I think we met in the snowstorm in Boston, sparked something a couple years ago. >> Yep. When we were both trapped there. >> Yep, and at that time, we spent a lot of time talking about your internal role as the Chief Data Officer, working closely with Inderpal Bhandari, and you guys are doing inside of IBM. I want to talk a little bit more about your other half which is working with clients and the Data Science Elite Team, and we'll get into what you're doing with Niagara Bottling, but let's start there, in terms of that side of your role, give us the update. >> Yeah, like you said, we spent a lot of time talking about how IBM is implementing the CTO role. While we were doing that internally, I spent quite a bit of time flying around the world, talking to our clients over the last 18 months since I joined IBM, and we found a consistent theme with all the clients, in that, they needed help learning how to implement data science, AI, machine learning, whatever you want to call it, in their enterprise. There's a fundamental difference between doing these things at a university or as part of a Kaggle competition than in an enterprise, so we felt really strongly that it was important for the future of IBM that all of our clients become successful at it because what we don't want to do is we don't want in two years for them to go "Oh my God, this whole data science thing was a scam. We haven't made any money from it." And it's not because the data science thing is a scam. It's because the way they're doing it is not conducive to business, and so we set up this team we call the Data Science Elite Team, and what this team does is we sit with clients around a specific use case for 30, 60, 90 days, it's really about 3 or 4 sprints, depending on the material, the client, and how long it takes, and we help them learn through this use case, how to use Python, R, Scala in our platform obviously, because we're here to make money too, to implement these projects in their enterprise. Now, because it's written in completely open-source, if they're not happy with what the product looks like, they can take their toys and go home afterwards. It's on us to prove the value as part of this, but there's a key point here. My team is not measured on sales. They're measured on adoption of AI in the enterprise, and so it creates a different behavior for them. So they're really about "Make the enterprise successful," right, not "Sell this software." >> Yeah, compensation drives behavior. >> Yeah, yeah. >> So, at this point, I ask, "Well, do you have any examples?" so Shreesha, let's turn to you. (laughing softly) Niagara Bottling -- >> As a matter of fact, Dave, we do. (laughing) >> Yeah, so you're not a bank with a trillion dollars in assets under management. Tell us about Niagara Bottling and your role. >> Well, Niagara Bottling is the biggest private label bottled water manufacturing company in the U.S. We make bottled water for Costcos, Walmarts, major national grocery retailers. These are our customers whom we service, and as with all large customers, they're demanding, and we provide bottled water at relatively low cost and high quality. >> Yeah, so I used to have a CIO consultancy. We worked with every CIO up and down the East Coast. I always observed, really got into a lot of organizations. I was always observed that it was really the heads of Application that drove AI because they were the glue between the business and IT, and that's really where you sit in the organization, right? >> Yes. My role is to support the business and business analytics as well as I support some of the distribution technologies and planning technologies at Niagara Bottling. >> So take us the through the project if you will. What were the drivers? What were the outcomes you envisioned? And we can kind of go through the case study. >> So the current project that we leveraged IBM's help was with a stretch wrapper project. Each pallet that we produce--- we produce obviously cases of bottled water. These are stacked into pallets and then shrink wrapped or stretch wrapped with a stretch wrapper, and this project is to be able to save money by trying to optimize the amount of stretch wrap that goes around a pallet. We need to be able to maintain the structural stability of the pallet while it's transported from the manufacturing location to our customer's location where it's unwrapped and then the cases are used. >> And over breakfast we were talking. You guys produce 2833 bottles of water per second. >> Wow. (everyone laughs) >> It's enormous. The manufacturing line is a high speed manufacturing line, and we have a lights-out policy where everything runs in an automated fashion with raw materials coming in from one end and the finished goods, pallets of water, going out. It's called pellets to pallets. Pellets of plastic coming in through one end and pallets of water going out through the other end. >> Are you sitting on top of an aquifer? Or are you guys using sort of some other techniques? >> Yes, in fact, we do bore wells and extract water from the aquifer. >> Okay, so the goal was to minimize the amount of material that you used but maintain its stability? Is that right? >> Yes, during transportation, yes. So if we use too much plastic, we're not optimally, I mean, we're wasting material, and cost goes up. We produce almost 16 million pallets of water every single year, so that's a lot of shrink wrap that goes around those, so what we can save in terms of maybe 15-20% of shrink wrap costs will amount to quite a bit. >> So, how does machine learning fit into all of this? >> So, machine learning is way to understand what kind of profile, if we can measure what is happening as we wrap the pallets, whether we are wrapping it too tight or by stretching it, that results in either a conservative way of wrapping the pallets or an aggressive way of wrapping the pallets. >> I.e. too much material, right? >> Too much material is conservative, and aggressive is too little material, and so we can achieve some savings if we were to alternate between the profiles. >> So, too little material means you lose product, right? >> Yes, and there's a risk of breakage, so essentially, while the pallet is being wrapped, if you are stretching it too much there's a breakage, and then it interrupts production, so we want to try and avoid that. We want a continuous production, at the same time, we want the pallet to be stable while saving material costs. >> Okay, so you're trying to find that ideal balance, and how much variability is in there? Is it a function of distance and how many touches it has? Maybe you can share with that. >> Yes, so each pallet takes about 16-18 wraps of the stretch wrapper going around it, and that's how much material is laid out. About 250 grams of plastic that goes on there. So we're trying to optimize the gram weight which is the amount of plastic that goes around each of the pallet. >> So it's about predicting how much plastic is enough without having breakage and disrupting your line. So they had labeled data that was, "if we stretch it this much, it breaks. If we don't stretch it this much, it doesn't break, but then it was about predicting what's good enough, avoiding both of those extremes, right? >> Yes. >> So it's a truly predictive and iterative model that we've built with them. >> And, you're obviously injecting data in terms of the trip to the store as well, right? You're taking that into consideration in the model, right? >> Yeah that's mainly to make sure that the pallets are stable during transportation. >> Right. >> And that is already determined how much containment force is required when your stretch and wrap each pallet. So that's one of the variables that is measured, but the inputs and outputs are-- the input is the amount of material that is being used in terms of gram weight. We are trying to minimize that. So that's what the whole machine learning exercise was. >> And the data comes from where? Is it observation, maybe instrumented? >> Yeah, the instruments. Our stretch-wrapper machines have an ignition platform, which is a Scada platform that allows us to measure all of these variables. We would be able to get machine variable information from those machines and then be able to hopefully, one day, automate that process, so the feedback loop that says "On this profile, we've not had any breaks. We can continue," or if there have been frequent breaks on a certain profile or machine setting, then we can change that dynamically as the product is moving through the manufacturing process. >> Yeah, so think of it as, it's kind of a traditional manufacturing production line optimization and prediction problem right? It's minimizing waste, right, while maximizing the output and then throughput of the production line. When you optimize a production line, the first step is to predict what's going to go wrong, and then the next step would be to include precision optimization to say "How do we maximize? Using the constraints that the predictive models give us, how do we maximize the output of the production line?" This is not a unique situation. It's a unique material that we haven't really worked with, but they had some really good data on this material, how it behaves, and that's key, as you know, Dave, and probable most of the people watching this know, labeled data is the hardest part of doing machine learning, and building those features from that labeled data, and they had some great data for us to start with. >> Okay, so you're collecting data at the edge essentially, then you're using that to feed the models, which is running, I don't know, where's it running, your data center? Your cloud? >> Yeah, in our data center, there's an instance of DSX Local. >> Okay. >> That we stood up. Most of the data is running through that. We build the models there. And then our goal is to be able to deploy to the edge where we can complete the loop in terms of the feedback that happens. >> And iterate. (Shreesha nods) >> And DSX Local, is Data Science Experience Local? >> Yes. >> Slash Watson Studio, so they're the same thing. >> Okay now, what role did IBM and the Data Science Elite Team play? You could take us through that. >> So, as we discussed earlier, adopting data science is not that easy. It requires subject matter, expertise. It requires understanding of data science itself, the tools and techniques, and IBM brought that as a part of the Data Science Elite Team. They brought both the tools and the expertise so that we could get on that journey towards AI. >> And it's not a "do the work for them." It's a "teach to fish," and so my team sat side by side with the Niagara Bottling team, and we walked them through the process, so it's not a consulting engagement in the traditional sense. It's how do we help them learn how to do it? So it's side by side with their team. Our team sat there and walked them through it. >> For how many weeks? >> We've had about two sprints already, and we're entering the third sprint. It's been about 30-45 days between sprints. >> And you have your own data science team. >> Yes. Our team is coming up to speed using this project. They've been trained but they needed help with people who have done this, been there, and have handled some of the challenges of modeling and data science. >> So it accelerates that time to --- >> Value. >> Outcome and value and is a knowledge transfer component -- >> Yes, absolutely. >> It's occurring now, and I guess it's ongoing, right? >> Yes. The engagement is unique in the sense that IBM's team came to our factory, understood what that process, the stretch-wrap process looks like so they had an understanding of the physical process and how it's modeled with the help of the variables and understand the data science modeling piece as well. Once they know both side of the equation, they can help put the physical problem and the digital equivalent together, and then be able to correlate why things are happening with the appropriate data that supports the behavior. >> Yeah and then the constraints of the one use case and up to 90 days, there's no charge for those two. Like I said, it's paramount that our clients like Niagara know how to do this successfully in their enterprise. >> It's a freebie? >> No, it's no charge. Free makes it sound too cheap. (everybody laughs) >> But it's part of obviously a broader arrangement with buying hardware and software, or whatever it is. >> Yeah, its a strategy for us to help make sure our clients are successful, and I want it to minimize the activation energy to do that, so there's no charge, and the only requirements from the client is it's a real use case, they at least match the resources I put on the ground, and they sit with us and do things like this and act as a reference and talk about the team and our offerings and their experiences. >> So you've got to have skin in the game obviously, an IBM customer. There's got to be some commitment for some kind of business relationship. How big was the collective team for each, if you will? >> So IBM had 2-3 data scientists. (Dave takes notes) Niagara matched that, 2-3 analysts. There were some working with the machines who were familiar with the machines and others who were more familiar with the data acquisition and data modeling. >> So each of these engagements, they cost us about $250,000 all in, so they're quite an investment we're making in our clients. >> I bet. I mean, 2-3 weeks over many, many weeks of super geeks time. So you're bringing in hardcore data scientists, math wizzes, stat wiz, data hackers, developer--- >> Data viz people, yeah, the whole stack. >> And the level of skills that Niagara has? >> We've got actual employees who are responsible for production, our manufacturing analysts who help aid in troubleshooting problems. If there are breakages, they go analyze why that's happening. Now they have data to tell them what to do about it, and that's the whole journey that we are in, in trying to quantify with the help of data, and be able to connect our systems with data, systems and models that help us analyze what happened and why it happened and what to do before it happens. >> Your team must love this because they're sort of elevating their skills. They're working with rock star data scientists. >> Yes. >> And we've talked about this before. A point that was made here is that it's really important in these projects to have people acting as product owners if you will, subject matter experts, that are on the front line, that do this everyday, not just for the subject matter expertise. I'm sure there's executives that understand it, but when you're done with the model, bringing it to the floor, and talking to their peers about it, there's no better way to drive this cultural change of adopting these things and having one of your peers that you respect talk about it instead of some guy or lady sitting up in the ivory tower saying "thou shalt." >> Now you don't know the outcome yet. It's still early days, but you've got a model built that you've got confidence in, and then you can iterate that model. What's your expectation for the outcome? >> We're hoping that preliminary results help us get up the learning curve of data science and how to leverage data to be able to make decisions. So that's our idea. There are obviously optimal settings that we can use, but it's going to be a trial and error process. And through that, as we collect data, we can understand what settings are optimal and what should we be using in each of the plants. And if the plants decide, hey they have a subjective preference for one profile versus another with the data we are capturing we can measure when they deviated from what we specified. We have a lot of learning coming from the approach that we're taking. You can't control things if you don't measure it first. >> Well, your objectives are to transcend this one project and to do the same thing across. >> And to do the same thing across, yes. >> Essentially pay for it, with a quick return. That's the way to do things these days, right? >> Yes. >> You've got more narrow, small projects that'll give you a quick hit, and then leverage that expertise across the organization to drive more value. >> Yes. >> Love it. What a great story, guys. Thanks so much for coming to theCUBE and sharing. >> Thank you. >> Congratulations. You must be really excited. >> No. It's a fun project. I appreciate it. >> Thanks for having us, Dave. I appreciate it. >> Pleasure, Seth. Always great talking to you, and keep it right there everybody. You're watching theCUBE. We're live from New York City here at the Westin Hotel. cubenyc #cubenyc Check out the ibm.com/winwithai Change the Game: Winning with AI Tonight. We'll be right back after a short break. (minimal upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by IBM. at Terminal 5 of the West Side Highway, I think we met in the snowstorm in Boston, sparked something When we were both trapped there. Yep, and at that time, we spent a lot of time and we found a consistent theme with all the clients, So, at this point, I ask, "Well, do you have As a matter of fact, Dave, we do. Yeah, so you're not a bank with a trillion dollars Well, Niagara Bottling is the biggest private label and that's really where you sit in the organization, right? and business analytics as well as I support some of the And we can kind of go through the case study. So the current project that we leveraged IBM's help was And over breakfast we were talking. (everyone laughs) It's called pellets to pallets. Yes, in fact, we do bore wells and So if we use too much plastic, we're not optimally, as we wrap the pallets, whether we are wrapping it too little material, and so we can achieve some savings so we want to try and avoid that. and how much variability is in there? goes around each of the pallet. So they had labeled data that was, "if we stretch it this that we've built with them. Yeah that's mainly to make sure that the pallets So that's one of the variables that is measured, one day, automate that process, so the feedback loop the predictive models give us, how do we maximize the Yeah, in our data center, Most of the data And iterate. the Data Science Elite Team play? so that we could get on that journey towards AI. And it's not a "do the work for them." and we're entering the third sprint. some of the challenges of modeling and data science. that supports the behavior. Yeah and then the constraints of the one use case No, it's no charge. with buying hardware and software, or whatever it is. minimize the activation energy to do that, There's got to be some commitment for some and others who were more familiar with the So each of these engagements, So you're bringing in hardcore data scientists, math wizzes, and that's the whole journey that we are in, in trying to Your team must love this because that are on the front line, that do this everyday, and then you can iterate that model. And if the plants decide, hey they have a subjective and to do the same thing across. That's the way to do things these days, right? across the organization to drive more value. Thanks so much for coming to theCUBE and sharing. You must be really excited. I appreciate it. I appreciate it. Change the Game: Winning with AI Tonight.

ENTITIES

Entity	Category	Confidence
Shreesha Rao	PERSON	0.99+
Seth Dobern	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Walmarts	ORGANIZATION	0.99+
Costcos	ORGANIZATION	0.99+
Dave	PERSON	0.99+
30	QUANTITY	0.99+
Boston	LOCATION	0.99+
New York City	LOCATION	0.99+
California	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
60	QUANTITY	0.99+
Niagara	ORGANIZATION	0.99+
Seth	PERSON	0.99+
Shreesha	PERSON	0.99+
U.S.	LOCATION	0.99+
Sreesha Rao	PERSON	0.99+
third sprint	QUANTITY	0.99+
90 days	QUANTITY	0.99+
two	QUANTITY	0.99+
first step	QUANTITY	0.99+
Inderpal Bhandari	PERSON	0.99+
Niagara Bottling	ORGANIZATION	0.99+
Python	TITLE	0.99+
both	QUANTITY	0.99+
tonight	DATE	0.99+
ibm.com/winwithai	OTHER	0.99+
one	QUANTITY	0.99+
Terminal 5	LOCATION	0.99+
two years	QUANTITY	0.99+
about $250,000	QUANTITY	0.98+
Times Square	LOCATION	0.98+
Scala	TITLE	0.98+
2018	DATE	0.98+
15-20%	QUANTITY	0.98+
IBM Analytics	ORGANIZATION	0.98+
each	QUANTITY	0.98+
today	DATE	0.98+
each pallet	QUANTITY	0.98+
Kaggle	ORGANIZATION	0.98+
West Side Highway	LOCATION	0.97+
Each pallet	QUANTITY	0.97+
4 sprints	QUANTITY	0.97+
About 250 grams	QUANTITY	0.97+
both side	QUANTITY	0.96+
Data Science Elite Team	ORGANIZATION	0.96+
one day	QUANTITY	0.95+
every single year	QUANTITY	0.95+
Niagara Bottling	PERSON	0.93+
about two sprints	QUANTITY	0.93+
one end	QUANTITY	0.93+
R	TITLE	0.92+
2-3 weeks	QUANTITY	0.91+
one profile	QUANTITY	0.91+
50-60 analysts	QUANTITY	0.91+
trillion dollars	QUANTITY	0.9+
2-3 data scientists	QUANTITY	0.9+
about 30-45 days	QUANTITY	0.88+
almost 16 million pallets of water	QUANTITY	0.88+
Big Apple	LOCATION	0.87+
couple years ago	DATE	0.87+
last 18 months	DATE	0.87+
Westin Hotel	ORGANIZATION	0.83+
pallet	QUANTITY	0.83+
#cubenyc	LOCATION	0.82+
2833 bottles of water per second	QUANTITY	0.82+
the Game: Winning with AI	TITLE	0.81+

Seth Dobrin, IBM | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our own event, Big Data SV. I'm Lisa Martin, with my cohost Dave Vellante. We're in downtown San Jose at this really cool place, Forager Eatery. Come by, check us out. We're here tomorrow as well. We're joined by, next, one of our CUBE alumni, Seth Dobrin, the Vice President and Chief Data Officer at IBM Analytics. Hey, Seth, welcome back to theCUBE. >> Hey, thanks for having again. Always fun being with you guys. >> Good to see you, Seth. >> Good to see you. >> Yeah, so last time you were chatting with Dave and company was about in the fall at the Chief Data Officers Summit. What's kind of new with you in IBM Analytics since then? >> Yeah, so the Chief Data Officers Summit, I was talking with one of the data governance people from TD Bank and we spent a lot of time talking about governance. Still doing a lot with governance, especially with GDPR coming up. But really started to ramp up my team to focus on data science, machine learning. How do you do data science in the enterprise? How is it different from doing a Kaggle competition, or someone getting their PhD or Masters in Data Science? >> Just quickly, who is your team composed of in IBM Analytics? >> So IBM Analytics represents, think of it as our software umbrella, so it's everything that's not pure cloud or Watson or services. So it's all of our software franchise. >> But in terms of roles and responsibilities, data scientists, analysts. What's the mixture of-- >> Yeah. So on my team I have a small group of people that do governance, and so they're really managing our GDPR readiness inside of IBM in our business unit. And then the rest of my team is really focused on this data science space. And so this is set up from the perspective of we have machine-learning engineers, we have predictive-analytics engineers, we have data engineers, and we have data journalists. And that's really focus on helping IBM and other companies do data science in the enterprise. >> So what's the dynamic amongst those roles that you just mentioned? Is it really a team sport? I mean, initially it was the data science on a pedestal. Have you been able to attack that problem? >> So I know a total of two people that can do that all themselves. So I think it absolutely is a team sport. And it really takes a data engineer or someone with deep expertise in there, that also understands machine-learning, to really build out the data assets, engineer the features appropriately, provide access to the model, and ultimately to what you're going to deploy, right? Because the way you do it as a research project or an activity is different than using it in real life, right? And so you need to make sure the data pipes are there. And when I look for people, I actually look for a differentiation between machine-learning engineers and optimization. I don't even post for data scientists because then you get a lot of data scientists, right? People who aren't really data scientists, and so if you're specific and ask for machine-learning engineers or decision optimization, OR-type people, you really get a whole different crowd in. But the interplay is really important because most machine-learning use cases you want to be able to give information about what you should do next. What's the next best action? And to do that, you need decision optimization. >> So in the early days of when we, I mean, data science has been around forever, right? We always hear that. But in the, sort of, more modern use of the term, you never heard much about machine learning. It was more like stats, math, some programming, data hacking, creativity. And then now, machine learning sounds fundamental. Is that a new skillset that the data scientists had to learn? Did they get them from other parts of the organization? >> I mean, when we talk about math and stats, what we call machine learning today has been what we've been doing since the first statistics for years, right? I mean, a lot of the same things we apply in what we call machine learning today I did during my PhD 20 years ago, right? It was just with a different perspective. And you applied those types of, they were more static, right? So I would build a model to predict something, and it was only for that. It really didn't apply it beyond, so it was very static. Now, when we're talking about machine learning, I want to understand Dave, right? And I want to be able to predict Dave's behavior in the future, and learn how you're changing your behavior over time, right? So one of the things that a lot of people don't realize, especially senior executives, is that machine learning creates a self-fulfilling prophecy. You're going to drive a behavior so your data is going to change, right? So your model needs to change. And so that's really the difference between what you think of as stats and what we think of as machine learning today. So what we were looking for years ago is all the same we just described it a little differently. >> So how fine is the line between a statistician and a data scientist? >> I think any good statistician can really become a data scientist. There's some issues around data engineering and things like that but if it's a team sport, I think any really good, pure mathematician or statistician could certainly become a data scientist. Or machine-learning engineer. Sorry. >> I'm interested in it from a skillset standpoint. You were saying how you're advertising to bring on these roles. I was at the Women in Data Science Conference with theCUBE just a couple of days ago, and we hear so much excitement about the role of data scientists. It's so horizontal. People have the opportunity to make impact in policy change, healthcare, etc. So the hard skills, the soft skills, mathematician, what are some of the other elements that you would look for or that companies, enterprises that need to learn how to embrace data science, should look for? Someone that's not just a mathematician but someone that has communication skills, collaboration, empathy, what are some of those, openness, to not lead data down a certain, what do you see as the right mix there of a data scientist? >> Yeah, so I think that's a really good point, right? It's not just the hard skills. When my team goes out, because part of what we do is we go out and sit with clients and teach them our philosophy on how you should integrate data science in the enterprise. A good part of that is sitting down and understanding the use case. And working with people to tease out, how do you get to this ultimate use case because any problem worth solving is not one model, any use case is not one model, it's many models. How do you work with the people in the business to understand, okay, what's the most important thing for us to deliver first? And it's almost a negotiation, right? Talking them back. Okay, we can't solve the whole problem. We need to break it down in discreet pieces. Even when we break it down into discreet pieces, there's going to be a series of sprints to deliver that. Right? And so having these soft skills to be able to tease that in a way, and really help people understand that their way of thinking about this may or may not be right. And doing that in a way that's not offensive. And there's a lot of really smart people that can say that, but they can come across at being offensive, so those soft skills are really important. >> I'm going to talk about GDPR in the time we have remaining. We talked about in the past, the clocks ticking, May the fines go into effect. The relationship between data science, machine learning, GDPR, is it going to help us solve this problem? This is a nightmare for people. And many organizations aren't ready. Your thoughts. >> Yeah, so I think there's some aspects that we've talked about before. How important it's going to be to apply machine learning to your data to get ready for GDPR. But I think there's some aspects that we haven't talked about before here, and that's around what impact does GDPR have on being able to do data science, and being able to implement data science. So one of the aspects of the GDPR is this concept of consent, right? So it really requires consent to be understandable and very explicit. And it allows people to be able to retract that consent at any time. And so what does that mean when you build a model that's trained on someone's data? If you haven't anonymized it properly, do I have to rebuild the model without their data? And then it also brings up some points around explainability. So you need to be able to explain your decision, how you used analytics, how you got to that decision, to someone if they request it. To an auditor if they request it. Traditional machine learning, that's not too much of a problem. You can look at the features and say these features, this contributed 20%, this contributed 50%. But as you get into things like deep learning, this concept of explainable or XAI becomes really, really important. And there were some talks earlier today at Strata about how you apply machine learning, traditional machine learning to interpret your deep learning or black box AI. So that's really going to be important, those two things, in terms of how they effect data science. >> Well, you mentioned the black box. I mean, do you think we'll ever resolve the black box challenge? Or is it really that people are just going to be comfortable that what happens inside the box, how you got to that decision is okay? >> So I'm inherently both cynical and optimistic. (chuckles) But I think there's a lot of things we looked at five years ago and we said there's no way we'll ever be able to do them that we can do today. And so while I don't know how we're going to get to be able to explain this black box as a XAI, I'm fairly confident that in five years, this won't even be a conversation anymore. >> Yeah, I kind of agree. I mean, somebody said to me the other day, well, it's really hard to explain how you know it's a dog. >> Seth: Right (chuckles). But you know it's a dog. >> But you know it's a dog. And so, we'll get over this. >> Yeah. >> I love that you just brought up dogs as we're ending. That's my favorite thing in the world, thank you. Yes, you knew that. Well, Seth, I wish we had more time, and thanks so much for stopping by theCUBE and sharing some of your insights. Look forward to the next update in the next few months from you. >> Yeah, thanks for having me. Good seeing you again. >> Pleasure. >> Nice meeting you. >> Likewise. We want to thank you for watching theCUBE live from our event Big Data SV down the street from the Strata Data Conference. I'm Lisa Martin, for Dave Vellante. Thanks for watching, stick around, we'll be rick back after a short break.

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media Welcome back to theCUBE's continuing coverage Always fun being with you guys. Yeah, so last time you were chatting But really started to ramp up my team So it's all of our software franchise. What's the mixture of-- and other companies do data science in the enterprise. that you just mentioned? And to do that, you need decision optimization. So in the early days of when we, And so that's really the difference I think any good statistician People have the opportunity to make impact there's going to be a series of sprints to deliver that. in the time we have remaining. And so what does that mean when you build a model Or is it really that people are just going to be comfortable ever be able to do them that we can do today. I mean, somebody said to me the other day, But you know it's a dog. But you know it's a dog. I love that you just brought up dogs as we're ending. Good seeing you again. We want to thank you for watching theCUBE

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
20%	QUANTITY	0.99+
50%	QUANTITY	0.99+
TD Bank	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
two people	QUANTITY	0.99+
tomorrow	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
one model	QUANTITY	0.99+
five years	QUANTITY	0.98+
20 years ago	DATE	0.98+
Big Data SV	EVENT	0.98+
five years ago	DATE	0.98+
GDPR	TITLE	0.98+
theCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Strata Data Conference	EVENT	0.97+
today	DATE	0.97+
first statistics	QUANTITY	0.95+
CUBE	ORGANIZATION	0.94+
Women in Data Science Conference	EVENT	0.94+
both	QUANTITY	0.94+
Chief Data Officers Summit	EVENT	0.93+
Big Data SV 2018	EVENT	0.93+
couple of days ago	DATE	0.93+
years	DATE	0.9+
Forager Eatery	ORGANIZATION	0.9+
first	QUANTITY	0.86+
Watson	TITLE	0.86+
Officers Summit	EVENT	0.74+
Data Officer	PERSON	0.73+
SV	EVENT	0.71+
President	PERSON	0.68+
Strata	TITLE	0.67+
Big Data	ORGANIZATION	0.66+
earlier today	DATE	0.65+
Silicon Valley	LOCATION	0.64+
years	QUANTITY	0.6+
Chief	EVENT	0.44+
Kaggle	ORGANIZATION	0.43+

Arik Pelkey, Pentaho - BigData SV 2017 - #BigDataSV - #theCUBE

>> Announcer: Live from Santa Fe, California, it's the Cube covering Big Data Silicon Valley 2017. >> Welcome, back, everyone. We're here live in Silicon Valley in San Jose for Big Data SV in conjunct with stratAHEAD Hadoop part two. Three days of coverage here in Silicon Valley and Big Data. It's our eighth year covering Hadoop and the Hadoop ecosystem. Now expanding beyond just Hadoop into AI, machine learning, IoT, cloud computing with all this compute is really making it happen. I'm John Furrier with my co-host George Gilbert. Our next guest is Arik Pelkey who is the senior director of product marketing at Pentaho that we've covered many times and covered their event at Pentaho world. Thanks for joining us. >> Thank you for having me. >> So, in following you guys I'll see Pentaho was once an independent company bought by Hitachi, but still an independent group within Hitachi. >> That's right, very much so. >> Okay so you guys some news. Let's just jump into the news. You guys announced some of the machine learning. >> Exactly, yeah. So, Arik Pelkey, Pentaho. We are a data integration and analytics software company. You mentioned you've been doing this for eight years. We have been at Big Data for the past eight years as well. In fact, we're one of the first vendors to support Hadoop back in the day, so we've been along for the journey ever since then. What we're announcing today is really exciting. It's a set of machine learning orchestration capabilities, which allows data scientists, data engineers, and data analysts to really streamline their data science processes. Everything from ingesting new data sources through data preparation, feature engineering which is where a lot of data scientists spend their time through tuning their models which can still be programmed in R, in Weka, in Python, and any other kind of data science tool of choice. What we do is we help them deploy those models inside of Pentaho as a step inside of Pentaho, and then we help them update those models as time goes on. So, really what this is doing is it's streamlining. It's making them more productive so that they can focus their time on things like model building rather than data preparation and feature engineering. >> You know, it's interesting. The market is really active right now around machine learning and even just last week at Google Next, which is their cloud event, they had made the acquisition of Kaggle, which is kind of an open data science. You mentioned the three categories: data engineer, data science, data analyst. Almost on a progression, super geek to business facing, and there's different approaches. One of the comments from the CEO of Kaggle on the acquisition when we wrote up at Sylvan Angle was, and I found this fascinating, I want to get your commentary and reaction to is, he says the data science tools are as early as generations ago, meaning that all the advances and open source and tooling and software development is far along, but now data science is still at that early stage and is going to get better. So, what's your reaction to that, because this is really the demand we're seeing is a lot of heavy lifing going on in the data science world, yet there's a lot of runway of more stuff to do. What is that more stuff? >> Right. Yeah, we're seeing the same thing. Last week I was at the Gardener Data and Analytics conference, and that was kind of the take there from one of their lead machine learning analysts was this is still really early days for data science software. So, there's a lot of Apache projects out there. There's a lot of other open source activity going on, but there are very few vendors that bring to the table an integrated kind of full platform approach to the data science workflow, and that's what we're bringing to market today. Let me be clear, we're not trying to replace R, or Python, or MLlib, because those are the tools of the data scientists. They're not going anywhere. They spent eight years in their phD program working with these tools. We're not trying to change that. >> They're fluent with those tools. >> Very much so. They're also spending a lot of time doing feature engineering. Some research reports, say between 70 and 80% of their time. What we bring to the table is a visual drag and drop environment to do feature engineering a much faster, more efficient way than before. So, there's a lot of different kind of desperate siloed applications out there that all do interesting things on their own, but what we're doing is we're trying to bring all of those together. >> And the trends are reduce the time it takes to do stuff and take away some of those tasks that you can use machine learning for. What unique capabilities do you guys have? Talk about that for a minute, just what Pentaho is doing that's unique and added value to those guys. >> So, the big thing is I keep going back to the data preparation part. I mean, that's 80% of time that's still a really big challenge. There's other vendors out there that focus on just the data science kind of workflow, but where we're really unqiue is around being able to accommodate very complex data environments, and being able to onboard data. >> Give me an example of those environments. >> Geospatial data combined with data from your ERP or your CRM system and all kinds of different formats. So, there might be 15 different data formats that need to be blended together and standardized before any of that can really happen. That's the complexity in the data. So, Pentaho, very consistent with everything else that we do outside of machine learning, is all about helping our customers solve those very complex data challenges before doing any kind of machine learning. One example is one customer is called Caterpillar Machine Asset Intelligence. So, their doing predictive maintenance onboard container ships and on ferry's. So, they're taking data from hundreds and hundreds of sensors onboard these ships, combining that kind of operational sensor data together with geospatial data and then they're serving up predictive maintenance alerts if you will, or giving signals when it's time to replace an engine or complace a compressor or something like that. >> Versus waiting for it to break. >> Versus waiting for it to break, exactly. That's one of the real differentiators is that very complex data environment, and then I was starting to move toward the other differentiator which is our end to end platform which allows customers to deliver these analytics in an embedded fashion. So, kind of full circle, being able to send that signal, but not to an operational system which is sometimes a challenge because you might have to rewrite the code. Deploying models is a really big challenge within Pentaho because it is this fully integrated application. You can deploy the models within Pentaho and not have to jump out into a mainframe environment or something like that. So, I'd say differentiators are very complex data environments, and then this end to end approach where deploying models is much easier than ever before. >> Perhaps, let's talk about alternatives that customers might see. You have a tool suite, and others might have to put together a suite of tools. Maybe tell us some of the geeky version would be the impendent mismatch. You know, like the chasms you'd find between each tool where you have to glue them together, so what are some of those pitfalls? >> One of the challenges is, you have these data scientists working in silos often times. You have data analysts working in silos, you might have data engineers working in silos. One of the big pitfalls is not really collaborating enough to the point where they can do all of this together. So, that's a really big area that we see pitfalls. >> Is it binary not collaborating, or is it that the round trip takes so long that the quality or number of collaborations is so drastically reduced that the output is of lower quality? >> I think it's probably a little bit of both. I think they want to collaborate but one person might sit in Dearborn, Michigan and the other person might sit in Silicon Valley, so there's just a location challenge as well. The other challenge is, some of the data analysts might sit in IT and some of the data scientists might sit in an analytics department somewhere, so it kind of cuts across both location and functional area too. >> So let me ask from the point of view of, you know we've been doing these shows for a number of years and most people have their first data links up and running and their first maybe one or two use cases in production, very sophisticated customers have done more, but what seems to be clear is the highest value coming from those projects isn't to put a BI tool in front of them so much as to do advanced analytics on that data, apply those analytics to inform a decision, whether a person or a machine. >> That's exactly right. >> So, how do you help customers over that hump and what are some other examples that you can share? >> Yeah, so speaking of transformative. I mean, that's what machine learning is all about. It helps companies transform their businesses. We like to talk about that at Pentaho. One customer kind of industry example that I'll share is a company called IMS. IMS is in the business of providing data and analytics to insurance companies so that the insurance companies can price insurance policies based on usage. So, it's a usage model. So, IMS has a technology platform where they put sensors in a car, and then using your mobile phone, can track your driving behavior. Then, your insurance premium that month reflects the driving behavior that you had during that month. In terms of transformative, this is completely upending the insurance industry which has always had a very fixed approach to pricing risk. Now, they understand everything about your behavior. You know, are you turning too fast? Are you breaking too fast, and they're taking it further than that too. They're able to now do kind of a retroactive look at an accident. So, after an accident, they can go back and kind of decompose what happened in the accident and determine whether or not it was your fault or was in fact the ice on the street. So, transformative? I mean, this is just changing things in a really big way. >> I want to get your thoughts on this. I'm just looking at some of the research. You know, we always have the good data but there's also other data out there. In your news, 92% of organizations plan to deploy more predictive analytics, however 50% of organizations have difficulty integrating predictive analytics into their information architecture, which is where the research is shown. So my question to you is, there's a huge gap between the technology landscapes of front end BI tools and then complex data integration tools. That seems to be the sweet spot where the value's created. So, you have the demand and then front end BI's kind of sexy and cool. Wow, I could power my business, but the complexity is really hard in the backend. Who's accessing it? What's the data sources? What's the governance? All these things are complicated, so how do you guys reconcile the front end BI tools and the backend complexity integrations? >> Our story from the beginning has always been this one integrated platform, both for complex data integration challenges together with visualizations, and that's very similar to what this announcement is all about for the data science market. We're very much in line with that. >> So, it's the cart before the horse? Is it like the BI tools are really driven by the data? I mean, it makes sense that the data has to be key. Front end BI could be easy if you have one data set. >> It's funny you say that. I presented at the Gardner conference last week and my topic was, this just in: it's not about analytics. Kind of in jest, but it drove a really big crowd. So, it's about the data right? It's about solving the data problem before you solve the analytics problem whether it's a simple visualization or it's a complex fraud machine learning problem. It's about solving the data problem first. To that quote, I think one of the things that they were referencing was the challenging information architectures into which companies are trying to deploy models and so part of that is when you build a machine learning model, you use R and Python and all these other ones we're familiar with. In order to deploy that into a mainframe environment, someone has to then recode it in C++ or COBOL or something else. That can take a really long time. With our integrated approach, once you've done the feature engineering and the data preparation using our drag and drop environment, what's really interesting is that you're like 90% of the way there in terms of making that model production ready. So, you don't have to go back and change all that code, it's already there because you used it in Pentaho. >> So obviously for those two technologies groups I just mentioned, I think you had a good story there, but it creates problems. You've got product gaps, you've got organizational gaps, you have process gaps between the two. Are you guys going to solve that, or are you currently solving that today? There's a lot of little questions in there, but that seems to be the disconnect. You know, I can do this, I can do that, do I do them together? >> I mean, sticking to my story of one integrated approach to being able to do the entire data science workflow, from beginning to end and that's where we've really excelled. To the extent that more and more data engineers and data analysts and data scientists can get on this one platform even if their using R and WECCA and Python. >> You guys want to close those gaps down, that's what you guys are doing, right? >> We want to make the process more collaborative and more efficient. >> So Dave Alonte has a question on CrowdChat for you. Dave Alonte was in the snowstorm in Boston. Dave, good to see you, hope you're doing well shoveling out the driveway. Thanks for coming in digitally. His question is HDS has been known for mainframes and storage, but Hitachi is an industrial giant. How is Pentaho leveraging Hitatchi's IoT chops? >> Great question, thanks for asking. Hitatchi acquired Pentaho about two years ago, this is before my time. I've been with Pentaho about ten months ago. One of the reasons that they acquired Pentaho is because a platform that they've announced which is called Lumata which is their IoT platform, so what Pentaho is, is the analytics engine that drives that IoT platform Lumata. So, Lumata is about solving more of the hardware sensor, bringing data from the edge into being able to do the analytics. So, it's an incredibly great partnership between Lumata and Pentaho. >> Makes an eternal customer too. >> It's a 90 billion dollar conglomerate so yeah, the acquisition's been great and we're still very much an independent company going to market on our own, but we now have a much larger channel through Hitatchi's reps around the world. >> You've got IoT's use case right there in front of you. >> Exactly. >> But you are leveraging it big time, that's what you're saying? >> Oh yeah, absolutely. We're a very big part of their IoT strategy. It's the analytics. Both of the examples that I shared with you are in fact IoT, not by design but it's because there's a lot of demand. >> You guys seeing a lot of IoT right now? >> Oh yeah. We're seeing a lot of companies coming to us who have just hired a director or vice president of IoT to go out and figure out the IoT strategy. A lot of these are manufacturing companies or coming from industries that are inefficient. >> Digitizing the business model. >> So to the other point about Hitachi that I'll make, is that as it relates to data science, a 90 billion dollar manufacturing and otherwise giant, we have a very deep bench of phD data scientists that we can go to when there's very complex data science problems to solve at customer sight. So, if a customer's struggling with some of the basic how do I get up and running doing machine learning, we can bring our bench of data scientist at Hitatchi to bear in those engagements, and that's a really big differentiator for us. >> Just to be clear and one last point, you've talked about you handle the entire life cycle of modeling from acquiring the data and prepping it all the way through to building a model, deploying it, and updating it which is a continuous process. I think as we've talked about before, data scientists or just the DEV ops community has had trouble operationalizing the end of the model life cycle where you deploy it and update it. Tell us how Pentaho helps with that. >> Yeah, it's a really big problem and it's a very simple solution inside of Pentaho. It's basically a step inside of Pentaho. So, in the case of fraud let's say for example, a prediction might say fraud, not fraud, fraud, not fraud, whatever it is. We can then bring that kind of full lifecycle back into the data workflow at the beginning. It's a simple drag and drop step inside of Pentaho to say which were right and which were wrong and feed that back into the next prediction. We could also take it one step further where there has to be a manual part of this too where it goes to the customer service center, they investigate and they say yes fraud, no fraud, and then that then gets funneled back into the next prediction. So yeah, it's a big challenge and it's something that's relatively easy for us to do just as part of the data science workflow inside of Pentaho. >> Well Arick, thanks for coming on The Cube. We really appreciate it, good luck with the rest of the week here. >> Yeah, very exciting. Thank you for having me. >> You're watching The Cube here live in Silicon Valley covering Strata Hadoop, and of course our Big Data SV event, we also have a companion event called Big Data NYC. We program with O'Reilley Strata Hadoop, and of course have been covering Hadoop really since it's been founded. This is The Cube, I'm John Furrier. George Gilbert. We'll be back with more live coverage today for the next three days here inside The Cube after this short break.

Published Date : Mar 14 2017

SUMMARY :

it's the Cube covering Big Data Silicon Valley 2017. and the Hadoop ecosystem. So, in following you guys I'll see Pentaho was once You guys announced some of the machine learning. We have been at Big Data for the past eight years as well. One of the comments from the CEO of Kaggle of the data scientists. environment to do feature engineering a much faster, and take away some of those tasks that you can use So, the big thing is I keep going back to the data That's the complexity in the data. So, kind of full circle, being able to send that signal, You know, like the chasms you'd find between each tool One of the challenges is, you have these data might sit in IT and some of the data scientists So let me ask from the point of view of, the driving behavior that you had during that month. and the backend complexity integrations? is all about for the data science market. I mean, it makes sense that the data has to be key. It's about solving the data problem before you solve but that seems to be the disconnect. To the extent that more and more data engineers and more efficient. shoveling out the driveway. One of the reasons that they acquired Pentaho the acquisition's been great and we're still very much Both of the examples that I shared with you of IoT to go out and figure out the IoT strategy. is that as it relates to data science, from acquiring the data and prepping it all the way through and feed that back into the next prediction. of the week here. Thank you for having me. for the next three days here inside The Cube

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Hitachi	ORGANIZATION	0.99+
Dave Alonte	PERSON	0.99+
Pentaho	ORGANIZATION	0.99+
Dave	PERSON	0.99+
90%	QUANTITY	0.99+
Arik Pelkey	PERSON	0.99+
Boston	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Hitatchi	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
one	QUANTITY	0.99+
50%	QUANTITY	0.99+
eight years	QUANTITY	0.99+
Arick	PERSON	0.99+
One	QUANTITY	0.99+
Lumata	ORGANIZATION	0.99+
Last week	DATE	0.99+
two technologies	QUANTITY	0.99+
15 different data formats	QUANTITY	0.99+
first	QUANTITY	0.99+
92%	QUANTITY	0.99+
One example	QUANTITY	0.99+
Both	QUANTITY	0.99+
Three days	QUANTITY	0.99+
Python	TITLE	0.99+
Kaggle	ORGANIZATION	0.99+
one customer	QUANTITY	0.99+
today	DATE	0.99+
eighth year	QUANTITY	0.99+
last week	DATE	0.99+
Santa Fe, California	LOCATION	0.99+
two	QUANTITY	0.99+
each tool	QUANTITY	0.99+
90 billion dollar	QUANTITY	0.99+
80%	QUANTITY	0.99+
Caterpillar	ORGANIZATION	0.98+
both	QUANTITY	0.98+
NYC	LOCATION	0.98+
first data	QUANTITY	0.98+
Pentaho	LOCATION	0.98+
San Jose	LOCATION	0.98+
The Cube	TITLE	0.98+
Big Data SV	EVENT	0.97+
COBOL	TITLE	0.97+
70	QUANTITY	0.97+
C++	TITLE	0.97+
IMS	TITLE	0.96+
MLlib	TITLE	0.96+
one person	QUANTITY	0.95+
R	TITLE	0.95+
Big Data	EVENT	0.95+
Gardener Data and Analytics	EVENT	0.94+
Gardner	EVENT	0.94+
Strata Hadoop	TITLE	0.93+

Wrap - Google Next 2017 - #GoogleNext17 - #theCUBE

>> Narrator: Live from Silicon Valley, it's theCUBE, covering Google Cloud, Next 17. >> Hey, welcome back everyone. We're here live in the Palo Alto Studios, SiliconANGLE Media, is theCUBE's new 4400 square foot studio, here in our studio, this is our sports center. I'm here with Stu Miniman, analyst at Wikibon on the team. I was at the event all day today, drove down to Palo Alto to give us the latest in-person updates, as well as, for the past two days, Stu has been at the Analyst Summit, which is Google's first analyst summit, Google Cloud. And Stu, we're going to break down day one in the books. Certainly, people starting to get onto there. After-meetups, parties, dinners, and festivities. 10,000 people came to the Google Annual Cloud Next Conference. A lot of customer conversations, not a lot of technology announcements, Stu. But we got another day tomorrow. >> John, first of all, congrats on the studio here. I mean, it's really exciting. I remember the first time I met you in Palo Alto, there was the corner in ColoSpace-- >> Cloud Air. >> A couple towards down for fries, at the (mumbles) And look at this space. Gorgeous studio. Excited to be here. Happy to do a couple videos. And I'll be in here all day tomorrow, helping to break down. >> Well, Stu, first allows us to, one, do a lot more coverage. Obviously, Google Next, you saw, was literally a blockbuster, as Diane Greene said. People were around the block, lines to get in, mass hysteria, chaos. They really couldn't scale the event, which is Google's scale, they nailed the scale software, but scaling event, no room for theCUBE. But we're pumping out videos. We did, what? 13 today. We'll do a lot more tomorrow, and get more now. So you're going to be coming in as well. But also, we had on-the-ground, cause we had phone call-ins from Akash Agarwal from SAP. We had an exclusive video with Sam Yen, who was breaking down the SAP strategic announcement with Google Cloud. And of course, we have a post going on siliconangle.com. A lot of videos up on youtube.com/siliconangle. Great commentary. And really the goal was to continue our coverage, at SiliconANGLE, theCUBE, Wikibon, in the Cloud. Obviously, we've been covering the Cloud since it's really been around. I've been covering Google since it was founded. So we have a lot history, a lot of inside baseball, certainly here in Palo Alto, where Larry Page lives in the neighborhood, friends at Google Earth. So the utmost respect for Google. But really, I mean, come on. The story, you can't put lipstick on a pig. Amazon is crushing them. And there's just no debate about that. And people trying to put that out there, wrote a post this morning, to actually try to illustrate that point. You really can't compare Google Cloud to AWS, because it's just two different animals, Stu. And my point was, "Okay, you want to compare them? "Let's compare them." And we're well briefed on the Cloud players, and you guys have the studies coming out of Wikibon. So there it is. And my post pretty much sums up the truth, which is, Google's really serious about the enterprise. Their making steps, there's some holes, there's some potential fatal flaws in how they allow customers to park their data. They have some architectural differences. But Stu, it's really a different animal. I mean, it's apples and oranges in the Cloud. I don't think it's worthy complaining, because certainly Amazon has the lead. But you have Microsoft, you have Google, you have Oracle, IBM, SAP, they're all kind of in the cluster of this, I call "NASCAR Formation", where they're all kind of jocking around, some go ahead. And it really is a race to get the table stake features done. And really, truly be serious contender for the enterprise. So you can be serious about the enterprise, and say, "Hey, I'm serious about the enterprise." But to be serious winner and leader, are two different ball games. >> And a lot to kind of break down here, John. Because first of all, some of the (mumbles) challenges, absolutely, they scaled that event really big. And kudos to them, 10,000 people, a lot of these things came together last minute. They treated the press and analysts really well. We got to sit up front. They had some good sessions. You just tweeted out, Diane Greene, in the analyst session, and in the Q&A after, absolutely nailed it. I mean, she is an icon in the industry. She's brilliant, really impressive. And she's been pulling together a great team of people that understand the enterprise. But who is Google going after, and how do they compete against so of the other guys, is really interesting to parse. Because some people were saying in the keynote, "We heard more about G Suite "than we heard about some of the Cloud features." Some of that is because they're going to do the announcements tomorrow. And you keep hearing all this G Suite stuff, and it makes me think of Microsoft, not Amazon. It makes me think of Office 365. And we've been hearing out of Amazon recently, they're trying to go after some of those business productivity applications. They're trying to go there where Microsoft is embedded. We know everybody wants to go after companies like IBM and Oracle, and their applications. Because Google has some applications, but really, their strength is been on the data. The machine the AI stuff was really interesting. Dr. Fei-Fei Li from Stanford, really good piece in the keynote there, when they hired her not that long ago. The community really perked up, and is really interesting. And everybody seems to think that this could be the secret weapon for Google. I actually asked them like, in some of the one-on-ones, "Is this the entry point? "Are most people coming for this piece, "when it's around these data challenges in the analytics, "and coming to Google." And they're like, "Well, it's part of it. "But no, we have broad play." Everything from devices through G Suite. And last year, when they did the show, it was all the Cloud. And this year, it's kind of the full enterprise suite, that they're pulling in. So there's some of that sorting out the messaging, and how do you pull all of these pieces together? As you know, when you've got a portfolio, it's like, "Oh well, I got to have a customer for G Suite." And then when the customer's up there talking about G Suite for a while, it's like, "Wait, it's--" >> Wait a minute. Is this a software? >> "What's going on?" >> Is this a sash show? Is this a workplace productivity show? Or is this a Cloud show? Again, this is what my issue is. First of all, the insight is very clear. When you start seeing G Suite, that means that they've got something else that they are either hiding or waiting to announce. But the key though, that is the head customers. That was one important thing. I pointed out in my blog post. To me, when I'm looking for it's competitive wins, and I want to parse out the G Suite, because it's easy just to lay that on, Microsoft does it with 365 of Office, Oracle does it with their stuff. And it does kind of make the numbers fuzzy a little bit. But ultimately, where's the beef on infrastructure as a service, and platform as a service? >> And John, good customers out there, Disney, Colgate, SAP as a partner, HSBC, eBay, Home Depot, which was a big announcement with Pivotal, last year, and Verizon were there. So these are companies, we all know them. Dan Greene was joking, "Disney is going to bring their magic onto our magic. "And make that work." So real enterprise use cases. They seem to have some good push-around developers. They just acquired Kaggle, which is working in some of that space. >> Apogee. >> Yeah, Apogee-- >> I think Apogee's an API company, come on. What does that relate to? It has nothing to do with the enterprise. It's an API management solution. Okay, yes. I guess it fits the stack for Cloud-Native, and for developers. I get that. But this show has to nail the enterprise, Stu. >> And John, you remember back four years ago, when we went to the re:Invent show for the first time, and it was like, they're talking to all the developers, and they haven't gotten to the enterprise. And then they over-pivoted to enterprise. And I listen to the customers that were talking and keynote today, and I said, "You know, they're talking digital transformation, "but it's not like GE and Nike getting up on stage, "being like, "'We're going to be a software company, "'and we're hiring lots--'" >> John: Moving our data center over. >> They were pulling all of over stuff, and it's like, "Oh yeah, Google's a good partner. "And we're using them--" >> But to be fair, Stu. Let's be fair, for a second. First of all, let's break down the keynotes. And then we'll get to some of the things about being fair. And I think, one, people should be fair to Diane Greene, because I think that the press and the coverage of it, looking at the media coverage, is weak. And I'll tell you why it's weak. Cause everyone has the same story as, "Oh, Google's finally serious about Cloud. "That's old news. "Diane Greene from day one says "we're serious with the Cloud." That's not the story. The story is, can they be a serious contender? That's number one. On the keynote, one, customer traction, I saw that, the slide up there. Yeah, the G Suite in there, but at least they're talking customers. Number two, the SAP news was strategic for Google. SAP now has Google Cloud platform, I mean, Google Cloud support for HANA, and also the SAP Cloud platform. And three, the Chief Data Science from AIG pointed. To me, those were the three highlights of the keynote. Each one, thematically, represents at least a positive direction for Google, big time, which is, one, customer adoption, the customer focus. Two, partnerships with SAP, and they had Disney up there. And then three, the real game changer, which is, can they change the AI machine learning, TensorFlow has a ton of traction. Intel Xeon chips now are optimized with TensorFlow. This is Google. >> TensorFlow, Kubernetes, it's really interesting. And it's interesting, John, I think if the media listened to Eric Schmidt at the end, he was talking straight to them. He's like, "Look, bullet one. "17 years ago, I told Google that "this is where we need to go. "Bullet two, 30 billion dollars "I'm investing in infrastructure. "And yes, it's real, "cause I had to sign off on all of this money. And we've been all saying for a while, "Is this another beta from Google. "Is it serious? "There's no ad revenue, what is this?" And Diane Greene, in the Q&A afterwards, somebody talked about, "Perpetual beta seems to be Google." And she's like, "Look, I want to differentiate. "We are not the consumer business. "The consumer business might kill something. "They might change something. "We're positioning, "this a Cloud that the enterprise can build on. "We will not deprecate something. "We'll support today. "We'll support the old version. "We will support you going forward." Big push for channel, go-to-market service and support, because they understand that that-- >> Yeah, but that's weak. >> For those of us that used Google for years, understand that-- >> There's no support. >> "Where do I call for Google?" Come on, no. >> Yeah, but they're very weak on that. And we broke that down with Tom Kemp earlier, from Centrify, where Google's play is very weak on the sales and marketing side. Yeah, I get the service piece. But go to Diane Greene for a second, she is an incredible, savvy enterprise executive. She knows Cloud. She moved from server to virtualization. And now she can move virtualization to Cloud. That is her playbook. And I think she's well suited to do that. And I think anyone who rushes to judgment on her keynote, given the fail of the teleprompter, I think is a little bit overstepping their bounds on that. I think it's fair to say that, she knows what she's doing. But she can only go as fast as they can go. And that is, you can't like hope that you're further along. The reality is, it takes time. Security and data are the key points. On your point you just mentioned, that's interesting. Because now the war goes on. Okay, Kubernetes, the microservices, some of the things going on in the applications side, as trends like Serverless come on, Stu, where you're looking at the containerization trend that's now gone to Kubernetes. This is the battleground. This is the ground that we've been at Dockercon, we've been at Linux, CNCF has got huge traction, the Cloud Native Compute Foundation. This is key. Now, that being said. The marketplace never panned out, Stu. And I wanted to get your analysis on this, cause you cover this. Few years ago, the world was like, "Oh, I want to be like Facebook." We've heard, "the Uber of this, and the Airbnb of that." Here's the thing. Name one company that is the Facebook of their company. It's not happening. There is no other Facebook, and there is no other Google. So run like Google, is just a good idea in principle, horizontally scalable, having all the software. But no one is like Google. No one is like Facebook, in the enterprise. So I think that Google's got to downclock their messaging. I won't say dumb down, maybe I'll just say, slow it down a little bit for the enterprise, because they care about different things. They care more about SLA than pricing. They care more about data sovereignty than the most epic architecture for data. What's your analysis? >> John, some really good points there. So there's a lot of technology, where like, "This is really cool." And Google is the biggest of it. Remember that software-defined networking we spent years talking about? Well, the first big company we heard about was Google, and they got up of stage, "We're the largest SDN deployer in the world on that." And it's like, "Great. "So if you're the enterprise, "don't deploy SDN, go to somebody else "that can deliver it for you. "If that's Google, that's great." Dockercon, the first year they had, 2014, Google got up there, talked about how they were using containers, and containers, and they spin up and spin down. Two billion containers in a week. Now, nobody else needs to spin up two billion containers a week, and do that down. But they learned from that. They build Kubernetes-- >> Well, I think that's a good leadership position. But it's leadership position to show that you got the mojo, which again, this is again, what I like about Google's strategy is, they're going to play the technology card. I think that's a good card to play. But there are some just table stakes they got to nail. One is the certifications, the security, the data. But also, the sales motions. Going into the enterprise takes time. And our advice to Diane Greene was, "Don't screw the gold Google culture. "Keep that technology leadership. "And buy somebody, "buy a company that's got a full blown sales force." >> But John, one of the critiques of Google has always been, everything they create, they create like for Google, and it's too Googley. I talked to a couple of friends, that know about AWS for a while, and when they're trying to do Google, they're like, "Boy, this is a lot tougher. "It's not as easy as what we're doing." Google says that they want to do a lot of simplicity. You touched on pricing, it's like, "Oh, we're going to make pricing "so much easier than what Amazon's doing." Amazon Reserved Instances is something that I hear a lot of negative feedback in the community on, and Google's like, "It's much simpler." But when I've talked to some people that have been using it, it's like, "Well, generally it should be cheaper, "and it should be easier. "But it's not as predictable. "And therefore, it's not speaking to what "the CFO needs to have. "I can't be getting a rebate sometime down the road. "Based on some advanced math, "I need to know what I'm going to be getting, "and how I'm going to be using it." >> And that's a good point, Stu. And this comes down to the consumability of the Cloud. I think what Amazon has done well, and this came out of many interviews today, but it was highlighted by Val Bercovici, who pointed out that, Amazon has made their service consumable by the enterprise. I think that's important. Google needs to start thinking about how enterprises want to consume Cloud, and hit those points. The other thing that Val and I teased at, was kind of some new ground, and he coined the term, or used the term, maybe he coined it, I'm not sure, empathy. Enterprise empathy. Google has developer empathy, they understand the developer community. They're rock solid on open source. Obviously, their mojo's phenomenal on technology, AI, et cetera, TensorFlow, all that stuff's great. Empathy for the enterprise, not there. And I think that's something that they're going to have to work on. And again, that's just evolution. You mentioned Amazon, our first event, developer, developer, developer. Me and Pat Gelsinger once called it the developer Cloud. Now they're truly the enterprise Cloud. It took three years for Amazon to do that. So you just can't jump to a trajectory. There's a huge amount of diseconomies of scale, Stu, to try and just be an enterprise player overnight, because, "We're Google." That's just not going to fly. And whether it's sales motions, pricing and support, security, this is hard. >> And sorting out that go-to-market, is going to take years. You see a lot of the big SIs are there. PwC, everywhere at the show. Accenture, big push at the show. We saw that a year or two ago, at the Amazon show. I talked to some friends in the channel, and they're like, "Yeah, Google's still got work to do. "They're not there." Look, Amazon has work to do on the go-to-market, and Google is still a couple-- >> I mean, Amazon's not spring chicken here. They're quietly, slowly, ramming up. But they're not in a good position with their sales force, needs to be where they want to be. Let's talk about technology now. So tomorrow we're expecting to see a bunch of stuff. And one area that I'm super excited about with Google, is if they can have their identity identified, and solidified with the mind of the enterprise, make their product consumable, change or adjust or buy a sales force, that could go out and actually sell to the enterprise, that's going to be key. But you're going to hear some cool trends that I like. And if you look at the TensorFlow, and the relationship, Intel, we're going to see Intel on stage tomorrow, coming out during one of the keynotes. And you're going to start to see the Xeon chip come out. And now you're starting to see now, the silicon piece. And this has been a data center nuisance, Stu. As we talked about with James Hamilton at Amazon, which having a hardware being optimized for software, really is the key. And what Intel's doing with Xeon, and we talked to some other people today about it, is that the Cloud is like an operating system, it's a global computer, if you want look at that. It's a mainframe, the software mainframe, as it's been called. You want a diversity of chipsets, from two cores Atom to 72 cores Xeon. And have them being used in certain cases, whether it's programmable silicon, or whether it's GPUs, having these things in use case scenarios, where the chips can accelerate the software evolution, to me is going to be the key, state of the art innovation. I think if Intel continues to get that right, companies like Google are going to crush it. Now, Amazon, they do their own. So this is going to another interesting dynamic. >> Yeah, it was actually one of the differentiating points Google's saying, is like, "Hey, you can get the Intel Skylake chip, "on Google Cloud, "probably six months before you're going to be able to "just call up your favorite OEM of choice, "and get that in there." And it's an interesting move. Because we've been covering for years, John, Google does a ton of servers. And they don't just do Intel, they've been heavily involved in the openPOWER movement, they're looking at alternatives, they're looking at low power, they're looking at from their device standpoint. They understand how to develop to all these pieces. They actually gave to the influencers, the press, the analysts, just like at Amazon, we all walked home with Echo Dot, everybody's walking home with the Google Homes. >> John: Did you get one? >> I did get one, disclaimer. Yeah, I got one. I'll be playing with it home. I figured I could have Alexa and Google talking to each other. >> Is it an evaluation unit? You have to give it back, or do you get to keep? >> No, I'm pretty sure they just let us keep that. >> John: Tainted. >> But what I'm interested to see, John, is we talk like Serverless, so I saw a ton of companies that were playing with Alexa at re:Invent, and they've been creating tons of skills. Lambda currently has the leadership out there. Google leverages Serverless in a lot of their architecture, it's what drives a lot of their analytics on the inside. Coming into the show, Google Cloud Functions is alpha. So we expect them to move that forward, but we will see with the announcements come tomorrow. But you would think if they're, try to stay that leadership though there, I actually got a statement from one of the guys that work on the Serverless, and Google believes that for functions, that whole Serverless, to really go where it needs to be, it needs to be open. Google isn't open sourcing anything this week, as far as I know. But they want to be able to move forward-- >> And they're doing great at open source. And I think one of the things, that not to rush to judgment on Google, and no one should, by the way. I mean, certainly, we put out our analysis, and we stick by that, because we know the enterprise pretty well, very well actually. So the thing that I like is that there are new use cases coming out. And we had someone who came on theCUBE here, Tarun Thakur, who's with Datos, datos.io. They're reimagining data backup and recovery in the Cloud. And when you factor in IoT, this is a paradigm shift. So I think we're going to see use cases, and this is a Google opportunity, where they can actually move the goal post a bit on the market, by enabling these no-use cases, whether it's something as, what might seem pedestrian, like backup and recovery, reimagining that is huge. That's going to take impact as the data domains of the world, and what not, that (mumbles). These new uses cases are going to evolve. And so I'm excited by that. But the key thing that came out of this, Stu, and this is where I want to get your reaction on is, Multicloud. Clearly the messaging in the industry, over the course of events that we've been covering, and highlighted today on Google Next is, Multicloud is the world we are living in. Now, you can argue that we're all in Amazon's world, but as we start developing, you're starting to see the emergence of Cloud services providers. Cloud services providers are going to have some tiering, certainly the big ones, and then you're going to have secondary partner like service providers. And Google putting G Suite in the mix, and Office 365 from Microsoft, and Oracle put in their apps in their Clouds stuff, highlights that the SaaS market is going to be very relevant. If that's the case, then why aren't we putting Salesforce in there, Adobe? They all got Clouds too. So if you believe that there's going to be specialism around Clouds, that opens up the notion that there'll be a series of Multicloud architectures. So, Stu-- >> Stu: Yeah so, I mean, John, first of all-- >> BS? Real? I mean what's going on? >> Cloud is this big broad term. From Wikibon's research standpoint, SaaS, today, is two-thirds of the public Cloud market. We spend a lot of time talking-- >> In revenue? >> In revenue. Revenue standpoint. So, absolutely, Salesforce, Oracle, Infor, Microsoft, all up there, big dollars. If we look at the much smaller part of the world, that infrastructures a service, that's where we're spending a lot of time-- >> And platforms a service, which Gartner kind of bundles in, that's how Gartner looks at it. >> It's interesting. This year, we're saying PaaS as a category goes away. It's either SaaS plus, I'm sorry, it's SaaS minus, or infrastructure plus. So look at what Salesforce did with Heroku. Look at what company service now are doing. Yes, there are solutions-- >> Why is PaaS going away? What's the thesis? What's the premise of that for Wikibon research? >> If we look at what PaaS, the idea was it tied to languages, things like portability. There are other tools and solutions that are going to be able to help there. Look at, Docker came out of a PaaS company, DockCloud. There's a really good article from one of the Docker guys talking about the history of this, and you and I are going to be at Dockercon. John, from what I hear, we're going to spending a lot of time talking about Kubernetes, at Dockercon. OpenStack Summit is going to be talking a lot about-- >> By the way, Kubernetes originated at Google. Another cool thing from Google. >> All right, so the PaaS as a market, even if you talk to the Cloud Foundry people, the OpenShift people. The term we got, had a year ago was PaaS is Passe, the nice piffy line. So it really feeds into, because, just some of these categorizations are what we, as industry watchers have a put in there, when you talk to Google, it's like, "Well, why are they talking about G Suite, "and Google Cloud, and even some of their pieces?" They're like, "Well, this is our bundle "that we put together." When you talk to Microsoft, and talk about Cloud, it's like, "Oh, well." They're including Skype in that. They're including Office 365. I'm like, "Well, that's our productivity. "That's a part of our overall solutions." Amazon, even when you talk to Amazon, it's not like that there are two separate companies. There's not AWS and Amazon, it's one company-- >> Are we living in a world of alternative facts, Stu? I mean, Larry Ellison coined the term "Fake Cloud", talking about Salesforce. I'm not going to say Google's a fake Cloud, cause certainly it's not. But when you start blending in these numbers, it's kind of shifting the narrative to having alternative facts, certainly skewing the revenue numbers. To your point, if PaaS goes away because the SaaS minuses that lower down the stack. Cause if you have microservices and orchestration, it kind of thins that out. So one, is that the case? And then I saw your tweet with Sam Ramji, he formally ran Cloud Foundry, he's now at Google, knows his stuff, ex-Microsoft guy, very strong dude. What's he take? What's his take on this? Did you get a chance to chat with Sam at all? >> Yeah, I mean, it was interesting, because Sam, right, coming from Cloud Foundry said, what Cloud Foundry was one of the things they were trying to do, was to really standardize across the clouds. And of course, little bias that he works at Google now. But he's like, "We couldn't do that with Google, "cause Google had really cool features. And of course, when you put an abstraction layer on, can I actually do all the stuff? And he's like, "We couldn't do that." Sure, if you talked to Amazon, they'll be like, "Come on. "Thousand features we announced last year, "look at all the things we have. "It's not like you can just take all of our pieces, "and use it there." Yes, at the VM, or container, or application microservices layer, we can sit on a lot of different Clouds, public or private. But as we said today, the Cloud is not a utility. John, you've been in this discussion for years. So we've talked about, "Oh, I'm just going "to have a Cloud broker, "and go out in a service." It's like, this is not, I'm not buying from Domino's and Pizza Hut, and it's pepperoni pizza's a pepperoni pizza. >> Well, Multicloud, and moving workloads across Clouds, is a different challenge. Certainly, I might have to some stuff here, maybe put some data and edge my bets on leveraging other services. But this brings up the total cost of ownership problem. If you look at the trajectory, say OpenStack, just as a random example. OpenStack, at one point, had a great promise. Now it's kind of niched down into infrastructural service. I know you're going to be covering that summit in Boston. And it's going to be interesting to see how that is. But the word in the community is, that OpenStack is struggling because of the employment challenges involved with it. So to me, Google has an opportunity to avoid that OpenStack kind of concept. Because, talking about Sam Ramji, open source is the wildcard in all of this. So if you look at a open source, and you believe that that PaaS layer's thinning down, to infrastructure and SaaS, then you got to look at the open source community, and that's going to be a key area, that we're certainly watching, and we've identified, and we've mentioned it before. But here's my point. If you look at the total cost of ownership. If I'm a customer, Stu, I'm like, "Okay, if I'm just going to move to the Cloud, "I need to rely and lean on my partner, "my vendor, my supplier, "Amazon, or Google, or Microsoft, whoever, "to provide really excellent manageability. "Really excellent security. "Because if I don't, I have to build it myself." So it's becoming the shark fin, the tip of the iceberg, that you don't see the hidden cost, because I would much rather have more confidence in manageability that I can control. But I don't want to have to spend resources building manageability software, if the stuff doesn't work. So there's the issue about Multicloud that I'm watching. Your thoughts? Or is that too nuance? >> No, no. First of all, one of the things is that if I look at what I was doing on premises, before versus public Cloud, yes, there are some hidden costs, but in general I think we understand them a little bit better in public Cloud. And public Cloud gives us a chance to do a do-over for this like security, which most of us understand that security is good in public Cloud. Now, security overall, lots of work to do, challenges, not security isn't the same across all of them. We've talked to plenty of companies that are helping to give security across Clouds. But this Multicloud discussion is still something that is sorting out. Portability is not simple, but it's where we're going. Today, most companies, if I'm not really small, have some on-prem pieces. And they're leveraging at least one Cloud. They're usually using many SaaS providers. And there's this whole giant ecosystem, John, around the Cloud management platforms. Because managing across lots of environment, is definitely a challenge. There's so many companies that are trying to solve them. And there's just dozens and dozens of these companies, attacking everything from licensing, to the data management, to everything else. So there's a lot of challenges there, especially the larger you get as a company, the more things you need to worry about. >> So Stu, just to wrap up our segment. Great day. Wanted to just get some color on the day. And highlighting some parody from the web is always great. Just got a tweet from fake Andy Jassy, which we know really isn't Andy Jassy. But Cloud Opinion was very active to the hashtag, that Twitter handle Cloud Opinion. But he had a medium post, and he said, "Eric Schmidt was boring. "Diane Greene was horrible. "Unfortunately, day one keynote were missed opportunity, "that left several gaps, "failed to portray Google's vision for Google Cloud. "They could've done the following, A, "explain the vision for the Cloud, "where do they see Google Cloud going. "Identify customer use cases that show samples "and customer adoption." They kind of did that. So discount that. My favorite line is this one, "Differentiate from other Cloud providers. "'We're Google damn it,' isn't working so well. "Neither is indirect shots as S3 downtime, "didn't work either as well as either. "Where is the customer's journey going? "And what's the most compelling thing for customers?" This phrase, "We're Google damn it," has kind of speaks to the arrogance of Google. And we've seen this before, and always say, Google doesn't have a bad arrogance. I like the Google mojo. I think the technology, they run hard. But they can sometimes, like, "Customer support, self-service." You can't really get someone on the phone. It's hard to replies from Google. >> "Check out YouTube video. "We own that too, don't you know that?" >> So this is a perception of Google. This could fly in the face, and that arrogance might blow up in the enterprise, cause the enterprises aren't that sophisticated to kind of recognize the mojo from Google. And they, "Hey, I want support. "I want SLAs. "I want security. "I want data flexibility." What's your thoughts? >> So Cloud Opinion wrote, I thought a really thoughtful piece leading up to it, that I didn't think was satire. Some of what he's putting in there, is definitely satire-- >> John: Some of it's kind of true though. >> From the keynote. So I did not get a sense in the meetings I've been in, or watching the keynote, that they were arrogant. They're growing. They're learning. They're working with the community. They're reaching out. They're doing all the things we think they need to do. They're listening really well. So, yes, I think the keynote was a missed opportunity overall. >> John: But we've got to give, point out that was a teleprompter fail. >> That was a piece of it. But even, we felt with a little bit of polish, some of the interactions would've been a little bit smoother. I thought Eric Schmidt's piece was really good at end. As I said before, the AI discussion was enlightening, and really solid. So I don't give it a glowing rating, but I'm not ready to trash it. And tomorrow is when they're going to have the announcements. And overall, there's good buzz going at the show. There's lots going on. >> Give 'em a letter. Letter grade. >> For the keynote? Or the show in general? >> So far, your experience as an analyst, cause you had the, again, to give them credit, I agree with you. First analyst conference. They are listening. And the slideshow, you see what they're doing. They're being humble. They didn't take any real direct shots at its competitors. They were really humble. >> And that is something that I think they could've helped to focus one something that differentiated a little bit. Something we had to pry out of them in some of the one-on-ones, is like, "Come on, what are you doing?" And they're like, "We're winning 50, 60% of our competitive deals." And I'm like, "Explain to us why. "Because we're not hearing it. "You're not articulating it as well." It's not like we expect them, it's like, "Oh wait, they told us we're arrogant. "Maybe we should be super humble now." It's kind of-- >> I don't think they're thinking that way. I think my impression of Google, knowing the companies history, and the people involved there, and Diane Greene in particular, as you know from the Vmware days. She's kind of humble, but she's not. She's tough. And she's good. And she's smart. >> And she's bringing in really good people. And by the way, John, I want to give them kudos, really supported International Women's Day, I love the, Fei-Fei got up, and she talked about her, one of her compatriots, another badass woman up there, that got like one of the big moments of the keynote there. >> John: Did they have a woman in tech panel? >> Not at this event. Because Diane was there, Fei-Fei was there. They had some women just participating in it. I know they had some other events going on throughout the show. >> I agree, and I think it's awesome. I think one of the things that I like about Google, and again, I'll reiterate, is that apples and oranges relative to the other Cloud guys. But remember, just because Amazon's lead is so far ahead, that you still have this jocking of position between the other players. And they're all taking the same pattern. Again, this is the same thing we talked about at our other analysis, is that, certainly at re:Invent, we talked about the same thing. Microsoft, Oracle, IBM, and now Google, are differentiating with their apps. And I think that's smart. I don't think that's a bad move at all. It does telegraph a little bit, that maybe they got, they could add more to show, we'll see tomorrow. But I don't think that's a bad thing. Again, it does make the numbers a little messy, in terms of what's what. But I think it's totally cool for a company to differentiate on their offering. >> Yeah, definitely. And John, as you said, Google is playing their game. They're not trying to play Amazon's game. They're not, Oracle's thing was what? You kind of get a little bit of the lead, and kind of just make sure how you attack and stay ahead of what they're doing, going to the boating analogy there. But Google knows where they're going, moving themselves forward. That they've made some really good progress. The amount of people, the amount of news they have. Are they moving fast enough to really try to close a little bit on the Amazon's world, is something I want to come out of the show with. Where are customers going? >> And it's a turbulent time too. As Peter Burris, our own Peter Buriss at Wikibon, would say, is a turbulent time. And it's going to really put everyone on notice. There's a lot to cover, if you're an analyst. I mean, you have compute, network storage, services. I mean, there's a slew of stuff that's being rolled out, either in table stakes for existing enterprises, plus new stuff. I mean, I didn't hear a lot of IoT today. Did you hear much IoT? Is there IoT coming to you at the briefing? >> Come on. I'm sure there's some service coming out from Google, that'll help us be able to process all this stuff much faster. They'll just replace this with-- >> So you're in the analyst meeting. I know you're under NDA, but is there IoT coming tomorrow? >> IoT was a term that I heard this week, yes. >> So all right, that's a good confirmation. Stu cannot confirm or deny that IoT will be there tomorrow. Okay, well, that's going to end day one of coverage, here in our studio. As you know, we got a new studio. We have folks on the ground. You're going to start to see a new CUBE formula, where we have in-studio coverage, and out in the field, like our normal CUBE, our "game day", as we say. Getting all the signal, extracting it from that noise out there, for you. Again, in-studio allows us to get more content. We bring our friends in. We want to get the content. We're going to get the summaries, and share that with you. I'm John Furrier, Stu Miniman, day one coverage. We'll see you tomorrow for another full day of special coverage, sponsored by Intel, two days of coverage. I want to thank Intel for supporting our editorial mission. We love the enterprise, we love Cloud, we love big data, love Smart Cities, autonomous vehicles, and the changing landscape in tech. We'll be back tomorrow, thanks for watching.

Published Date : Mar 9 2017

SUMMARY :

Silicon Valley, it's theCUBE, analyst at Wikibon on the team. I remember the first time for fries, at the (mumbles) And really the goal was and in the Q&A after, Is this a software? And it does kind of make the "Disney is going to bring I guess it fits the And I listen to the and it's like, "Oh yeah, and also the SAP Cloud platform. And Diane Greene, in the Q&A afterwards, "Where do I call for Google?" Name one company that is the And Google is the biggest of it. But also, the sales motions. one of the critiques of and he coined the term, do on the go-to-market, is that the Cloud is in the openPOWER movement, talking to each other. they just let us keep that. from one of the guys And Google putting G Suite in the mix, of the public Cloud market. smaller part of the world, And platforms a service, So look at what Salesforce the idea was it tied to languages, By the way, Kubernetes All right, so the PaaS as a market, it's kind of shifting the narrative to "look at all the things we have. So it's becoming the shark fin, First of all, one of the things is that I like the Google mojo. "We own that too, don't you know that?" This could fly in the face, that I didn't think was satire. They're doing all the things point out that was a teleprompter fail. the AI discussion was enlightening, Give 'em a letter. And the slideshow, you And I'm like, "Explain to us why. and the people involved there, And by the way, John, I know they had some other events going on Again, it does make the You kind of get a little bit of the lead, And it's going to really to process all this stuff I know you're under NDA, I heard this week, yes. and out in the field,

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Diane	PERSON	0.99+
John	PERSON	0.99+
HSBC	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Larry Ellison	PERSON	0.99+
Dan Greene	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Larry Page	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Sam Ramji	PERSON	0.99+
Sam Yen	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Tom Kemp	PERSON	0.99+
eBay	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
50	QUANTITY	0.99+
Eric Schmidt	PERSON	0.99+
Nike	ORGANIZATION	0.99+
James Hamilton	PERSON	0.99+
Peter Buriss	PERSON	0.99+
AWS	ORGANIZATION	0.99+
AIG	ORGANIZATION	0.99+
Home Depot	ORGANIZATION	0.99+
Disney	ORGANIZATION	0.99+
Sam	PERSON	0.99+
Verizon	ORGANIZATION	0.99+

Jean Francois Puget, IBM | IBM Machine Learning Launch 2017

>> Announcer: Live from New York, it's theCUBE, covering the IBM machine learning launch event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Alright, we're back. Jean Francois Puget is here, he's the distinguished engineer for machine learning and optimization at IBM analytics, CUBE alum. Good to see you again. >> Yes. >> Thanks very much for coming on, big day for you guys. >> Jean Francois: Indeed. >> It's like giving birth every time you guys give one of these products. We saw you a little bit in the analyst meeting, pretty well attended. Give us the highlights from your standpoint. What are the key things that we should be focused on in this announcement? >> For most people, machine learning equals machine learning algorithms. Algorithms, when you look at newspapers or blogs, social media, it's all about algorithms. Our view that, sure, you need algorithms for machine learning, but you need steps before you run algorithms, and after. So before, you need to get data, to transform it, to make it usable for machine learning. And then, you run algorithms. These produce models, and then, you need to move your models into a production environment. For instance, you use an algorithm to learn from past credit card transaction fraud. You can learn models, patterns, that correspond to fraud. Then, you want to use those models, those patterns, in your payment system. And moving from where you run the algorithm to the operation system is a nightmare today, so our value is to automate what you do before you run algorithms, and then what you do after. That's our differentiator. >> I've had some folks in theCUBE in the past have said years ago, actually, said, "You know what, algorithms are plentiful." I think he made the statement, I remember my friend Avi Mehta, "Algorithms are free. "It's what you do with them that matters." >> Exactly, that's, I believe in autonomy that open source won for machine learning algorithms. Now the future is with open source, clearly. But it solves only a part of the problem you're facing if you want to action machine learning. So, exactly what you said. What do you do with the results of algorithm is key. And open source people don't care much about it, for good reasons. They are focusing on producing the best algorithm. We are focusing on creating value for our customers. It's different. >> In terms of, you mentioned open source a couple times, in terms of customer choice, what's your philosophy with regard to the various tooling and platforms for open source, how do you go about selecting which to support? >> Machine learning is fascinating. It's overhyped, maybe, but it's also moving very quickly. Every year there is a new cool stuff. Five years ago, nobody spoke about deep learning. Now it's everywhere. Who knows what will happen next year? Our take is to support open source, to support the top open source packages. We don't know which one will win in the future. We don't know even if one will be enough for all needs. We believe one size does not fit all, so our take is support a curated list of mid-show open source. We start with Spark ML for many reasons, but we won't stop at Spark ML. >> Okay, I wonder if we can talk use cases. Two of my favorite, well, let's just start with fraud. Fraud has become much, much better over the past certainly 10 years, but still not perfect. I don't know if perfection is achievable, but lot of false positives. How will machine learning affect that? Can we expect as consumers even better fraud detection in more real time? >> If we think of the full life cycle going from data to value, we will provide a better answer. We still use machine learning algorithm to create models, but a model does not tell you what to do. It will tell you, okay, for this credit card transaction coming, it has a high probability to be fraud. Or this one has a lower priority, uh, probability. But then it's up to the designer of the overall application to make decisions, so what we recommend is to use machine learning data prediction but not only, and then use, maybe, (murmuring). For instance, if your machine learning model tells you this is a fraud with a high probability, say 90%, and this is a customer you know very well, it's a 10-year customer you know very well, then you can be confident that it's a fraud. Then if next fraud tells you this is 70% probability, but it's a customer since one week. In a week, we don't know the customer, so the confidence we can get in machine learning should be low, and there you will not reject the transaction immediately. Maybe you will enter, you don't approve it automatically, maybe you will send a one-time passcode, or you enter a serve vendor system, but you don't reject it outright. Really, the idea is to use machine learning predictions as yet another input for making decisions. You're making decision informed on what you could learn from your past. But it's not replacing human decision-making. Our approach with IBM, you don't see IBM speak much about artificial intelligence in general because we don't believe we're here to replace humans. We're here to assist humans, so we say, augmented intelligence or assistance. That's the role we see for machine learning. It will give you additional data so that you make better decisions. >> It's not the concept that you object to, it's the term artificial intelligence. It's really machine intelligence, it's not fake. >> I started my career as a PhD in artificial intelligence, I won't say when, but long enough. At that time, there were already promise that we have Terminator in the next decade and this and that. And the same happened in the '60s, or it was after the '60s. And then, there is an AI winter, and we have a risk here to have an AI winter because some people are just raising red flags that are not substantiated, I believe. I don't think that technology's here that we can replace human decision-making altogether any time soon, but we can help. We can certainly make some proficient, more efficient, more productive with machine learning. >> Having said that, there are a lot of cognitive functions that are getting replaced, maybe not by so-called artificial intelligence, but certainly by machines and automation. >> Yes, so we're automating a number of things, and maybe we won't need to have people do quality check and just have an automated vision system detect defects. Sure, so we're automating more and more, but this is not new, it has been going on for centuries. >> Well, the list evolved. So, what can humans do that machines can't, and how would you expect that to change? >> We're moving away from IMB machine learning, but it is interesting. You know, each time there is a capacity that a machine that will automate, we basically redefine intelligence to exclude it, so you know. That's what I foresee. >> Yeah, well, robots a while ago, Stu, couldn't climb stairs, and now, look at that. >> Do we feel threatened because a robot can climb a stair faster than us? Not necessarily. >> No, it doesn't bother us, right. Okay, question? >> Yeah, so I guess, bringing it back down to the solution that we're talking about today, if I now am doing, I'm doing the analytics, the machine learning on the mainframe, how do we make sure that we don't overrun and blow out all our MIPS? >> We recommend, so we are not using the mainframe base compute system. We recommend using ZIPS, so additional calls to not overload, so it's a very important point. We claim, okay, if you do everything on the mainframe, you can learn from operational data. You don't want to disturb, and you don't want to disturb takes a lot of different meanings. One that you just said, you don't want to slow down your operation processings because you're going to hurt your business. But you also want to be careful. Say we have a payment system where there is a machine learning model predicting fraud probability, a part of the system. You don't want a young bright data scientist decide that he had a great idea, a great model, and he wants to push his model in production without asking anyone. So you want to control that. That's why we insist, we are providing governance that includes a lot of things like keeping track of how models were created from which data sets, so lineage. We also want to have access control and not allow anyone to just deploy a new model because we make it easy to deploy, so we want to have a role-based access and only someone someone with some executive, well, it depends on the customer, but not everybody can update the production system, and we want to support that. And that's something that differentiates us from open source. Open source developers, they don't care about governance. It's not their problem, but it is our customer problem, so this solution will come with all the governance and integrity constraints you can expect from us. >> Can you speak to, first solution's going to be on z/OS, what's the roadmap look like and what are some of those challenges of rolling this out to other private cloud solutions? >> We are going to shape this quarter IBM machine learning for Z. It starts with Spark ML as a base open source. This is not, this is interesting, but it's not all that is for machine learning. So that's how we start. We're going to add more in the future. Last week we announced we will shape Anaconda, which is a major distribution for Python ecosystem, and it includes a number of machine learning open source. We announced it for next quarter. >> I believe in the press release it said down the road things like TensorFlow are coming, H20. >> But Anaconda will announce for next quarter, so we will leverage this when it's out. Then indeed, we have a roadmap to include major open source, so major open source are the one from Anaconda (murmuring), mostly. Key deep learning, so TensorFlow and probably one or two additional, we're still discussing. One that I'm very keen on, it's called XGBoost in one word. People don't speak about it in newspapers, but this is what wins all Kaggle competitions. Kaggle is a machine learning competition site. When I say all, all that are not imagery cognition competitions. >> Dave: And that was ex-- >> XGBoost, X-G-B-O-O-S-T. >> Dave: XGBoost, okay. >> XGBoost, and it's-- >> Dave: X-ray gamma, right? >> It's really a package. When I say we don't know which package will win, XGBoost was introduced a year ago also, or maybe a bit more, but not so long ago, and now, if you have structure data, it is the best choice today. It's a really fast-moving, but so, we will support mid-show deep learning package and mid-show classical learning package like the one from Anaconda or XGBoost. The other thing we start with Z. We announced in the analyst session that we will have a power version and a private cloud, meaning XTC69X version as well. I can't tell you when because it's not firm, but it will come. >> And in public cloud as well, I guess we'll, you've got components in the public cloud today like the Watson Data Platform that you've extracted and put here. >> We have extracted part of the testing experience, so we've extracted notebooks and a graphical tool called ModelBuilder from DSX as part of IBM machine learning now, and we're going to add more of DSX as we go. But the goal is to really share code and function across private cloud and public cloud. As Rob Thomas defined it, we want with private cloud to offer all the features and functionality of public cloud, except that it would run inside a firewall. We are really developing machine learning and Watson machine learning on a command code base. It's an internal open source project. We share code, and then, we shape on different platform. >> I mean, you haven't, just now, used the word hybrid. Every now and then IBM does, but do you see that so-called hybrid use case as viable, or do you see it more, some workloads should run on prem, some should run in the cloud, and maybe they'll never come together? >> Machine learning, you basically have to face, one is training and the other is scoring. I see people moving training to cloud quite easily, unless there is some regulation about data privacy. But training is a good fit for cloud because usually you need a large computing system but only for limited time, so elasticity's great. But then deployment, if you want to score transaction in a CICS transaction, it has to run beside CICS, not cloud. If you want to score data on an IoT gateway, you want to score other gateway, not in a data center. I would say that may not be what people think first, but what will drive really the split between public cloud, private, and on prem is where you want to apply your machine learning models, where you want to score. For instance, smart watches, they are switching to gear to fit measurement system. You want to score your health data on the watch, not in the internet somewhere. >> Right, and in that CICS example that you gave, you'd essentially be bringing the model to the CICS data, is that right? >> Yes, that's what we do. That's a value of machine learning for Z is if you want to score transactions happening on Z, you need to be running on Z. So it's clear, mainframe people, they don't want to hear about public cloud, so they will be the last one moving. They have their reasons, but they like mainframe because it ties really, really secure and private. >> Dave: Public cloud's a dirty word. >> Yes, yes, for Z users. At least that's what I was told, and I could check with many people. But we know that in general the move is for public cloud, so we want to help people, depending on their journey, of the cloud. >> You've got one of those, too. Jean Francois, thanks very much for coming on theCUBE, it was really a pleasure having you back. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from the Waldorf Astoria. IBM's machine learning announcement, be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Good to see you again. on, big day for you guys. What are the key things that we and then what you do after. "It's what you do with them that matters." So, exactly what you said. but we won't stop at Spark ML. the past certainly 10 years, so that you make better decisions. that you object to, that we have Terminator in the next decade cognitive functions that and maybe we won't need to and how would you expect that to change? to exclude it, so you know. and now, look at that. Do we feel threatened because No, it doesn't bother us, right. and you don't want to disturb but it's not all that I believe in the press release it said so we will leverage this when it's out. and now, if you have structure data, like the Watson Data Platform But the goal is to really but do you see that so-called is where you want to apply is if you want to score so we want to help people, depending on it was really a pleasure having you back. from the Waldorf Astoria.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jean Francois	PERSON	0.99+
IBM	ORGANIZATION	0.99+
10-year	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
Avi Mehta	PERSON	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Jean Francois Puget	PERSON	0.99+
next year	DATE	0.99+
Two	QUANTITY	0.99+
Last week	DATE	0.99+
next quarter	DATE	0.99+
90%	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
one-time	QUANTITY	0.99+
today	DATE	0.99+
Five years ago	DATE	0.99+
one word	QUANTITY	0.99+
CICS	ORGANIZATION	0.99+
Python	TITLE	0.99+
a year ago	DATE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
next decade	DATE	0.98+
one week	QUANTITY	0.98+
first solution	QUANTITY	0.98+
XGBoost	TITLE	0.98+
a week	QUANTITY	0.97+
Spark ML	TITLE	0.97+
'60s	DATE	0.97+
ModelBuilder	TITLE	0.96+
one size	QUANTITY	0.96+
One	QUANTITY	0.95+
first	QUANTITY	0.94+
Watson Data Platform	TITLE	0.93+
each time	QUANTITY	0.93+
Kaggle	ORGANIZATION	0.92+
Stu	PERSON	0.91+
this quarter	DATE	0.91+
DSX	TITLE	0.89+
XGBoost	ORGANIZATION	0.89+
Waldorf Astoria	ORGANIZATION	0.86+
Spark ML.	TITLE	0.85+
z/OS	TITLE	0.82+
years	DATE	0.8+
centuries	QUANTITY	0.75+
10 years	QUANTITY	0.75+
DSX	ORGANIZATION	0.72+
Terminator	TITLE	0.64+
XTC69X	TITLE	0.63+
IBM Machine Learning Launch 2017	EVENT	0.63+
couple times	QUANTITY	0.57+
machine learning	EVENT	0.56+
X	TITLE	0.56+
Watson	TITLE	0.55+
these products	QUANTITY	0.53+
-G-B	COMMERCIAL_ITEM	0.53+
H20	ORGANIZATION	0.52+
TensorFlow	ORGANIZATION	0.5+
theCUBE	ORGANIZATION	0.49+
CUBE	ORGANIZATION	0.37+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Kaggle: