Ron Bodkin, Teradata - DataWorks Summit 2017

>> Announcer: Live from San Jose in the heart of Silicon Valley, It's theCUBE covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Welcome back to theCUBE. We are live at the DataWorks Summit on day two. We have had a great day and a half learning a lot about the next generation of big data, machine learning, artificial intelligence, I'm Lisa Martin, and my co-host is George Gilbert. We are next joined by a CUBE alumni, Ron Bodkin, the VP and General Manager of Artificial Intelligence for Teradata. Welcome back to theCUBE! >> Well thank you Lisa, it's nice to be here. >> Yeah, so talk to us about what you're doing right now. Your keynote is tomorrow. >> Ron: Yeah. >> What are you doing, what is Teradata doing in helping customers to be able to leverage artificial intelligence? >> Sure, yeah so as you may know, I ha`ve been involved in this conference and the big data space for a long time as the founding CEO of Think Big Analytics. We were involved in really helping customers in the beginning of big data in the enterprise. And so, we are seeing a very similar trend in the space of artificial intelligence, right? The rapid advances in recent years in deep learning have opened up a lot of opportunity to really create value from all the data the customers have in their data ecosystems, right? So Teradata has a big role to play in having high quality product, Teradata database, analytic ecosystem products such as Hadoop, such as QueryGrid for connecting these systems together, right? So what we're seeing is our customers are very excited by artificial intelligence, but what we're really focused on is how do they get to the value, right? What can they do that's really going to get results, right? And we bring this perspective of having this strong solutions approach inside of Teradata, and so we have Think Big Analytics consulting for data science, we now have been building up experts in deep learning in that organization, working with customers, right? We've brought product functionality so we're innovating around how do we keep pushing the Teradata product family forward with functionality around streaming with listeners. Functionality like the ability to, how do you take GPU and start to think about how can we add that and make that deploy efficiently inside our customer's data center. How can you take advantage of innovation in open source with projects like TensorFlow and Keras becoming important for our customers. So we're seeing is a lot of customers are excited about use cases for artificial intelligence. And tomorrow in the keynote I'm going to touch on a few of them, ranging from applications like preventative maintenance, anti-fraud in banking, to e-commerce recommendations and we're seeing those are some of the examples of use cases where customers are saying hey, there's a lot of value in combining traditional machine learning, wide learning, with deep learning using neural nets to generalize. >> Help us understand if there's an arc where there's the mix of what's repeatable and what's packagable, or what's custom, how that changes over time, or whether it's just by solution. >> Yeah, it's a great question. Right, I mean I think there's a lot of infrastructure that any of these systems need to rest on. So having data infrastructure, having quality data that you can rely on is foundational, and so you need to get that installed and working well as a beginning point. Obviously having repeatable products that manage data with high SLAs and supporting not use production use, but also how do you let data scientists analyze data in a lab and make that work well. So there's that foundational data layer. Then there's the whole integration of the data science into applications, which is critical, analytics, ops, agile ways of making it possible to take the data and build repeatable processes, and those are very horizontal, right? There's some variation, but those work the same in a lot of use cases. At this stage, I'd say, in deep learning, just like in machine learning generally, you still have a lot of horizontal infrastructure. You've got Spark, you've got TensorFlow, those are support use case across many industries. But then you get to the next level, you get specific problems, and there's a lot of nuance. What modeling techniques are going to work, what data sets matter? Okay, you've got time series data and a problem like fraud. What techniques are going to make that work well? And recommendations, you may have a long tail of items to think about recommending. How do you generalize across the long tail where you can't learn. People who use some relatively small thing or go to an obscure website, or buy an obscure product, there's not enough data to say are they likely to buy something else or do something else, but how do you categorize them so you get statistical power to make useful recommendations, right? Those are things that are very specific that there's a lot of repeatability and a specific solution area of. >> This is, when you talk about the data assets that might be specific to a customer and then I guess some third party or syndicated sources. If you have an outcome in mind, but not every customer has the same inventory of data, so how do you square that circle? >> That's a great question. And I really think that's a lot of the opportunity in the enterprise of applying analytics, so this whole summit DataWorks is about hey, the power of your data. What you can get by collecting your data in a well-managed ecosystem and creating value. So, there's always a nuance. It's like what's happening in your customers, what's your business process, what's special about how you interact, what's the core of your business? So I guess my view is that the way anybody that wants to be a winner in this new digital era and have processes that take advantage of artificial intelligence is going to have to use data as a competitive advantage and build on their unique data. So because we see a lot of times enterprises struggle with this. There's a tendency to say hey, can we just buy a package off the shelf SaaS solution and do that? And for context, for things that are the same for everybody in an industry, that's a great choice. But if you're doing that for your core differentiation of your business, you're in deep trouble in this digital era. >> And that's a great place, sorry George, really quickly. That this day and age, every company is a technology company. You mentioned a use case in banking, fraud detection, which is huge. There's tremendous value that can be gleaned from artificial intelligence, and there's also tremendous risk to them. I'm curious, maybe just kind of a generalization. Where are your customers on this journey in terms of have they, are you going out to customers that have already embraced Hadoop and have a significant amount of data that they say, all right, we've got a lot of data here, we need to understand the context. Where are customers in that maturity evolution? >> Sure, so I'd say that we're really fast-approaching the slope of enlightenment for Hadoop, which is to say the enthusiasm of three years ago when people thought Hadoop was going to do everything have kind of waned and there's now more of an appreciation, like there's a lot of value in having a data warehouse for high value curated data for large-scale use. There's a lot of value in having a data lake of fairly raw data that can be used for exploration in the data science arena. So there's emerging, like what is the best architecture for streaming and how do you drive realtime decisions, and that's still very much up in the air. So I'd say that most of our customers are somewhere on that journey, I think that a lot of them have backed off from their initial ambitions that they bought a little too much of the hype of all that Hadoop might do and they're realizing what it is good for, and how they really need to build a complementary ecosystem. The other thing I think is exciting though is I see the conversation is moving from the technology to the use cases. People are a lot more excited about how can we drive value and analytics, and let's work backwards from the analytics value to the data that's going to support it. >> Absolutely. >> So building on that, we talk about sort of what's core and if you can't have something completely repeatable that's going to be core to your sustainable advantage, but if everyone is learning from data, how does a customer achieve a competitive advantage or even sustain a competitive advantage? Is it orchestrating learning that feeds, that informs processes all across the business, or is it just sort of a perpetual Red Queen effect? >> Well, that's a great question. I mean, I think there's a few things, right? There's operational excellence in every discipline, so having good data scientists, having the right data, collecting data, thinking about how do you get network effects, those are all elements. So I would say there's a table-stakes aspect that if you're not doing this, you're in trouble, but then if you are it's like how do you optimize and lift your game and get better at it? So that's an important fact that you see companies that say how do we acquire data? Like one of the things that you see digital disruptors, like a Tesla, doing is changing the game by saying we're changing the way we work with our customers to get access to the data. Think of the difference between every time you buy a Tesla you sign over the rights for them to collect and use all your data, when the traditional auto OEMs are struggling to get access to a lot of the data because they have intermediaries that control the relationship and aren't willing to share. And a similar thing in other industries, you see in consumer packaged goods. You see a lot of manufacturers there are saying how do we get partnerships, how do we get more accurate data? The old models of going out to the Nielsens of the world and saying give us aggregates, and we'll pay you a lot to give us a summary report, that's not working. How do we learn directly in a digital world about our consumers so we can be more relevant? So one of the things is definitely that control of data and access to data, as well as we see a lot of companies saying what are the acquisitions we can make? What are start ups and capabilities that we can plug in, and complement to get data, to get analytic capability that we can then tailor for our needs? >> It's funny that you mention Tesla having more cars on the road, collecting more data than pretty much anyone else at this point. But then there's like Stanford's sort of luminary for AI, Fei-Fei Li. She signed on I think with Toyota, because she said they sell 10 million cars a year, I'm going to be swimming in data compared to anyone else, possible exception of GM or maybe some Chinese manufacturer. So where does, how can you get around scale when using data at scale to inform your models? How would someone like a Tesla be able to get an end run around that? So that's the battle, the disruptor comes in, they're not at scale, but they maybe change the game in some way. Like having different terms that give them access to different kinds of data, more complete data. So that's sort of part of the answer, is to disrupt an industry you need a strategy what's different, right, like in Tesla's case an electric vehicle. And they've been investing in autonomous vehicles with AI, of course everybody in the industry is seeing that and is racing. I mean, Google really started that whole wave going a long time ago as another potential disruptor coming in with their own unique data asset. So, I think it's all about the combination of capabilities that you need. Disruptors often bring a commitment to a different business process, and that's a big challenge is a lot of times the hardest things are the business processes that are entrenched in existing organizations and disruptors can say we're rethinking the way this gets done. I mean, the example of that in ride sharing, the Ubers and Lyfts of the world, deities where they are re-conceiving what does it mean to consume automobile services. Maybe you don't want to own a car at all if you're a millennial, maybe you just want to have access to a car when you need to go somewhere. That's a good example of a disruptive business model change. >> What are some things that are on the intermediate-term horizon that might affect how you go about trying to create a sustainable advantage? And here I mean things like where deep learning might help data scientists with feature engineering so there's less need for, you can make data scientists less of a scarce resource. Or where there's new types of training for models where you need less data? Those sorts of things might disrupt the practice of achieving an advantage with current AI technology. >> You know, that's a great question. So near-term, the ability to be more efficient in data science is a big deal. There's no surprise that there's a big talent gap, big shortage of qualified data scientists in the enterprise and one of the things that's exciting is that deep learning lets you get more information out of the data, so it learns more so that you'd have to do less future engineering. It's not like a magic box you just pour in raw data to deep learning and out comes the answers, so you still need qualified data scientists, but it's a force multiplier. There's less work to do in future engineering, and therefore you get better results. So that's a factor, you're starting to see things like a hyperparameter search where people will create neural networks that search for the best machine learning model, and again get another level of leverage. Now, today doing that is very expensive. The amount of hardware to do that, very few organizations are going to spend millions of dollars to sort of automate the discovery of models, but things are moving so fast. I mean, even just in the last six weeks to have Nvidia and Google both announce significant breakthroughs in hardware. And I just had a colleague forward me a paper for recent research that says hey this technique could produce a hundred times faster results in deep learning convergence. So you've got rapid advances in investment in the hardware and the software. Historically software improvements have outstripped hardware improvements throughout the history of computing, so it's quite reasonable to expect you'll have 10 thousand times the price performance for deep learning in five years. So things that today might cost a hundred million dollars and no one would do, could cost 10 thousand dollars in five years, and suddenly it's a no-brainer to apply a technique like that to automate something instead of hiring more scarce data scientists that are hard to find, and make the data scientists more productive so they're spending more time thinking about what's going on and less time trying out different variations of how do I configure this thing, does this work, does this, right? >> Oh gosh, Ron, we could keep chatting away. Thank you so much for stopping by theCUBE again, we wish you the best of luck in your keynote tomorrow. I think people are going to be very inspired by your passion, your energy, and also the tremendous opportunity that is really sitting right in front of us. >> Thank you, Lisa, it's a very exciting time to be in the data industry, and the emergence of AI and the enterprise, I couldn't be more excited by it. >> Oh, excellent, well your excitement is palpable. We want to thank you for watching. We are live on theCUBE at the DataWorks Summit day 2, #dws17. For my cohost George Gilbert, I'm Lisa Martin, stick around. We'll be right back. (upbeat electronic melody)

Published Date : Jun 14 2017

SUMMARY :

Brought to you by Hortonworks. We are live at the DataWorks Summit on day two. Yeah, so talk to us about what you're doing right now. Functionality like the ability to, how do you take GPU and what's packagable, or what's custom, how that changes of infrastructure that any of these systems need to rest on. that might be specific to a customer There's a tendency to say hey, can we just buy a package are you going out to customers that have already embraced conversation is moving from the technology to the use cases. Like one of the things that you see digital disruptors, So that's sort of part of the answer, is to disrupt horizon that might affect how you go about So near-term, the ability to be more efficient we wish you the best of luck in your keynote tomorrow. and the emergence of AI and the enterprise, We want to thank you for watching.

ENTITIES

Entity	Category	Confidence
Toyota	ORGANIZATION	0.99+
George Gilbert	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
George	PERSON	0.99+
Ron Bodkin	PERSON	0.99+
Google	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
Ron	PERSON	0.99+
San Jose	LOCATION	0.99+
five years	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
10 thousand dollars	QUANTITY	0.99+
GM	ORGANIZATION	0.99+
Stanford	ORGANIZATION	0.99+
Teradata	ORGANIZATION	0.99+
Ubers	ORGANIZATION	0.99+
Think Big Analytics	ORGANIZATION	0.99+
10 thousand times	QUANTITY	0.99+
tomorrow	DATE	0.99+
one	QUANTITY	0.99+
DataWorks Summit	EVENT	0.98+
CUBE	ORGANIZATION	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
three years ago	DATE	0.98+
DataWorks Summit 2017	EVENT	0.97+
Hadoop	TITLE	0.97+
Lyfts	ORGANIZATION	0.97+
#dws17	EVENT	0.96+
10 million cars a year	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.95+
millions of dollars	QUANTITY	0.94+
hundred times	QUANTITY	0.92+
Nielsens	ORGANIZATION	0.91+
last six weeks	DATE	0.89+
Spark	TITLE	0.88+
day two	QUANTITY	0.86+
Hortonworks	ORGANIZATION	0.83+
a hundred million dollars	QUANTITY	0.81+
Fei-Fei Li	COMMERCIAL_ITEM	0.8+
TensorFlow	TITLE	0.77+
Chinese	OTHER	0.75+
Teradata -	EVENT	0.67+
QueryGrid	ORGANIZATION	0.64+
DataWorks	ORGANIZATION	0.63+
things	QUANTITY	0.61+
a half	QUANTITY	0.55+
Keras	TITLE	0.53+
Hadoop	ORGANIZATION	0.44+
2	DATE	0.4+
day	QUANTITY	0.35+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for QueryGrid: