Chris Lynskey, Oracle - On the Ground - #theCUBE

>> Announcer: TheCube presents, On the Ground. (upbeat music) Hi everyone, welcome to this special On the Ground, Cube coverage here at Oracle headquarters. I'm John Furrier, the host of theCube, here with Chris Linskey, who's the Vice President, Product Management for Oracle Big Data. Welcome to On the Ground, good to see you. >> Thanks John, nice to meet you. >> So let's talk about big data, and the concepts going on now for analytics. What is going on in your mind around big data, and some of the ideas that with customers are kicking around, because the number one thing we hear is, I got to store the data. Solved, check, database, system of record. But now other databases are popping up. Different types of databases, you got graph databases, you've got unstructured databases. Do I run Oracle for all those? When do I use Oracle? When do I don't use Oracle? So the first question is, what are some of the obstacles that are facing the companies? Is it integration? Is it the choice? What's going on? >> There's a lot. There's a lot of interest in the market around big data. But in terms of companies that are actually using that in kind of a productized fashion to build competitive insight, are less than you would think, because of some of these obstacles. So, we look at it in a few different ways, and we try to tackle the obstacles at Oracle in each of these categories. One of the first big questions to solve, is what you raised. How do I manage the data? I've got a lot of gravity in my data warehouse and in my databases, but now I've got all this new content coming in. It might be social media. It might be log data. Things that you're not sure of the value, so it may not make sense to store in that enterprise data warehouse. That's really where customers are looking at alternate technologies like big data, like Hadoop, to give you both that cost savings, but also to give you specialized access, whether you're doing, like you said, spacial queries, or graph queries. Oracle can give you the right engine for the right job, but what's also important in that data management layer, is doing it in a way that breeds simplicity of ownership. If the cost of ownership is too expensive, no one's going to do that. So we also have an initiative called Big Data SQL, that let's you use that common Oracle database as your front-end, but then queried back to Hadoop, queried back to a spatial or graph engine. You can leave that data there, where it makes the most sense. >> I mean SQL on Hadoop for instance has proven that SQL is the language of most people querying. So, that's out there, so that's done. But it doesn't mean run relational databases all the time, but that's what people are interfacing into other databases. >> Chris: Yeah. >> Is that a pretext to what's really happening? Is that, interfacing to other data sets is really the more important than actually having whole new systems. Because that seems to be ... >> It's a bit of both. The way I look at it is, some companies look at Hadoop as just another data source. I've got some log data, some social data, let me put it in a place that's cost-effective to store. And there using your database as a front-end makes sense. Other customers look at Hadoop and big data more as a data platform, where they want to use that cluster, that compute environment, to do more than just query things and build a chart. And that's where you see some new technologies coming out. In Oracle we call it our data factory. That's around, how can I use all of that compute power to actually do data integration? Right, how can I keep up with that one hour of ETL window I'm given a night to deal with all these new sources? So we see people adopting Hadoop for ... >> That's a tough window, one hour is a tough window. If you're Wall Street, backing up. >> Yeah, some of it's tough. >> Talk about Data Lab. What is this concept that you have been kicking around called Data Lab? >> Exactly. >> What does that mean? >> So I think that's the third pillar. We talked about data management, giving you the right engine. We talked about data factory, giving you that integration capability. But, why go through all that effort, if not to start driving innovation? And that's what we think about as the Data Lab. It's a place where you can experiment with advanced analytics. It's a place where you can experiment with data mashup, and new data combinations. And you do it in a cost-effective way, and a way that breeds this notion of agility. You mentioned the word, system of record before. That's a very great description for the warehouse. You're not going to change your revenue definition, or your customer dimension, in the warehouse. That's what everyone uses. But Hadoop, people look at as a system of innovation. It sits alongside the warehouse. You can put a lot of that same data in there. Often you'll put data that never made it into the warehouse. So you get that big data variety, and then you can use that to come up with new ideas. So that's really the essence of the lab, is bringing in more data sets. Trying more combinations of data. And then also seeing if you can move beyond just descriptive and diagnostic analytics, into predictive. >> So let me just get this right. Factory is all the ingestion, Data Lab is like your, I'd say sandbox, my word. So system of record is the most important data. That's a customer name, a key variable for that, that's in the company's business model. So that's where all the hardcore data is. Social media data might be, hey, I'm geo, piece of geo data, and it's at retail store, says I'm going to buy something, or has local presence. Has my name, which is in the system of record. So, that data is in a different database. Has to go over there and get to the system of record. That's hard. That's actually a hard problem. >> Chris: It is. >> But that's a realistic thing that people want to take is this gestural data pieces, small data, that means something to the system of record, or some engagement data, cross-connected to system of record. Do you guys solve that problem? This is what people want to look for, right? >> We do. What's interesting is, that's an age old problem. We had that with data warehousing. We have it even more now with all the big data sources. And, I think the opportunity here is to decide who should solve that problem. Is it a scarce ETL developer that you have in IT. They have limited cycles. >> That's true. >> Do I have a data scientist? People actually use data scientists to do this sort of data integration work. It's hard to come up with a new predictive model if the data sets don't match up. And, its unfortunate, because that's the PhD guy. And, that's menial labor to a large degree. >> Hard to find PhD's, too. >> It is. I like to call them unicorns. You hear about them, you never really see them. And you definitely don't want the scientist doing that menial labor. The joke we say is that the data scientist has been turned into a data janitor, because of all these tasks that get put on their shoulders. So, we think at Oracle that's an opportunity. With this combination of data management, data factory, and Data Lab on top, you can actually push that work out to your business analyst teams. They can collaborate with IT. They can collaborate with your data scientist if you have them, but the spirit of the Lab is not ... >> So making the analysts and the business folks, make them like data scientists. >> Chris: Exactly. >> As functional as data scientists, without having them being ... >> One of the phrases in the industry is citizen data scientist, and I manage a product called Oracle Big Data Discovery, and that is really our goal. Can we build these very intuitive UI's, that make these analysts produce more output like a data scientist would. >> So what's the architecture to make that happen? Because I think that's right on the money. I think that's a great solution. I think and the example I used is just a small piece of data, but that's a database problem. So by abstracting out to another level with software, you can let people wire their own solutions together. I get that. How do you guys do that from an architecture standpoint? What do you say to customers? How do I do this? What's the playbook? >> It's a good question, because at its core, there's no reason to go about solving this problem, unless it works at the big data scale, right? If you can't analyze petabytes, terabytes of content, you would use a regular BI solution. There's no reason to move over to big data. So, a key aspect of the architecture is scale. But also if you're going to support these analysts, they're not happy if they click on the screen and then they wait five minutes for something to come back. So, interactivity performance is critical too for this user base. Because of that, in products like BDD, and really across a lot of our different initiatives, Apache Spark has become a key piece of our architecture. And that's something you might not expect from Oracle, that we're moving into open source, adopting a lot of those technologies, but we really do see the value of Spark. >> So I asked Neil Mendelson just today the question, where he sees the market going. So I want to ask you a little bit different question, but same question on a different task. What's the next big thing? Because we are on the front end of this really pioneering analytics mindset. >> Chris: Yep. >> Horizontally scalable data sets. Software value propositions, applied to data as currency, if you will. Soon data will be on the balance sheets. Some say, certainly the analysts at Wikibon are saying that, some day it should be an asset class. >> Chris: Data capital is a phrase we use. >> Data capital, love that. And so that is a trend, that could be right around the corner. But that's where it's going. What's the next big thing to get us there? >> I think the first hurdle was just making sense of big data. It took organizations a couple years just to get their head around that, and to build that architecture, so it will scale and people will adopt the system. I think the opportunity now is, at least as we see it in our analytic portfolio is, you've got these users on the system. You've got these Hadoop clusters in place. What can you do with that power? And, we think the big opportunity, especially as we create these data scientists, these citizen data scientists, is machine learning. How can we embed, especially the Spark machine learning libraries, into our products more natively? Such that, you don't have to have the PhD at the outset. You can use that compute power, and you can use the Spark open source libraries, to help bootstrap that process. >> So do you guys solve what I call the data swamp problem? Because, let me explain in more color. Most people are dumping everything in what they call a data lake. And, just store all the data, we'll get to it later. Some of it, mostly it's Hadoop, it's a bunch of batch data. Because they don't know what to do with it yet. So it just sits there. And it gets dirty, and it turns into a swamp. That's what the joke is, data swamp. Ironically we're looking at the lake here at the Oracle headquarters. >> Chris: Pristine, pristine. >> Pristine, the water's flying up through thing, it's beautiful. This is a big problem, because data that's idle, that's not being used in this case, not being intelligently acted upon, can turn into a swamp, is only valuable when needed. Meaning, if something's happening in real time, you go to the data lake, and pull out a piece of data, to your earlier reference, and make it in real time, it's important. So you never know the potential energy of that data, and the value. It could be perfectly useless one minute, extremely valuable the next. Is your value proposition with the big data appliance of analyst tools to connect to those lakes and bring them back? Is that the whole, you guys save the data lake >> There's two pieces >> problem? >> There's two pieces. One is giving you the infrastructure, and for that we have our big data cloud service, our big data appliance. Because, lots of people think big data is just commodity hardware. As you move into analytics and do more in memory, you're going to want that extra capacity. So that's one piece, making sure you've got the horsepower. But then, you need those tools on top. And that's where our big data discovery product focuses. And to your point, what we've done is actually integrate the things that those analysts need when they're in that discovery moment. First thing they need, like you said, I never knew I needed this data set before. It just came up to me. So we give you almost a shopping experience for data. You can go in, type in keywords. I want to look for social media log data. And we actually search into Hadoop, and index all that content. So, it's just like you were on our website. >> So you're kind of keeping the lake moving and clean, because you're indexing it, so you can service data at any given time. >> That's the first piece. The second piece though is again, in your discovery process, you have to recognize this is the first time people will be working with this data. And that's where a lot of these data scientists shine, because they know all the techniques as to how do I interrogate it? What's important? What's not? And that's what we build into our product now. So the analyst can just look at a very visual screen, and it helps them figure out where to focus. Is it worth me spending time? >> It's like almost this bot craze that's going on. You guys are abstracting away the scientist's knowledge into software, and providing almost an interface. >> That's the hope. If you can get a data scientist, trust me, keep them. They're very valuable. >> Catch that unicorn. >> Yes. >> No, it's true though. There's not enough PhD's, or data scientists out there. Soon, there's new curriculum out there, but still. The idea is to scale up, and make the normal person, the citizen be the data scientist. >> And also, it's funny, if you look at the advanced analytic tools, and the data science tools out there, they're very dated. A lot of them were built 15, 20 years ago with that data miner statistician. There's now this new breed of data scientists that they want more compelling interfaces. They expect more. >> Chris, final question. Top three conversations that you have with customers, where they're most challenged. If you had to look at the patterns, applying all the big data techniques in your brain to the three top problems that customers are trying to solve that you guys help. >> Excellent. So the first one I would say by far, and I wish it wasn't the case, but it's, help me justify building out my big data cluster. That's the first one. Lots of companies want to do more with big data, but they're struggling ... >> Just their ROI, or cost, or both. >> The ROI, the cost, really, why should I make that investment? How do I justify it? And I really do think that cloud is going to change that picture dramatically. When I can shift to looking at the CapEx versus OpEx ... >> So you're saying the cloud lowers the bar, in terms of getting value generated, or is it ... >> It does two things. It lowers the financial entry point, and how much you have to justify up front. And it lowers the IT skillset to manage those clusters in the data center. So, two very big problems. >> Great, that's awesome. Second one? >> No that I've solved that. Second one is, okay, well what do I do next? How do I find things? Where should I be looking? And that is where this Data Lab concept is meant to come into play. Some customers will have a perfect use case in mind. That's how they justified the project. They can go and execute that. But a lot of them, again it's this notion of a data lake. I need to pursue a range of experiments. Where do I start? And tools like big data discovery help a lot there. >> So Data Lab is just play with the data, and get a feel for it. >> Yep. And do it in a way that breeds that experimentation. Not just to visualize the data, but change it. Reshape it. Build new models, build new classifications. The last thing I'd say is okay, did I get my ROI, do I have a cluster? Yes. Did I figure out something that looks interesting? Yes. Now I have an idea. What do I do next? It's how do I connect my insights from big data back to the tools that we use every day. >> So this is where the value of the data capital thing you're talking about. The Lab is essentially formulating the key connections for data pipes to connect in. >> Yep. >> Is that kind of the best way to think about it? >> Roughly, yes. Yeah you come up with new ideas, new data products ... >> So you've operationalized it by the third step. >> Yes. And then, how do you do that? In some cases it's, oh, I just push the data, I move the data over to my data warehouse. Which may make sense. But Oracle also has, I think I mentioned it before, Big Data SQL as a product. Which will let you keep that data in Hadoop, keep everything else in your data warehouse, and productization is that easy. So you don't have to worry about moving data. It helps a lot. >> Well that highlights one of the things we always hear all the time, which is skills. >> Chris: Yep. >> And people know SQL. >> Chris: They do. Everyone does. >> Everyone does. Chris, thanks so much for spending the time here On the Ground. Really appreciate chatting with you. This is theCube. Exclusive coverage on the ground at Oracle headquarters. I'm John Furrier, thanks for watching.

Published Date : Sep 7 2016

SUMMARY :

I'm John Furrier, the host of theCube, that are facing the companies? One of the first big questions to solve, is what you raised. has proven that SQL is the language Is that a pretext to what's really happening? And that's where you see some new technologies coming out. That's a tough window, one hour is a tough window. What is this concept that you have been kicking around So that's really the essence of the lab, So system of record is the most important data. that means something to the system of record, Is it a scarce ETL developer that you have in IT. It's hard to come up with a new predictive model And you definitely don't want the scientist So making the analysts and the business folks, As functional as data scientists, One of the phrases in the industry So by abstracting out to another level with software, So, a key aspect of the architecture is scale. So I want to ask you a little bit different question, Some say, certainly the analysts at Wikibon What's the next big thing to get us there? and you can use the Spark open source libraries, So do you guys solve what I call the data swamp problem? Is that the whole, you guys So we give you almost a shopping experience for data. so you can service data at any given time. So the analyst can just look at a very visual screen, the scientist's knowledge into software, That's the hope. The idea is to scale up, and make the normal person, and the data science tools out there, that you guys help. So the first one I would say by far, And I really do think that cloud is going to So you're saying the cloud lowers the bar, And it lowers the IT skillset to manage those clusters Great, that's awesome. And that is where this Data Lab concept So Data Lab is just play with the data, back to the tools that we use every day. The Lab is essentially formulating the key connections Yeah you come up with new ideas, new data products ... I move the data over to my data warehouse. Well that highlights one of the things we always hear Chris: They do. Exclusive coverage on the ground at Oracle headquarters.

ENTITIES

Entity	Category	Confidence
Chris Linskey	PERSON	0.99+
John	PERSON	0.99+
Chris	PERSON	0.99+
Neil Mendelson	PERSON	0.99+
two pieces	QUANTITY	0.99+
John Furrier	PERSON	0.99+
five minutes	QUANTITY	0.99+
second piece	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Chris Lynskey	PERSON	0.99+
first piece	QUANTITY	0.99+
one piece	QUANTITY	0.99+
first question	QUANTITY	0.99+
One	QUANTITY	0.99+
one hour	QUANTITY	0.99+
one minute	QUANTITY	0.99+
SQL	TITLE	0.99+
third step	QUANTITY	0.99+
Data Lab	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
each	QUANTITY	0.99+
both	QUANTITY	0.99+
third pillar	QUANTITY	0.99+
Second one	QUANTITY	0.99+
first time	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
Oracle Big Data	ORGANIZATION	0.98+
two things	QUANTITY	0.98+
first hurdle	QUANTITY	0.98+
three top problems	QUANTITY	0.98+
today	DATE	0.98+
Spark	TITLE	0.97+
first one	QUANTITY	0.97+
First thing	QUANTITY	0.96+
Apache	ORGANIZATION	0.95+
two very big problems	QUANTITY	0.92+
CapEx	ORGANIZATION	0.9+
one	QUANTITY	0.89+
Wall Street	LOCATION	0.86+
OpEx	ORGANIZATION	0.85+
15, 20 years ago	DATE	0.85+
three conversations	QUANTITY	0.85+
TheCube	ORGANIZATION	0.84+
Data	ORGANIZATION	0.82+
theCube	ORGANIZATION	0.81+
first big questions	QUANTITY	0.79+
On the Ground	ORGANIZATION	0.62+
reen	PERSON	0.57+
Pristine	PERSON	0.54+
Data	OTHER	0.52+
President	PERSON	0.52+
couple	QUANTITY	0.52+
Data	TITLE	0.52+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Chris Lynskey: