Image Title

Search Results for Palmer Bacher:

Analytics and the Future: Big Data Deep Dive Episode 6


 

>> No. Yeah. Wait. >> Hi, everyone, and welcome to the big data. Deep Dive with the Cube on AMC TV. I'm Richard Schlessinger, and I'm here with tech industry entrepreneur and wicked bond analyst Dave Volonte and Silicon Angle CEO and editor in chief John Furrier. For this last segment in our show, we're talking about the future of big data and there aren't two better guys to talk about that you and glad that you guys were here. Let me sort of tee up the this conversation a little bit with a video that we did. Because the results of big data leveraging are only as good as the data itself. There has to be trust that the data is true and accurate and as unbiased as possible. So AMC TV addressed that issue, and we're just trying to sort of keep the dialogue going with this spot. >> We live in a world that is in a constant state of transformation, political natural transformation that has many faces, many consequences. A world overflowing with information with the potential to improve the lives of millions with prospects of nations with generations in the balance way are awakening to the power of big data way trust and together transform our future. >> So, Gentlemen Trust, without that, where are we and how big of an issue is that in the world of big data? Well, you know, the old saying garbage in garbage out in the old days, the single version of the truth was what you were after with data warehousing. And people say that we're further away from a single version of the truth. Now with all this data. But the reality is with big data and these new algorithms you, khun algorithmic Lee, weed out the false positives, get rid of the bad data and mathematically get to the good data a lot faster than you could before. Without a lot of processes around it. The machines can do it for you. So, John, while we were watching that video, you murmured something about how this is the biggest issue. This is cutting edge stuff. This is what I mean. >> Trust, trust issues and trust the trust equation. Right now it is still unknown. It's evolving fast. You see it with social networks, Stevens go viral on the internet and and we live in a system now with mobility and cloud things. Air scaling infinitely, you know, these days and so good day two scales, big and bad data scales being so whether it's a rumor on you here and this is viral or the data data, trust is the most important issue, and sometimes big data can be creepy. So a. This really, really important area. People are watching it on DH. Trust is the most important thing. >> But, you know, you have to earn trust, and we're still sort of at the beginning of this thing. So what has to happen to make sure that you know you don't get the garbage in, so you get the garbage. >> It's iterative and and we're seeing a lot of pilot projects. And then those pilot projects get reworked, and then they spawn into new projects. And so it's an evolution. And as I've said many, many times, it's very early we've talked about, were just barely scratching the surface here. >> It's evolving, too, and the nature of the data is needs to be questioned as well. So what kind of data? For instance, if you don't authorize your data to be viewed, there's all kinds of technical issues around. >> That's one side of it, But the other side of it, I mean, they're bad people out there who would try to influence, Uh, you know what? Whatever conclusions were being drawn by big data programs, >> especially when you think about big data sources. So companies start with their internal data, and they know that pretty well. They know where the warts are. They know how to manipulate. It's when they start bringing in outside data that this gets a lot fuzzier. >> Yeah, it's a problem. And security talk to a guy not long ago who thought that big data could be used to protect big data, that you could use big data techniques to detect anomalies in data that's coming into the system, which is poetic if nothing else, that guys think data has told me that that that's totally happened. It's a good solution. I want to move on because way really want to talk about how this stuff is going to be used. Assuming that these trust issues can be solved on and you know, the best minds in the world are working on this issue to try to figure out how to best, you know, leverage the data, we all produce, which has been measured at five exabytes every two days. You know, somebody made an analogy with, like something. If a bite was a paper clip and you stretched five exabytes worth of paper clips, they would go to the moon or whatever. Anyway, it's a lot of bike. It's a lot of actually, I think that's a lot of fun and back way too many times one hundred thousand times I lost track of my paper. But anyway, the best minds are trying to figure out, you know, howto, you know, maximize that the value that data. And they're doing that not far from here where we sit. Uh, Emmett in a place called C Sale, which was just recently set up, See Sail stands for the computer signs, an artificial intelligence lab. So we went there not long ago. It's just, you know, down the Mass. Pike was an easy trip, and this is what we found. It's fascinating >> Everybody's obviously talking about big data all the time, and you hear it gets used to mean all different types of things. So he thinks we're trying to do in the big data. Is he? Still program is to understand what are the different types of big data that exists in the world? And how do we help people to understand what different problems or fall under the the overall umbrella of big data? She sells the largest interdepartmental laboratory and mitt, so there's about one hundred principal investigators. So that's faculty and sort of senior research scientists. About nine hundred students who are involved, >> basically with big data, almost anything to do with it has to be in a much larger scale than we're used to, and the way it changes that equation is you have to You have to have the hardware and software to do the things you're used to doing. You have to meet them of comedy's a larger size a much larger size >> of times. When people talk about big data, they, I mean, not so much the volume of the data, but that the data, for example, is too complex for their existing data. Processing system to be able to deal with it. So it's I've got information from Social network from Twitter. I've got your information from a person's mobile phone. Maybe I've got information about retail records. Transactions hole Very diverse set of things that need to be combined together. What this clear? It says this is If you added this, credit it to your query, you would remove the dots that you selected. That's part of what we're trying to do here. And big data is he sail on. Our big data effort in general at MIT is toe build a set of software tools that allow people to take all these different data sets, combine them together, asked questions and run algorithms on top of them that allowed him to extracting sight. >> I'm working with it was dragged by NASA, but the purpose of my work right now is Tio Tio. Take data sets within Davis's, and instead of carrying them for table results, you query them, get visualizations. So instead of looking at large sets of numbers and text him or not, you get a picture and gave the motivation Behind that is that humans are really good into pretty pictures. They're not so that interpreting huge tables with big data, that's a really big issue. So this will have scientists tio visualize their data sets more quickly so they can start exploring And, uh, just looking at it faster, because with big data, it's a challenge to be able to visualize an exploiter data. >> I'm here just to proclaim what you already know, which is that the hour of big data has arrived in Massachusetts, and >> it's a very, very exciting time. So Governor Patrick was here just a few weeks ago to announce the Mass Big Data Initiative. And really, I think what he recognizes and is partly what we recognize here is that there's a expertise in the state of Massachusetts in areas that are related to big data, partly because of companies like AMC, as well as a number of other companies in this sort of database analytic space, CMC is a partner in our big data detail, initiatives and big data and See Sale is industry focused initiative that brings companies together to work with Emmet T. Think about it. Big data problems help to understand what big data means for the companies and also to allow the companies to give feedback. Tow us about one of the most important problems for them to be working on and potentially expose our students and give access to these companies to our students. >> I think the future will tell us, and that's hard to say right now, because way haven't done a lot of thinking, and I was interpreting and Big Data Way haven't reached our potential yet, and I just there's just so many things that we can't see right now. >> So one of the things that people tell us that are involved in big data is they have trouble finding the skill sets the data. Science can pick capability and capacity. And so seeing videos like this one of them, it is a new breed of students coming out there. They're growing up in this big data world, and that's critical to keep the big data pipeline flowing. And Jon, you and I have spent a lot of time in the East Coast looking at some of the big data cos it's almost a renaissance for Massachusetts in Cambridge and very exciting to see. Obviously, there's a lot going on the West Coast as well. Yeah, I mean, I'll say, I'm impressed with Emmett and around M I. T. In Cambridge is exploding with young, young new guns coming out of there. The new rock stars, if you will. But in California we're headquartered in Palo Alto. You know we in a chance that we go up close to Google Facebook and Jeff Hammer backer, who will show a video in a second that I interview with him and had dupe some. But he was the first guy a date at Facebook to build the data platform, which now has completely changed Facebook and made it what it is. He's also the co founder of Cloudera The Leader and Had Duke, which we've talked about, and he's the poster child, in my opinion of a data scientist. He's a math geek, but he understands the world problems. It's not just a tech thing. It's a bigger picture. I think that's key. I mean, he knows. He knows that you have to apply this stuff so and the passion that he has. This video from Jeff Hammer Bacher, cofounder of Cloud Ear, Watches Video. But and then the thing walk away is that big data is for everyone, and it's about having the passion. >> Wait. Wait. >> Palmer Bacher Data scientists from Cloudera Cofounder Hacking data Twitter handle Welcome to the Cube. >> Thank you. >> So you're known in the industry? I'LL see. Everyone knows you on Twitter. Young Cora heavily follow you there at Facebook. You built the data platform for Facebook. One of the guys mean guys. They're hacking the data over Facebook. Look what happened, right? I mean, the tsunami that Facebook has this amazing co founder Cloudera. You saw the vision on Rommedahl always quotes on the Cube. We've seen the future. No one knows it yet. That was a year and a half ago. Now everyone knows it. So do you feel about that? Is the co founder Cloudera forty million thousand? Funding validation again? More validation. How do you feel? >> Yeah, sure, it's exciting. I think of you as data volumes have grown and as the complexity of data that is collected, collected and analyzed as increase your novel software architectures have emerged on. I think what I'm most excited about is the fact that that software is open source and we're playing a key role in driving where that software is going. And, you know, I think what I'm most excited about. On top of that is the commodification of that software. You know, I'm tired of talking about the container in which you put your data. I think a lot of the creativity is happening in the data collection integration on preparation stage. Esso, I think. You know, there was ah tremendous focus over the past several decades on the modeling aspect of data way really increase the sophistication of our understanding, you know, classification and regression and optimization. And all off the hard court model and it gets done. And now we're seeing Okay, we've got these great tools to use at the end of the pipe. Eso Now, how do we get more data pushed through those those modeling algorithm? So there's a lot of innovative work. So we're thinking at the time how you make money at this or did you just say, Well, let's just go solve the problem and good things will happen. It was it was a lot more the ladder. You know, I didn't leave Facebook to start a company. I just left Facebook because I was ready to do something new. And I knew this was a huge movement and I felt that, you know, it was very gnashing and unfinished a software infrastructure. So when the opportunity Cloudera came along, I really jumped on it. And I've been absolutely blown away by the commercial success we've had s o. I didn't I certainly didn't set out with a master plan about how to extract value from this. My master plan has always been to really drive her duped into the background of enterprise infrastructure. I really wanted to be as obvious of a choice as Lennox and you See you, you're We've talked a lot at this conference and others about, you know, do moving from with fringe to the mainstream commercial enterprises. And all those guys are looking at night J. P. Morgan Chase. Today we're building competitive advantage. We're saving money, those guys, to have a master plan to make money. Does that change the dynamic of what you do on a day to day basis, or is that really exciting to you? Is an entrepreneur? Oh, yeah, for sure. It's exciting. And what we're trying to do is facilitate their master plan, right? Like we wanted way. Want to identify the commonalities and everyone's master plan and then commoditize it so they can avoid the undifferentiated heavy lifting that Jeff Bezos points out. You know where you know? No one should be required, Teo to invest tremendous amounts of money in their container anymore, right? They should really be identifying novel data sources, new algorithms to manipulate that data, the smartest people for using that data. And that's where they should be building their competitive advantage on. We really feel that, you know, we know where the market's going on. We're very confident, our product strategy. And I think over the next few years, you know, you guys are gonna be pretty excited about the stuff we're building, because I know that I'm personally very excited. And yet we're very excited about the competition because number one more people building open source software has never made me angry. >> Yeah, so So, you know, that's kind of market place. So, you know, we're talking about data science building and data science teams. So first tell us Gerald feeling today to science about that. What you're doing that, Todd here, around data science on your team and your goals. And what is a data scientist? I mean, this is not, You know, it's a D B A for her. Do you know what you know, sheriff? Sure. So what's going on? >> Yeah, So, you know, to kind of reflect on the genesis of the term. You know, when we were building out the data team at Facebook, we kind of two classes of analysts. We had data analysts who are more traditional business intelligence. You know, building can reports, performing data, retrieval, queries, doing, you know, lightweight analytics. And then we had research scientists who are often phds and things like sociology or economics or psychology. And they were doing much more of the deep dive, longitudinal, complex modeling exercises. And I really wanted to combine those two things I didn't want to have. Those two folks be separate in the same way that we combined engineering and operations on our date infrastructure group. So I literally just took data analyst and research scientists and put them together and called it data scientist s O. So that's kind of the the origin of the title on then how that's translating what we do at Clyde era. So I've recently hired to folks into a a burgeoning data science group Cloudera. So the way we see the market evolving is that you know the infrastructure is going to be commoditized. Yes, mindset >> to really be a data scientists, and you know what is way should be thinking about it. And there's no real manual. Most people aboard that math skills, economic kinds of disciplines you mentioned. What should someone prepared themselves? How did they? How does someone wanna hire data scientist had, I think form? Yeah, kinds of things. >> Well, I tend to, you know, I played a lot of sports growing up, and there's this phrase of being a gym rat, which is someone who's always in the gym just practicing. Whatever support is that they love. And I find that most data scientists or sort of data rats, they're always there, always going out for having any data. So you're there's a genuine curiosity about seeing what's happening and data that you really can't teach. But in terms of the skills that are required, I didn't really find anyone background to be perfect. Eso actually put together a course at University California, Berkeley, and taught it this spring called Introduction to Data Science, and I'm teaching and teaching it again this coming spring, and they're actually gonna put it into the core curriculum. Uh, in the fall of next year for computer science. >> Right, Jack Harmer. Bakar. Thanks so much for that insight. Great epic talk here on the Cube. Another another epic conversations share with the world Live. Congratulations on the funding. Another forty months. It's great validation. Been congratulations for essentially being part of data science and finding that whole movement Facebook. And and now, with Amaar Awadallah and the team that cloud there, you contend a great job. So congratulations present on all the competition keeping you keeping a fast capitalism, right? Right. Thank >> you. But it's >> okay. It's great, isn't it? That with all these great minds working in this industry, they still can't. We're so early in this that they still can't really define what a data scientist is. Well, what does talk about an industry and its infancy? That's what's so exciting. Everyone has a different definition of what it is, and that that what that means is is that it's everyone I think. Data science represents the new everybody. It could be a housewife. It could be a homemaker to on eighth grader. It doesn't matter if you see an insight and you see something that could be solved. Date is out there, and I think that's the future. And Jeff Hamel could talked about spending all this time and technology with undifferentiated heavy lifting. And I'm excited that we are moving beyond that into essentially the human part of Big Data. And it's going to have a huge impact, as we talked about before on the productivity of organizations and potentially productivity of lives. I mean, look at what we've talked about this this afternoon. We've talked about predicting volcanoes. We've talked about, you know, the medical issues. We've talked about pretty much every aspect of life, and I guess that's really the message of this industry now is that the folks who were managing big data are looking too change pretty much every aspect of life. This is the biggest inflexion point in history of technology that I've ever seen in the sense that it truly affects everything and the data that's generated in the data that machine's generate the data that humans generate, data that forest generate things like everything is generating data. So this's a time where we can actually instrument it. So this is why this massive disruption, this area and disruption We should say the uninitiated is a good thing in this business. Well, creation, entrepreneurship, copies of being found it It's got a great opportunity. Well, I appreciate your time, I unfortunately I think that's going to wrap it up for our big date. A deep dive. John and Dave the Cube guys have been great. I really appreciate you showing up here and, you know, just lending your insights and expertise and all that on DH. I want to thank you the audience for joining us. So you should stay tuned for the ongoing conversation on the Cube and to emcee TV to be informed, inspired and hopefully engaged. I'm Richard Schlessinger. Thank you very much for joining us.

Published Date : Feb 19 2013

SUMMARY :

aren't two better guys to talk about that you and glad that you guys were here. of millions with prospects of nations with generations in the get rid of the bad data and mathematically get to the good data a lot faster than you could before. you know, these days and so good day two scales, big and bad data scales being so whether make sure that you know you don't get the garbage in, so you get the garbage. And then those pilot projects get reworked, For instance, if you don't authorize your data to be viewed, there's all kinds of technical especially when you think about big data sources. Assuming that these trust issues can be solved on and you know, the best minds in the world Everybody's obviously talking about big data all the time, and you hear it gets used and the way it changes that equation is you have to You have to have the hardware and software to It says this is If you added this, of numbers and text him or not, you get a picture and gave the motivation Behind data means for the companies and also to allow the companies to give feedback. I think the future will tell us, and that's hard to say right now, And Jon, you and I have spent a lot of time in the East Coast looking at some of the big data cos it's almost a renaissance Wait. Welcome to the Cube. So do you feel about that? Does that change the dynamic of what you do on a day to day basis, Yeah, so So, you know, that's kind of market place. So the way we see the market evolving is that you know the infrastructure is going to be commoditized. to really be a data scientists, and you know what is way should be thinking about it. data that you really can't teach. with Amaar Awadallah and the team that cloud there, you contend a great job. But it's and I guess that's really the message of this industry now is that the

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jeff HamelPERSON

0.99+

Richard SchlessingerPERSON

0.99+

AMCORGANIZATION

0.99+

JonPERSON

0.99+

CMCORGANIZATION

0.99+

CaliforniaLOCATION

0.99+

Jeff HammerPERSON

0.99+

Jeff Hammer BacherPERSON

0.99+

MassachusettsLOCATION

0.99+

John FurrierPERSON

0.99+

Jeff BezosPERSON

0.99+

Palo AltoLOCATION

0.99+

JohnPERSON

0.99+

ClouderaORGANIZATION

0.99+

Jack HarmerPERSON

0.99+

FacebookORGANIZATION

0.99+

Dave VolontePERSON

0.99+

AmaarPERSON

0.99+

GeraldPERSON

0.99+

Silicon AngleORGANIZATION

0.99+

AMC TVORGANIZATION

0.99+

AwadallahPERSON

0.99+

TwitterORGANIZATION

0.99+

NASAORGANIZATION

0.99+

EmmettPERSON

0.99+

CambridgeLOCATION

0.99+

DavePERSON

0.99+

five exabytesQUANTITY

0.99+

Emmet T.PERSON

0.99+

ToddPERSON

0.99+

GoogleORGANIZATION

0.99+

forty monthsQUANTITY

0.99+

RommedahlPERSON

0.99+

millionsQUANTITY

0.99+

two better guysQUANTITY

0.99+

two folksQUANTITY

0.99+

one hundred thousand timesQUANTITY

0.99+

Cloud EarORGANIZATION

0.98+

forty million thousandQUANTITY

0.98+

a year and a half agoDATE

0.98+

TodayDATE

0.98+

firstQUANTITY

0.98+

M I. T.PERSON

0.98+

two thingsQUANTITY

0.98+

J. P. Morgan ChaseORGANIZATION

0.98+

GovernorPERSON

0.98+

oneQUANTITY

0.97+

MITORGANIZATION

0.97+

BerkeleyLOCATION

0.97+

todayDATE

0.97+

University CaliforniaORGANIZATION

0.96+

single versionQUANTITY

0.96+

OneQUANTITY

0.96+

DavisPERSON

0.96+

one sideQUANTITY

0.95+

About nine hundred studentsQUANTITY

0.95+

few weeks agoDATE

0.94+

StevensPERSON

0.94+

Mass Big Data InitiativeEVENT

0.94+

first guyQUANTITY

0.93+

West CoastLOCATION

0.93+

Palmer BacherPERSON

0.93+

EsoORGANIZATION

0.93+

two classesQUANTITY

0.92+

about one hundred principal investigatorsQUANTITY

0.92+

CubeORGANIZATION

0.9+

East CoastLOCATION

0.9+

C SaleORGANIZATION

0.87+

khunPERSON

0.83+

PatrickPERSON

0.82+

two scalesQUANTITY

0.81+

every two daysQUANTITY

0.81+

LennoxPERSON

0.8+

Had DukeORGANIZATION

0.78+