Andy Palmer, TAMR | MIT CDOIQ 2019
>> from Cambridge, Massachusetts. It's the Cube covering M. I. T. Chief Data officer and Information Quality Symposium 2019 Brought to you by Silicon Angle Media >> Welcome back to M I. T. Everybody watching the Cube. The leader in live tech coverage we hear a Day two of the M I t chief data officer information Quality Conference Day Volonte with Paul Dillon. Andy Palmer's here. He's the co founder and CEO of Tamer. Good to see again. It's great to see it actually coming out. So I didn't ask this to Mike. I could kind of infirm from someone's dances. But why did you guys start >> Tamer? >> Well, it really started with an academic project that Mike was doing over at M. I. T. And I was over in of artists at the time. Is the chief get officer over there? And what we really found was that there were a lot of companies really suffering from data mastering as the primary bottleneck in their company did used great new tech like the vertical system that we've built and, you know, automated a lot of their warehousing and such. But the real bottleneck was getting lots of data integrated and mastered really, really >> quickly. Yeah, He took us through the sort of problems with obviously the d. W. In terms of scaling master data management and the scanning problems was Was that really the problem that you were trying to solve? >> Yeah, it really was. And when we started, I mean, it was like, seven years ago, eight years ago, now that we started the company and maybe almost 10 when we started working on the academic project, and at that time, people weren't really thinking are worried about that. They were still kind of digesting big data. A zit was called, but I think what Mike and I kind of felt was going on was that people were gonna get over the big data, Um, and the volume of data. And we're going to start worrying about the variety of the data and how to make the data cleaner and more organized. And, uh, I think I think way called that one pretty much right. Maybe >> we're a little >> bit early, but but I think now variety is the big problem >> with the other thing about your big day. Big data's oftentimes associated with Duke, which was a batch and then you sort of saw the shifter real time and spark was gonna fix all that. And so what are you seeing in terms of the trends in terms of how data is being used to drive almost near real time business decisions. >> You know, Mike and I came out really specifically back in 2007 and declared that we thought, uh, Hadoop and H D f s was going to be far less impactful than other people. >> 07 >> Yeah, Yeah. And Mike Mike actually was really aggressive and saying it was gonna be a disaster. And I think we've finally seen that actually play out of it now that the bloom is off the rose, so to speak. And so they're They're these fundamental things that big companies struggle with in terms of their data and, you know, cleaning it up and organizing it and making it, Iike want. Anybody that's worked at one of these big companies can tell you that the data that they get from most of their internal system sucks plain and simple, and so cleaning up that data, turning it into something it's an asset rather than liability is really what what tamers all about? And it's kind of our mission. We're out there to do this and it sort of pails and compare. Do you think about the amount of money that some of these companies have spent on systems like ASAP on you're like, Yeah, but all the data inside of the systems so bad and so, uh, ugly and unuseful like we're gonna fix that problem. >> So you're you're you're special sauce and machine learning. Where are you applying machine learning most most effectively when >> we apply machine learning to probably the least sexy problem on the planet. There are a lot of companies out there that use machine learning and a I t o do predictive algorithms and all kinds of cool stuff. All we do with machine learning is actually use it to clean up data and organize data. Get it ready for people to use a I I I started in the eye industry back in the late 19 eighties on, you know, really, I learned from the sky. Marvin Minsky and Mark Marvin taught me two things. First was garbage in garbage out. There's no algorithm that's worth anything unless you've got great data, and the 2nd 1 is it's always about the human in the machine working together. And I've really been working on those two same principles most of my career, and Tamer really brings both of those together. Our goal is to prepare data so that it can be used analytically inside of these companies, that it's actually high quality and useful. And the way we do that involves bringing together the machine, mostly these advanced machine learning algorithms with humans, subject matter experts inside of these companies that actually know all the ins and outs and all the intricacies of the data inside of their company. >> So say garbage in garbage out. If you don't have good training data course you're not going good ML model. How much how much upfront work is required. G. I know it was one of your customers and how much time is required to put together on ML model that can deal with 20,000,000 records like that? >> Well, you know, the amazing thing that this happened for us in the last five years, especially is that now we've got we've built enough models from scratch inside of these large global 2000 companies that very rarely do we go into a place where there we don't already have a model that's pre built. That they can use is a starting point. And I think that's the same thing that's happening in modeling in general. If you look a great companies like data robot Andi and even in in the Python community ml live that the accessibility of these modeling tools and the models themselves are actually so they're commoditized. And so most of our models and most of the projects we work on, we've already got a model. That's a starting point. We don't really have to start from scratch. >> You mentioned gonna ta I in the eighties Is that is the notion of a I Is it same as it was in the eighties and now we've just got the tooling, the horsepower, the data to take advantage of it is the concept changed? The >> math is all the same, like, you know, absolutely full stop, like there's really no new math. The two things I think that have changed our first. There's a lot more data that's available now, and, you know, uh, neural nets are a great example, right? in Marvin's things that, you know when you look at Google translate and how aggressively they used neural nets, it was the quantity of data that was available that actually made neural nets work. The second thing that that's that's changed is the cheap availability of Compute that Now the largest supercomputer in the world is available to rent by the minute. And so we've got all this data. You've got all this really cheap compute. And then third thing is what you alluded to earlier. The accessibility of all the math that now it's becoming so simple and easy to apply these math techniques, and they're becoming you know, it's It's almost to the point where the average data scientists not the advance With the average data, scientists can do a practice. Aye, aye. Techniques that 20 years ago required five PhDs. >> It's not surprising that Google, with its new neural net technology, all the search data that it has has been so successful. It's a surprise you that that Amazon with Alexa was able to compete so effectively. >> Oh, I think that I would never underestimate Amazon and their ability to, you know, build great tact. They've done some amazing work. One of my favorite Mike and I actually, one of our favorite examples in the last, uh, three years, they took their red shift system, you know, that competed with with Veronica and they they re implemented it and, you know, as a compiled system and it really runs incredibly fast. I mean, that that feat of engineering, what was truly exceptional >> to hear you say that Because it wasn't Red Shift originally Park. So yeah, that's right, Larry Ellison craps all over Red Shift because it's just open source offer that they just took and repackage. But you're saying they did some major engineering to Oh >> my gosh, yeah, It's like Mike and I both way Never. You know, we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. But this this latest rewrite that they've done this compiled version like it's really good. >> So as a guy has been doing a eye for 30 years now, and it's really seeing it come into its own, a lot of a I project seems right now are sort of low hanging fruit is it's small scale stuff where you see a I in five years what kind of projects are going our bar company's gonna be undertaking and what kind of new applications are gonna come out of this? But >> I think we're at the very beginning of this cycle, and actually there's a lot more potential than has been realized. So I think we are in the pick the low hanging fruit kind of a thing. But some of the potential applications of A I are so much more impactful, especially as we modernize core infrastructure in the enterprise. So the enterprise is sort of living with this huge legacy burden. And we always air encouraging a tamer our customers to think of all their existing legacy systems is just dated generating machines and the faster they can get that data into a state where they can start doing state of the art A. I work on top of it, the better. And so really, you know, you gotta put the legacy burden aside and kind of draw this line in the sand so that as you really get, build their muscles on the A. I side that you can take advantage of that with all the data that they're generating every single day. >> Everything about these data repose. He's Enterprise Data Warehouse. You guys built better with MPP technology. Better data warehouses, the master data management stuff, the top down, you know, Enterprise data models, Dupin in big data, none of them really lived up to their promise, you know? Yeah, it's kind of somewhat unfair toe toe like the MPP guys because you said, Hey, we're just gonna run faster. And you did. But you didn't say you're gonna change the world and all that stuff, right? Where's e d? W? Did Do you feel like this next wave is actually gonna live up to the promise? >> I think the next phase is it's very logical. Like, you know, I know you're talking to Chris Lynch here in a minute, and you know what? They're doing it at scale and at scale and tamer. These companies are all in the same general area. That's kind of related to how do you take all this data and actually prepare it and turn it into something that's consumable really quickly and easily for all of these new data consumers in the enterprise and like so that that's the next logical phase in this process. Now, will this phase be the one that finally sort of meets the high expectations that were set 2030 years ago with enterprise data warehousing? I don't know, but we're certainly getting closer >> to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out there. That was a technology just >> I'm huge. I'm fanatical right now about health care. I think that the opportunity for health care to be transformed with technology is, you know, almost makes everything else look like chump change. What aspect of health care? Well, I think that the most obvious thing is that now, with the consumer sort of in the driver seat in healthcare, that technology companies that come in and provide consumer driven solutions that meet the needs of patients, regardless of how dysfunctional the health care system is, that's killer stuff. We had a great company here in Boston called Pill Pack was a great example of that where they just build something better for consumers, and it was so popular and so, you know, broadly adopted again again. Eventually, Amazon bought it for $1,000,000,000. But those kinds of things and health care Pill pack is just the beginning. There's lots and lots of those kinds of opportunities. >> Well, it's right. Healthcare's ripe for disruption on, and it hasn't been hit with the digital destruction. And neither is financialservices. Really? Certainly, defenses has not yet another. They're high risk industry, so Absolutely takes longer. Well, Andy, thanks so much for making the time. You know, You gotta run. Yeah. Yeah. Thank you. All right, keep it right. Everybody move back with our next guest right after this short break. You're watching the Cube from M I T c B O Q. Right back.
SUMMARY :
you by Silicon Angle Media But why did you guys start like the vertical system that we've built and, you know, the problem that you were trying to solve? now that we started the company and maybe almost 10 when we started working on the academic And so what are you seeing in terms of the trends in terms of how data that we thought, uh, Hadoop and H D f s was going to be far big companies struggle with in terms of their data and, you know, cleaning it up and organizing Where are you applying machine the eye industry back in the late 19 eighties on, you know, If you don't have good training data course And so most of our models and most of the projects we work on, we've already got a model. math is all the same, like, you know, absolutely full stop, like there's really no new math. It's a surprise you that that Amazon implemented it and, you know, as a compiled system and to hear you say that Because it wasn't Red Shift originally Park. we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. And so really, you know, you gotta put the legacy of them really lived up to their promise, you know? That's kind of related to how do you take all this data and actually to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out health care to be transformed with technology is, you know, Well, Andy, thanks so much for making the time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Mike | PERSON | 0.99+ |
Andy | PERSON | 0.99+ |
Andy Palmer | PERSON | 0.99+ |
Mark Marvin | PERSON | 0.99+ |
2007 | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Paul Dillon | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
$1,000,000,000 | QUANTITY | 0.99+ |
Chris Lynch | PERSON | 0.99+ |
Marvin Minsky | PERSON | 0.99+ |
Larry Ellison | PERSON | 0.99+ |
First | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
30 years | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Cambridge, Massachusetts | LOCATION | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
second thing | QUANTITY | 0.99+ |
third thing | QUANTITY | 0.99+ |
20,000,000 records | QUANTITY | 0.99+ |
two same principles | QUANTITY | 0.99+ |
seven years ago | DATE | 0.99+ |
eight years ago | DATE | 0.99+ |
Mike Mike | PERSON | 0.98+ |
three years | QUANTITY | 0.98+ |
late 19 eighties | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
five years | QUANTITY | 0.98+ |
2030 years ago | DATE | 0.98+ |
2nd 1 | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
two things | QUANTITY | 0.97+ |
five PhDs | QUANTITY | 0.97+ |
Day two | QUANTITY | 0.97+ |
Veronica | PERSON | 0.97+ |
M I. T. | PERSON | 0.96+ |
Marvin | PERSON | 0.96+ |
20 years ago | DATE | 0.96+ |
Python | TITLE | 0.96+ |
eighties | DATE | 0.94+ |
2019 | DATE | 0.94+ |
2000 companies | QUANTITY | 0.94+ |
Red Shift | TITLE | 0.94+ |
Duke | ORGANIZATION | 0.93+ |
Alexa | TITLE | 0.91+ |
last five years | DATE | 0.9+ |
M I t | EVENT | 0.88+ |
almost 10 | QUANTITY | 0.87+ |
TAMR | PERSON | 0.86+ |
Andi | PERSON | 0.8+ |
M. I. T. | ORGANIZATION | 0.79+ |
Tamer | ORGANIZATION | 0.78+ |
Information Quality Symposium | EVENT | 0.78+ |
Quality Conference Day Volonte | EVENT | 0.77+ |
Tamer | PERSON | 0.77+ |
Google translate | TITLE | 0.75+ |
single day | QUANTITY | 0.71+ |
H | PERSON | 0.71+ |
Chief | PERSON | 0.66+ |
Hadoop | PERSON | 0.64+ |
MIT | ORGANIZATION | 0.63+ |
Cube | ORGANIZATION | 0.61+ |
more | QUANTITY | 0.6+ |
M. I. T. | PERSON | 0.57+ |
Pill pack | COMMERCIAL_ITEM | 0.56+ |
Pill Pack | ORGANIZATION | 0.53+ |
D f s | ORGANIZATION | 0.48+ |
Park | TITLE | 0.44+ |
CDOIQ | EVENT | 0.32+ |
Cube | PERSON | 0.27+ |