Andy Palmer, TAMR | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's the Cube covering M. I. T. Chief Data officer and Information Quality Symposium 2019 Brought to you by Silicon Angle Media >> Welcome back to M I. T. Everybody watching the Cube. The leader in live tech coverage we hear a Day two of the M I t chief data officer information Quality Conference Day Volonte with Paul Dillon. Andy Palmer's here. He's the co founder and CEO of Tamer. Good to see again. It's great to see it actually coming out. So I didn't ask this to Mike. I could kind of infirm from someone's dances. But why did you guys start >> Tamer? >> Well, it really started with an academic project that Mike was doing over at M. I. T. And I was over in of artists at the time. Is the chief get officer over there? And what we really found was that there were a lot of companies really suffering from data mastering as the primary bottleneck in their company did used great new tech like the vertical system that we've built and, you know, automated a lot of their warehousing and such. But the real bottleneck was getting lots of data integrated and mastered really, really >> quickly. Yeah, He took us through the sort of problems with obviously the d. W. In terms of scaling master data management and the scanning problems was Was that really the problem that you were trying to solve? >> Yeah, it really was. And when we started, I mean, it was like, seven years ago, eight years ago, now that we started the company and maybe almost 10 when we started working on the academic project, and at that time, people weren't really thinking are worried about that. They were still kind of digesting big data. A zit was called, but I think what Mike and I kind of felt was going on was that people were gonna get over the big data, Um, and the volume of data. And we're going to start worrying about the variety of the data and how to make the data cleaner and more organized. And, uh, I think I think way called that one pretty much right. Maybe >> we're a little >> bit early, but but I think now variety is the big problem >> with the other thing about your big day. Big data's oftentimes associated with Duke, which was a batch and then you sort of saw the shifter real time and spark was gonna fix all that. And so what are you seeing in terms of the trends in terms of how data is being used to drive almost near real time business decisions. >> You know, Mike and I came out really specifically back in 2007 and declared that we thought, uh, Hadoop and H D f s was going to be far less impactful than other people. >> 07 >> Yeah, Yeah. And Mike Mike actually was really aggressive and saying it was gonna be a disaster. And I think we've finally seen that actually play out of it now that the bloom is off the rose, so to speak. And so they're They're these fundamental things that big companies struggle with in terms of their data and, you know, cleaning it up and organizing it and making it, Iike want. Anybody that's worked at one of these big companies can tell you that the data that they get from most of their internal system sucks plain and simple, and so cleaning up that data, turning it into something it's an asset rather than liability is really what what tamers all about? And it's kind of our mission. We're out there to do this and it sort of pails and compare. Do you think about the amount of money that some of these companies have spent on systems like ASAP on you're like, Yeah, but all the data inside of the systems so bad and so, uh, ugly and unuseful like we're gonna fix that problem. >> So you're you're you're special sauce and machine learning. Where are you applying machine learning most most effectively when >> we apply machine learning to probably the least sexy problem on the planet. There are a lot of companies out there that use machine learning and a I t o do predictive algorithms and all kinds of cool stuff. All we do with machine learning is actually use it to clean up data and organize data. Get it ready for people to use a I I I started in the eye industry back in the late 19 eighties on, you know, really, I learned from the sky. Marvin Minsky and Mark Marvin taught me two things. First was garbage in garbage out. There's no algorithm that's worth anything unless you've got great data, and the 2nd 1 is it's always about the human in the machine working together. And I've really been working on those two same principles most of my career, and Tamer really brings both of those together. Our goal is to prepare data so that it can be used analytically inside of these companies, that it's actually high quality and useful. And the way we do that involves bringing together the machine, mostly these advanced machine learning algorithms with humans, subject matter experts inside of these companies that actually know all the ins and outs and all the intricacies of the data inside of their company. >> So say garbage in garbage out. If you don't have good training data course you're not going good ML model. How much how much upfront work is required. G. I know it was one of your customers and how much time is required to put together on ML model that can deal with 20,000,000 records like that? >> Well, you know, the amazing thing that this happened for us in the last five years, especially is that now we've got we've built enough models from scratch inside of these large global 2000 companies that very rarely do we go into a place where there we don't already have a model that's pre built. That they can use is a starting point. And I think that's the same thing that's happening in modeling in general. If you look a great companies like data robot Andi and even in in the Python community ml live that the accessibility of these modeling tools and the models themselves are actually so they're commoditized. And so most of our models and most of the projects we work on, we've already got a model. That's a starting point. We don't really have to start from scratch. >> You mentioned gonna ta I in the eighties Is that is the notion of a I Is it same as it was in the eighties and now we've just got the tooling, the horsepower, the data to take advantage of it is the concept changed? The >> math is all the same, like, you know, absolutely full stop, like there's really no new math. The two things I think that have changed our first. There's a lot more data that's available now, and, you know, uh, neural nets are a great example, right? in Marvin's things that, you know when you look at Google translate and how aggressively they used neural nets, it was the quantity of data that was available that actually made neural nets work. The second thing that that's that's changed is the cheap availability of Compute that Now the largest supercomputer in the world is available to rent by the minute. And so we've got all this data. You've got all this really cheap compute. And then third thing is what you alluded to earlier. The accessibility of all the math that now it's becoming so simple and easy to apply these math techniques, and they're becoming you know, it's It's almost to the point where the average data scientists not the advance With the average data, scientists can do a practice. Aye, aye. Techniques that 20 years ago required five PhDs. >> It's not surprising that Google, with its new neural net technology, all the search data that it has has been so successful. It's a surprise you that that Amazon with Alexa was able to compete so effectively. >> Oh, I think that I would never underestimate Amazon and their ability to, you know, build great tact. They've done some amazing work. One of my favorite Mike and I actually, one of our favorite examples in the last, uh, three years, they took their red shift system, you know, that competed with with Veronica and they they re implemented it and, you know, as a compiled system and it really runs incredibly fast. I mean, that that feat of engineering, what was truly exceptional >> to hear you say that Because it wasn't Red Shift originally Park. So yeah, that's right, Larry Ellison craps all over Red Shift because it's just open source offer that they just took and repackage. But you're saying they did some major engineering to Oh >> my gosh, yeah, It's like Mike and I both way Never. You know, we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. But this this latest rewrite that they've done this compiled version like it's really good. >> So as a guy has been doing a eye for 30 years now, and it's really seeing it come into its own, a lot of a I project seems right now are sort of low hanging fruit is it's small scale stuff where you see a I in five years what kind of projects are going our bar company's gonna be undertaking and what kind of new applications are gonna come out of this? But >> I think we're at the very beginning of this cycle, and actually there's a lot more potential than has been realized. So I think we are in the pick the low hanging fruit kind of a thing. But some of the potential applications of A I are so much more impactful, especially as we modernize core infrastructure in the enterprise. So the enterprise is sort of living with this huge legacy burden. And we always air encouraging a tamer our customers to think of all their existing legacy systems is just dated generating machines and the faster they can get that data into a state where they can start doing state of the art A. I work on top of it, the better. And so really, you know, you gotta put the legacy burden aside and kind of draw this line in the sand so that as you really get, build their muscles on the A. I side that you can take advantage of that with all the data that they're generating every single day. >> Everything about these data repose. He's Enterprise Data Warehouse. You guys built better with MPP technology. Better data warehouses, the master data management stuff, the top down, you know, Enterprise data models, Dupin in big data, none of them really lived up to their promise, you know? Yeah, it's kind of somewhat unfair toe toe like the MPP guys because you said, Hey, we're just gonna run faster. And you did. But you didn't say you're gonna change the world and all that stuff, right? Where's e d? W? Did Do you feel like this next wave is actually gonna live up to the promise? >> I think the next phase is it's very logical. Like, you know, I know you're talking to Chris Lynch here in a minute, and you know what? They're doing it at scale and at scale and tamer. These companies are all in the same general area. That's kind of related to how do you take all this data and actually prepare it and turn it into something that's consumable really quickly and easily for all of these new data consumers in the enterprise and like so that that's the next logical phase in this process. Now, will this phase be the one that finally sort of meets the high expectations that were set 2030 years ago with enterprise data warehousing? I don't know, but we're certainly getting closer >> to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out there. That was a technology just >> I'm huge. I'm fanatical right now about health care. I think that the opportunity for health care to be transformed with technology is, you know, almost makes everything else look like chump change. What aspect of health care? Well, I think that the most obvious thing is that now, with the consumer sort of in the driver seat in healthcare, that technology companies that come in and provide consumer driven solutions that meet the needs of patients, regardless of how dysfunctional the health care system is, that's killer stuff. We had a great company here in Boston called Pill Pack was a great example of that where they just build something better for consumers, and it was so popular and so, you know, broadly adopted again again. Eventually, Amazon bought it for $1,000,000,000. But those kinds of things and health care Pill pack is just the beginning. There's lots and lots of those kinds of opportunities. >> Well, it's right. Healthcare's ripe for disruption on, and it hasn't been hit with the digital destruction. And neither is financialservices. Really? Certainly, defenses has not yet another. They're high risk industry, so Absolutely takes longer. Well, Andy, thanks so much for making the time. You know, You gotta run. Yeah. Yeah. Thank you. All right, keep it right. Everybody move back with our next guest right after this short break. You're watching the Cube from M I T c B O Q. Right back.

Published Date : Aug 1 2019

SUMMARY :

you by Silicon Angle Media But why did you guys start like the vertical system that we've built and, you know, the problem that you were trying to solve? now that we started the company and maybe almost 10 when we started working on the academic And so what are you seeing in terms of the trends in terms of how data that we thought, uh, Hadoop and H D f s was going to be far big companies struggle with in terms of their data and, you know, cleaning it up and organizing Where are you applying machine the eye industry back in the late 19 eighties on, you know, If you don't have good training data course And so most of our models and most of the projects we work on, we've already got a model. math is all the same, like, you know, absolutely full stop, like there's really no new math. It's a surprise you that that Amazon implemented it and, you know, as a compiled system and to hear you say that Because it wasn't Red Shift originally Park. we always compared par, excelled over tika, and, you know, we always knew we were better in a whole bunch of ways. And so really, you know, you gotta put the legacy of them really lived up to their promise, you know? That's kind of related to how do you take all this data and actually to I kind of hoped knockers, and we'll have less to do any other cool stuff that you see out health care to be transformed with technology is, you know, Well, Andy, thanks so much for making the time.

ENTITIES

Entity	Category	Confidence
Mike	PERSON	0.99+
Andy	PERSON	0.99+
Andy Palmer	PERSON	0.99+
Mark Marvin	PERSON	0.99+
2007	DATE	0.99+
Amazon	ORGANIZATION	0.99+
Paul Dillon	PERSON	0.99+
Boston	LOCATION	0.99+
$1,000,000,000	QUANTITY	0.99+
Chris Lynch	PERSON	0.99+
Marvin Minsky	PERSON	0.99+
Larry Ellison	PERSON	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.99+
30 years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
second thing	QUANTITY	0.99+
third thing	QUANTITY	0.99+
20,000,000 records	QUANTITY	0.99+
two same principles	QUANTITY	0.99+
seven years ago	DATE	0.99+
eight years ago	DATE	0.99+
Mike Mike	PERSON	0.98+
three years	QUANTITY	0.98+
late 19 eighties	DATE	0.98+
first	QUANTITY	0.98+
five years	QUANTITY	0.98+
2030 years ago	DATE	0.98+
2nd 1	QUANTITY	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
two things	QUANTITY	0.97+
five PhDs	QUANTITY	0.97+
Day two	QUANTITY	0.97+
Veronica	PERSON	0.97+
M I. T.	PERSON	0.96+
Marvin	PERSON	0.96+
20 years ago	DATE	0.96+
Python	TITLE	0.96+
eighties	DATE	0.94+
2019	DATE	0.94+
2000 companies	QUANTITY	0.94+
Red Shift	TITLE	0.94+
Duke	ORGANIZATION	0.93+
Alexa	TITLE	0.91+
last five years	DATE	0.9+
M I t	EVENT	0.88+
almost 10	QUANTITY	0.87+
TAMR	PERSON	0.86+
Andi	PERSON	0.8+
M. I. T.	ORGANIZATION	0.79+
Tamer	ORGANIZATION	0.78+
Information Quality Symposium	EVENT	0.78+
Quality Conference Day Volonte	EVENT	0.77+
Tamer	PERSON	0.77+
Google translate	TITLE	0.75+
single day	QUANTITY	0.71+
H	PERSON	0.71+
Chief	PERSON	0.66+
Hadoop	PERSON	0.64+
MIT	ORGANIZATION	0.63+
Cube	ORGANIZATION	0.61+
more	QUANTITY	0.6+
M. I. T.	PERSON	0.57+
Pill pack	COMMERCIAL_ITEM	0.56+
Pill Pack	ORGANIZATION	0.53+
D f s	ORGANIZATION	0.48+
Park	TITLE	0.44+
CDOIQ	EVENT	0.32+
Cube	PERSON	0.27+

Michael Stonebraker, TAMR | MIT CDOIQ 2019

>> from Cambridge, Massachusetts. It's the Cube covering M I T. Chief data officer and information quality Symposium 2019. Brought to you by Silicon Angle Media. >> Welcome back to Cambridge, Massachusetts. Everybody, You're watching the Cube, the leader in live tech coverage, and we're covering the M I t CDO conference M I t. CDO. My name is David Monty in here with my co host, Paul Galen. Mike Stone breakers here. The legend is founder CTO of Of Tamer, as well as many other companies. Inventor Michael. Thanks for coming back in the Cube. Good to see again. Nice to be here. So this is kind of ah, repeat pattern for all of us. We kind of gather here in August that the CDO conference You're always the highlight of the show. You gave a talk this week on the top 10. Big data mistakes. You and I are one of the few. You were the few people who still use the term big data. I happen to like it. Sad that it's out of vogue already, but people associated with the doo doop it's kind of waning, but regardless, so welcome. How'd the talk go? What were you talking about. >> So I talked to a lot of people who were doing analytics. We're doing operation Offer operational day of data at scale, and they always make most of them make a collection of bad mistakes. And so the talk waas a litany of the blunders that I've seen people make, and so the audience could relate to the blunders about most. Most of the enterprise is represented. Make a bunch of the blunders. So I think no. One blunder is not planning on moving most everything to the cloud. >> So that's interesting, because a lot of people would would would love to debate that, but and I would imagine you probably could have done this 10 years ago in a lot of the blunders would be the same, but that's one that wouldn't have been there. But so I tend to agree. I was one of the two hands that went up this morning, and vocalist talk when he asked, Is the cloud cheaper for us? It is anyway. But so what? Why should everybody move everything? The cloud aren't there laws of physics, laws of economics, laws of the land that suggest maybe you >> shouldn't? Well, I guess 22 things and then a comment. First thing is James Hamilton, who's no techies. Techie works for Amazon. We know James. So he claims that he could stand up a server for 25% of your cost. I have no reason to disbelieve him. That number has been pretty constant for a few years, so his cost is 1/4 of your cost. Sooner or later, prices are gonna reflect costs as there's a race to the bottom of cloud servers. So >> So can I just stop you there for a second? Because you're some other date on that. All you have to do is look at a W S is operating margin and you'll see how profitable they are. They have software like economics. Now we're deploying servers. So sorry to interrupt, but so carry. So >> anyway, sooner or later, they're gonna have their gonna be wildly cheaper than you are. The second, then yet is from Dave DeWitt, whose database wizard. And here's the current technology that that Microsoft Azure is using. As of 18 months ago, it's shipping containers and parking lots, chilled water in power in Internet, Ian otherwise sealed roof and walls optional. So if you're doing raised flooring in Cambridge versus I'm doing shipping containers in the Columbia River Valley, who's gonna be a lot cheaper? And so you know the economies of scale? I mean, that, uh, big, big cloud guys are building data centers as fast as they can, using the cheapest technology around. You put up the data center every 10 years on dhe. You do it on raised flooring in Cambridge. So sooner or later, the cloud guys are gonna be a lot cheaper. And the only thing that isn't gonna the only thing that will change that equation is For example, my lab is up the street with Frank Gehry building, and we have we have an I t i t department who runs servers in Cambridge. Uh, and they claim they're cheaper than the cloud. And they don't pay rent for square footage and they don't pay for electricity. So yeah, if if think externalities, If there are no externalities, the cloud is assuredly going to be cheaper. And then the other thing is that most everybody tonight that I talk thio including me, has very skewed resource demands. So in the cloud finding three servers, except for the last day of the month on the last day of the month. I need 20 servers. I just do it. If I'm doing on Prem, I've got a provision for peak load. And so again, I'm just way more expensive. So I think sooner or later these combinations of effects was going to send everybody to the cloud for most everything, >> and my point about the operating margins is difference in price and cost. I think James Hamilton's right on it. If he If you look at the actual cost of deploying, it's even lower than the price with the market allows them to their growing at 40 plus percent a year and a 35 $40,000,000,000 run rate company sooner, Sooner or >> later, it's gonna be a race to the lot of you >> and the only guys are gonna win. You have guys have the best cost structure. A >> couple other highlights from your talk. >> Sure, I think 2nd 2nd thing like Thio Thio, no stress is that machine learning is going to be a game is going to be a game changer for essentially everybody. And not only is it going to be autonomous vehicles. It's gonna be automatic. Check out. It's going to be drone delivery of most everything. Uh, and so you can, either. And it's gonna affect essentially everybody gonna concert of, say, categorically. Any job that is easy to understand is going to get automated. And I think that's it's gonna be majorly impactful to most everybody. So if you're in Enterprise, you have two choices. You can be a disrupt or or you could be a disruptive. And so you can either be a taxi company or you can be you over, and it's gonna be a I machine learning that's going going to be determined which side of that equation you're on. So I was a big blunder that I see people not taking ml incredibly seriously. >> Do you see that? In fact, everyone I talked who seems to be bought in that this is we've got to get on the bandwagon. Yeah, >> I'm just pointing out the obvious. Yeah, yeah, I think, But one that's not quite so obvious you're is a lot of a lot of people I talked to say, uh, I'm on top of data science. I've hired a group of of 10 data scientists, and they're doing great. And when I talked, one vignette that's kind of fun is I talked to a data scientist from iRobot, which is the guys that have the vacuum cleaner that runs around your living room. So, uh, she said, I spend 90% of my time locating the data. I want to analyze getting my hands on it and cleaning it, leaving the 10% to do data science job for which I was hired. Of the 10% I spend 90% fixing the data cleaning errors in my data so that my models work. So she spends 99% of her time on what you call data preparation 1% of her time doing the job for which he was hired. So data science is not about data science. It's about data integration, data cleaning, data, discovery. >> But your new latest venture, >> so tamer does that sort of stuff. And so that's But that's the rial data science problem. And a lot of people don't realize that yet, And, uh, you know they will. I >> want to ask you because you've been involved in this by my count and starting up at least a dozen companies. Um, 99 Okay, It's a lot. >> It's not overstated. You estimated high fall. How do you How >> do you >> decide what challenge to move on? Because they're really not. You're not solving the same problems. You're You're moving on to new problems. How do you decide? What's the next thing that interests you? Enough to actually start a company. Okay, >> that's really easy. You know, I'm on the faculty of M i t. My job is to think of news new ship and investigate it, and I come up. No, I'm paid to come up with new ideas, some of which have commercial value, some of which don't and the ones that have commercial value, like, commercialized on. So it's whatever I'm doing at the time on. And that's why all the things I've commercialized, you're different >> s so going back to tamer data integration platform is a lot of companies out there claim to do it day to get integration right now. What did you see? What? That was the deficit in the market that you could address. >> Okay, great question. So there's the traditional data. Integration is extract transforming load systems and so called Master Data management systems brought to you by IBM in from Attica. Talent that class of folks. So a dirty little secret is that that technology does not scale Okay, in the following sense that it's all well, e t l doesn't scale for a different reason with an m d l e t l doesn't scale because e t. L is based on the premise that somebody really smart comes up with a global data model For all the data sources you want put together. You then send a human out to interview each business unit to figure out exactly what data they've got and then how to transform it into the global data model. How to load it into your data warehouse. That's very human intensive. And it doesn't scale because it's so human intensive. So I've never talked to a data warehouse operator who who says I integrate the average I talk to says they they integrate less than 10 data sources. Some people 20. If you twist my arm hard, I'll give you 50. So a Here. Here's a real world problem, which is Toyota Motor Europe. I want you right now. They have a distributor in Spain, another distributor in France. They have a country by country distributor, sometimes canton by Canton. Distribute distribution. So if you buy a Toyota and Spain and move to France, Toyota develops amnesia. The French French guys know nothing about you. So they've got 250 separate customer databases with 40,000,000 total records in 50 languages. And they're in the process of integrating that. It was single customer database so that they can Duke custom. They could do the customer service we expect when you cross cross and you boundary. I've never seen an e t l system capable of dealing with that kind of scale. E t l dozen scale to this level of problem. >> So how do you solve that problem? >> I'll tell you that they're a tamer customer. I'll tell you all about it. Let me first tell you why MGM doesn't scare. >> Okay. Great. >> So e t l says I now have all your data in one place in the same format, but now you've got following problems. You've got a d duplicated because if if I if I bought it, I bought a Toyota in Spain, I bought another Toyota in France. I'm both databases. So if you want to avoid double counting customers, you got a dupe. Uh, you know, got Duke 30,000,000 records. And so MGM says Okay, you write some rules. It's a rule based technology. So you write a rule. That's so, for example, my favorite example of a rule. I don't know if you guys like to downhill downhill skiing, All right? I love downhill skiing. So ski areas, Aaron, all kinds of public databases assemble those all together. Now you gotta figure out which ones are the same the same ski area, and they're called different names in different addresses and so forth. However, a vertical drop from bottom to the top is the same. Chances are they're the same ski area. So that's a rule that says how to how to put how to put data together in clusters. And so I now have a cluster for mount sanity, and I have a problem which is, uh, one address says something rather another address as something else. Which one is right or both? Right, so now you want. Now you have a gold. Let's call the golden Record problem to basically decide which, which, which data elements among a variety that maybe all associated with the same entity are in fact correct. So again, MDM, that's a rule's a rule based system. So it's a rule based technology and rule systems don't scale the best example I can give you for why Rules systems don't scale. His tamer has another customer. General Electric probably heard of them, and G wanted to do spend analytics, and so they had 20,000,000 spend transactions. Frank the year before last and spend transaction is I paid $12 to take a cab from here here to the airport, and I charged it to cost center X Y Z 20,000,000 of those so G has a pre built classification system for spend, so they have parts and underneath parts or computers underneath computers and memory and so forth. So pre existing preexisting class classifications for spend they want to simply classified 20,000,000 spent transactions into this pre existing hierarchy. So the traditional technology is, well, let's write some rules. So G wrote 500 rules, which is about the most any single human I can get there, their arms around so that classified 2,000,000 of the 20,000,000 transactions. You've now got 18 to go and another 500 rules is not going to give you 2,000,000 more. It's gonna give you love diminishing returns, right? So you have to write a huge number of rules and no one can possibly understand. So the technology simply doesn't scale, right? So in the case of G, uh, they had tamer health. Um, solve this. Solved this classification problem. Tamer used their 2,000,000 rule based, uh, tag records as training data. They used an ML model, then work off the training data classifies remaining 18,000,000. So the answer is machine learning. If you don't use machine learning, you're absolutely toast. So the answer to MDM the answer to MGM doesn't scale. You've got to use them. L The answer to each yell doesn't scale. You gotta You're putting together disparate records can. The answer is ml So you've got to replace humans by machine learning. And so that's that seems, at least in this conference, that seems to be resonating, which is people are understanding that at scale tradition, traditional data integration, technology's just don't work >> well and you got you got a great shot out on yesterday from the former G S K Mark Grams, a leader Mark Ramsay. Exactly. Guys. And how they solve their problem. He basically laid it out. BTW didn't work and GM didn't work, All right. I mean, kick it, kick the can top down data modelling, didn't work, kicked the candid governance That's not going to solve the problem. And But Tamer did, along with some other tooling. Obviously, of course, >> the Well, the other thing is No. One technology. There's no silver bullet here. It's going to be a bunch of technologies working together, right? Mark Ramsay is a great example. He used his stream sets and a bunch of other a bunch of other startup technology operating together and that traditional guys >> Okay, we're good >> question. I want to show we have time. >> So with traditional vendors by and large or 10 years behind the times, And if you want cutting edge stuff, you've got to go to start ups. >> I want to jump. It's a different topic, but I know that you in the past were critic of know of the no sequel movement, and no sequel isn't going away. It seems to be a uh uh, it seems to be actually gaining steam right now. What what are the flaws in no sequel? It has your opinion changed >> all? No. So so no sequel originally meant no sequel. Don't use it then. Then the marketing message changed to not only sequel, So sequel is fine, but no sequel does others. >> Now it's all sequel, right? >> And my point of view is now. No sequel means not yet sequel because high level language, high level data languages, air good. Mongo is inventing one Cassandra's inventing one. Those unless you squint, look like sequel. And so I think the answer is no sequel. Guys are drifting towards sequel. Meanwhile, Jason is That's a great idea. If you've got your regular data sequel, guys were saying, Sure, let's have Jason is the data type, and I think the only place where this a fair amount of argument is schema later versus schema first, and I pretty much think schema later is a bad idea because schema later really means you're creating a data swamp exactly on. So if you >> have to fix it and then you get a feel of >> salary, so you're storing employees and salaries. So, Paul salaries recorded as dollars per month. Uh, Dave, salary is in euros per week with a lunch allowance minds. So if you if you don't, If you don't deal with irregularities up front on data that you care about, you're gonna create a mess. >> No scheme on right. Was convenient of larger store, a lot of data cheaply. But then what? Hard to get value out of it created. >> So So I think the I'm not opposed to scheme later. As long as you realize that you were kicking the can down the road and you're just you're just going to give your successor a big mess. >> Yeah, right. Michael, we gotta jump. But thank you so much. Sure appreciate it. All right. Keep it right there, everybody. We'll be back with our next guest right into the short break. You watching the cue from M i t cdo Ike, you right back

Published Date : Aug 1 2019

SUMMARY :

Brought to you by We kind of gather here in August that the CDO conference You're always the highlight of the so the audience could relate to the blunders about most. physics, laws of economics, laws of the land that suggest maybe you So he claims that So can I just stop you there for a second? And so you know the and my point about the operating margins is difference in price and cost. You have guys have the best cost structure. And so you can either be a taxi company got to get on the bandwagon. leaving the 10% to do data science job for which I was hired. But that's the rial data science problem. want to ask you because you've been involved in this by my count and starting up at least a dozen companies. How do you How You're You're moving on to new problems. No, I'm paid to come up with new ideas, s so going back to tamer data integration platform is a lot of companies out there claim to do and so called Master Data management systems brought to you by IBM I'll tell you that they're a tamer customer. So the answer to MDM the I mean, kick it, kick the can top down data modelling, It's going to be a bunch of technologies working together, I want to show we have time. and large or 10 years behind the times, And if you want cutting edge It's a different topic, but I know that you in the past were critic of know of the no sequel movement, No. So so no sequel originally meant no So if you So if you if Hard to get value out of it created. So So I think the I'm not opposed to scheme later. But thank you so much.

ENTITIES

Entity	Category	Confidence
Michael	PERSON	0.99+
James	PERSON	0.99+
Mark Ramsay	PERSON	0.99+
James Hamilton	PERSON	0.99+
Paul Galen	PERSON	0.99+
Dave DeWitt	PERSON	0.99+
Toyota	ORGANIZATION	0.99+
David Monty	PERSON	0.99+
General Electric	ORGANIZATION	0.99+
2,000,000	QUANTITY	0.99+
France	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
20,000,000	QUANTITY	0.99+
10%	QUANTITY	0.99+
Michael Stonebraker	PERSON	0.99+
Cambridge	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
50	QUANTITY	0.99+
$12	QUANTITY	0.99+
Spain	LOCATION	0.99+
18,000,000	QUANTITY	0.99+
25%	QUANTITY	0.99+
20 servers	QUANTITY	0.99+
90%	QUANTITY	0.99+
Columbia River Valley	LOCATION	0.99+
99%	QUANTITY	0.99+
18	QUANTITY	0.99+
Aaron	PERSON	0.99+
Dave	PERSON	0.99+
August	DATE	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
three servers	QUANTITY	0.99+
35 $40,000,000,000	QUANTITY	0.99+
50 languages	QUANTITY	0.99+
500 rules	QUANTITY	0.99+
22 things	QUANTITY	0.99+
10 data scientists	QUANTITY	0.99+
Mike Stone	PERSON	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
MGM	ORGANIZATION	0.99+
less than 10 data sources	QUANTITY	0.99+
Ian	PERSON	0.99+
Paul	PERSON	0.99+
1%	QUANTITY	0.99+
both	QUANTITY	0.99+
Toyota Motor Europe	ORGANIZATION	0.99+
Of Tamer	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
one	QUANTITY	0.99+
single	QUANTITY	0.99+
Attica	ORGANIZATION	0.99+
10 years ago	DATE	0.99+
yesterday	DATE	0.99+
iRobot	ORGANIZATION	0.99+
Mark Grams	PERSON	0.99+
TAMR	PERSON	0.99+
10 years	QUANTITY	0.99+
20	QUANTITY	0.98+
1/4	QUANTITY	0.98+
250 separate customer databases	QUANTITY	0.98+
Cassandra	PERSON	0.98+
First thing	QUANTITY	0.98+
30,000,000 records	QUANTITY	0.98+
both databases	QUANTITY	0.98+
18 months ago	DATE	0.98+
first	QUANTITY	0.98+
M I t CDO	EVENT	0.98+
One blunder	QUANTITY	0.98+
Tamer	PERSON	0.98+
one place	QUANTITY	0.98+
second	QUANTITY	0.97+
two choices	QUANTITY	0.97+
tonight	DATE	0.97+
each business unit	QUANTITY	0.97+
Thio Thio	PERSON	0.97+
two hands	QUANTITY	0.96+
this week	DATE	0.96+
Frank	PERSON	0.95+
Duke	ORGANIZATION	0.95+

Colin Mahony, Vertica | MIT CDOIQ 2019

>> From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts everybody, you're watching The Cube, the leader in tech coverage. My name is Dave Vellante here with my cohost Paul Gillin. This is day one of our two day coverage of the MIT CDOIQ conferences. CDO, Chief Data Officer, IQ, information quality. Colin Mahoney is here, he's a good friend and long time CUBE alum. I haven't seen you in awhile, >> I know >> But thank you so much for taking some time, you're like a special guest here >> Thank you, yeah it's great to be here, thank you. >> Yeah, so, this is not, you know, something that you would normally attend. I caught up with you, invited you in. This conference has started as, like back office governance, information quality, kind of wonky stuff, hidden. And then when the big data meme took off, kind of around the time we met. The Chief Data Officer role emerged, the whole Hadoop thing exploded, and then this conference kind of got bigger and bigger and bigger. Still intimate, but very high level, very senior. It's kind of come full circle as we've been saying, you know, information quality still matters. You have been in this data business forever, so I wanted to invite you in just to get your perspectives, we'll talk about what's new with what's going on in your company, but let's go back a little bit. When we first met and even before, you saw it coming, you kind of invested your whole career into data. So, take us back 10 years, I mean it was so different, remember it was Batch, it was Hadoop, but it was cool. There was a lot of cool >> It's still cool. (laughs) projects going on, and it's still cool. But, take a look back. >> Yeah, so it's changed a lot, look, I got into it a while ago, I've always loved data, I had no idea, the explosion and the three V's of data that we've seen over the last decade. But, data's really important, and it's just going to get more and more important. But as I look back I think what's really changed, and even if you just go back a decade I mean, there's an insatiable appetite for data. And that is not slowing down, it hasn't slowed down at all, and I think everybody wants that perfect solution that they can ask any question and get an immediate answers to. We went through the Hadoop boom, I'd argue that we're going through the Hadoop bust, but what people actually want is still the same. You know, they want real answers, accurate answers, they want them quickly, and they want it against all their information and all their data. And I think that Hadoop evolved a lot as well, you know, it started as one thing 10 years ago, with MapReduce and I think in the end what it's really been about is disrupting the storage market. But if you really look at what's disrupting storage right now, public clouds, S3, right? That's the new data league. So there's always a lot of hype cycles, everybody talks about you know, now it's Cloud, everything, for maybe the last 10 years it was a lot of Hadoop, but at the end of the day I think what people want to do with data is still very much the same. And a lot of companies are still struggling with it, hence the role for Chief Data Officers to really figure out how do I monetize data on the one hand and how to I protect that asset on the other hand. >> Well so, and the cool this is, so this conference is not a tech conference, really. And we love tech, we love talking about this, this is why I love having you on. We kind of have a little Vertica thread that I've created here, so Colin essentially, is the current CEO of Vertica, I know that's not your title, you're GM and Senior Vice President, but you're running Vertica. So, Michael Stonebreaker's coming on tomorrow, >> Yeah, excellent. >> Chris Lynch is coming on tomorrow, >> Oh, great, yeah. >> we've got Andy Palmer >> Awesome, yeah. >> coming up as well. >> Pretty cool. (laughs) >> So we have this connection, why is that important? It's because, you know, Vertica is a very cool company and is all about data, and it was all about disrupting, sort of the traditional relational database. It's kind of doing more with data, and if you go back to the roots of Vertica, it was like how do you do things faster? How do you really take advantage of data to really drive new business? And that's kind of what it's all about. And the tech behind it is really cool, we did your conference for many, many years. >> It's coming back by the way. >> Is it? >> Yeah, this March, so March 30th. >> Oh, wow, mark that down. >> At Boston, at the new Encore Hotel. >> Well we better have theCUBE there, bro. (laughs) >> Yeah, that's great. And yeah, you've done that conference >> Yep. >> haven't you before? So very cool customers, kind of leading edge, so I want to get to some of that, but let's talk the disruption for a minute. So you guys started with the whole architecture, MPP and so forth. And you talked about Cloud, Cloud really disrupted Hadoop. What are some of the other technology disruptions that you're seeing in the market space? >> I think, I mean, you know, it's hard not to talk about AI machine learning, and what one means versus the other, who knows right? But I think one thing that is definitely happening is people are leveraging the volumes of data and they're trying to use all the processing power and storage power that we have to do things that humans either are too expensive to do or simply can't do at the same speed and scale. And so, I think we're going through a renaissance where a lot more is being automated, certainly on the Vertica roadmap, and our path has always been initially to get the data in and then we want the platform to do a lot more for our customers, lots more analytics, lots more machine-learning in the platform. So that's definitely been a lot of the buzz around, but what's really funny is when you talk to a lot of customers they're still struggling with just some basic stuff. Forget about the predictive thing, first you've got to get to what happened in the past. Let's give accurate reporting on what's actually happening. The other big thing I think as a disruption is, I think IOT, for all the hype that it's getting it's very real. And every device is kicking off lots of information, the feedback loop of AB testing or quality testing for predictive maintenance, it's happening almost instantly. And so you're getting massive amounts of new data coming in, it's all this machine sensor type data, you got to figure out what it means really quick, and then you actually have to do something and act on it within seconds. And that's a whole new area for so many people. It's not their traditional enterprise data network warehouse and you know, back to you comment on Stonebreaker, he got a lot of this right from the beginning, you know, and I think he looked at the architectures, he took a lot of the best in class designs, we didn't necessarily invent everything, but we put a lot of that together. And then I think the other you've got to do is constantly re-invent your platform. We came out with our Eon Mode to run cloud native, we just got rated the best cloud data warehouse from a net promoter score rating perspective, so, but we got to keep going you know, we got to keep re-inventing ourselves, but leverage everything that we've done in the past as well. >> So one of the things that you said, which is kind of relevant for here, Paul, is you're still seeing a real data quality issue that customers are wrestling with, and that's a big theme here, isn't it? >> Absolutely, and the, what goes around comes around, as Dave said earlier, we're still talking about information quality 13 years after this conference began. Have the tools to improve quality improved all that much? >> I think the tools have improved, I think that's another area where machine learning, if you look at Tamr, and I know you're going to have Andy here tomorrow, they're leveraging a lot of the augmented things you can do with the processing to make it better. But I think one thing that makes the problem worse now, is it's gotten really easy to pour data in. It's gotten really easy to store data without having to have the right structure, the right quality, you know, 10 years ago, 20 years ago, everything was perfect before it got into the platform. Right, everything was, there was quality, everything was there. What's been happening over the last decade is you're pumping data into these systems, nobody knows if it's redundant data, nobody knows if the quality's any good, and the amount of data is massive. >> And it's cheap to store >> Very cheap to store. >> So people keep pumping it in. >> But I think that creates a lot of issues when it comes to data quality. So, I do think the technology's gotten better, I think there's a lot of companies that are doing a great job with it, but I think the challenge has definitely upped. >> So, go ahead. >> I'm sorry. You mentioned earlier that we're seeing the death of Hadoop, but I'd like you to elaborate on that becuase (Dave laughs) Hadoop actually came up this morning in the keynote, it's part of what GlaxoSmithKline did. Came up in a conversation I had with the CEO of Experian last week, I mean, it's still out there, why do you think it's in decline? >> I think, I mean first of all if you look at the Hadoop vendors that are out there, they've all been struggling. I mean some of them are shutting down, two of them have merged and they've got killed lately. I think there are some very successful implementations of Hadoop. I think Hadoop as a storage environment is wonderful, I think you can process a lot of data on Hadoop, but the problem with Hadoop is it became the panacea that was going to solve all things data. It was going to be the database, it was going to be the data warehouse, it was going to do everything. >> That's usually the kiss of death, isn't it? >> It's the kiss of death. And it, you know, the killer app on Hadoop, ironically, became SQL. I mean, SQL's the killer app on Hadoop. If you want to SQL engine, you don't need Hadoop. But what we did was, in the beginning Mike sort of made fun of it, Stonebreaker, and joked a lot about he's heard of MapReduce, it's called Group By, (Dave laughs) and that created a lot of tension between the early Vertica and Hadoop. I think, in the end, we embraced it. We sit next to Hadoop, we sit on top of Hadoop, we sit behind it, we sit in front of it, it's there. But I think what the reality check of the industry has been, certainly by the business folks in these companies is it has not fulfilled all the promises, it has not fulfilled a fraction on the promises that they bet on, and so they need to figure those things out. So I don't think it's going to go away completely, but I think its best success has been disrupting the storage market, and I think there's some much larger disruptions of technologies that frankly are better than HTFS to do that. >> And the Cloud was a gamechanger >> And a lot of them are in the cloud. >> Which is ironic, 'cause you know, cloud era, (Colin laughs) they didn't really have a cloud strategy, neither did Hortonworks, neither did MapR and, it just so happened Amazon had one, Google had one, and Microsoft has one, so, it's just convenient to-- >> Well, how is that affecting your business? We've seen this massive migration to the cloud (mumbles) >> It's actually been great for us, so one of the things about Vertica is we run everywhere, and we made a decision a while ago, we had our own data warehouse as a service offering. It might have been ahead of its time, never really took off, what we did instead is we pivoted and we say "you know what? "We're going to invest in that experience "so it's a SaaS-like experience, "but we're going to let our customers "have full control over the cloud. "And if they want to go to Amazon they can, "if they want to go to Google they can, "if they want to go to Azure they can." And we really invested in that and that experience. We're up on the Amazon marketplace, we have lots of customers running up on Amazon Cloud as well as Google and Azure now, and then about two years ago we went down and did this endeavor to completely re-architect our product so that we could separate compute and storage so that our customers could actually take advantage of the cloud economics as well. That's been huge for us, >> So you scale independent-- >> Scale independently, cloud native, add compute, take away compute, and for our existing customers, they're loving the hybrid aspect, they love that they can still run on Premise, they love that they can run up on a public cloud, they love that they can run in both places. So we will continue to invest a lot in that. And it is really, really important, and frankly, I think cloud has helped Vertica a lot, because being able to provision hardware quickly, being able to tie in to these public clouds, into our customers' accounts, give them control, has been great and we're going to continue on that path. >> Because Vertica's an ISV, I mean you're a software company. >> We're a software company. >> I know you were a part of HP for a while, and HP wanted to mash that in and run it on it's hardware, but software runs great in the cloud. And then to you it's another hardware platform. >> It's another hardware platform, exactly. >> So give us the update on Micro Focus, Micro Focus acquired Vertica as part of the HPE software business, how many years ago now? Two years ago? >> Less than two years ago. >> Okay, so how's that going, >> It's going great. >> Give us the update there. >> Yeah, so first of all it is great, HPE and HP were wonderful to Vertica, but it's great being part of a software company. Micro Focus is a software company. And more than just a software company it's a company that has a lot of experience bridging the old and the new. Leveraging all of the investments that you've made but also thinking about cloud and all these other things that are coming down the pike. I think for Vertica it's been really great because, as you've seen Vertica has gotten its identity back again. And that's something that Micro Focus is very good at. You can look at what Micro Focus did with SUSE, the Linux company, which actually you know, now just recently spun out of Micro Focus but, letting organizations like Vertica that have this culture, have this product, have this passion, really focus on our market and our customers and doing the right thing by them has been just really great for us and operating as a software company. The other nice thing is that we do integrate with a lot of other products, some of which came from the HPE side, some of which came from Micro Focus, security products is an example. The other really nice thing is we've been doing this insource thing at Micro Focus where we open up our source code to some of the other teams in Micro Focus and they've been contributing now in amazing ways to the product. In ways that we would just never be able to scale, but with 4,000 engineers strong in Micro Focus, we've got a much larger development organization that can actually contribute to the things that Vertica needs to do. And as we go into the cloud and as we do a lot more operational aspects, the experience that these teams have has been incredible, and security's another great example there. So overall it's been great, we've had four different owners of Vertica, our job is to continue what we do on the innovation side in the culture, but so far Micro Focus has been terrific. >> Well, I'd like to say, you're kind of getting that mojo back, because you guys as an independent company were doing your own thing, and then you did for a while inside of HP, >> We did. >> And that obviously changed, 'cause they wanted more integration, but, and Micro Focus, they know what they're doing, they know how to do acquisitions, they've been very successful. >> It's a very well run company, operationally. >> The SUSE piece was really interesting, spinning that out, because now RHEL is part of IBM, so now you've got SUSE as the lone independent. >> Yeah. >> Yeah. >> But I want to ask you, go back to a technology question, is NoSQL the next Hadoop? Are these databases, it seems to be that the hot fad now is NoSQL, it can do anything. Is the promise overblown? >> I think, I mean NoSQL has been out almost as long as Hadoop, and I, we always say not only SQL, right? Mike's said this from day one, best tool for the job. Nothing is going to do every job well, so I think that there are, whether it's key value stores or other types of NoSQL engines, document DB's, now you have some of these DB's that are running on different chips, >> Graph, yeah. >> there's always, yeah, graph DBs, there's always going to be specialty things. I think one of the things about our analytic platform is we can do, time series is a great example. Vertica's a great time series database. We can compete with specialized time series databases. But we also offer a lot of, the other things that you can do with Vertica that you wouldn't be able to do on a database like that. So, I always think there's going to be specialty products, I also think some of these can do a lot more workloads than you might think, but I don't see as much around the NoSQL movement as say I did a few years ago. >> But so, and you mentioned the cloud before as kind of, your position on it I think is a tailwind, not to put words in your mouth, >> Yeah, yeah, it's a great tailwind. >> You're in the Amazon marketplace, I mean they have products that are competitive, right? >> They do, they do. >> But, so how are you differentiating there? >> I think the way we differentiate, whether it's Redshift from Amazon, or BigQuery from Google, or even what Azure DB does is, first of all, Vertica, I think from, feature functionality and performance standpoint is ahead. Number one, I think the second thing, and we hear this from a lot of customers, especially at the C-level is they don't want to be locked into these full stacks of the clouds. Having the ability to take a product and run it across multiple clouds is a big thing, because the stack lock-in now, the full stack lock-in of these clouds is scary. It's really easy to develop in their ecosystems but you get very locked into them, and I think a lot of people are concerned about that. So that works really well for Vertica, but I think at the end of the day it's just, it's the robustness of the product, we continue to innovate, when you look at separating compute and storage, believe it or not, a lot of these cloud-native databases don't do that. And so we can actually leverage a lot of the cloud hardware better than the native cloud databases do themselves. So, like I said, we have to keep going, those guys aren't going to stop, and we actually have great relationships with those companies, we work really well with the clouds, they seem to care just as much about their cloud ecosystem as their own database products, and so I think that's going to continue as well. >> Well, Colin, congratulations on all the success >> Yeah, thank you, yeah. >> It's awesome to see you again and really appreciate you coming to >> Oh thank you, it's great, I appreciate the invite, >> MIT. >> it's great to be here. >> All right, keep it right there everybody, Paul and I will be back with our next guest from MIT, you're watching theCUBE. (electronic jingle)

Published Date : Jul 31 2019

SUMMARY :

brought to you by SiliconANGLE Media. I haven't seen you in awhile, kind of around the time we met. It's still cool. but at the end of the day I think is the current CEO of Vertica, (laughs) and if you go back to the roots of Vertica, at the new Encore Hotel. Well we better have theCUBE there, bro. And yeah, you've done that conference but let's talk the disruption for a minute. but we got to keep going you know, Have the tools to improve quality the right quality, you know, But I think that creates a lot of issues but I'd like you to elaborate on that becuase I think you can process a lot of data on Hadoop, and so they need to figure those things out. so one of the things about Vertica is we run everywhere, and frankly, I think cloud has helped Vertica a lot, I mean you're a software company. And then to you it's another hardware platform. the Linux company, which actually you know, and Micro Focus, they know what they're doing, so now you've got SUSE as the lone independent. is NoSQL the next Hadoop? Nothing is going to do every job well, the other things that you can do with Vertica and so I think that's going to continue as well. Paul and I will be back with our next guest from MIT,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Andy Palmer	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Colin Mahoney	PERSON	0.99+
Paul	PERSON	0.99+
Colin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Vertica	ORGANIZATION	0.99+
Chris Lynch	PERSON	0.99+
HPE	ORGANIZATION	0.99+
Michael Stonebreaker	PERSON	0.99+
HP	ORGANIZATION	0.99+
Micro Focus	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
Colin Mahony	PERSON	0.99+
last week	DATE	0.99+
Andy	PERSON	0.99+
March 30th	DATE	0.99+
NoSQL	TITLE	0.99+
Mike	PERSON	0.99+
Experian	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
SQL	TITLE	0.99+
two day	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
4,000 engineers	QUANTITY	0.99+
Two years ago	DATE	0.99+
SUSE	TITLE	0.99+
Azure DB	TITLE	0.98+
second thing	QUANTITY	0.98+
20 years ago	DATE	0.98+
10 years ago	DATE	0.98+
one	QUANTITY	0.98+
Vertica	TITLE	0.98+
Hortonworks	ORGANIZATION	0.97+
MapReduce	ORGANIZATION	0.97+
one thing	QUANTITY	0.97+

Mark Ramsey, Ramsey International LLC | MIT CDOIQ 2019

>> From Cambridge, Massachusetts. It's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019. Brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts, everybody. We're here at MIT, sweltering Cambridge, Massachusetts. You're watching theCUBE, the leader in live tech coverage, my name is Dave Vellante. I'm here with my co-host, Paul Gillin. Special coverage of the MITCDOIQ. The Chief Data Officer event, this is the 13th year of the event, we started seven years ago covering it, Mark Ramsey is here. He's the Chief Data and Analytics Officer Advisor at Ramsey International, LLC and former Chief Data Officer of GlaxoSmithKline. Big pharma, Mark, thanks for coming onto theCUBE. >> Thanks for having me. >> You're very welcome, fresh off the keynote. Fascinating keynote this evening, or this morning. Lot of interest here, tons of questions. And we have some as well, but let's start with your history in data. I sat down after 10 years, but I could have I could have stretched it to 20. I'll sit down with the young guns. But there was some folks in there with 30 plus year careers. How about you, what does your data journey look like? >> Well, my data journey, of course I was able to stand up for the whole time because I was in the front, but I actually started about 32, a little over 32 years ago and I was involved with building. What I always tell folks is that Data and Analytics has been a long journey, and the name has changed over the years, but we've been really trying to tackle the same problems of using data as a strategic asset. So when I started I was with an insurance and financial services company, building one of the first data warehouse environments in the insurance industry, and that was in the 87, 88 range, and then once I was able to deliver that, I ended up transitioning into being in consulting for IBM and basically spent 18 years with IBM in consulting and services. When I joined, the name had evolved from Data Warehousing to Business Intelligence and then over the years it was Master Data Management, Customer 360. Analytics and Optimization, Big Data. And then in 2013, I joined Samsung Mobile as their first Chief Data Officer. So, moving out of consulting, I really wanted to own the end-to-end delivery of advanced solutions in the Data Analytics space and so that made the transition to Samsung quite interesting, very much into consumer electronics, mobile phones, tablets and things of that nature, and then in 2015 I joined GSK as their first Chief Data Officer to deliver a Data Analytics solution. >> So you have long data history and Paul, Mark took us through. And you're right, Mark-o, it's a lot of the same narrative, same wine, new bottle but the technology's obviously changed. The opportunities are greater today. But you took us through Enterprise Data Warehouse which was ETL and then MAP and then Master Data Management which is kind of this mapping and abstraction layer, then an Enterprise Data Model, top-down. And then that all failed, so we turned to Governance which has been very very difficult and then you came up with another solution that we're going to dig into, but is it the same wine, new bottle from the industry? >> I think it has been over the last 20, 30 years, which is why I kind of did the experiment at the beginning of how long folks have been in the industry. I think that certainly, the technology has advanced, moving to reduction in the amount of schema that's required to move data so you can kind of move away from the map and move type of an approach of a data warehouse but it is tackling the same type of problems and like I said in the session it's a little bit like Einstein's phrase of doing the same thing over and over again and expecting a different answer is certainly the definition of insanity and what I really proposed at the session was let's come at this from a very different perspective. Let's actually use Data Analytics on the data to make it available for these purposes, and I do think I think it's a different wine now and so I think it's just now a matter of if folks can really take off and head that direction. >> What struck me about, you were ticking off some of the issues that have failed like Data Warehouses, I was surprised to hear you say Data Governance really hasn't worked because there's a lot of talk around that right now, but all of those are top-down initiatives, and what you did at GSK was really invert that model and go from the bottom up. What were some of the barriers that you had to face organizationally to get the cooperation of all these people in this different approach? >> Yeah, I think it's still key. It's not a complete bottoms up because then you do end up really just doing data for the sake of data, which is also something that's been tried and does not work. I think it has to be a balance and that's really striking that right balance of really tackling the data at full perspective but also making sure that you have very definitive use cases to deliver value for the organization and then striking the balance of how you do that and I think of the things that becomes a struggle is you're talking about very large breadth and any time you're covering multiple functions within a business it's getting the support of those different business functions and I think part of that is really around executive support and what that means, I did mention it in the session, that executive support to me is really stepping up and saying that the data across the organization is the organization's data. It isn't owned by a particular person or a particular scientist, and I think in a lot of organization, that gatekeeper mentality really does put barriers up to really tackling the full breadth of the data. >> So I had a question around digital initiatives. Everywhere you go, every C-level Executive is trying to get digital right, and a lot of this is top-down, a lot of it is big ideas and it's kind of the North Star. Do you think that that's the wrong approach? That maybe there should be a more tactical line of business alignment with that threaded leader as opposed to this big picture. We're going to change and transform our company, what are your thoughts? >> I think one of the struggles is just I'm not sure that organizations really have a good appreciation of what they mean when they talk about digital transformation. I think there's in most of the industries it is an initiative that's getting a lot of press within the organizations and folks want to go through digital transformation but in some cases that means having a more interactive experience with consumers and it's maybe through sensors or different ways to capture data but if they haven't solved the data problem it just becomes another source of data that we're going to mismanage and so I do think there's a risk that we're going to see the same outcome from digital that we have when folks have tried other approaches to integrate information, and if you don't solve the basic blocking and tackling having data that has higher velocity and more granularity, if you're not able to solve that because you haven't tackled the bigger problem, I'm not sure it's going to have the impact that folks really expect. >> You mentioned that at GSK you collected 15 petabytes of data of which only one petabyte was structured. So you had to make sense of all that unstructured data. What did you learn about that process? About how to unlock value from unstructured data as a result of that? >> Yeah, and I think this is something. I think it's extremely important in the unstructured data to apply advanced analytics against the data to go through a process of making sense of that information and a lot of folks talk about or have talked about historically around text mining of trying to extract an entity out of unstructured data and using that for the value. There's a few steps before you even get to that point, and first of all it's classifying the information to understand which documents do you care about and which documents do you not care about and I always use the story that in this vast amount of documents there's going to be, somebody has probably uploaded the cafeteria menu from 10 years ago. That has no scientific value, whereas a protocol document for a clinical trial has significant value, you don't want to look through manually a billion documents to separate those, so you have to apply the technology even in that first step of classification, and then there's a number of steps that ultimately lead you to understanding the relationship of the knowledge that's in the documents. >> Side question on that, so you had discussed okay, if it's a menu, get rid of it but there's certain restrictions where you got to keep data for decades. It struck me, what about work in process? Especially in the pharmaceutical industry. I mean, post Federal Rules of Civil Procedure was everybody looking for a smoking gun. So, how are organizations dealing with what to keep and what to get rid of? >> Yeah, and I think certainly the thinking has been to remove the excess and it's to your point, how do you draw the line as to what is excess, right, so you don't want to just keep every document because then if an organization is involved in any type of litigation and there's disclosure requirements, you don't want to have to have thousands of documents. At the same time, there are requirements and so it's like a lot of things. It's figuring out how do you abide by the requirements, but that is not an easy thing to do, and it really is another driver, certainly document retention has been a big thing over a number of years but I think people have not applied advanced analytics to the level that they can to really help support that. >> Another Einstein bro-mahd, you know. Keep everything you must but no more. So, you put forth a proposal where you basically had this sort of three approaches, well, combined three approaches. The crawlers to go, the spiders to go out and do the discovery and I presume that's where the classification is done? >> That's really the identification of all of the source information >> Okay, so find out what you got, okay. >> so that's kind of the start. Find out what you have. >> Step two is the data repository. Putting that in, I thought it was when I heard you I said okay it must be a logical data repository, but you said you basically told the CIO we're copying all the data and putting it into essentially one place. >> A physical location, yes. >> Okay, and then so I got another question about that and then use bots in the pipeline to move the data and then you sort of drew the diagram of the back end to all the databases. Unstructured, structured, and then all the fun stuff up front, visualization. >> Which people love to focus on the fun stuff, right? Especially, you can't tell how many articles are on you got to apply deep learning and machine learning and that's where the answers are, we have to have the data and that's the piece that people are missing. >> So, my question there is you had this tactical mindset, it seems like you picked a good workload, the clinical trials and you had at least conceptually a good chance of success. Is that a fair statement? >> Well, the clinical trials was one aspect. Again, we tackled the entire data landscape. So it was all of the data across all of R&D. It wasn't limited to just, that's that top down and bottom up, so the bottom up is tackle everything in the landscape. The top down is what's important to the organization for decision making. >> So, that's actually the entire R&D application portfolio. >> Both internal and external. >> So my follow up question there is so that largely was kind of an inside the four walls of GSK, workload or not necessarily. My question was what about, you hear about these emerging Edge applications, and that's got to be a nightmare for what you described. In other words, putting all the data into one physical place, so it must be like a snake swallowing a basketball. Thoughts on that? >> I think some of it really does depend on you're always going to have these, IOT is another example where it's a large amount of streaming information, and so I'm not proposing that all data in every format in every location needs to be centralized and homogenized, I think you have to add some intelligence on top of that but certainly from an edge perspective or an IOT perspective or sensors. The data that you want to then make decisions around, so you're probably going to have a filter level that will impact those things coming in, then you filter it down to where you're going to really want to make decisions on that and then that comes together with the other-- >> So it's a prioritization exercise, and that presumably can be automated. >> Right, but I think we always have these cases where we can say well what about this case, and you know I guess what I'm saying is I've not seen organizations tackle their own data landscape challenges and really do it in an aggressive way to get value out of the data that's within their four walls. It's always like I mentioned in the keynote. It's always let's do a very small proof of concept, let's take a very narrow chunk. And what ultimately ends up happening is that becomes the only solution they build and then they go to another area and they build another solution and that's why we end up with 15 or 25-- (all talk over each other) >> The conventional wisdom is you start small. >> And fail. >> And you go on from there, you fail and that's now how you get big things done. >> Well that's not how you support analytic algorithms like machine learning and deep learning. You can't feed those just fragmented data of one aspect of your business and expect it to learn intelligent things to then make recommendations, you've got to have a much broader perspective. >> I want to ask you about one statistic you shared. You found 26 thousand relational database schemas for capturing experimental data and you standardized those into one. How? >> Yeah, I mean we took advantage of the Tamr technology that Michael Stonebraker created here at MIT a number of years ago which is really, again, it's applying advanced analytics to the data and using the content of the data and the characteristics of the data to go from dispersed schemas into a unified schema. So if you look across 26 thousand schemas using machine learning, you then can understand what's the consolidated view that gives you one perspective across all of those different schemas, 'cause ultimately when you give people flexibility they love to take advantage of it but it doesn't mean that they're actually doing things in an extremely different way, 'cause ultimately they're capturing the same kind of data. They're just calling things different names and they might be using different formats but in that particular case we use Tamr very heavily, and that again is back to my example of using advanced analytics on the data to make it available to do the fun stuff. The visualization and the advanced analytics. >> So Mark, the last question is you well know that the CDO role emerged in these highly regulated industries and I guess in the case of pharma quasi-regulated industries but now it seems to be permeating all industries. We have Goka-lan from McDonald's and virtually every industry is at least thinking about this role or has some kind of de facto CDO, so if you were slotted in to a CDO role, let's make it generic. I know it depends on the industry but where do you start as a CDO for an organization large company that doesn't have a CDO. Even a mid-sized organization, where do you start? >> Yeah, I mean my approach is that a true CDO is maximizing the strategic value of data within the organization. It isn't a regulatory requirement. I know a lot of the banks started there 'cause they needed someone to be responsible for data quality and data privacy but for me the most critical thing is understanding the strategic objectives of the organization and how will data be used differently in the future to drive decisions and actions and the effectiveness of the business. In some cases, there was a lot of discussion around monetizing the value of data. People immediately took that to can we sell our data and make money as a different revenue stream, I'm not a proponent of that. It's internally monetizing your data. How do you triple the size of the business by using data as a strategic advantage and how do you change the executives so what is good enough today is not good enough tomorrow because they are really focused on using data as their decision making tool, and that to me is the difference that a CDO needs to make is really using data to drive those strategic decision points. >> And that nuance you mentioned I think is really important. Inderpal Bhandari, who is the Chief Data Officer of IBM often says how can you monetize the data and you're right, I don't think he means selling data, it's how does data contribute, if I could rephrase what you said, contribute to the value of the organization, that can be cutting costs, that can be driving new revenue streams, that could be saving lives if you're a hospital, improving productivity. >> Yeah, and I think what I've shared typically shared with executives when I've been in the CDO role is that they need to change their behavior, right? If a CDO comes in to an organization and a year later, the executives are still making decisions on the same data PowerPoints with spinning logos and they said ooh, we've got to have 'em. If they're still making decisions that way then the CDO has not been successful. The executives have to change what their level of expectation is in order to make a decision. >> Change agents, top down, bottom up, last question. >> Going back to GSK, now that they've completed this massive data consolidation project how are things different for that business? >> Yeah, I mean you look how Barron joined as the President of R&D about a year and a half ago and his primary focus is using data and analytics and machine learning to drive the decision making in the discovery of a new medicine and the environment that has been created is a key component to that strategic initiative and so they are actually completely changing the way they're selecting new targets for new medicines based on data and analytics. >> Mark, thanks so much for coming on theCUBE. >> Thanks for having me. >> Great keynote this morning, you're welcome. All right, keep it right there everybody. We'll be back with our next guest. This is theCUBE, Dave Vellante with Paul Gillin. Be right back from MIT. (upbeat music)

Published Date : Jul 31 2019

SUMMARY :

Brought to you by SiliconANGLE Media. Special coverage of the MITCDOIQ. I could have stretched it to 20. and so that made the transition to Samsung and then you came up with another solution on the data to make it available some of the issues that have failed striking the balance of how you do that and it's kind of the North Star. the bigger problem, I'm not sure it's going to You mentioned that at GSK you against the data to go through a process of Especially in the pharmaceutical industry. as to what is excess, right, so you and do the discovery and I presume Okay, so find out what you so that's kind of the start. all the data and putting it into essentially one place. and then you sort of drew the diagram of and that's the piece that people are missing. So, my question there is you had this Well, the clinical trials was one aspect. My question was what about, you hear about these and homogenized, I think you have to exercise, and that presumably can be automated. and then they go to another area and that's now how you get big things done. Well that's not how you support analytic and you standardized those into one. on the data to make it available to do the fun stuff. and I guess in the case of pharma the difference that a CDO needs to make is of the organization, that can be Yeah, and I think what I've shared and the environment that has been created This is theCUBE, Dave Vellante with Paul Gillin.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
Mark	PERSON	0.99+
Mark Ramsey	PERSON	0.99+
15 petabytes	QUANTITY	0.99+
Samsung	ORGANIZATION	0.99+
Inderpal Bhandari	PERSON	0.99+
Michael Stonebraker	PERSON	0.99+
2013	DATE	0.99+
Paul	PERSON	0.99+
GlaxoSmithKline	ORGANIZATION	0.99+
Barron	PERSON	0.99+
Ramsey International, LLC	ORGANIZATION	0.99+
26 thousand schemas	QUANTITY	0.99+
GSK	ORGANIZATION	0.99+
18 years	QUANTITY	0.99+
2015	DATE	0.99+
thousands	QUANTITY	0.99+
Einstein	PERSON	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
tomorrow	DATE	0.99+
Samsung Mobile	ORGANIZATION	0.99+
26 thousand	QUANTITY	0.99+
Ramsey International LLC	ORGANIZATION	0.99+
30 plus year	QUANTITY	0.99+
a year later	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Federal Rules of Civil Procedure	TITLE	0.99+
20	QUANTITY	0.99+
25	QUANTITY	0.99+
Both	QUANTITY	0.99+
first step	QUANTITY	0.99+
one petabyte	QUANTITY	0.98+
today	DATE	0.98+
15	QUANTITY	0.98+
one	QUANTITY	0.98+
three approaches	QUANTITY	0.98+
13th year	QUANTITY	0.98+
one aspect	QUANTITY	0.97+
MIT	ORGANIZATION	0.97+
seven years ago	DATE	0.97+
McDonald's	ORGANIZATION	0.96+
MIT Chief Data Officer and	EVENT	0.95+
R&D	ORGANIZATION	0.95+
10 years ago	DATE	0.95+
this morning	DATE	0.94+
this evening	DATE	0.93+
one place	QUANTITY	0.93+
one perspective	QUANTITY	0.92+
about a year and a half ago	DATE	0.91+
over 32 years ago	DATE	0.9+
a lot of talk	QUANTITY	0.9+
a billion documents	QUANTITY	0.9+
CDO	TITLE	0.89+
decades	QUANTITY	0.88+
one statistic	QUANTITY	0.87+
2019	DATE	0.85+
first data	QUANTITY	0.84+
of years ago	DATE	0.83+
Step two	QUANTITY	0.8+
Tamr	OTHER	0.77+
Information Quality Symposium 2019	EVENT	0.77+
PowerPoints	TITLE	0.76+
documents	QUANTITY	0.75+
theCUBE	ORGANIZATION	0.75+
one physical	QUANTITY	0.73+
10 years	QUANTITY	0.72+
87, 88 range	QUANTITY	0.71+
President	PERSON	0.7+
Chief Data Officer	PERSON	0.7+
Enterprise Data Warehouse	ORGANIZATION	0.66+
Goka-lan	ORGANIZATION	0.66+
first Chief Data	QUANTITY	0.63+
first Chief Data Officer	QUANTITY	0.63+
Edge	TITLE	0.63+
tons	QUANTITY	0.62+

Keeping People Safe With IOT | Armored Things

(pulsating electronic music) >> Welcome everybody, this is theCube, I'm Paul Gillin. Physical security and cybersecurity have traditionally been sort of isolated worlds, they didn't talk to each other. But in the age of the Internet of Things we now have unprecedented opportunities to unite these two traditionally separate areas. Armored Things is a startup out of Boston and is doing some very interesting work in using intelligent devices to make decisions and to intuit patterns in crowd behavior which has applications in cybersecurity, crowd management, traffic management, a lot of different potential uses of this technology. With me are Julie Johnson the co-founder and President of Armored Things, and Chris Lord, the Chief Technology Officer, Welcome. >> Thank you. >> Why don't you describe in a nutshell, let's start out, what you do Julie. >> Great, Armored things is building software to do next generation incident response. We're using the IOT devices and their data to power decisions across large environments used for safety. So for example the data that we're collecting can be used to get better situational awareness within seconds and drive incident response in seconds instead tens of minutes, which is the state of the art today. >> And so it's sounds like, is security the primary target area or are there others? >> That's right, we sit at the intersection of physical and cybersecurity. This information can also be used to drive additional value over time but right now we're really focused on achieving that mission, using these devices, this technology to improve both the physical and cyber realms for Internet of Things. >> Chris why don't you give us an example of how your technology might be applied? >> Sure, so a very common one is, you know active shooter. People are very concerned about active shooter, and so how can you leverage all the data that you have across different devices, different systems that you have out there, in order to understand what happened, and get people the right information at the right time. A more commonplace example might be something like a protest formation. So if you look at a university campus where you might have a controversial group meeting on campus and you need to get early warning when there's a protest forming on the other side. Our technology will allow you to see that before it's gotten to a critical proportion or before it's marching down the street. >> So why don't you take a deeper dive and talk about what, how are you federating these devices? How are you using these multiple devices together? >> Well that's exactly what we are. So we're a data analytics layer across all the silos of data that you already have in your environment. So as you look around you might have motion sensors in your environment, you might have access control systems in your environment, you have wireless infrastructure in your environment, all these things are used for specific purposes now but nothings really trying to correlate and connect the data across all of them. So Armored Things builds a layer across all of them, brings that data together to give you better understanding of what's going on in your environment, people and your physical space. >> Julie talk about how the company came about, what are the origins? >> Sure, so I started working with Charles Curran our CEO about two years ago at Qualcomm. We were really focused on understanding the security portion of the IOT layer and how to manage these things in enterprise. So if you're familiar with IOT in the household there's been a lot of proliferation around turning your lights on, understanding who's at your front door, but in enterprise it's been much slower to adopt. Fundamentally we believe that part of that was because management took a lot of time. Every time you provisioned a device it took a number of minutes and because there was an intrinsic lack of security on each of the devices. So we went around and started talking to different potential customer groups about what it would look like to bring more IOT into their environments. And we really got pulled into universities, and large sporting and entertainment venues, who we're still working with as our primary customers today. Because they saw a desperate need for IOT, not only to save time on managing these devices, and to make sure that they're secure in their environments, but also to use them for physical security. So now that we've spent, you know $15 million in selling IP video cameras, or a few million dollars in selling access control systems, how do we actually elevate their use from what they were initially intended for. That spend has a secondary use when it comes to physical security. That ability to, you know quickly get cameras on the scene of an incident. That ability to harness data coming off of motion sensors or environmental sensors. How do we use all of that information to drive an awareness of our environments day-to-day and then use it in critical emergencies for a better response. >> I understand you're working with some sports teams right now. Can you describe a scenario in which you might be able to help them manage crowds more effectively? >> So there was a great example we heard about two weeks ago from a top team, who's recently hosted some World Series events. They had a unfortunate incident where they were watching, they were hosting a watch party for the World Series in their venue during an away game, and they handed about 40,000 paper tickets out. They got a great turnout, 20,000 people came to the venue. But in the seventh inning of the game the other 20,000 people decided that they also wanted to be in the venue in order to celebrate. That was a pretty unanticipated event. Usually in the fifth or sixth inning you start to consolidate your entrances, you start to consolidate your security personnel and send them to other parts of the venue, and the net result of that was they ended up closing the doors, not allowing additional entrance in, and tweeting that there wouldn't be additional people allowed to enter. There were a lot of security issues with letting 20,000 people in, in the seventh inning, not of the least is you don't know where they're coming from, and you don't really know what their intent is in coming so late to that venue. But there's patterns in the data that we could've seen sooner. So hypothetically, understanding that a normal game day has a couple hundred people entering in the fifth, sixth, seventh innings. Seeing a significant uptick in that number of people coming into your environment should immediately say, what's unique, you know what's different about this situation? Now how do I tie in my resources, my security personnel, my responders, and just maybe notify people who are in charge of making these types of decisions, so that we're not closing the gate and tweeting out to our fans that there's no more entries. >> And getting back to the technical nuances of this situation, how might your technology detect this crowd assembling before it was even visually apparent? >> Good question, so there's many, many different things. So part of what we do is rely on diversity of data from different sources. So that might be mobile devices. That might be from wireless. That might be from cameras that you have there and doing occupancy counts on those cameras. It might be from other, you know motion sensors you have in your environment. All this data gets aggregated so that we can come up with a good understanding of population and flow within your environment. So we would have early indications and bring that awareness to people that have to respond, people who might be sitting in a network operations center, and looking at other cameras but not seeing the information. So we can bring the information right there, notify them that there's a problem forming before it's gotten to critical proportions. >> Fantastic. >> One more thought on that is there's kind of a unique advantage in data to go beyond what humans can perceive. When we're looking at these knocks, you know they have thousands of video cameras potentially united in one central screen. It takes not only having the right camera up but also noticing a degree of difference that might be quite minute, to actually see it as an anomaly in real-time. So you can imagine, you know a university campus where people are walking through the campus at a certain pace every single day. One day everyone's walking just 30% faster, not running just walking, why? You know is there a suspicious package? Is there someone gathered there that you know is attracting people that they don't necessarily want to be associated with, or end up in a vulnerable position? How can we see that in the data faster than someone in the control room might notice it and alert people to respond. >> And with machine learning, of course now we have the means to do that. Chris, talk about the, it strikes me that there must be a lot of complexity involved. You've got a great diversity of devices out there you have to connect to. Every institution would have a different fabric. How are you technically pulling this all together? >> Well the nice thing about a lot of these technologies is there is standardization across many of these different types of devices, and there are, you know there are tiers of players right. And so we do have to be selective about who we integrate with. We are integrated with the top-tier players in all these categories, and we'll prioritize other integrations over time based on our customers and our market so. >> And Julie, what are your plans for deployment? What's your timeframe? >> We're looking to rollout our first generation of product in the next nine to twelve months. That really drives home at that situational awareness piece. So before we even get to building through incident response at scale, the ability to give people very specific cues during a critical emergency. How do we start with getting more information to the people who are there? So getting occupancy, flow, the dynamics of movement around a campus or a large venue. How do we start equipping the police personnel, and security personnel to make better decisions and drive value from there. >> I understand there's no shortage of demand for your solution. >> We do have some top-tier universities, and pro-sporting and entertainment venues who we're working with to build the right solution not just the solution that we think is needed, but the solution that they're telling us, "Hey we would really like to use something like this." >> I also understand you've pulled together a team, kind of a dream team, talk about some of the people that you've brought on board for this operation which few people have even heard of. >> Yeah so I think the first of those you're seeing here, so Chris joined us as co-founder and CTO and has been really an asset to this team given his background in cybersecurity from Carbon Black and before that. And you know if you want to add more to that please feel free to. >> No thanks. >> We've also brought in, I would call it two pillars of our strategy. One one the physical security side and one on the machine learning data analytics side, and those two women are Elizabeth Carter. Who came to us from Apple, where she led crisis management for the Americas. She previously worked at Chertoff Group where she sat at the intersection of physical and cybersecurity, and before that actually worked for the city of New York, where she understood weapons of mass destruction, different types of biological and chemical weapons response planning. So she's kind of the pillar of our physical security response understanding and driving product. The other woman, her name is Clare Bernard and she recently joined us from another Boston startup called Tamr where she was running product and engineering for them. Clare's background is actually in particle physics. She was BU and John's Hopkins, and happened to work with the team that discovered the God particle while she was getting her PhD. So we' think she's as smart as you can find, and is going to help us think about these data challenges, the analytics piece at a scale that, you know we think has the potential to really improve physical security and cybersecurity. I would be remiss if I didn't mention the rest of our team. Our CEO Charles comes from a background in the venture capital community and is just incredibly knowledgeable about the process of building a company from the ground up, and has many skills when it comes to recruiting as well. Really helped drive some of these hires forward and the rest of the team is the next generation of rising stars, people from Oracle, HP Vertica, other Carbon Black individuals. People who just have experience from across the board that's going to help us build the right solution. >> And you know at a time when diversity has been a major issue for tech companies, I understand your team is unusually well represented. >> I think our executive team is about 60% women, which we're very proud of. I think our team in general might actually be, >> About that too, yup. >> About 60% women, which we're also very proud of. And I'd like to say that that's organic. That we've worked with some great advisors and potential customers, and I do think that from my perspective, it's been helpful to have younger women coming in who see a path forward for senior women in executive roles in their company. I think that's something that can't be underestimated. >> Where do you stand in funding right now? >> We just closed our first institutional capital about a week and a half ago. We're still finishing the close of that round but we have a Boston based partner who's very focused on machine learning and analytics, and also has been a well recognized investor in the cyber security realm. So we're very fortunate to have this investor as our partner, and excited to keep working with them. >> Chris, as someone whose background is in cybersecurity how do you see the security landscape changing now with the IOT coming on and the possibility of really transforming the way organizations look at their physical and cybersecurity operations? >> Good question, so over time they're converging, and they're converging I think more rapidly than we expected, so now I'm going to step back a little bit and say that there's a lot of parallels. Cybersecurity I think is probably about five years ahead of physical security in terms of maturity of technology and approaches to problems. And then so what we're seeing right now, and we're part of the force behind that, is taking the learnings from cyber security and applying them to physical security right. So when we talk about situational awareness, when we talk about the data analytics that supports that, and when we talk about incident response and orchestration automation. All of those are core to taking cybersecurity and applying it to physical security. In terms of convergence, we're seeing many cases, and this is going back a number of years, where people are using cyber events to create physical problems right. Stuxnet is a classic example, but you can do the same thing by taking over something and instilling panic in a stadium, and causing you know, all sorts of grief, cyber driving physical. You can also see cases where people who are running cybersecurity operation centers want access to physical knowledge of their environment in order to do their job better. Whether it is a malicious insider that they suspect, whether it's an infection that occurs on a particular machine, being able to pull up the cameras, know who was there at the time, bringing all that information together, is again necessary in order to understand their perception of situational awareness. So two converging towards one, we're going to be building towards that goal from our perspective. >> Now the flip side of federating IOT devices is that the bad guys can do the same thing. So you potentially have a much broader attack surface. That has to be factoring into your thinking. What is the embedded security in your platform? >> So, we're not going to address fully that right now, but so we take advantage of best in breed security principles in our design both for security and for privacy. But in terms of the dependency we have on a lot of IOT devices and IOT systems, part of what helps us is diversity of data across those, and diversity of devices right. And so while you might have compromises in specific cases, the fact that you are dealing with so many, and so many different categories at the same time, allows you to maintain and fulfill your mission, and deliver what you're trying to do regardless of some of those individual compromises. We're also in a unique vantage point where we can actually see the operational integrity of what's going on. So when you look across all those different categories and you look at the data that we're collecting, whether it's malicious or not, we're able to identify a failure, and bring that to the attention of the people who are dependent on those systems. So we could be an early morning to cyber events, malicious or not. >> Julie, entrepreneurs love to dream. I'm sure you are thinking big, beyond the immediate cybersecurity applications. Where could Armored Things eventually go? >> That's a great question. The dream is that we become not only the dominant solution for physical and cyber security for schools and large venues. But we bring our solution into K, 12 where some of this is desperately needed. That's kind of the mission orientation of our team. How do we start to drive value in a way that we can get to every school in the country sooner. In the longer term though, I think there's a lot of opportunities with IOT and we're still kind of at the tip of the iceberg here. We're going to see all sorts of new devices come online over the next two, five, 10 years. The growth of these devices is incredible. And the question is how do we continue this challenge of solving the data at scale in a way that continues to drive value, not just for some of the first use cases, which are often around marketing, and understanding an environment in that sense, but also continuing that physical cybersecurity angle. >> Enormous potential and hope you stay based in Boston. We can use more companies like that. Chris Lord and Julie Johnson, thanks very much for joining us today on theCUbe. >> Thanks Paul. >> Thank you. >> Armored Things, keep your eye on them. You're going to be hearing a lot more about this company in the months to come. I'm Paul Gillin, this is theCube.

Published Date : May 21 2018

SUMMARY :

and Chris Lord, the Chief Technology Officer, let's start out, what you do Julie. and their data to power decisions this technology to improve both the physical and so how can you leverage all the data and connect the data across all of them. and how to manage these things in enterprise. Can you describe a scenario in which you might be able not of the least is you don't know and bring that awareness to people that have to respond, and alert people to respond. of course now we have the means to do that. and there are, you know there are tiers of players right. in the next nine to twelve months. for your solution. not just the solution that we think is needed, kind of a dream team, talk about some of the people and has been really an asset to this team and is going to help us think about these data challenges, And you know at a time when diversity I think our executive team is about 60% women, and I do think that from my perspective, in the cyber security realm. and applying it to physical security. is that the bad guys can do the same thing. and bring that to the attention of the people beyond the immediate cybersecurity applications. And the question is how do we continue this challenge Chris Lord and Julie Johnson, in the months to come.

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Julie Johnson	PERSON	0.99+
Elizabeth Carter	PERSON	0.99+
Chris Lord	PERSON	0.99+
Julie	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Charles	PERSON	0.99+
Charles Curran	PERSON	0.99+
Boston	LOCATION	0.99+
Clare	PERSON	0.99+
$15 million	QUANTITY	0.99+
New York	LOCATION	0.99+
Paul Gillin	PERSON	0.99+
two	QUANTITY	0.99+
Carbon Black	ORGANIZATION	0.99+
Paul	PERSON	0.99+
sixth	QUANTITY	0.99+
five	QUANTITY	0.99+
20,000 people	QUANTITY	0.99+
World Series	EVENT	0.99+
Qualcomm	ORGANIZATION	0.99+
Chertoff Group	ORGANIZATION	0.99+
tens of minutes	QUANTITY	0.99+
fifth	QUANTITY	0.99+
each	QUANTITY	0.99+
30%	QUANTITY	0.99+
two women	QUANTITY	0.99+
10 years	QUANTITY	0.99+
seventh inning	QUANTITY	0.99+
Clare Bernard	PERSON	0.99+
first	QUANTITY	0.99+
Tamr	ORGANIZATION	0.98+
Armored Things	ORGANIZATION	0.98+
one	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
One	QUANTITY	0.98+
Americas	LOCATION	0.98+
HP Vertica	ORGANIZATION	0.98+
thousands of video cameras	QUANTITY	0.98+
first generation	QUANTITY	0.98+
sixth inning	QUANTITY	0.97+
about 40,000 paper tickets	QUANTITY	0.97+
John's Hopkins	ORGANIZATION	0.96+
about five years	QUANTITY	0.96+
seventh innings	QUANTITY	0.95+
theCube	ORGANIZATION	0.94+
one central screen	QUANTITY	0.93+
About 60%	QUANTITY	0.93+
twelve months	QUANTITY	0.92+
about a week and a half ago	DATE	0.92+
two pillars	QUANTITY	0.92+
Apple	ORGANIZATION	0.91+
about 60% women	QUANTITY	0.9+
two years ago	DATE	0.89+
One day	QUANTITY	0.87+
first use cases	QUANTITY	0.85+
couple hundred people	QUANTITY	0.83+
about two weeks ago	DATE	0.8+
few million dollars	QUANTITY	0.79+
first institutional capital	QUANTITY	0.79+
One more	QUANTITY	0.76+
separate areas	QUANTITY	0.75+
every single day	QUANTITY	0.73+
theCUbe	ORGANIZATION	0.7+
IOT	ORGANIZATION	0.65+
President	PERSON	0.64+
seconds	QUANTITY	0.63+
K	LOCATION	0.5+
nine	DATE	0.49+
12	OTHER	0.49+
Stuxnet	PERSON	0.41+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Tamr: