Image Title

Search Results for Hobby:

Scott Howser, Hadapt - MIT Information Quality 2013 - #MIT #CDOIQ #theCUBE


 

>> wait. >> Okay, We're back. We are in Cambridge, Massachusetts. This is Dave Volante. I'm here with Jeff Kelly. Where with Wicked Bond. This is the Cube Silicon Angles production. We're here at the Mighty Information Quality Symposium in the heart of database design and development. We've had some great guests on Scott Hauser is here. He's the head of marketing at Adapt Company that we've introduced to our community. You know, quite some time ago, Um, really bringing multiple channels into the Duke Duke ecosystem and helping make sense out of all this data bringing insights to this data. Scott, welcome back to the Cube. >> Thanks for having me. It's good to be here. >> So this this notion of data quality, the reason why we asked you to be on here today is because first of all, you're a practitioner. Umm, you've been in the data warehousing world for a long, long time. So you've struggled with this issue? Um, people here today, uh, really from the world of Hey, we've been doing big data for a long time. This whole big data theme is nothing new to us. Sure, but there's a lot knew. Um, and so take us back to your days as a zoo. A data practitioner. Uh, data warehousing, business intelligence. What were some of the data quality issues that you faced and how did you deal with him? So >> I think a couple of points to raise in that area are no. One of things that we like to do is try and triangulate on user to engage them. And every channel we wanted to go and bring into the fold, creating unique dimension of how do we validate that this is the same person, right? Because each channel that you engage with has potentially different requirements of, um, user accreditation or, ah, guarantee of, you know, single user fuel. That's why I think the Holy Grail used to be in a lot of ways, like single sign on our way to triangulate across the spirit systems, one common identity or person to make that world simple. I don't think that's a reality in the in the sense that when you look at, um, a product provider or solution provider and a customer that's external, write those those two worlds Avery spirit and there was a lot of channels and pitch it potentially even third party means that I might want to engage this individual by. And every time I want to bring another one of those channels online, it further complicates. Validating who? That person eighty. >> Okay, so So when you were doing your data warehouse thing again as an I t practitioner, Um, you have you You try to expand the channels, but every time he did that and complex if I hide the data source So how did you deal with that problem? So just create another database and stole five Everything well, >> unfortunately, absolutely creates us this notion of islands of information throughout the enterprise. Because, as you mentioned, you know, we define a schema effectively a new place, Um, data elements into that schema of how you identified how you engage in and how you rate that person's behaviors or engagement, etcetera. And I think what you'd see is, as you'd bring on new sources that timeto actually emerge those things together wasn't in the order of days or weeks. It's on months and years. And so, with every new channel that became interesting, you further complicate the problem and effectively, What you do is, you know, creating these pools of information on you. Take extracts and you try and do something to munch the data and put in a place where you give access to an analyst to say, Okay, here's it. Another, um, Sample said a day to try and figure out of these things. Align and you try and create effectively a new schema that includes all the additional day that we just added. >> So it's interesting because again, one of the themes that we've been hearing a lot of this conference and hear it a lot in many conferences, not the technology. It's the people in process around the technology. That's certainly any person person would agree with that. But at the same time, the technology historically has been problematic, particularly data. Warehouse technology has been challenging you. So you've had toe keep databases relatively small and despair, and you had to build business processes around those that's right a basis. So you've not only got, you know, deficient technology, if you will, no offense, toe data, warehousing friends, but you've got ah, process creep that's actually fair. That's occurred, and >> I think you know what is happening is it's one of the things that's led to sort of the the revolution it's occurring in the market right now about you know, whether it's the new ecosystem or all the tangential technologies around that. Because what what's bound not some technology issues in the past has been the schema right. As important as that is because it gives people a very easy way to interact with the data. It also creates significant challenges when you want to bring on these unique sources of information. Because, you know, as you look at things that have happened over the last decade, the engagement process for either a consumer, a prospect or customer have changed pretty dramatically, and they don't all have the same stringent requirements about providing information to become engaged that way. So I think where the schema has, you know, has value you obviously, in the enterprise, it also has a lot of, um, historical challenges that brings along with >> us. So this jump movement is very disruptive to the traditional market spaces. Many folks say it isn't traditional guy, say, say it isn't but clearly is, particularly as you go Omni Channel. I threw that word out earlier on the channels of discussion that we had a dupe summit myself. John Ferrier, Hobby lobby meta and as your and this is something that you guys are doing that bringing in data to allow your customers to go Omni Channel. As you do that, you start again. Increase the complexity of the corpus of data at the same time. A lot of a lot of times into do you hear about scheme alight ski, but less so how do you reconcile the Omni Channel? The scheme of less It's their scheme alight. And the data quality >> problems, Yes, I think for, you know, particular speaking about adapt one of things that we do is we give customers the ability to take and effectively dump all that data into one common repository that is HD if s and do and leverage some of those open source tools and even their own, you know, inventions, if you will, you know, with m R code pig, whatever, and allow them to effectively normalized data through it orations and to do and then push that into tables effectively that now we can give access to the sequel interface. Right? So I think for us the abilities you're absolutely right. The more channels. You, Khun, give access to write. So this concept of anomie channel where Irrespective of what way we engaged with a customer what way? They touch us in some way. Being able to provide those dimensions of data in one common repository gives the marketeer, if you will, an incredible flexibility and insights that were previous, Who'd be discoverable >> assuming that data qualities this scene >> right of all these So so that that was gonna be my question. So what did the data quality implications of using something like HD FSB. You're essentially scheme unless you're just dumping data and essentially have a raw format and and it's raw format. So now you've gotto reconcile all these different types of data from different sources on build out that kind of single view of a customer of a product, Whatever, whatever is yours. You're right. >> So how do you go >> about doing that in that kind of scenario? So I think the repository in Hindu breach defense himself gives you that one common ground toa workin because you've got, you know, no implications of schema or any other preconceived notions about how you're going toe to toe massage weight if you will, And it's about applying logic and looking for those universal ides. There are a bunch of tools around that are focused on this, but applying those tools and it means that doesn't, um, handy captain from the start by predisposing them to some structure. And you want them to decipher or call out that through whether it's began homegrown type scripts, tools that might be upstairs here and then effectively normalizing the data and moving it into some structure where you can interact with it on in a meaningful way. So that really the kind the old way of trying to bring, you know, snippets of the data from different sources into ah, yet another database where you've got a play structure that takes time, months and years in some cases. And so Duke really allows you to speed up that process significantly by basically eliminating that that part of the equation. Yeah, I think there's and there's a bunch of dimensions we could talk about things like even like pricing exercises, right quality of triangulating on what that pricing should be per product for geography, for engagement, etcetera. I think you see that a lot of those types of work. Let's have transitioned from, you know, mainframe type environments, environments of legacy to the Duke ecosystem. And we've seen cases where people talk about they're going from eight month, you know, exercises to a week. And I think that's where the value of this ecosystem in you know, the commodity scalability really provides you with flexibility. That was just previously you unachievable. >> So could you provide some examples either >> you know, your own from your own career or from some customers you're seeing in terms of the data quality implications of the type of work they're doing. So one of our kind of *** is that you know the data quality measures required for any given, uh, use case various, in some cases, depending on the type of case. You know, in depending on the speed that you need, the analysis done, uh, the type of data quality or the level data qualities going is going to marry. Are you seeing that? And if >> so, can you give some examples of the different >> types of way data quality Gonna manifest itself in a big data were close. Sure. So I think that's absolutely fair. And you know. Obviously there's there's gonna be some trade off between accuracy and performance, right? And so you have to create some sort of confidence coefficient part, if you will, that you know, within some degree of probability this is good enough, right? And there's got to be some sort of balance between that actor Jerseyan time Um, some of the things that you know I've seen a lot of customers being interested in is it is a sort of market emerging around providing tools for authenticity of engagement. So it's an example. You know, I may be a large brand, and I have very, um, open channels that I engage somebody with my B e mail might be some Web portal, etcetera, and there's a lot of fishing that goes on out there, right? And so people fishing for whether it's brands and misrepresenting themselves etcetera. And there's a lot of, you know, desire to try and triangulate on data quality of who is effectively positioned themselves as me, who's really not me and being able to sort of, you know, take a cybersecurity spin and started to block those things down and alleviate those sort of nefarious activities. So We've seen a lot of people using our tool to effectively understand and be able to pinpoint those activities based upon behavior's based upon, um, out liars and looking at examples of where the engagement's coming from that aren't authentic if that >> makes you feel any somewhat nebulous but right. So using >> analytics essentially to determine the authenticity of a person of intensity, of an engagement rather than taking more rather than kind of looking at the data itself using pattern detection to determine. But it also taking, you know, there's a bunch of, um, there's a bunch of raw data that exists out there that needs you when you put it together again. Back to this notion of this sort of, you know, landing zone, if you will, or Data Lake or whatever you wanna call it. You know, putting all of this this data into one repository where now I can start to do you know, analytics against it without any sort of pre determined schema. And start to understand, you know, are these people who are purporting to be, you know, firm X y Z are there really from X y Z? And if they're not, where these things originating and how, when we start to put filters or things in place to alleviate those sort of and that could apply, it sounds like to certainly private industry. But, I mean, >> it sounds like >> something you know, government would be very interested in terms ofthe, you know, in the news about different foreign countries potentially being the source of attacks on U. S. Corporations are part of the, uh, part of our infrastructure and trying to determine where that's coming from and who these people are. And >> of course, people were trying to get >> complicated because they're trying to cover up their tracks, right? Certainly. But I think that the most important thing in this context is it's not necessarily about being able to look at it after the fact, but it's being able to look at a set of conditions that occur before these things happen and identify those conditions and put controls in place to alleviate the action from taking place. I think that's where when you look at what is happening from now an acceleration of these models and from an acceleration of the quality of the data gathering being able to put those things into place and put effective controls in place beforehand is changing. You know the loss prevention side of the business and in this one example. But you're absolutely right. From from what I see and from what our customers were doing, it is, you know, it's multi dimensional in that you know this cyber security. That's one example. There's pricing that could be another example. There's engagements from, ah, final analysis or conversion ratio that could be yet another example. So I think you're right in it and that it is ubiquitous. >> So when you think about the historical role of the well historical we had Stewart on earlier, he was saying, the first known chief data officer we could find was two thousand three. So I guess that gives us a decade of history. But if you look back at the hole, I mean data quality. We've been talking about that for many, many decades. So if you think about the traditional or role of an organization, trying tio achieved data quality, single version of the truth, information, quality, information value and you inject it with this destruction of a dupe that to me anyway, that whole notion of data quality is changing because in certain use, cases inference just fine. Um, in false positives are great. Who cares? That's right. Now analyzing Twitter data from some cases and others like healthcare and financial services. It's it's critical. But so how do you see the notion of data quality evolving and adapting to this >> new world? Well, I think one of these you mentioned about this, you know, this single version of the truth was something that was, you know, when I was on the other side of the table, >> they were beating you over the head waken Do this, We >> can do this, and it's It's something that it sounds great on paper. But when you look at the practical implications of trying to do it in a very finite or stringent controlled way, it's not practical for the business >> because you're saying that the portions of your data that you can give a single version of the truth on our so small because of the elapsed time That's right. I think there's that >> dimension. But there's also this element of time, right and the time that it takes to define something that could be that rigid and the structure months. It's months, and by that time a lot of the innovations that business is trying to >> accomplish. The eyes have changed. The initiatives has changed. Yeah, you lost the sale. Hey, but we got the data. It would look here. Yeah, I think that's your >> right. And I think that's what's evolving. I think there's this idea that you know what Let's fail fast and let's do a lot of it. Orations and the flexibility it's being provided out in that ecosystem today gives people an opportunity. Teo iterated failed fast, and you write that you set some sort of, you know confidence in that for this particular application. We're happy with you in a percent confidence. Go fish. You are something a little >> bit, but it's good enough. So having said that now, what can we learn from the traditional date? A quality, you know, chief data officer, practitioners, those who've been very dogmatic, particularly in certain it is what can we learn from them and take into this >> new war? I think from my point of view on what my experience has always been is that those individuals have an unparalleled command of the business and have an appreciation for the end goal that the business is trying to accomplish. And it's taking that instinct that knowledge and applying that to the emergence of what's happening in the technology world and bringing those two things together. I think it's It's not so much as you know, there's a practical application in that sense of Okay, here's the technology options that we have to do these, you know, these desired you engaged father again. It's the pricing engagement, the cyber security or whatever. It's more. How could we accelerate what the business is trying to accomplish and applying this? You know, this technology that's out there to the business problem. I think in a lot of ways, you know, in the past it's always been here. But this really need technology. How can I make it that somewhere? And now I think those folks bring a lot of relevance to the technology to say Hey, here's a problem. Trying to solve legacy methodologies haven't been effective. Haven't been timely. Haven't been, uh, scaleable. Whatever hock me. Apply what's happening. The market today to these problems. >> Um, you guys adapt in particular to me any way a good signal of the maturity model and with the maturity of a dupe, it's It's starting to grow up pretty rapidly, you know, See, due to two auto. And so where are we had? What do you see is the progression, Um, and where we're going. >> So, you know, I mentioned it it on the cue for the last time it So it and I said, I believe that you know who do busy operating system of big data. And I believe that, you know, there's a huge transition taking place that was there were some interesting response to that on Twitter and all the other channels, but I stand behind that. I think that's really what's happening. Lookit. You know what people are engaging us to do is really start to transition away from the legacy methodologies and they're looking at. He's not just lower cost alternatives, but also more flexibility. And we talked about, you know, its summit. The notion of that revenue curve right and cost takeouts great on one side of the coin, and I are one side of the defense here. But I think equally and even more importantly, is the change in the revenue curve and the insights that people they're finding because of these unique channels of the Omni Challenge you describe being able to. So look at all these dimensions have dated one. Unified place is really changing the way that they could go to market. They could engage consumers on DH that they could provide access to the analyst. Yeah. I mean, ultimately, that's the most >> we had. Stewart Madness con who's maybe got written textbooks on operating systems. We probably use them. I know I did. Maybe they were gone by the time you got there, but young, but the point being, you know, a dupe azan operating system. The notion of a platform is really it's changing dramatically. So, um, I think you're right on that. Okay. So what's what's next for you guys? Uh, we talked about, you know, customer attraction and proof points. You're working. All right on that. I know. Um, you guys got a great tech, amazing team. Um, what's next for >> you? So I think it's it's continuing toe. Look at the market in being flexible with the market around as the Hughes case is developed. So, you know, obviously is a startup We're focused in a couple of key areas where we see a lot of early adoption and a lot of pain around the problem that we can solve. But I think it's really about continuing to develop those use cases, um, and expanded the market to become more of a, you know, a holistic provider of Angelique Solutions on top of a >> house. Uh, how's Cambridge working out for you, right? I mean, the company moved up from the founders, moved up from New Haven and chose shows the East Coast shows cameras were obviously really happy about. That is East Coast people. You don't live there full time, but I might as well. So how's that working out talent pool? You know, the vibrancy of the community, the the you know, the young people that you're able to tap. So >> I see there's a bunch of dimensions around that one. It's hot. It's really, really hot >> in human, Yes, but it's been actually >> fantastic. And if you look it not just a town inside the team, but I think around the team. So if you look at our board right Jet Saxena. Chris Lynch, I've been very successful. The database community over decades of experience, you know, and getting folks like that onto the board fell. The Hardiman has been, you know, in this space as well for a long time. Having folks like that is, you know, advisors and providing guidance to the team. Absolutely incredible. Hack Reduce is a great facility where we do things like hackathons meet ups get the community together. So I think there's been a lot of positive inertia around the company just being here in Cambridge. But, you know, from AA development of resource or recruiting one of you. It's also been great because you've got some really exceptional database companies in this area, and history will show you like there's been a lot of success here, not only an incubating technology, but building real database companies. And, you know, we're on start up on the block that people are very interested in, and I think we show a lot of, you know, dynamics that are changing in the market and the way the markets moving. So the ability for us to recruit talent is exceptional, right? We've got a lot of great people to pick from. We've had a lot of people joined from no other previously very successful database companies. The team's growing, you know, significantly in the engineering space right now. Um, but I just you know, I can't say enough good things about the community. Hack, reduce and all the resource is that we get access to because we're here in Cambridge. >> Is the hacker deuces cool? So you guys are obviously leveraging that you do how to bring people into the Sohag produces essentially this. It's not an incubator. It's really more of a an idea cloud. It's a resource cloud really started by Fred Lan and Chris Lynch on DH. Essentially, people come in, they share ideas. You guys I know have hosted a number of how twos and and it's basically open. You know, we've done some stuff there. It's it's very cool. >> Yeah, you know, I think you know, it's even for us. It's also a great place to recruit, right. We made a lot of talented people there, and you know what? The university participation as well We get a lot of talent coming in, participate in these activities, and we do things that aren't just adapt related, that we've had people teach had obsessions and just sort of evangelize what's happening in the ecosystem around us. And like I said, it's just it's been a great resource pool to engage with. And, uh, I think it's been is beneficial to the community, as it has been to us. So very grateful for that. >> All right. Scott has always awesome. See, I knew you were going to have some good practitioner perspectives on data. Qualities really appreciate you stopping by. My pleasure. Thanks for having to see you. Take care. I keep right to everybody right back with our next guest. This is Dave a lot. They would. Jeff Kelly, this is the Cube. We're live here at the MIT Information Quality Symposium. We'LL be right back.

Published Date : Jul 17 2013

SUMMARY :

the Duke Duke ecosystem and helping make sense out of all this data bringing insights to It's good to be here. So this this notion of data quality, the reason why we asked you to be on here today is because first of all, I don't think that's a reality in the in the sense that when you look at, um, that became interesting, you further complicate the problem and effectively, What you do is, databases relatively small and despair, and you had to build business processes around those it's occurring in the market right now about you know, whether it's the new ecosystem or all the A lot of a lot of times into do you hear about scheme alight ski, but less so problems, Yes, I think for, you know, particular speaking about adapt one of things that we do is we So what did the data quality implications of using And I think that's where the value of this ecosystem in you know, the commodity scalability So one of our kind of *** is that you know the data quality that you know, within some degree of probability this is good enough, right? makes you feel any somewhat nebulous but right. And start to understand, you know, are these people who are purporting something you know, government would be very interested in terms ofthe, you know, in the news about different customers were doing, it is, you know, it's multi dimensional in that you know this cyber security. So if you think about the traditional or But when you look at the practical of the truth on our so small because of the elapsed time That's right. could be that rigid and the structure months. Yeah, you lost the sale. I think there's this idea that you know what Let's fail fast and A quality, you know, chief data officer, practitioners, those who've been very dogmatic, here's the technology options that we have to do these, you know, these desired you engaged you know, See, due to two auto. And I believe that, you know, there's a huge transition taking place Uh, we talked about, you know, customer attraction and proof points. um, and expanded the market to become more of a, you know, a holistic provider the the you know, the young people that you're able to tap. I see there's a bunch of dimensions around that one. on the block that people are very interested in, and I think we show a lot of, you know, dynamics that are changing in So you guys are obviously leveraging that you do how to bring people into the Sohag Yeah, you know, I think you know, it's even for us. Qualities really appreciate you stopping by.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jeff KellyPERSON

0.99+

ScottPERSON

0.99+

Omni ChannelORGANIZATION

0.99+

Chris LynchPERSON

0.99+

Scott HowserPERSON

0.99+

Dave VolantePERSON

0.99+

CambridgeLOCATION

0.99+

fiveQUANTITY

0.99+

eight monthQUANTITY

0.99+

todayDATE

0.99+

Angelique SolutionsORGANIZATION

0.99+

DavePERSON

0.99+

John FerrierPERSON

0.99+

firstQUANTITY

0.99+

Fred LanPERSON

0.99+

Scott HauserPERSON

0.99+

SohagORGANIZATION

0.99+

New HavenLOCATION

0.99+

TwitterORGANIZATION

0.99+

Cambridge, MassachusettsLOCATION

0.99+

two thousandQUANTITY

0.99+

two thingsQUANTITY

0.99+

StewartPERSON

0.99+

eightyQUANTITY

0.99+

oneQUANTITY

0.99+

one exampleQUANTITY

0.98+

each channelQUANTITY

0.98+

one sideQUANTITY

0.98+

singleQUANTITY

0.98+

OneQUANTITY

0.98+

2013DATE

0.97+

HughesPERSON

0.97+

a weekQUANTITY

0.96+

twoQUANTITY

0.96+

one repositoryQUANTITY

0.96+

#CDOIQORGANIZATION

0.96+

East CoastLOCATION

0.96+

two worldsQUANTITY

0.95+

a decadeQUANTITY

0.94+

one common repositoryQUANTITY

0.93+

Hack ReduceORGANIZATION

0.92+

#MITORGANIZATION

0.91+

one common repositoryQUANTITY

0.91+

Wicked BondORGANIZATION

0.91+

CubeORGANIZATION

0.91+

one commonQUANTITY

0.89+

MIT Information QualityEVENT

0.89+

Mighty Information Quality SymposiumEVENT

0.88+

KhunPERSON

0.87+

MIT Information QualityORGANIZATION

0.86+

single versionQUANTITY

0.86+

a dayQUANTITY

0.85+

twosQUANTITY

0.85+

TeoPERSON

0.85+

SamplePERSON

0.82+

Duke DukeORGANIZATION

0.81+

one side ofQUANTITY

0.8+

single signQUANTITY

0.8+

DukeORGANIZATION

0.76+

Jet SaxenaPERSON

0.75+

HobbyORGANIZATION

0.75+

last decadeDATE

0.74+

Data LakeLOCATION

0.72+

themesQUANTITY

0.7+

Adapt CompanyORGANIZATION

0.65+

Cube Silicon AnglesORGANIZATION

0.62+

HinduOTHER

0.61+

DukeLOCATION

0.6+

HadaptORGANIZATION

0.58+

HardimanPERSON

0.57+

threeQUANTITY

0.52+

SymposiumORGANIZATION

0.51+

pointsQUANTITY

0.5+

#theCUBEORGANIZATION

0.49+

Stewart MadnessPERSON

0.49+

U. S.ORGANIZATION

0.48+

coupleQUANTITY

0.47+