Martin Lidl, Chris Murphy & Itamar Ankorion - BigData SV - #BigDataSV - #theCUBE
>> Announcer: Live from San Jose, California, it's the CUBE, covering Big Data Silicon Valley 2017. >> Good afternoon everyone. This is George Gilbert. We're at Silicon Valley Big Data in conjunction with Strata and Hadoop World. We've been here every year for six years and I'm pleased to bring with us today a really interesting panel, with our friends from Attunity, Itamar Ankorion. We were just discussing, has an Israeli name, but some of us could be forgiven for thinking Italian or Turkish. Itamar is CMO of Attunity. We have Chris Murphy who is from a very large insurance company that we can't name right now, and then Martin Lidl from Deloitte. We're going to be talking about their experience building a data lake, a high value data lake, and some of the technology choices they made, including how Attunity fits in that. Maybe kicking that off, Chris, perhaps you can tell us what the big objectives were for the data lake, in terms of what outcomes were you seeking. >> Okay, I'd start off by saying there wasn't any single objective. It was very much about putting in a key enterprise component that would facilitate many, many things. When I look at it now and I look back, with wisdom hopefully, I see it as trying to put in data as a service within the company. Very much we built it as an operational data lake first and foremost, because we wanted to generate value for the company. I very much convey to people that this was something that was worth investing in on an ongoing basis, and then on the back of that of course, once you've actually pulled all the data together and started to curate it and make it available, then you can start doing the research work as well. We were trying to get the best of both worlds from that perspective. >> Let me follow up with that just really quickly. It sounds like if you're doing data as a service, it's where central IT as a function created a platform on which others would build applications and you had to make that platform mature at a certain level, not just the software but the data itself. Then at that point, did you show prototype applications to different departments and business units, or how did the uptake, you know, how organically did that move? >> Not so much, it was very much a fast delivering, agile, set of projects working together, so we actually had, and we used to call it the holy trinity of the projects we were doing. We had putting in a new customer portal that would be getting all of its data from the data lake, putting in a new CRM system getting all of its data from the data lake and talking to the customer portal, and then of course at the back behind that, the data lake itself feeding all the data to these systems. We weren't developing in parallel to to those projects, but of course those were not small projects. Those were sizable beasts, but side by side with that, we were still able to use the data lake to do some proof of concept work around analytics. Interestingly, one of the first things we used the data lake for, in terms of on the analytics side, was actually meeting a government regulatory requirement, where they needed us to get an amount of data together for them very quickly. When I say quickly, I mean within two weeks. We went to our typical suppliers and said, "How long will this take?" About three months, they thought. In terms of actually using the data lake, we pulled the data together in about two days and most of the delays were due to the lack of strict requirements, where we were just figuring out exactly what people wanted, and that really helped demonstrate the benefit of having a data lake at base. >> So Martin, tell us how Deloitte, you know, with its sort of deep bench of professional services skills, could help make that journey easier for Chris and for others. >> There were actually a number of areas where we engaged ... We were all the way from the very beginning, engaged in working on the business case creation and really when it sort of came to life was when we brought our technology people actually in to work out a road map of how to deal with it. As Chris said, there were many moving parts, therefor many teams within Deloitte that were engaged with different areas of specialization, so from a big development perspective on the one hand to sales force, CRM in the background, and then obviously my team of sort of data ninjas that came in and built the data lake. What we also did is actually we partnered with other third parties on the testing side, so that we covered, really, the full life cycle there. >> If I were to follow up with that, it sounds like because there were other systems being built out in parallel that depended on this, you probably had less, fewer degrees of freedom in terms of what the data had to look like when you were done. >> I think that's true, to a degree, but when you look at that every model that we deployed, it was very much agile delivery and we, during the liberation phase, we were working together very closely across these three teams, right? So there was a certain amount of, well not freedom in terms of what to deliver in the end, but to come to an agreement as to what good will look like at the end of a sprint or for a release, so there were no surprises as such. Still, through the flexible architecture that we had built and the flexible model that we had delivering, we could also respond to changes very quickly, so if the product owner changed priority or made priority calls and changed priority items in the backlog, we could quite quickly respond to this. >> So Itamar, maybe you can help us understand how Attunity added value, that other products couldn't really do and how it made the overall pipeline more performant. >> Okay, absolutely. The project that again, this Fortune 100 company was putting together, was an operational data lake. It was very important for them to get data from a lot of different data sources, so they can merge it together for analytic purposes, and also get data in real time so they can support real time analytics using information that is very fresh. That data in many financial services and insurance companies came from the mainframe, so multiple systems on the mainframe as well as other systems, and they needed an efficient way to get the data ingested into their data lake, so that's where Attunity came in, as part of the overall data lake architecture, to support an incremental, continuous, universal data ingestion process. Attunity replicate lends itself to being able to load the data directly into the data lake, into Hadoop, in this case, or also if they opt to use Kafka or go through mechanisms like Kafka and others, so it provided a lot of flexibility architecturally to capture data as it changes, in their many different databases and feed that into the data lake so it can be used for different types of analytics. >> So just to drill down on that one level, 'cuz many of us would assume that, you know, the replication log that Attunity sort of models itself after, would be similar to the event log that Kafka works, sort of models itself after. Is it that if you use Kafka you have to modify the source systems, and therefor it puts more load on them, where as with Attunity you are sort of piggybacking on what's already happening, and so you don't add to the load on those systems? >> Okay, great question. Let me clarify. >> Okay. First of all, Kafka is a great technology that we're seeing more and more customers adopt as part of their overall big data management architectures. It's a public subscribe basically infrastructure that allows you to scale up the messaging of data and storage of data as events, as messages, so you can easily move it around and process it also in a more real time streaming fashion. Attunity complements Kafka and is actually very well integrated with it, as well as other streaming type of ingestion data processing technologies. What Attunity brings to the picture here is primarily the key function of technology, CDC, change data capture, which is the ability, the technology to capture the data as it changes, in many different databases. Do that in a manner that has very little impact, if any, on the source system and the environment, and deliver it in real time. So what Attunity does in a sense, we turn the databases to be live feeds that then can stream, either directly, either we can take it directly into platforms such as Hive, HDFS, or we can feed it into Kafka for further processing integration through Kafka integration. So again, it's very complementary in that sense. >> Okay. So maybe give us, Chris, a little more color on the before and after state, you know, before these multiple projects happened, and then the data lake as sort of a data foundation for these other systems that you're integrating. What business outcomes changed, and how did they change? >> Oof, that's a tough question. I've been asked many flavors of that question before and the analogy I always come back to is it's like we were moving from candle power to electricity. There's no single use case that shows this is why you need a data lake. It was many, many things they wanted to do. In the before picture, again that was always just very challenging, so like many companies, we've outsourced the mainframe support operation and running of our system to third parties, and we were constrained by that. You know, we were in that crazy situation where we couldn't get to our own data. By implementing the data lake, we've broken down that barrier. We now have things back in our control. I mentioned before that POC we did with the regulatory reporting, again, three months ... Two days. It was night and day in terms of what we were now able to do. >> Many banks are beginning to say that their old business model was get the customers' checking account and then, you know, upsell, cross sell, to all these other related products or services. Is something happening like that with insurance, where if you break down the data silos, it's easier to sell other services? >> There will be, is probably the best way to put it. We're not there yet, and you know it's a road, right? It's a long journey and we're doing it in stages, so I think we've done what? Three different releases on the data lake to date? That's very much on the plan. We want to do things like nudges to demonstrate to the customers how there are products that could be a very good fit for them, because once you understand your customer, you understand what their gaps are, what their needs, what their wants are. Again, very much in the roadmap, just not at that part of the map yet. >> So help us maybe understand some of the near term steps you want to take on that roadmap towards that nirvana. >> So, those >> And what the role Attunity as a vendor might play, and Deloitte, you know as a professional service organization, to help get you there. >> So Attunity was obviously was all about getting the data there as efficiently as possible. Unfortunately like many things, in your first iteration it's still, our data lake is still running on a batch basis, but we'd like to evolve that as time goes by. In terms of actually making use of the lake, one of the key things that we were doing on that was actually implementing a client matching solution, so we didn't actually have a MDM system in place for managing our customers. We had 12 different policy admin systems in place. Customers could be coming to us being enrolled, they could be a beneficiary, they could be the policy holder, they could be a power of attorney, and we could talk to someone on the phone and not really understand who they were. You get them into the data lake, you start to build up that 360 view about who people are, then you start to understand what can I do for this person. That was very much the journey we're going on. >> And Martin, have you worked with ... Are you organized by industry line and is there a sort of capability maturity level where you know, you can say, okay, you have to master these skills and at that skill level then you can do these richer business offerings? >> Yeah, absolutely. First of all, yes, we are organized by industry groups and we have sort of a common model across industry store that describe what you just said. When we talk about inside strength in organization, this is really where you are sort of moving to on the maturity curve, as you become more mature in using your analytical capabilities and turning data from just data into information, into a real asset you can actually monetize, right? Where we went with Chris' organization and actually there's many other life insurers, is actually sort of the first step on this journey, right? What Chris described around for the first time being able to see a customer centric view and see what a customer has in terms of product, and therefor what they don't have, right? And where there's opportunities for cross selling, this is sort of a first step into becoming more proactive, right? There's actually a lot more that can follow on after that, but yeah, we've got maturity models that we assess against and we sort of gradually move people, organizations to the right place for them, because it's not going to be right for every organization to be an inside driven organization, to make this huge investment, to get there, but most companies will benefit will benefit from nudging them in that direction. >> Okay, and on that note we're going to have to leave it here. I will say that I think that there's a session at 2:30 today with the Deloitte and the unnamed insurance team talking in greater depth about the case study, with Attunity. On that, we'll be taking a short break. We'll be back at Big Data Silicon Valley. This is George Gilbert and we'll see you in a few short minutes.
SUMMARY :
it's the CUBE, and some of the technology choices they made, and started to curate it and make it available, or how did the uptake, you know, of the projects we were doing. you know, with its sort of deep bench of professional that came in and built the data lake. had to look like when you were done. and the flexible model that we had delivering, So Itamar, maybe you can help us understand the data directly into the data lake, into Hadoop, Is it that if you use Kafka you have to modify Okay, great question. that allows you to scale up the messaging of data before and after state, you know, before these multiple and the analogy I always come back to is it's like and then, you know, upsell, cross sell, to all these other Three different releases on the data lake to date? you want to take on that roadmap towards that nirvana. professional service organization, to help get you there. one of the key things that we were doing on that where you know, you can say, okay, on the maturity curve, as you become more mature Okay, and on that note we're going to have to leave it here.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Martin | PERSON | 0.99+ |
Chris Murphy | PERSON | 0.99+ |
Chris | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Deloitte | ORGANIZATION | 0.99+ |
six years | QUANTITY | 0.99+ |
San Jose, California | LOCATION | 0.99+ |
Kafka | TITLE | 0.99+ |
Big Data | ORGANIZATION | 0.99+ |
first step | QUANTITY | 0.99+ |
Itamar Ankorion | PERSON | 0.99+ |
Strata | ORGANIZATION | 0.99+ |
Two days | QUANTITY | 0.99+ |
Chris' | PERSON | 0.99+ |
today | DATE | 0.99+ |
first time | QUANTITY | 0.99+ |
Itamar | PERSON | 0.99+ |
BigData SV | ORGANIZATION | 0.98+ |
Attunity | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
three teams | QUANTITY | 0.98+ |
Hadoop World | ORGANIZATION | 0.98+ |
three months | QUANTITY | 0.98+ |
single objective | QUANTITY | 0.98+ |
two weeks | QUANTITY | 0.97+ |
first iteration | QUANTITY | 0.97+ |
#BigDataSV | ORGANIZATION | 0.96+ |
about two days | QUANTITY | 0.96+ |
First | QUANTITY | 0.96+ |
both worlds | QUANTITY | 0.95+ |
12 different policy admin systems | QUANTITY | 0.95+ |
Three different releases | QUANTITY | 0.95+ |
360 view | QUANTITY | 0.95+ |
About three months | QUANTITY | 0.94+ |
Israeli | OTHER | 0.93+ |
single use case | QUANTITY | 0.93+ |
Martin Lidl | PERSON | 0.91+ |
Hadoop | TITLE | 0.91+ |
data lake | ORGANIZATION | 0.9+ |
first | QUANTITY | 0.9+ |
one level | QUANTITY | 0.88+ |
2:30 today | DATE | 0.87+ |
Turkish | OTHER | 0.86+ |
first things | QUANTITY | 0.85+ |
Attunity | TITLE | 0.84+ |
Hive | TITLE | 0.82+ |
Italian | OTHER | 0.77+ |
2017 | EVENT | 0.7+ |
Fortune 100 | ORGANIZATION | 0.69+ |
third | QUANTITY | 0.68+ |
Silicon Valley | LOCATION | 0.65+ |
Big Data | EVENT | 0.62+ |
Silicon | LOCATION | 0.58+ |
every | QUANTITY | 0.58+ |
Valley | ORGANIZATION | 0.58+ |
data | ORGANIZATION | 0.55+ |
HDFS | TITLE | 0.52+ |
agile | ORGANIZATION | 0.51+ |