Applying Smart Data Fabrics Across Industries

(upbeat music) >> Today more than ever before, organizations are striving to gain a competitive advantage, deliver more value to customers, reduce risk, and respond more quickly to the needs of businesses. Now, to achieve these goals, organizations need easy access to a single view of accurate, consistent and very importantly, trusted data. If it's not trusted, nobody's going to use it and all in near real time. However, the growing volumes and complexities of data make this difficult to achieve in practice. Not to mention the organizational challenges that have evolved as data becomes increasingly important to winning in the marketplace. Specifically as data grows, so does the prevalence of data silos, making, integrating and leveraging data from internal and external sources a real challenge. Now, in this final segment, we'll hear from Joe Lichtenberg who's the global head of product and industry marketing, and he's going to discuss how smart data fabrics can be applied to different industries. And by way of these use cases, we'll probe Joe's vast knowledge base and ask him to highlight how InterSystems, which touts a next gen approach to Customer 360, how the company leverages a smart data fabric to provide organizations of varying sizes and sectors in financial services, supply chain, logistics and healthcare with a better, faster and easier way to deliver value to the business. Joe welcome, great to have you here. >> Thank you, it's great to be here. That was some intro. I could not have said it better myself, so thank you for that. >> Thank you. Well, we're happy to have you on this show now. I understand- >> It's great to be here. >> You you've made a career helping large businesses with technology solutions, small businesses, and then scale those solutions to meet whatever needs they had. And of course, you're a vocal advocate as is your company of data fabrics. We talked to Scott earlier about data fabrics, how it relates to data mesh big discussions in the industry. So tell us more about your perspective. >> Sure, so first I would say that I have been in this industry for a very long time so I've been like you, I'm sure, for decades working with customers and with technology, really to solve these same kinds of challenges. So for decades, companies have been working with lots and lots of data and trying to get business value to solve all sorts of different challenges. And I will tell you that I've seen many different approaches and different technologies over the years. So, early on, point to point connections with custom coding, and I've worked with integration platforms 20 years ago with the advent of web services and service-oriented architectures and exposing endpoints with wisdom and getting access to disparate data from across the organization. And more recently, obviously with data warehouses and data lakes and now moving workloads to the cloud with cloud-based data marts and data warehouses. Lots of approaches that I've seen over the years but yet still challenges remain in terms of getting access to a single trusted real-time view of data. And so, recently, we ran a survey of more than 500 different business users across different industries and 86% told us that they still lack confidence in using their data to make decisions. That's a huge number, right? And if you think about all of the work and all of the technology and approaches over the years, that is a surprising number and drilling into why that is, there were three main reasons. One is latency. So the amount of time that it takes to access the data and process the data and make it fit for purpose by the time the business has access to the data and the information that they need, the opportunity has passed. >> Elapsed time, not speed a light, right? But that too maybe. >> But it takes a long time if you think about these processes and you have to take the data and copy it and run ETL processes and prepare it. So that's one, one is just the amount of data that's disparate in data silos. So still struggling with data that is dispersed across different systems in different formats. And the third, is data democratization. So the business really wants to have access to the data so that they can drill into the data and ask ad hoc questions and the next question and drill into the information and see where it leads them rather than having sort of pre-structured data and pre-structured queries and having to go back to IT and put the request back on the queue again and waiting. >> So it takes too long, the data's too hard to get to 'cause it's in silos and the data lacks context because it's technical people that are serving up the data to the business people. >> Exactly. >> And there's a mismatch. >> Exactly right. So they call that data democratization or giving the business access to the data and the tools that they need to get the answers that they need in the moment. >> So the skeptic in me, 'cause you're right I have seen this story before and the problems seem like they keep coming up, year after year, decade after decade. But I'm an optimist and so. >> As am I. >> And so I sometimes say, okay, same wine new bottle, but it feels like it's different this time around with data fabrics. You guys talk about smart data fabrics from your perspective, what's different? >> Yeah, it's very exciting and it's a fundamentally different approach. So if you think about all of these prior approaches, and by the way, all of these prior approaches have added value, right? It's not like they were bad, but there's still limitations and the business still isn't getting access to all the data that they need in the moment, right? So data warehouses are terrific if you know the questions that you want answered and you take the data and you structure the data in advance. And so now you're serving the business with sort of pre-planned answers to pre-planned queries, right? The data fabric, what we call a smart data fabric is fundamentally different. It's a fundamentally different approach in that rather than sort of in batch mode, taking the data and making it fit for purpose with all the complexity and delays associated with it, with a data fabric where accessing the data on demand as it's needed, as it's requested, either by the business or by applications or by the data scientists directly from the source systems. >> So you're not copying it necessarily to that to make that you're not FTPing it, for instance. I've got it, you take it, you're basically using the same source. >> You're pulling the data on demand as it's being requested by the consumers. And then all of the data management processes that need to be applied for integration and transformation to get the data into a consistent format and business rules and analytic queries. And with Jess showed with machine learning, predictive prescriptive analytics all sorts of powerful capabilities are built into the fabric so that as you're pulling the data on demand, right, all of these processes are being applied and the net result is you're addressing these limitations around latency and silos that we've seen in the past. >> Okay, so you've talked about you have a lot of customers, InterSystems does in different industries supply chain, financial services, manufacturing. We heard from just healthcare. What are you seeing in terms of applications of smart data fabrics in the real world? >> Yeah, so we see it in every industry. So InterSystems, as you know, has been around now for 43 years, and we have tens of thousands of customers in every industry. And this architectural pattern now is providing value for really critical use cases in every industry. So I'm happy to talk to you about some that we're seeing. I could actually spend like three hours here and there but I'm very passionate about working with customers and there's all sorts of exciting. >> What are some of your favorites? >> So, obviously supply chain right now is going through a very challenging time. So the combination of what's happening with the pandemic and disruptions and now I understand eggs are difficult to come by I just heard on NPR. >> Yeah and it's in part a data problem and a big part of data problem, is that fair? >> Yeah and so, in supply chain, first there's supply chain visibility. So organizations want a real time or near real time expansive view of what's happening across the entire supply chain from a supply all the way through distribution, right? So that's only part of the issue but that's a huge sort of real-time data silos problem. So if you think about your extended supply chain, it's complicated enough with all the systems and silos inside your firewall, before all of your suppliers even just thinking about your tier one suppliers let alone tier two and tier three. And then building on top of real-time visibility is what the industry calls a control tower, what we call the ultimate control tower. And so it's built in analytics to be able to sense disruptions and exceptions as they occur and predict the likelihood of these disruptions occurring. And then having data driven and analytics driven guidance in terms of the best way to deal with these disruptions. So for example, an order is missing line items or a cargo ship is stuck off port somewhere. What do you do about it? Do you reroute a different cargo ship, right? Do you take an order that's en route to a different client and reroute that? What's the cost associated? What's the impact associated with it? So that's a huge issue right now around control towers for supply chain. So that's one. >> Can I ask you a question about that? Because you and I have both seen a lot but we've never seen, at least I haven't the economy completely shut down like it was in March of 2020, and now we're seeing this sort of slingshot effect almost like you're driving on the highway sometimes you don't know why, but all of a sudden you slow down and then you speed up, you think it's okay then you slow down again. Do you feel like you guys can help get a handle on that product because it goes on both sides. Sometimes you can't get the product, sometimes there's too much of a product as well and that's not good for business. >> Yeah, absolutely. You want to smooth out the peaks and valleys. >> Yeah. >> And that's a big business goal, business challenge for supply chain executives, right? So you want to make sure that you can respond to demand but you don't want to overstock because there's cost associated with that as well. So how do you optimize the supply chains and it's very much a data silo and a real time challenge. So it's a perfect fit for this new architectural pattern. >> All right, what else? >> So if we look at financial services, we have many, many customers in financial services and that's another industry where they have many different sources of data that all have information that organizations can use to really move the needle if they could just get to that single source of truth in real time. So we sort of bucket many different implementations and use cases that we do around what we call Business 360 and Customer 360. So Business 360, there's all sorts of ways to add business value in terms of having a real-time operational view across all of the different GOs and parts of the business, especially in these very large global financial services institutions like capital markets and investment firms and so forth. So around Business 360, having a realtime view of risk, operational performance regulatory compliance, things like that. Customer 360, there's a whole set of use cases around Customer 360 around hyper-personalization of customers and in realtime next best action looking to see how you can sell more increase share of wallet, cross-sell, upsell to customers. We also do a lot in terms of predicting customer churn. So if you have all the historical data and what's the likelihood of customers churning to be able to proactively intercede, right? It's much more cost effective to keep assets under management and keep clients rather than going and getting new clients to come to the firm. A very interesting use case from one of our customers in Latin America, so Banco do Brasil largest bank in all of Latin America and they have a very innovative CTO who's always looking for new ways to move the needle for the bank. And so one of their ideas and we're working with them to do this is how can they generate net new revenue streams by bringing in new business to the bank? And so they identified a large percentage of the population in Latin America that does no banking. So they have no banking history not only with Banco do Brasil, but with any bank. So there's a fair amount of risk associated with offering services to this segment of the population that's not associated with any banks or financial institutions. >> There is no historical data on them, there's no. >> So it's a data challenge. And so, they're bringing in data from a variety of different sources, social media, open source data that they find online and so forth. And with us running risk models to identify which are the citizens that there's acceptable risk to offer their services. >> It's going to be huge market of unbanked people in vision Latin America. >> Wow, that's interesting. >> Yeah, yeah, totally vision. >> And if you can lower the risk and you could tap that market and be first >> And they are, yeah. >> Yeah. >> So very exciting. Manufacturing, we know industry 4.0 which is about taking the OT data, so the data from the MES systems and the streaming data, real-time streaming data from the machine controllers and integrating it with the IT data, so your data warehouses and your ERP systems and so forth to have not only a real-time view of manufacturing from supply and source all the way through demand but also predictive maintenance and things like that. So that's very big right now in manufacturing. >> Kind of cool to hear these use cases beyond your healthcare, which is obviously, your wheelhouse, Scott defined this term of smart data fabrics, different than data fabrics, I guess. So when we think about these use cases what's the value add of so-called smart data fabrics? >> Yeah, it's a great question. So we did not define the term data fabric or enterprise data fabric. The analysts now are all over it. They're all saying it's the future of data management. It's a fundamentally different approach this architectural approach to be able to access the data on demand. The canonical definition of a data fabric is to access the data where it lies and apply a set of data management processes, but it does not include analytics, interestingly. And so we firmly believe that most of these use cases gain value from having analytics built directly into the fabric. So whether that's business rules or predictive analytics to predict the likelihood of a customer churn or a machine on the shop floor failing or prescriptive analytics. So if there's a problem in the supply chain, what's the guidance for the supply chain managers to take the best action, right? Prescriptive analytics based on data. So rather than taking the data and the data fabric and moving it to another environment to run those analytics where you have complexity and latency, having tall of those analytics capabilities built directly into the fabric, which is why we call it a smart data fabric, brings a lot of value to our customers. >> So simplifies the whole data lifecycle, data pipelining, the hyper-specialized roles that you have to have, you can really just focus on one platform, is that? >> Exactly, basically, yeah. And it's a simplicity of architecture and faster speed to production. So a big differentiator for our technology, for InterSystems, Iris, is most if not all of the capabilities that are needed are built into one engine, right? So you don't need to stitch together 10 or 15 or 20 different data management services for relational database in a non-relational database and a caching layer and a data warehouse and security and so forth. And so you can do that. There's many ways to build this data fabric architecture, right? InterSystems is not the only way. >> Right? >> But if you can speed and simplify the implementation of the fabric by having most of what you need in one engine, one product that gets you to where you need to go much, much faster. >> Joe, how can people learn more about smart data Fabric some of the use cases that you've presented here? >> Yeah, come to our website, intersystems.com. If you go to intersystems.com/smartdatafabric that'll take you there. >> I know that you have like probably dozens more examples but it would be cool- >> I do. >> If people reach out to you, how can they get in touch? >> Oh, I would love that. So feel free to reach out to me on LinkedIn. It's Joe Lichtenberg I think it's linkedin.com/joeLichtenberg and I'd love to connect. >> Awesome. Joe, thanks so much for your time. Really appreciate it. >> It was great to be here. Thank you, Dave. >> All right, I hope you've enjoyed our program today. You know, we heard Scott now he helped us understand this notion of data fabrics and smart data fabrics and how they can address the data challenges faced by the vast majority of organizations today. Jess Jody's demo was awesome. It was really a highlight of the program where she showed the smart data fabrics inaction and Joe Lichtenberg, we just heard from him dug in to some of the prominent use cases and proof points. We hope this content was educational and inspires you to action. Now, don't forget all these videos are available on Demand to watch, rewatch and share. Go to theCUBE.net, check out siliconangle.com for all the news and analysis and we'll summarize the highlights of this program and go to intersystems.com because there are a ton of resources there. In particular, there's a knowledge hub where you'll find some excellent educational content and online learning courses. There's a resource library with analyst reports, technical documentation videos, some great freebies. So check it out. This is Dave Vellante. On behalf of theCUBE and our supporter, InterSystems, thanks for watching and we'll see you next time. (upbeat music)

Published Date : Feb 15 2023

SUMMARY :

and ask him to highlight how InterSystems, so thank you for that. you on this show now. big discussions in the industry. and all of the technology and But that too maybe. and drill into the information and the data lacks context or giving the business access to the data and the problems seem And so I sometimes say, okay, and by the way, to that to make that you're and the net result is you're fabrics in the real world? So I'm happy to talk to you So the combination and predict the likelihood of but all of a sudden you slow the peaks and valleys. So how do you optimize the supply chains of the different GOs and parts data on them, there's no. risk models to identify It's going to be huge market and integrating it with the IT Kind of cool to hear these use cases and moving it to another if not all of the capabilities and simplify the Yeah, come to our and I'd love to connect. Joe, thanks so much for your time. It was great to be here. and go to intersystems.com

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Joe	PERSON	0.99+
Joe Lichtenberg	PERSON	0.99+
Dave	PERSON	0.99+
Banco do Brasil	ORGANIZATION	0.99+
Scott	PERSON	0.99+
March of 2020	DATE	0.99+
Jess Jody	PERSON	0.99+
Latin America	LOCATION	0.99+
InterSystems	ORGANIZATION	0.99+
Latin America	LOCATION	0.99+
Banco do Brasil	ORGANIZATION	0.99+
10	QUANTITY	0.99+
43 years	QUANTITY	0.99+
three hours	QUANTITY	0.99+
15	QUANTITY	0.99+
86%	QUANTITY	0.99+
Jess	PERSON	0.99+
one product	QUANTITY	0.99+
linkedin.com/joeLichtenberg	OTHER	0.99+
theCUBE.net	OTHER	0.99+
LinkedIn	ORGANIZATION	0.99+
both sides	QUANTITY	0.99+
intersystems.com/smartdatafabric	OTHER	0.99+
One	QUANTITY	0.99+
one engine	QUANTITY	0.99+
one	QUANTITY	0.99+
third	QUANTITY	0.98+
Today	DATE	0.98+
both	QUANTITY	0.98+
intersystems.com	OTHER	0.98+
more than 500 different business users	QUANTITY	0.98+
first	QUANTITY	0.98+
one platform	QUANTITY	0.98+
siliconangle.com	OTHER	0.98+
single	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.95+
tens of thousands of customers	QUANTITY	0.95+
three main reasons	QUANTITY	0.94+
20 years ago	DATE	0.92+
dozens more examples	QUANTITY	0.9+
today	DATE	0.9+
NPR	ORGANIZATION	0.9+
tier one	QUANTITY	0.9+
single view	QUANTITY	0.89+
single source	QUANTITY	0.88+
Business 360	TITLE	0.82+
pandemic	EVENT	0.81+
one of	QUANTITY	0.77+
20 different data management services	QUANTITY	0.76+
tier	QUANTITY	0.74+
resources	QUANTITY	0.73+
Customer 360	ORGANIZATION	0.72+
tier three	OTHER	0.72+
Business 360	ORGANIZATION	0.72+
decade	QUANTITY	0.68+
Business	ORGANIZATION	0.68+
decades	QUANTITY	0.68+
Iris	ORGANIZATION	0.63+
360	TITLE	0.63+
two	OTHER	0.61+
Customer 360	TITLE	0.47+
ton	QUANTITY	0.43+
360	OTHER	0.24+

FINANCIAL Fight Fraud

(upbeat music) >> Hi, I'm Joe Rodriguez, Managing Director of Financial Services at Cloudera. Welcome to the Fight Fraud with Data session. At Cloudera we believe that fighting fraud begins with data. So financial services is Cloudera's largest industry vertical. We have approximately 425 global financial services customers, which consists of 82 out of a hundred of the largest global banks of which we have 27 that are globally systemic banks. Four out of the five top stock exchanges, eight out of the top 10 wealth management firms and all four of the top credit card networks. So as you can see, most financial services institutions utilize Cloudera for data analytics and machine learning. We also have over 20 central banks and a dozen or so financial regulators. So it's an incredible footprint which gives Cloudera lots of insight into the many innovations that our customers are coming up with. Criminals can steal thousands of dollars before a fraudulent transaction is detected. So the cost to purchase your account data is well worth the price to fraudsters. According to Experian, credit and a debit card account information sells on the dark web for a mere $5 with the CVV number and up to $110 if it comes with all the bank information, including your name, social security number, date of birth, complete account numbers, and other personal data. Our customers have several key data and analytics challenges when it comes to fighting financial crime. The volume of data that they need to deal with is huge and growing exponentially. All this data needs to be evaluated in real time. There are new sources of streaming data that need to be integrated with existing legacy data sources. This includes biometrics data and enhanced authentication video surveillance, call center data, and of course all that needs to be integrated with existing legacy data sources. There is an analytics Arms Race between the banks and the criminals, and the criminal networks never stop innovating. They also have to deal with disjointed security and governance. Security and governance policies are often set per data source or application requiring redundant work across workloads. And they have to deal with siloed environments. The specialized nature of platforms and people results in disparate data sources and data management processes. This duplicates efforts and divides the business risk and crime teams, limiting collaboration opportunities between them. CDP enhances financial crime solutions to be holistic by eliminating data gaps between siloed solutions, with an enterprise data approach, advanced data analytics and machine learning. By deploying an enterprise wide data platform, you reduce siloed divisions between business risk and crime teams and enable better collaboration through industrialized machine learning, you tighten up the loop between detection and new fraud patterns. Cloudera provides the data platform on which a best of breed applications can run and leverage integrated machine learning. Cloudera stands rather than replaces your existing fraud modeling applications. So Oracle, SAS, Actimize, to name a few, integrate with an enterprise data hub to scale the data, increase speed and flexibility and improve efficacy of your entire fraud system. It also centralizes the fraud workload on data that can be used for other use cases in applications like Enhanced KYC and Customer 360 for example. I just wanted to highlight a couple of our partners in financial crime prevention, Simudyne and Quantexa. So Simudyne provides fraud simulation using agent-based modeling machine learning techniques to generate synthetic transaction data. This data simulates potential fraud scenarios in a cost-effective GDPR-compliant virtual environment to significantly improve financial crime detection systems. Simudyne identifies future fraud topologies for millions of simulations that can be used to dynamically train new machine learning algorithms for enhanced identification. And Quantexa connects the dots within your data using dynamic entity resolution, and advanced network analytics to create context around your customers. This enables you to see the bigger picture and automatically assesses potential criminal behavior. Now let's go over some of our customers and how they're using Cloudera. First, we'll talk about United Overseas Bank or UOB. UOB is a leading full service bank in Asia with a network of more than 500 offices in 19 countries and territories, in Asia Pacific, Western Europe and North America. UOB built a modern data platform on Cloudera that gives it the flexibility and speed to develop new AI and machine learning solutions and to create a data-driven enterprise. UOB set up it's big data analytics center in 2017. It was Singapore's first centralized big data unit within a bank to deepen the bank's data analytic capabilities and to use data insights to enhance the bank's performance. Essential to this work was implementing a platform that could cost efficiently bring together data from dozens of separate systems and incorporate a range of unstructured data, including voice and text. Using Cloudera CDP and machine learning, UOB gained a richer understanding of its customer preferences to help make their banking experience simpler, safer, and more reliable. Working with Cloudera, UOB has a big data platform that gives business staff and data scientists, faster access to relevant and quality data for self-service analytics, machine learning and emerging artificial intelligence solutions. With new self-service analytics and machine learning driven insights, UOB has realized improvements in digital banking, asset management, compliance, AML, and more. Advanced AML detection capabilities, help analysts detect suspicious transactions either based on hidden relationships of shell companies and high risk individuals with Cloudera and machine learning technologies, UOB was able to enhance AML detection and reduce the time to identify new links from months to three weeks. Next, let's speak about MasterCard. So MasterCard's principle business is to process payments between banks and merchants and the credit issuing banks and credit unions of the purchasers who use the MasterCard brand debit and credit cards to make purchases. MasterCard chose Cloudera Enterprise for fraud detection and to optimize their DW infrastructure, delivering deep insights and best practices and big data security and compliance. Next, let's speak about Bank Rakyat in Indonesia or BRI. BRI is one of the largest and oldest banks in Indonesia and engages in the provision of general banking services. It's headquartered in Jakarta, Indonesia. BRI is well-known for its focus on microfinancing initiatives and serves over 75 million customers through its more than 11,000 offices and rural service outposts. BRI required better insight to understand customer activity and identify fraudulent transactions. The bank needed a solid foundation that allowed it to leverage the power of advanced analytics, artificial intelligence, and machine learning to gain better understanding of customers and the market. BRI used Cloudera Enterprise data platform to build an agile and reliable, predictive augmented intelligence solution to enhance its credit scoring system. And to address the rising concern around data security from regulators and customers, BRI developed a real-time fraud detection service powered by Cloudera and Kafka, BRI's data scientists developed a machine learning model for fraud detection by creating a behavioral scoring model based on customer savings, loan transactions, deposits, payroll and other financial real-time data. This led to improvements in its fraud detection and credit scoring capabilities, as well as the development of a new digital microfinancing product. With the enablement of real-time fraud detection, BRI was able to reduce the rate of fraud by 40%. It improved relationship manager productivity by two and a half fold. It improved the credit scoring system to cut down on micro-financing loan processing times from two weeks to two days to now two minutes. So fraud prevention is a good area to start with data focus if you haven't already. It offers a quick return on investment and it's a focused area that's not too entrenched across the company. To learn more about fraud prevention, go to www.cloudera.com, and you should schedule a meeting with Cloudera to learn even more. And with that, thank you for listening and thank you for your time. (upbeat music)

Published Date : Aug 5 2021

SUMMARY :

and reduce the time to identify new links

ENTITIES

Entity	Category	Confidence
Asia	LOCATION	0.99+
Joe Rodriguez	PERSON	0.99+
2017	DATE	0.99+
United Overseas Bank	ORGANIZATION	0.99+
MasterCard	ORGANIZATION	0.99+
BRI	ORGANIZATION	0.99+
UOB	ORGANIZATION	0.99+
Indonesia	LOCATION	0.99+
two minutes	QUANTITY	0.99+
Bank Rakyat	ORGANIZATION	0.99+
eight	QUANTITY	0.99+
40%	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
two weeks	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Four	QUANTITY	0.99+
27	QUANTITY	0.99+
82	QUANTITY	0.99+
Experian	ORGANIZATION	0.99+
North America	LOCATION	0.99+
First	QUANTITY	0.99+
Jakarta, Indonesia	LOCATION	0.99+
thousands of dollars	QUANTITY	0.99+
two days	QUANTITY	0.99+
more than 11,000 offices	QUANTITY	0.99+
more than 500 offices	QUANTITY	0.99+
SAS	ORGANIZATION	0.99+
Western Europe	LOCATION	0.99+
over 75 million customers	QUANTITY	0.99+
$5	QUANTITY	0.99+
a dozen	QUANTITY	0.99+
www.cloudera.com	OTHER	0.98+
Quantexa	ORGANIZATION	0.98+
five top stock exchanges	QUANTITY	0.98+
GDPR	TITLE	0.98+
three weeks	QUANTITY	0.98+
Asia Pacific	LOCATION	0.98+
one	QUANTITY	0.98+
Enhanced	TITLE	0.98+
four	QUANTITY	0.98+
up to $110	QUANTITY	0.97+
19 countries	QUANTITY	0.97+
over 20 central banks	QUANTITY	0.97+
two and a half fold	QUANTITY	0.96+
approximately 425 global financial services customers	QUANTITY	0.95+
Actimize	ORGANIZATION	0.95+
Simudyne	ORGANIZATION	0.94+
millions	QUANTITY	0.94+
Customer 360	TITLE	0.93+
10 wealth management firms	QUANTITY	0.92+
Cloudera	TITLE	0.92+
first centralized	QUANTITY	0.91+
Kafka	ORGANIZATION	0.85+
Simudyne	TITLE	0.85+
dozens of separate systems	QUANTITY	0.73+
Singapore	LOCATION	0.73+
Cloudera Enterprise	COMMERCIAL_ITEM	0.73+
global banks	QUANTITY	0.72+
Arms Race	EVENT	0.69+
credit card networks	QUANTITY	0.63+
couple	QUANTITY	0.62+
Managing	PERSON	0.58+
a hundred	QUANTITY	0.56+
service outposts	QUANTITY	0.55+
Quantexa	TITLE	0.5+
KYC	TITLE	0.48+

Tendü Yogurtçu, Syncsort | BigData NYC 2017

>> Announcer: Live from midtown Manhattan, it's theCUBE, covering BigData New York City 2017, brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Hello everyone, welcome back to theCUBE's special BigData NYC coverage of theCUBE here in Manhattan in New York City, we're in Hell's Kitchen. I'm John Furrier, with my cohost Jim Kobielus, whose Wikibon analyst for BigData. In conjunction with Strata Data going on right around the corner, this is our annual event where we break down the big data, the AI, the cloud, all the goodness of what's going on in big data. Our next guest is Tendu Yogurtcu who's the Chief Technology Officer at Syncsort. Great to see you again, CUBE alumni, been on multiple times. Always great to have you on, get the perspective, a CTO perspective and the Syncsort update, so good to see you. >> Good seeing you John and Jim. It's a pleasure being here too. Again the pulse of big data is in New York, and it's a great week with a lot of happening. >> I always borrow the quote from Pat Gelsinger, who's the CEO of VMware, he said on theCUBE in I think 2011, before he joined VMware as CEO he was at EMC. He said if you're not out in front of that next wave, you're driftwood. And the key to being successful is to ride the waves, and the big waves are coming in now with AI, certainly big data has been rising tide for its own bubble but now the aperture of the scale of data's larger, Syncsort has been riding the wave with us, we've been having you guys on multiple times. And it was important to the mainframe in the early days, but now Syncsort just keeps on adding more and more capabilities, and you're riding the wave, the big wave, the big data wave. What's the update now with you guys, where are you guys now in context of today's emerging data landscape? >> Absolutely. As organizations progress with their modern data architectures and building the next generation analytics platforms, leveraging machine learning, leveraging cloud elasticity, we have observed that data quality and data governance have become more critical than ever. Couple of years we have been seeing this trend, I would like to create a data lake, data as a service, and enable bigger insights from the data, and this year, really every enterprise is trying to have that trusted data set created, because data lakes are turning into data swamps, as Dave Vellante refers often (John laughs) and collection of this diverse data sets, whether it's mainframe, whether it's messaging queues, whether it's relational data warehouse environments is challenging the customers, and we can take one simple use case like Customer 360, which we have been talking for decades now, right? Yet still it's a complex problem. Everybody is trying to get that trusted single view of their customers so that they can serve the customer needs in a better way, offer better solutions and products to customers, get better insights about the customer behavior, whether leveraging deep learning, machine learning, et cetera. However, in order to do that, the data has to be in a clean, trusted, valid format, and every business is going global. You have data sets coming from Asia, from Europe, from Latin America, and many different places, in different formats and it's becoming challenge. We acquired Trillium Software in December 2016, and our vision was really to bring that world leader enterprise grade data quality into the big data environments. So last week we announced our Trillium Quality for Big Data product. This product brings unmatched capabilities of data validation, cleansing, enrichment, and matching, fuzzy matching to the data lake. We are also leveraging our Intelligent eXecution engine that we developed for data integration product, the MX8. So we are enabling the organizations to take this data quality offering, whether it's in Hadoop, MapReduce or Apache Spark, whichever computer framework it's going to be in the future. So we are very excited about that now. >> Congratulations, you mentioned the data lake being a swamp, that Dave Vellante referred to. It's interesting, because how does it become a swamp if it's a silo, right? We've seen data silos being antithesis to governance, it challenges, certainly IoT. Then you've got the complication of geopolitical borders, you mentioned that earlier. So you still got to integrate the data, you need data quality, which has been around for a while but now it's more complex. What specifically about the cleansing and the quality of the data that's more important now in the landscape now? Is it those factors, are that the drivers of the challenges today and what's the opportunity for customers, how do they figure this out? >> Complexity is because of many different factors. Some of it from being global. Every business is trying to have global presence, and the data is originating from web, from mobile, from many different data sets, and if we just take a simple address, these address formats are different in every single country. Trillium Quality for Big Data, we support over 150 postal data from different countries, and data enrichment with this data. So it becomes really complex, because you have to deal with different types of data from different countries, and the matching also becomes very difficult, whether it's John Furrier, J Furrier, John Currier, you have to be >> All my handles on Twitter, knowing that's about. (Tendu laughs) >> All of the handles you have. Every business is trying to have a better targeting in terms of offering product and understanding the single and one and only John Furrier as a customer. That creates a complexity, and any data management and data processing challenge, the variety of data and the speed that data is really being populated is higher than ever we have observed. >> Hold on Jim, I want to get Jim involved in this one conversation, 'cause I want to just make sure those guys can get settled in on, and adjust your microphone there. Jim, she's bringing up a good point, I want you to weigh in just to kind of add to the conversation and take it in the direction of where the automation's happening. If you look at what Tendu's saying as to complexity is going to have an opportunity in software. Machine learning, root-level cleanliness can be automated, because Facebook and others have shown that you can apply machine learning and techniques to the volume of data. No human can get at all the nuances. How is that impacting the data platforms and some of the tooling out there, in your opinion? >> Yeah well, much of the issue, one of the core issues is where do you place the data matching and data cleansing logic or execution in this distributed infrastructure. At the source, in the cloud, at the consumer level in terms of rolling up the disparate versions of data into a common view. So by acquiring a very strong, well-established reputable brand in data cleansing, Trillium, as Syncsort has done, a great service to your portfolio, to your customers. You know, Trillium is well known for offering lots of options in terms of where to configure the logic, where to deploy it within distributed hybrid architectures. Give us a sense for going forward the range of options you're going to be providing with for customers on where to place the cleansing and matching logic. How you're going to support, Syncsort, a flexible workflows in terms of curation of the data and so forth, because the curation cycle for data is critically important, the stewardship. So how do you plan to address all of that going forward in your product portfolio, Tendu? >> Thank you for asking the question, Jim, because that's exactly the challenge that we hear from our customers, especially from larger enterprise and financial services, banking and insurance. So our plan is our actually next upcoming release end of the year, is targeting very flexible deployment. Flexible deployment in the sense that you might be creating, when you understand the data and create the business rules and said what kind of matching and enrichment that you'll be performing on the data sets, you can actually have those business rules executed in the source of the data or in the data lake or switch between the source and the enterprise data lake that you are creating. That flexibility is what we are targeting, that's one area. On the data creation side, we see these percentages, 80% of data stewards' time is spent on data prep, data creation and data cleansing, and it is actually really a very high percentage. From our customers we see this still being a challenge. One area that we started investing is using the machine learning to understand the data, and using that discovery of the data capabilities we currently have to make recommendations what those business rules can be, or what kind of data validation and cleansing and matching might be required. So that's an area that we will be investing. >> Are you contemplating in terms of incorporating in your product portfolio, using machine learning to drive a sort of, the term I like to use is recommendation engine, that presents recommendations to the data stewards, human beings, about different data schemas or different ways of matching the data, different ways of, the optimal way of reconciling different versions of customer data. So is there going to be like a recommendation engine of that sort >> It's going to be >> In line with your >> That's what our plan currently recommendations so the users can opt to apply or not, or to modify them, because sometimes when you go too far with automation you still need some human intervention in making these decisions because you might be operating on a sample of data versus the full data set, and you may actually have to infuse some human understanding and insight as well. So our plan is to make as a recommendation in the first phase at least, that's what we are planning. And when we look at the portfolio of the products and our CEO Josh is actually today was also in theCUBE, part of Splunk .conf. We have acquisitions happening, we have organic innovation that's happening, and we really try to stay focused in terms of how do we create more value from your data, and how do we increase the business serviceability, whether it's with our Ironstream product, we made an announcement this week, Ironstream transaction tracing to create more visibility to application performance and more visibility to IT operations, for example when you make a payment with your mobile, you might be having problem and you want to be able to trace back to the back end, which is usually a legacy mainframe environment, or whether you are populating the data lake and you want to keep the data in sync and fresh with the data source, and apply the change as a CDC, or whether you are making that data from raw data set to more consumable data by creating the trusted, high quality data set. We are very much focused on creating more value and bigger insights out of the data sets. >> And Josh'll be on tomorrow, so folks watching, we're going to get the business perspective. I have some pointed questions I'm going to ask him, but I'll take one of the questions I was going to ask him but I want to get your response from a technical perspective as CTO. As Syncsort continues your journey, you keep on adding more and more things, it's been quite impressive, you guys done a great job, >> Tendu: Thank you. >> We enjoy covering the success there, watching you guys really evolve. What is the value proposition for Syncsort today, technically? If you go in, talk to a customer, and prospective new customer, why Syncsort, what's the enabling value that you're providing under the hood, technically for customers? >> We are enabling our customers to access and integrate data sets in a trusted manner. So we are ultimately liberating the data from all of the enterprise data stores, and making that data consumable in a trusted manner. And everything we provide in that data management stack, is about making data available, making data accessible and integrated the modern data architecture, bridging the gap between those legacy environments and the modern data architecture. And it becomes really a big challenge because this is a cross-platform play. It is not a single environment that enterprises are working with. Hadoop is real now, right? Hadoop is in the center of data warehouse architecture, and whether it's on-premise or in the cloud, there is also a big trend about the cloud. >> And certainly batch, they own the batch thing. >> Yeah, and as part of that, it becomes very important to be able to leverage the existing data assets in the enterprise, and that requires an understanding of the legacy data stores, and existing infrastructure, and existing data warehouse attributes. >> John: And you guys say you provide that. >> We provide that and that's our baby and provide that in enterprise grade manner. >> Hold on Jim, one second, just let her finish the thought. Okay, so given that, okay, cool you got that out there. What's the problem that you're solving for customers today? What's the big problem in the enterprise and in the data world today that you address? >> I want to have a single view of my data, and whether that data is originating on the mobile or that data is originating on the mainframe, or in the legacy data warehouse, and we provide that single view in a trusted manner. >> When you mentioned Ironstream, that reminded me that one of the core things that we're seeing in Wikibon in terms of, IT operations is increasingly being automated through AI, some call it AI ops and whatnot, we're going deeper on the research there. Ironstream, by bringing mainframe and transactional data, like the use case you brought in was IT operations data, into a data lake alongside machine data that you might source from the internet of things and so forth. Seem to me that that's a great enabler potentially for Syncsort if it wished to play your solutions or position them into IT operations as an enabler, leveraging your machine learning investments to build more automated anomaly detection and remediation into your capabilities. What are your thoughts? Is that where you're going or do you see it as an opportunity, AI for IT ops, for Syncsort going forward? >> Absolutely. We target use cases around IT operations and application performance. We integrate with Splunk ITSI, and we also provide this data available in the big data analytics platforms. So those are really application performance and IT operations are the main uses cases we target, and as part of the advanced analytics platform, for example, we can correlate that data set with other machine data that's originating in other platforms in the enterprise. Nobody's looking at what's happening on mainframe or what's happening in my Hadoop cluster or what's happening on my VMware environment, right. They want to correlate the data that's closed platform, and that's one of the biggest values we bring, whether it's on the machine data, or on the application data. >> Yeah, that's quite a differentiator for you. >> Tendu, thanks for coming on theCUBE, great to see you. Congratulations on your success. Thanks for sharing. >> Thank you. >> Okay, CUBE coverage here in BigData NYC, exclusive coverage of our event, BigData NYC, in conjunction with Strata Hadoop right around the corner. This is our annual event for SiliconANGLE, and theCUBE and Wikibon. I'm John Furrier, with Jim Kobielus, who's our analyst at Wikibon on big data. Peter Burris has been on theCUBE, he's here as well. Big three days of wall-to-wall coverage on what's happening in the data world. This is theCUBE, thanks for watching, be right back with more after this short break.

Published Date : Sep 27 2017

SUMMARY :

brought to you by SiliconANGLE Media all the goodness of what's going on in big data. and it's a great week with a lot of happening. and the big waves are coming in now with AI, and enable bigger insights from the data, of the data that's more important now and the data is originating from web, from mobile, All my handles on Twitter, All of the handles you have. and some of the tooling out there, in your opinion? and so forth, because the curation cycle for data and create the business rules and said the term I like to use is recommendation engine, and bigger insights out of the data sets. but I'll take one of the questions I was going to ask him What is the value proposition for Syncsort today, and integrated the modern data architecture, in the enterprise, and that requires an understanding and provide that in enterprise grade manner. and in the data world today that you address? or that data is originating on the mainframe, like the use case you brought in was IT operations data, and that's one of the biggest values we bring, Tendu, thanks for coming on theCUBE, great to see you. and theCUBE and Wikibon.

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Jim	PERSON	0.99+
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Asia	LOCATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
John Furrier	PERSON	0.99+
December 2016	DATE	0.99+
VMware	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Tendu Yogurtcu	PERSON	0.99+
Manhattan	LOCATION	0.99+
Latin America	LOCATION	0.99+
Josh	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Syncsort	ORGANIZATION	0.99+
2011	DATE	0.99+
Ironstream	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
tomorrow	DATE	0.99+
EMC	ORGANIZATION	0.99+
last week	DATE	0.99+
Tendu	PERSON	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
one second	QUANTITY	0.99+
over 150 postal data	QUANTITY	0.99+
BigData	ORGANIZATION	0.99+
Wikibon	ORGANIZATION	0.99+
one	QUANTITY	0.98+
Trillium Software	ORGANIZATION	0.98+
New York City	LOCATION	0.98+
Trillium	ORGANIZATION	0.98+
single	QUANTITY	0.98+
John Currier	PERSON	0.98+
first phase	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.97+
this week	DATE	0.96+
Tendü Yogurtçu	PERSON	0.96+
this year	DATE	0.96+
Twitter	ORGANIZATION	0.95+
Couple of years	QUANTITY	0.95+
today	DATE	0.95+
single view	QUANTITY	0.94+
CUBE	ORGANIZATION	0.94+
NYC	LOCATION	0.94+
one area	QUANTITY	0.93+
J Furrier	PERSON	0.92+
Hadoop	TITLE	0.91+
2017	EVENT	0.9+
three days	QUANTITY	0.89+
single environment	QUANTITY	0.88+
One area	QUANTITY	0.87+
one conversation	QUANTITY	0.86+
Apache	ORGANIZATION	0.85+
big wave	EVENT	0.84+
one simple use case	QUANTITY	0.82+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Customer 360: