Image Title

Search Results for Standard Chartered Bank:

Santhosh Mahendiran, Standard Chartered Bank | BigData NYC 2017


 

>> Announcer: Live, from Midtown Manhattan, it's theCUBE, covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (upbeat techno music) >> Okay welcome back, we're live here in New York City. It's theCUBE's presentation of Big Data NYC, our fifth year doing this event in conjunction with Strata Data, formerly Strata Hadoop, formerly Strata Conference, formerly Hadoop World, we've been there from the beginning. Eight years covering Hadoop's ecosystem now Big Data. This is theCUBE, I'm John Furrier. Our next guest is Santhosh Mahendiran, who is the global head of technology analytics at Standard Chartered Bank. A practitioner in the field, here getting the data, checking out the scene, giving a presentation on your journey with Data at a bank, which is big financial obviously an adopter. Welcome to theCUBE. >> Thank you very much. >> So we always want to know what the practitioners are doing because at the end of the day there's a lot of vendors selling stuff here, so you got, everyone's got their story. End of the day you got to implement. >> That's right. >> And one of the themes is the data democratization which sounds warm and fuzzy, collaborating with data, this is all good stuff and you feel good and you move into the future, but at the end of the day it's got to have business value. >> That's right. >> And as you look at that, how do you look at the business value? Cause you want to be in the bleeding edge, you want to provide value and get that edge operationally. >> That's right. >> Where's the value in data democratization? How did you guys roll this out? Share your story. >> Okay, so let me start with the journey first before I come to the value part of it, right? So, data democratization is an outcome, but the journey has been something we started three years back. So what did we do, right? So we had some guiding principles to start our journey. The first was to say that we believed in the three S's, which is speed, scale, and it should be really, really flexible and super fast. So one of the challenges that we had was our historical data warehouses was entirely becoming redundant. And why was it? Because it was RDBMS centric, and it was extremely disparate. So we weren't able to scale up to meet the demands of managing huge chunks of data. So, the first step that we did was to re-pivot it to say that okay, let's embrace Hadoop. And what you mean by embracing is just not putting in the data lake, but we said that all our data will land into the data lake. And this journey started in 2015, so we have close to 80% of the Bank's data in the lake and it is end of day data right now and this data flows in on daily basis, and we have consumers who feed off that data. Now coming to your question about-- >> So the data lake's working? >> The data lake is working, up and running. >> People like it, you just got a good spot, batch 'em all you throw everything in the lake. >> So it is not real time, it is end of day. There is some data that is real-time, but the data lake is not entirely real-time, that I have to tell you. But one part is that the data lake is working. Second part to your question is how do I actually monetize it? Are you getting some value out of it? But I think that's where tools like Paxata has actually enabled us to accelerate this journey. So we call it data democratization. So the best part it's not about having the data. We want the business users to actually use the data. Typically, data has always been either delayed or denied in most of the cases to end-users and we have end-users waiting for the data but they don't get access to the data. It was done because primarily the size of the data was too huge and it wasn't flexible enough to be shared with. So how did tools like Paxata and the data lake help us? So what we did with data democratization is basically to say that "hey we'll get end-users to access the data first in a fast manner, in a self-service manner, and something that gives operational assurance to the data, so you don't hold the data and then say that you're going to get a subset of data to play with. We'll give you the entire set of data and we'll give you the right tools which you can play with. Most importantly, from an IT perspective, we'll be able to govern it. So that's the key about democratization. It's not about just giving them a tool, giving them all data and then say "go figure it out." It's about ensuring that "okay, you've got the tools, you've got the data, but we'll also govern it," so that you obviously have control over what they're doing. >> So now you govern it, they don't have to get involved in the governance, they just have access? >> No they don't need to. Yeah, they have access. So governance works both ways. We establish the boundaries. Look at it as a referee, and then say that "okay, there are guidelines that you don't," and within the datasets that key people have access to, you can further set rules. Now, coming back to specific use cases, I can talk about two specific cases which actually helped us to move the needle. The first is on stress testing, so being a financial institution, we typically have to report various numbers to our regulators, etc. The turnaround time was extremely huge. These kind of stress testing typically involve taking huge amount-- >> What were some of the turnaround times? >> Normally it was two to three weeks, some cases a month-- >> Wow. >> So we were able to narrow it down to days, but what we essentially did was as with any stress testing or reporting, it involved taking huge amounts of data, crunching them and then running some models and then showing the output, basically a number of transformations involved. Earlier, you first couldn't access the entire dataset, so that we solved-- >> So check, that was a good step one-- >> That was step one. >> But was there automation involved in that, the Paxata piece? >> Yeah, I wouldn't say it was fully automated end-to-end, but there was definitely automation given the fact that now you got Paxata to work off the data rather than someone extracting the data and then going off and figuring what needs to be done. The ability to work off the entire dataset was a big plus. So stress testing, bringing down the cycle time. The second one use case I can talk about is again anti-money laundering, and in our financial crime compliance space. We had processes that took time to report, given the clunkiness in the various handoffs that we needed to do. But again, empowering the users, giving the tool to them and then saying "hey, this"-- >> How about know your user, because we have to anti-money launder, you need to have to know your user base, that's all set their too? >> Yeah. So the good part is know the user, know your customer, KYCs all that part is set, but the key part is making sure the end-users are able to access the data much more earlier in the life cycle and are able to play with it. In the case of anti-money laundering, again first question of three weeks to four weeks was shortened down to question of days by giving tools like Paxata again in a structured manner and with which we're able to govern. >> You control this, so you knew what you were doing, but you let their tools do the job? >> Correct, so look at it this way. Typically, the data journey has always been IT-led. It has never been business-led. If you look at the generations of what happens is, you source the data which is IT-led, then you model the data which is IT-led, then you prepare then massage the data which is again IT-led and then you have tools on top of it which is again IT-led so the end-users get it only after the fourth stage. Now look at the generations within. All these life cycles apart from the fact that you source the data which is typically an IT issue, the rest need to be done by the actual business users and that's what we did. That's the progression of the generations in which we now we're in the third generation as I call it where our role is just to source the data and then say, "yeah we'll govern it in the matter and then preparation-- >> It's really an operating system and we were talking with Aaron with Elation's co-founder, we used the analogy of a car, how this show was like a car show engine show, what's in the engine and the technology and then it evolved every year, now it's like we're talking about the cars, now we're talking about driver experience-- >> That's right. >> At the end of the day, you just want to drive. You don't really care what's under the hood, you do but you don't, but there's those people who do care what's under the hood, so you can have best of both worlds. You've got the engines, you set up the infrastructure, but ultimately, you in the business side, you just want to drive, that's what's you're getting at? >> That's right. The time-to-market and speed to empower the users to play around with the data rather than IT trying to churn the data and confine access to data, that's a thing of the past. So we want more users to have faster access to data but at the same time govern it in a seamless manner. The word governance is still important because it's not about just give the data. >> And seamless is key. >> Seamless is key. >> Cause if you have democratization of data, you're implying that it is community-oriented, means that it's available, with access privileges all transparently or abstracted away from the users. >> Absolutely. >> So here's the question I want to ask you. There's been talk, I've been saying it for years going back to 2012 that an abstraction layer, a data layer will evolve and that'll be the real key. And then here in this show, I heard things like intelligent information fabric that is business, consumer-friendly. Okay, it's a mouthful, but intelligent information fabric in essence talks about an abstraction layer-- >> That's right. >> That doesn't really compromise anything but gives some enablement, creates some enabling value-- >> That's right. >> For software, how do you see that? >> As the word suggests, the earlier model was trying to build something for the end-users, but not which was end-user friendly, meaning to say, let me just give you a simple example. You had a data model that existed. Historically the way that we have approached using data is to say "hey, I've got a model and then let's fit that data into this model," without actually saying that "does this model actually serve the purpose?" You abstracted the model to a higher level. The whole point about intelligent data is about saying that, I'll give you a very simple analogy. Take zip code. Zipcode in US is very different from zipcode in India, it's very different from zipcode in Singapore. So if I had the ability for my data to come in, to say that "I know it's a zipcode, but this zipcode belongs to US, this zipcode belongs to Singapore, and this zipcode belongs to India," and more importantly, if I can further rev it up a notch, if I say that "this belongs to India, and this zipcode is valid." Look at where I'm going with intelligent sense. So that's what's up. If you look at the earlier model, you have to say that "yeah, this is a placeholder for zipcode." Now that makes sense, but what are you doing with it? >> Being a relational database model, it's just a field in a schema, you're taking it and abstracting it and creating value out of it. >> Precisely. So what I'm actually doing is accelerating the adoption, I'm making it more simpler for users to understand what the data is. So I don't need to as a user figure out "I got a zipcode, now is it a Singapore, India or what zipcode." >> So all this automation, Paxata's got a good system, we'll come back to the Paxata question in a second, I do want to drill down on that. But the big thing that I've been seeing at the show, and again Dave Alonte, my partner, co-CEO of Silicon Angle, we always talk about this all the time. He's more less bullish on Hadoop than I am. Although I love Hadoop, I think it's great but it's not the end-all, be-all. It's a great use case. We were critical early on and the thing we were critical on it was it was too much time being spent on the engine and how things are built, not on the business value. So there's like a lull period in the business where it was just too costly-- >> That's right. >> Total cost of ownership was a huge, huge problem. >> That's right. >> So now today, how did you deal with that and are you measuring the TCO or total cost of ownership cause at the end of the day, time to value, which is can you be up and running in 90 days with value and can you continue to do that, and then what's the overall cost to get there. Thoughts? >> So look I think TCO always underpins any technology investment. If someone said I'm doing a technology investment without thinking about TCO, I don't think he's a good technology leader, so TCO is obviously a driving factor. But TCO has multiple components. One is the TCO of the solution. The other aspect is TCO of what my value I'm going to get out of this system. So talking from an implementation perspective, what I look at as TCO is my whole ecosystem which is my hardware, software, so you spoke about Hadoop, you spoke about RDBMS, is Hadoop cheaper, etc? I don't want to get into that debate of cheaper or not but what I know is the ecosystem is becoming much, much more cheaper than before. And when I talk about ecosystem, I'm talking about RDBMS tools, I'm talking about Hadoop, I'm talking about BI tools, I'm talking about governance, I'm talking about this whole framework becoming cheaper. And it is also underpinned by the fact that hardware is also becoming cheaper. So the reality is all components in the whole ecosystem are becoming cheaper and given the fact that software is also becoming more open-sourced and people are open to using open-source software, I think the whole question of TCO becomes a much more pertinent question. Now coming to your point, do you measure it regularly? I think the honest answer is I don't think we are doing a good job of measuring it that well, but we do have that as one of the criteria for us to actually measure the success of our project. The way that we do is our implementation cost, at the time of writing out our PETs, we call it PETs, which is the Project Execution Document, we talk about cost. We say that "what's the implementation cost?" What are the business cases that are going to be an outcome of this? I'll give you an example of our anti-money laundering. I told you we reduced our cycle time from few weeks to a few days, and that in turn means the number of people involved in this whole process, you're reducing the overheads and the operational folks involved in it. That itself tells you how much we're able to save. So definitely, TCO is there and to say that-- >> And you are mindful of, it's what you look at, it's key. TCO is on your radar 100% you evaluate that into your deals? >> Yes, we do. >> So Paxata, what's so great about Paxata? Obviously you've had success with them. You're a customer, what's the deal. Was it the tech, was it the automation, the team? What was the key thing that got you engaged with them or specifically why Paxata? >> Look, I think the key to partnership there cannot be one ingredient that makes a partnership successful, I think there are multiple ingredients that make a partnership successful. We were one of the earliest adopters of Paxata. Given that we're a bank and we have multiple different systems and we have lot of manual processing involved, we saw Paxata as a good fit to govern these processes and ensure at the same time, users don't lose their experience. The good thing about Paxata that we like was obviously the simplicity and the look and feel of the tool. That's number one. Simplicity was a big point. The second one is about scale. The scale, the fact that it can take in millions of roles, it's not about just working off a sample of data. It can work on the entire dataset. That's very key for us. The third is to leverage our ecosystem, so it's not about saying "okay you give me this data, let me go figure out what to do and then," so Paxata works off the data lake. The fact that it can leverage the lake that we built, the fact that it's a simple and self-preparation tool which doesn't require a lot of time to bootstrap, so end-use people like you-- >> So it makes it usable. >> It's extremely user-friendly and usable in a very short period of time. >> And that helped with the journey? >> That really helped with the journey. >> Santosh, thanks so much for sharing. Santosh Mahendiran, who is the Global Tech Lead at the Analytics of the Bank at Standard Chartered Bank. Again, financial services, always a great early adopter, and you get success under your belt, congratulations. Data democratization is huge and again, it's an ecosystem, you got all that anti-money laundering to figure out, you got to get those reports out, lot of heavylifting? >> That's right, >> So thanks so much for sharing your story. >> Thank you very much. >> We'll give you more coverage after this short break, I'm John Furrier, stay tuned. More live coverage in New York City, its theCube.

Published Date : Sep 29 2017

SUMMARY :

Brought to you by SiliconANGLE Media here getting the data, checking out the scene, End of the day you got to implement. but at the end of the day it's got to have business value. how do you look at the business value? Where's the value in data democratization? So one of the challenges that we had was People like it, you just got a good spot, in most of the cases to end-users and we have end-users guidelines that you don't," and within the datasets that Earlier, you first couldn't access the entire dataset, So stress testing, bringing down the cycle time. So the good part is know the user, know your customer, That's the progression of the generations in which we At the end of the day, you just want to drive. but at the same time govern it in a seamless manner. Cause if you have democratization of data, So here's the question I want to ask you. So if I had the ability for my data to come in, and creating value out of it. So I don't need to as a user figure out "I got a zipcode, But the big thing that I've been seeing at the show, at the end of the day, time to value, which is can you be So the reality is all components in the whole ecosystem And you are mindful of, it's what you look at, it's key. Was it the tech, was it the automation, the team? The fact that it can leverage the lake that we built, It's extremely user-friendly and usable in a very at the Analytics of the Bank at Standard Chartered Bank. We'll give you more coverage after this short break,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave AlontePERSON

0.99+

Standard Chartered BankORGANIZATION

0.99+

three weeksQUANTITY

0.99+

John FurrierPERSON

0.99+

New York CityLOCATION

0.99+

2012DATE

0.99+

2015DATE

0.99+

Santosh MahendiranPERSON

0.99+

twoQUANTITY

0.99+

AaronPERSON

0.99+

USLOCATION

0.99+

Santhosh MahendiranPERSON

0.99+

SingaporeLOCATION

0.99+

SantoshPERSON

0.99+

four weeksQUANTITY

0.99+

TCOORGANIZATION

0.99+

100%QUANTITY

0.99+

90 daysQUANTITY

0.99+

IndiaLOCATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

fifth yearQUANTITY

0.99+

todayDATE

0.99+

Midtown ManhattanLOCATION

0.99+

PaxataORGANIZATION

0.99+

one ingredientQUANTITY

0.99+

thirdQUANTITY

0.99+

theCUBEORGANIZATION

0.99+

one partQUANTITY

0.99+

millionsQUANTITY

0.99+

firstQUANTITY

0.99+

Eight yearsQUANTITY

0.99+

Silicon AngleORGANIZATION

0.99+

Second partQUANTITY

0.98+

third generationQUANTITY

0.98+

fourth stageQUANTITY

0.98+

two specific casesQUANTITY

0.98+

both waysQUANTITY

0.98+

oneQUANTITY

0.98+

BigDataORGANIZATION

0.98+

NYCLOCATION

0.98+

both worldsQUANTITY

0.98+

first stepQUANTITY

0.97+

three years backDATE

0.97+

second oneQUANTITY

0.97+

OneQUANTITY

0.97+

2017DATE

0.96+

HadoopTITLE

0.96+

Strata DataORGANIZATION

0.96+

Strata HadoopORGANIZATION

0.94+

step oneQUANTITY

0.94+

first questionQUANTITY

0.93+

a monthQUANTITY

0.92+

ElationORGANIZATION

0.9+

DataEVENT

0.89+

2017EVENT

0.89+

80%QUANTITY

0.88+

PaxataTITLE

0.88+

Big DataEVENT

0.84+

theCubeORGANIZATION

0.83+

Prakash Nanduri, Paxata | Corinium Chief Analytics Officer Spring 2018


 

(techno music) >> Announcer: From the Corinium Chief Analytics Officer Conference Spring San Francisco. It's theCUBE. >> Hey, welcome back everybody. Jeff Frick here with theCUBE. We're in downtown San Francisco at the Parc 55 Hotel at the Corinium Chief Analytics Officer Spring 2018 event, about 100 people, pretty intimate affair. A lot of practitioners here talking about the challenges of Big Data and the challenges of Analytics. We're really excited to have a very special Cube guest. I think he was the first guy to launch his company on theCUBE. It was Big Data New York City 2013. I remember it distinctly. It's Prakash Nanduri, the co-founder and CEO of Paxata. Great to see you. >> Great seeing you. Thank you for having me back. >> Absolutely. You know we got so much mileage out of that clip. We put it on all of our promotional materials. You going to launch your company? Launch your company on theCUBE. >> You know it seems just like yesterday but it's been a long ride and it's been a fantastic ride. >> So give us just a quick general update on the company, where you guys are now, how things are going. >> Things are going fantastic. We continue to grow. If you recall, when we launched, we launched the whole notion of democratization of information in the enterprise with self service data prep. We have gone onto now delivered real value to some of the largest brands in the world. We're very proud that 2017 was the year when massive amount of adoption of Paxata's adaptive information platform was taken across multiple industries, financial services, retail, CPG, high tech, in the OIT space. So, we just keep growing and it's the usual challenges of managing growth and managing, you know, the change in the company as you, as you grow from being a small start-up to know being a real company. >> Right, right. There's good problems and bad problems. Those are the good problems. >> Yes, yes. >> So, you know, we do so many shows and there's two big themes over and over and over like digital transformation which gets way over used and then innovation and how do you find a culture of innovation. In doing literally thousands of these interviews, to me it seems pretty simple. It is about democratization. If you give more people the data, more people the tools to work with the data, and more people the power to do something once they find something in the data, and open that up to a broader set of people, they're going to find innovations, simply the fact of doing it. But the reality is those three simple steps aren't necessarily very easy to execute. >> You're spot on, you're spot on. I like to say that when we talk about digital transformation the real focus should be on the deed . And it really centers around data and it centers around the whole notion of democratization, right? The challenge always in large enterprises is democratization without governance becomes chaos. And we always need to focus on democratization. We need to focus on data because as we all know data is the new oil, all of that, and governance becomes a critical piece too. But as you recall, when we launched Paxata, the entire vision from day one has been while the entire focus around digitization covers many things right? It covers people processes. It covers applications. It's a very large topic, the whole digital transformation of enterprise. But the core foundation to digital transformation, data democratization governance, but the key issue is the companies that are going to succeed are the companies that turn data into information that's relevant for every digital transformation effort. >> Right, right. >> Because if you do not turn raw data into information, you're just dealing with raw data which is not useful >> Jeff: Right >> And it will not be democratized. >> Jeff: Right >> Because the business will only consume the information that is contextual to their need, the information that's complete and the information that is clean. >> Right, right. >> So that's really what we're driving towards. >> And that's interesting 'cause the data, there's so many more sources of data, right? There's data that you control. There's structured data, unstructured data. You know, I used to joke, just the first question when you'd ask people "Where's your data?", half the time they couldn't even, they couldn't even get beyond that step. And that's before you start talking about cleaning it and making it ready and making it available. Before you even start to get into governance and rights and access so it's a really complicated puzzle to solve on the backend. >> I think it starts with first focusing on what are the business outcomes we are driving with digital transformation. When you double-click on digital transformation and then you start focusing on data and information, there's a few things that come to fore. First of all, how do I leverage information to improve productivity in my company? There's multiple areas, whether it is marketing or supply chain or whatever. The second notion is how do I ensure that I can actually transform the culture in my company and attract the brightest and the best by giving them the the environment where democratization of information is actually reality, where people feel like they're empowered to access data and turn it into information and then be able to do really interesting things. Because people are not interested on being subservient to somebody who gives them the data. They want to be saying "Give it to me. "I'm smart enough. "I know analytics. "I think analytically and I want to drive my career forward." So the second thing is the cultural aspect to it. And the last thing, which is really important is every company, regardless of whether you're making toothpicks or turbines, you are looking to monetize data. So it's about productivity. It's about cultural change and attracting of talent. And it's about monetization. And when it comes to monetization of data, you cannot be satisfied with only covering enterprise data which is sitting in my enterprise systems. You have to be able to focus on, oh, how can I leverage the IOT data that's being generated from my products or widgets. How can I generate social immobile? How can I consume that? How can I bring all of this together and get the most complete insight that I need for my decision-making process? >> Right. So, I'm just curious, how do you see it your customers? So this is the chief analytics officer, we go to chief data officer, I mean, there's all these chief something officers that want to get involved in data and marketing is much more involved with it. Forget about manufacturing. So when you see successful cultural change, what drives that? Who are the people that are successful and what is the secret to driving the cultural change that we are going to be data-driven, we are going to give you the tools, we are going to make the investment to turn data which historically was even arguably a liability 'cause it had to buy a bunch o' servers to stick it on, into that now being an asset that drives actionable outcomes? >> You know, recently I was having this exact discussion with the CEO of one of the largest financial institutions in the world. This gentleman is running a very large financial services firm, is dealing with all the potential disruption where they're seeing completely new type of PINTEC products coming in, the whole notion of blockchain et cetera coming in. Everything is changing. Everything looks very dramatic. And what we started talking about is the first thing as the CEO that we always focus on is do we have the right people? And do we have the people that are motivated and driven to basically go and disrupt and change? For those people, you need to be able to give them the right kind of tools, the right kind of environment to empower them. This doesn't start with lip service. It doesn't start about us saying "We're going to be on a digital transformation journey" but at the same time, your data is completely in silos. It's locked up. There is 15,000 checks and balances before I can even access a simple piece of data and third, even when I get access to it, it's too little, too late or it's garbage in, garbage out. And that's not the culture. So first, it needs to be CEO drive, top down. We are going to go through digital transformation which means we are going to go through a democratization effort which means we are going to look at data and information as an asset and that means we are not only going to be able to harness these assets, but we're also going to monetize these assets. How are we going to do it? It depends very much on the business you're in, the vertical industry you play in, and your strengths and weaknesses. So each company has to look at it from their perspective. There's no one size fits all for everyone. >> Jeff: Right. >> There are some companies that have fantastic cultures of empowerment and openness but they may not have the right innovation or the right kind of product innovation skills in place. So it's about looking at data across the board. First from your culture and your empowerment, second about democratization of information which is where a company like Paxata comes in, and third, along with democratization, you have to focus on governance because we are for-profit companies. We have a fiducial responsibility to our customers and our regulators and therefore we cannot have democratization without governance. >> Right, right >> And that's really what our biggest differentiation is. >> And then what about just in terms of the political play inside the company. You know, on one hand, used to be if you held the information, you had the power. And now that's changed really 'cause there's so much information. It's really, if you are the conduit of information to help people make better decisions, that's actually a better position to be. But I'm sure there's got to be some conflicts going through digital transformation where I, you know, I was the keeper of the kingdom and now you want to open that up. Conversely, it must just be transformational for the people on the front lines that finally get the data that they've been looking for to run the analysis that they want to rather than waiting for the weekly reports to come down from on high. >> You bet. You know what I like to say is that if you've been in a company for 10, 15 years and if you felt like a particular aspect, purely selfishly, you felt a particular aspect was job security, that is exactly what's going to likely make you lose your job today. What you thought 10 years ago was your job security, that's exactly what's going to make you lose your job today. So if you do not disrupt yourself, somebody else will. So it's either transform yourself or not. Now this whole notion of politics and you know, struggle within the company, it's been there for as long as, humans generally go towards entropy. So, if you have three humans, you have all sort of issues. >> Jeff: Right, right. >> The issue starts frankly with leadership. It starts with the CEO coming down and not only putting an edict down on how things will be done but actually walking the walk with talking the talk. If, as a CEO, you're not transparent, it you're not trusting your people, if you're not sharing information which could be confidential, but you mention that it's confidential but you have to keep this confidential. If you trust your people, you give them the ability to, I think it's a culture change thing. And the second thing is incentivisation. You have to be able to focus on giving people the ability to say "by sharing my data, "I actually become a hero." >> Right, right. >> By giving them the actual credit for actually delivering the data to achieve an outcome. And that takes a lot of work. But if you do not actually drive the cultural change, you will not drive the digital transformation and you will not drive the democratization of information. >> And have you seen people try to do it without making the commitment? Have you seen 'em pay the lip service, spend a few bucks, start a project but then ultimately they, they hamstring themselves 'cause they're not actually behind it? >> Look, I mean, there's many instances where companies start on digital transformation or they start jumping into cool terms like AI or machine-learning, and there's a small group of people who are kind of the elites that go in and do this. And they're given all the kind of attention et cetera. Two things happen. Because these people who are quote, unquote, the elite team, either they are smart but they're not able to scale across the organization or many times, they're so good, they leave. So that transformation doesn't really get democratized. So it is really important from day one to start a culture where you're not going to have a small group of exclusive data scientists. You can have those people but you need to have a broader democratization focus. So what I have seen is many of the siloed, small, tight, mini science projects end up failing. They fail because number one, either the business outcome is not clearly identified early on or two, it's not scalable across the enterprise. >> Jeff: Right. >> And a majority of these exercises fail because the whole information foundation that is taking raw data turning it into clean, complete, potential consumable information, to feed across the organization, not just for one siloed group, not just one data science team. But how do you do that across the company? That's what you need to think from day one. When you do these siloed things, these departmental things, a lot of times they can fail. Now, it's important to say "I will start with a couple of test cases" >> Jeff: Right, right. >> "But I'm going to expand it across "from the beginning to think through that." >> So I'm just curious, your perspective, is there some departments that are the ripest for being that leading edge of the digital transformation in terms of, they've got the data, they've got the right attitude, they're just a short step away. Where have you seen the great place to succeed when you're starting on kind of a smaller PLC, I don't know if you'd say PLC, project or department level? >> So, it's funny but you will hear this, it's not rocket science. Always they say, follow the money. So, in a business, there are three incentives, making more money, saving money, or staying out of jail. (laughs) >> Those are good. I don't know if I'd put them in that order but >> Exactly, and you know what? Depending on who are you are, you may have a different order but staying out of jail if pretty high on my list. >> Jeff: I'm with you on that one. >> So, what are the ambiants? Risk and compliance. Right? >> Jeff: Right, right. >> That's one of those things where you absolutely have to deliver. You absolutely have to do it. It's significantly high cost. It's very data and analytic centric and if you find a smart way to do it, you can dramatically reduce your cost. You can significantly increase your quality and you can significantly increase the volume of your insights and your reporting, thereby achieving all the risk and compliance requirements but doing it in a smarter way and a less expensive way. >> Right. >> That's where incentives have really been high. Second, in making money, it always comes down to sales and marketing and customer success. Those are the three things, sales, marketing, and customer success. So most of our customers who have been widely successful, are the ones who have basically been able to go and say "You know what? "It used to take us eight months "to be able to even figure out a customer list "for a particular region. "Now it takes us two days because of Paxata "and because of the data prep capabilities "and the governance aspects." That's the power that you can deliver today. And when you see one person who's a line of business person who says "Oh my God. "What used to take me eight months, "now it's done in half a day". Or "What use to take me 22 days to create a report, "is now done in 45 minutes." All of a sudden, you will not have a small kind of trickle down, you will have a tsunami of democratization with governance. That's what we've seen in our customers. >> Right, right. I love it. And this is just so classic too. I always like to joke, you know, back in the day, you would run your business based on reports from old data. Now we want to run your business with stuff you can actually take action on now. >> Exactly. I mean, this is public, Shameek Kundu, the chief data officer of Standard Chartered Bank and Michael Gorriz who's the global CIO of Standard Chartered Bank, they have embraced the notion that information democratization in the bank is a foundational element to the digital transformation of Standard Chartered. They are very forward thinking and they're looking at how do I democratize information for all our 87,500 employees while we maintain governance? And another major thing that they are looking at is they know that the data that they need to manipulate and turn into information is not sitting only on premise. >> Right, right. >> It's sitting across a multi-cloud world and that's why they've embraced the Paxata information platform to be their information fabric for a multi-cloud hybrid world. And this is where we see successes and we're seeing more and more of this, because it starts with the people. It starts with the line of business outcomes and then it starts with looking at it from scale. >> Alright, Prakash, well always great to catch up and enjoy really watching the success of the company grow since you launched it many moons ago in New York City >> yes Fantastic. Always a pleasure to come back here. Thank you so much. >> Alright. Thank you. He's Prakash, I'm Jeff Frick. You're watching theCUBE from downtown San Francisco. Thanks for watching. (techno music)

Published Date : May 17 2018

SUMMARY :

Announcer: From the Corinium and the challenges of Analytics. Thank you for having me back. You going to launch your company? You know it seems just like yesterday where you guys are now, how things are going. of information in the enterprise Those are the good problems. and more people the power to do something and it centers around the whole notion of and the information that is clean. And that's before you start talking about cleaning it So the second thing is the cultural aspect to it. we are going to give you the tools, the vertical industry you play in, So it's about looking at data across the board. And that's really and now you want to open that up. and if you felt like a particular aspect, the ability to say "by sharing my data, and you will not drive the democratization of information. but you need to have a broader democratization focus. That's what you need to think from day one. "from the beginning to think through that." Where have you seen the great place to succeed So, it's funny but you will hear this, I don't know if I'd put them in that order but Exactly, and you know what? Risk and compliance. and if you find a smart way to do it, That's the power that you can deliver today. I always like to joke, you know, back in the day, is a foundational element to the digital transformation the Paxata information platform Thank you so much. Thank you.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeffPERSON

0.99+

Michael GorrizPERSON

0.99+

Prakash NanduriPERSON

0.99+

eight monthsQUANTITY

0.99+

Standard Chartered BankORGANIZATION

0.99+

22 daysQUANTITY

0.99+

Jeff FrickPERSON

0.99+

PaxataORGANIZATION

0.99+

Shameek KunduPERSON

0.99+

two daysQUANTITY

0.99+

New York CityLOCATION

0.99+

PrakashPERSON

0.99+

SecondQUANTITY

0.99+

thousandsQUANTITY

0.99+

87,500 employeesQUANTITY

0.99+

45 minutesQUANTITY

0.99+

PINTECORGANIZATION

0.99+

Standard CharteredORGANIZATION

0.99+

2017DATE

0.99+

10QUANTITY

0.99+

thirdQUANTITY

0.99+

half a dayQUANTITY

0.99+

15,000 checksQUANTITY

0.99+

firstQUANTITY

0.99+

FirstQUANTITY

0.99+

Spring 2018DATE

0.99+

first questionQUANTITY

0.99+

each companyQUANTITY

0.99+

oneQUANTITY

0.99+

second thingQUANTITY

0.98+

niumORGANIZATION

0.98+

three thingsQUANTITY

0.98+

10 years agoDATE

0.98+

yesterdayDATE

0.98+

three simple stepsQUANTITY

0.98+

two big themesQUANTITY

0.98+

first guyQUANTITY

0.98+

secondQUANTITY

0.97+

three humansQUANTITY

0.97+

about 100 peopleQUANTITY

0.97+

twoQUANTITY

0.97+

CoriPERSON

0.97+

Two thingsQUANTITY

0.96+

PaxataPERSON

0.96+

day oneQUANTITY

0.96+

15 yearsQUANTITY

0.95+

three incentivesQUANTITY

0.94+

todayDATE

0.94+

theCUBEORGANIZATION

0.94+

second notionQUANTITY

0.93+

first thingQUANTITY

0.92+

one personQUANTITY

0.92+

CubeORGANIZATION

0.89+

Parc 55 HotelLOCATION

0.88+

San FranciscoLOCATION

0.87+

2013DATE

0.85+

Corinium Chief Analytics OfficerEVENT

0.82+

double-QUANTITY

0.8+

downtown San FranciscoLOCATION

0.79+

Chief Analytics OfficerPERSON

0.78+

Corinium Chief Analytics Officer ConferenceEVENT

0.77+

groupQUANTITY

0.74+

one dataQUANTITY

0.69+

PaxataTITLE

0.66+

many moons agoDATE

0.61+

coupleQUANTITY

0.61+

theCUBETITLE

0.57+

SpringEVENT

0.5+

Prakash Nanduri, Paxata | BigData NYC 2017


 

>> Announcer: Live from midtown Manhattan, it's theCUBE covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and it's ecosystem sponsors. (upbeat techno music) >> Hey, welcome back, everyone. Here live in New York City, this is theCUBE from SiliconANGLE Media Special. Exclusive coverage of the Big Data World at NYC. We call it Big Data NYC in conjunction also with Strata Hadoop, Strata Data, Hadoop World all going on kind of around the corner from our event here on 37th Street in Manhattan. I'm John Furrier, the co-host of theCUBE with Peter Burris, Head of Research at SiliconANGLE Media, and General Manager of WikiBon Research. And our next guest is one of our famous CUBE alumni, Prakash Nanduri co-founder and CEO of Paxata who launched his company here on theCUBE at our first inaugural Big Data NYC event in 2013. Great to see you. >> Great to see you, John. >> John: Great to have you back. You've been on every year since, and it's been the lucky charm. You guys have been doing great. It's not broke, don't fix it, right? And so theCUBE is working with you guys. We love having you on. It's been a pleasure, you as an entrepreneur, launching your company. Really, the entrepreneurial mojo. It's really what it's all about. Getting access to the market, you guys got in there, and you got a position. Give us the update on Paxata. What's happening? >> Awesome, John and Peter. Great to be here again. Every time I come here to New York for Strata I always look forward to our conversations. And every year we have something exciting and new to share with you. So, if you recall in 2013, it was a tiny little show, and it was a tiny little company, and we came in with big plans. And in 2013, I said, "You know, John, we're going to completely disrupt the way business consumers and business analysts turn raw data into information and they do self-service data preparation." That's what we brought to the market in 2013. Ever since, we have gone on to do something really exciting and new for our customers every year. In '14, we came in with the first Apache Spark-based platform that allowed business analysts to do data preparation at scale interactively. Every year since, last year we did enterprise grade and we talked about how Paxata is going to be delivering our self-service data preparation solution in a highly-scalable enterprise grade deployment world. This year, what's super exciting is in addition to the recent announcements we made on Paxata running natively on the Microsoft Azure HDI Spark system. We are truly now the only information platform that allows business consumers to turn data into information in a multi-cloud hybrid world for our enterprise customers. In the last few years, I came and I talked to you and I told you about work we're doing and what great things are happening. But this year, in addition to the super-exciting announcements with Microsoft and other exciting announcements that you'll be hearing. You are going to hear directly from one of our key anchor customers, Standard Chartered Bank. 150-year-old institution operating in over 46 countries. One of the most storied banks in the world with 87,500 employees. >> John: That's not a start up. >> That's not a start up. (John laughs) >> They probably have a high bar, high bar. They got a lot of data. >> They have lots of data. And they have chosen Paxata as their information fabric. We announced our strategic partnership with them recently and you know that they are going to be speaking on theCUBE this week. And what started as a little experiment, just like our experiment in 2013, has actually mushroomed now into Michael Gorriz, and Shameek Kundu, and the entire leadership of Standard Chartered choosing Paxata as the platform that will democratize information in the bank across their 87,500 employees. We are going in a very exciting way, a very fast way, and now delivering real value to the bank. And you can hear all about it on our website-- >> Well, he's coming on theCUBE so we'll drill down on that, but banks are changing. You talk about a transformation. What is a teller? An Internet of Things device. The watch potentially could be a terminal. So, the Internet of Things of people changes the game. Are the ATMs going to go away and become like broadcast points? >> Prakash: And you're absolutely right. And really what it is about is, it doesn't matter if you're a Standard Chartered Bank or if you're a pharma company or if you're the leading healthcare company, what it is is that everyone of our customers is really becoming an information-inspired business. And what we are driving our customers to is moving from a world where they're data-driven. I think being data-driven is fine. But what you need to be is information-inspired. And what does that mean? It means that you need to be able to consume data, regardless of format, regardless of source, regardless of where it's coming from, and turn it into information that actually allows you to get inside in decisions. And that's what Paxata does for you. So, this whole notion of being information-inspired, I don't care if you're a bank, if you're a car company, or if you're a healthcare company today, you need to have-- >> Prakash, for the folks watching that might not know our history as you launched on theCUBE in 2013 and have been successful every year since. You guys have really deploying the classic entrepreneurial success formula, be fast, walk the talk, listen to customers, add value. Take a minute quickly just to talk about what you guys do. Just for the folks that don't know you. >> Absolutely, let's just actually give it in the real example of you know, a customer like Standard Chartered. Standard Chartered operates in multiple countries. They have significant number of lines of businesses. And whether it's in risk and compliance, whether it is in their marketing department, whether it's in their corporate banking business, what they have to do is, a simple example could be I want to create a customer list to be able to go and run a marketing campaign. And the customer list in a particular region is not something easy for a bank like Standard Charter to come up with. They need to be able to pull from multiple sources. They need to be able to clean the data. They need to be able to shape the data to get that list. And if you look at what is really important, the people who understand the data are actually not the folks in IT but the folks in business. So, they need to have a tool and a platform that allows them to pull data from multiple sources to be able to massage it, to be able to clean it-- >> John: So, you sell to the business person? >> We sell to the business consumer. The business analyst is our consumer. And the person who supports them is the chief data officer and the person who runs the Paxata platform on their data lake infrastructure. >> So, IT sets the data lake and you guys just let the business guys go to town on the data. >> Prakash: Bingo. >> Okay, what's the problem that you solve? If you can summarize the problem that you solve for the customers, what is it? >> We take data and turn it into information that is clean, that's complete, that's consumable and that's contextual. The hardest problem in every analytical exercise is actually taking data and cleaning it up and getting it ready for analytics. That's what we do. >> It's the prep work. >> It's the prep work. >> As companies gain experience with Big Data, John, what they need to start doing increasingly is move more of the prep work or have more of the prep work flow closer to the analyst. And the reason's actually pretty simple. It's because of that context. Because the analyst knows more about what their looking for and is a better evaluator of whether or not they get what they need. Otherwise, you end up in this strange cycle time problem between people in back end that are trying to generate the data that they think they want. And so, by making the whole concept of data preparation simpler, more straight forward, you're able to have the people who actually consume the data and need it do a better job of articulating what they need, how they need it and making it presentable to the work that they're performing. >> Exactly, Peter. What does that say about how roles are starting to merge together? Cause you've got to be at the vanguard of seeing how some of these mature organizations are working. What do you think? Are we seeing roles start to become more aligned? >> Yes, I do think. So, first and foremost, I think what's happening is there is no such thing as having just one group that's doing data science and another group consuming. I think what you're going to be going into is the world of data and information isn't all-consuming and that everybody's role. Everybody has a role in that. And everybody's going to consume. So, if you look at a business analyst that was spending 80% of their time living in Excel or working with self-service BI tools like our partner's Tableau and Power BI from Microsoft, others. What you find is these people today are living in a world where either they have to live in coding scripting world hell or they have to rely on IT to get them the real data. So, the role of a business analyst or a subject matter expert, first and foremost, the fact that they work with data and they need information that's a given. There is no business role today where you can't deal with data. >> But it also makes them real valuable, because there aren't a lot of people who are good at dealing with data. And they're very, very reliant on these people to turn that data into something that is regarded as consumable elsewhere. So, you're trying to make them much more productive. >> Exactly. So, four years years ago, when we launched on theCUBE, the whole premise was that in order to be able to really drive towards a world where you can make information and data-driven decisions, you need to ensure that the business analyst community, or what I like to call the business consumer needs to have the power of being able to, A, get access to data, B, make sense of the data, and then turn that data into something that's valuable for her or for him. >> Peter: And others. >> And others, and others. Absolutely. And that's what Paxata is doing. In a collaborative, in a 21st Century world where I don't work in a silo, I work collaboratively. And then the tool, and the platform that helps me do that is actually a 21st Century platform. >> So, John, at the beginning of the session you and Jim were talking about what is going to be one of the themes here at the show. And we observed that it used to be that people were talking about setting up the hardware, setting up the clutters, getting Hadoop to work, and Jim talked about going up the stack. Well, this is one of the indicators that, in fact, people were starting to go up the stack because they're starting to worry more about the data, what it can do, the value of how it's going to be used, and how we distribute more of that work so that we get more people using data that's actually good and useful to the business. >> John: And drives value. >> And drives value. >> Absolutely. And if I may, just put a chronological aspect to this. When we launched the company we said the business analyst needs to be in charge of the data and turning the data into something useful. Then right at that time, the world of create data lakes came in thanks to our partners like Cloudera and Hortonworks, and others, and MapR and others. In the recent past, the world of moving from on premise data lakes to hybrid, multicloud data lakes is becoming reality. Our partners at Microsoft, at AWS, and others are having customers come in and build cloud-based data lakes. So, today what you're seeing is on one hand this complete democratization within the business, like at Standard Chartered, where all these business analysts are getting access to data. And on the other hand, from the data infrastructure moving into a hybrid multicloud world. And what you need is a 21st Century information management platform that serves the need of the business and to make that data relevant and information and ready for their consumption. While at the same time we should not forget that enterprises need governance. They need lineage. They need scale. They need to be able to move things around depending on what their business needs are. And that's what Paxata is driving. That's why we're so excited about our partnership with Microsoft, with AWS, with our customer partnerships such as Standard Chartered Bank, rolling this out in an enterprise-- >> This is a democratization that you were referring to with your customers. We see this-- >> Everywhere. >> When you free the data up, good things happen but you don't want to have IT be the constraint, you want to let them enable-- >> Peter: And IT doesn't want to be the constraint. >> They don't. >> This is one of the biggest problems that they have on a daily basis. >> They're happy to let it go free as long as it's in they're mind DevOps-like related, this is cool for them. >> Well, they're happy to let it go with policy and security in place. >> Our customers, our most strategic customers, the folks who are running the data lakes, the folks who are managing the data lakes, they are the first ones that say that we want business to be able to access this data, and to be able to go and make use out of this data in the right way for the bank. And not have us be the impediment, not have us be the roadblock. While at the same time we still need governance. We still need security. We still need all those things that are important for a bank or a large enterprise. That's what Paxata is delivering to the customers. >> John: So, what's next? >> Peter: Oh, I'm sorry. >> So, really quickly. An interesting observation. People talk about data being the new fuel of business. That really doesn't work because, as Bill Schmarzo says, it's not the new fuel of business, it's new sunlight of business. And the reason why is because fuel can only be used once. >> Prakash: That's right. >> The whole point of data is that it can be used a lot, in a lot of different ways, and a lot of different contexts. And so, in many respects what we're really trying to facilitate or if someone who runs a data lake when someone in the business asks them, "Well, how do you create value for the business?" The more people, the more users, the more context that they're serving out of that common data, the more valuable the resource that they're administering. So, they want to see more utilization, more contexts, more data being moved out. But again, governance, security have to be in place. >> You bet, you bet. And using that analogy of data, and I've heard this term about data being the new oil, etc. Well, if data is the oil, information is really the refined fuel or sunlight as we like to call it. >> Peter: Yeah. >> John: Well, you're riffing on semantics, but the point is it's not a one trick pony. Data is part of the development, I wrote a blog post in 1997, I mean 2007 that said data's the new development kit. And it was kind of riffing on this notion of the old days >> Prakash: You bet. >> Here's your development kit, SDK, or whatever was how people did things back then Enter the cloud, >> Prakash: That's right. >> And boom, there it is. The data now is in the process of the refinery the developers wanted. The developers want the data libraries. Whatever that means. That's where I see it. And that is the democratization where data is available to be integrated in to apps, into feeds, into ... >> Exactly, and so it brings me to our point about what was the exciting, new product innovation announcement we made today about Intelligent Ingest. You want to be able to access data in the enterprise regardless of where it is, regardless of the cloud where it's sitting, regardless of whether it's on-premise, in the cloud. You don't need to as a business worry about whether that is a JSON file or whether that's an XML file or that's a relational file. That's irrelevant. What you want is, do I have the access to the right data? Can I take that data, can I turn it into something valuable and then can I make a decision out of it? I need to do that fast. At the same time, I need to have the governance and security, all of that. That's at the end of the day the objective that our customers are driving towards. >> Prakash, thanks so much for coming on and being a great member of our community. >> Fantastic. >> You're part of our smart network of great people out there and entrepreneurial journey continues. >> Yes. >> Final question. Just observation. As you pinch yourself and you go down the journey, you guys are walking the talk, adding new products. We're global landscape. You're seeing a lot of new stuff happening. Customers are trying to stay focused. A lot of distractions whether security or data or app development. What's your state of the industry? How do you view the current market, from your perspective and also how the customer might see it from their impact? >> Well, the first thing is that I think in the last four years we have seen significant maturity both on the providers off software technology and solutions, and also amongst the customers. I do think that going forward what is really going to make a difference is one really driving towards business outcomes by leveraging data. We've talked about a lot of this over the last few years. What real business outcomes are you delivering? What we are super excited is when we see our customers each one of them actually subscribes to Paxata, we're a SAS company, they subscribe to Paxata not because they're doing the science experiment but because they're trying to deliver real business value. What is that? Whether that is a risk in compliance solution which is going to drive towards real cost savings. Or whether that's a top line benefit because they know what they're customer 360 is and how they can go and serve their customers better or how they can improve supply chains or how they can optimize their entire efficiency in the company. I think if you take it from that lens, what is going to be important right now is there's lots of new technologies coming in, and what's important is how is it going to drive towards those top three business drivers that I have today for the next 18 months? >> John: So, that's foundational. >> That's foundational. Those are the building blocks-- >> That's what is happening. Don't jump... If you're a customer, it's great to look at new technologies, etc. There's always innovation projects-- >> RND, GPOCs, whatever. Kick the tires. >> But now, if you are really going to talk the talk about saying I'm going to be, call your word, data-driven, information-driven, whatever it is. If you're going to talk the talk, then you better walk the walk by delivering the real kind of tools and capabilities that you're business consumers can adopt. And they better adopt that fast. If they're not up and running in 24 hours, something is wrong. >> Peter: Let me ask one question before you close, John. So, you're argument, which I agree with, suggests that one of the big changes in the next 18 months, three years as this whole thing matures and gets more consistent in it's application of the value that it generates, we're going to see an explosion in the number users of these types of tools. >> Prakash: Yes, yes. >> Correct? >> Prakash: Absolutely. >> 2X, 3X, 5X? What do you think? >> I think we're just at the cusp. I think is going to grow up at least 10X and beyond. >> Peter: In the next two years? >> In the next, I would give that next three to five years. >> Peter: Three to five years? >> Yes. And we're on the journey. We're just at the tip of the high curve taking off. That's what I feel. >> Yeah, and there's going to be a lot more consolidation. You're going to start to see people who are winning. It's becoming clear as the fog lifts. It's a cloud game, a scale game. It's democratization, community-driven. It's open source software. Just solve problems, outcomes. I think outcome is going to be much faster. I think outcomes as a service will be a model that we'll probably be talking about in the future. You know, real time outcomes. Not eight month projects or year projects. >> Certainly, we started writing research about outcome-based management. >> Right. >> Wikibon Research... Prakash, one more thing? >> I also just want to say that in addition to this business outcome thing, I think in the last five years I've seen a lot of shift in our customer's world where the initial excitement about analytics, predictive, AI, machine-learning to get to outcomes. They've all come into a reality that none of that is possible if you're not able to handle, first get a grip on your data, and then be able to turn that data into something meaningful that can be analyzed. So, that is also a major shift. That's why you're seeing the growth we're seeing-- >> John: Cause it's really hard. >> Prakash: It's really hard. >> I mean, it's a cultural mindset. You have the personnel. It's an operational model. I mean this is not like, throw some pixie dust on it and it magically happens. >> That's why I say, before you go into any kind of BI, analytics, AI initiative, stop, think about your information management strategy. Think about how you're going to democratize information. Think about how you're going to get governance. Think about how you're going to enable your business to turn data into information. >> Remember, you can't do AI with IA? You can't do AI without information architecture. >> There you go. That's a great point. >> And I think this all points to why Wikibon's research have all the analysts got it right with true private cloud because people got to take care of their business here to have a foundation for the future. And you can't just jump to the future. There's too much just to come and use a scale, too many cracks in the foundation. You got to do your, take your medicine now. And do the homework and lay down a solid foundation. >> You bet. >> All right, Prakash. Great to have you on theCUBE. Again, congratulations. And again, it's great for us. I totally have a great vibe when I see you. Thinking about how you launched on theCUBE in 2013, and how far you continue to climb. Congratulations. >> Thank you so much, John. Thanks, Peter. That was fantastic. >> All right, live coverage continuing day one of three days. It's going to be a great week here in New York City. Weather's perfect and all the players are in town for Big Data NYC. I'm John Furrier with Peter Burris. Be back with more after this short break. (upbeat techno music).

Published Date : Sep 27 2017

SUMMARY :

Brought to you by SiliconANGLE Media I'm John Furrier, the co-host of theCUBE with Peter Burris, and it's been the lucky charm. In the last few years, I came and I talked to you That's not a start up. They got a lot of data. and Shameek Kundu, and the entire leadership Are the ATMs going to go away and turn it into information that actually allows you Take a minute quickly just to talk about what you guys do. And the customer list in a particular region and the person who runs the Paxata platform and you guys just let the business guys and that's contextual. is move more of the prep work or have more of the prep work are starting to merge together? And everybody's going to consume. to turn that data into something that is regarded to be able to really drive towards a world And that's what Paxata is doing. So, John, at the beginning of the session of the business and to make that data relevant This is a democratization that you were referring to This is one of the biggest problems that they have They're happy to let it go free as long as Well, they're happy to let it go with policy and to be able to go and make use out of this data And the reason why is because fuel can only be used once. out of that common data, the more valuable Well, if data is the oil, I mean 2007 that said data's the new development kit. And that is the democratization At the same time, I need to have the governance and being a great member of our community. and entrepreneurial journey continues. How do you view the current market, and also amongst the customers. Those are the building blocks-- it's great to look at new technologies, etc. Kick the tires. the real kind of tools and capabilities in it's application of the value that it generates, I think is going to grow up at least 10X and beyond. We're just at the tip of Yeah, and there's going to be a lot more consolidation. Certainly, we started writing research Prakash, one more thing? and then be able to turn that data into something meaningful You have the personnel. to turn data into information. Remember, you can't do AI with IA? There you go. And I think this all points to Great to have you on theCUBE. Thank you so much, John. It's going to be a great week here in New York City.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Peter BurrisPERSON

0.99+

JohnPERSON

0.99+

JimPERSON

0.99+

MicrosoftORGANIZATION

0.99+

2013DATE

0.99+

PeterPERSON

0.99+

PrakashPERSON

0.99+

AWSORGANIZATION

0.99+

John FurrierPERSON

0.99+

Prakash NanduriPERSON

0.99+

Bill SchmarzoPERSON

0.99+

1997DATE

0.99+

New YorkLOCATION

0.99+

ThreeQUANTITY

0.99+

80%QUANTITY

0.99+

Michael GorrizPERSON

0.99+

Standard Chartered BankORGANIZATION

0.99+

New York CityLOCATION

0.99+

2007DATE

0.99+

HortonworksORGANIZATION

0.99+

87,500 employeesQUANTITY

0.99+

PaxataORGANIZATION

0.99+

NYCLOCATION

0.99+

last yearDATE

0.99+

37th StreetLOCATION

0.99+

SASORGANIZATION

0.99+

WikiBon ResearchORGANIZATION

0.99+

five yearsQUANTITY

0.99+

ExcelTITLE

0.99+

24 hoursQUANTITY

0.99+

OneQUANTITY

0.99+

this yearDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

This yearDATE

0.99+

21st CenturyDATE

0.99+

oneQUANTITY

0.99+

eight monthQUANTITY

0.99+

one questionQUANTITY

0.99+

four years years agoDATE

0.99+

3XQUANTITY

0.99+

5XQUANTITY

0.99+

firstQUANTITY

0.99+

three yearsQUANTITY

0.99+

Nenshad Bardoliwalla, Paxata - #BigDataNYC 2016 - #theCUBE


 

>> Voiceover: Live from New York, it's The Cube, covering Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, Nvidia, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and George Gilbert. >> Welcome back to New York City, everybody. Nenshad Bardoliwalla is here, he's the co-founder and chief product officer at Paxata, a company that, three years ago, I want to say three years ago, came out of stealth on The Cube. >> October 27, 2013. >> Right, and we were at the Warwick Hotel across the street from the Hilton. Yeah, Prakash came on The Cube and came out of stealth. Welcome back. >> Thank you very much. >> Great to see you guys. Taking the world by storm. >> Great to be here, and of course, Prakash sends his apologies. He couldn't be here so he sent his stunt double. (Dave and George laugh) >> Great, so give us the update. What's the latest? >> So there are a lot of great things going on in our space. The thing that we announced here at the show is what we're calling Paxata Connect, OK? We are moving just in the same way that we created the self-service data preparation category, and now there are 50 companies that claim they do self-service data prep. We are moving the industry to the next phase of what we are calling our business information platform. Paxata Connect is one of the first major milestones in getting to that vision of the business information platform. What Paxata Connect allows our customers to do is, number one, to have visual, completely declarative, point-and-click browsing access to a variety of different data sources in the enterprise. For example, we support, we are the only company that we know of that supports connecting to multiple, simultaneous, different Hadoop distributions in one system. So a Paxata customer can connect to MapR, they can connect to Hortonworks, they can connect to Cloudera, and they can federate across all of them, which is a very powerful aspect of the system. >> And part of this involves, when you say declarative, it means you don't have to write a program to retrieve the data. >> Exactly right. Exactly right. >> Is this going into HTFS, into Hive, or? >> Yes it is. In fact, so Hadoop is one part of, this multi-source Hadoop capability is one part of Paxata Connect. The second is, as we've moved into this information platform world, our customers are telling us they want read-write access to more than just Hadoop. Hadoop is obviously a very important part, but we're actually supporting no-sequel data sources like Cloudant, Mongo DB, we're supporting read and write, we're supporting, for the first time, relational databases, we already supported read, but now we actually support write to relational databases. So Paxata is really becoming kind of this fabric, a business-centric information fabric, that allows people to move data from anywhere to any destination, and transform it, profile it, explore it along the way. >> Excellent. Let's get into some of the use cases. >> Yeah, tell us where the banks are. The sense at the conference is that everyone sort of got their data lakes to some extent up and running. Now where are they pushing to go next? >> Sure, that's an excellent question. So we have really focused on the enterprise segment, as you know. So the customers that are working with Paxata from an industry perspective, banking is, of course, a very important one, we were really proud to share the stage yesterday with both Citi and Standard Chartered Bank, two of our flagship banking customers. But Paxata is also heavily used in the United States government, in the intelligence community, I won't say any more about that. It's used heavily in retail and consumer products, it's used heavily in the high-tech space, it's used heavily by data service providers, that is, companies whose entire business is based on data. But to answer your question specifically, what's happening in the data lake world is that a lot of folks, the early adopters, have jumped onto the data lake bandwagon. So they're pouring terabytes and petabytes of data into the data lake. And then the next question the business asks is, OK, now what? Where's the data, right? One of the simplest use cases, but actually one that's very pervasive for our customers, is they say, "Look, we don't even know, "our business people, they don't even know "what's in Hadoop right now." And by the way, I will also say that the data lake is not just Hadoop, but Amazon S3 is also serving as a data lake. The capabilities inside Microsoft's cloud are also serving as a data lake. Even the notion of a data lake is becoming this sort of polymorphic distributed thing. So what they do is, they want to be able to get what we like to say is first eyes on data. We let people with Paxata, especially with the release of Connect, to just point and click their way and to actually explore the data in all of the native systems before they even bring it in to something like Paxata. So they can actually sneak preview thousands of database tables or thousands of compressed data sets inside of Amazon S3, or thousands of data sets inside of Hadoop, and now the business people for the first time can point and click and actually see what is in the data lake in the first place. So step number one is, we have taken the approach so far in the industry of, there have been a lot of IT-driven use cases that have motivated people to go to the data lake approach. But now, we obviously want to show, all of our companies want to show business value, so tools and platforms like Paxata that sit on top of the data lake, that can federate across multiple data lakes and provide business-centric access to that information is the first significant use case pattern we're seeing. >> Just a clarification, could there be two roles where one is for slightly more technical business user exposes views summarizing, so that the ultimate end user doesn't have to see the thousands of tables? >> Absolutely, that's a great question. So when you look at self-service, if somebody wants to roll out a self-service strategy, there are multiple roles in an organization that actually need to intersect with self-service. There is a pattern in organizations where people say, "We want our people to get access to all the data." Of course it's governed, they have to have the right passwords and SSO and all that, but they're the companies who say, yes, the users really need to be able to see all of the data across these different tables. But there's a different role, who also uses Paxata extensively, who are the curators, right? These are the people who say, look, I'm going to provision the raw data, provide the views, provide even some normalization or transformation, and then land that data back into another layer, as people call the data relay, they go from layer zero to layer one to layer two, they're different directory structures, but the point is, there's a natural processing frame that they're going through with their data, and then from the curated data that's created by the data stewards, then the analysts can go pick it up. >> One of the other big challenges that our research is showing, that chief data officers express, is that they get this data in the data lake. So they've got the data sources, you're providing access to it, the other piece is they want to trust that data. There's obviously a governance piece, but then there's a data quality piece, maybe you could talk about that? >> Absolutely. So use case number one is about access. The second reason that people are not so -- So, why are people doing data prep in the first place? They are trying to make information-driven decisions that actually help move their business forward. So if you look at researchers from firms like Forrester, they'll say there are two reasons that slow down the latency of going from raw data to decision. Number one is access to data. That's the use case we just talked about. Number two is the trustworthiness of data. Our approach is very different on that. Once people actually can find the data that they're looking for, the big paradigm shift in the self-service world is that, instead of trying to process data based on transforming the metadata attributes, like I'm going to draw on a work flow diagram, bring in this table, aggregate with this operator, then split it this way, filter it, which is the classic ETL paradigm. The, I don't want to say profound, but maybe the very obvious thing we did was to say, "What if people could actually look at the data in the first place --" >> And sort of program it by example? >> We can tell, that's right. Because our eyes can tell us, our brains help us to say, we can immediately look at a data set, right? You look at an age column, let's say. There are values in the age column of 150 years. Maybe 20 years from now there may be someone who, on Earth, lives to 150 years. But pretty much -- >> Highly unlikely. >> The customers at the banks you work with are not 150 years old, right? So just being able to look at the data, to get to the point that you're asking, quality is about data being fit for a specific purpose. In order for data to be fit for a specific purpose, the person who needs the data needs to make the decision about what is quality data. Both of you may have access to the same transactional data, raw data, that the IT team has landed in the Hadoop cluster. But now you pull it up for one use case, you pull it up for another use case, and because your needs are different, what constitutes quality to you and where you want to make the investment is going to be very different. So by putting the power of that capability into the hands of the person who actually knows what they want, that is how we are actually able to change the paradigm and really compress the latency from "Here's my raw data" to "Here's the decision I want to make on that data." >> Let me ask, it sounds like, having put all of the self-service capabilities together, you've democratized access to this data. Now, what happens in terms of governance, or more importantly, just trust, when the pipeline, you know, has to go beyond where you're working on it, to some of the analytics or some of the basic ingest? To say, "I know this data came from here "and it's going there." >> That's right, how do we verify the fidelity of these data sources? It's a fantastic question. So, in my career, having worked in BI for a couple of decades, I know I look much younger but it actually has been a couple of decades. Remember, the camera adds about 15 pounds, for those of you watching at home. (Dave and George laugh) >> George: But you've lost already. >> Thank you very much. >> So you've lost net 30. (Nenshad laughs) >> Or maybe I'm back to where I'm supposed to be. What I've seen as the two models of governance in the enterprise when it comes to analytics and information management, right? There's model one, which is, we're going to build an enterprise data warehouse, we're going to know all the possible questions people are going to ask in advance, we're going to preprogram the ETL routines, we're going to put something like a MicroStrategy or BusinessObjects, an enterprise-reporting factory tool. Then you spend 10 million dollars on that project, the users come in and for the first time they use the system, and they say, "Oh, I kind of want to change this, this way. "I want to add this calculation." It takes them about five minutes to determine that they can't do it for whatever reason, and what is the first feature they look for in the product in order to move forward? Download to Excel, right? So you invested 15 million dollars to build a download to Excel capability which they already had before. So if you lock things down too much, the point is, the end users will go around you. They've been doing it for 30 years and they'll keep doing it. Then we have model two. Model two is, Excel spreadsheet. Excel Hell, or spreadmarts. There are lots of words for these things. You have a version of the data, you have a version of the data, I have a version of the data. We all started from the same transactional data, yet you're the head of sales, so suddenly your forecast looks really rosy. You're the head of finance, you really don't like what the forecast looks like. And I'm the product guy, so why am I even looking at the forecast in the first place, but somehow I got access to the data, right? These are the two polarities of the enterprise that we've worked with for the last 30 years. We wanted to find sort of a middle path, which is to say, let's give people the freedom and flexibility to be able to do the transformations they need to. If they want to add a column, let them add a column. If they want to change a calculation, let them add a a calculation. But, every single step in the process must be recorded. It must be versioned, it must be auditable. It must be governed in that way. So why the large banks and the intelligence community and the large enterprise customers are attracted to Paxata is because they have the ability to have perfect retraceability for every decision that they make. I can actually sit next to you and say, "This is why the data looks like this. "This is how this value, which started at one million, "became 1.5 million." That covers the Paxata part. But then the answer to the question you asked is, how do you even extend that to a broader ecosystem? I think that's really about some of the metadata interchange initiatives that a lot of the vendors in the Hadoop space, but also in the traditional enterprise space, have had for the last many years. If you look at something like Apache Atlas or Cloudera Navigator, they are systems designed to collect, aggregate, and connect these different metadata steps so you can see in an end-to-end flow, this is the raw data that got ingested into Hadoop. These are the transformations that the end user did in Paxata in order to make it ready for analytics. This is how it's getting consumed in something like Zoom Data, and you actually have the entire life cycle of data now actually manifested as a software asset. >> So those not, in other words, those are not just managing within the perimeter of Hadoop. They are managers of managers. >> That's right, that's right. Because the data is coming from anywhere, and it's going to anywhere. And then you can add another dimension of complexity which is, it's not just one Hadoop cluster. It's 10 Hadoop clusters. And those 10 Hadoop clusters, three of them are in Amazon. Four of them are in Microsoft. Three of them are in Google Cloud platform. How do you know what people are doing with data then? >> How is this all presented to the user? What does the user see? >> Great question. The trick to all of this, of self service, first you have to know very clearly, who is the person you are trying to serve? What are their technical skills and capabilities, and how can you get them productive as fast as possible? When we created this category, our key notion was that we were going to go after analysts. Now, that is a very generic term, right? Because we are all, in some sense, analysts in our day-to-day lives. But in Paxata, a business analyst, in an enterprise organizational context, is somebody that has the ability to use Microsoft Excel, they have to have that skill or they won't be successful with today's Paxata. They have to know what a VLOOKUP is, because a VLOOKUP is a way to actually pull data from a second data source into one. We would all know that as a join or a lookup. And the third thing is, they have to know what a pivot table is and know how a pivot table works. Because the key insight we had is that, of the hundreds of millions of analysts, people who use Excel on a day-to-day basis, a lot of their work is data prep. But Excel, being an amazing generic tool, is actually quite bad for doing data prep. So the person we target, when I go to a customer and they say, "Are we a good candidate to use Paxata?" and we're talking to the actual person who's going to use the software, I say, "Do you know what a VLOOKUP is, yes or no? "Do you know what a pivot table is, yes or no?" If they have that skill, when they come into Paxata, we designed Paxata to be very attractive to those people. So it's completely point-and-click. It's completely visual. It's completely interactive. There's no scripting inside that whole process, because do you think the average Microsoft Excel analyst wants to script, or they want to use a proprietary wrangling language? I'm sorry, but analysts don't want to wrangle. Data scientists, the 1% of the 1%, maybe they like to wrangle, but you don't have that with the broader analyst community, and that is a much larger market opportunity that we have targeted. >> Well, very large, I mean, a lot of people are familiar with those concepts in Excel, and if they're not, they're relatively easy to learn. >> Nenshad: That's right. Excellent. All right, Nenshad, we have to leave it there. Thanks very much for coming on The Cube, appreciate it. >> Thank you very much for having me. >> Congratulations for all the success. >> Thank you. >> All right, keep it right there, everybody. We'll be back with our next guest. This is The Cube, we're live from New York City at Big Data NYC. We'll be right back. (electronic music)

Published Date : Sep 30 2016

SUMMARY :

Brought to you by headline sponsors, here, he's the co-founder across the street from the Hilton. Great to see you guys. Great to be here, and of course, What's the latest? of the business information platform. to retrieve the data. Exactly right. explore it along the way. Let's get into some of the use cases. The sense at the conference One of the simplest use These are the people who One of the other big That's the use case we just talked about. to say, we can immediately the banks you work with of the self-service capabilities together, Remember, the camera adds about 15 pounds, So you've lost net 30. of the data, I have a version of the data. They are managers of managers. and it's going to anywhere. And the third thing is, they have to know relatively easy to learn. have to leave it there. This is The Cube, we're

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
CitiORGANIZATION

0.99+

October 27, 2013DATE

0.99+

GeorgePERSON

0.99+

George GilbertPERSON

0.99+

NenshadPERSON

0.99+

IBMORGANIZATION

0.99+

Dave VellantePERSON

0.99+

PrakashPERSON

0.99+

DavePERSON

0.99+

New York CityLOCATION

0.99+

NvidiaORGANIZATION

0.99+

CiscoORGANIZATION

0.99+

EarthLOCATION

0.99+

15 million dollarsQUANTITY

0.99+

twoQUANTITY

0.99+

30 yearsQUANTITY

0.99+

ForresterORGANIZATION

0.99+

ExcelTITLE

0.99+

thousandsQUANTITY

0.99+

50 companiesQUANTITY

0.99+

10 million dollarsQUANTITY

0.99+

Standard Chartered BankORGANIZATION

0.99+

New York CityLOCATION

0.99+

Nenshad BardoliwallaPERSON

0.99+

two reasonsQUANTITY

0.99+

one millionQUANTITY

0.99+

MicrosoftORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

firstQUANTITY

0.99+

two rolesQUANTITY

0.99+

two polaritiesQUANTITY

0.99+

1.5 millionQUANTITY

0.99+

HortonworksORGANIZATION

0.99+

150 yearsQUANTITY

0.99+

HadoopTITLE

0.99+

PaxataORGANIZATION

0.99+

second reasonQUANTITY

0.99+

OneQUANTITY

0.99+

two modelsQUANTITY

0.99+

secondQUANTITY

0.99+

oneQUANTITY

0.99+

yesterdayDATE

0.99+

BothQUANTITY

0.99+

three years agoDATE

0.99+

first timeQUANTITY

0.98+

first timeQUANTITY

0.98+

New YorkLOCATION

0.98+

bothQUANTITY

0.98+

1%QUANTITY

0.97+

third thingQUANTITY

0.97+

one systemQUANTITY

0.97+

about five minutesQUANTITY

0.97+

PaxataPERSON

0.97+

first featureQUANTITY

0.97+

DataLOCATION

0.96+

one partQUANTITY

0.96+

United States governmentORGANIZATION

0.95+

thousands of tablesQUANTITY

0.94+

20 yearsQUANTITY

0.94+

Model twoQUANTITY

0.94+

10 Hadoop clustersQUANTITY

0.94+

terabytesQUANTITY

0.93+