Mick Bass, 47Lining - Data Platforms 2017 - #DataPlatforms2017

>> Live, from The Wigwam, in Phoenix, Arizona, it's theCube, covering Data Platforms 2017. Brought to you by Cue Ball. Hey, welcome back everybody. Jeff Frick here with theCube. Welcome back to Data Platforms 2017, at the historic Wigwam Resort, just outside of Phoenix, Arizona. I'm here all day with George Gilbert from Wikibon, and we're excited to be joined by our next guest. He's Mick Bass, the CEO of 47Lining. Mick, welcome. >> Welcome, thanks for having me, yes. >> Absolutely. So, what is 47Lining, for people that aren't familiar? >> Well, you know every cloud has a silver lining, and if you look at the periodic table, 47 is the atomic number for silver. So, we are a consulting services company that helps customers build out data platforms and ongoing data processes and data machines in Amazon web services. And, one of the primary use cases that we help customers with is to establish data lakes in Amazon web services to help them answer some of their most valuable business questions. >> So, there's always this question about own vs buy, right, with Cloud and Amazon, specifically. >> Mm-hmm, mm-hmm. >> And, with a data lake, the perception right... That's huge, this giant cost. Clearly that's from benefits that come with putting your data lake in AWS vs having it on Primm. What are some of the things you take customers through, and kind of the scenario planning and the value planning? >> Well, just a couple of the really important aspects, one, is this notion of elastic and on-demand pricing. In a Cloud based data lake, you can start out with actually a very small infrastructure footprint that's focused on maybe just one or two business use cases. You can pay only for the data that you need to get your data leg bootstrapped, and demonstrate the business benefit from one of those use cases. But, then it's very easy to scale that up, in a pay as you go kind of a way. The second, you know, really important benefit that customers experience in a platform that's built on AWS, is the breadth of the tools and capabilities that they can bring to bare for their predictive analytics and descriptive analytics, and streaming kinds of data problems. So, you need Spark, you can have it. You need Hive, you can have it. You need a high performance, close to the metal, data warehouse, on a cluster database, you can have it. So, analysts are really empowered through this approach because they can choose the right tool for the right job, and reduce the time to business benefit, based on what their business owners are asking them for. >> You touched on something really interesting, which was... So, when a customer is on Primm, and let's say is evaluating Cloudera, MaPr, Hortonworks, there's a finite set of services or software components within that distro. Once they're on the Cloud, there's a thousand times more... As you were saying, you could have one of 27 different data warehouse products, you could have many different sequel products, some of which are really delivered as services. >> Mm-hmm >> How does the consideration of the customer's choice change when they go to the Cloud? >> Well, I think that what they find is that it's much more tenable to take an agile, iterative process, where they're trying to align the outgoing cost of the data lake build to keep that in alignment with the business benefits that come from it. And, so if you recognize the need for a particular kind of analytics approach, but you're not going to need that until down the road, two or three quarters from now. It's easy to get started with simple use cases, and then like add those incremental services, as the need manifests. One of the things that I mention in my talk, that I always encourage our customers to keep in mind, is that a data lake is more than just a technology construct. It's not just an analysis set of machinery, it's really a business construct. Your data lake has a profit and loss statement, and the way that you interact with your business owners to identify this specific value sources, that you're going to make pop for you company, can be made to align with the cost footprint, as you build your data lake out. >> So I'm curious, when you're taking customers though the journey to start kind of thinking of the data lake and AWS, are there any specific kind of application spaces, or vertical spaces where you have pretty high confidence that you can secure an early, and relatively easy, win to help them kind of move down the road? >> Absolutely. So, you know, many of our customers, in a very common, you know, business need, is to enhance the set of information that they have available for a 360 degree view of the customer. In many cases, this information and data, it's available in different parts of the enterprises, but it might be siloed. And, a data lake approach in AWS really helps you to pull it together in an agile fashion based on particular, quarter by quarter, objectives or capabilities that you're trying to respond to. Another very common example is predictive analytics for things like fraud detection, or mechanical failure. So, in eCommerce kinds of situations, being able to pull together semi-structured information that might be coming from web servers or logs, or like what cookies are associated with this particular user. It's very easy to pull together a fraud oriented predictive analytic. And, then the third area that is very common is internet of things use cases. Many enterprises are augmenting their existing data warehouse with sensor oriented time series data, and there's really no place in the enterprise for that data currently to land. >> So, when you say they are augmenting the data warehouse, are they putting it in the data warehouse, or they putting it in a sort of adjunct, time series database, from which they can sort of curate aggregates, and things like that to put in the data warehouse? >> It's very much the latter, right. And, the time series data itself may come from multiple different vendors and the input formats, in which that information lands, can be pretty diverse. And so, it's not really a good fit for a typical kind of data warehouse ingest or intake process. >> So, if you were to look at, sort of, maturity models for the different use cases, where would we be, you know, like IOT, Customer 360, fraud, things like that? >> I think, you know, so many customers have pretty rich fraud analytics capabilities, but some of the pain points that we hear is that it's difficult for them to access the most recent technologies. In some cases the order management systems that those analytics are running on are quite old. We just finished some work with a customer where literally the order management system's running on a mainframe, even today. Those systems have the ability to accept steer from like a sidecar decision support predictive analytic system. And, one of the things that's really cool about the Cloud is you could build a custom API just for that fraud analytics use case so that you can inject exactly the right information that makes it super cheap and easy for the ops team, that's running that mainframe, to consume the fraud improvement decision signal that you're offering. >> Interesting. And so, this may be diving into the weeds a little bit, but if you've got an order management system that's decades old and you're going to plug-in something that has to meet some stringent performance requirements, how do you, sort of, test... It's not just the end to end performance once, but you know for the 99th percentile, that someone doesn't get locked out for five minutes while he's to trying to finish his shopping cart. >> Exactly. And I mean, I think this is what is important about the concept of building data machines, in the Cloud. This is not like a once and done kind of process. You're not building an analytic that produces a print out that an executive is going to look at (laughing) and make a decision. (laughing) You're really creating a process that runs at consumer scale, and you're going to apply all of the same kinds of metrics of percentile performance that you would apply at any kind of large scale consumer delivery system. >> Do you custom-build, a fraud prevention application for each customer? Or, is there a template and then some additional capabilities that you'll learn by running through their training data? >> Well, I think largely, there are business by business distinctions in the approach that these customers take to fraud detection. There's also business by business direction distinction in their current state. But, what we find is that the commonalities in the kinds of patterns and approaches that you tend to apply. So, you know... We may have extra data about you based on your behavior on the web, and your behavior on a mobile app. The particulars of that data might be different for Enterprise A vs Enterprise B, but this pattern of joining up mobile data plus web data plus, maybe, phone-in call center data. Putting those all together, to increase the signal that can be made available to a fraud prevention algorithm, that's very common across all enterprises. And so, one of the roles that we play is to set up the platform, so that it's really easy to mobilize each of these data sources. So in many cases, it's the customer's data scientist that's saying, I think I know how to do a better job for my business. I just need to be unleashed to be able to access this data, and if I'm blocked, I need a platform where the answer that I get back is oh, you could have that, like, second quarter of 2019. Instead, you want to say, oh, we can onboard that data in an agile fashion pay, and increment a little bit of money because you've identified a specific benefit that could be made available by having that data. >> Alright Mick, well thanks for stopping by. I'm going to send Andy Jassy a note that we found the silver lining to the Cloud (laughing) So, I'm excited for that, if nothing else, so that made the trip well worth while, so thanks for taking a few minutes. >> You bet, thanks so much, guys. >> Alright Mick Bass, George Gilbert, Jeff Frick, you're watching theCube, from Data Platforms 2017. We'll be right back after this short break. Thanks for watching. (computer techno beat)

Published Date : May 26 2017

SUMMARY :

Brought to you by Cue Ball. So, what is 47Lining, for people that aren't familiar? and if you look at the periodic table, So, there's always this question about own vs buy, right, What are some of the things you take customers through, and reduce the time to business benefit, you could have many different sequel products, and the way that you interact with your business owners for that data currently to land. and the input formats, so that you can inject exactly the right information It's not just the end to end performance once, a print out that an executive is going to look at (laughing) of patterns and approaches that you tend to apply. the silver lining to the Cloud (laughing) Thanks for watching.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Mick Bass	PERSON	0.99+
Jeff Frick	PERSON	0.99+
five minutes	QUANTITY	0.99+
one	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Mick	PERSON	0.99+
360 degree	QUANTITY	0.99+
Cue Ball	PERSON	0.99+
AWS	ORGANIZATION	0.99+
47Lining	ORGANIZATION	0.99+
99th percentile	QUANTITY	0.99+
Phoenix, Arizona	LOCATION	0.99+
two	QUANTITY	0.99+
second quarter of 2019	DATE	0.99+
One	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
second	QUANTITY	0.98+
each	QUANTITY	0.98+
Spark	TITLE	0.96+
today	DATE	0.96+
Cloud	TITLE	0.95+
27 different data warehouse products	QUANTITY	0.95+
Wikibon	ORGANIZATION	0.95+
decades	QUANTITY	0.94+
three quarters	QUANTITY	0.9+
each customer	QUANTITY	0.89+
MaPr	ORGANIZATION	0.87+
third area	QUANTITY	0.87+
two business use cases	QUANTITY	0.81+
The Wigwam	ORGANIZATION	0.8+
theCube	ORGANIZATION	0.8+
Wigwam Resort	LOCATION	0.78+
Cloud	ORGANIZATION	0.77+
IOT	ORGANIZATION	0.76+
47	OTHER	0.74+
a thousand times	QUANTITY	0.73+
Customer	ORGANIZATION	0.72+
Cloudera	ORGANIZATION	0.7+
2017	DATE	0.7+
things	QUANTITY	0.68+
#DataPlatforms2017	EVENT	0.62+
Platforms	TITLE	0.61+
Primm	ORGANIZATION	0.59+
Data	ORGANIZATION	0.58+
Data Platforms	EVENT	0.53+
Data Platforms 2017	TITLE	0.5+
lake	ORGANIZATION	0.49+
2017	EVENT	0.46+
Data Platforms	ORGANIZATION	0.38+
360	OTHER	0.24+

Saket Saurabh, Nexla - Data Platforms 2017 - #DataPlatforms2017

(upbeat music) [Announcer] Live from the Wigwam in Pheonix, Arizona, it's the Cube. Covering Data Platforms 2017. Brought to you by Cue Ball. >> Hey welcome back everybody, Jeff Frick here with the Cube. We are coming down to the end of a great day here at the historic Wigwam at the Data Platforms 2017, lot of great big data practitioners talking about the new way to do things, really coining the term data ops, or maybe not coining it but really leveraging it, as a new way to think about data and using data in your business, to be data-driven, software-defined, automated solution and company. So we're excited to have Saket Saurabh, he is the, and I'm sorry I butchered that, Saurabh. >> Saurabh, yeah. >> Saurabh, thank you, sorry. He is the co-founder and CEO of Nexla, and welcome. >> Thank you. >> So what is Nexla, tell us about Nexla for those that aren't familiar with the company. Thank you so much. Yeah so Nexla is a data operations platform. And the way we look at data is that data is increasingly moving between companies and one of the things that is driving that is the growth in machine learning. So imagine you are an e-commerce company, or a healthcare provider. You need to get data from your different partners. You know, suppliers and point-of-sale systems, and brands and all that. And the companies, when they are getting this data, from all these different places, it's so hard to manage. So we think of, you know just like cloud computing, made it easy to manage thousands of servers, we think of data ops as something that makes it easy to manage those thousands of data sources coming from so many partners. So you've jumped straight past the it's a cool buzz term in way to think about things, into the actual platform. So how does that platform fit within the cloud, and on Prim, is it part of the infrastructure, sits next to the infrastructure, is it a conduit? How does that work? >> Yeah, we think of it as, if you think of maybe machine learning or advanced analytics as the application, then data operations is sort of an underlying infrastructure for it. It's not really the hardware, the storage, but it's a layer on top. The job of data operations is to get the data from where it is to where you need it to be, and in the right form and shape. So now you can act on it. >> And do you find yourself replacing legacy stuff, or is this a brand new demand because of all the variant and so many types of datasets that are coming in that people want to leverage. >> Yeah, I mean to be honest, some of this has always been there in the sense that the day you connected a database to a network data started to move around. But if you think of the scale that has happened in the last six or seven years, none of those existing systems were ever designed for that. So when we talk about data growing at at a Moore's Law rate, when we talk about everybody getting into machine learning, when we talk about thousands of data sets across so many different partners that you work with, and when we think that reports that you get from your partners is no more sufficient, you need that underlying data, you can not basically feed that report into an algo. So when you look at all of these things we feel like it is a new thing in some ways. >> Right. Well, I want to unpack that a little bit because you made an interesting comment, before you turned on the cameras you just repeated, that you can't run an algorithm on a report. And in a world where we've got all the shared data sets, and it's funny too right, because you used to run a sample, now you want, you said, the raw. Not only all, but the raw data, so that you can do with it what you wish. Very different paradigm. >> Yeah. >> It sounds like there's a lot more, and you're not just parsing what's in the report, but you have to give it structure that can be combined with other data sources. And that sounds like a rather challenging task. Because the structure, all the metadata, the context that gives the data meaning that is relevant to other data sets, where does that come from? >> Yeah, so what happens, and this has been how technology companies have started to evolve. You want to focus on your core business. And therefore you will use a provider that processes your payments, you will use a provider that gives you search. You will use a provider that provides you the data for example for your e-commerce system. So there are different types of vendors you're working with. Which means that there's different types of data being involved. So when I look at for example a brand today, you could be say, a Nike, and your products are being sold on so many websites. If you want to really analyze your business well, you want data from every single one of those places, where your data team can now access it. So yes, it is that raw data, it is that metadata, and it is the data coming from all the systems that you can look at together and say when I ran this ad this is how people reacted to it, this was the marketing lift from that, this is the purchase that happened across these different channels, this is how my top line or bottom line was affected. And to analyze everything together you need all the data in a place. >> I'm curious on what do you find on the change in the business relationship. Because I'm sure there were agreements structured in another time which weren't quite as detailed, where the expectations in terms of what was exchanged wasn't quite this deep. Are you seeing people have to change their relationships to get this data? Is it out there that they're getting it, or is this really changing the way that people partner in data exchange, on like the example that you just used between say Nike and Foot Locker, to pick a name. >> Yeah, so I think companies that have worked together have always had reports come in, so you would get a daily report of how much you sold. Now just a high-level report of how much you sold is not sufficient anymore. You want to understand where was it bought, in which city, under what weather conditions, by what kind of user and all that stuff. So I think what companies are looking at, again, they have built their data systems, they have the data teams, unless they give the data their teams cannot be effective and you cannot really take a daily sales report and feed that into your algorithm, right? So you need very fine-grained data for that. So I think companies are doing this where, hey you were giving me a report before, I also need some underlying data. Report is for a business executive to look at and see how business is doing, and the underlying data is really for that algorithm to understand and maybe identify things that a report might not. >> Wouldn't there have been already, at least in the example of sell-through, structured data that's been exchanged between partners already like vendor-managed inventory, or you know where like a downstream retailer might make their sell-through data accessible to suppliers who actually take ownership of the inventory and are responsible for stocking it at optimal levels. >> Yeah, I think Walmart was the innovator in that, with the POS link system, back in the day, for retail. But the point is that this need for data to go from one company to their partners and back and forth is across every sector. So you need that in e-commerce, you need that in fintech, we see companies who have to manage your portfolio needs to connect with different banks and brokerages you work with to get the data. We see that in healthcare across different providers and pharmaceutical companies, you need that. We see that in automotive. If every care generates data, an insurance company needs to be able to understand that and look at it. >> This, it's a huge problem you're addressing, because this is the friction between inter-company applications. And we went through this with the B2B marketplaces, 15 plus years ago. But the reason we did these marketplace hubs was so that we could standardize the information exchange. If it's just Walgreens talking to Pfizer, and then doing another one-off deal with, I don't know, Lily, I don't know if they both still exist, it won't work for connecting all of pharmacy with all of pharma. How do you ensure standards between downstream and upstream? >> Yeah. So you're right, this has happened. When we do a wire transfer from one person to another, some data goes from a bank to another bank, still takes hours to get that, it's very tiny amount of data. That has all exploded, we are talking about zetabytes of data now every year. So the challenge is significantly bigger. Now coming to standards, what we have found, that two companies sitting together and defining a standard almost never works. It never works because applications change, systems change, the change is the only constant. So the way we've approached it at our company is, we monitor the data, we sit on top of the data and just learn the structure as we observe data flowing through. So we have tons of data flowing through and we're constantly learning the structure, and are identifying how the structure will map to the destination. So again, applying machine learning to see how the structure is changing, how the data volume is changing. So you are getting data from somewhere say every hour, and then it doesn't show up for two hours. Traditionally systems will go down, you may not even find for five days that the data wasn't there for that. So we look at the data structure, the amount of data, the time when it comes, and everything to instantly learn and be able to inform the downstream systems of what they should be expecting, if there is a change that somebody needs to be alerted about. So a lot of innovation is going in to doing this at scale without necessarily having to predefine something in a tight box that cannot be changed. Because it's extremely hard to control. >> All right, Saket, that's a great explanation. We're going to have to leave it there, we're out of time. And thank you for taking a few minutes out of your day to stop by. >> Thank you. >> All right. Jeff Frick with George Gilbert, we are at Data Platforms 2017, Pheonix Arizona, thanks for watching. (electronic music)

Published Date : May 25 2017

SUMMARY :

Brought to you by Cue Ball. at the historic Wigwam at the Data Platforms 2017, He is the co-founder and CEO of Nexla, So we think of, you know just like cloud computing, So now you can act on it. And do you find yourself replacing legacy stuff, the day you connected a database to a network Not only all, but the raw data, so that you can do with it but you have to give it structure that can be combined And to analyze everything together you need all the data I'm curious on what do you find on the change So you need very fine-grained data for that. or you know where like a downstream retailer But the point is that this need for data to go But the reason we did these marketplace hubs and just learn the structure as we observe data And thank you for taking a few minutes out of your day we are at Data Platforms 2017, Pheonix Arizona,

ENTITIES

Entity	Category	Confidence
Walmart	ORGANIZATION	0.99+
Walgreens	ORGANIZATION	0.99+
Saurabh	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Nike	ORGANIZATION	0.99+
George Gilbert	PERSON	0.99+
Pfizer	ORGANIZATION	0.99+
two hours	QUANTITY	0.99+
five days	QUANTITY	0.99+
Lily	PERSON	0.99+
two companies	QUANTITY	0.99+
Nexla	ORGANIZATION	0.99+
Saket	PERSON	0.99+
Foot Locker	ORGANIZATION	0.99+
Saket Saurabh	PERSON	0.99+
one person	QUANTITY	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
Pheonix, Arizona	LOCATION	0.97+
Cube	ORGANIZATION	0.97+
15 plus years ago	DATE	0.97+
today	DATE	0.97+
thousands of data sources	QUANTITY	0.97+
Wigwam	LOCATION	0.96+
Data Platforms 2017	EVENT	0.96+
thousands of servers	QUANTITY	0.95+
one company	QUANTITY	0.95+
#DataPlatforms2017	EVENT	0.92+
Cue Ball	PERSON	0.9+
thousands of data sets	QUANTITY	0.9+
Arizona	LOCATION	0.75+
last six	DATE	0.73+
hour	QUANTITY	0.69+
single	QUANTITY	0.67+
seven years	QUANTITY	0.67+
Moore	ORGANIZATION	0.66+
every	QUANTITY	0.64+
Pheonix	LOCATION	0.54+
Covering	EVENT	0.51+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Cue Ball: