Adi Krishnan & Ryan Waite | AWS Summit 2014

>>Hey, welcome back everyone. We're here live here in San Francisco for Amazon web services summit. This is the smaller event compared to reinvent the big conference in Vegas, which we were broadcasting live. I'm John furry, the founder's SiliconANGLE. This is the cube. Our flagship program where we go out to the events district to see live from the noise and a an Amazon show would not be complete without talking to the Amazon guys directly about what's going on under the hood. And our next guest is ADI Krishnan and Ryan Wade have run the Canisius teams. Guys, welcome to the cube. So we, Dave Vellante and I was not here unfortunately. He has another commitment but we were going Gaga over the says we'd love red shift in love with going with the data. I see glaciers really low cost options, the store stuff, but when you start adding on red shift and you know can, he says you're adding in some new features that really kind of really pointed where the market's game, which is I need to deal with real time stuff. >>I'll need to deal with a lot of data. I need to manage it effectively at a low latency across any work use case. Okay. So how the hell do you come up with an ISA? Give us the insight into how it all came together. We'd love the real time. We'd love how it's all closing the loop if you will for developer. Just take us through how it came about. What are some of the stats now post re-invent share with us will be uh, the Genesis for Canisius was trying to solve our metering problem. The metering problem inside of AWS is how do we keep track with how our customers are using our products. So every time a customer does a read out of dynamo DB or they read a file out of S3 or they do some sort of transaction with any of our products, that generates a meeting record, it's tens of millions of records per second and tens of terabytes per hour. >>So it's a big workload. And what we were trying to do is understand how to transition from being a batch oriented processing where we using large hitting clusters to process all that data to a continuous processing where we could read all of that data in real time and make decisions on that data in real time. So you basically had created an aspirin for yourself is Hey, a little pain point internally, right? Yeah. It's kind of an example of us building a product to solve some of our own problems first and then making that available to the public. Okay. So when you guys do your Amazon thing, which I've gotten to know about it a little bit, the culture there, you guys kind of break stuff, kind of the quote Zuckerberg, you guys build kind of invented that philosophy, you know stuff good. Quickly iterating fast. So you saw your own problem and then was there an aha moment like hell Dan, this is good. We can bring it out in the market. What were customers asking for at the same time was kind of a known use case. Did you bring it to the market? What happened next? >>We spend a lot of time talking to a lot of customers. I mean that was kind of the logistical, uh, we had customers from all different sorts of investigative roles. Uh, financial services, consumer online services from manufacturing conditional attic come up to us and say, we have this canonical workflow. This workflow is about getting data of all of these producers, uh, the sources of data. They didn't have a way to aggregate that data and then driving it through a variety of different crossing systems to ultimately light up different data stores. Are these data source could be native to AWS stores like S3 time would be be uh, they could be a more interesting, uh, uh, higher data warehousing services like Gretchen. But the key thing was how do we deal with all this massive amount of data that's been producing real time, ingested, reliably scale it elastically and enable continuous crossing in the data. >>Yeah, we always loved the word of last tickets. You know, a term that you guys have built your business around being elastic. You need some new means. You have a lot of flexibility and that's a key part of being agile. But I want you guys at while we're here in the queue, define Kenny SIS for the folks out there, what the hell is it? Define it for the record. Then I have some specific questions I want to ask. Uh, so Canisius is a new service for processing huge amounts of streaming data in real time. Shortens and scales elastically. So as your data volume increases or decreases the service grows with you. And so like a no JS error log or an iPhone data. This is an example of this would be example of streaming. Yeah, exactly. You can imagine that you were tailing a whole bunch of logs coming off of servers. >>You could also be watching event streams coming out of a little internet of things type devices. Um, one of our customers we're talking about here is a super cell who's capturing in gain data from their game, Pasha, the plans. So as you're playing clash of the plans, you're tapping on the screen. All of that data is captured in thesis and then processed by my super Supercell. And this is validated. I mean obviously you mentioned some of the use cases you needed of things, just a sensor network to wearable computers or whatever. Mobile phones, I'll see event data coming off machines. So you've got machine data, you've got human data, got application data. That's kind of the data sets we're seeing with Kinesis, right? Traverse set. Um, also attraction with trends like spark out of Berkeley. You seeing in memory does this kind of, is this in your wheelhouse? >>How does that all relate to, cause you guys have purpose-built SSDs now in your new ECQ instances and all this new modern gear we heard in the announcements. How does all the in-memory stuff affect the Canisius service? It's a great question. When you can imagine as Canisius is being a great service for capturing all of that data that's being generated by, you know, hundreds of thousands or millions of sources, it gets sent to Canisius where we replicated across three different availability zones. That data is then made available for applications to process those that are processing that data could be Hadoop clusters, they could be your own Kaloosas applications. And it could be a spark cluster. And so writing spark applications that are processing that data in real time is a, it's a great use case and the in memory capabilities and sparker probably ideal for being able to process data that's stored in pieces. >>Okay. So let's talk about some of the connecting the dots. So Canisius works in conjunction with what other services are you seeing that is being adopted most right now? Now see I mentioned red shift, I'm just throwing that in there. I'll see a data warehousing tool seeing a lot of business tells. So basically people are playing with data, a lot of different needs for the data. So how does connect through the stack? I think they are the number one use case we see is customers capturing all of this data and then archiving all of it right away to S3 just been difficult to capture everything. Right. And even if you did, you probably could keep it for a little while and then you had to get, do you have to get rid of it? But, uh, with the, the prices for us three being so low and Canisius being so easy to capture tiny rights, these little tiny tales of log data, they're coming out of your servers are little bits of data coming off of mobile devices capture all of that, aggregate it and put it in S3. >>That's the number one use case we see as customers are becoming more sophisticated with using Kinesis, they then begin to run real time dashboards on top of Kinesis data. So you could, there's all the data into dynamo DB where you could push all that data into even something like Redshift and run analytics on top of that. The final cases, people in doing real time decision making based on PISA. So once you've got all this data coming in, putting it into a dynamo DB or Redshift or EMR, you then process it and then start making decisions, automated decisions that take advantage of them. So essentially you're taking STEM the life life cycle of kind of like man walking the wreck at some point. Right? It's like they start small, they store the data, usually probably a developer problem just in efficiencies. Log file management is a disaster. >>We know it's a pain in the butt for developers. So step one is solve that pain triage, that next step is okay I'm dashboard, I'm starting to learn about the data and then three is more advanced like real time decision making. So like now that I've got the data coming in in real time and not going to act. Yeah, so when I want to bring that up, this is more of a theoretical kind of orthogonal conversation is where you guys are basically doing is we look, we like that Silicon angles like the point out to kind of what's weird in the market and kind of why it's important and that is the data things. There's something to do with data. It really points to a new developer. Fair enough. And I want to give you guys comments on this. No one's really come out yet and said here's a development kit or development environment for data. >>You see companies like factual doing some amazing stuff. I don't know if you know those guys just met with um, new Relic. They launched kind of this data off the application. So you seeing, you seeing what you guys are doing, you can imagine that now the developer framework is, Hey I had to deal with as a resource constraint so you haven't seen it. So I want to get your thoughts. Do you see that happening in that direction? How will data be presented to developers? Is it going to be abstracted away? Will there be development environments? Is it matter? And just organizing the data, what's your vision around? So >>that's really good person because we've got customers that come up to us and say I want to mail real time data with batch processing or I have my data that is right now lots of little data and now I want to go ahead and aggregate it to make sense of it over a longer period of time. And there's a lot of theory around how data should be modeled, how we should be represented. But the way we are taking the evolution set is really learning from our customers and customers come up and say we need the ability to capture data quickly. But then what I want to do is apply my existing Hadoop stack and tools to my data because then you won't understand that. And as a response to that classroom demand, uh, was the EMR connect. Somehow customers can use say hi queries or cascading scripts and apply that to real time data. That can means is ingesting. Another response to pass was, was the, that some customers that would really liked the, the, the stream processing construct a storm. And so on, our step over there was to say, okay, we shipped the Canisius storm spout, so now customers can bring their choice of matter Dame in and mail back with Canisius. So I think the, the short answer there right now is that, >>you know, it's crazy. It's really early, right? I would also add like, like just with, uh, as with have you, there's so many different ways to process data in the real time space. They're going to be so many different ways that people process that data. There's never going to be a single tool that you use for processing real time data. It's a lot of tools and it adapts to the way that people think about data. So this also brings us back to the dev ops culture, which you guys essentially founded Amazon early in the early days and you know I gotta give you credit for that and you guys deserve it. Dev ops was really about building from the ground good cloud, which post.com bubble. Really the thing about that's Amazon's, you've lived your own, your own world, right? To survive with lesson and help other developers. >>But that brings up a good point, right? So okay, data's early and I'm now going to be advancing slowly. Can there be a single architecture for dealing with data or is it going to be specialized systems? You're seeing Oracle made some mates look probably engineered systems. You seeing any grade stacks work? What's the take on the data equation? I'm not just going to do because of the data out the internet of things data. What is the refer architecture right now? I think what we're going to see is a set of patterns that we can do alone and people will be using those patterns for doing particular types of processing. Uh, one of the other teams that I run at is the fraud detection team and we use a set of machine learning algorithms to be able to continuously monitor usage of the cloud, to identify patterns of behavior which are indicative of fraud. >>Um, that kind of pattern of use is very different than I'm doing clickstream analysis and the kind of pattern that we use for doing that would naturally be different. I think we're going to see a canonical set of patterns. I don't know if we're going to see a very particular set of technologies. Yeah. So that brings us back to the dev ops things. So how do I want to get your take on this? Because dev ops is really about efficiencies. Software guys don't want to be hardware guys the other day. That's how it all started. I don't want to provision the network. I don't want a stack of servers. I just want to push code and then you guys have crazy, really easy ways to make that completely transparent. But now you joke about composite application development. You're saying, Hey, I'm gonna have an EMR over here for my head cluster and then a deal with, so maybe fraud detection stream data, it's going to be a different system than a Duke or could be a relational database. >>Now I need to basically composite we build an app. That's what we're talking about here. Composite construction resource. Is that kind of the new dev ops 2.0 maybe. So we'll try to tease out here's what's next after dev ops. I mean dev ops really means there's no operations. And how does a developer deal with these kinds of complex environments like fraud detection, maybe application here, a container for this bass. So is it going to be fully composite? Well, I don't know if we run the full circuit with the dev ops development models. It's a great model. It's worked really well for a number of startups. However, making it easy to be able to plug different components together. I get just a great idea. So, like as ADI mentioned just a moment ago, our ability to take data and Kinesis and pump that right into a elastic MapReduce. >>It's great. And it makes it easy for people to use their existing applications with a new system like pieces that kind of composing of applications. It's worth well for a long time. And I think you're just going to see us continuing to do more and more of that kind of work. So I'm going to ask both of you guys a question. Give me an example of when something broke internally. This is not in a sound, John, I don't go negative here, but you got your, part of your culture is, is to move fast, iterate. So when you, these important projects like Canisius give me an example of like, that was a helpful way in which I stumbled. What did you learn? What was the key pain points of the evolution of getting it out the door and what key things did you learn from media success or kind of a speed bump or a failure along the way? >>Well, I think, uh, I think one of the first things we learned right after we chipped and we were still in a limited previous and we were trying it out with our customers who are getting feedback and learning with, uh, what they wanted to change in the product. Uh, one of the first things that we learned was that the, uh, the amount of time that it took to put data into Canisius and receive a return code was too high for a lot of our customers. It was probably around a hundred milliseconds for the, that you put the data in to the time that we've replicated that data across multiple availability zones and return success to the client. Uh, that was, that was a moment for us to really think about what it meant to enable people to be pushing tons of data into pieces. And we went back a hundred milliseconds. >>That's low, no bad. But right away we went back and doubled our efforts and we came back in around, you know, somewhere between 30 and 40 milliseconds depending on your network connectivity. Hey, the old days, that was, that was the spitting disc of the art. 10, 20 Meg art. It's got a VC. That's right. Those Lotus files out, you know, seeing those windows files. So you guys improve performance. So that's an example. You guys, what's the biggest surprise that you guys have seen from a customer use case that was kind of like, wow, this is really something that we didn't see happening on a, on a larger scale that caught me by surprise. >>Uh, I is in use case it'd be a corner use case. Like, well, I'd never figured that, you know, I would say like, uh, some of the one thing that actually surprised us was how common it is for people to have multiple applications reading out of the same stream. Uh, like again, the basic use case for so many customers is I'm going to take all this data and I'm just going to throw it into S3. Uh, and we kind of envisioned that there might be a couple of different applications reading data of that stream. We have a couple of customers that actually have uh, as many as three applications that are reading that stream of events that are coming out of Kinesis. Each one of them is reading from a different position in the stream. They're able to read from different locations, process that data differently. >>But uh, but the idea that cleanses is so different from traditional queuing systems and yet provides, uh, a real time emotionality and that multiple applications can read from it. That was, that was a bit of a versa. The number one use case right now, who's adopting, can you sit there, watch folks watching out there, did the Canisius brain trust right here with an Amazon? Um, what are the killer no brainer scenarios that you're seeing on the uptake side right now that people should be aware of that they haven't really kicked the tires on Kinesis where they should be? What should they be looking at? I think the number one use case is log and ingestion. So like I'm tailing logs that are coming off of web servers, my application servers, uh, data that's just being produced continuously who grab all that data. And very easily put it into something like us through the beauty of that model is I now have all the logo that I got it off of all of my hosts as quickly as possible and I can go do log nights later if there's a problem that is the slam dunk use case for using crisis. >>Uh, there are other scenarios that are beginning to emerge as well. I don't know audio if you want to talk, that's many interesting and lots of customers are doing so already is emit data from all sorts of devices. So this is, these devices are not just your smartphones and tablets that are practically food computing machines, but also seemingly low power, seemingly dumb devices. And the design remains the same. There are millions of these out there and having the ability to capture that in a day produce in real time is, you know, I think just, uh, just to highlight that, one of things I'm hearing on the cube interviews, all the customers we talk to is the number one thing is I just got to scroll the date. I know what I want to do with it yet. Now that's a practice that's a hangover from the BI data warehouse in business of just store from a compliance reasons now, which is basically like, that's like laser as far as I'm concerned. >>Traditional business intelligence systems are like their version of Galatians chipped out somewhere and give me those reports. Five weeks later they come back. But that's different. Now you see people store that data and they realize that I need to touch it faster. I don't know yet when, that's why I'm teasing out this whole development 2.0 model because I'm just seeing more and more people want the data hanging around but not fully parked out in Malaysia or some sort of, you know, compliance storage. So there's, you know, I think, I think I kind of understand where you're going. There's a, I'm going to use a model for like how we used to do BI analytics and our own internal data warehouse. I also run the data warehouse for AWS. Um, and the classic BI model there is somebody asks a question, we go off and we just do some analysis and if it's a question that we're going to ask repeatedly, we don't, you know, a special fact table or a dimensional view or something to be able to grind through that particular view and do it very quickly. >>A Kunis is offers a different kind of data processing model, which is I'm collecting all of the data and make it easy to capture everything, but now I can start doing things like, Oh, there's, there's certain pieces of data that I want to respond to you quickly. Just like we would create dimensional views that would give us access to particular sets of data and very quick pace. We can now also respond to when those events are generated very quickly. Well, you guys are the young guns in the industry now. I'm a little bit older and the gray hair showing, we actually use the word data processing back in the day. The data processing that the DP department or the MIS department, if you remember those those days, MIS was the management information. Are we going back to those terms? I mean we're looking at look what's happening. >>Is it the software mainframe in the cloud? I mean these are some of the words you're using. Just data processing data pipeline. Well, I my S that's my work, but I mean we're back to those old school stuff but different, well and I think those kinds of very generic terms make a lot of sense for what we're doing is we, especially as we move into these brand new spaces like wow, what do I do with real time data? Like real time data processing is kind of the third type of big data processing or data warehousing was the first time I know what my data looks like. I've created indices like a pre computation of the data, uh, uh, Hadoop clusters and the MapReduce model was kind of the second wave of big data processing and realtime processing I think will be the third way. I think our process, well, I'm getting the hook here, but I got to just say, you guys are doing an amazing job. >>We're big fans of Amazon. I always say that, uh, you know, it was very rare in the history the world. We look at innovations like the printing press, the Wright brothers discover, you know, flying and things like we, Amazon with cloud. You guys have done something that's pretty amazing. But what I find fascinating is it's very rare to see a company that's commoditizing and disrupting and innovating at the same time. And it's really a unique value proposition and the competition is responding. IBM, Google. So you guys have a lot of targets painted on your back by a lot of big players. So, uh, one congratulations on your success, which means that you, you know, you're not going to go in the open field and fight the, the British if they said use the American revolution analogy. You've got to continue to compete. So what's your view of that? >>I mean, and I'm sure you don't talk about competition. You'd probably told him not to talk about it, but I mean, you got to know that all the guns are on you right now. The big guys are putting up the sea wall for your wave of innovation. How do you guys deal with that? It's just cause it's not like we, we ignore our competitors but we obsess about our customers, right? Like it's just constantly looking for what are people trying to do and how can we help them and can seem like a very simple strategy. But the strategy is built with people want and we get a lot of great feedback on how we can make our products better. And it certainly will force you to up your game when you have the competition citing on you. You've got more focused on the customer, which is cool. >>But like you guys kind of aware of like games on, I mean Amazon is at any given a little pep talk, Hey, game is on guys. Let's rock and roll. Right? You guys are aware, right? I think we're totally wearing, I think we're actually sometimes a little surprised at how long it's taken to our competitors to kind of get into this industry with us. So, uh, again, as Andy talked about earlier today, we've had eight years in the cloud computing market. It's been a great eight years and we have a lot of work to do, a lot of stuff that we're going to be almost ready for middle school. Um, final final question for you guys and give you the final word here. Share the photos on the last word is why is this show so important, right this point in time in this market. Why is this environment of the thousands of people that are here learning about Amazon, why, what should they know about why this is such an important advance? I think our summits are a great opportunity for us to share with customers how to use our AWS services. Learn firsthand from not only our hands on labs, but also our partners that are providing information about how they use AWS resources. It's, it's a great opportunity to meet a lot of people that are taking advantage of the cloud computing wave and see how to use the cloud most effectively. >>It's a great time to be in the cloud right now and the Olin's amazing services coming up. There's no better mind now of people coming together and so that's probably as good reasons. Then you guys are doing a great job disrupting change in the future. Modern enterprise and modern business, modern applications. Excited to watch it. If you guys keep focusing on your customer, but that customer base, you keep up the pace that's sick. That question, can you finish the race? That's what I always tell Dave a lot. They, I know Jay's watching Dave. Shout out to Dave Volante, who's on the mobile app right now is traveling. Guys, thanks for coming inside. Can he says great stuff. Closing the loop real time. Amazon really building it out. Thanks for coming on. If you'd be right back with our next guest after this short break. Thank you.

Published Date : Mar 26 2014

SUMMARY :

the store stuff, but when you start adding on red shift and you know can, he says you're adding in some new features So how the hell do you come up with an ISA? the culture there, you guys kind of break stuff, kind of the quote Zuckerberg, you guys build kind of invented that philosophy, I mean that was kind of the logistical, You know, a term that you guys have built your business around being elastic. That's kind of the data sets we're seeing with Kinesis, of that data that's being generated by, you know, hundreds of thousands or millions of sources, it gets with what other services are you seeing that is being adopted most right now? That's the number one use case we see as customers are becoming more sophisticated with using Kinesis, And I want to give you guys comments on this. I don't know if you know those guys just met with But the way we are taking the evolution set is So this also brings us back to the dev ops culture, which you guys essentially founded Amazon early in the early days So okay, data's early and I'm now going to be I just want to push code and then you So is it going to be fully composite? So I'm going to ask both of you guys a question. Uh, one of the first things that we learned So you guys improve performance. of the one thing that actually surprised us was how common it is for people to have multiple applications So like I'm tailing logs that are coming off of web capture that in a day produce in real time is, you know, I think just, uh, just to highlight that, So there's, you know, I think, I think I kind of understand where you're going. The data processing that the DP department or the MIS department, if you remember those those days, you guys are doing an amazing job. So you guys have a lot of targets painted on your back by a lot of big players. And it certainly will force you to up your game when But like you guys kind of aware of like games on, I mean Amazon is If you guys keep focusing on your customer, but that customer base, you keep up the pace that's

ENTITIES

Entity	Category	Confidence
Andy	PERSON	0.99+
Zuckerberg	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Ryan Wade	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Malaysia	LOCATION	0.99+
eight years	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Jay	PERSON	0.99+
San Francisco	LOCATION	0.99+
Adi Krishnan	PERSON	0.99+
Vegas	LOCATION	0.99+
Berkeley	LOCATION	0.99+
Ryan Waite	PERSON	0.99+
ADI Krishnan	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
hundreds of thousands	QUANTITY	0.99+
Lotus	TITLE	0.99+
Five weeks later	DATE	0.99+
40 milliseconds	QUANTITY	0.99+
Dan	PERSON	0.99+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
third way	QUANTITY	0.98+
three applications	QUANTITY	0.98+
John furry	PERSON	0.98+
S3	TITLE	0.98+
Canisius	ORGANIZATION	0.97+
three	QUANTITY	0.97+
30	QUANTITY	0.97+
first time	QUANTITY	0.97+
Each one	QUANTITY	0.96+
single tool	QUANTITY	0.95+
around a hundred milliseconds	QUANTITY	0.95+
millions of sources	QUANTITY	0.95+
AWS Summit 2014	EVENT	0.95+
step one	QUANTITY	0.94+
Relic	ORGANIZATION	0.94+
Gaga	PERSON	0.93+
third type	QUANTITY	0.91+
earlier today	DATE	0.91+
S3 time	TITLE	0.9+
American	OTHER	0.9+
a hundred milliseconds	QUANTITY	0.89+
Olin	PERSON	0.89+
Kunis	ORGANIZATION	0.89+
first things	QUANTITY	0.89+
ECQ	TITLE	0.87+
Duke	ORGANIZATION	0.87+
ADI	ORGANIZATION	0.86+
Kenny SIS	PERSON	0.85+
tens of terabytes per	QUANTITY	0.85+
Redshift	TITLE	0.85+
first	QUANTITY	0.85+
Kinesis	ORGANIZATION	0.85+
SiliconANGLE	ORGANIZATION	0.81+
tons of data	QUANTITY	0.8+
single architecture	QUANTITY	0.77+
one thing	QUANTITY	0.77+
thousands of people	QUANTITY	0.76+
PISA	TITLE	0.73+
second wave	EVENT	0.73+
tens of millions of records per second	QUANTITY	0.73+
agile	TITLE	0.73+
windows	TITLE	0.69+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for ECQ: