A Day in the Life of an IT Admin | HPE Ezmeral Day 2021

>>Hi, everyone. Welcome to ASML day. My name is Yasmin Joffey. I'm the director of systems engineering for ASML at HPE. Today. We're here and joined by my colleague, Don wake, who is a technical marketing engineer who will talk to us about the date and the life of an it administrator through the lens of ASML container platform. We'll be answering your questions real time. So if you have any questions, please feel free to put your questions in the chat, and we should have some time at the end for some live Q and a. Don wants to go ahead and kick us off. >>All right. Thanks a lot, Yasir. Yeah, my name is Don wake. I'm the tech marketing guy and welcome to asthma all day, day in the life of an it admin and happy St. Patrick's day. At the same time, I hope you're wearing green virtual pinch. If you're not wearing green, don't have to look that up if you don't know what I'm scouting. So we're just going to go through some quick things. Talk about discussion of modern business. It needs to kind of set the stage and go right into a demo. Um, so what is the need here that we're trying to fulfill with, uh, ASML container platform? It's, it's all rooted in analytics. Um, modern businesses are driven by data. Um, they are also application centric and the separation of applications and data has never been more important or, or the relationship between the two applications are very data hungry. >>These days, they consume data in all new ways. The applications themselves are, are virtualized, containerized, and distributed everywhere, and optimizing every decision and every application is, is become a huge problem to tackle for every enterprise. Um, so we look at, um, for example, data science, um, as one big use case here, um, and it's, it's really a team sport and I'm today wearing the hat of perhaps, you know, operations team, maybe software engineer, guy working on, you know, continuous integration, continuous development integration with source control, and I'm supporting these data scientists, data analysts. And I also have some resource control. I can decide whether or not the data science team gets a, a particular cluster of compute and storage so that they can do their work. So this is the solution that I've been given as an it admin, and that is the ASML container platform. >>And just walking through this real quick, at the top, I'm trying to, as wherever possible, not get involved in these guys' lives. So the data engineers, scientists, app developers, dev ops guys, they all have particular needs and they can access their resources and spin up clusters, or just do work with the Jupiter notebook or run spark or Kafka or any of the, you know, popular analytics platforms by just getting in points that we can provide to them web URLs and their self service. But in the backend, I can then as the it guy makes sure the Kubernetes clusters are up and running, I can assign particular access to particular roles. I can make sure the data's well protected and I can connect them. I can import clusters from public clouds. I can, uh, you know, put my like clusters on premise if I want to. >>And I can do all this through this centralized control plane. So today I'm just going to show you I'm supporting some data scientists. So one of our very own guys is actually doing a demo right now as well, called the a day in the life of the data scientist. And he's on the opposite side, not caring about all the stuff I'm doing in the backend and he's training models and registering the models and working with data, uh, inside his, you know, Jupiter notebook, running inferences, running postman scripts. And so I'm in the background here, making sure that he's got access to his cluster storage protected, make sure it's, um, you know, his training models are up, he's got service endpoints, connecting him to, um, you know, his source control and making sure he's got access to all that stuff. So he's got like a taxi ride prediction model that he's working on and he has a Jupiter notebook and models. So why don't we, um, get hands on and I'll just jump right over it. >>It was no container platform. So this is a web UI. So this is the interface into the container platform. Our centralized control plane, I'm using my active directory credentials to log in here. >>And >>When I log in, I've also been assigned a particular role, uh, with regard to how much of the resources I can access. Now, in my case, I'm a site admin you can see right up here in the upper right hand, I'm a site admin and I have access to lots and lots of resources. And the one I'm going to be focusing on today is a Kubernetes cluster. Um, so I have a cluster I can go in here and let's say, um, we have a new data scientists come on board one. I can give him his own resources so he can do whatever he wants, use some GPU's and not affect other clusters. Um, so we have all these other clusters already created here. You can see here that, um, this is a very busy, um, you know, production system. They've got some dev clusters over here. >>I see here, we have a production cluster. So he needs to produce something for data scientists to use. It has to be well protected and, and not be treated like a development resource. So under his production cluster, I decided to create a new Kubernetes cluster. And literally I just push a button, create Kubernetes cluster once I've done that. And I'll just show you some of the screens and this is a live environment. So this is, I could actually do it all my hosts are used up right now, but I wouldn't be able to go in here and give it a name, just select, um, some hosts to use as the primary master controller and some workers answer a few more questions. And then once that's done, I have now created a special, a whole nother Kubernetes cluster, um, that I could also create tenants from. >>So tenants are really Kubernetes. Uh namespaces so in addition to taking hosts and Kubernetes clusters, I can also go to that, uh, to existing clusters and now carve out a namespace from that. So I look at some of the clusters that were already created and, um, let's see, we've got, um, we've got this year is an example of a tenant that I could have created from that production cluster. And to do that here in the namespace, I just hit create and similar to how you create a cluster. You can now carve down from a given cluster and we'll say the production cluster and give it a name and a description. I can even tell it, I want this specific one to be an AI ML project, um, which really is our ML ops license. So at the end of the day, I can say, okay, I'm going to create an ML ops tenant from that cluster that I created. >>And so I've already created it here for this demo. And I'm going to just go into that Kubernetes namespace now that we also call it tenant. I mean, it's like, multitenancy the name essentially means we're carving out resources so that somebody can be isolated from another environment. First thing I typically do. Um, and at this point I could also give access to this tenant and only this tenant to my data scientist. So the first thing I typically do is I go in here and you can actually assign users right here. So right now it's just me. But if I want it to, for example, give this, um, to Terry, I could go in here and find another user and assign him from this lead, from this list, as long as he's got the proper credentials here. So you can see here, all these other users have active directory credentials, and they, uh, when we created the cluster itself, we also made sure it integrated with our active directory, so that only authorized users can get in there. >>Let's say the first thing I want to do is make sure when I do Jupiter notebook work, or when Terry does, I'm going to connect him up straight up to the get hub repository. So he gives me a link to get hub and says, Hey man, this is all of my cluster work that I've been doing. I've got my source control there. My scripts, my Python notebooks, my Jupiter notebooks. So when I create that, I simply give him, you know, he gives me his, I create a configuration. I say, okay, here's a, here's a get repo. Here's the link to it. I can use a token, here's his username. And I can now put in that token. So this is actually a private repo and using a token, you know, standard get interface. And then the cool thing after that, you can go in here and actually copy the authorization secret. >>And this gets into the Kubernetes world. Um, you know, if you want to make sure you have secure integration with things like your source control or perhaps your active directory, that's all maintained in secrets. So you can take that secret. And when I then create his notebook, I can put that secret right in here in this, uh, launch Yammel. And I say, Hey, connect this Jupiter notebook up with this secret so he can log in. And when I've launched this Jupiter notebook cluster, this is actually now, uh, within my, my, uh, Kubernetes tenant. It is now really a pod. And if I want to, I can go right into a terminal for that, uh, Kubernetes tenant and say, coop CTL, these are standard, you know, CNCF certified Kubernetes get pods. And when I do this, it'll tell me all of the active pods and within those positive containers that I'm running. >>So I'm running quite a few pods and containers here in this, uh, artificial intelligence machine learning, um, tenant. So that's kind of cool. Also, if I wanted to, I could go straight and I can download the config for Kubernetes, uh, control. Uh well, and then I can do something like this, where on my own system where I'm more comfortable, perhaps coop CTL get pods. So this is running on my laptop and I just had to do a coop CTL refresh and give the IP address and authorization, um, information in order to connect from my laptop to that end point. So from a CIC D perspective from, you know, an it admin guides, he usually wants to use tools right on his, uh, desktop. So here am I back in my web browser, I'm also here on the dashboard of this, uh, Kubernetes, um, tenant, and I can see how it's doing. >>It looks like it's kind of busy here. I can focus specifically on a pod if I want to. I happen to know this pod is my Jupiter notebook pod. So aren't, I show how, you know, I could enable my data scientists by just giving him the, uh, URL or what we call a notebook service end points or notebook end point. And just by clicking on this URL or copying it, copying, you know, it's a link, uh, and then emailing it to them and say, okay, here's your, uh, you know, here's your duper notebook. And I say, Hey, just log in with your credentials. I've already logged in. Um, and so then he's got his Jupiter notebook here and you can see that he's connected to his GitHub repo directly. He's got all of the files that he needs to run his data science project and within here, and this is really in the data science realm, data scientists realm. >>He can see that he can have access to centralized storage and he can copy the files from his GitHub repo to that centralized storage. And, you know, these, these commands, um, are kind of cool. They're a little Jupiter magic commands, and we've got some of our own that showed that attachment to the cluster. Um, but you can see here if you run these commands, they're actually looking at the shared project repository managed by the container platform. So, you know, just to show you that again, I'll go back to the container platform. And in fact, the data scientist, uh, could do the same thing. Attitude put a notebook back to platform. So here's this project repository. So this is other big point. So now putting on my storage admin hat, you know, I've got this shared, um, storage, um, volume that is managed for me by the ESMO data fabric. >>Um, in, in here, you can see that the data scientist, um, from his get repo is able to through Jupiter notebook directly, uh, copy his code. He was able to run as Jupiter notebook and create this XG boost, uh, model. So this file can then be registered in this AIML tenant. So he can go in here and register his model. So this is, you know, this is really where the data scientist guy can self-service kick off his notebooks, even get a deployment end point so that he can then inference his cluster. So here again, another URL that you could then take this and put it into like a postman rest URL and get answers. Um, but let's say he wants to, um, he's been doing all this work and I want to make sure that his, uh, data's protected, uh, how about creating a mirror. >>So if I want to create a mirror of that data, now I go back to this other, uh, and this is the, the, uh, data fabric embedded in a very special cluster called the Picasso cluster. And it's a version of the ASML data fabric that allows you to launch what was formerly called Matt bar as a Kubernetes cluster. And when you create this special cluster, every other cluster that you create is automatically, uh, gets things like that. Tenant storage. I showed you to create a shared workspace, and it's automatically managed by this, uh, data fabric. Uh, and you're even given an end point to go into the data fabric and then use all of the awesome features of ASML data fabric. So here I can just log in here. And now I'm at the, uh, data fabric, web UI to do some data protection and mirroring. >>So >>Let's go over here. Let's say I want to, uh, create a mirror of that tenant. So I forgot to note what the name of my tenant was. I'm going to go back to my tenant, the name of the volume that I'm playing with here. So in my AIML tenant, I'm going to go to my source, control my project repository that I want to protect. And I see that the ESMO data fabric has created 10 and 30 as a volume. So I'll go back to my, um, data fabric here, and I'm going to look for 10 and 30. And if I want to, I can go into tenant 30, >>Okay. >>Down here, I can look at the usage. I can look at all of the, you know, I've used very little of the, uh, allocated storage that I want, but let's, uh, you know what, let's go ahead and create a volume to mirror that one. So very simple web UI that has said create volume. I go in here and I say, I want to do a, a tenant 30 mirror. And I say, mirror the mirror volume. Um, I want to use my Picasso cluster. I want to use tenant 30. So now that's actually looking up in the data fabric, um, database there's 10 and 30 K. So it knows exactly which one I want to use. I can go in here and I can say, you know, ext HCP, tenant, 30 mirror, you know, I can give it whatever name I want and this path here. >>And that's a whole nother, uh, demo is this could be in Tokyo. This could be mirrored to all kinds of places all over the world, because this is truly a global name, split namespace, which is a huge differentiator for us in this case, I'm creating a local mirror and that can go down here and, um, I can add, uh, audit and encryptions. I can do, um, access control. I can, you know, change permissions, you know, so full service, um, interactivity here. And of course this is using the web UI, but there's also rest API interfaces as well. So that is pretty much the, the brunt of what I wanted to show you in the demo. Um, so we got hands on and I'm just going to throw this up real quick and then come back to Yasser. See if he's got any questions he has received from anybody watching, if you have any new questions. >>Yeah. We've got a few questions. Um, we can, uh, just take some time to go, hopefully answer a few. Um, so it, it does look like you can integrate or incorporate your existing get hub, uh, to be able to, um, extract, uh, shared code or repositories. Correct? >>Yeah. So we have that built in and can either be, um, get hub or bit bucket it's, you know, pretty standard interface. So just like you can go into any given, get hub and do a clone of a, of a repo, pull it into your local environment. We integrated that directly into the gooey so that you can, uh, say to your, um, AIML tenant, uh, to your Jupiter notebook. You know, here's, here's my GitHub repo. When you open up my notebook, just connect me straight up. So it saves you some, some steps there because Jupiter notebook is designed to be integrated with get hub. So we have get hub integrated in as well or bit bucket. Right. >>Um, another question around the file system, um, has the map, our file system that was carried over, been modified in any way to run on top of Kubernetes. >>So yeah, I would say that the map, our file system data fabric, what I showed here is the Kubernetes version of it. So it gives you a lot of the same features, but if you need, um, perhaps run it on bare metal, maybe you have performance, um, concerns, um, you know, you can, uh, you can also deploy it as a separate bare metal instance of data fabric, but this is just one way that you can, uh, use it integrated directly into Kubernetes depends really the needs of, of the, uh, the user and that a fabric has a lot of different capabilities, but this is, um, it has a lot of the core file system capabilities where you can do snapshots and mirrors, and it it's of course, striped across multiple, um, multiple disks and nodes. And, uh, you know, Matt BARR data fabric has been around for years. It's, uh, and it's designed for integration with these, uh, analytic type workloads. >>Great. Um, you showed us how you can manage, um, Kubernetes clusters through the ASML container platform you buy. Um, but the question is, can you, uh, control who accesses, which tenant, I guess, namespace that you created, um, and also can you restrict or, uh, inject resource limitations for each individual namespace through the UI? >>Oh yeah. So that's, that's a great question. Yes. To both of those. So, um, as a site admin, I had lots of authority to create clusters, to go into any cluster I wanted, but typically for like the data scientist example I used, I would give him, I would create a user for him. And there's a couple of ways you can create users. Um, and it's all role-based access control. So I could create a local user and have container platform authenticate him, or I can say integrate directly with, uh, active directory or LDAP, and then even including which groups he has access to. And then in the user interface for the site admin, I could say he gets access to this tenant and only this tenant. Um, another thing you asked about is his limitations. So when you create the tenant to prevent that noisy neighbor problem, you can, um, go in and create quotas. >>So I didn't show the process of actually creating a Quentin, a tenant, but integral to that, um, flow is okay, I've defined which cluster I want to use. I defined how much memory I want to use. So there's a quota right there. You could say, Hey, how many CPU's am I taking from this pool? And that's one of the cool things about the platform is that it abstracts all that away. You don't have to really know exactly which host, um, you know, you can create the cluster and select specific hosts, but once you've created the cluster, it's not just a big pool of resources. So you can say Bob, over here, um, he's only going to get 50 of the a hundred CPU's available and he's only going to get X amount of gigabytes of memory. And he's only going to get this much storage that he can consume. So you can then safely hand off something and know they're not going to take all the resources, especially the GPU's where those will be expensive. And you want to make sure that one person doesn't hog all the resources. And so that absolutely quotas are built in there. >>Fantastic. Well, we, I think we are out of time. Um, we have, uh, a list of other questions that we will absolutely reach out and, um, get all your questions answered, uh, for those of you who ask questions in the chat. Um, Don, thank you very much. Thanks everyone else for joining Don, will this recording be made available for those who couldn't make it today? >>I believe so. Honestly, I'm not sure what the process is, but, um, yeah, it's being recorded so they must've done that for a reason. >>Fantastic. Well, Don, thank you very much for your time and thank everyone else for joining. Thank you.

Published Date : Mar 17 2021

SUMMARY :

So if you have any questions, please feel free to put your questions in the chat, don't have to look that up if you don't know what I'm scouting. you know, continuous integration, continuous development integration with source control, and I'm supporting I can, uh, you know, And so I'm in the background here, making sure that he's got access to So this is a web UI. You can see here that, um, this is a very busy, um, you know, And I'll just show you some of the screens and this is a live environment. in the namespace, I just hit create and similar to how you create a cluster. So you can see here, all these other users have active I create that, I simply give him, you know, he gives me his, I create a configuration. So you can take that secret. So this is running on my laptop and I just had to do a coop CTL refresh And just by clicking on this URL or copying it, copying, you know, it's a link, So now putting on my storage admin hat, you know, I've got this shared, So here again, another URL that you could then take this and put it into like a postman rest URL And when you create this special cluster, every other cluster that you create is automatically, And I see that the ESMO data I can look at all of the, you know, I can, you know, change permissions, Um, so it, it does look like you can integrate So just like you can go into any given, Um, another question around the file system, um, has the it has a lot of the core file system capabilities where you can do snapshots and mirrors, and also can you restrict or, uh, inject resource limitations for each So when you create the tenant to prevent So I didn't show the process of actually creating a Quentin, a tenant, but integral to that, Um, Don, thank you very much. I believe so.

ENTITIES

Entity	Category	Confidence
Yasir	PERSON	0.99+
Terry	PERSON	0.99+
Don wake	PERSON	0.99+
Tokyo	LOCATION	0.99+
50	QUANTITY	0.99+
Yasmin Joffey	PERSON	0.99+
First	QUANTITY	0.99+
two applications	QUANTITY	0.99+
Don	PERSON	0.99+
Today	DATE	0.99+
today	DATE	0.99+
St. Patrick's day	EVENT	0.98+
10	QUANTITY	0.98+
both	QUANTITY	0.98+
30 K.	QUANTITY	0.98+
one	QUANTITY	0.98+
Kubernetes	TITLE	0.98+
HPE	ORGANIZATION	0.97+
one person	QUANTITY	0.97+
first thing	QUANTITY	0.97+
Yasser	PERSON	0.97+
Kafka	TITLE	0.97+
Python	TITLE	0.96+
ASML	ORGANIZATION	0.96+
CNCF	ORGANIZATION	0.96+
one way	QUANTITY	0.95+
Jupiter	LOCATION	0.94+
ESMO	ORGANIZATION	0.94+
GitHub	ORGANIZATION	0.94+
ASML	EVENT	0.93+
Bob	PERSON	0.93+
Matt BARR	PERSON	0.92+
this year	DATE	0.91+
Jupiter	ORGANIZATION	0.9+
each individual	QUANTITY	0.86+
30	OTHER	0.85+
a hundred CPU	QUANTITY	0.82+
ASML	TITLE	0.82+
2021	DATE	0.8+
coop	ORGANIZATION	0.78+
a day	QUANTITY	0.78+
Kubernetes	ORGANIZATION	0.75+
couple	QUANTITY	0.75+
A Day in the Life	TITLE	0.73+
an IT	TITLE	0.7+
30 mirror	QUANTITY	0.69+
case	QUANTITY	0.64+
CTL	COMMERCIAL_ITEM	0.57+
few more questions	QUANTITY	0.57+
coop CTL	ORGANIZATION	0.55+
years	QUANTITY	0.55+
Quentin	PERSON	0.51+
30	QUANTITY	0.49+
Ezmeral Day	PERSON	0.48+
lots	QUANTITY	0.43+
Jupiter	COMMERCIAL_ITEM	0.42+
10	TITLE	0.41+
Picasso	ORGANIZATION	0.38+

Jack Norris - Hadoop on the Hudson - theCUBE

>>Live from New York city. It's cute. here's your host? Jeff Frick. >>Hi, Jeff Frick here with the Q we're on the ground at the USS Intrepid at the Hadoop on the Hudson party put on by Matt BARR. It's uh, I think it's the party of the night tonight here in big data week, New York city with strata cough, a dupe world, big data NYC. So Jack a great >>Venue. Yeah, it's excellent. Here. >>The place is filled. I'm just struck by the technology. There's a Gemini capsule over there, about 50 years old. It's about the size of a Volkswagen, I think would be much bigger. And to think that those guys went up into space with probably less technology than is on your four year old flip phone. Amazing. Yeah. >>Not, not much data at all. No. If >>You look at it, just kind of get that bounce on the gravity thing, which I never quite understood. So talk about you guys had some big news today. Once you give us a rundown on some of the announcements, >>We had two big announcements. One was incorporating the map RDB and our community edition that came out. We also reported results from our customers where the majority of customers reported less than a 12 month payback, uh, 65% of five X or greater return and 40%, 10 X or greater. And that included a subset of those customers that had experienced with other distributions. So kind of a Testament to when you get serious about Hadoop, you get serious with Mapbox >>And when they're getting those return on investments, we're always trying to explore where's the big, the big ROI, because it's really in value that's released for the customer. It's not necessarily because it's a cheaper way to do it, >>Right? So, so there are some costs that 63% was cost reduction that was driving it about 41% were top-line revenue projects. And about 23% were related to risk reduction and risk mitigation. And if you add those up, it's greater than a hundred percent because of many customers that are doing multiple applications. >>Great. So you've been coming to Hadoop world for longer than you would admit to me before we came on camera and, and the baseball playoffs are going on right now. I mean, we like to talk in sports analogy. So kind of where are we in, in kind of what inning are we in this adoption of big data and the duke specifically >>Early, early innings. Um, but, uh, what we've seen is the bases are loaded and we're up >>And it's it. And it seems to be we're way past now the POC stage. Now we're really getting in there for that. >>And the, the customer announcement, we did kind of shows how people are hitting it out of the park with Hadoop. And a lot of that is by impacting the operations, impacting the business as it happens. And that's coupling analytics plus this higher arrival rate data from a variety of sources and making adjustments so that you can impact revenue as businesses happening. You can mitigate risk as it's happening. It's not just reporting, looking back >>Function. Right, right. It's being able to react in real time, which is defined by, in time to do something about it. Right. Exactly. All right. Well, thanks for hosting a great party, Jack Norris. Here we are on the ground, uh, at the USS Intrepid at the Hadoop on the Hudson. Uh, uh, if you take a nice picture, tweet that in. I think they got some prizes. Hadoop Hudson is a hashtag Jeff Frick on the ground. You're watching the cube. Thanks. Big ship.

Published Date : Oct 22 2014

SUMMARY :

It's cute. It's uh, I think it's the party of the night tonight here And to think that those guys went up into space with probably less technology than is on your four Not, not much data at all. You look at it, just kind of get that bounce on the gravity thing, which I never quite understood. So kind of a Testament to when you get serious about Hadoop, And when they're getting those return on investments, we're always trying to explore where's the big, And if you add those up, it's greater than a hundred percent because of many customers that are doing multiple applications. So kind of where are we in, Um, but, uh, what we've seen is the bases are loaded and we're up And it seems to be we're way past now the POC stage. And a lot of that is by impacting the operations, It's being able to react in real time, which is defined by,

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
40%	QUANTITY	0.99+
Jack Norris	PERSON	0.99+
Matt BARR	PERSON	0.99+
65%	QUANTITY	0.99+
63%	QUANTITY	0.99+
One	QUANTITY	0.99+
10 X	QUANTITY	0.99+
New York city	LOCATION	0.99+
NYC	LOCATION	0.99+
today	DATE	0.99+
greater than a hundred percent	QUANTITY	0.99+
about 23%	QUANTITY	0.99+
Volkswagen	ORGANIZATION	0.98+
two big announcements	QUANTITY	0.98+
Jack	PERSON	0.98+
about 41%	QUANTITY	0.98+
five X	QUANTITY	0.98+
about 50 years old	QUANTITY	0.94+
Mapbox	ORGANIZATION	0.93+
Hadoop	TITLE	0.93+
tonight	DATE	0.91+
less than a 12 month	QUANTITY	0.91+
Hudson	LOCATION	0.87+
Hadoop	LOCATION	0.86+
four year old	QUANTITY	0.83+
Hadoop on	LOCATION	0.78+
USS Intrepid	ORGANIZATION	0.76+
map RDB	TITLE	0.68+
Hadoop Hudson	TITLE	0.68+
Gemini	COMMERCIAL_ITEM	0.53+
some	QUANTITY	0.5+
Hadoop on the	TITLE	0.5+

Steve Wooledge - HP Discover Las Vegas 2014 - theCUBE - #HPDiscover

>>Live from Las Vegas, Nevada. It's a queue at HP. Discover 2014 brought to you by HP. >>Welcome back, everyone live here in Las Vegas for HP. Discover 2014. This is the cube we're out. We go where the action is. We're on the ground here at HP. Discover getting all the signals, sharing them with you, extracting the signal from the noise. I'm John furrier, founder of SiliconANGLE. I joined Steve Woolwich VP of product marketing at map art technologies. Great to see you welcome to the cube. Thank you. I know you got a plane to catch up, but I really wanted to squeeze you in because you guys are a leader in the big data space. You guys are in the top three, the three big whales map are Hortonworks, Cloudera. Um, you know, part of the original big data industry, which, you know, when we did the cube, when we first started the industry, you had like 30, 34 employees, total combined with three, one company Cloudera, and then Matt are announced and then Hortonworks, you guys have been part of that. Holy Trinity of, of early pioneers. Give us the update you guys are doing very, very well. Uh, we talked to you guys at the dupe summit last week. So Jack Norris for the party, give us the update what's going on with the momentum and the traction. And then I want to talk about some of the things with the product. >>Yeah. So we've seen a tremendous uptick in sales at map. Are we tripled revenue? We announced that publicly about a month ago. So we went up 300% in sales, over Q3, I'm sorry, Q1 of 2013. And I think it's really, you know, the maturity of the market. As people move more towards production, they appreciate the enterprise features. We built into the map, our distribution for Hadoop. So, um, you know, the stats I would share is that 80% of our customers triple the size of their cluster within the first 12 months and 50% of them doubled the size of the cluster because there's the, you know, they had that first production success use case and they find other applications and start rolling out more and more. So it's been great for us. >>You know, I always joke with Jack Norris, who's the VP of marketing over there. And John Frodo is the CEO about Matt bars, humbleness. You don't have the fanfare of all the height, depressed love cloud era. Now see they had done some pretty amazing things. They've had a liquidity event, so essentially kind of an IPO, if you will, that huge ex uh, financing from Intel and they're doing great big Salesforce. Hortonworks has got their open source play. You guys got, you got your heads down as well. So talk about that. How many employees you guys have and what's going on with the product? How many, how many new, what, how many products do you guys actually, >>We have, well, we have one product. So we have the map, our distribution for Hadoop, and it's got all the open source packages directly within it, but where we really innovate is in the course. So that's where we, we spent our time early on was really innovating that data platform to give everything within the Hadoop ecosystem, more reliability, better availability, performance, security scale, >>It's open source contributions to the court. And you guys put stuff on top of that, uh, >>And how it works. Yeah. And even some projects we lead the projects like with Apache Mahal and Apache drill, which is coming into beta shortly other projects, we commit and contribute back. But, um, so we take in the distribution, we're distributing all those projects, but where we really innovate is at that data platform level. So >>HP is a big data leader officer. They bought, uh, autonomy. They have HP Vertica. You guys are here. Hey, what are you doing here? Obviously we covered the cube, uh, the announcement with, uh, with, with HP Vertica, you here for that reason, is there other biz dev other activity going on other integration opportunities? >>Yeah, a few things. So, um, obviously the HP Vertica news was big. We went into general availability that solution the first week of may. So, um, what we have is the HP Vertica database integrated directly on top of our data platform. So it's this hybrid solution where you have full SQL database directly within your Hadoop distribution. Um, so it had a couple sessions on that. We had, uh, a nice panel discussion with our friends from Cloudera and Hortonworks. So really good discussion with HP about just the ecosystem and how it's evolving. The other things we're doing with HP now is, you know, we've got reference architectures on their hardware lines. So, um, you know, people can deploy Mapbox on the hardware of HP, but then also we're talking with the, um, the autonomy group about enterprise search and looking at a similar type of integration where you could have the search integrated directly into your Hadoop distro. And we've got some joint accounts we're piloting that she goes, now, >>You guys are integrating with HP pretty significantly that deals is working well. Absolutely. What's the coolest thing that you've seen with an HP that you can share. How so I asked you in the big data landscape, everyone's Bucher, you know, hunkering down, working on their feature, but outside in the real world, big data, it's not on the top of mind of the CIO, 24 7. It's probably an item that they're dressing. What have you seen and what have you been most impressed with at HP here? >>Yeah. Say, you know, this is my first HP event like this. I think the strategy they have is really good. I think in certain areas like the cloud in particular with the helium, I think they made a lot of early investments there and place some bets. And I think that's going to pay off well for them. And that marries pretty nicely with our strategy as well in terms of, you know, we have on-premise deployments, but we're also an OEM if you will, within Amazon web services. So we have a lot of agility in the cloud if you will. And I think as those products and the partnerships with HP, evolvable, we'll be playing a lot more with them in the cloud as well. >>I see that asks you a question. I want you to share with the folks out there in your own words, what is it about map bar that they may or may not understand or might not know about? Um, a little humble brag out there and share some, share some, uh, insight of, into, into map bar for folks that don't know you guys as a company and for the folks that may have a misperception of what you guys do shit share with them, with what, what map map is all about. >>Yeah. I mean, for me, I was in this space with Aster data and kind of the whole Hadoop and MapReduce area since 2008 and pretty familiar with everybody in the space. I really looked at Matt bars, the best technology hands down, you look at the Forrester wave and they rank us as having the best technology today, as well as product roadmap. I think the misperception is people think, oh, it's proprietary and close. It's actually the opposite of that. We have an unbiased open-source approach where we'll ship in support in our distribution, in the entire Apache spark stack. We're not selective over which projects within Apache spark. We support. Um, I feel like SQL on Hadoop. We support Impala as well as hive and other SQL on to do technologies, including the ability to integrate HP Vertica directly in the system. And it's because of the openness of our platform. I'd say it's actually more open because of the standards we've integrated into the data platform to support a lot of third-party tools directly within it. So there is no locked in the storage formats are all the same. The code that runs on top of the distribution from the projects is exactly the same. So you can build a project in hive or some other system, and you can port it between any of the distributions. So there isn't a, lock-in >>The end of the day, what the customers want is they want ease of integration. They want reliability. That's right. And so what are you guys working on next? What's the big, uh, product marketing roadmap that you can share with us? >>Yeah, I think for us, because of the innovations we did in the data platform allows us to support not only more applications, but more types of operational systems. So integrating things like fraud detection and recommendation engines directly with the analytical systems to really speed up that, um, accuracy and, and, uh, in targeting and detecting risk and things like that. So I think now over time, you know, Hadoop has sort of been this batch analytic type of platform, but the ability to converge operations and analytics in one system is really going to be enabled by technology like Matt BARR. >>How many employees do you guys have now? Uh, >>I'm not sure what our CFO would. Let me say that before. You can say we're over 200 at this point >>As well. And over five, the customers which got the data, you guys do summit graduations, we covered your relationship with HP during our big data SV. That was exciting. Good to see John Schroeder, big, very impressive team. I'm impressed with map. I will always have been. You guys have Stephanie kept your knitting saved. Are you going to do, and again, leading the big data space, um, and again, not proprietary is a very key word and that's really cool. So thanks for coming on. Like you really appreciate Steve. We'll be right back. This is the cube live in Las Vegas, extracting the city from the noise with map bar here at the HP discover 2014. We'll be right back here for the short break.

Published Date : Jun 12 2014

SUMMARY :

Discover 2014 brought to you by HP. Uh, we talked to you guys at the dupe summit last week. So, um, you know, the stats You guys got, you got your heads down as well. and it's got all the open source packages directly within it, but where we really innovate is in the course. And you guys put stuff on top of that, But, um, so we take in the distribution, we're distributing all those projects, but where we really innovate is uh, the announcement with, uh, with, with HP Vertica, you here for that reason, is there other biz dev other activity So it's this hybrid solution where you have full SQL How so I asked you in the big data landscape, everyone's Bucher, So we have a lot of agility in the cloud if you will. into map bar for folks that don't know you guys as a company and for the folks that may have a misperception of what you So you can build a project in hive or some What's the big, uh, product marketing roadmap that you can So I think now over time, you know, Hadoop has sort of been this batch analytic Let me say that before. And over five, the customers which got the data, you guys do summit graduations,

ENTITIES

Entity	Category	Confidence
John Schroeder	PERSON	0.99+
Steve Woolwich	PERSON	0.99+
Steve	PERSON	0.99+
Jack Norris	PERSON	0.99+
HP	ORGANIZATION	0.99+
John Frodo	PERSON	0.99+
three	QUANTITY	0.99+
80%	QUANTITY	0.99+
Steve Wooledge	PERSON	0.99+
50%	QUANTITY	0.99+
John furrier	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Matt BARR	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
Stephanie	PERSON	0.99+
30	QUANTITY	0.99+
300%	QUANTITY	0.99+
first	QUANTITY	0.99+
last week	DATE	0.99+
Aster	ORGANIZATION	0.99+
2008	DATE	0.98+
Q1	DATE	0.98+
Las Vegas, Nevada	LOCATION	0.98+
one product	QUANTITY	0.98+
34 employees	QUANTITY	0.98+
one system	QUANTITY	0.98+
evolvable	ORGANIZATION	0.98+
over five	QUANTITY	0.97+
SQL	TITLE	0.97+
three big whales	QUANTITY	0.97+
MapReduce	ORGANIZATION	0.96+
SiliconANGLE	ORGANIZATION	0.96+
first 12 months	QUANTITY	0.95+
Apache Mahal	ORGANIZATION	0.95+
map map	ORGANIZATION	0.95+
over 200	QUANTITY	0.95+
24	OTHER	0.94+
today	DATE	0.94+
Intel	ORGANIZATION	0.92+
Matt	PERSON	0.92+
Salesforce	ORGANIZATION	0.91+
2014	DATE	0.9+
Impala	TITLE	0.9+
Hadoop	ORGANIZATION	0.89+
HP Vertica	ORGANIZATION	0.89+
map bar	ORGANIZATION	0.89+
Hadoop	TITLE	0.86+
one company	QUANTITY	0.85+
dupe summit	EVENT	0.84+
about a month ago	DATE	0.83+
Bucher	PERSON	0.81+
Discover 2014	EVENT	0.78+
first week of may	DATE	0.77+
Apache drill	ORGANIZATION	0.74+
#HPDiscover	ORGANIZATION	0.73+
Mapbox	TITLE	0.73+
2013	DATE	0.72+
SQL on	TITLE	0.7+
art technologies	ORGANIZATION	0.63+
Apache	ORGANIZATION	0.61+

Jack Norris - Hadoop Summit 2014 - theCUBE - #HadoopSummit

>>The queue at Hadoop summit, 2014 is brought to you by anchor sponsor Hortonworks. We do, I do. And headline sponsor when disco we make Hadoop invincible >>Okay. Welcome back. Everyone live here in Silicon valley in San Jose. This is a dupe summit. This is Silicon angle and Wiki bonds. The cube is our flagship program. We go out to the events and extract the signal to noise. I'm John barrier, the founder SiliconANGLE joins my cohost, Jeff Kelly, top big data analyst in the, in the community. Our next guest, Jack Norris, COO of map R security enterprise. That's the buzz of the show and it was the buzz of OpenStack summit. Another open source show. And here this year, you're just seeing move after, move at the moon, talking about a couple of critical issues. Enterprise grade Hadoop, Hortonworks announced a big acquisition when all in, as they said, and now cloud era follows suit with their news. Today, I, you sitting back saying, they're catching up to you guys. I mean, how do you look at that? I mean, cause you guys have that's the security stuff nailed down. So what Dan, >>You feel about that now? I think I'm, if you look at the kind of Hadoop market, it's definitely moving from a test experimental phase into a production phase. We've got tremendous customers across verticals that are doing some really interesting production use cases. And we recognized very early on that to really meet the needs of customers required some architectural innovation. So combining the open source ecosystem packages with some innovations underneath to really deliver high availability, data protection, disaster recovery features, security is part of that. But if you can't predict the PR protect the data, if you can't have multitenancy and separate workflows across the cluster, then it doesn't matter how secure it is. You know, you need those. >>I got to ask you a direct question since we're here at Hadoop summit, because we get this question all the time. Silicon lucky bond is so successful, but I just don't understand your business model without plates were free content and they have some underwriters. So you guys have been very successful yet. People aren't looking at map are as good at the quiet leader, like you doing your business, you're making money. Jeff. He had some numbers with us that in the Hindu community, about 20% are paying subscriptions. That's unlike your business model. So explain to the folks out there, the business model and specifically the traction because you have >>Customers. Yeah. Oh no, we've got, we've got over 500 paying customers. We've got at least $1 million customer in seven different verticals. So we've got breadth and depth and our business model is simple. We're an enterprise software company. That's looking at how to provide the best of open source as well as innovations underneath >>The most open distribution of Hadoop. But you add that value separately to that, right? So you're, it's not so much that you're proprietary at all. Right. Okay. >>You clarify that. Right. So if you look at, at this exciting ecosystem, Hadoop is fairly early in its life cycle. If it's a commoditization phase like Linux or, or relational database with my SQL open source, kind of equates the whole technology here at the beginning of this life cycle, early stages of the life cycle. There's some architectural innovations that are really required. If you look at Hadoop, it's an append only file system relying on Linux. And that really limits the types of operations. That types of use cases that you can do. What map ours done is provide some deep architectural innovations, provide complete read-write file systems to integrate data protection with snapshots and mirroring, et cetera. So there's a whole host of capabilities that make it easy to integrate enterprise secure and, and scale much better. Do you think, >>I feel like you were maybe a little early to the market in the sense that we heard Merv Adrian and his keynote this morning. Talk about, you know, it's about 10 years when you start to get these questions about security and governance and we're about nine years into Hadoop. Do you feel like maybe you guys were a little early and now you're at a tipping point, whereas these more, as more and more deployments get ready to go to production, this is going to be an area that's going to become increasingly important. >>I think, I think our timing has been spectacular because we, we kind of came out at a time when there was some customers that were really serious about Hadoop. We were able to work closely with them and prove our technology. And now as the market is just ramping, we're here with all of those features that they need. And what's a, what's an issue. Is that an incremental improvement to provide those kind of key features is not really possible if the underlying architecture isn't there and it's hard to provide, you know, online real-time capabilities in a underlying platform that's append only. So the, the HDFS layer written in Java, relying on the Linux file system is kind of the, the weak underbelly, if you will, of, of the ecosystem. There's a lot of, a lot of important developments happening yarn on top of it, a lot of really kind of exciting things. So we're actively participating in including Apache drill and on top of a complete read-write file system and integrated Hindu database. It just makes it all come to life. >>Yeah. I mean, those things on top are critical, but you know, it's, it's the underlying infrastructure that, you know, we asked, we keep on community about that. And what's the, what are the things that are really holding you back from Paducah and production and the, and the biggest challenge is they cited worth high availability, backup, and recovery and maintaining performance at scale. Those are the top three and that's kind of where Matt BARR has been focused, you know, since day one. >>So if you look at a major retailer, 2000 nodes and map bar 50 unique applications running on a single cluster on 10,000 jobs a day running on top of that, if you look at the Rubicon project, they recently went public a hundred million add actions, a hundred billion ad auctions a day. And on top of that platform, beats music that just got acquired for $3 billion. Basically it's the underlying map, our engine that allowed them to scale and personalize that music service. So there's a, there's a lot of proof points in terms of how quickly we scale the enterprise grade features that we provide and kind of the blending of deep predictive analytics in a batch environment with online capabilities. >>So I got to ask you about your go to market. I'll see Cloudera and Hortonworks have different business models. Just talk about that, but Cloudera got the massive funding. So you get this question all the time. What do you, how do you counter that army and the arms race? I think >>I just wrote an article in Forbes and he says cash is not a strategy. And I think that was, that was an excellent, excellent article. And he goes in and, you know, in this fast growing market, you know, an amount of money isn't necessarily translate to architectural innovations or speeding the development of that. This is a fairly fragmented ecosystem in terms of the stack that runs on top of it. There's no single application or single vendor that kind of drives value. So an acquisition strategy is >>So your field Salesforce has direct or indirect, both mixable. How do you handle the, because Cloudera has got feet on the street and every squirrel will find it, not if they're parked there, parking sales reps and SCS and all the enterprise accounts, you know, they're going to get the, squirrel's going to find a nut once in awhile. Yeah. And they're going to actually try to engage the clients. So, you know, I guess it is a strategy if they're deploying sales and marketing, right? So >>The beauty about that, and in fact, we're all in this together in terms of sharing an API and driving an ecosystem, it's not a fragmented market. You can start with one distribution and move to another, without recompiling or without doing any sort of changes. So it's a fairly open community. If this were a vendor lock-in or, you know, then spending money on brand, et cetera, would, would be important. Our focus is on the, so the sales execution of direct sales, yes, we have direct sales. We also have partners and it depends on the geographies as to what that percentage is. >>And John Schroeder on with the HP at fifth big data NYC has updated the HP relationship. >>Oh, excellent. In fact, we just launched our application gallery app gallery, make it very easy for administrators and developers and analysts to get access and understand what's available in the ecosystem. That's available directly on our website. And one of the featured applications there today is an integration with the map, our sandbox and HP Vertica. So you can get early access, try it and get the best of kind of enterprise grade SQL first, >>First Hadoop app store, basically. Yeah. If you want to call it that way. Right. So like >>Sure. Available, we launched with close to 30, 30 with, you know, a whole wave kind of following that. >>So talk a little bit about, you know, speaking of verdict and kind of the sequel on Hadoop. So, you know, there's a lot of talk about that. Some confusion about the different methods for applying SQL on predicts or map art takes an open approach. I know you'll support things like Impala from, from a competitor Cloudera, talk about that approach from a map arts perspective. >>So I guess our, our, our perspective is kind of unbiased open source. We don't try to pick and choose and dictate what's the right open source based on either our participation or some community involvement. And the reality is with multiple applications being run on the platform, there are different use cases that make difference, you know, make different sense. So whether it's a hive solution or, you know, drill drills available, or HP Vertica people have the choice. And it's part of, of a broad range of capabilities that you want to be able to run on the platform for your workflows, whether it's SQL access or a MapReduce or a spark framework shark, et cetera. >>So, yeah, I mean there is because there's so many different there's spark there's, you know, you can run HP Vertica, you've got Impala, you've got hive. And the stinger initiative is, is that whole kind of SQL on Hadoop ecosystem, still working itself out. Are we going to have this many options in a year or two years from now? Or are they complimentary and potentially, you know, each has its has its role. >>I think the major differences is kind of how it deals with the new data formats. Can it deal with self-describing data? Sources can leverage, Jason file does require a centralized metadata, and those are some of the perspectives and advantages say the Apache drill has to expand the data sets that are possible enabled data exploration without dependency on a, on an it administrator to define that, that metadata. >>So another, maybe not always as exciting, but taking workloads from existing systems, moving them to Hadoop is one of the ways that a lot of people get started with, to do whether associated transformation workloads or there's something in that vein. So I know you've announced a partnership with Syncsort and that's one of the things that they focus on is really making it as easy as possible to meet those. We'll talk a little bit about that partnership, why that makes sense for you and, and >>When your customer, I think it's a great proof point because we announced that partnership around mainframe offload, we have flipped comScore and experience in that, in that press release. And if you look at a workload on a mainframe going to duke, that that seems like that's a, that's really an oxymoron, but by having the capabilities that map R has and making that a system of record with that full high availability and that data protection, we're actually an option to offload from mainframe offload, from sand processing and provide a really cost effective, scalable alternative. And we've got customers that had, had tried to offload from the mainframe multiple times in the past, on successfully and have done it successfully with Mapbox. >>So talk a little bit more about kind of the broader partnership strategy. I mean, we're, we're here at Hadoop summit. Of course, Hortonworks talks a lot about their partnerships and kind of their reseller arrangements. Fedor. I seem to take a little bit more of a direct approach what's map R's approach to kind of partnering and, and as that relates to kind of resell arrangements and things like, >>I think the app gallery is probably a great proof point there. The strategy is, is an ecosystem approach. It's having a collection of tools and applications and management facilities as well as applications on top. So it's a very open strategy. We focus on making sure that we have open API APIs at that application layer, that it's very easy to get data in and out. And part of that architecture by presenting standard file system format, by allowing non Java applications to run directly on our platform to support standard database connections, ODBC, and JDBC, to provide database functionality. In addition to kind of this deep predictive analytics really it's about supporting the broadest set of applications on top of a single platform. What we're seeing in this kind of this, this modern architecture is data gravity matters. And the more processing you can do on a single platform, the better off you are, the more agile, the more competitive, right? >>So in terms of, so you're partnering with people like SAS, for example, to kind of bring some of the, some of the analytic capabilities into the platform. Can you kind of tell us a little bit about any >>Companies like SAS and revolution analytics and Skytree, and I mean, just a whole host of, of companies on the analytics side, as well as on the tools and visualization, et cetera. Yeah. >>Well, I mean, I, I bring up SAS because I think they, they get the fact that the, the whole data gravity situation is they've got it. They've got to go to where the data is and not have the data come to them. So, you know, I give them credit for kind of acknowledging that, that kind of big data truth ism, that it's >>All going to the data, not bringing the data >>To the computer. Jack talk about the success you had with the customers had some pretty impressive numbers talking about 500 customers, Merv agent. The garden was on with us earlier, essentially reiterating not mentioning that bar. He was just saying what you guys are doing is right where the puck is going. And some think the puck is not even there at the same rink, some other vendors. So I gotta give you props on that. So what I want you to talk about the success you have in specifically around where you're winning and where you're successful, you guys have struggled with, >>I need to improve on, yeah, there's a, there's a whole class of applications that I think Hadoop is enabling, which is about operations in analytics. It's taking this, this higher arrival rate machine generated data and doing analytics as it happens and then impacting the business. So whether it's fraud detection or recommendation engines, or, you know, supply chain applications using sensor data, it's happening very, very quickly. So a system that can tolerate and accept streaming data sources, it has real-time operations. That is 24 by seven and highly available is, is what really moves the needle. And that's the examples I used with, you know, add a Rubicon project and, you know, cable TV, >>The very outcome. What's the primary outcomes your clients want with your product? Is it stability? And the platform has enabled development. Is there a specific, is there an outcome that's consistent across all your wins? >>Well, the big picture, some of them are focused on revenues. Like how do we optimize revenue either? It's a new data source or it's a new application or it's existing application. We're exploding the dataset. Some of it's reducing costs. So they want to do things like a mainframe offload or data warehouse offload. And then there's some that are focused on risk mitigation. And if there's anything that they have in common it's, as they moved from kind of test and looked at production, it's the key capabilities that they have in enterprise systems today that they want to make sure they're in Hindu. So it's not, it's not anything new. It's just like, Hey, we've got SLS and I've got data protection policies, and I've got a disaster recovery procedure. And why can't I expect the same level of capabilities in Hindu that I have today in those other systems. >>It's a final question. Where are you guys heading this year? What's your key objectives. Obviously, you're getting these announcements as flurry of announcements, good success state of the company. How many employees were you guys at? Give us a quick update on the numbers. >>So, you know, we just reported this incredible momentum where we've tripled core growth year over year, we've added a tremendous amount of customers. We're over 500 now. So we're basically sticking to our knitting, focusing on the customers, elevating the proof points here. Some of the most significant customers we have in the telco and financial services and healthcare and, and retail area are, you know, view this as a strategic weapon view, this is a huge competitive advantage, and it's helping them impact their business. That's really spring our success. We've, you know, we're, we're growing at an incredible clip here and it's just, it's a great time to have made those calls and those investments early on and kind of reaping the benefits. >>It's. Now I've always said, when we, since the first Hadoop summit, when Hortonworks came out of Yahoo and this whole community kind of burst open, you had to duke world. Now Riley runs at it's a whole different vibe of itself. This was look at the developer vibe. So I got to ask you, and we would have been a big fan. I mean, everyone has enough beachhead to be successful, not about map arbors Hortonworks or cloud air. And this is why I always kind of smile when everyone goes, oh, Cloudera or Hortonworks. I mean, they're two different animals at this point. It would do different things. If you guys were over here, everyone has their quote, swim lanes or beachhead is not a lot of super competition. Do you think, or is it going to be this way for awhile? What's your fork at some? At what point do you see more competition? 10 years out? I mean, Merv was talking a 10 year horizon for innovation. >>I think that the more people learn and understand about Hadoop, the more they'll appreciate these kind of set of capabilities that matter in production and post-production, and it'll migrate earlier. And as we, you know, focus on more developer tools like our sandbox, so people can easily get experienced and understand kind of what map are, is. I think we'll start to see a lot more understanding and momentum. >>Awesome. Jack Norris here, inside the cube CMO, Matt BARR, a very successful enterprise grade, a duke player, a leader in the space. Thanks for coming on. We really appreciate it. Right back after the short break you're live in Silicon valley, I had dupe December, 2014, the right back.

Published Date : Jun 4 2014

SUMMARY :

The queue at Hadoop summit, 2014 is brought to you by anchor sponsor I mean, cause you guys have that's the security stuff nailed down. I think I'm, if you look at the kind of Hadoop market, I got to ask you a direct question since we're here at Hadoop summit, because we get this question all the time. That's looking at how to provide the best of open source But you add that value separately to So if you look at, at this exciting ecosystem, Talk about, you know, it's about 10 years when you start to get these questions about security and governance and we're about isn't there and it's hard to provide, you know, online real-time And what's the, what are the things that are really holding you back from Paducah So if you look at a major retailer, 2000 nodes and map bar 50 So I got to ask you about your go to market. you know, in this fast growing market, you know, an amount of money isn't necessarily all the enterprise accounts, you know, they're going to get the, squirrel's going to find a nut once in awhile. We also have partners and it depends on the geographies as to what that percentage So you can get early If you want to call it that way. a whole wave kind of following that. So talk a little bit about, you know, speaking of verdict and kind of the sequel on Hadoop. And it's part of, of a broad range of capabilities that you want So, yeah, I mean there is because there's so many different there's spark there's, you know, you can run HP Vertica, of the perspectives and advantages say the Apache drill has to expand the data sets why that makes sense for you and, and And if you look at a workload on a mainframe going to duke, So talk a little bit more about kind of the broader partnership strategy. And the more processing you can do on a single platform, the better off you are, Can you kind and I mean, just a whole host of, of companies on the analytics side, as well as on the tools So, you know, I give them credit for kind of acknowledging that, that kind of big data truth So what I want you to talk about the success you have in specifically around where you're winning and you know, add a Rubicon project and, you know, cable TV, And the platform has enabled development. the key capabilities that they have in enterprise systems today that they want to make sure they're in Hindu. Where are you guys heading this year? So, you know, we just reported this incredible momentum where we've tripled core and this whole community kind of burst open, you had to duke world. And as we, you know, focus on more developer tools like our sandbox, a duke player, a leader in the space.

ENTITIES

Entity	Category	Confidence
Jeff Kelly	PERSON	0.99+
Jack Norris	PERSON	0.99+
John Schroeder	PERSON	0.99+
HP	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
$3 billion	QUANTITY	0.99+
December, 2014	DATE	0.99+
Jason	PERSON	0.99+
Matt BARR	PERSON	0.99+
10,000 jobs	QUANTITY	0.99+
Today	DATE	0.99+
10 year	QUANTITY	0.99+
Syncsort	ORGANIZATION	0.99+
Dan	PERSON	0.99+
Silicon valley	LOCATION	0.99+
John barrier	PERSON	0.99+
Java	TITLE	0.99+
Yahoo	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
24	QUANTITY	0.99+
Hadoop	TITLE	0.99+
Cloudera	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
this year	DATE	0.99+
Jack	PERSON	0.99+
fifth	QUANTITY	0.99+
Linux	TITLE	0.99+
Skytree	ORGANIZATION	0.99+
each	QUANTITY	0.99+
both	QUANTITY	0.99+
today	DATE	0.98+
one	QUANTITY	0.98+
Merv	PERSON	0.98+
about 10 years	QUANTITY	0.98+
San Jose	LOCATION	0.98+
Hadoop	EVENT	0.98+
about 20%	QUANTITY	0.97+
seven	QUANTITY	0.97+
over 500	QUANTITY	0.97+
a year	QUANTITY	0.97+
about 500 customers	QUANTITY	0.97+
SQL	TITLE	0.97+
seven different verticals	QUANTITY	0.97+
two years	QUANTITY	0.97+
single platform	QUANTITY	0.96+
2014	DATE	0.96+
Apache	ORGANIZATION	0.96+
Hadoop	LOCATION	0.95+
SiliconANGLE	ORGANIZATION	0.94+
comScore	ORGANIZATION	0.94+
single vendor	QUANTITY	0.94+
day one	QUANTITY	0.94+
Salesforce	ORGANIZATION	0.93+
about nine years	QUANTITY	0.93+
Hadoop Summit 2014	EVENT	0.93+
Merv	ORGANIZATION	0.93+
two different animals	QUANTITY	0.92+
single application	QUANTITY	0.92+
top three	QUANTITY	0.89+
SAS	ORGANIZATION	0.89+
Riley	PERSON	0.88+
First	QUANTITY	0.87+
Forbes	TITLE	0.87+
single cluster	QUANTITY	0.87+
Mapbox	ORGANIZATION	0.87+
map R	ORGANIZATION	0.86+
map	ORGANIZATION	0.86+

Jack Norris - Hadoop Summit 2013 - theCUBE - #HadoopSummit

>>Ash it's, you know, what will that mean to my investment? And the announcement fusion IO is that, you know, we're 25 times faster on read intensive HBase applications. The combination. So as organizations are deploying Hadoop, and they're looking at technology changes coming down the pike, they can rest assured that they'll be able to take advantage of those in a much more aggressive fashion with map R than, than other distribution. >>Jack, how I got to ask you, we were talking last night at the Hadoop summit, kind of the kickoff party and, you know, everyone was there. All the top execs were there and all the developers, you know, we were in the queue. I think, I think that either Dave or myself coined the term, the big three of big data, you guys ROMs cloud Cloudera map R and Hortonworks, really at the, at the beginning of the key players early on and Charles from Cloudera was just recently on. And, and he's like, oh no, this, this enterprise grade stuff has been kicked around. It's been there from the beginning. You guys have been there from the beginning and Matt BARR has never, ever waffled on your, on your messaging. You've always been very clear. Hey, we're going to take a dupe open source a dupe and turn it into an enterprise grade product. Right. So that's clear, right? That's, that's, that's a great, that's a great, so what's your take on this because now enterprise grade is kind of there, I guess, the buzz around getting the, like the folks that have crossed the chasm implemented. So what can you comment on that about one enterprise grade, the reality of it, certainly from your perspective, you haven't been any but others. And then those folks that are now rolling it out for the first time, what can you share with them around? What does it mean to be enterprise grade? >>So enterprise grade is more about the customer experience than, than a marketing claim. And, you know, by enterprise grade, what we're talking about are some of the capabilities and features that they've grown to expect in their, their other enterprise applications. So, you know, the ability to meet full S SLA is full ha recovery from multiple failures, rolling upgrades, data protection was consistent snapshots business continuity with mirroring the ability to share a cluster across multiple groups and have, you know, volumes. I mean, there's a, there's a host of features that fall under the umbrella enterprise grade. And when you move from no support for any of those features to support to a few of them, I don't think that's going to, to ha it's more like moving to low availability. And, and there's just a lot of differences in terms of when we say enterprise grade with those features mean versus w what we view as kind of an incomplete story. So >>What do you, what do you mean by low availability? Well, I mean, it's tongue in cheek. It's nice. It's a good term. It's really saying, you know, just available when you sometimes is that what you mean? Is this not true availability? I mean, availability is 99.9%. Right? >>Right. So if you've got a, an ha solution that can't recover from multiple failures, that's downtime. If you've got an HBase application that's running online and you have data that goes down and it takes 10 to 30 minutes to have the region servers recover it from another place in the distribution, that's downtime. If you have snapshots that aren't consistent across the cluster, that doesn't provide data protection, there's no point in time recovery for, for a cluster. So, you know, there's a lot of details underneath that, but what it, what it amounts to is, do you have interruptions? Do you have downtime? Do you have the potential for losing data? And our answer is you need a series of features that are hardened and proven to deliver that. >>What about recoverability? You mentioned that you guys have done a lot of work in that area with snapshotting, that's kind of being kicked around, are our folks addressing, what are the comp what's your competition doing in those areas of recoverability just mentioned availability. Okay, got that. Recoverability security, compliance, and usability. Those are the areas that seem to be the hot focus areas what's going on in the energy. How would you give them the grade, the letter grade, if you will, candidly, compared to what you guys offer? Well, the, >>The first of all, it's take recoverability. You know, one of the tenants is you have a point in time recovery, the ability to restore to a previous point that's consistent across the cluster. And right now there's, there's no point in time recovery for, for HDFS, for the files. And there's no point in time recovery for HBase tables. So there's snapshot support. It's being talked about in the open source community with respect to snapshots, but it's being referred to in the JIRAs as fuzzy snapshots and really compared to copy table. >>So, Jack, I want to turn the conversation to the, kind of the topic we've talked about before kind of the open versus a proprietary that, that whole debate we've, we've, we've heard about that. We talked about that before here on the cube. So just kind of reiterate for us your take. I mean, we, we hear perhaps because of the show we're at, there's a lot of talk about the open source nature of Hadoop and some of the purists, as you might call them are saying, it's gotta be open a hundred percent Patrick compatible, et cetera. And then there's others that are taking a different approach, explain your approach and why you think that's the key way to make, to really spur adoption of a dupe and make it >>W w we're we're a part of the community we're, we've got, you know, commitment going on. We've, you know, pioneered and pushed a patchy drill, but we have done innovations as well. And I think that those innovations are really required to support and extend the, the whole ecosystem. So canonical distributes RN, three D distribution. We've got, you know, all our, our packages are, are available on get hub and, and open source. So it's not, it's not a binary debate. And I think the, the point being that there's companies that have jumped ahead and now that Peloton is, is, you know, pedaling faster and, and we'll, we'll catch up. We'll streamline. I think the difference is we rearchitected. So we're basically in a race car and, you know, are, are racing ahead with, with enterprise grade features that are required. And there's a lot of work that still needs to be done, needs to be accomplished before that full rearchitecture is, is in place. >>Well, I mean, I think for me, the proof is really in the pudding when you, when it comes to talk about customers that are doing real things and real production, grade mission, critical applications that they're running. And to me that shows the successor or relative success of a given approach. So I know you guys are working with companies like ancestry.com, live nation and Quicken loans. Maybe you could, could you walk us through a couple of those scenarios? Let's take ancestry.com. Obviously they've got a huge amount of data based on the kind of geological information, where do you guys do >>With them? Yeah, so they've got, I mean, they've got the world's largest family genealogy services available on the web. So there's a massive amount of data that they make accessible and, and, you know, ability for, for analysis. And then they've rolled out new features and new applications. One of which is to ship a kit out, have people spit in a tube, returned back and they do DNA matching and reveal additional details. So really some really fabulous leading edge things that are being done with, with the use of, of Hadoop. >>Interesting. So talk about when you went to, to work with them, what were some of their key requirements? Was it around, it was more around the enterprise enterprise, grade security and uptime kind of equation, or was it more around some of the analytics? What, what, what's the kind of the killer use case for them? >>It's kind of, you know, it's, it's hard with a specific company or even, you know, to generalize across companies. Cause they're really three main areas in terms of ease of use and administration dependability, which includes the full ha and then, and then performance. And in some cases, it's, it's just one of those that kind of drives it. And it's used to justify, in other cases, it's kind of a collection. The ease of use is being able to use a cluster, not only as Hadoop, but to access it and treat it like enterprise storage. So it's a complete POSIX compliance file system underneath that allows the, the mounting and access and updates and using it in dynamic read-write. So what that means from an application level, it's, it's faster, it's much easier to administer and it's much easier and reliable for developers to, to utilize. >>I got to ask you about the marketing question cause I see, you know, map our, you guys have done a good job of marketing. Certainly we want to be thankful to you guys is supporting the cube in the past and you guys have been great supporters of our mission, but now the ecosystem's evolving a lot more competition. Claudia mentioned those eight companies they're tracking in quote Hadoop, and certainly Jeff and I, and, and SiliconANGLE by look at there's a lot more because Hadoop washing has been going on now for the term Hadoop watching me and jumping in and doing Hadoop, slapping that onto an existing solution. It's not been happening full, full, full bore for a year. At least what's the next for you guys to break above the noise? Obviously the communities are very active projects are coming online. You guys have your mission in the enterprise. What's the strategy for you guys going forward is more of the same and anything new even share. >>Yeah, I, I, I think as far as breaking above the noise, it will be our customers, their success and their use cases that really put the spotlight on what the differences are in terms of, of, you know, using a big data platform. And I think what, what companies will start to realize is I'd rather analogy between supply chain and the big, the big revolution in supply chain was focusing on inventory at each stage in the supply chain. And how do you reduce that inventory level and how do you speed the, the flow of goods and the agility of a company for competitive advantage. And I think we're going to view data the same way. So companies instead of raw data that they're copying and moving across different silos, if they're able to process data in place and send small results sets, they're going to be faster, more agile and more competitive. >>And that puts the spotlight on what data platform is out there that can support a broad set of applications and it can have the broadest set of functionality. So, you know, what we're delivering is a mission grade, you know, enterprise grade mission, critical support platform that supports MapReduce and does that high performance provides NFS POSIX access. So you can use it like a file system integrates, you know, enterprise grade, no SQL applications. So now you can do, you know, high-speed consistent performance, real time operations in addition to batch streaming, integrated search, et cetera. So it's, it's really exciting to provide that platform and have organizations transform what they're doing. >>How's the feedback on with Ted Dunning? I haven't seen a lot of buzz on the Twittersphere is getting positive feedback here. He's a, a tech athlete. He's a guru, he's an expert. He's got his hands in all the pies. He's a scientist type. What's he up to? What's his, what's his role within Mapa and he's obviously playing in the open-source community. What's he up to these days, >>Chief application architect, he's on the leading edge of my house. So machine learning, so, you know, sharing insights there, he was speaking at the storm meetup two nights ago and sharing how you can integrate long running batch, predictive analytics with real-time streaming and how the use of snapshots really that, that easy and possible. He travels the world and is helping organizations understand how they can take some very complex, long running processes and really simplify and shorten those >>Chance to meet him in New York city had last had duke world at a, at a, a party and great guy, fantastic geek, and certainly is doing a great work and shout out to Ted. Congratulations, continue up that support. How's everyone else doing? How's John and Treevis doing how's the team at map are we're pedaling as best as you can growing >>Really quickly. No, we're just shifting gears. Would it be on pedaling >>Engine? >>Yeah. Give us an update on the company in terms of how the growth and kind of where you guys are moving that. >>Yeah. We're, we're expanding worldwide, you know, just this, you know, last few months we've opened up offices and in London and Munich and Paris, we're expanding in Asia, Japan and Korea. So w our, our sales and services and engineering, and basically across the whole company continues to expand rapidly. Some really great, interesting partnerships and, and a lot of growth Natalie's we add customers, but it's, it's nice to see customers that continue to really grow their use of map are within their organization, both in terms of amount of data that they're analyzing and the number of applications that they're bringing to bear on the platform. >>Well, that a little bit, because I think, you know, one of the, one of the trends we do see is when a company brings in big data, big data platform, and they might start experiment experimenting with it, build an application. And then maybe in the, maybe in the marketing department, then the sales guys see it and they say, well, maybe we can do something with that. How is that typically the kind of the experience you're seeing and how do you support companies that want to start expanding beyond those initial use cases to support other departments, potentially even other physical locations around the world? How do you, how do you kind of, >>That's been the beauty of that is if you have a platform that can support those new applications. So if you know, mission critical workloads are not an issue, if you support volumes so that you can logically separate makes it much easier, which we have. So one of our customers Zions bank, they brought in Matt BARR to do fraud detection. And pretty soon the fact that they were able to collect all of that data, they had other departments coming to them and saying, Hey, we'd like to use that to do analysis on because we're not getting that data from our existing system. >>Yeah. They come in and you're sitting on a goldmine, there are use cases. And you also mentioned kind of, as you're expanding internationally, what's your take on the international market for big data to do specifically is, is the U S kind of a leaps and bounds ahead of the rest of the world in terms of adoption of the technology. What are you seeing out there in terms of where, where the rest of the, >>I wouldn't say leaps and bounds, and I think internationally, they're able to maybe skip some of the experimental steps. So we're seeing, we're seeing deployment of class financial services and telecom, and it's, it's fairly broad recruit technologies there. The largest provider of recruiting services, indeed.com is one of their subsidiaries they're doing a lot with, with Hadoop and map are specifically, so it's, it's, it's been, it's been expanding rapidly. Fantastic. >>I also, you know, when you think about Europe, what's going on with Google and some of the, the privacy concerns even here, or I should say, is there, are there different regulatory environments you've got to navigate when you're talking about data and how you use data when you're starting to expand to other, other locales? >>Yeah. There's typically by vertical, there's different, different requirements, HIPAA and healthcare, and basal to, and financial services. And so all of those, and it, it, it basically, it's the same theme of when you're bringing Hadoop into an organization and into a data center, the same sorts of concerns and requirements and privacy that you're applying in other areas will be applied on Hindu. >>I'm now kind of turning back to the technology. You mentioned Apache drill. I'd love to get an update on kind of where, where that stands. You know, it's put, then put that into context for people. We hear a lot about the SQL and Hadoop question here, where does drill fit into that, into that equation? >>Well, the, the, you know, there's a lot of different approaches to provide SQL access. A lot of that is driven by how do you, how do you leverage some of the talent and organization that, you know, speak SQL? So there's developments with respect to hive, you know, there's other projects out there. Apache drill is an open source project, getting a lot of community involvement. And the design center there is pretty interesting. It started from the beginning as an open source project. And two main differences. One was in looking at supporting SQL it's, let's do full ANSI SQL. So it's full 2003 ANSI, sequel, not a SQL like, and that'll support the greatest number of applications and, you know, avoid a lot of support and, and issues. And the second design center is let's support a broad set of data sources. So nested sources like Jason scheme on discovery, and basically fitting it into an enterprise environment, which sometimes is kinda messy and can get messy as acquisitions happen, et cetera. So it's complimentary, it's about, you know, enabling interactive, low latency queries. >>Jack, I want to give you the final word. We are out of time. Thanks for coming on the cube. Really preached. Great to see you again, keep alumni, but final word. And we'll end the segment here on the cube is your quick thoughts on what's happening here at Hadoop world. What is this show about? Share with the audience? What's the vibe, the summary quick soundbite on Hadoop. >>I think I'll go back to how we started. It's not, if you used to do putz, how you use to do and, you know, look at not only the first application, but what it's going to look like in multiple applications and pay attention to what enterprise grade means. >>Okay. They were secure. We got a more coverage coming, Jack Norris with map R I'll say one of the big three original, big three, still on the, on the list in our mind, and the market's mind with a unique approach to Hadoop and the mid-June great. This is the cube I'm Jennifer with Jeff Kelly. We'll be right back after this short break, >>Let's settle the PR program out there and fighting gap tech news right there. Plenty of the attack was that providing a new gadget. Let's talk about the latest game name, but just the.

Published Date : Jun 27 2013

SUMMARY :

IO is that, you know, we're 25 times faster on read intensive HBase applications. All the top execs were there and all the developers, you know, So, you know, the ability to meet full S SLA is full ha It's really saying, you know, just available when So, you know, there's a lot of details compared to what you guys offer? You know, one of the tenants is you have a point of Hadoop and some of the purists, as you might call them are saying, it's gotta be open a hundred percent that Peloton is, is, you know, pedaling faster and, and we'll, we'll catch up. So I know you guys are working with companies like ancestry.com, live nation and Quicken that they make accessible and, and, you know, ability for, So talk about when you went to, to work with them, what were some of their key requirements? It's kind of, you know, it's, it's hard with a specific company or even, I got to ask you about the marketing question cause I see, you know, map our, you guys have done a good job of marketing. And how do you reduce that inventory level and how do you speed the, you know, what we're delivering is a mission grade, you know, enterprise grade mission, How's the feedback on with Ted Dunning? so, you know, sharing insights there, he was speaking at the storm meetup How's John and Treevis doing how's the team at map are we're pedaling as best as you can No, we're just shifting gears. and basically across the whole company continues to expand rapidly. Well, that a little bit, because I think, you know, one of the, one of the trends we do see is when a company brings in big data, That's been the beauty of that is if you have a platform that can support those And you also mentioned kind of, they're able to maybe skip some of the experimental steps. and it, it, it basically, it's the same theme of when you're bringing Hadoop into We hear a lot about the SQL and Hadoop question support the greatest number of applications and, you know, avoid a lot of support and, Great to see you again, you know, look at not only the first application, but what it's going to look like in multiple This is the cube I'm Jennifer with Jeff Kelly. Plenty of the attack was that providing a new gadget.

ENTITIES

Entity	Category	Confidence
Ted	PERSON	0.99+
London	LOCATION	0.99+
Claudia	PERSON	0.99+
Jeff Kelly	PERSON	0.99+
Asia	LOCATION	0.99+
Ted Dunning	PERSON	0.99+
Jack Norris	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Jack	PERSON	0.99+
10	QUANTITY	0.99+
Paris	LOCATION	0.99+
Korea	LOCATION	0.99+
Matt BARR	PERSON	0.99+
Munich	LOCATION	0.99+
New York	LOCATION	0.99+
99.9%	QUANTITY	0.99+
Jennifer	PERSON	0.99+
Treevis	PERSON	0.99+
25 times	QUANTITY	0.99+
Japan	LOCATION	0.99+
Google	ORGANIZATION	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
Jeff	PERSON	0.99+
eight companies	QUANTITY	0.99+
first time	QUANTITY	0.99+
mid-June	DATE	0.99+
Charles	PERSON	0.98+
Europe	LOCATION	0.98+
30 minutes	QUANTITY	0.98+
One	QUANTITY	0.98+
first application	QUANTITY	0.98+
Ash	PERSON	0.98+
two nights ago	DATE	0.98+
Hortonworks	ORGANIZATION	0.98+
each stage	QUANTITY	0.97+
SQL	TITLE	0.97+
SiliconANGLE	ORGANIZATION	0.97+
Natalie	PERSON	0.97+
ancestry.com	ORGANIZATION	0.96+
Hadoop	TITLE	0.96+
Patrick	PERSON	0.96+
last night	DATE	0.95+
Jason	PERSON	0.95+
2003	DATE	0.95+
Hadoop	EVENT	0.94+
Apache	ORGANIZATION	0.94+
Hadoop	PERSON	0.93+
indeed.com	ORGANIZATION	0.93+
hundred percent	QUANTITY	0.92+
HBase	TITLE	0.92+
Hadoop Summit 2013	EVENT	0.92+
Quicken loans	ORGANIZATION	0.92+
two main differences	QUANTITY	0.89+
HIPAA	TITLE	0.89+
#HadoopSummit	EVENT	0.89+
S SLA	TITLE	0.89+
Hadoop	ORGANIZATION	0.88+
Cloudera	ORGANIZATION	0.85+
map R	TITLE	0.85+
a year	QUANTITY	0.83+
Zions bank	ORGANIZATION	0.83+
Peloton	LOCATION	0.78+
NFS	TITLE	0.78+
MapReduce	TITLE	0.77+
Cloudera map R	ORGANIZATION	0.75+
live	ORGANIZATION	0.74+
second design center	QUANTITY	0.73+
Hindu	ORGANIZATION	0.7+
theCUBE	ORGANIZATION	0.7+
three main areas	QUANTITY	0.68+
one enterprise grade	QUANTITY	0.65+

Jack Norris | Strata-Hadoop World 2012

>>Okay. We're back here, live in New York city for big data week. This is siliconangle.tvs, exclusive coverage of Hadoop world strata plus Hadoop world big event, a big data week. And we just wrote a blog post on siliconangle.com calling this the south by Southwest for data geeks and, and, um, it's my prediction that this is going to turn into a, quite the geek Fest. Uh, obviously the crowd here is enormous packed and an amazing event. And, uh, we're excited. This is siliconangle.com. I'm the founder John ferry. I'm joined by cohost update >>Volante of Wiki bond.org, where people go for free research and peers collaborate to solve problems. And we're here with Jack Norris. Who's the vice president of market marketing at map are a company that we've been tracking for quite some time. Jack, welcome back to the cube. Thank you, Dave. I'm going to hand it to you. You know, we met quite a while ago now. It was well over a year ago and we were pushing at you guys and saying, well, you know, open source and nice look, we're solving problems for customers. We got the right model. We think, you know, this is, this is our strategy. We're sticking to it. Watch what happens. And like I said, I have to hand it to you. You guys are really have some great traction in the market and you're doing what you said. And so congratulations on that. I know you've got a lot more work to do, but >>Yeah, and actually the, the topic of openness is when it's, it's pretty interesting. Um, and, uh, you know, if you look at the different options out there, all of them are combining open source with some proprietary. Uh, now in the case of some distributions, it's very small, like an ODBC driver with a proprietary, um, driver. Um, but I think it represents that that any solution combining to make it more open is, is important. So what we've done is make innovations, but what we've made those innovations we've opened up and provided API. It's like NFS for standard access, like rest, like, uh, ODBC drivers, et cetera. >>So, so it's a spectrum. I mean, actually we were at Oracle open world a few weeks ago and you listen to Larry Ellison, talk about the Oracle public cloud mix of actually a very strong case that it's open. You can move data, it's all Java. So it's all about standards. Yeah. And, uh, yeah, it from an opposite, but it was really all about the business value. That's, that's what the bottom line is. So, uh, we had your CEO, John Schroeder on yesterday. Uh, John and I both were very impressed with, um, essentially what he described as your philosophy of we, we not as a product when we have, we have customers when we announce that product and, um, you know, that's impressive, >>Is that what he was also given some good feedback that startup entrepreneurs out there who are obviously a lot of action going on with the startup community. And he's basically said the same thing, get customers. Yeah. And that's it, that's all and use your tech, but don't be so locked into the tech, get the cutters, understand the needs and then deliver that. So you guys have done great. And, uh, I want to talk about the, the show here. Okay. Because, uh, you guys are, um, have a big booth and big presence here at the show. What, what did you guys are learning? I'll say how's the positioning, how's the new news hitting. Give us a quick update. So, >>Uh, a lot of news, uh, first started, uh, on Tuesday where we announced the M seven edition. And, uh, yeah, I brought a demo here for me, uh, for you all. Uh, because the, the big thing about M seven is what we don't have. So, uh, w we're not demoing Regents servers, we're not demoing compactions, uh, we're not demoing a lot of, uh, manual administration, uh, administrative tasks. So what that really means is that we took this stack. And if you look at HBase HBase today has about half of dupe users, uh, adopting HBase. So it's a lot of momentum in the market, uh, and, you know, use for everything from real-time analytics to kind of lightweight LTP processing. But it's an infrastructure that sits on top of a JVM that stores it's data in the Hadoop distributed file system that sits on a JVM that stores its data in a Linux file system that writes to disk. >>And so a lot of the complexity is that stack. And so as an administrator, you have to worry about how data gets permit, uh, uh, you know, kind of basically written across that. And you've got region servers to keep up, uh, when you're doing kind of rights, you have things called compactions, which increased response time. So it's, uh, it's a complex environment and we've spent quite a bit of time in, in collapsing that infrastructure and with the M seven edition, you've got files and tables together in the same layer writing directly to disc. So there's no region servers, uh, there's no compactions to deal with. There's no pre splitting of tables and trying to do manual merges. It just makes it much, much simpler. >>Let's talk about some of your customers in terms of, um, the profile of these guys are, uh, I'm assuming and correct me if I'm wrong, that you're not selling to the tire kickers. You're selling to the guys who actually have some experience with, with a dupe and have run into some of the limitations and you come in and say, Hey, we can solve some of those problems. Is that, is that, is that right? Can you talk about that a little bit >>Characterization? I think part of it is when you're in the evaluation process and when you first hear about Hadoop, it's kind of like the Gartner hype curve, right. And, uh, you know, this stuff, it does everything. And of course you got data protection, cause you've got things replicated across the cluster. And, uh, of course you've got scalability because you can just add nodes and so forth. Well, once you start using it, you realize that yes, I've got data replicated across the cluster, but if I accidentally delete something or if I've got some corruption that's replicated across the cluster too. So things like snapshots are really important. So you can return to, you know, what was it, five minutes before, uh, you know, performance where you can get the most out of your hardware, um, you know, ease of administration where I can cut this up into, into logical volumes and, and have policies at that whole level instead of at an individual file. >>So there's a, there's a bunch of features that really resonate with users after they've had some experience. And those tend to be our, um, you know, our, our kind of key customers. There's a, there's another phase two, which is when you're testing Hadoop, you're looking at, what's possible with this platform. What, what type of analytics can I do when you go into production? Now, all of a sudden you're looking at how does this fit in with my SLS? How does this fit in with my data protection, uh, policies, you know, how do I integrate with my different data sources? And can I leverage existing code? You know, we had one customer, um, you know, a large kind of a systems integrator for the federal government. They have a million lines of code that they were told to rewrite, to run with other distributions that they could use just out of the box with Matt BARR. >>So, um, let's talk about some of those customers. Can you name some names and get >>Sure. So, um, actually I'll, I'll, I'll talk with, uh, we had a keynote today and, uh, we had this beautiful customer video. They've had to cut because of times it's running in our booth and it's screaming on our website. And I think we've got to, uh, actually some of the bumper here, we kind of inserted. So, um, but I want to shout out to those because they ended up in the cutting room floor running it here. Yeah. So one was Rubicon project and, um, they're, they're an interesting company. They're a real-time advertising platform at auction network. They recently passed a Google in terms of number one ad reach as mentioned by comScore, uh, and a lot of press on that. Um, I particularly liked the headline that mentioned those three companies because it was measured by comScore and comScore's customer to map our customer. And Google's a key partner. >>And, uh, yesterday we announced a world record for the Hadoop pterosaur running on, running on Google. So, um, M seven for Rubicon, it allows them to address and replace different point solutions that were running alongside of Hadoop. And, uh, you know, it simplifies their, their potentially simplifies their architecture because now they have more things done with a single platform, increases performance, simplifies administration. Um, another customer is ancestry.com who, uh, you know, maybe you've seen their ads or heard, uh, some of their radio shots. Um, they're they do a tremendous amount of, of data processing to help family services and genealogy and figure out, you know, family backgrounds. One of the things they do is, is DNA testing. Uh, so for an internet service to do that, advanced technology is pretty impressive. And, uh, you know, you send them it's $99, I believe, and they'll send you a DNA kit spit in the tube, you send it back and then they process that and match and give you insights into your family background. So for them simplifying HBase meant additional performance, so they could do matches faster and really simplified administration. Uh, so, you know, and, and Melinda Graham's words, uh, you know, it's simpler because they're just not there. Those, those components >>Jack, I want to ask you about enterprise grade had duped because, um, um, and then, uh, Ted Dunning, because he was, he was mentioned by Tim SDS on his keynote speech. So, so you have some rockstars stars in the company. I was in his management team. We had your CEO when we've interviewed MC Sri vis and Google IO, and we were on a panel together. So as to know your team solid team, uh, so let's talk about, uh, Ted in a minute, but I want to ask you about the enterprise grade Hadoop conversation. What does that mean now? I mean, obviously you guys were very successful at first. Again, we were skeptics at first, but now your traction and your performance has proven this is a market for that kind of platform. What does that mean now in this, uh, at this event today, as this is evolving as Hadoop ecosystem is not just Hadoop anymore. It's other things. Yeah, >>There's, there's, there's three dimensions to enterprise grade. Um, the first is, is ease of use and ease of use from an administrator standpoint, how easy does it integrate into an existing environment? How easy does it, does it fit into my, my it policies? You know, do you run in a lights out data center? Does the Hadoop distribution fit into that? So that's, that's one whole dimension. Um, a key to that is, is, you know, complete NFS support. So it functions like, uh, you know, like standard storage. Uh, a second dimension is undependability reliability. So it's not just, you know, do you have a checkbox ha feature it's do you have automated stateful fail over? Do you have self healing? Can you handle multiple, uh, failures and, and, you know, automated recovery. So, you know, in a lights out data center, can you actually go there once a week? Uh, and then just, you know, replace drives. And a great example of that is one of our customers had a test cluster with, with Matt BARR. It was a POC went on and did other things. They had a power field, they came back a week later and the cluster was up and running and they hadn't done any manual tasks there. And they were, they were just blown away to the recovery process for the other distributions, a long laundry list of, >>So I've got to ask you, I got to ask you this, the third >>One, what's the third one, third one is performance and performance is, is, you know, kind of Ross' speed. It's also, how do you leverage the infrastructure? Can you take advantage of, of the network infrastructure, multiple Knicks? Can you take advantage of heterogeneous hardware? Can you mix and match for different workloads? And it's really about sharing a cluster for different use cases and, and different users. And there's a lot of features there. It's not just raw >>The existing it infrastructure policies that whole, the whole, what happens when something goes wrong. Can you automate that? And then, >>And it's easy to be dependable, fast, and speed the same thing, making HBase, uh, easy, dependable, fast with themselves. >>So the talk of the show right now, he had the keynote this morning is that map. Our marketing has dropped the big data term and going with data Kozum. Is that true? Is that true? So, Joe, Hellerstein just had a tweet, Joe, um, famous, uh, Cal Berkeley professor, computer science professor now is CEO of a startup. Um, what's the industry trifecta they're doing, and he had a good couple of epic tweets this week. So shout out to Joe Hellerstein, but Joel Hellison's tweet that says map our marketing has decided to drop the term big data and go with data Kozum with a shout out to George Gilder. So I'm kind of like middle intellectual kind of humor. So w w w what's what's your response to that? Is it true? What's happening? What is your, the embargo, the VP of marketing? >>Well, if you look at the big data term, I think, you know, there's a lot of big data washing going on where, um, you know, architectures that have been out there for 30 years or, you know, all about big data. Uh, so I think there's a, uh, there's the need for a more descriptive term. Um, the, the purpose of data Kozum was not to try to coin something or try to, you know, change a big data label. It was just to get people to take a step back and think, and to realize that we are in a massive paradigm shift. And, you know, with a shout out to George Gilder, acknowledging, you know, he recognized what the impact of, of making available compute, uh, meant he recognized with Telekom what bandwidth would mean. And if you look at the combination of we've got all this, this, uh, compute efficiency and bandwidth, now data them is, is basically taking those resources and unleashing it and changing the way we do things. >>And, um, I think, I think one of the ways to look at that is the new things that will be possible. And there's been a lot of focus on, you know, SQL interfaces on top of, of Hadoop, which are important. But I think some of the more interesting use cases are taking this machine J generated data that's being produced very, very rapidly and having automated operational analytics that can respond in a very fast time to change how you do business, either, how you're communicating with customers, um, how you're responding to two different, uh, uh, risk factors in the environment for fraud, et cetera, or, uh, just increasing and improving, um, uh, your response time to kind of cost events. We met earlier called >>Actionable insight. Then he said, assigning intent, you be able to respond. It's interesting that you talk about that George Gilder, cause we like to kind of riff and get into the concept abstract concepts, but he also was very big in supply side economics. And so if you look at the business value conversation, one of things we pointed out, uh, yesterday and this morning, so opening, um, review was, you know, the, the top conversations, insight and analytics, you know, as a killer app right now, the app market has not developed. And that's why we like companies like continuity and what you guys are doing under the hood is being worked on right at many levels, performance units of those three things, but analytics is a no brainer insight, but the other one's business value. So when you look at that kind of data, Kozum, I can see where you're going with that. >>Um, and that's kind of what people want, because it's not so much like I'm Republican because he's Republican George Gilder and he bought American spectator. Everyone knows that. So, so obviously he's a Republican, but politics aside, the business side of what big data is implementing is massive. Now that I guess that's a Republican concept. Um, but not really. I mean, businesses is, is, uh, all parties. So relative to data caused them. I mean, no one talks about e-business anymore. We talking to IBM at the IBM conference and they were saying, Hey, that was a great marketing campaign, but no one says, Hey, uh, you and eat business today. So we think that big data is going to have the same effect, which is, Hey, are you, do you have big data? No, it's just assumed. Yeah. So that's what you're basically trying to establish that it's not just about big. >>Yeah. Let me give you one small example, um, from a business value standpoint and, uh, Ted Dunning, you mentioned Ted earlier, chief application architect, um, and one of the coauthors of, of, uh, the book hoot, which deals with machine learning, uh, he dealt with one of our large financial services, uh, companies, and, uh, you know, one of the techniques on Hadoop is, is clustering, uh, you know, K nearest neighbors, uh, you know, different algorithms. And they looked at a particular process and they sped up that process by 30,000 times. So there's a blog post, uh, that's on our website. You can find out additional information on that. And I, >>There's one >>Point on this one point, but I think, you know, to your point about business value and you know, what does data Kozum really mean? That's an incredible speed up, uh, in terms of, of performance and it changes how companies can react in real time. It changes how they can do pattern recognition. And Google did a really interesting paper called the unreasonable effectiveness of data. And in there they say simple algorithms on big data, on massive amounts of data, beat a complex model every time. And so I think what we'll see is a movement away from data sampling and trying to do an 80 20 to looking at all your data and identifying where are the exceptions that we want to increase because there, you know, revenue exceptions or that we want to address because it's a cost or a fraud. >>Well, that's what I, I would give a shout out to, uh, to the guys that digital reasoning Tim asked he's plugged, uh, Ted. It was idolized him in terms of his work. Obviously his work is awesome, but two, he brought up this concept of understanding gap and he showed an interesting chart in his keynote, which was the date explosion, you know, it's up and, you know, straight up, right. It's massive amount of data, 64% unstructured by his calculation. Then he showed out a flat line called attention. So as data's been exploding over time, going up attention mean user attention is flat with some uptick maybe, but so users and humans, they can't expand their mind fast enough. So machine learning technologies have to bridge that gap. That's analytics, that's insight. >>Yeah. There's a big conversation now going on about more data, better models, people trying to squint through some of the comments that Google made and say, all right, does that mean we just throw out >>The models and data trumps algorithms, data >>Trumps algorithms, but the question I have is do you think, and your customer is talking about, okay, well now they have more data. Can I actually develop better algorithms that are simpler? And is it a virtuous cycle? >>Yeah, it's I, I think, I mean, uh, there are there's, there are a lot of debate here, a lot of information, but I think one of the, one of the interesting things is given that compute cycles, given the, you know, kind of that compute efficiency that we have and given the bandwidth, you can take a model and then iterate very quickly on it and kind of arrive at, at insight. And in the past, it was just that amount of data in that amount of time to process. Okay. That could take you 40 days to get to the point where you can do now in hours. Right. >>Right. So, I mean, the great example is fraud detection, right? So we used the sample six months later, Hey, your credit card might've been hacked. And now it's, you know, you got a phone call, you know, or you can't use your credit card or whatever it is. And so, uh, but there's still a lot of use cases where, you know, whether is an example where modeling and better modeling would be very helpful. Uh, excellent. So, um, so Dana custom, are you planning other marketing initiatives around that? Or is this sort of tongue in cheek fun? Throw it out there. A little red meat into the chum in the waters is, >>You know, what really motivated us was, um, you know, the cubes here talking, you know, for the whole day, what could we possibly do to help give them a topic of conversation? >>Okay. Data cosmos. Now of course, we found that on our proprietary HBase tools, Jack Norris, thanks for coming in. We appreciate your support. You guys have been great. We've been following you and continue to follow. You've been a great support of the cube. Want to thank you personally, while we're here. Uh, Matt BARR has been generous underwriter supportive of our great independent editorial. We want to recognize you guys, thanks for your support. And we continue to look forward to watching you guys grow and kick ass. So thanks for all your support. And we'll be right back with our next guest after this short break. >>Thank you. >>10 years ago, the video news business believed the internet was a fat. The science is settled. We all know the internet is here to stay bubbles and busts come and go. But the industry deserves a news team that goes the distance coming up on social angle are some interesting new metrics for measuring the worth of a customer on the web. What zinc every morning, we're on the air to bring you the most up-to-date information on the tech industry with scrutiny on releases of the day and news of industry-wide trends. We're here daily with breaking analysis, from the best minds in the business. Join me, Kristin Filetti daily at the news desk on Silicon angle TV, your reference point for tech innovation 18 months.

Published Date : Oct 25 2012

SUMMARY :

And, uh, we're excited. We think, you know, this is, this is our strategy. Um, and, uh, you know, if you look at the different options out there, we not as a product when we have, we have customers when we announce that product and, um, you know, Because, uh, you guys are, um, have a big booth and big presence here at the show. uh, and, you know, use for everything from real-time analytics to you know, kind of basically written across that. Can you talk about that a little bit And, uh, you know, this stuff, it does everything. And those tend to be our, um, you know, Can you name some names and get uh, we had this beautiful customer video. uh, you know, you send them it's $99, I believe, and they'll send you a DNA so let's talk about, uh, Ted in a minute, but I want to ask you about the enterprise grade Hadoop conversation. So it functions like, uh, you know, like standard storage. is, you know, kind of Ross' speed. Can you automate that? And it's easy to be dependable, fast, and speed the same thing, making HBase, So the talk of the show right now, he had the keynote this morning is that map. there's a lot of big data washing going on where, um, you know, architectures that have been out there for you know, SQL interfaces on top of, of Hadoop, which are important. uh, yesterday and this morning, so opening, um, review was, you know, but no one says, Hey, uh, you and eat business today. uh, you know, K nearest neighbors, uh, you know, different algorithms. Point on this one point, but I think, you know, to your point about business value and you which was the date explosion, you know, it's up and, you know, straight up, right. that Google made and say, all right, does that mean we just throw out Trumps algorithms, but the question I have is do you think, and your customer is talking about, okay, well now they have more data. cycles, given the, you know, kind of that compute efficiency that we have and given And now it's, you know, you got a phone call, you know, We want to recognize you guys, thanks for your support. We all know the internet is here to stay bubbles and busts come and go.

ENTITIES

Entity	Category	Confidence
Joe Hellerstein	PERSON	0.99+
George Gilder	PERSON	0.99+
Ted Dunning	PERSON	0.99+
Kristin Filetti	PERSON	0.99+
Joel Hellison	PERSON	0.99+
John Schroeder	PERSON	0.99+
Joe	PERSON	0.99+
Jack	PERSON	0.99+
Larry Ellison	PERSON	0.99+
Jack Norris	PERSON	0.99+
John	PERSON	0.99+
40 days	QUANTITY	0.99+
Melinda Graham	PERSON	0.99+
64%	QUANTITY	0.99+
$99	QUANTITY	0.99+
comScore	ORGANIZATION	0.99+
Tim	PERSON	0.99+
Dave	PERSON	0.99+
Tuesday	DATE	0.99+
Matt BARR	PERSON	0.99+
Hellerstein	PERSON	0.99+
Google	ORGANIZATION	0.99+
George Gilder	PERSON	0.99+
Ted	PERSON	0.99+
John ferry	PERSON	0.99+
30 years	QUANTITY	0.99+
30,000 times	QUANTITY	0.99+
today	DATE	0.99+
IBM	ORGANIZATION	0.99+
a week later	DATE	0.99+
yesterday	DATE	0.99+
two	QUANTITY	0.99+
three companies	QUANTITY	0.99+
Dana	PERSON	0.99+
Tim SDS	PERSON	0.99+
one point	QUANTITY	0.99+
Java	TITLE	0.99+
first	QUANTITY	0.99+
six months later	DATE	0.99+
one	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
one customer	QUANTITY	0.99+
Linux	TITLE	0.98+
once a week	QUANTITY	0.98+
18 months	QUANTITY	0.98+
Rubicon	ORGANIZATION	0.98+
HBase	TITLE	0.98+
Kozum	PERSON	0.98+
Gartner	ORGANIZATION	0.98+
this morning	DATE	0.97+
Telekom	ORGANIZATION	0.97+
this week	DATE	0.97+
10 years ago	DATE	0.97+
second dimension	QUANTITY	0.97+
both	QUANTITY	0.97+
Kozum	ORGANIZATION	0.95+
third one	QUANTITY	0.95+
One	QUANTITY	0.94+
three things	QUANTITY	0.94+
a year ago	DATE	0.94+
Hadoop	TITLE	0.93+
siliconangle.com	OTHER	0.93+
Knicks	ORGANIZATION	0.93+
Regents	ORGANIZATION	0.92+

Jack Norris | Hadoop Summit 2012

>>Okay. We're back live in Silicon valley and San Jose, California for the continuous coverage of siliconangle.tv and have duke world 2012. This is ground zero for the alpha geeks in big data. Uh, just the tech elite. We call them tech athletes and, uh, we're excited to cover it on the ground. Extract the signal from the noise here. This is the cube, our flagship telecast. I'm joining my co-host Jeff Kelly from Wiki bond.org, the best analyst in the business. Jeff, welcome back for another segment. End of the day, day one loving every minute. Okay. We're here with our guest. Jack Norris is a cm of map bar Jack. Welcome back to the cube. You've been on a few times. Um, so you guys have some news. Yes. So let's get right to the news. So you guys are a player in the business, so share with your news, the folks. Excellent jump right in. >>So, uh, two big announcements today, we announced that Amazon is integrating map bar as part of their Lastic MapReduce service and both edition or, or free edition. M three is available as well as M five directly with Amazon, Amazon in the cloud. >>So what's the value proposition. Why would a customer say, all right, I want to do this in the cloud manpower, an Amazon cloud rather than doing it on premise. >>Okay. So let's start with, I mean, there's a lot of value propositions, all balled up into one here. Uh, first of all, in the cloud, it allows them to spin up very quickly. Within a couple minutes, you can get, uh, you know, hundreds of nodes available. Um, and, uh, and depending on where you're processing the data, if you've got a lot of data in the cloud already makes a lot of sense to do the Hadoop processing directly there. So that's, that's one area. A second is you might have an on-premise cloud deployment and need to have a disaster recovery. So map R provides point in time, snapshots, uh, as well as, as a white area replication. So you can use mirroring having Amazon available as a target is a huge advantage. And then there's also a third application area where you can do processing of the data in the cloud and then synchronize those results to an on-premise. So basically process where the data is combined the results into a cluster on premise. So you >>Don't have to move the raw data. Uh, >>On-premise actually, it's all about let's do the processing on the data. Well, you know, the whole, >>The value proposition and big data in general is let's not move, move data as little as possible. Yep. Uh, you know, so you bring the computation to the data, if you can. Uh, so what are your take on this event? I mean, we've got, uh, this is a, you know, the 4th of June summit, uh, you know, Hortonworks is now fully taken over the show and talk about what you see out here in terms of, uh, the other vendors that play. And, uh, just to kind of the attendees, the vibe you're seeing, >>Uh, it's a lot of excitement. I think a big difference between last year, which seemed to be very developer focused. We're seeing a lot of, a lot of presentations by customers. A lot of information was shared by our customers today. It was fun to see that, uh, comScore's shared, uh, shared their success. Boeing gap map is, uh, it was great for us. >>Fantastic. We look at Amazon, Amazon, first of all, is the gold standard for public cloud. Right? They've knocked it out of the park. Everyone knows Amazon. Um, but they've been criticized on the big data front because of the cycle times involve on. Um, and some developers and mean for web service spending up and down. No problem. Um, and we're seeing businesses like Netflix run on Amazon. So Amazon is not a stranger to running scale for cloud, but Hadoop has kind of been a klugey thing for Amazon. So I think, you know, talk about why Amazon and you guys is a good fit out to the market. The market reach is great. So you guys know and have a huge addressable market. Are you guys helping solve some of that complexity with the, uh, with the MapReduce side? What's, >>What's the core, I guess the first comment first response would be, I think every customer should have that type of Kluge. Uh, uh, they could have the success that Amazon has in Hadoop. They have a huge number of, of, uh, of Hadoop deployments have been very, very successful. I think, >>I mean, you know what I mean by it's natural, it's, cloogy everywhere right now. That's the problem. But Amazon has huge scale, um, and had not a natural fit. There >>Is not a natural fit >>For the data for the data component. And, uh, uh, the HBase for example, >>Component. So where were Amazons, you know, made it very frictionless is the ability to spin up Hadoop to do the analysis. The gap that was missing is some of the, the ha capabilities. The data protection features the disaster recovery, and, you know, we're map are now it gives options to those customers. You know, if they want those kinds of enterprise enterprise grade features, now they have an option within EMR. It can select a M five and, and get moving if they want a performance. And in NFS, they've got the M three options. >>Well, congratulations. I think it's a great deal for you guys and for Amazon customers. My question for you is, as you guys explore the enterprise ready equation, which has been a big topic this week, um, what does that mean to you guys? Cause it means different things to different people depends on where, how high up to OLTB do you go? Right? I mean, we're how far from batch to real time transactional, um, levels you go, I mean, low bash, no problem. But as you start to get more near real time, it's going to be a little bit different gray in this house used security HDFS. Yeah. >>Yeah. So, so duke represents the strategic platform, right? Deploying that in an organization, um, you know, moving from kind of an experimental kind of lab based to production environment creates a different set of feature requirements. How available is it? How easy is it to integrate, right? How do I kind of protect that information and how do I share it? So when we say enterprise grade, we mean you can have SLA, she can put the data there and, and be confident that the data will remain there, that you can have a point in time recovery for an application error or user mistake. Uh, you can have a disaster recovery features in place. And then the integration is about not recreating the wheel to get access to the information. So Hadoop is very powerful, but it requires interacting through an HDFS API. If you can leverage it like through map bar with NFS standard file based access standard ODBC access, open it up. >>So I can use a standard file browser applications to see and manipulate the data really opens up the use cases. And then finally, what we announced in two dot oh, was multitenancy features. So as you share that information, all of a sudden the SLA is of different groups and well, these guys need it immediately. And if you've got some low grade batch jobs are going to impact that. So you want the ability to protect, to isolate, to secure information, and basically have virtual clusters within a cluster. And those features are important to cloud, but they're also important to on-premise >>So great for the hybrid cloud environments out there. I mean, the multitenancy cracking the code on that. Exactly huge. I mean, that is basically, I mean, right now most enterprises are like private cloud because it's like, they're basically extension of their data center and you're seeing a lot more activity in the hybrid cloud as a gateway to the public cloud. So, >>And, and, you know, frankly, people are kind of struggling with in an experimental with Apache Hadoop and the other distributions, the policies are either at the individual file level or the whole cluster. And it all almost forced the creation of separate physical clusters, which kind of goes against the whole Hadoop concept. So the ability to manage it, a logical layer have separate volumes where you can apply policies to apply that applies to all the content underneath really kind of makes it much, much easier for administrators to kind of deal with these multiple use cases. >>Amazon, Amazon has always been one of those cases for the enterprise where it's been one of those and they've, this has been talked about for years, put the credit card down, go play on Amazon, but then bring it back into the it group for certification. And so I think this is a nice product for you guys to bring that comfort. You know, we're very >>Excited the enterprise saying, Hey, >>Come play in Amazon. It's Bulletproof enterprise. Ready? So congratulations. >>I wonder, can we talk, uh, talk use cases. So what are you seeing in terms of, uh, evolving use cases as, as, uh, duke continues to become more enterprise grade, uh, depending on your definition, uh, but how is that impacting what you're seeing in terms of, even if it's just, uh, you know, the, the, um, the mindset even people think now, okay, now it's enterprise grade, well, maybe, you know, in, in, depending on who you talk to, it's been that way for a bit, but what kind of, uh, use cases are you seeing develop now that it's kind of starting to gain acceptance? It's like, okay, we can trust our data is going to be there, et cetera. >>So th there's a huge range of use cases that, uh, different by industry, different by kind of dataset that's being used against everything from really a deep store where you can do analytics on it. So you're selecting the content to something that's very, very analytic machine learning intensive, where you're doing sophisticated clustering algorithms, uh, et cetera, um, where we've seen kind of an expansion of use cases are around real-time streaming and you get streaming data sets that are kind of entering into the cloud. And, um, some of the more mission, critical data moving beyond just maybe click stream data or things that if you happen to drop a few, you know, not a big deal, right. Versus the kind of trust the business type of content. >>Talk a little bit about the streaming, uh, aspects, uh, because of course, you know, we think of duke, we think of a batch system in terms of streaming data into Hadoop. You know, that's, that's a different, uh, that's something we don't, we haven't heard a lot about. So how do you guys approach that? >>So, uh, one of the artifacts of, of HDFS, which is a, is a distributed file system that scores in the underlying Linux file system, it's append only. So as an administrator, you decide, how frequently do I close the file item? I going to do that an hourly basis on it every eight hours, because you have to close the file for other applications to see the data that's been written. Right? So one of the innovations that, uh, that we pursued was to rewrite that create this dynamic read-write layer. So you can continue to write data in any application is seeing the latest data that's written. So you can Mount the cluster as if it's storage and just continue to write data. There really opens up what's, uh, what's possible companies like Informatica, they're all from a messaging product integrates directly in with, with Matt BARR and provides. >>So what kind of advantage does that provide to the end user? What w w translate that into real business value? Why, why is that important? >>Well, so one example is comScore, comScore handles 30 billion, uh, objects a day, uh, as they go out and try to measure the use of, of the web and being able to continually write and stream that information and scale and handle that in a real time and do analytics and turn around data faster, has tremendous business value to them. If they're stuck in a batch environment where the load times lengthen to the point where all of a sudden they can't keep up and they're actually reporting on, you know, old news. And I think the analogy is forecasting rain a day after it's wet. Isn't exactly valuable. >>Yeah. So you guys, obviously a great deal of the enterprise ready for Amazon, big story, big coup for the company. What's next for you. I want to ask that and make sure you get that out there on your agenda for the next year, but then I want you to take a step back a year, maybe a year and a half ago. Look back at how much has changed in this landscape. Um, share your perspective because the market has gone through an evolution where there's been a market opportunity, and then everyone goes, oh my God, it's bigger than we actually thought. I mean, Jeff, Kelly's a groundbreaking report about the $50 billion market is now being talked about as too low. So big data has absolutely opened up to a huge, and it's changed some of the tactics around strategies. So your strategy, Hortonworks strategy, even cloud era. So, and it's still evolving. So what's changed for the folks out there from a year and a half ago, a year ago to today, and then look out for the next 12 months. What's on your agenda. >>Well, if, if you look back, I think we've been fairly consistent. Um, uh, I'm, I'm not going to take credit for the vision of our CEO and CTO. Uh, but they recognized early on that Hadoop was, uh, was a strategic platform and to be a strategic platform that applied to the broadest number of use cases and organizations required some, some areas, uh, of innovation and particularly the how it, how it scaled, how it was managed, how you stored and protected the information needed a rearchitecture. And I think that, you know, architecture matters when you're going through a paradigm shift, having the right one in place creates this, this ability, you know, to speed innovation. And I think that's, if there's anything that's changed, I think it's the speed of innovation has even increased in the Hadoop community. I think it's, it's created a focus on these enterprise grade features on how do we store this valuable information and, and continue to explore. >>And I think one of the observations I'll make is that on that note is that it really focuses everyone to be just mind your own business and get the products out. You know what I'm saying? We've seen everyone, the product focus be the number one conversation. >>What we've seen is customers, you know, start and they expand rapidly. Some of that student data growth, but a lot of it is student more and more applications are being delivered and, and, uh, and, and the values kind of extracted from the hoop platform and success breeds success. Well, >>Congratulations for all your success, great win with Amazon web services and make that a little bit more easier, more robust, and more, more features for them and you, uh, more revenue for part of our, um, and I want to personally thank you for your support to the cube. Uh, we've expanded with a new studio B software for extra extra interviews, um, and wanna expand the conversation, thanks to your generous support. You can bring the independent coverage out to the market and, um, great community, thanks for helping us out. And we appreciate it. So thank you. Okay. Jack Dorsey with Matt bar, we'll be right back to wrap up day one with that. Jeff and I will give our analysis right at the short break.

Published Date : Jun 14 2012

SUMMARY :

So you guys are a player in the business, so share with your news, Amazon in the cloud. So what's the value proposition. And then there's also a third application area where you can do processing of the data in Don't have to move the raw data. Well, you know, the whole, uh, you know, Hortonworks is now fully taken over the show and talk about what you see out here in terms of, uh, it was great for us. So I think, you know, talk about why Amazon and you guys is a good fit out What's the core, I guess the first comment first response would be, I think every customer I mean, you know what I mean by it's natural, it's, cloogy everywhere right now. For the data for the data component. the disaster recovery, and, you know, we're map are now it gives options to those customers. I think it's a great deal for you guys and for Amazon customers. that the data will remain there, that you can have a point in time recovery for an application error or user mistake. So as you share that information, So great for the hybrid cloud environments out there. So the ability to manage it, And so I think this is a nice product for you guys to So congratulations. So what are you seeing in terms of, uh, evolving use cases as, really a deep store where you can do analytics on it. Talk a little bit about the streaming, uh, aspects, uh, because of course, you know, we think of duke, I going to do that an hourly basis on it every eight hours, because you have to close the file for other applications actually reporting on, you know, old news. I want to ask that and make sure you get that And I think that, you know, architecture matters when you're going through a paradigm shift, And I think one of the observations I'll make is that on that note is that it really focuses everyone to be What we've seen is customers, you know, start and they expand rapidly. You can bring the independent coverage out to the market and, um, great community,

ENTITIES

Entity	Category	Confidence
Jeff Kelly	PERSON	0.99+
Jeff	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jack Norris	PERSON	0.99+
Jack Dorsey	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
$50 billion	QUANTITY	0.99+
Silicon valley	LOCATION	0.99+
30 billion	QUANTITY	0.99+
today	DATE	0.99+
Informatica	ORGANIZATION	0.99+
a year ago	DATE	0.99+
next year	DATE	0.99+
comScore	ORGANIZATION	0.99+
a year and a half ago	DATE	0.99+
Kelly	PERSON	0.99+
last year	DATE	0.99+
Amazons	ORGANIZATION	0.99+
Linux	TITLE	0.99+
Matt BARR	PERSON	0.99+
San Jose, California	LOCATION	0.99+
one example	QUANTITY	0.98+
one area	QUANTITY	0.97+
third application	QUANTITY	0.97+
Matt	PERSON	0.97+
one	QUANTITY	0.97+
Hadoop	TITLE	0.97+
this week	DATE	0.96+
2012	DATE	0.95+
hundreds of nodes	QUANTITY	0.94+
Hortonworks	ORGANIZATION	0.94+
Jack	PERSON	0.93+
both edition	QUANTITY	0.93+
a day	QUANTITY	0.93+
two big announcements	QUANTITY	0.92+
second	QUANTITY	0.9+
next 12 months	DATE	0.88+
day one	QUANTITY	0.86+
two dot	QUANTITY	0.85+
M three	OTHER	0.85+
M three	TITLE	0.84+
MapReduce	ORGANIZATION	0.82+
Hadoop Summit 2012	EVENT	0.79+
first response	QUANTITY	0.79+
every eight hours	QUANTITY	0.78+
SLA	TITLE	0.77+
June	DATE	0.77+
first comment	QUANTITY	0.77+
Lastic MapReduce	TITLE	0.69+
M five	OTHER	0.69+
Boeing	ORGANIZATION	0.68+
M five	TITLE	0.67+
siliconangle.tv	OTHER	0.67+
ground zero	QUANTITY	0.67+
Wiki bond.org	ORGANIZATION	0.62+
Apache	ORGANIZATION	0.61+
4th of	EVENT	0.6+

Jack Norris - Strata Conference 2012 - theCUBE

>>Hi everybody. We're back. This is Dave Volante from Wiki bond.org. We're live at strata in Santa Clara, California. This is Silicon angle TVs, continuous coverage of the strata conference. So Riley media or Raleigh media is a great partner of ours. And thanks to them for allowing us to be here. We've been going all week cause it's day three for us. I'm here with Jeff Kelly Wiki bonds that lead big data analysts. And we're here with Jack Norris. Who's the VP of marketing at Matt bar Jack. Welcome to the cube. Thank you, Dave. Thanks very much for coming on. And you know, we've been going all week. You guys are a great sponsor of ours. Thank you for the support. We really appreciate it. How's the show going for you? >>Great. A lot of attention, a lot of focus, a lot of discussion about Hadoop and big data. >>Yeah. So you guys getting a lot of traffic. I mean, it says I hear this 2,500 people here up from 1400 last year. So that's >>Yeah, we've had like five, six people deep in the, in the booth. So I think there's a lot of, a lot of interests. There's interesting. >>You know, when we were here last year, when you looked at the, the infrastructure and the competitive landscape, there wasn't a lot going on and just a very short time, that's completely changed. And you guys have had your hand in that. So, so that's good. Competition is a good thing, right? And, and obviously customers want choice, but so we want to talk about that a little bit. We want to talk about map bar, the kind of problems you're solving. So why don't we start there? What is map are all about? And you've got your own distribution of, of, of enterprise Hadoop. You make it Hadoop enterprise ready? Let's start there. >>Okay. Yeah, I mean, we invested heavily in creating a alternative distribution one that took the best of the open source community with the best of the map, our innovations, and really it's, it's about making Hadoop more applicable, broader use cases, more mission, critical support, you know, being able to sit in and work in a lights out data center environment. >>Okay. So what was the problem that you set out to solve? Why, why do, why do we need another distribution of Hadoop? Let me ask it that way. Get nice and close to. >>So there, there are some just big issues with, with the duke. >>One of those issues, let's talk about that. There's >>Some ease of use issues. There's some deep dependability issues. There's some, some performance. So, you know, let's take those in order right now. If you look at some of the distributions, Apache Hadoop, great technology, but it requires a programmer, right? To get access to the data it's through the Hadoop API, you can't really see the data. So there's a lot of focus of, you know, what do I do once the data's in there opening that up, providing a full file based access, right? So I can look at it and treat it like enterprise storage, see the data, use my standard tools, standard commands, you know, drag and drop from a file browser. You can do that with Matt bar. You can't do that with other districts >>Talking about mountain HDFS as a NFS correct >>Example. Correct. And then, and then just the underlying storage services. The fact that it's append only instead of full random read-write, you know, causes some, some issues. So, you know, that's some of the, the ease of use features. There's a whole lot. We could discuss there. Big picture for reliability. Dependability is there's a single point of failure, multiple single points of failure within Hadoop. So you risk data loss. So people have looked at Hadoop. Traditionally is, is batch oriented. Scratchpad right. We were out to solve that, right? We want to make sure that you can use it for mission critical data, that you don't have a risk of a data loss that you've got full high availability. You've got the full data protection in terms of snapshots and mirroring that you would expect with the enterprise products. >>It gets back to when you guys were, you know, thinking about doing this. I'm not even sure you were at the company at the time, but you, your DNA was there and you're familiar with it. So you guys saw this big data movement. You saw this at duke moon and you said, okay, this is cool. It's going to be big. And it's gonna take a long time for the community to fix all these problems. We can fix them. Now let's go do that. Is that the general discussion? Yeah. >>You know, I think, I think the what's different about this. This is the first open source package. The first open source project that's created a market. If you look at the other open source, you know, Linux, my SQL, et cetera, it was really late in the life cycle of a product. Everyone knew what the features were. It was about, you know, giving an alternative choice, better Unix. Your, your, the focus is on innovation and our founders, you know, have deep enterprise background or CTO was at Google and charge of big table, understands MapReduce at scale, spent time as chief software architect at Spinnaker, which was kind of the fastest clustered Nazanin on the planet. So recognize that the underlying layers of Hadoop needed some rearchitecture and needed some deep investment and to do that effectively and do that quickly required a whole lot of focus. And we thought that was the best way to go to market. >>Talk about the early validation from customers. Obviously you guys didn't just do this in a vacuum, I presume. So you went out and talked to some customers. Yeah. >>What sorts of conversations with customers, why we're in stealth mode? We're probably the loudest stealth >>As you were nodding. And I mean, what were they telling you at the time? Yeah, please go do this. >>The, what we address weren't secrets. I there've been gyrus for open for four or five years on, on these issues. >>Yeah. But at the same time, Jack, you've got this, you got this purist community out there that says, I don't want to, I don't want to rip out HDFS. You know, I want it to be pure. What'd you, what'd you say to those guys, you just say, okay, thank you. We, we understand you're not a prospect. >>And I think, I think that, you know, duke has a huge amount of momentum. And I think a lot of that momentum is that there isn't any risks to adopting Hadoop, right? It's not like the fractured no SQL market where there's 122 different entrance, which one's going to win. Hadoop's got the ecosystem. So when you say pure, it's about the API APIs, it's about making sure that if I create a MapReduce job, it's going to run an Apache. It's going to run a map bar. It's going to run on the other distributions. That's where I think that the heat and the focus is now to do that. You also have to have innovation occurring up and down the stack that that provides choice and alternatives for. >>So when I'm talking about purists, I don't, I agree with you the whole lock-in thing, which is the elephant in the room here. People will worry about lock-in >>Pun intended. >>No, no, but good one good catch. But so, but you're basically saying, Hey, where we're no more locked in than cloud era. Right. I mean, they've got their own >>Actually. I think we're less because it's so easy to get data in and out with our NFS. That there's probably less so, >>So, and I'm gonna come back to that. But so for instance, many, when I, when I say peers, I mean some users in ISV, some guys we've had on here, we had an Abby Mehta from Triceda on the other day, for instance, he's one who said, I just don't have time to mess with that stuff and figure out all that API integration. I mean, there are people out there that just don't want to go that route. Okay. But, but you're saying I'm, I'm inferring this plenty who do right. >>And the, and by the API route, I want to make sure I understand what you're saying. You >>Talked about, Hey, it's all about the API integration. It's not >>About, it's not the, it it's about the API APIs being consistent, a hundred percent compatible. Right. So if I, you know, write a program, that's, that's going after HDFS and the HDFS API, I want to make sure that that'll run on other distributions. Right. >>And that's your promise. Yeah. Okay. All right. So now where I was going with this was th again, there are some peers to say, oh, I just don't want to mess with all that. Now let's talk about what that means to mess with all that. So comScore was a big, high profile case study for you guys. They, they were cloud era customer. They basically, in my understanding is a couple of days migrated from Cloudera to Mapbox. And the impetus was, let's talk about that. Why'd they do that >>Performance data protection, ease of use >>License fee issues. There was some license issues there as well, right? The, the, your, your maintenance pricing was more attractive. Is that true? Or >>I read more mainly about price performance and reliability, and, you know, they tested our stuff at work real well in a test environment, they put it in production environment. Didn't actually tell all their users, they had one guys debug the software for half a day because something was wrong. It finished so quickly. >>So, so it took him a couple of days to migrate and then boom, >>Boom. And they've, they handle about 30 billion objects a day. So there, you know, the use of that really high performance support for, for streaming data flows, you know, they're talking about, they're doing forecasts and insights into web behavior, and, you know, they w the earlier they can do that, the better off they are. So >>Greg, >>So talk about the implications of, of your approach in terms of the customer base. So I'm, I'm imagining that your customers are more, perhaps advanced than a lot of your typical Hadoop users who are just getting started tinkering with Hadoop. Is it fair to say, you know, your customers know what they want and they want performance and they want it now. And they're a little more advanced than perhaps some of the typical early adopters. >>We've got people to go to our website and download the free version. And some of them are just starting off and getting used to Hadoop, but we did specifically target those very experienced Hadoop users that, you know, we're kind of, you know, stubbing their toes on, on the issues. And so they're very receptive to the message of we've made it faster. We've made it more reliable, you know, we've, we've added a lot of ease of use to the, to the Hindu. >>So I found this, let me interrupt, go back to what I was saying before is I found this comment that I found online from Mike Brown comScore. Skipio I presume you mean, he said comScore's map our direct access NFS feature, which exposes a duke distributed file system data as NFS files can then be easily mounted, modified, or overwritten. So that's a data access simplification. You also said we could capitalize on the purchase of map bar with an annual maintenance charge versus a yearly cost per node. NFS allowed our enterprise systems to easily access the data in the cluster. So does that make sense to you that, that enterprise of that annual maintenance charge versus yearly cost per node? I didn't get that. >>Oh, I think he's talking about some, some organizations prefer to do a perpetual license versus a subscription model that's >>Oh, okay. So the traditional way of licensing software >>And that, that you have to do it basically reinforces the fact that we've really invested in have kind of a, a product, you know, orientation rather than just services on top of, of some opensource. >>Okay. So you go in, you license it and then yeah. Perpetual license. >>Then you can also start with the free edition that does all the performance NFS support kick the tires >>Before you buy it. Sorry. Sorry, Jeff. Sorry to interrupt. No, no problem >>At all. So another topic, a lot of interest is security making a dupe enterprise ready. One of the pillars, there is security, making sure access controls, for instance, making sure let's talk about how you guys approach that and maybe how you differentiate from some of the other vendors out there, or the other >>Full Kerberos support. We Lincoln to enterprise standards for access eldap, et cetera. We leveraged the Linux, Pam security, and we also provide volume control. So, you know, right now in Hindu in Apache to dupe other distributions, you put policies at the file level or the entire cluster. And we see many organizations having separate physical clusters because of that limitation, right? And we'd provide volume. So you can define a volume. And in that volume control, access control, administrative privileges data protection class, and, you know, in a sense kind of segregate that content. And that provides a lot of, a lot of control and a lot more, you know, security and protection and separation of data. >>That scenario, the comScore scenario, common where somebody's moving off an existing distribution onto a map are, or, or you more going, going, seeing demand from new customers that are saying, Hey, what's this big data thing I really want to get into it. How's it shake out there >>Right now? There's this huge pent up demand for these features. And we're seeing a lot of people that have run on other distributions switched to map our >>A little bit of everything. How about, can you talk a little bit about your, your channel? You go to market strategy, maybe even some of your ecosystem and partnerships in the little time. >>Sure. So EMC is a big partner of the EMC Greenplum Mr. Edition is basically a map R you can start with any of our additions and upgrade to that. Greenplum with just a licensed key that gives us worldwide service and support. It's been a great partnership. >>We hear a lot of proof of concepts out there >>For, yeah. And then it just hit the news news today about EMC's distribution, Mr. Distribution being available with UCS Cisco's ECS gear. So now that's further expanded the, the footprint that we have about. >>Okay. So you're the EMC relationship. Anything else that you can share with us? >>We have other announcements coming out and >>Then you want to pre-announce in the queue. >>Oops. Did I let that slip >>It's alive? So be careful. And so, in terms of your, your channel strategy, you guys mostly selling direct indirect combination, >>It's it? It, it's kind of an indirect model through these, these large partners with a direct assist. >>Yeah. Okay. So you guys come in and help evangelize. Yep. Excellent. All right. Do you have anything else before we gotta got a roll here? >>Yeah, I did wonder if you could talk a little bit about, you mentioned EMC Greenplum so there's a lot of talk about the data warehouse market, the MPB data warehouses, versus a Hadoop based on that relationship. I'm assuming that Matt BARR thinks well, they're certainly complimentary. Can you just touch on that? And, you know, as opposed to some who think, well, Hadoop is going to be the platform where we go, >>Well, th th there's just, I mean, if you look at the typical organization, they're just really trying to get their, excuse me, their arms around a lot of this machine generated content, this, you know, unstructured data that just growing like wildfire. So there's a lot of Paducah specific use cases that are being rolled out. They're also kind of data lakes, data, oceans, whatever you want to call it, large pools where that information is then being extracted and loaded into data warehouses for further analysis. And I think the big pivot there is if it's well understood what the issue is, you define the schema, then there's a whole host of, of data warehouse applications out there that can be deployed. But there's many things where you don't really understand that yet having to dupe where you don't need to find a schema a is a, is a big value, >>Jack, I'm sorry. We have to go run a couple of minutes behind. Thank you very much for coming on the cube. Great story. Good luck with everything. And sounds like things are really going well and market's heating up and you're in the right place at the right time. So thank you again. Thank you to Jeff. And we'll be right back everybody to the strata conference live in Santa Clara, California, right after this word from our.

Published Date : Apr 27 2012

SUMMARY :

And you know, we've been going all week. A lot of attention, a lot of focus, a lot of discussion about Hadoop So that's So I think there's a lot of, And you guys have had your hand in that. broader use cases, more mission, critical support, you know, being able to sit in and work Let me ask it that way. So there, there are some just big issues with, One of those issues, let's talk about that. So there's a lot of focus of, you know, what do I do once the data's in So you risk data loss. It gets back to when you guys were, you know, thinking about doing this. It was about, you know, giving an alternative choice, better Unix. So you went out and talked to some customers. And I mean, what were they telling you at the time? I there've been gyrus for open for four or five You know, I want it to be And I think, I think that, you know, duke has a huge amount of momentum. So when I'm talking about purists, I don't, I agree with you the whole lock-in thing, I mean, they've got their own I think we're less because it's so easy to get data in and out with our NFS. So, and I'm gonna come back to that. And the, and by the API route, I want to make sure I understand what you're saying. Talked about, Hey, it's all about the API integration. So if I, you know, write a program, that's, that's going after for you guys. Is that true? and, you know, they tested our stuff at work real well in a test environment, they put it in production environment. you know, the use of that really high performance support for, to say, you know, your customers know what they want and they want performance and they want it now. experienced Hadoop users that, you know, we're kind of, you know, So does that make sense to you that, So the traditional way of licensing software And that, that you have to do it basically reinforces the fact that we've really invested in have kind Before you buy it. for instance, making sure let's talk about how you guys approach that and maybe how you differentiate from a lot of control and a lot more, you know, security and protection and separation of data. off an existing distribution onto a map are, or, or you more going, And we're seeing a lot of people that have run on other distributions switched to map our How about, can you talk a little bit about your, your channel? Mr. Edition is basically a map R you can start with any of our additions So now that's further Anything else that you can share with us? you guys mostly selling direct indirect combination, It, it's kind of an indirect model through these, these large partners with Do you have anything else before And, you know, as opposed to some who think, excuse me, their arms around a lot of this machine generated content, this, you know, So thank you again.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Jeff	PERSON	0.99+
Jack Norris	PERSON	0.99+
five	QUANTITY	0.99+
Dave Volante	PERSON	0.99+
Jack	PERSON	0.99+
EMC	ORGANIZATION	0.99+
last year	DATE	0.99+
Matt BARR	PERSON	0.99+
four	QUANTITY	0.99+
UCS	ORGANIZATION	0.99+
2,500 people	QUANTITY	0.99+
Santa Clara, California	LOCATION	0.99+
Greg	PERSON	0.99+
Google	ORGANIZATION	0.99+
Mike Brown	PERSON	0.99+
half a day	QUANTITY	0.99+
Spinnaker	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
comScore	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
Riley	ORGANIZATION	0.98+
EMC Greenplum	ORGANIZATION	0.98+
Abby Mehta	PERSON	0.98+
Linux	TITLE	0.97+
strata conference	EVENT	0.97+
SQL	TITLE	0.97+
One	QUANTITY	0.97+
one guys	QUANTITY	0.97+
today	DATE	0.97+
Raleigh	ORGANIZATION	0.97+
122 different entrance	QUANTITY	0.97+
six people	QUANTITY	0.97+
Skipio	PERSON	0.96+
Jeff Kelly	PERSON	0.95+
single point	QUANTITY	0.95+
about 30 billion objects a day	QUANTITY	0.94+
Strata Conference 2012	EVENT	0.93+
ECS	ORGANIZATION	0.93+
hundred percent	QUANTITY	0.91+
Triceda	ORGANIZATION	0.9+
Apache	TITLE	0.9+
firs	QUANTITY	0.9+
Paducah	LOCATION	0.89+
Greenplum	ORGANIZATION	0.89+
single points	QUANTITY	0.88+
day three	QUANTITY	0.88+
NFS	TITLE	0.87+
Wiki bond.org	OTHER	0.87+
1400	QUANTITY	0.85+
Unix	TITLE	0.85+
Wiki bonds	ORGANIZATION	0.84+
Silicon angle	ORGANIZATION	0.83+
Mapbox	ORGANIZATION	0.78+
Apache	ORGANIZATION	0.76+
MapReduce	ORGANIZATION	0.75+
Kerberos	ORGANIZATION	0.75+
first open	QUANTITY	0.74+
Pam	TITLE	0.73+
Matt bar	ORGANIZATION	0.73+
Nazanin	ORGANIZATION	0.61+
Cloudera	TITLE	0.59+
moon	LOCATION	0.58+
Cisco	ORGANIZATION	0.54+
one	QUANTITY	0.53+
days	QUANTITY	0.52+
MapReduce	TITLE	0.47+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Matt BARR: