Tendu Yogurtcu, Syncsort - #BigDataSV 2016 - #theCUBE

from San Jose in the heart of Silicon Valley it's the kue covering big data sv 2016 now your host John furrier and George Gilbert okay welcome back on we are here live in Silicon Valley for the cubes looking angles flagship program we go out to the events and extract the signal from the noise i'm john furrier mykos george gilbert big data analyst at Wikibon calm our next guest is 10 do yoga coo to yogurt coo coo I you see your last name yo Joe okay I gots clothes GM with big David sinks or welcome back to the cube sink starts been a long time guess one of those companies we love to cover because your value publishes is right in the center of all the action around mainframes and you know Dave and I always love to talk about mainframe not mean frame guys we know that we remember those days and still powering a lot of the big enterprises so I got to ask you you know what's your take on the show here one of the themes that came up last night on crowd chatters why is enterprise data warehousing failing so you know got some conversation but you're seeing a transformation what do you guys see thank you for having me it's great to be here yes we are seeing the transformation of the next generation data warehouse and evolution of the data warehouse architecture and as part of that mainframes are a big part of this data warehouse architecture because still seventy percent of data is on the mainframes world's data seventy percent of world's data this is a large amount of data so when we talk about big data architecture and making big data and enterprise data useful for the business and having advanced analytics not just gaining operational efficiencies with the new architecture and also having new products new services available to the customers of those organizations this data is intact and making that part of this next-generation data warehouse architecture is a big part of the initiatives and we play a very strong core role in this bridging the gap between mainframes and the big data platforms because we have product offerings spanning across platforms and we are very focused on accessing and integrating data accessing and integrating in a secure way from mainframes to the big data plan one is one of the things that's the mainframe highlights kind of a dynamic in the marketplace and wrong hall customers whether they have many firms are not your customers who have mainframes they already have a ton of data their data full as we say in the cube they have a ton of data do it but they spend a lot of times you mentioned cleaning the data how do you guys specifically solve that because that's a big hurdle that they want to just put behind they want to clean fast and get on to other things yes we see a few different trends and challenges first of all from the Big Data initiatives everybody is really trying to either gain operational efficiency business agility and make use of some of the data they weren't able to make use of before and enrich this data with some of the new data sources they might be actually adding to the data pipeline or they are trying to provide new products and services to their customers so when we talk about the mainframe data it's a it's really a how you access this mainframe data in a secure way and how you make that data preparation very easy for the data scientists the data scientists are still spending close to eighty percent of their time in data preparation and if you can't think of it when we talk about the compute frameworks like spark MapReduce flink versus the technology stack technologies these should not be relevant to the data scientist they should be just worried about how do i create my data pipeline what are the new insights that I'm trying to get from this data the simplification we bring in that data cleansing and data preparation is one well we are bringing simple way to access and integrate all of the enterprise data not just the legacy mainframe and the relational data sources and also the emerging data sources with streaming data sources the messaging frameworks new data sources we also make this in a cross-platform secure way and some of the new features for example we announced where we were simply the best in terms of accessing all of the mainframe data and having this available on Hadoop and spark we now also makes park and Hadoop understand this data in its original format you do not have to change the original record format which is very important for highly regulated industries like financial services banking and insurance and health care because you want to be able to do the data sanitization and data cleansing and yet bring that mainframe data in its original format for audit and compliance reasons okay so this is this is the product i think where you were telling us earlier that you can move the processing you can move the data from the mainframe do processing at scale and at cost that's not possible or even ii is is easy on the mainframe do it on a distributed platform like a dupe it preserves its original sort of way of being encoded send it back but then there's also this new way of creating a data fabric that we were talking about earlier where it used to be sort of point-to-point from the transactional systems to the data warehouse and now we've basically got this richer fabric and your tools sitting on some technologies perhaps like spark and Kafka tell us what that world looks like and how it was different from we see a greater interest in terms of the concept of a database because some organizations call it data as a service some organizations call it a Hadoop is a service but ultimately an easy way of publishing data and making data available for both the internal clients of the organization's and external clients of the organization's so Kafka is in the center of this and we see a lot of other partners of us including hot dog vendors like Cloudera map r & Horton works as well as data bricks and confluent are really focused on creating that data bus and servicing so we play a very strong there because phase one project for these organizations how do I create this enterprise data lake or enterprise data hub that is usually the phase one project because for advanced analytics or predictive analytics or when you make an engine your mortgage application you want to be able to see that change on your mobile phone under five minutes likewise when you make a change in your healthcare coverage or telecom services you want to be able to see that under five minutes on your phone these things really require easy access to that enterprise data hub what we have we have a tool called data funnel this basically simplifies in a one click and reduces the time for creating the enterprise data hub significantly and our customers are using this to migrate and make I would not say my great access data from the database tables like db2 for example thousands of tables populating an automatically mapping metadata whether that metadata is higher tables or parquet files or whatever the format is going to be in the distributed platform so this really simplifies the time to create the enterprise data hub it sounds actually really interesting when I'm hearing what you're saying the first sort of step was create this this data lake lets you know put data in there and start getting our feet wet and learning new analysis patterns but what if I'm hearing you correctly you're saying now radiating out of that is a new sort of data backbone that's much lower latency that gets data out of the analytic systems perhaps back into the operational systems or into new systems at a speed that we didn't do before so that we can now make decisions or or do an analysis and make decisions very quickly yes that's true basically operational intelligence and mathematics are converging okay and in that convergence what we are basically seeing is that I'm analyzing security data I'm analyzing telemetry data that's a streamed and I want to be able to react as fast as possible and some of the interest in the emerging computer platforms is really driven by this they eat the use case right many of our customers are basic saying that today operating under five minutes is enough for me however I want to be prepared I want to future-proof my applications because in a year it might be that I have to respond under a minute even in sub seconds when they talk about being future proofed and you mentioned to time you know time sort of brackets on either end our customers saying they're looking at a speed that current technologies don't support in other words are they evaluating some things that are you know essentially research projects right now you know very experimental or do they see a set of technologies that they can pick and choose from to serve those different latency needs we published a Hadoop survey earlier this year in january according to the results from that Hadoop survey seventy percent of the respondents were actually evaluating spark and this is very confused consistent with our customer base as well and the promise of spark is driven by multiple use cases and multiple workload including predictive analytics and streaming analytics and bat analytics all of these use cases being able to run on the same platform and all of the Hadoop vendors are also supporting this so we see as our customer base are heavy enterprise customers they are in production already in Hadoop so running spark on top of their Hadoop cluster is one way they are looking for future proofing their applications and this is where we also bring value because we really abstract that insulate the user while we are liberating all of the data from the enterprise whether it's on the relational legis data warehouse or it's on the mainframe side or it's coming from new web clients we are also helping them insulate their applications because they don't really need to worry about what's the next compute framework that's going to be the fastest most reliable and low latency they need to focus on the application layer they need to focus on creating that data pipeline today I want to ask you about the state of syncsort you guys have been great success with the mainframe this concept of data funneling or you can bring stuff in very fast new management new ownership what's the update on the market dynamics because now ingestion zev rethink data sources how do you guys view what's the plan for syncsort going forward share with the folks out there sure our new investors clearlake capital is very supportive of both organic and inorganic growth so acquisitions are one of the areas for us we plan to actually make one or two acquisitions this year and companies with the products in the near adjacent markets are real value add for us so that's one area in addition to organic growth in terms of the organic growth our investments are really we have been very successful with a lot of organizations insurance financial services banking and healthcare many many of the verticals very successful with helping our customers create the enterprise data hub integrate access all of the data integrated and now carrying them to the next generating generation frameworks those are the areas that we have been partnering with them the next is for us is really having streaming data sources as well as batch data sources through the single data pipeline and this includes bringing telemetry data and security data to the advanced analytics as well okay so it sounds like you're providing a platform that can handle the today's needs which were mostly batch but the emerging ones which are streaming and so you've got that sort of future proofing that customers are looking for once they've got that those types of data coming together including stuff from the mainframe that they want might want to enrich from public sources what new things do you see them doing predictive analytics and machine learning is a big part of this because ultimately once there are different phases right operational efficiency phase was the low-hanging fruit for many organizations I want to understand what I can do faster and serve my clients faster and create that operational efficiency in a cost-effective scalable way second was what our new for go to market opportunities with transformative applications what can I do by recognizing how my telco customers are interacting with the SAS services to help and how like under a couple of minutes I react to their responses or cell service is the second one and then the next phase is that how do I use this historical data in addition to the streaming of data rapidly I'm collecting to actually predict and prevent some of the things and this is already happening with a guy with banking for example it's really with the fraud detection a lot of predictive analysis happens so advanced analytics using AI advanced analytics using machine learning will be a very critical component of this moving forward this is really interesting because now you're honing in on a specific industry use case and something that you know every vendor is trying to sort of solve the fraud detection fraud prevention how repeatable is it across your customers is this something they have to build from scratch because there's no templates that get them fifty percent of the way there seventy percent of the way there actually there's an opportunity here because if you look at the health care or telco or financial services or insurance verticals there are repeating patterns and that one is fraud for fraud or some of the new use cases in terms of customer churn analytics or cosmetics estate so these patterns and the compliance requirements in these verticals creates an opportunity actually to come up with application applications for new companies start for new startups okay then do final question share with the folks out there to view the show right now this is ten years of Hadoop seven years of this event Big Data NYC we had a great event there New York City Silicon Valley what's the vibe here in Silicon Valley here this is one of the best events I really enjoy strata San Jose and I'm looking forward two days of keynotes and hearing from colleagues and networking with colleagues this is really the heartbeat happens because with the hadoop world and strata combined actually we started seeing more business use cases and more discussions around how to enable the business users which means the technology stack is maturing and the focus is really on the business and creating more insights and value for the businesses ten do you go to welcome to the cube thanks for coming by really appreciate it go check out our Dublin event on fourteenth of April hadoop summit will be in europe for that event of course go to SiliconANGLE TV check out our women in check every week we feature women in tech on wednesday thanks for joining us thanks for sharing the inside would sink so i really appreciate it thanks for coming by this turkey will be right back with more coverage live and Silicon Valley into the short break you

Published Date : Mar 29 2016

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
fifty percent	QUANTITY	0.99+
john furrier	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
seventy percent	QUANTITY	0.99+
two days	QUANTITY	0.99+
San Jose	LOCATION	0.99+
one	QUANTITY	0.99+
Dave	PERSON	0.99+
John furrier	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
ten years	QUANTITY	0.99+
telco	ORGANIZATION	0.99+
george gilbert	PERSON	0.99+
seven years	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
NYC	LOCATION	0.98+
Hadoop	TITLE	0.98+
thousands of tables	QUANTITY	0.98+
today	DATE	0.98+
under five minutes	QUANTITY	0.98+
europe	LOCATION	0.98+
Joe	PERSON	0.98+
january	DATE	0.97+
wednesday	DATE	0.97+
one click	QUANTITY	0.97+
under five minutes	QUANTITY	0.97+
under a minute	QUANTITY	0.97+
one area	QUANTITY	0.97+
phase one	QUANTITY	0.96+
earlier this year	DATE	0.96+
both	QUANTITY	0.96+
a year	QUANTITY	0.96+
second one	QUANTITY	0.96+
Dublin	LOCATION	0.95+
this year	DATE	0.95+
New York City Silicon Valley	LOCATION	0.92+
last night	DATE	0.91+
one way	QUANTITY	0.9+
a ton of data	QUANTITY	0.9+
Cloudera map r & Horton	ORGANIZATION	0.9+
a ton of data	QUANTITY	0.89+
two acquisitions	QUANTITY	0.89+
turkey	LOCATION	0.88+
one of the best events	QUANTITY	0.87+
first sort	QUANTITY	0.86+
eighty percent	QUANTITY	0.85+
mykos	PERSON	0.84+
SiliconANGLE TV	ORGANIZATION	0.83+
2016	DATE	0.82+
second	QUANTITY	0.81+
Tendu Yogurtcu	PERSON	0.81+
single data	QUANTITY	0.8+
David	PERSON	0.8+
park	TITLE	0.8+
a lot of times	QUANTITY	0.78+
10	QUANTITY	0.77+
db2	TITLE	0.76+
under a couple of minutes	QUANTITY	0.75+
areas	QUANTITY	0.71+
things	QUANTITY	0.71+
Syncsort	ORGANIZATION	0.71+
every week	QUANTITY	0.7+
one of	QUANTITY	0.7+
sv	EVENT	0.69+
of April	DATE	0.69+
first	QUANTITY	0.68+
#BigDataSV	EVENT	0.66+
themes	QUANTITY	0.66+
spark	ORGANIZATION	0.65+
summit	EVENT	0.62+
Kafka	TITLE	0.62+
every vendor	QUANTITY	0.61+
use	QUANTITY	0.6+
Big Data	EVENT	0.6+
MapReduce	TITLE	0.55+

Jack Norris - Hadoop Summit 2013 - theCUBE - #HadoopSummit

>>Ash it's, you know, what will that mean to my investment? And the announcement fusion IO is that, you know, we're 25 times faster on read intensive HBase applications. The combination. So as organizations are deploying Hadoop, and they're looking at technology changes coming down the pike, they can rest assured that they'll be able to take advantage of those in a much more aggressive fashion with map R than, than other distribution. >>Jack, how I got to ask you, we were talking last night at the Hadoop summit, kind of the kickoff party and, you know, everyone was there. All the top execs were there and all the developers, you know, we were in the queue. I think, I think that either Dave or myself coined the term, the big three of big data, you guys ROMs cloud Cloudera map R and Hortonworks, really at the, at the beginning of the key players early on and Charles from Cloudera was just recently on. And, and he's like, oh no, this, this enterprise grade stuff has been kicked around. It's been there from the beginning. You guys have been there from the beginning and Matt BARR has never, ever waffled on your, on your messaging. You've always been very clear. Hey, we're going to take a dupe open source a dupe and turn it into an enterprise grade product. Right. So that's clear, right? That's, that's, that's a great, that's a great, so what's your take on this because now enterprise grade is kind of there, I guess, the buzz around getting the, like the folks that have crossed the chasm implemented. So what can you comment on that about one enterprise grade, the reality of it, certainly from your perspective, you haven't been any but others. And then those folks that are now rolling it out for the first time, what can you share with them around? What does it mean to be enterprise grade? >>So enterprise grade is more about the customer experience than, than a marketing claim. And, you know, by enterprise grade, what we're talking about are some of the capabilities and features that they've grown to expect in their, their other enterprise applications. So, you know, the ability to meet full S SLA is full ha recovery from multiple failures, rolling upgrades, data protection was consistent snapshots business continuity with mirroring the ability to share a cluster across multiple groups and have, you know, volumes. I mean, there's a, there's a host of features that fall under the umbrella enterprise grade. And when you move from no support for any of those features to support to a few of them, I don't think that's going to, to ha it's more like moving to low availability. And, and there's just a lot of differences in terms of when we say enterprise grade with those features mean versus w what we view as kind of an incomplete story. So >>What do you, what do you mean by low availability? Well, I mean, it's tongue in cheek. It's nice. It's a good term. It's really saying, you know, just available when you sometimes is that what you mean? Is this not true availability? I mean, availability is 99.9%. Right? >>Right. So if you've got a, an ha solution that can't recover from multiple failures, that's downtime. If you've got an HBase application that's running online and you have data that goes down and it takes 10 to 30 minutes to have the region servers recover it from another place in the distribution, that's downtime. If you have snapshots that aren't consistent across the cluster, that doesn't provide data protection, there's no point in time recovery for, for a cluster. So, you know, there's a lot of details underneath that, but what it, what it amounts to is, do you have interruptions? Do you have downtime? Do you have the potential for losing data? And our answer is you need a series of features that are hardened and proven to deliver that. >>What about recoverability? You mentioned that you guys have done a lot of work in that area with snapshotting, that's kind of being kicked around, are our folks addressing, what are the comp what's your competition doing in those areas of recoverability just mentioned availability. Okay, got that. Recoverability security, compliance, and usability. Those are the areas that seem to be the hot focus areas what's going on in the energy. How would you give them the grade, the letter grade, if you will, candidly, compared to what you guys offer? Well, the, >>The first of all, it's take recoverability. You know, one of the tenants is you have a point in time recovery, the ability to restore to a previous point that's consistent across the cluster. And right now there's, there's no point in time recovery for, for HDFS, for the files. And there's no point in time recovery for HBase tables. So there's snapshot support. It's being talked about in the open source community with respect to snapshots, but it's being referred to in the JIRAs as fuzzy snapshots and really compared to copy table. >>So, Jack, I want to turn the conversation to the, kind of the topic we've talked about before kind of the open versus a proprietary that, that whole debate we've, we've, we've heard about that. We talked about that before here on the cube. So just kind of reiterate for us your take. I mean, we, we hear perhaps because of the show we're at, there's a lot of talk about the open source nature of Hadoop and some of the purists, as you might call them are saying, it's gotta be open a hundred percent Patrick compatible, et cetera. And then there's others that are taking a different approach, explain your approach and why you think that's the key way to make, to really spur adoption of a dupe and make it >>W w we're we're a part of the community we're, we've got, you know, commitment going on. We've, you know, pioneered and pushed a patchy drill, but we have done innovations as well. And I think that those innovations are really required to support and extend the, the whole ecosystem. So canonical distributes RN, three D distribution. We've got, you know, all our, our packages are, are available on get hub and, and open source. So it's not, it's not a binary debate. And I think the, the point being that there's companies that have jumped ahead and now that Peloton is, is, you know, pedaling faster and, and we'll, we'll catch up. We'll streamline. I think the difference is we rearchitected. So we're basically in a race car and, you know, are, are racing ahead with, with enterprise grade features that are required. And there's a lot of work that still needs to be done, needs to be accomplished before that full rearchitecture is, is in place. >>Well, I mean, I think for me, the proof is really in the pudding when you, when it comes to talk about customers that are doing real things and real production, grade mission, critical applications that they're running. And to me that shows the successor or relative success of a given approach. So I know you guys are working with companies like ancestry.com, live nation and Quicken loans. Maybe you could, could you walk us through a couple of those scenarios? Let's take ancestry.com. Obviously they've got a huge amount of data based on the kind of geological information, where do you guys do >>With them? Yeah, so they've got, I mean, they've got the world's largest family genealogy services available on the web. So there's a massive amount of data that they make accessible and, and, you know, ability for, for analysis. And then they've rolled out new features and new applications. One of which is to ship a kit out, have people spit in a tube, returned back and they do DNA matching and reveal additional details. So really some really fabulous leading edge things that are being done with, with the use of, of Hadoop. >>Interesting. So talk about when you went to, to work with them, what were some of their key requirements? Was it around, it was more around the enterprise enterprise, grade security and uptime kind of equation, or was it more around some of the analytics? What, what, what's the kind of the killer use case for them? >>It's kind of, you know, it's, it's hard with a specific company or even, you know, to generalize across companies. Cause they're really three main areas in terms of ease of use and administration dependability, which includes the full ha and then, and then performance. And in some cases, it's, it's just one of those that kind of drives it. And it's used to justify, in other cases, it's kind of a collection. The ease of use is being able to use a cluster, not only as Hadoop, but to access it and treat it like enterprise storage. So it's a complete POSIX compliance file system underneath that allows the, the mounting and access and updates and using it in dynamic read-write. So what that means from an application level, it's, it's faster, it's much easier to administer and it's much easier and reliable for developers to, to utilize. >>I got to ask you about the marketing question cause I see, you know, map our, you guys have done a good job of marketing. Certainly we want to be thankful to you guys is supporting the cube in the past and you guys have been great supporters of our mission, but now the ecosystem's evolving a lot more competition. Claudia mentioned those eight companies they're tracking in quote Hadoop, and certainly Jeff and I, and, and SiliconANGLE by look at there's a lot more because Hadoop washing has been going on now for the term Hadoop watching me and jumping in and doing Hadoop, slapping that onto an existing solution. It's not been happening full, full, full bore for a year. At least what's the next for you guys to break above the noise? Obviously the communities are very active projects are coming online. You guys have your mission in the enterprise. What's the strategy for you guys going forward is more of the same and anything new even share. >>Yeah, I, I, I think as far as breaking above the noise, it will be our customers, their success and their use cases that really put the spotlight on what the differences are in terms of, of, you know, using a big data platform. And I think what, what companies will start to realize is I'd rather analogy between supply chain and the big, the big revolution in supply chain was focusing on inventory at each stage in the supply chain. And how do you reduce that inventory level and how do you speed the, the flow of goods and the agility of a company for competitive advantage. And I think we're going to view data the same way. So companies instead of raw data that they're copying and moving across different silos, if they're able to process data in place and send small results sets, they're going to be faster, more agile and more competitive. >>And that puts the spotlight on what data platform is out there that can support a broad set of applications and it can have the broadest set of functionality. So, you know, what we're delivering is a mission grade, you know, enterprise grade mission, critical support platform that supports MapReduce and does that high performance provides NFS POSIX access. So you can use it like a file system integrates, you know, enterprise grade, no SQL applications. So now you can do, you know, high-speed consistent performance, real time operations in addition to batch streaming, integrated search, et cetera. So it's, it's really exciting to provide that platform and have organizations transform what they're doing. >>How's the feedback on with Ted Dunning? I haven't seen a lot of buzz on the Twittersphere is getting positive feedback here. He's a, a tech athlete. He's a guru, he's an expert. He's got his hands in all the pies. He's a scientist type. What's he up to? What's his, what's his role within Mapa and he's obviously playing in the open-source community. What's he up to these days, >>Chief application architect, he's on the leading edge of my house. So machine learning, so, you know, sharing insights there, he was speaking at the storm meetup two nights ago and sharing how you can integrate long running batch, predictive analytics with real-time streaming and how the use of snapshots really that, that easy and possible. He travels the world and is helping organizations understand how they can take some very complex, long running processes and really simplify and shorten those >>Chance to meet him in New York city had last had duke world at a, at a, a party and great guy, fantastic geek, and certainly is doing a great work and shout out to Ted. Congratulations, continue up that support. How's everyone else doing? How's John and Treevis doing how's the team at map are we're pedaling as best as you can growing >>Really quickly. No, we're just shifting gears. Would it be on pedaling >>Engine? >>Yeah. Give us an update on the company in terms of how the growth and kind of where you guys are moving that. >>Yeah. We're, we're expanding worldwide, you know, just this, you know, last few months we've opened up offices and in London and Munich and Paris, we're expanding in Asia, Japan and Korea. So w our, our sales and services and engineering, and basically across the whole company continues to expand rapidly. Some really great, interesting partnerships and, and a lot of growth Natalie's we add customers, but it's, it's nice to see customers that continue to really grow their use of map are within their organization, both in terms of amount of data that they're analyzing and the number of applications that they're bringing to bear on the platform. >>Well, that a little bit, because I think, you know, one of the, one of the trends we do see is when a company brings in big data, big data platform, and they might start experiment experimenting with it, build an application. And then maybe in the, maybe in the marketing department, then the sales guys see it and they say, well, maybe we can do something with that. How is that typically the kind of the experience you're seeing and how do you support companies that want to start expanding beyond those initial use cases to support other departments, potentially even other physical locations around the world? How do you, how do you kind of, >>That's been the beauty of that is if you have a platform that can support those new applications. So if you know, mission critical workloads are not an issue, if you support volumes so that you can logically separate makes it much easier, which we have. So one of our customers Zions bank, they brought in Matt BARR to do fraud detection. And pretty soon the fact that they were able to collect all of that data, they had other departments coming to them and saying, Hey, we'd like to use that to do analysis on because we're not getting that data from our existing system. >>Yeah. They come in and you're sitting on a goldmine, there are use cases. And you also mentioned kind of, as you're expanding internationally, what's your take on the international market for big data to do specifically is, is the U S kind of a leaps and bounds ahead of the rest of the world in terms of adoption of the technology. What are you seeing out there in terms of where, where the rest of the, >>I wouldn't say leaps and bounds, and I think internationally, they're able to maybe skip some of the experimental steps. So we're seeing, we're seeing deployment of class financial services and telecom, and it's, it's fairly broad recruit technologies there. The largest provider of recruiting services, indeed.com is one of their subsidiaries they're doing a lot with, with Hadoop and map are specifically, so it's, it's, it's been, it's been expanding rapidly. Fantastic. >>I also, you know, when you think about Europe, what's going on with Google and some of the, the privacy concerns even here, or I should say, is there, are there different regulatory environments you've got to navigate when you're talking about data and how you use data when you're starting to expand to other, other locales? >>Yeah. There's typically by vertical, there's different, different requirements, HIPAA and healthcare, and basal to, and financial services. And so all of those, and it, it, it basically, it's the same theme of when you're bringing Hadoop into an organization and into a data center, the same sorts of concerns and requirements and privacy that you're applying in other areas will be applied on Hindu. >>I'm now kind of turning back to the technology. You mentioned Apache drill. I'd love to get an update on kind of where, where that stands. You know, it's put, then put that into context for people. We hear a lot about the SQL and Hadoop question here, where does drill fit into that, into that equation? >>Well, the, the, you know, there's a lot of different approaches to provide SQL access. A lot of that is driven by how do you, how do you leverage some of the talent and organization that, you know, speak SQL? So there's developments with respect to hive, you know, there's other projects out there. Apache drill is an open source project, getting a lot of community involvement. And the design center there is pretty interesting. It started from the beginning as an open source project. And two main differences. One was in looking at supporting SQL it's, let's do full ANSI SQL. So it's full 2003 ANSI, sequel, not a SQL like, and that'll support the greatest number of applications and, you know, avoid a lot of support and, and issues. And the second design center is let's support a broad set of data sources. So nested sources like Jason scheme on discovery, and basically fitting it into an enterprise environment, which sometimes is kinda messy and can get messy as acquisitions happen, et cetera. So it's complimentary, it's about, you know, enabling interactive, low latency queries. >>Jack, I want to give you the final word. We are out of time. Thanks for coming on the cube. Really preached. Great to see you again, keep alumni, but final word. And we'll end the segment here on the cube is your quick thoughts on what's happening here at Hadoop world. What is this show about? Share with the audience? What's the vibe, the summary quick soundbite on Hadoop. >>I think I'll go back to how we started. It's not, if you used to do putz, how you use to do and, you know, look at not only the first application, but what it's going to look like in multiple applications and pay attention to what enterprise grade means. >>Okay. They were secure. We got a more coverage coming, Jack Norris with map R I'll say one of the big three original, big three, still on the, on the list in our mind, and the market's mind with a unique approach to Hadoop and the mid-June great. This is the cube I'm Jennifer with Jeff Kelly. We'll be right back after this short break, >>Let's settle the PR program out there and fighting gap tech news right there. Plenty of the attack was that providing a new gadget. Let's talk about the latest game name, but just the.

Published Date : Jun 27 2013

SUMMARY :

IO is that, you know, we're 25 times faster on read intensive HBase applications. All the top execs were there and all the developers, you know, So, you know, the ability to meet full S SLA is full ha It's really saying, you know, just available when So, you know, there's a lot of details compared to what you guys offer? You know, one of the tenants is you have a point of Hadoop and some of the purists, as you might call them are saying, it's gotta be open a hundred percent that Peloton is, is, you know, pedaling faster and, and we'll, we'll catch up. So I know you guys are working with companies like ancestry.com, live nation and Quicken that they make accessible and, and, you know, ability for, So talk about when you went to, to work with them, what were some of their key requirements? It's kind of, you know, it's, it's hard with a specific company or even, I got to ask you about the marketing question cause I see, you know, map our, you guys have done a good job of marketing. And how do you reduce that inventory level and how do you speed the, you know, what we're delivering is a mission grade, you know, enterprise grade mission, How's the feedback on with Ted Dunning? so, you know, sharing insights there, he was speaking at the storm meetup How's John and Treevis doing how's the team at map are we're pedaling as best as you can No, we're just shifting gears. and basically across the whole company continues to expand rapidly. Well, that a little bit, because I think, you know, one of the, one of the trends we do see is when a company brings in big data, That's been the beauty of that is if you have a platform that can support those And you also mentioned kind of, they're able to maybe skip some of the experimental steps. and it, it, it basically, it's the same theme of when you're bringing Hadoop into We hear a lot about the SQL and Hadoop question support the greatest number of applications and, you know, avoid a lot of support and, Great to see you again, you know, look at not only the first application, but what it's going to look like in multiple This is the cube I'm Jennifer with Jeff Kelly. Plenty of the attack was that providing a new gadget.

ENTITIES

Entity	Category	Confidence
Ted	PERSON	0.99+
London	LOCATION	0.99+
Claudia	PERSON	0.99+
Jeff Kelly	PERSON	0.99+
Asia	LOCATION	0.99+
Ted Dunning	PERSON	0.99+
Jack Norris	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Jack	PERSON	0.99+
10	QUANTITY	0.99+
Paris	LOCATION	0.99+
Korea	LOCATION	0.99+
Matt BARR	PERSON	0.99+
Munich	LOCATION	0.99+
New York	LOCATION	0.99+
99.9%	QUANTITY	0.99+
Jennifer	PERSON	0.99+
Treevis	PERSON	0.99+
25 times	QUANTITY	0.99+
Japan	LOCATION	0.99+
Google	ORGANIZATION	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
Jeff	PERSON	0.99+
eight companies	QUANTITY	0.99+
first time	QUANTITY	0.99+
mid-June	DATE	0.99+
Charles	PERSON	0.98+
Europe	LOCATION	0.98+
30 minutes	QUANTITY	0.98+
One	QUANTITY	0.98+
first application	QUANTITY	0.98+
Ash	PERSON	0.98+
two nights ago	DATE	0.98+
Hortonworks	ORGANIZATION	0.98+
each stage	QUANTITY	0.97+
SQL	TITLE	0.97+
SiliconANGLE	ORGANIZATION	0.97+
Natalie	PERSON	0.97+
ancestry.com	ORGANIZATION	0.96+
Hadoop	TITLE	0.96+
Patrick	PERSON	0.96+
last night	DATE	0.95+
Jason	PERSON	0.95+
2003	DATE	0.95+
Hadoop	EVENT	0.94+
Apache	ORGANIZATION	0.94+
Hadoop	PERSON	0.93+
indeed.com	ORGANIZATION	0.93+
hundred percent	QUANTITY	0.92+
HBase	TITLE	0.92+
Hadoop Summit 2013	EVENT	0.92+
Quicken loans	ORGANIZATION	0.92+
two main differences	QUANTITY	0.89+
HIPAA	TITLE	0.89+
#HadoopSummit	EVENT	0.89+
S SLA	TITLE	0.89+
Hadoop	ORGANIZATION	0.88+
Cloudera	ORGANIZATION	0.85+
map R	TITLE	0.85+
a year	QUANTITY	0.83+
Zions bank	ORGANIZATION	0.83+
Peloton	LOCATION	0.78+
NFS	TITLE	0.78+
MapReduce	TITLE	0.77+
Cloudera map R	ORGANIZATION	0.75+
live	ORGANIZATION	0.74+
second design center	QUANTITY	0.73+
Hindu	ORGANIZATION	0.7+
theCUBE	ORGANIZATION	0.7+
three main areas	QUANTITY	0.68+
one enterprise grade	QUANTITY	0.65+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Cloudera map R: