Christian Rodatus, Datameer | CUBEConversation, July 2018
(upbeat music) >> Hi, I'm Peter Burris and welcome to another CUBE Conversation from our wonderful studios in Palo Alto, California. Great conversation today, we got Christian Rodatus, who is the CEO of Datameer, here to talk about some of the trends within the overall analytic space. One of the most important things happening in technology today. Christian, welcome back to theCube! >> Good morning, Peter, thanks for having me today. >> It's great to have you here. Hey, let's start with, kind of some of the preliminaries. What's happening at Datameer? >> Well we've been around for nine years now, which is a lot of time in a very agile technology space. And I actually just came back from an Investiere offsite that was arranged from one of our biggest investors. And everything is centering around the cloud, right? We were trotting along within the Hadoop ecosystem, the big data ecosystem over the past couple years and since, 12, 15 months, the transition and the analytics market and how it's transforming from on premise to the cloud in a hybrid way as well has been stunning, right? And we're faced with a challenge in innovating in those spaces and making our product relevant for on premise deployment, for cloud deployments, and various different cloud platforms, and in a hybrid fashion as well. And we've been traditionally working with customers that have been laggards in terms of cloud adoption because we do a lot of business and financial services, and insurance, healthcare, telecommunications, but even in those industries over the past year, it has been stunning how they are accelerate cloud adoption, how they move analytic workloads to the cloud. >> Well, actually, they all sound like sometimes leaders in the analytics world, even if they're laggards in the cloud. And there's something of a relationship there. People didn't want to do a lot of their analytics because they were doing analytics in some of the most strategic, sensitive data, and they felt pressured to not give that off to a company that they felt perhaps, or an industry that's a little bit less ready from infrastructure standpoint. But our research shows pretty strongly that we're seeing a push to adoption, precisely because so much of that ecosystem got wrapped up in the infrastructure and never got to the possible value of analytics. So is that helping to force this along, do you think, the idea of-- >> Absolutely, right, if you look at the key drivers, and there was some other analyst research that I read this week. Why are people being moderated moving analytic workloads into the cloud? It's really less cost, it's really business agility. How do they become independent from IT and procure services across the organization in a very simple, easy, and fast fashion? And then there's a lot of fears associated with it. It's data governance, it's security, it's data privacy, is what these industries that we predominately work in are concerned with, right, and we provide a solution framework that actually helps them to transition those on premise analytic workloads into the cloud and still get the enterprise grade features that they're used to from an on premise solution deployment. >> Yeah, so in other words, a lot of businesses confuse failure to deal with big data infrastructure as failure to do big data. >> That's correct. >> I want to build on something you've just said, specifically the governance issue, because I think you're absolutely right. There's an enormous lack of understanding about what really constitutes data governance. It used to be, oh, data governance is what the data administrator does when they do modeling, and who gets to change the model, and who owns the model, and who gets to, all that other stuff. We're talking about something fundamentally different as we embed more deeply some of these analytics directly into high value business activities that are being utilized or performed by high cost business executives. >> Absolutely. >> How does data governance play out, and I'm going to ask you specifically, what are you guys doing that makes data governance more accessible, more manageable, within Datameer customers? >> So I think there's two key features to a solution that's important. So number one, we have very much a self-service aspect to it, so we're pushing abilities to model and create views on the big data assets that are persisting in the data lakes, towards a business user, right? But we do this in a very governed way, right? We can provide barefold data lineage, we can audit every single step, how the data's being sourced, how it's being manipulated on the way, and provide an audit trail, which is very important for many of the customers that we work with. And we really bring this into the hands of the business users without much IT interference. They don't have to work on models to be built and so on and so forth, and this is really what helps them build rapid analytic applications that provide a lot of value and benefits for their business processes. >> So you talked about how you're using governance, or the ability to provide a manageable governance regime, to open up the aperture on the utilization of some of these high value analytics frameworks by broader numbers of individuals within the organization. That seems to me to be a pretty significant challenge for a lot of businesses. It's not enough to just have a ivory tower group of data scientists be good at crafting data, understanding data, and then advising people what actions to take based on that data. It seems it has to be more broadly diffused within the organization, what do you think? >> So this is clearly the trend, and as these analytics services move to the cloud, you will see this even more so, right? You will have created data assets and you provide access control for certain using groups that can see and work with this data, but then you need to provide a solution framework that enables these customers to consume this in a very seamless and an easy way. This is basically what we are doing. We're going to push it down to the end user and give them the ability to work on complex analytical problems using our framework in a governed way, in a fast way, in a very iterative analytic workflow. A lot of our customers say they have analytic, or they pursue analytic problems that are of investigative nature, and you cannot do this if you rely on IT to build new new models to delay the process-- >> Or if you only rely on IT. >> And only rely on IT, right? They want to do this on their own and create their own views, depending on their analytic workflow in a very rapid, rapid way. And so we support this in a highly governed way that can do this in a very fast and rapid fashion, and as it moves to the cloud, it provides some of the even more opportunities to do so. >> So as CEO of Datameer, you're spending a lot time with customers. Are there some patterns that you're seeing customers, in addition to buy Datameer, but are there some patterns in addition to what you just described that the successful companies are utilizing to facilitate this fusion? Are they training people more? >> Yep. Are they embedding this more deeply into other types of applications or workflows? What are some of those patterns of success that you're seeing amongst your customers? >> So that's a very interesting question, right, because a lot of big data initiatives within companies fail for the lack of an option. So they build these big data lakes and ramp up cloud services, and they never really see adoption. And so the successful customers we work with, they have a couple of things they do differently than others. They have a centralized, serious type of organization, usually, that facilitates and promotes and educates people on number one, the data assets being available through the organization, about the tool sets that are being used, and amongst one of them, obviously, is Datameer within our customers, and they facilitate constant education and experience sharing across the organization for the user of big data assets throughout the organization. And these companies, they see adoption, right? And it spreads throughout the organization. It has increasing momentum and adoption across various business departments from many eye value use cases. >> So we've done a lot of research. I myself have spent a lot of time on questions of technology adoption, questions within the large enterprises. And you actually described it fails to adopt, and from adoption standpoint, it's called they abandon. >> Absolutely true. >> One of the things that often catalyzes whether or not someone continues to adopt, or a group determines to abandon, is a lack of understanding of what the returns are, what kind of returns these changes of behavior are initiating or instantiating. I've always been curious why a lot of these software tools don't do a good job of actually utilizing data about utilization, from a big data standpoint, to improve the adoption of big data. Are you seeing any effort made by companies to use Datameer to help businesses better adopt Datameer? >> Well, I haven't seen that yet. I see this more with our OEN customers. So we've got OEN customers that analyze the cloud consumption with their customers and provide analytics on users across the organization. I see these things, and from our standpoint, we facilitate this process by providing use case discovery workshops, so we have a services organization that helps our customers to see the light, literally, right, to understand what's the nature of the data assets available. How can they leverage for specific use case, high value use case, implementations, experience sharing, what other customers are doing, what kind of high value application are they going after in a specific industry, and things like this. We do lunch and learns with our customers. We just recently did one with a big healthcare provider and the interest is definitely there. You get 200 people in a room for a lunch and learn meeting, and everyone's interesting, how they can make their life easier and make better business decisions based on the data assets that are available throughout the organization. >> That's amazing, when a lunch and learn meeting goes from 20 people to 200 people, it really becomes much more focused on learn. One of the questions I have related to this is that you've got a lot of experience in the analytics space, more than big data, and how the overall analytics space has evolved over the years. We have some research, pretty strong to suggest that it's time to start thinking about big data not as a thing unto itself, but as part of an aggregate approach to how enterprises should think about analytics. What do you think? How do you think an enterprise should start to refashion its understanding of the role that big data plays in a broader understanding of analytics? >> Back in the earlier days, when my career come from the EDW road, and then all the large enterprises had EDWs and they tried to build a centralized repository of data assets-- >> Highly modeled. >> Highly modeled, a lot of work to set up, structured, highly modeled, extreme complex to modify and service a new application regressed from business users, and then came the Hadoop data lake base, big data approach there. It said dump the data in, and this is where we were a part, within where we became very successful in providing a tool framework that allows customers to build virtue of use into these data assets in a very rapid fashion, driven by the business user community. But to some extent, these data lakes have also had issues in servicing the bread and butter BI user community throughout the organization, and the EDW never really went away, right, so now we have EDWs, we have data lakes that service different analytic application requirements throughout the organization. >> And new reporting systems. >> And even reporting systems. And now the third wave is coming by moving workloads into the cloud, and if you look into the cloud, the wealth of available solutions to a customer becomes even more complex, as cloud vendors themselves build out tons of different solutions to service different analytical needs. The marketplaces offer hundreds of solutions of third party vendors, and the customers try to figure out how all these things can be stitched together and provide the right services for the right business user communities throughout the organization. So what we see moving forward will be a hybrid approach that will retain some of the on premise EDW and data lake services, and those will be combined with multi-cloud services. So there always will not be a single cloud service, and we're already seeing this today. One of our customers is Sprint Pinsight, the advertising business of the Sprint. Telecommunications companies say they have a massive Hadoop on premise data lake, and then they do all the preprocessing of the ATS data from their network, with Datameer on premise, and we condensed down the data assets from a daily volume of 70 terabytes to eight, and this gets exposed to a secret cloud base dataware service for BI consumption throughout the organization. So you see these hybrid, very agile services emerging throughout our customer base, and I believe this will be the future-- >> Yeah, one of the things we like about the concept, or the approach of virtual view, is precisely that. It focuses in on the value that the data's creating, and not the underlying implementation, so that you have greater flexibility about whether you treat it as a big data approach, or EDW approach, or whether you put it here, or whether you put it there. But by focusing on the outcome that gets delivered, it allows a lot of flexibility in the implementation you employed. >> Absolutely, I agree. >> Phenomenal, Christian Rodatus, CEO of Datameer, thanks again for being on theCUBE! >> Thanks so much. I appreciate it, thanks, peter. >> We'll be back.
SUMMARY :
One of the most important things It's great to have you here. and the analytics market and how it's transforming and they felt pressured to not give that off and procure services across the organization confuse failure to deal with big data infrastructure specifically the governance issue, for many of the customers that we work with. or the ability to provide a manageable governance regime, and as these analytics services move to the cloud, it provides some of the even more opportunities to do so. in addition to what you just described Are they embedding this more deeply And so the successful customers we work with, and from adoption standpoint, it's called they abandon. One of the things that often catalyzes and the interest is definitely there. One of the questions I have related to this is that and the EDW never really went away, right, and this gets exposed to a secret cloud base dataware and not the underlying implementation, Thanks so much.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Sprint | ORGANIZATION | 0.99+ |
Peter Burris | PERSON | 0.99+ |
20 people | QUANTITY | 0.99+ |
Datameer | ORGANIZATION | 0.99+ |
Peter | PERSON | 0.99+ |
Christian Rodatus | PERSON | 0.99+ |
July 2018 | DATE | 0.99+ |
70 terabytes | QUANTITY | 0.99+ |
200 people | QUANTITY | 0.99+ |
nine years | QUANTITY | 0.99+ |
eight | QUANTITY | 0.99+ |
Christian | PERSON | 0.99+ |
Palo Alto, California | LOCATION | 0.99+ |
One | QUANTITY | 0.99+ |
EDW | ORGANIZATION | 0.99+ |
two key features | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
peter | PERSON | 0.98+ |
Sprint Pinsight | ORGANIZATION | 0.98+ |
this week | DATE | 0.97+ |
OEN | ORGANIZATION | 0.97+ |
one | QUANTITY | 0.96+ |
single cloud | QUANTITY | 0.93+ |
EDWs | ORGANIZATION | 0.91+ |
hundreds of solutions | QUANTITY | 0.88+ |
CEO | PERSON | 0.86+ |
12, 15 months | QUANTITY | 0.82+ |
theCube | ORGANIZATION | 0.82+ |
third wave | EVENT | 0.8+ |
single step | QUANTITY | 0.74+ |
past couple years | DATE | 0.7+ |
Hadoop | LOCATION | 0.66+ |
Investiere | ORGANIZATION | 0.65+ |
past year | DATE | 0.55+ |
Christian Rodatus, Datameer & Pooja Palan, Datameer | AWS re:Invent
>> Announcer: Live from Las Vegas, it's theCUBE. Covering AWS re:Invent 2017. Presented by AWS, Intel, and our ecosystem of partners. >> Well we are back live here at the Sands Expo Center. We're of course in Las Vegas live at re:Invent. AWS putting on quite a show here. Day one of three days of coverage you'll be seeing right here on theCUBE. I'm John Walls along with Justin Warren. And we're now joined by a couple folks from Datameer. Justin Rodatus who's the CEO of that company, and Pooja Palan who's the Senior Product Manager. And Christian and Pusha thanks for being with us. Good to have you here on theCUBE. >> Thanks for having us. >> So you were cube-ing at just recently up at New York, Christian. >> Yeah absolutely we were seeing your guys in New York and we had actually, we've done some work with a couple of customers probably two weeks ago in Palo Alto I believe. >> I don't know how we can afford you. I mean I'm gonna have to look into our budget. >> Christian: Happy to be here again. >> Okay no it is great, thank for taking the time here. I know this is a busy week for you all. First off let's talk about Datameer in general just to let the audience at home known in case they're not familiar with what you're doing from a core competency standpoint. And let's talk about what you're doing here. >> Absolutely, I mean Datameer was founded eight years ago and Datameer was only an onset of the big data wave that started in the 2009 and 2010 time frame. And Datameer was actually the first commercial platform that provided a tool set to enable our customers to consume enterprise scale Hadoop solutions for their enterprise analytics. So we do everything from ingesting the data into the data lake or we're preparing the data for a consumption by analytics tools throughout the enterprise. And we just recently also launched our own visualization capabilities for sophisticated analysis against very large data sets. We also are capable of integrating machine learning solutions and preparing data for machine learning throughout the organization. And probably the biggest push is into the cloud. And we've been in the cloud for couple of years now, but we see increased momentum from our customers in the market place for about 15 months now I would say. >> So before we dive a little deeper here I'm just kind of curious about your work in general. It's kind of chicken and the egg right? You're trying to come up with new products to meet customer demand. So are you producing to give them what you think they need or are you producing on what they're telling you that they need? How does that work as far as trying to keep up with-- >> You know I can kick this off. So it's actually interesting that you ask this because the customers that did interviews with you guys two weeks ago were part of our customer advisory council. So we get direct feedback from leading customers that do really sophisticated things with Datameer. They are at the forefront of developing really mind blowing analytical applications for high value use cases throughout their organizations. And they help us understanding where theses trends go. And to give you an example. So I was recently in a meeting with a Chief Data Officer of a large global bank in London. And they have kicked off 32 Hadoop projects throughout the organization. And what he told me is just these projects will lead to an expansion of the physical footprint of the data centers in the UK by 30%. So in (mumbles) we are not in the data center business, we don't want this, we need other people to take care of this. And they've launched a massive initiative with Amazon to bring a big chunk of their enterprise analytics into AWS. >> It sounds like you're actually really ahead of the curve in many ways 'cause of the explosion in machine learning and AI, that data analytics side of things. Yeah we had big data for a little while, but it's really hitting now where people are starting to really show some of the amazing things that you can do with data and analysis. So what are you seeing from these customers? What are some of the things that they're saying, actually this thing here, this is what we really love about Datameer, and this is something that we can do here that we wouldn't be able to do in any other way. >> Shall I take that? So when it comes to heart of the matter, there's like you know three things that Datameer hits on really well. So in terms of our user personas, we look at all of our users, our analysts, and data engineers. So what we provide them with that ease of use, being able to take data from anywhere, and be able to use any multiple analytic capabilities within one tool without having to jump around in all different UI's. So it's like ease of use single interface. The second one that they really like about us is being able to not have to, whatever being able to not have to switch between interfaces to be able to get something done. So if they want to ingest data from different sources, it's one place to go to. If they want to access their data, all of it is in the single file browser. They want to munch their data, prepare data, analyze data, it's all within the same interface. And they don't have to use 10 different tools to be able to do that. It's a very seamless workflow. And the same token, the third thing which comes up is that collaboration. It enables collaboration across different user groups within the same organization. Which means that we are totally enabling the data democratization which all of the self service tools are trying to promote here. Making the IT's job easier. And that's what Datameer enables. So it's kind of like a win-win situation between our users and the IT. And the third thing that I want to talk about, which is the IT, making their lives easier, but at the same time not letting them go off, leaving the leash alone. Enabling governance, and that's a key challenge, which is where Datameer comes in the picture to be able to provide enterprise ready governance to be able to deploy it across the board in the organization. >> Yeah, that's something that AWS is certainly lead in, is that democratization of access to things so that you can as individual developers, or individual users go and make use of some of these cloud resources. And seeing here at the show, and we've been talking about that today, about this is becoming a much more enterprise type issue. So being able to do that, have that self service, but also have some of those enterprise level controls. We're starting to see a lot of focus on that from enterprises who want to use cloud, but they really want to make sure that they do it properly, and they do it securely. So what are some of the things that Datameer is doing that helps customers keep that kind of enterprise level control, but without getting in the way of people being able to just use the cloud services to do what they want to do? So could you give us some examples of that maybe? >> I let Puja comment on the specifics on how we deploy in AWS and other cloud solutions for that matter. But what you see with on premise data lakes, customers are struggling with it. So the stack has become outrageously complicated. So they try to stitch all these various solutions together. The open source community I believe now supports 27 different technology platforms. And then there's dozens over dozens of commercial tools that play into that. And what they want, they actually just want this thing to work. They want to deploy what they used from the enterprise IT. Scalability, security, seamlessness across the platforms, appropriate service level agreements with the end user communities and so on and so forth. So they really struggle to make this happen on premise. The cloud address a lot of these issues and takes a lot of the burden away, and it becomes way more flexible, scalable, and adjustable to whatever they need. And when it comes to the specific deployments and how we do this, and we give them enterprise grade solutions that make sense for them, Puja maybe you can comment on that. >> Sure absolutely, and more specific to cloud I would love to talk about this. So in the recent times one of our very first financial services customers went on cloud, and that pretty much brings us over here being even more excited about it. And trust me, even before elasticity, their number one requirement is security. And as part of security, it's not just like, one two three Amazon takes care of it, it's sorted, we have security as part of Datameer, it's been deployed before it's sorted. It's not enough. So when it comes to security it's security at multiple levels, it's security about data in motion, it's security about data at rest. So encryption across the board. And then specifically right now while we're at the Amazon conference, we're talking about enabling key management services, being able to have server-side encryption that Amazon enables. Being able to support that, and then besides that, there's a lot of other custom requirements specifically around how do you, because it's more of hybrid architecture. They do have applications on-prem, they do have like a deployed cloud infrastructure to do compute in the cloud as it may needed for any kind of worst workloads. So as part of that, when data moves between, within their land to the cloud, within that VPC, that itself, those connectivity has to be secured and they want to make sure that all of those user passwords, all of that authentication is also kind of secure. So we've enabled a bunch of capabilities around that, specifically for customers who are like super keen on having security, taking care of rule number one, even before they go. >> So financial services, I mean you mentioned that and both of you are talking about it. That's a pretty big target market for you right? I mean you've really made it a point of emphasis. Are there concerns, or I get it (mumbles) so we understand how treasured that data can be. But do you provide anything different for them? I mean is the data point is a point as opposed to another business. You just protect the same way? Or do you have unique processes and procedures and treatments in place that give them maybe whatever that additional of oomph of comfort is that they need? >> So that's a good question. So in principle we service a couple of industries that are very demanding. So it's financial services, it's telecommunication and media, it's government agencies, insurance companies. And when you look at the complexities of the stack that I've described. It's very challenging to make security, scalability in these things really happen. You can not inherit security protocols throughout the stack. So you stack a data prep piece together with a BI accelerator with an ingest tool. These things don't make sense. So the big advantage of Datameer is it's an end to end tool. We do everything from ingest, data preparation to enterprise scale analytics, and provide this out of the box in a seamless fashion to our customers. >> It is fascinating how the whole ecosystem has sort of changed in what feels like only a couple of years and how much customers are taking some of these things and putting them together to create some amazing new products and new ways of doing things. So can you give us a bit of an idea of, you were saying earlier that cloud was sort of, it was about two years ago, three years ago. What was it that finally tipped you over and said you know what we gotta do this. We're hearing a lot of talk about people wanting hybrid solutions, wanting to be able to do bursting. What was it really that drove you from the customer perspective to say you know what we have to do this, and we have to go into AWS? >> Did you just catch the entire question? Just repeat the last one. What drove it to the cloud? >> Justin: Yeah, what drove you to the cloud? >> John: What puts you over the top? >> I mean, so this is a very interesting question because Datameer was always innovating ahead of the curve. And this is probably a big piece to the story. And if you look back. I think the first cloud solutions with Microsoft Azure. So first I think we did our own cloud solution, and we moved to Microsoft Azure and this was already maybe two and a half years ago, or even longer. So we were ahead of the curve. Then I would say it was even too early. You saw some adoption, so we have a couple of great customers like JC Penny is already operating in the cloud for us, big retail company, they're actually in AWS. National Instruments works in Microsoft Azure. So there's some good adoption, but now you see this accelerating. And it's related to the complexity of the stack, to the multiple points of failure of on premise solutions to the fact that people want, really they want elasticity. They want flexibility in rolling this out. The primary, interestingly enough, the primary motivators actually not cost. It's really a breathable solution that allows them to spin up clusters, to manage certain workloads that come for a compliance report every quarter. They need another 50 notes, spin them up, run them for a week or two and spin them down again. So it's really the customers are buying elasticity, they're buying elasticity from a technology perspective. They're buying elasticity from a commercial perspective. But they want enterprise grade. >> Yeah we certainly hear customers like that flexibility. >> And I think we are now at a tipping point where customers see that they can actually do this in a highly secure and governed way. So especially our demanding customers. And that it really makes sense from a commercial and elasticity perspective. >> So you were saying that's what they're buying, but they're buying what you're selling. So congratulations on that. Obviously it's working. So good luck, continued success down the road, and thanks for the time here today, we appreciate it. >> Absolutely, thanks for having us. >> John: Always good to have you on theCUBE. >> It's cocktail time, thanks for having us. >> It is five o' clock somewhere, here right now. Back with more live coverage from re:Invent. We'll be back here from Las Vegas live in just a bit. (electronic music)
SUMMARY :
Announcer: Live from Las Vegas, it's theCUBE. Good to have you here on theCUBE. So you were cube-ing at just recently and we had actually, we've done some work with a couple I mean I'm gonna have to look into our budget. I know this is a busy week for you all. So we do everything from ingesting the data So are you producing to give them what you think So it's actually interesting that you ask this really show some of the amazing things that you can do And they don't have to use 10 different tools So being able to do that, have that self service, So they really struggle to make this happen on premise. So in the recent times one of our very first So financial services, I mean you mentioned that So the big advantage of Datameer is it's an end to end tool. to say you know what we have to do this, What drove it to the cloud? So it's really the customers are buying elasticity, And I think we are now at a tipping point and thanks for the time here today, we appreciate it. Back with more live coverage from re:Invent.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Justin | PERSON | 0.99+ |
Justin Warren | PERSON | 0.99+ |
Justin Rodatus | PERSON | 0.99+ |
John | PERSON | 0.99+ |
New York | LOCATION | 0.99+ |
Christian | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
John Walls | PERSON | 0.99+ |
2009 | DATE | 0.99+ |
London | LOCATION | 0.99+ |
UK | LOCATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
30% | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
National Instruments | ORGANIZATION | 0.99+ |
Pooja Palan | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
a week | QUANTITY | 0.99+ |
Pusha | PERSON | 0.99+ |
two weeks ago | DATE | 0.99+ |
third thing | QUANTITY | 0.99+ |
27 different technology platforms | QUANTITY | 0.99+ |
Datameer | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
2010 | DATE | 0.99+ |
eight years ago | DATE | 0.99+ |
five o' clock | DATE | 0.99+ |
10 different tools | QUANTITY | 0.99+ |
Intel | ORGANIZATION | 0.98+ |
50 notes | QUANTITY | 0.98+ |
second one | QUANTITY | 0.98+ |
three years ago | DATE | 0.98+ |
both | QUANTITY | 0.98+ |
three days | QUANTITY | 0.98+ |
First | QUANTITY | 0.98+ |
one tool | QUANTITY | 0.98+ |
about 15 months | QUANTITY | 0.97+ |
Christian Rodatus | PERSON | 0.97+ |
Sands Expo Center | LOCATION | 0.97+ |
first | QUANTITY | 0.97+ |
two and a half years ago | DATE | 0.97+ |
32 Hadoop | QUANTITY | 0.96+ |
one place | QUANTITY | 0.95+ |
single interface | QUANTITY | 0.94+ |
re:Invent | EVENT | 0.94+ |
single file | QUANTITY | 0.94+ |
three things | QUANTITY | 0.92+ |
first cloud | QUANTITY | 0.92+ |
one | QUANTITY | 0.92+ |
dozens over dozens of commercial tools | QUANTITY | 0.89+ |
years | QUANTITY | 0.88+ |
Day one | QUANTITY | 0.88+ |
first financial services | QUANTITY | 0.88+ |
Invent | EVENT | 0.85+ |
Invent 2017 | EVENT | 0.84+ |
Puja | PERSON | 0.83+ |
first commercial platform | QUANTITY | 0.82+ |
about two years ago | DATE | 0.82+ |
couple folks | QUANTITY | 0.8+ |
JC Penny | ORGANIZATION | 0.8+ |
re: | EVENT | 0.75+ |
Datameer | PERSON | 0.7+ |
Amazon | EVENT | 0.69+ |
couple of years | QUANTITY | 0.68+ |
Azure | TITLE | 0.67+ |
Datameer | TITLE | 0.67+ |
rule | QUANTITY | 0.67+ |
Christian Rodatus, Datameer | BigData NYC 2017
>> Announcer: Live from Midtown Manhattan, it's theCUBE covering Big Data New York City 2017. Brought to by SiliconANGLE Media and its ecosystem sponsors. >> Coverage to theCUBE in New York City for Big Data NYC, the hashtag is BigDataNYC. This is our fifth year doing our own event in conjunction with Strata Hadoop, now called Strata Data, used to be Hadoop World, our eighth year covering the industry, we've been there from the beginning in 2010, the beginning of this revolution. I'm John Furrier, the co-host, with Jim Kobielus, our lead analyst at Wikibon. Our next guest is Christian Rodatus, who is the CEO of Datameer. Datameer, obviously, one of the startups now evolving on the, I think, eighth year or so, roughly seven or eight years old. Great customer base, been successful blocking and tackling, just doing good business. Your shirt says show him the data. Welcome to theCUBE, Christian, appreciate it. >> So well established, I barely think of you as a startup anymore. >> It's kind of true, and actually a couple of months ago, after I took on the job, I met Mike Olson, and Datameer and Cloudera were sort of founded the same year, I believe late 2009, early 2010. Then, he told me there were two open source projects with MapReduce and Hadoop, basically, and Datameer was founded to actually enable customers to do something with it, as an entry platform to help getting data in, create the data and doing something with it. And now, if you walk the show floor, it's a completely different landscape now. >> We've had you guys on before, the founder, Stefan, has been on. Interesting migration, we've seen you guys grow from a customer base standpoint. You've come on as the CEO to kind of take it to the next level. Give us an update on what's going on at Datameer. Obviously, the shirt says "Show me the data." Show me the money kind of play there, I get that. That's where the money is, the data is where the action is. Real solutions, not pie in the sky, we're now in our eighth year of this market, so there's not a lot of tolerance for hype even though there's a lot of AI watching going on. What's going on with you guys? >> I would say, interesting enough I met with a customer, prospective customer, this morning, and this was a very typical organization. So, this is a customer that was an insurance company, and they're just about to spin up their first Hadoop cluster to actually work on customer management applications. And they are overwhelmed with what the market offers now. There's 27 open source projects, there's dozens and dozens of other different tools that try to basically, they try best of reach approaches and certain layers of the stack for specific applications, and they don't really know how to stitch this all together. And if I reflect from a customer meeting at a Canadian bank recently that has very successfully deployed applications on the data lake, like in fraud management and compliance applications and things like this, they still struggle to basically replicate the same performance and the service level agreements that they used from their old EDW that they still have in production. And so, everybody's now going out there and trying to figure out how to get value out of the data lake for the business users, right? There's a lot of approaches that these companies are trying. There's SQL-on-Hadoop that supposedly doesn't perform properly. There is other solutions like OLAP on Hadoop that tries to emulate what they've been used to from the EDWs, and we believe these are the wrong approaches, so we want to stay true to the stack and be native to the stack and offer a platform that really operates end-to-end from interesting the data into the data lake to creation, preparation of the data, and ultimately, building the data pipelines for the business users, and this is certainly something-- >> Here's more of a play for the business users now, not the data scientists and statistical modelers. I thought the data scientists were your core market. Is that not true? >> So, our primary user base as Datameer used to be like, until last week, we were the data engineers in the companies, or basically the people that built the data lake, that created the data and built these data pipelines for the business user community no matter what tool they were using. >> Jim, I want to get your thoughts on this for Christian's interest. Last year, so these guys can fix your microphone. I think you guys fix the microphone for us, his earpiece there, but I want to get a question to Chris, and I ask to redirect through you. Gartner, another analyst firm. >> Jim: I've heard of 'em. >> Not a big fan personally, but you know. >> Jim: They're still in business? >> The magic quadrant, they use that tool. Anyway, they had a good intro stat. Last year, they predicted through 2017, 60% of big data projects will fail. So, the question for both you guys is did that actually happen? I don't think it did, I'm not hearing that 60% have failed, but we are seeing the struggle around analytics and scaling analytics in a way that's like a dev ops mentality. So, thoughts on this 60% data projects fail. >> I don't know whether it's 60%, there was another statistic that said there's only 14% of Hadoop deployments, or production or something, >> They said 60, six zero. >> Or whatever. >> Define failure, I mean, you've built a data lake, and maybe you're not using it immediately for any particular application. Does that mean you've failed, or does it simply mean you haven't found the killer application yet for it? I don't know, your thoughts. >> I agree with you, it's probably not a failure to that extent. It's more like how do they, so they dump the data into it, right, they build the infrastructure, now it's about the next step data lake 2.0 to figure out how do I get value out of the data, how do I go after the right applications, how do I build a platform and tools that basically promotes the use of that data throughout the business community in a meaningful way. >> Okay, so what's going on with you guys from a product standpoint? You guys have some announcements. Let's get to some of the latest and greatest. >> Absolutely. I think we were very strong in data creation, data preparation and the entire data governance around it, and we are using, as a user interface, we are using this spreadsheet-like user interface called a workbook, it really looks like Excel, but it's not. It operates at completely different scale. It's basically an Excel spreadsheet on steroids. Our customers built a data pipeline, so this is the data engineers that we discussed before, but we also have a relatively small power user community in our client base that use that spreadsheet for deep data exploration. Now, we are lifting this to the next level, and we put up a visualization layer on top of it that runs natively in the stack, and what you get is basically a visual experience not only in the data curation process but also in deep data exploration, and this is combined with two platform technologies that we use, it's based on highly scalable distributed search in the backend engine of our product, number one. We have also adopted a columnar data store, Parquet, for our file system now. In this combination, the data exploration capabilities we bring to the market will allow power analysts to really dig deep into the data, so there's literally no limits in terms of the breadth and the depth of the data. It could be billions of rows, it could be thousands of different attributes and columns that you are looking at, and you will get a response time of sub-second as we create indices on demand as we run this through the analytic process. >> With these fast queries and visualization, do you also have the ability to do semantic data virtualization roll-ups across multi-cloud or multi-cluster? >> Yeah, absolutely. We, also there's a second trend that we discussed right before we started the live transmission here. Things are also moving into the cloud, so what we are seeing right now is the EDW's not going away, the on prem is data lake, that prevail, right, and now they are thinking about moving certain workload types into the cloud, and we understand ourselves as a platform play that builds a data fabric that really ties all these data assets together, and it enables business. >> On the trends, we weren't on camera, we'll bring it up here, the impact of cloud to the data world. You've seen this movie before, you have extensive experience in this space going back to the origination, you'd say Teradata. When it was the classic, old-school data warehouse. And then, great purpose, great growth, massive value creation. Enter the Hadoop kind of disruption. Hadoop evolved from batch to do ranking stuff, and then tried to, it was a hammer that turned into a lawnmower, right? Then they started going down the path, and really, it wasn't workable for what people were looking at, but everyone was still trying to be the Teradata of whatever. Fast forward, so things have evolved and things are starting to shake out, same picture of data warehouse-like stuff, now you got cloud. It seems to be changing the nature of what it will become in the future. What's your perspective on that evolution? What's different about now and what's same about now that's, from the old days? What's the similarities of the old-school, and what's different that people are missing? >> I think it's a lot related to cloud, just in general. It is extremely important to fast adoptions throughout the organization, to get performance, and service-level agreements without customers. This is where we clearly can help, and we give them a user experience that is meaningful and that resembles what they were used to from the old EDW world, right? That's number one. Number two, and this comes back to a question to 60% fail, or why is it failing or working. I think there's a lot of really interesting projects out, and our customers are betting big time on the data lake projects whether it being on premise or in the cloud. And we work with HSBC, for instance, in the United Kingdom. They've got 32 data lake projects throughout the organization, and I spoke to one of these-- >> Not 32 data lakes, 32 projects that involve tapping into the data lake. >> 32 projects that involve various data lakes. >> Okay. (chuckling) >> And I spoke to one of the chief data officers there, and they said they are data center infrastructure just by having kick-started these projects will explode. And they're not in the business of operating all the hardware and things like this, and so, a major bank like them, they made an announcement recently, a public announcement, you can read about it, started moving the data assets into the cloud. This is clearly happening at rapid pace, and it will change the paradigm in terms of breathability and being able to satisfy peak workload requirements as they come up, when you run a compliance report at quota end or something like this, so this will certainly help with adoption and creating business value for our customers. >> We talk about all the time real-time, and there's so many examples of how data science has changed the game. I mean, I was talking about, from a cyber perspective, how data science helped capture Bin Laden to how I can get increased sales to better user experience on devices. Having real-time access to data, and you put in some quick data science around things, really helps things in the edge. What's your view on real-time? Obviously, that's super important, you got to kind of get your house in order in terms of base data hygiene and foundational work, building blocks. At the end of the day, the real-time seems to be super hot right now. >> Real-time is a relative term, right, so there's certainly applications like IOT applications, or machine data that you analyze that require real-time access. I would call it right-time, so what's the increment of data load that is required for certain applications? We are certainly not a real-time application yet. We can possibly load data through Kafka and stream data through Kafka, but in general, we are still a batch-oriented platform. We can do. >> Which, by the way, is not going away any time soon. It's like super important. >> No, it's not going away at all, right. It can do many batches at relatively frequent increments, which is usually enough for what our customers demand from our platform today, but we're certainly looking at more streaming types of capability as we move this forward. >> What do the customer architectures look like? Because you brought up the good point, we talk about this all the time, batch versus real-time. They're not mutually exclusive, obviously, good architectures would argue that you decouple them, obviously will have a good software elements all through the life cycle of data. >> Through the stack. >> And have the stack, and the stack's only going to get more robust. Your customers, what's the main value that you guys provide them, the problem that you're solving today and the benefits to them? >> Absolutely, so our true value is that there's no breakages in the stack. We enter, and we can basically satisfy all requirements from interesting the data, from blending and integrating the data, preparing the data, building the data pipelines, and analyzing the data. And all this we do in a highly secure and governed environment, so if you stitch it together, as a customer, the customer this morning asked me, "Whom do you compete with?" I keep getting this question all the time, and we really compete with two things. We compete with build-your-own, which customers still opt to do nowadays, while our things are really point and click and highly automated, and we compete with a combination of different products. You need to have at least three to four different products to be able to do what we do, but then you get security breaks, you get lack of data lineage and data governance through the process, and this is the biggest value that we can bring to the table. And secondly now with visual exploration, we offer capability that literally nobody has in the marketplace, where we give power users the capability to explore with blazing fast response times, billion rows of data in a very free-form type of exploration process. >> Are there more power users now than there were when you started as a company? It seemed like tools like Datameer have brought people into the sort of power user camp, just simply by the virtue of having access to your tool. What are your thoughts there? >> Absolutely, it's definitely growing, and you see also different companies exploiting their capability in different ways. You might find insurance or financial services customers that have a very sophisticated capability building in that area, and you might see 1,000 to 2,000 users that do deep data exploration, and other companies are starting out with a couple of dozen and then evolving it as they go. >> Christian, I got to ask you as the new CEO of Datameer, obviously going to the next level, you guys have been successful. We were commenting yesterday on theCUBE about, we've been covering this for eight years in depth in terms of CUBE coverage, we've seen the waves come and go of hype, but now there's not a lot of tolerance for hype. You guys are one of the companies, I will say, that stay to your knitting, you didn't overplay your hand. You've certainly rode the hype like everyone else did, but your solution is very specific on value, and so, you didn't overplay your hand, the company didn't really overplay their hand, in my opinion. But now, there's really the hand is value. >> Absolutely. >> As the new CEO, you got to kind of put a little shiny new toy on there, and you know, rub the, keep the car lookin' shiny and everything looking good with cutting edge stuff, the same time scaling up what's been working. The question is what are you doubling down on, and what are you investing in to keep that innovation going? >> There's really three things, and you're very much right, so this has become a mature company. We've grown with our customer base, our enterprise features and capabilities are second to none in the marketplace, this is what our customers achieve, and now, the three investment areas that we are putting together and where we are doubling down is really visual exploration as I outlined before. Number two, hybrid cloud architectures, we don't believe the customers move their entire stack right into the cloud. There's a few that are going to do this and that are looking into these things, but we will, we believe in the idea that they will still have to EDW their on premise data lake and some workload capabilities in the cloud which will be growing, so this is investment area number two. Number three is the entire concept of data curation for machine learning. This is something where we've released a plug-in earlier in the year for TensorFlow where we can basically build data pipelines for machine learning applications. This is still very small. We see some interest from customers, but it's growing interest. >> It's a directionally correct kind of vector, you're looking and say, it's a good sign, let's kick the tires on that and play around. >> Absolutely. >> 'Cause machine learning's got to learn, too. You got to learn from somewhere. >> And quite frankly, deep learning, machine learning tools for the rest of us, there aren't really all that many for the rest of us power users, they're going to have to come along and get really super visual in terms of enabling visual modular development and tuning of these models. What are your thoughts there in terms of going forward about a visualization layer to make machine learning and deep learning developers more productive? >> That is an area where we will not engage in a way. We will stick with our platform play where we focus on building the data pipelines into those tools. >> Jim: Gotcha. >> In the last area where we invest is ecosystem integration, so we think with our visual explorer backend that is built on search and on a Parquet file format is, or columnar store, is really a key differentiator in feeding or building data pipelines into the incumbent BRE ecosystems and accelerating those as well. We've currently prototypes running where we can basically give the same performance and depth of analytic capability to some of the existing BI tools that are out there. >> What are some the ecosystem partners do you guys have? I know partnering is a big part of what you guys have done. Can you name a few? >> I mean, the biggest one-- >> Everybody, Switzerland. >> No, not really. We are focused on staying true to our stack and how we can provide value to our customers, so we work actively and very important on our cloud strategy with Microsoft and Amazon AWS in evolving our cloud strategy. We've started working with various BI vendors throughout that you know about, right, and we definitely have a play also with some of the big SIs and IBM is a more popular one. >> So, BI guys mostly on the tool visualization side. You said you were a pipeline. >> On tool and visualization side, right. We have very effective integration for our data pipelines into the BI tools today we support TD for Tableau, we have a native integration. >> Why compete there, just be a service provider. >> Absolutely, and we have more and better technology come up to even accelerate those tools as well in our big data stuff. >> You're focused, you're scaling, final word I'll give to you for the segment. Share with the folks that are a Datameer customer or have not yet become a customer, what's the outlook, what's the new Datameer look like under your leadership? What should they expect? >> Yeah, absolutely, so I think they can expect utmost predictability, the way how we roll out the division and how we build our product in the next couple of releases. The next five, six months are critical for us. We have launched Visual Explorer here at the conference. We're going to launch our native cloud solution probably middle of November to the customer base. So, these are the big milestones that will help us for our next fiscal year and provide really great value to our customers, and that's what they can expect, predictability, a very solid product, all the enterprise-grade features they need and require for what they do. And if you look at it, we are really enterprise play, and the customer base that we have is very demanding and challenging, and we want to keep up and deliver a capability that is relevant for them and helps them create values from the data lakes. >> Christian Rodatus, technology enthusiast, passionate, now CEO of Datameer. Great to have you on theCUBE, thanks for sharing. >> Thanks so much. >> And we'll be following your progress. Datameer here inside theCUBE live coverage, hashtag BigDataNYC, our fifth year doing our own event here in conjunction with Strata Data, formerly Strata Hadoop, Hadoop World, eight years covering this space. I'm John Furrier with Jim Kobielus here inside theCUBE. More after this short break. >> Christian: Thank you. (upbeat electronic music)
SUMMARY :
Brought to by SiliconANGLE Media and its ecosystem sponsors. I'm John Furrier, the co-host, with Jim Kobielus, So well established, I barely think of you create the data and doing something with it. You've come on as the CEO to kind of and the service level agreements that they used Here's more of a play for the business users now, that created the data and built these data pipelines and I ask to redirect through you. So, the question for both you guys is the killer application yet for it? the next step data lake 2.0 to figure out Okay, so what's going on with you guys and columns that you are looking at, and we understand ourselves as a platform play the impact of cloud to the data world. and that resembles what they were used to tapping into the data lake. and being able to satisfy peak workload requirements and you put in some quick data science around things, or machine data that you analyze Which, by the way, is not going away any time soon. more streaming types of capability as we move this forward. What do the customer architectures look like? and the stack's only going to get more robust. and analyzing the data. just simply by the virtue of having access to your tool. and you see also different companies and so, you didn't overplay your hand, the company and what are you investing in to keep that innovation going? and now, the three investment areas let's kick the tires on that and play around. You got to learn from somewhere. for the rest of us power users, We will stick with our platform play and depth of analytic capability to some of What are some the ecosystem partners do you guys have? and how we can provide value to our customers, on the tool visualization side. into the BI tools today we support TD for Tableau, Absolutely, and we have more and better technology Share with the folks that are a Datameer customer and the customer base that we have is Great to have you on theCUBE, here in conjunction with Strata Data, Christian: Thank you.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jim Kobielus | PERSON | 0.99+ |
Chris | PERSON | 0.99+ |
HSBC | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Jim | PERSON | 0.99+ |
Christian Rodatus | PERSON | 0.99+ |
Stefan | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
60% | QUANTITY | 0.99+ |
2017 | DATE | 0.99+ |
Datameer | ORGANIZATION | 0.99+ |
2010 | DATE | 0.99+ |
32 projects | QUANTITY | 0.99+ |
Last year | DATE | 0.99+ |
United Kingdom | LOCATION | 0.99+ |
1,000 | QUANTITY | 0.99+ |
New York City | LOCATION | 0.99+ |
14% | QUANTITY | 0.99+ |
eight years | QUANTITY | 0.99+ |
fifth year | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
Excel | TITLE | 0.99+ |
eighth year | QUANTITY | 0.99+ |
late 2009 | DATE | 0.99+ |
early 2010 | DATE | 0.99+ |
Mike Olson | PERSON | 0.99+ |
60 | QUANTITY | 0.99+ |
27 open source projects | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
thousands | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
Kafka | TITLE | 0.99+ |
seven | QUANTITY | 0.99+ |
second trend | QUANTITY | 0.99+ |
Midtown Manhattan | LOCATION | 0.99+ |
yesterday | DATE | 0.99+ |
Christian | PERSON | 0.99+ |
both | QUANTITY | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.98+ |
two open source projects | QUANTITY | 0.98+ |
Gartner | ORGANIZATION | 0.98+ |
two platform technologies | QUANTITY | 0.98+ |
Wikibon | ORGANIZATION | 0.98+ |
Switzerland | LOCATION | 0.98+ |
billions of rows | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
MapReduce | ORGANIZATION | 0.98+ |
2,000 users | QUANTITY | 0.98+ |
Bin Laden | PERSON | 0.98+ |
NYC | LOCATION | 0.97+ |
Strata Data | ORGANIZATION | 0.97+ |
32 data lakes | QUANTITY | 0.97+ |
six | QUANTITY | 0.97+ |
Hadoop | TITLE | 0.97+ |
secondly | QUANTITY | 0.96+ |
next fiscal year | DATE | 0.96+ |
three things | QUANTITY | 0.96+ |
today | DATE | 0.95+ |
four different products | QUANTITY | 0.95+ |
Teradata | ORGANIZATION | 0.95+ |
Christian | ORGANIZATION | 0.95+ |
this morning | DATE | 0.95+ |
TD | ORGANIZATION | 0.94+ |
EDW | ORGANIZATION | 0.94+ |
BigData | EVENT | 0.92+ |