Thomas LaRock, SolarWinds | Microsoft Ignite 2018

(music) >> Live from Orlado, Florida, it's theCUBE. Covering Microsoft Ignite. Brought to you by Cohesity. and theCube's ecosystem partners. >> Welcome back, everyone, to theCube's live coverage of Microsoft Ignite. Happy hour has started. The crowd is roaring. I'm your host Rebecca Knight, along with my cohost, Stu Miniman. We are joined by Thomas LaRock. >> He is the Head Geek at SolarWinds. Thanks so much for coming on the show. >> Thanks for having me. >> Great title: Head Geek >> Yes. >> So, tell our viewers a little bit about what - tell us about SolarWinds and also about what you do. >> SolarWinds is a company that offers about forty different products to help with your enterprise infrastructure monitoring. Really unify management of your systems. Been in the business for about twenty years and I've been with them for about eight now. Head Geek is really, uh, you can equate it to being a technical evangelist. >> Okay. So you're out there trying to win the hearts and minds, trying to tell everyone what you do. >> Yes, I need you all to love me. (laughing) and love my products. >> So, Thomas, and for those who don't already follow you on Twitter, you're a SQL rockstar. >> Yes, yes [Stu] - I need to say, "thank you," because you helped connect me with a lot of the community here, especially on the data side of the house. You and I have known each other for a bunch of years. You're a Microsoft MVP. So maybe give us a little bit of community aspect: what it means to be a Microsoft MVP for those who don't know. You're an evangelist in this space and you've been on this show many times. >> I usually don't talk about myself a lot, but sure. (Rebecca laughing) Let's go for it. I've been a Microsoft data platform MVP for about 10 year now. And it was intresting when you reached out, looking to get connected. I was kind of stunned by how many people I actually knew or knew how to get in touch with for you. I help you line up, I guess, a handful of people to be on the show because you were telling me you hadn't been here at Microsoft Ignite and I just thought, "well I know people," and they should know Stu, and we should get them connected so that you guys can have some good conversations. But, yeah, it's been a wild ride for me those ten years where Microsoft awards people MVP designation. It's kind of being an evangelist for Microsoft and some of the great stuff that they've been doing over the past ten years. >> It's a phenomenal program. Most people in the technology industry know the Microsoft MVP program. I was a Vmware expert for a number of years. Many of the things were patterned off of that. John Troyer is a friend of mine. He said that was one the things he looked at. Sytrics has programs like this. Many of the vendors here have evangelists or paragons showing that technology out here. Alight. So talk a little bit about community. Talk about database space. Data and databases have been going through such, you know, explosion of what's going on out there, right? SQL's still around. It's not all cosmos and, you know, microservices-based, cloud, native architecture. >> So the SQL Server box product is still around, but what I think is more amazing to me has been the evalution of...Let's take for example, one of the announcements today, the big data cluster. So, it's essentially a container that's going to run SQL servers, Spark and Hadoop, all in one. Basically, a pod that will get deployed by kubernetes. When you wrap all that together, what you start to realize is that the pattern that Microsoft has been doing for the past few years, which is, essentially, going to where the people are. What I mean is: you have in the open-source world, you have people and developers that have embraced things like DevOps much faster than what the Windows developers have been doing. So instead of taking your time trying to drag all these people where you want them to be, they've just start building all the cool stuff where all the cool kids already are, and everybody's just going to gravitate. Data has gravity, right? So, you're building these things, and people are going to follow it. Now, it's not that they're expecting to sell a billion dollars woth of licenses. No. They just need to be a part of the conversation. So if you're a company that's using those technologies, now all of a sudden, it's like, this is an option. Are you interested in it? Microsoft is the company that's best poised to bring enterprises to the cloud. Amazon has a huge share. We all know that, but Microsoft's already that platform of choice for these enterprises. Microsoft is going to be the one to help them get to the cloud. [Stu]- Thomas, Explain what you mean by that because the strength I look at Microsoft is look, they've got your application. Business productivity: that's where they are. Apologize for cutting you off there. Is that what you mean? The applications are changing and you trusted Microsoft and the application and therefore, that's a vendor of choice. >> Absolutely. If it's already your vendor of choice then, I don't want to say, "Lock in," but if it's already your preference and if they can help get to the cloud, or in the hybrid situation or just lift and shift and just get there, then that's the one you going to want to do it. Everything they're building and all the services they're providing... At the end of the day, they and Amazon, they're the new electric company. They want data. That's the electricity. They don't care how you get it, but between... even Vmware. Between Amazon, Vmware and Microsoft, they're going to be the ones to help... They're going to be your infrastructure companies. Microsoft-managed desktop now. We'll manage your laptop for you. >> Everything that they're doing essentially like, don't even need my own IT department. Microsoft's going to be the largest MSP in history, right? That's where they're headed. They're going to manage everything for you. The data part of it, of course for me, I just love talking about data. But the data part of it...Data is essential to everything we do. It's all about the data. They're doing their best to manage it and secure it. Security is a huge thing. There were some security announcements today as well, which were awesome. The advanced threat detection, the protection that they have. I'm always amazed when I walk through the offering they have for SQL injection protection. I try and ask people, "Who's right now monitoring for SQL injection?" And they're like, "We're not doing that." For fifteen dollars a month, you could do this for your servers. They're like, "that's amazing what they're offerening." Why wouldn't you want that as a service? Why wouldn't you sign-up tomorrow for this stuff? So, I get excited about it. I think all this stuff they're building is great. The announcements today were great. I think they have more coming out over the next couple days. Or at least in the sessions, we'll start seeing a lot of hands-on stuff. I'm excited for it. >> So when you were talking about Microsoft being the automatic vendor of choice. Why wouldn't you? You treated it as a no brainer. What does Microsoft need to do to make sure customers feel that way too? >> I think Microsoft is going to do that... How I would do that. A couple ways. One, at the end of the day, Microsoft wants what we all want, what I want, is they want happy customers. So they're going to do whatever it takes so their customers are happy. So one way you do that is you get a lot of valuable feedback from customers. So, one thing Microsoft has done in the past is they've increased the amount telemetry they're collecting from their products. So they know the usage. They know what the customers want. They know what the customers need. But they also collect simple voice to the customer. You're simply asking the customer, "What do you want?" And you're doing everything you can to keep them happy. And you're finding out where the struggles are. You're helping them solve those problems. How do you not earn trust as a result of all that, right? I think that's the avenue they've been doing for, at least, ten years. Well, let's say, eight years. That's the avenue and the approach they've been doing. I'd say it's been somewhat successful. >> Thomas, as our team was preparing for this show, we understand that Microsoft has a lot of strengths, but if I look at the AI space, Microsoft is not the clear leader today. Um, we think that some of the connections that Microsoft has, everything that you said, down to the desktop. Heck, even in the consumer space, they're down to the Xbox. There's a lot of reasons why Microsoft... You can say, "Here's a path of how Microsoft could become. You know number one, number two in the AI space over time. But, we're listening to things, like the Open Data Initiative that they announced today, which, obviously, Microsoft's working with a lot of partners out there, but it's a big ecosystem. Data plays everywhere. I mean, Google obviously has strong play in data. We've talked plenty about Amazon. What does Microsoft need to do to take the strength that they have in data move forward in AI and become even stronger player in the marketplace? >> So, AI, itself, is kind of that broad term. I mean, AI is a simple if-then statement. It doesn't really have to do anything, right? So let's talk about machine learning, predictive analytics, or even deep learning. That's really the are that we're talking about. What does Microsoft have to do? Well, they have to offer the services. But they don't have offer, say, new things. They just have to offer things that already exist. For example, the idea of, um, incorperating Jupiter notebooks into the Azure Data Studio. So if that could be achieved, you know, now you're bringing the workspaces people are using into the Microsoft platform a little bit, making it a little bit easier. So instead of these people in these enterprises... They already trust Microsoft. They already have the tools. But I got to go use these other things. Well, eventually, those other things come into the Microsoft tools, and now you don't have to use that other stuff either. I would talk about the ability to publish these models as a service. I've done the Academy program. I've earned a few certifications on some of this stuff. I was amazed at how easy it was with a few clicks, you know, published as a service as an API. It's sitting there. I sent in my data and I get back a result, a prediction. I was like, that was really easy. So I know they're not the leaders, but they're making it easy, especially for somebody like me who can start at zero and get to where I need to be. They made it incredibly easy and in some cases, it was intuative. I'm like, oh, I know what to do next with this widgit I'm building. I think it will take time for them to kind of get all that stuff in place. I don't know how long. But does Microsoft have to be the leader in AI? They have the Cognitive Toolkit. They have all that stuff with Cortana. They have the data. I think the customers are coming along. I think they get there just by attrition. I'm not sure there's something they're going to build where everybody just says, "There it is." Except there's the Quantum stuff. And last year's announcement of Quantum, I thought was one of the most stunning things. It just hit me. I had no idea working on it. So, who knows? A year from now there could be something similar to that type of announcement, where we're like, now I get it, now I got to go have this thing. I don't think we all need, you know, a hotdog not hotdog app, which seems to be the bulk of the examples out there. Some of the image classification stuff that you have out there is fabulous. There are a lot of use cases for it. Um, I'm not sure how they get there. But, I do think eventually over time, the platform that they offer, they do get just through attrition. >> One of the things you brought up earlier in this conversation was the Open Source Initiative and Stu, we had expressed a bit of skepticism that it's still going to take three to five years, for, really, customers to see the value of this. But once...The announcement was made today, so now we're going to go forward with this Initiative. What do you see as the future? >> Yeah, I was trying to, even, figure it out. So it sounds like the three companies are sharing data with each other. They pledged to be open. So if you buy one of their products, that data can seamlessly go into that other product is what it sounded like. And they were open, if I heard it right, they were open to partnering with other companies as well. >> Correct. >> Yes. Yes. >> Other vendors or customers, even that could tie in into these APIs, doing everything that they're doing. Open data models. >> Speaking as a data guy, that means if I trust one, I have to trust them all. (Stu Laughing) >> Right? So I don't know. I have trus&t issues. (Rebecca laughing) >> Clearly. >> I'm a DBA, by heart, so I have trust issues. I need to know a little more about it, but on the surface, just the words, "open data," sound great. I just don't know the practical, uh, practicality of it. It sounds like it's a way for people, or these companies, to partner with each other to get more of your data into their platform and their infrastructure. >> Yeah. I think next time we have Thomas on, we're going to spend some time talking about the dark side of data. >> Yes, indeed. >> We can talk dark data. Oh, sure. (Rebecca laughing) >> Well, Thomas, it was so much fun having you on this show and I should just plug your book. You are the author of "DBA Survivor." >> I am. Yes. It was a little book. So being a DBA, uh, I had some challenges in my role and I decided, as my friend Kevin Kline put it to me, he goes, "You should write the book you wish had written for you and handed to you on day zero of being a DBA." And I said, "Oh." It took m&e, I think, like, three weeks. It was just so easyto write all of that. >> It just flowed (laughing.) >> It was just stuff I had to say. But, yeah, thank you. >> Excellent. I'm Rebecca Knight for Stu Miniman. We will have more from theCUBE's live coverage of Microsoft Ignite coming up in just a little bit. (music playing)

Published Date : Sep 24 2018

SUMMARY :

Brought to you by Cohesity. to theCube's live coverage of He is the Head Geek at SolarWinds. and also about what you do. Been in the business trying to tell everyone what you do. Yes, I need you all to love me. So, Thomas, and for those especially on the data side of the house. and some of the great stuff Many of the things were be the one to help them the ones to help... the protection that they have. about Microsoft being the So they're going to do whatever it takes Microsoft is not the clear leader today. I don't think we all need, you know, One of the things you So it sounds like the three doing everything that they're I have to trust them all. I have trus&t issues. I just don't know the practical, the dark side of data. We can talk dark data. You are the author of "DBA Survivor." the book you wish had written It was just stuff I had to say. I'm Rebecca Knight for Stu Miniman.

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Kevin Kline	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Thomas	PERSON	0.99+
Google	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
John Troyer	PERSON	0.99+
Rebecca	PERSON	0.99+
Vmware	ORGANIZATION	0.99+
three companies	QUANTITY	0.99+
Thomas LaRock	PERSON	0.99+
SolarWinds	ORGANIZATION	0.99+
ten years	QUANTITY	0.99+
Cortana	TITLE	0.99+
eight years	QUANTITY	0.99+
today	DATE	0.99+
tomorrow	DATE	0.99+
last year	DATE	0.99+
one	QUANTITY	0.99+
Xbox	COMMERCIAL_ITEM	0.99+
five years	QUANTITY	0.99+
One	QUANTITY	0.98+
about twenty years	QUANTITY	0.98+
SQL	TITLE	0.98+
about forty different products	QUANTITY	0.97+
Stu	PERSON	0.97+
Cohesity	ORGANIZATION	0.97+
about 10 year	QUANTITY	0.96+
three weeks	QUANTITY	0.96+
theCube	ORGANIZATION	0.95+
about eight	QUANTITY	0.95+
Sytrics	ORGANIZATION	0.95+
Stu Laughing	PERSON	0.95+
Windows	TITLE	0.94+
Orlado, Florida	LOCATION	0.94+
fifteen dollars a month	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
Azure Data Studio	TITLE	0.93+
Twitter	ORGANIZATION	0.93+
one way	QUANTITY	0.93+
DevOps	TITLE	0.92+
billion dollars	QUANTITY	0.92+

Sudhir Hasbe, Google Cloud | Google Cloud Next 2018

>> Live from San Francisco, it's theCUBE covering Google Cloud Next 2018, brought to you by Google Cloud and its ecosystem partners. (techy music) >> Hey, welcome back, everyone, this is theCUBE Live in San Francisco coverage of Google Cloud Next '18, I'm John Furrier with Jeff Frick. Day three of three days of coverage, kind of getting day three going here. Our next guest, Sudhir, as the director of product management, Google Cloud, has the luxury and great job of managing BigTable, BigQuery, I'm sorry, BigQuery, I guess BigTable, BigQuery. (laughs) Welcome back to the table, good to see you. >> Thank you. >> So, you guys had a great demo yesterday, I want to get your thoughts on that, I want to explore some of the machine learning things that you guys announced, but first I want to get perspective of the show for you guys. What's going on with you guys at the show here, what are some of the big announcements, what's happening? >> A lot of different announcements across the board, so I'm responsible for data analytics on the Google Cloud. One of our key products is Google BigQuery. Large scale, cloud scale data warehouse, a lot of customers using it for bringing all their enterprise data into the data warehouse, analyzing it at scale, you can do petabyte scale queries in seconds, so that's the kind of scale we provide. So, a lot of momentum on that, we announced a lot of things, a lot of enhancements within that. For example, one of the things we announced was we have a new experience, new UI of BigQuery, now you can literally do the query, as I was saying, of petabyte scale or something, any queries that you want, and with one click you can go into Data Studio, which is our DI tool that's available, or you can go in Sheets and then from there quickly go ahead and fire up a connector, connect to BigQuery, get the data in Sheets and do analysis. >> So, ease of use is a focus. >> Ease of use is a major focus for us. As we are growing we want to make sure everybody in the organization can get access to their data, analyze it. That was one, one of the things, which is pretty unique to BigQuery, which is there is a real time collection of information, so you can... There are customers that are actually collecting real time data from click-stream, for example, on their websites or other places, and moving it directly into BigQuery and analyzing it. Example, in-game analytics, if in-game you're actually playing games and you're going to collect those events and do real time analysis, you're going to literally put it into BigQuery at scale and do that. So, a lot of customers using BigQuery at different levels. We also announced Clustering that allows you to reduce the cost, improve efficiency, and make queries almost two X faster for us. So, a lot of announcements other than the machine learning. >> Well, the one thing I saw in the demo I thought was, I mean, it was machine learning, so that's hot topic here, obviously. >> Yes. >> Is you don't have to move the data, and this is something that we've been covering, go back to the Hadoop, back when we first started doing theCUBE, you know, data pipeline, all the complexities involved in moving the data, and at the scale and size of the data all this wrangling was going on just to get some machine learning in. >> Yep. >> So, talk about that new feature where you guys are doing it inside BigQuery. I think that's important, take a minute to explain that. >> Yeah, so when we were talking to our customers one of the biggest challenges they were facing with machine learning in general, or a couple of them were, one, every time you want to do machine learning you are to take data from your core data warehouse, like in BigQuery you have petabytes of scaled data sets, terabytes of data sets. Now, if you want to do machine learning on any portion of it you take it out of BigQuery, move it into some machine learning engine, ML engine, auto-ML, anything, then you realize, "Oh, I missed some of the data that I needed." I go back then again take the data, move it, and you have to go back and forth too much time. There are analysis I think that different organizations have done. 80% of the time the data scientists say they're spending on the moving of data-- >> Right. >> Wrangling data and all of that, so that is one big problem. The second big challenge we were hearing was skillset gap, there are just not that many PhD data scientists in the industry, how do we solve that problem? So, what we said is first problem, how do we solve it, why do people have to move data to the machine learning engines? Why can't I take the machine learning capability, move it inside where the data is, so bring the machine learning closer to data rather than data closer to machine learning. So, that's what BigQuery ML is, it's an ability to run regression-like models inside the data warehouse itself in BigQuery so that you can do that. The second we said the interface can't be complex. Our audiences already know SQL, they're already analyzing data, these folks, business analysts that are using BigQuery are the experts on the data. So, what we said is use your standard SQL, write two lines of code, create model, type of the model you want to run, give us the data, we will just run the machine learning model on the backend and you can do predictions pretty easily. So, that's what we are doing with that. >> That's awesome. >> So, Sudhir, I love to hear that you were driven by that, by your customers, because one of the things we talk about all the time is democratization. >> Yeah. >> If you want innovation you've got to democratize access to the data, and then you got to democratize access to the tools to actually do stuff with the data-- >> Yes. >> That goes way beyond just the hardcore data scientist in the organization-- >> Yeah, exactly. >> And that's really what you're trying to enable the customers to be able to do. >> Absolutely, if you look at it, if you just go on LinkedIn and search for data analyst versus data scientist there is 100 X more analysts in the industry, and our thing was how do we empower these analysts that understand the data, that are familiar with SQL, to go ahead and do data science. Now, we realize they're not going to be expert machine learning folks who understand all the intricacies of how the gradient descent works, all that, that's not their skillset, so our thing was reduce the complexity, make it very simple for them to use. The framework, like just use SQL and we take care of the internal hyper-tuning, the complexity of it, model selection. We try to do that internally within the technology, and they just get a simple interface for that. So, it's really empowering the SQL analyst with an organization to do machine learning with very little to no knowledge of machine learning. >> Right. >> Talk about the history of BigQuery, where did it come from? I mean, Google has this DNA of they do it internally for themselves-- >> Yes. >> Which is a tough customer-- >> Yes. >> In Cloud Spatter we had the product manager on for Cloud Spatter. Dip Dee, she was, like amazing, like okay, baked internally, did that have the same-- >> Yes. >> BigQuery, take a minute to talk about that, because you're now making it consumable for enterprise customers. >> Yeah. >> It's not a just, "Here's BigQuery." >> No. >> Talk about the origination, how it started, why, and how you guys use it internally. >> So, BigQuery internally is called Dremel. There's a paper on Dremel available. I think in 2012 or something we published it. Dremel has been used internally for analytics across Google. So, if you think about Spanner being used for transaction management in the company across all areas, BigQuery, or Dremel internally, is what we use for all large scale data analytics within Google. So, the whole company runs on, analyzes data with it, so our things was how do we take this capability that we are driving, and imagine like, when you have seven products that are more than a billion active users, the amount of data that gets generated, the insights we are giving in Maps and all the different places, a lot of those things are first analyzed in Dremel internally and we're making it available. So, our thing was how do we take that capability that's there internally and make it available to all enterprises. >> Right. >> As Sundhir was saying yesterday, our goal is empower all our customers to go ahead and do more. >> Right. >> And so, this is a way of taking the piece of technology that's powered Google for a while and also make it available to enterprises. >> It's tested, hardened and tested. >> Yeah, absolutely. >> It's not like it's vaporware. >> Yeah, it's not. (laughs) >> No, I mean, this is what I think is important about the show this year. If you look at it, you guys have done a really good job of taking the big guns of Google, the big stuff, and not try to just say, "We're Google and you can be like Google." You've taken it and you've kind of made it consumable. >> Yes. >> This has been a big focus, explain the mindset behind the product management. >> Absolutely, there is actually one of the key things Google is good at doing is taking what's there internally used, but also the research part of it. Actually, Corinna Cortes, who is head of our AI side who does a lot of research in SQL-based machine learning, so again, the-- >> Yeah. >> BigQuery ML is nothing new, like we internally have a research team that has been developing it for a few years. We have been using it internally for running all these models and all, and so what we were able to do it bring product management from our side, like hey, this is really a problem we are facing, moving data, skillset gap, and then we were like, research team was already enabling it and then we had an engineering team which is pretty strong. We were like, okay, let's bring all three triads together and go ahead and make sure we provide a real value to our customers with all of that we're doing, so that's how it came to light. >> So, I just want to get your take, early days like when there was the early Google search appliance, I'll just pick that up, and that was ancient, ancient ago, but one of the digs was, right, it didn't work as well in the enterprise, per se, because you just didn't have the same amount of data when you applied that type of technique to a Google flow of data and a Google flow of queries. So, how's that evolved over time, because you guys, like you said, seven applications with a billion-- >> Yep. >> Users, most enterprises don't have that, so how do they get the same type of performance if they don't have the same kind of throughput to build the models and to get that data, how's that kind of evolved? >> So, this is why I think thinking about, when we think about scale we think about scaling up and scaling down, right? We have customers who are using BigQuery with a few terabytes of data. Not every customer has petabytes scale, but what we're also noticing is these same customers, when they see value in data they collect more. I will give you a real example, Zulily, one of our customers, I used to be there before, so when they started doing real time data collection for doing real time analytics they were collecting like 50 million events a day. Within 18 months they started collecting five billion a day, 100 x improvement, and the reason is they started seeing value. They could take this real time data, analyze it, make some real time experiences possible on their website and all, with all of that they were able to go out and get real valuer for their customers, drive growth, so when customers see that kind of value they collect more data. So, what I would say is yes, a lot of customers start small, but they all have an aspiration to have lots of data, leverage that to create operational efficiency as well as growth, and so as they start doing that I think they will need infrastructure that can scale down and up all the way, and I think that's what we're focusing on, providing that. >> You guys look at the possibility, and I've seen some examples where customers are just, like, they're shell-shocked, and you're almost too good, right? I mean, it's like, "We've been doing "Dremel on a large scale, I bought this "data warehouse like 10 years ago," like what are you talking about? (laughs) I mean, there's a reality of we've been buying IT, enterprises have been buying IT and in comes Google, the gunslinger saying, "Hey, man, you can do all this stuff." There's a little bit of shell-shock factor for some IT people. Some engineering organizations get it right away. How are you guys dealing with this as you make it consumable? >> Yeah. >> There's probably a lot of education. As a product manager do you see, is that something that you think about, is that something you guys talk about? >> Yes, we do, so I think I actually see a difference in how customers, what customers need, enterprise customers versus cloud native companies. As you said, cloud native companies starting new, starting fresh, so it's a very different set of requirement. Enterprise customers, thinking about scale, thinking about security and how do you do that. So, BigQuery is a highly secure data warehouse. The other thing BigQuery has is it's a completely serverless platform, so we take care of the security. We encrypt all the data at rest and when it's moving. The key thing is when we share what is possible and how easy it is to manage and how fast people can start analyzing, you can bring the data. Like you can actually get started with BigQuery in minutes, like you just bring your data in and start analyzing it. You don't have to worry about how many machines do I need, how do I provision it, how many servers do I need. >> Yeah. >> So, enterprises, when they look at-- >> Cloud native ready. >> Yeah. >> All right, so take a minute to explain BigTable versus, I mean, BigTable versus BigQuery. >> Yes. >> What's the difference between the two, one's a data warehouse and the other one is a system for managing data? What's the difference between Big-- >> So, it's a no-SQL system, so I will... The simple example, I will give you a real example how customers use it, right. BigQuery is great for large scale analytics, people who want to take, like, petabyte scale data or terabyte scale data and analyze historical patterns, all of that, and do complex analysis. You want to do machine learning model creation, you can do that. What BigTable is great at is once you have pre-aggregated data you want to go ahead and really fast serving. If you have a website, I don't expect you to run a website and back it with BigQuery, it's not built for that. Whereas BigTable is exactly for that scenario, so for example, you have millions of people coming on the website, they want to see some key metrics that have been pre-created ready to go, you go to BigTable and that can actually do high performance, high throughput. Last statement on that, like almost 10,000-- >> Yeah. >> Requests per second per node and you can just create as many as you want, so you can really create high scale-- >> Auto-scaling, all kinds of stuff there. >> Exactly. >> And that's good for unstructured data as well-- >> Exactly. >> And managing it. >> Absolutely. >> Okay, so structured data, SQL, basically large scale-- >> Yes. >> BigTable for real time-- >> Yes. >> New kinds of datas, different data types. >> Absolutely, yes. >> What else do you have in the bag of goodies in there that you're working on? >> The one big thing that we also announced with this week was a GIS capability within BigQuery. GIS is geographical information, like everything today is location-based, latitude, longitude. Our customers were telling us really difficult to analyze it, right, like I want to know... Example would be we are here, I want to know how many food restaurants are in a two-mile radius of here, which ones are those, how many, should we create the next one here or not. Those kind of analyses are really difficult, so we partnered with Earth Engine, Earth Engine team within Google with Maps, and then what we're launching is ability to do geospatial analysis within BigQuery. Additionally along with that we also have a visualization tool that we launched this week, so folks who haven't seen that should go check that out. One great example I will give you is Geotab, their CEO is here, Neil. He was showing a demo in one of the sessions and he was talking about how he was able to transform his business. I'll give you an example, Geotab is basically into vehicle tracking, so they have these sensors that track different things with vehicles, and then with, and they store everything in BigQuery, collect all of that and all, and his thing was with BigQuery ML and a GIS capability, what he's now able to do is create models that can predict what intersections in a city when it's snowing are going to be dangerous, and for smart cities he can now recommend to cities where and how to invest in these kind of scenarios. Completely transforming his business because his business is not smart cities, his business was vehicle tracking and all, he's like, but with these capabilities they're transforming what they were doing and solving-- >> New discoveries. >> New discoveries, solving new problems, it's amazing. I wonder if you could just dig at a little bit to, you know, the fact that you've got this, these seven billion activities or apps that you can leverage, you know, specific functionality or goals or objectives or priorities in those groups, and now apply those, pull that data, pull that knowledge, pull those use cases into a completely different application on the enterprise. I mean, is that an active process-- >> I don't think that's how people. >> Do people query? >> No, no. >> But how does that happen? >> No, we don't-- >> As a customer. >> As a customer completely different, right? Our focus in Google Cloud is primarily enabling enterprises to collect their data, process their data, innovate on their data. We don't bring in, like, the Google side of it at all, like that's their completely different area that way, so we basically, enterprises, all their data stays within their environment. They basically, we don't touch it, we don't get to access it at all, and they can know it. >> Yeah, yeah, no, I didn't mean that, I meant, you know, like say Maps for instance, it's interesting to see how Maps has evolved over all these years. Every time you open it, oh, and it's directions-- >> Yep. >> Oh, now it's better directions, oh, now it's got gas stations, oh, now it's where the... And it triggered because you said the restaurants that are close by, so it's kind of adding value to the core app on that side, and as you just said, now geolocation can be used on the enterprise side-- >> Yeah, yes. >> And lots of different things, so that-- >> Exactly. >> That's where I meant that kind of connection-- >> Exactly right, so-- >> In terms of the value of what can I do with geolocation. >> Absolutely, exactly, so like, that's exactly what we did. With Earth Engine we had a lot of learnings on geospatial analysis and our thing was how do you make it easy for our enterprise customers to do that. We've partnered with them closely and we said, "Okay, here are the core pieces of things "we can add in BigQuery that will allow you "to do better geospatial analysis, visualize it." One of the big challenges is lat longs, I don't think they're that friendly with analysts, like oh, numbers and all that. So, we actually will turn a UI visualization tool that allows you to just fire a query and see visually on a map where things are, all the points look like and all. >> Awesome. >> So, just simplifying what analysts can do with all these. >> Sudhir, thanks for coming on, really appreciate it and congratulations on your success. Got a lot of great, big products there, hardened internally, now-- >> Yes. >> Making consumable, it's clear here at Google Cloud you guys are recognized that making it consumable-- >> Yep. >> Pre-existing, proven technologies, so I want to give you guys props for that, congratulations. >> Thank you, thanks a lot. >> Thanks for coming on the show. >> Thanks for coming on. >> Thank you. >> It's theCUBE coverage here, Google Cloud coverage, Google Next 2018. I'm John Furrier with Jeff Frick, stay with us, we've got all day with more coverage for day three. Stay with us after this short break. (techy music)

Published Date : Jul 26 2018

SUMMARY :

brought to you by Google Cloud and its ecosystem partners. has the luxury and great job of managing BigTable, What's going on with you guys at the show here, in seconds, so that's the kind of scale we provide. So, a lot of announcements other than the machine learning. Well, the one thing I saw in the demo I thought was, and at the scale and size of the data all this wrangling you guys are doing it inside BigQuery. of them were, one, every time you want to on the backend and you can do predictions pretty easily. So, Sudhir, I love to hear that you were driven by that, enable the customers to be able to do. Absolutely, if you look at it, if you just baked internally, did that have the same-- BigQuery, take a minute to talk about why, and how you guys use it internally. that gets generated, the insights we are giving all our customers to go ahead and do more. and also make it available to enterprises. Yeah, it's not. "We're Google and you can be like Google." the mindset behind the product management. SQL-based machine learning, so again, the-- like hey, this is really a problem we are facing, So, how's that evolved over time, because you guys, I will give you a real example, Zulily, like what are you talking about? As a product manager do you see, is that something that can start analyzing, you can bring the data. All right, so take a minute to explain BigTable so for example, you have millions of people One great example I will give you that you can leverage, you know, specific functionality We don't bring in, like, the Google side of it at all, Every time you open it, oh, and it's directions-- to the core app on that side, and as you just said, on geospatial analysis and our thing was how do you Got a lot of great, big products there, give you guys props for that, congratulations. I'm John Furrier with Jeff Frick, stay with us,

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
John	PERSON	0.99+
Yusef	PERSON	0.99+
Vodafone	ORGANIZATION	0.99+
Neil	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Deutsche	ORGANIZATION	0.99+
Earth Engine	ORGANIZATION	0.99+
Sudhir	PERSON	0.99+
Europe	LOCATION	0.99+
Jeff Frick	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Adolfo Hernandez	PERSON	0.99+
Telco	ORGANIZATION	0.99+
2012	DATE	0.99+
Google	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Corinna Cortes	PERSON	0.99+
Dave Brown	PERSON	0.99+
telco	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
100s	QUANTITY	0.99+
Adolfo	PERSON	0.99+
KDDI	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
London	LOCATION	0.99+
15	QUANTITY	0.99+
Io-Tahoe	ORGANIZATION	0.99+
Yusef Khan	PERSON	0.99+
80%	QUANTITY	0.99+
90%	QUANTITY	0.99+
Sudhir Hasbe	PERSON	0.99+
two	QUANTITY	0.99+
SK Telecom	ORGANIZATION	0.99+
two lines	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
BigQuery	TITLE	0.99+
IBM	ORGANIZATION	0.99+
four weeks	QUANTITY	0.99+
10s	QUANTITY	0.99+
Brazil	LOCATION	0.99+
three	QUANTITY	0.99+
SQL	TITLE	0.99+
San Francisco	LOCATION	0.99+
LinkedIn	ORGANIZATION	0.99+
Global Telco Business Unit	ORGANIZATION	0.99+

Rob Bearden, Hortonworks | DataWorks Summit 2018

>> Live from San Jose in the heart of Silicon Valley, it's theCUBE covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks Summit here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by Rob Bearden. He is the CEO of Hortonworks. So thanks so much for coming on theCUBE again, Rob. >> Thank you for having us. >> So you just got off of the keynote on the main stage. The big theme is really about modern data architecture. So we're going to have this modern data architecture. What is it all about? How do you think about it? What's your approach? And how do you walk customers through this process? >> Well, there's a lot of moving parts in enabling a modern data architecture. One of the first steps is what we're trying to do is unlock the siloed transactional applications, and to get that data into a central architecture so you can get real time insights around the inclusive dataset. But what we're really trying to accomplish then within that modern data architecture is to bring all types of data whether it be real time streaming data, whether it be sensor data, IoT data, whether it be data that's coming from a connected core across the network, and to be able to bring all that data together in real time, and give the enterprise the ability to be able to take best in class action so that you get a very prescriptive outcome of what you want. So if we bring that data under management from point of origination and out on the edge, and then have the platforms that move that through its entire lifecycle, and that's our HDF platform, it gives the customer the ability to, after they capture it at the edge, move it, and then have the ability to process it as an event happens, a condition changes, various conditions come together, have the ability to process and take the exact action that you want to see performed against that, and then bring it to rest, and that's where our HDP platform comes into play where then all that data can be aggregated so you can have a holistic insight, and have real time interactions on that data. But then it then becomes about deploying those datasets and workloads on the tier that's most economically and architecturally pragmatic. So if that's on-prem, we make sure that we are architected for that on-prem deployment or private cloud or even across multiple public clouds simultaneously, and give the enterprise the ability to support each of those native environments. And so we think hybrid cloud architecture is really where the vast majority of our customers today and in the future, are going to want to be able to run and deploy their applications and workloads. And that's where our DataPlane Service Offering gives them the ability to have that hybrid architecture and the architectural latitude to move workloads and datasets across each tier transparently to what storage file format that they did or where that application is, and we provide all the tooling to match the complexity from doing that, and then we ensured that it has one common security framework, one common governance through its entire lifecycle, and one management platform to handle that entire lifecycle data. And that's the modern data architecture is to be able to bring all data under management, all types of data under management, and manage that in real time through its lifecycle til it comes at rest and deploy that across whatever architecture tier is most appropriate financially and from a performance on-cloud or prem. >> Rob, this morning at the keynote here in day one at DataWorks San Jose, you presented this whole architecture that you described in the context of what you call hybrid clouds to enable connected communities and with HDP, Hortonworks Data Platform 3.0 is one of the prime announcements, you brought containerization into the story. Could you connect those dots, containerization, connected communities, and HDP 3.0? >> Well, HDP 3.0 is really the foundation for enabling that hybrid architecture natively, and what's it done is it separated the storage from the compute, and so now we have the ability to deploy those workloads via a container strategy across whichever tier makes the most sense, and to move those application and datasets around, and to be able to leverage each tier in the deployment architectures that are most pragmatic. And then what that lets us do then is be able to bring all of the different data types, whether it be customer data, supply chain data, product data. So imagine as an industrial piece of equipment is, an airplane is flying from Atlanta, Georgia to London, and you want to be able to make sure you really understand how well is that each component performing, so that that plane is going to need service when it gets there, it doesn't miss the turnaround and leave 300 passengers stranded or delayed, right? Now with our Connected platform, we have the ability to take every piece of data from every component that's generated and see that in real time, and let the airlines make that real time. >> Delineate essentially. >> And ensure that we know every person that touched it and looked at that data through its entire lifecycle from the ground crew to the pilots to the operations team to the service. Folks on the ground to the reservation agents, and we can prove that if somehow that data has been breached, that we know exactly at what point it was breached and who did or didn't get to see it, and can prevent that because of the security models that we put in place. >> And that relates to compliance and mandates such as the Global Data Protection Regulation GDPR in the EU. At DataWorks Berlin a few months ago, you laid out, Hortonworks laid out, announced a new product called the Data Steward Studio to enable GDPR compliance. Can you give our listeners now who may not have been following the Berlin event a bit of an update on Data Steward Studio, how it relates to the whole data lineage, or set of requirements that you're describing, and then going forward what does Hortonworks's roadmap for supporting the full governance lifecycle for the Connected community, from data lineage through like model governance and so forth. Can you just connect a few dots that will be helpful? >> Absolutely. What's important certainly, driven by GDPR, is the requirement to be able to prove that you understand who's touched that data and who has not had access to it, and that you ensure that you're in compliance with the GDPR regulations which are significant, but essentially what they say is you have to protect the personal data and attributes of that data of the individual. And so what's very important is that you've got to be able to have the systems that not just secure the data, but understand who has the accessibility at any point in time that you've ever maintained that individual's data. And so it's not just about when you've had a transaction with that individual, but it's the rest of the history that you've kept or the multiple datasets that you may try to correlate to try to expand relationship with that customer, and you need to make sure that you can ensure not only that you've secured their data, but then you're protecting and governing who has access to it and when. And as importantly that you can prove in the event of a breach that you had control of that, and who did or did not access it, because if you can't prove any breach, that it was secure, and that no one breached it, who has or access to this not supposed to, you can be opened up for hundreds of thousands of dollars or even multiple millions of dollars of fines just because you can't prove that it was not accessed, and that's what the variety of our platforms, you mentioned Data Studio, is part of. DataPlane is one of the capabilities that gives us the ability. The core engine that does that is Atlas, and that's the open source governance platform that we developed through the community that really drives all the capabilities for governance that moves through each of our products, HDP, HDF, then of course, and DataPlane and Data Studio takes advantage of that and how it moves and replicates data and manages that process for us. >> One of the things that we were talking about before the cameras were rolling was this idea of data driven business models, how they are disrupting current contenders, new rivals coming on the scene all the time. Can you talk a little bit about what you're seeing and what are some of the most exciting and maybe also some of the most threatening things that you're seeing? >> Sure, in the traditional legacy enterprise, it's very procedural driven. You think about classic Encore ERP. It's worked very hard to have a very rigid, very structural procedural order to cash cycle that has not a great deal of flexibility. And it takes through a design process, it builds product, that then you sell product to a customer, and then you service that customer, and then you learn from that transaction different ways to automate or improve efficiencies in their supply chain. But it's very procedural, very linear. And in the new world of connected data models, you want to bring transparency and real time understanding and connectivity between the enterprise, the customer, the product, and the supply chain, and that you can take real time best in practice action. So for example you understand how well your product is performing. Is your customer using it correctly? Are they frustrated with that? Are they using it in the patterns and the frequency that they should be if they are going to expand their use and buy more, and if they're not, how do we engage in that cycle? How do we understand if they're going through a re-review and another buying of something similar that may not be with you for a different reason. And when we have real time visibility to our customer's interaction, understand our product's performance through its entire lifecycle, then we can bring real time efficiency with linking those together with our supply chain into the various relationships we have with our customers. To do that, it requires the modern data architecture, bringing data under management from the point it originates, whether it's from the product or the customer interacting with the company, or the customer interacting potentially with our ecosystem partners, mutual partners, and then letting the best in practice supply chain techniques, make sure that we're bringing the highest level of service and support to that entire lifecycle. And when we bring data under management, manage it through its lifecycle and have the historical view at rest, and leverage that across every tier, that's when we get these high velocity, deep transparency, and connectivity between each of the constituents in the value chain, and that's what our platforms give them the ability to do. >> Not only your platform, you guys have been in business now for I think seven years or so, and you shifted from being in the minds of many and including your own strategy from being the premier data at rest company in terms of the a Hadoop platform to being one of the premier data in motion companies. Is that really where you're going? To be more of a completely streaming focus, solution provider in a multi-cloud environment? And I hear a lot of Kafka in your story now that it's like, oh yeah, that's right, Hortonworks is big on Kafka. Can you give us just a quick sense of how you're making that shift towards low latency real time streaming, big data, or small data for that matter, with embedded analytics and machine learning? >> So, we have evolved from certainly being the leader in global data platforms with all the work that we do collaboratively, and in through the community, to make Hadoop an enterprise viable data platform that has the ability to run mission critical workloads and apps at scale, ensuring that it has all the enterprise facilities from security and governance and management. But you're right, we have expanded our footprint aggressively. And we saw the opportunity to actually create more value for our customers by giving them the ability to not wait til they bring data under management to gain an insight, because in that case, they're happened to be reactive post event post transaction. We want to give them the ability to shift their business model to being interactive, pre-event, pre-conditioned. The way to do that we learned was to be able to bring the data under management from the point of origination, and that's what we used MiNiFi and NiFi for, and then HDF, to move it through its lifecycle, and your point, we have the intellect, we have the insight, and then we have the ability then to process the best in class outcome based on what we know the variables are we're trying to solve for as that's happening. >> And there's the word, the phrase asset which of course is a transactional data paradigm plan, I hear that all over your story now in streaming. So, what you're saying is it's a completely enterprise-grade streaming environment from n to n for the new era of edge computing. Would that be a fair way of-- >> It's very much so. And our model and strategy has always been bring the other best in class engines for what they do well for their particular dataset. A couple of examples of that, one, you brought up Kafka, another is Spark. And they do what they do really well. But what we do is make sure that they fit inside an overall data architecture that then embodies their access to a much broader central dataset that goes from point of origination to point of rest on a whole central architecture, and then benefit from our security, governance, and operations model, being able to manage those engines. So what we're trying to do is eliminate the silos for our customers, and having siloed datasets that just do particular functions. We give them the ability to have an enterprise modern data architecture, we manage the things that bring that forward for the enterprise to have the modern data driven business models by bringing the governance, the security, the operations management, ensure that those workflows go from beginning to end seamlessly. >> Do you, go ahead. >> So I was just going to ask about the customer concerns. So here you are, you've now given them this ability to make these real time changes, what's sort of next? What's on their mind now and what do you see as the future of what you want to deliver next? >> First and foremost we got to make sure we get this right, and we really bring this modern data architecture forward, and make sure that we truly have the governance correct, the security models correct. One pane of glass to manage this. And really enable that hybrid data architecture, and let them leverage the cloud tier where it's architecturally and financially pragmatic to do it, and give them the ability to leg into a cloud architecture without risk of either being locked in or misunderstanding where the lines of demarcation of workloads or datasets are, and not getting the economies or efficiencies they should. And we solved that with DataPlane. So we're working very hard with the community, with our ecosystem and strategic partners to make sure that we're enabling the ability to bring each type of data from any source and deploy it across any tier with a common security, governance, and management framework. So then what's next is now that we have this high velocity of data through its entire lifecycle on one common set of platforms, then we can start enabling the modern applications to function. And we can go look back into some of the legacy technologies that are very procedural based and are dependent on a transaction or an event happening before they can run their logic to get an outcome because that grinds the customer in post world activity. We want to make sure that we're bringing that kind of, for example, supply chain functionality, to the modern data architecture, so that we can put real time inventory allocation based on the patterns that our customers go in either how they're using the product, or frustrations they've had, or success they've had. And we know through artificial intelligence and machine learning that there's a high probability not only they will buy or use or expand their consumption of whatever that they have of our product or service, but it will probably to these other things as well if we do those things. >> Predict the logic as opposed to procedural, yes, AI. >> And very much so. And so it'll be bringing those what's next will be the modern applications on top of this that become very predictive and enabler versus very procedural post to that post transaction. We're little ways downstream. That's looking out. >> That's next year's conference. >> That's probably next year's conference. >> Well, Rob, thank you so much for coming on theCUBE, it's always a pleasure to have you. >> Thank you both for having us, and thank you for being here, and enjoy the summit. >> We're excited. >> Thank you. >> We'll do. >> I'm Rebecca Knight for Jim Kobielus. We will have more from DataWorks Summit just after this. (upbeat music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, He is the CEO of Hortonworks. keynote on the main stage. and give the enterprise the ability in the context of what you call and let the airlines from the ground crew to the pilots And that relates to and that you ensure that and maybe also some of the most and that you can take real and you shifted from being that has the ability to run for the new era of edge computing. and then benefit from our security, and what do you see as the future and make sure that we truly have Predict the logic as the modern applications on top of this That's probably next year's it's always a pleasure to have you. and enjoy the summit. I'm Rebecca Knight for Jim Kobielus.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rob Bearden	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
London	LOCATION	0.99+
300 passengers	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Rob	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
hundreds of thousands of dollars	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
each component	QUANTITY	0.99+
GDPR	TITLE	0.99+
DataWorks Summit	EVENT	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
millions of dollars	QUANTITY	0.98+
Atlas	TITLE	0.98+
first steps	QUANTITY	0.98+
HDP 3.0	TITLE	0.97+
One pane	QUANTITY	0.97+
both	QUANTITY	0.97+
DataWorks Summit 2018	EVENT	0.97+
First	QUANTITY	0.96+
next year	DATE	0.96+
each	QUANTITY	0.96+
DataPlane	TITLE	0.96+
theCUBE	ORGANIZATION	0.96+
Hadoop	TITLE	0.96+
DataWorks	ORGANIZATION	0.95+
Spark	TITLE	0.95+
today	DATE	0.94+
EU	LOCATION	0.93+
this morning	DATE	0.91+
Atlanta,	LOCATION	0.91+
Berlin	LOCATION	0.9+
each type	QUANTITY	0.88+
Global Data Protection Regulation GDPR	TITLE	0.87+
one common	QUANTITY	0.86+
few months ago	DATE	0.85+
NiFi	ORGANIZATION	0.85+
Data Platform 3.0	TITLE	0.84+
each tier	QUANTITY	0.84+
Data Studio	ORGANIZATION	0.84+
Data Studio	TITLE	0.83+
day one	QUANTITY	0.83+
one management platform	QUANTITY	0.82+
MiNiFi	ORGANIZATION	0.82+
San	LOCATION	0.71+
DataPlane	ORGANIZATION	0.69+
Kafka	TITLE	0.67+
Encore ERP	TITLE	0.66+
one common set	QUANTITY	0.65+
Data Steward Studio	ORGANIZATION	0.65+
HDF	ORGANIZATION	0.59+
Georgia	LOCATION	0.55+
announcements	QUANTITY	0.51+
Jose	ORGANIZATION	0.47+

Bruno Aziza & Josh Klahr, AtScale - Big Data SV 17 - #BigDataSV - #theCUBE1

>> Announcer: Live from San Jose, California, it's The Cube. Covering Big Data, Silicon Valley, 2017. (electronic music) >> Okay, welcome back everyone, live at Silicon Valley for the big The Cube coverage, I'm John Furrier, with me Wikibon analyst George Gilbert, Bruno Aziza, who's on the CMO of AtScale, Cube alumni, and Josh Klahr VP at AtScale, welcome to the Cube. >> Welcome back. >> Thank you. >> Thanks, Brian. >> Bruno, great to see you. You look great, you're smiling as always. Business is good? >> Business is great. >> Give us the update on AtScale, what's up since we last saw you in New York? >> Well, thanks for having us, first of all. And, yeah, business is great, we- I think Last time I was here on The Cube we talked about the Hadoop Maturity Survey and at the time we'd just launched the company. And, so now you look about a year out and we've grown about 10x. We have large enterprises across just about any vertical you can think of. You know, financial services, your American Express, healthcare, think about ETNA, SIGNA, GSK, retail, Home Depot, Macy's and so forth. And, we've also done a lot of work with our partner Ecosystem, so Mork's- OEM's AtScale technology which is a great way for us to get you AtScale across the US, but also internationally. And then our customers are getting recognized for the work that they are doing with AtScale. So, last year, for instance, Yellowpages got recognized by Cloudera, on their leadership award. And Macy's got a leadership award as well. So, things are going the right trajectory, and I think we're also benefitting from the fact that the industry is changing, it's maturing on the the big data side, but also there's a right definition of what business intelligence means. This idea that you can have analytics on large-scale data without having to change your visualization tools and make that work with existing stock you have in place. And, I think that's been helping us in growing- >> How did you guys do it? I mean, you know, we've talked many times in there's some secret sauce there, but, at the time when you guys were first starting it was kind of crowded field, right? >> Bruno: Yeah. >> And all these BI tools were out there, you had front end BI tools- >> Bruno: Yep. But everyone was still separate from the whole batch back end. So, what did you guys do to break out? >> So, there's two key differentiators with AtScale. The first one is we are the only platform that does not have a visualization tool. And, so people think about this as, that's a bug, that's actually a feature. Because, most enterprises have already that stuff made with traditional BI tools. And so our ability to talk to MDX and SQL types of BI tools, without any changes is a big differentiator. And then the other piece of our technology, this idea that you can get the speed, the scale and security on large data sets without having to move the data. It's a big differentiation for our enterprise to get value out of the data. They already have in Hadoop as well as non-Hadoop systems, which we cover. >> Josh, you're the VP of products, you have the roadmaps, give us a peek into what's happening with the current product. And, where's the work areas? Where are you guys going? What's the to-do list, what's the check box, and what's the innovation coming around the corner? >> Yeah, I think, to follow up on what Bruno said about how we hit the sweet spot. I think- we made a strategic choice, which is we don't want to be in the business of trying to be Tableu or Excel or be a better front end. And there's so much diversity on the back end if you look at the ecosystem right now, whether it's Spark Sequel, or Hive, or Presto, or even new cloud based systems, the sweet spot is really how do you fit into those ecosystems and support the right level of BI on top of those applications. So, what we're looking at, from a road map perspective is how do we expand and support the back end data platforms that customers are asking about? I think we saw a big white space in BI on Hadoop in particular. And that's- I'd say, we've nailed it over the past year and a half. But, we see customers now that are asking us about Google Big Query. They're asking us about Athena. I think these server-less data platforms are really, really compelling. They're going to take a while to get adoption. So, that's a big investment area for us. And then, in terms of supporting BI front ends, we're kind of doubling down on making sure our Tableau integration is great, Power BI is I think getting really big traction. >> Well, two great products, you've got Microsoft and Tableau, leaders in that area. >> The self-service BI revolution has, I would say, has won. And the business user wants their tool of choice. Where we come in is the folks responsible for data platforms on the back end, they want some level of control and consistency and so they're trying to figure out, where do you draw the line? Where do you provide standards? Where do you provide governance, and where do you let the business lose? >> All right, so, Bruno and Josh, I want you to answer the questions, be a good quiz. So, define next generation BI platforms from a functional standpoint and then under the hood. >> Yeah, there's a few things you can look at. I think if you were at the Gartner BI conference last week you saw that there was 24 vendors in the magic quadrant and I think in general people are now realizing that this is a space that is extremely crowded and it's also sitting on technology that was built 20 years ago. Now, when you talk to enterprises like the ones we work with, like, as I named earlier, you realize that they all have multiple BI tools. So, the visualization war, if you will, kind of has been set up and almost won by Microsoft and Tableau at this point. And, the average enterprise is 15 different BI tools. So, clearly, if you're trying to innovate on the visualization side, I would say you're going to have a very hard time. So, you're dealing with that level of complexity. And then, at the back end standpoint, you're now having to deal with database from the past - that's the Teradata of this world - data sources from today - Hadoop - and data sources from the future, like Google Big Query. And, so, I think the CIO answer of what is the next gen BI platform I want is something that is enabling me to simplify this very complex world. I have lots of BI tools, lots of data, how can I standardize in the middle in order to provide security, provide scale, provide speed to my business users and, you know, that's really radically going to change the space, I think. If you're trying to sell a full stack that's integrated from the bottom all the way to visualization, I don't think that's what enterprises want anymore >> Josh, under the hood, what's the next generation- you know, key leverage for the tech, and, just the enabler. >> Yeah, so, for me the end state for the next generation GI platform is a user can log in, they can point to their data, wherever that data is, it's on Prime, it's in the cloud, it's in a relational database, it's a flat file, they can design their business model. We spend a lot of time making sure we can support the creation of business models, what are the key metrics, what are the hierarchies, what are the measures, it may sound like I'm talking about OLAP. You know, that's what our history is steeped in. >> Well, faster data is coming, that's- streaming and data is coming together. >> So, I should be able to just point at those data sets and turn around and be able to analyze it immediately. On the back end that means we need to have pretty robust modeling capabilities. So that you can define those complex metrics, so you can functionally do what are traditional business analytics, period over period comparisons, rolling averages, navigate up and down business hierarchies. The optimizations should be built in. It shouldn't be the responsibility of the designer to figure out, do I need to create indeces, do I need to create aggregates, do I need to create summarization? That should all be handled for you automatically. Shouldn't think about data movement. And so that's really what we've built in from an AtScale perspective on the back end. Point to data, we're smart about creating optimal data structure so you get fast performance. And then, you should be able to connect whatever BI tool you want. You should be able to connect Excel, we can talk the MDX Query language. We can talk Sequel, we can talk Dax, whatever language you want to talk. >> So, take the syntax out of the hands of the user. >> Yeah. >> Yeah. >> And getting in the weeds on that stuff. Make it easier for them- >> Exactly. >> And the key word I think, for the future of BI is open, right? We've been buying tools over the last- >> What do you mean by that, explain. >> Open means that you can choose whatever BI tool you want, and you can choose whatever data you want. And, as a business user there's no real compromise. But, because you're getting an open platform it doesn't mean that you have to trade off complexity. I think some of the stuff that Josh was talking about, period analysis, the type of multidimensional analysis that you need, calendar analysis, historical data, that's still going to be needed, but you're going to need to provide this in a world where the business, user, and IT organization expects that the tools they buy are going to be open to the rest of the ecosystem, and that's new, I think. >> George, you want to get a question in, edgewise? Come on. (group laughs) >> You know, I've been sort of a single-issue candidate, I guess, this week on machine learning and how it's sort of touching all the different sectors. And, I'm wondering, are you- how do you see yourselves as part of a broader pipeline of different users adding different types of value to data? >> I think maybe on the machine learning topic there is a few different ways to look at it. The first is we do use machine learning in our own product. I talked about this concept of auto-optimization. One of the things that AtScale does is it looks at end-user query patterns. And we look at those query patterns and try to figure out how can we be smart about anticipating the next thing they're going to ask so we can pre-index, or pre-materialize that data? So, there's machine learning in the context of making AtScale a better product. >> Reusing things that are already done, that's been the whole machine-learning- >> Yes. >> Demos, we saw Google Next with the video editing and the video recognition stuff, that's been- >> Exactly. >> Huge part of it. >> You've got users giving you signals, take that information and be smart with it. I think, in terms of the customer work flow - Comcast, for example, a customer of ours - we are in a data discovery phase, there's a data science group that looks at all of their set top box data, and they're trying to discover programming patterns. Who uses the Yankees' network for example? And where they use AtScale is what I would call a descriptive element, where they're trying to figure out what are the key measures and trends, and what are the attributes that contribute to that. And then they'll go in and they'll use machine learning tools on top of that same data set to come up with predictive algorithms. >> So, just to be clear there, they're hypotehsizing about, like, say, either the pattern of users that might be- have an affinity for a certain channel or channels, or they're looking for pathways. >> Yes. And I'd say our role in that right now is a descriptive role. We're supporting the descriptive element of that analytics life cycle. I think over time our customers are going to push us to build in more of our own capabilities, when it comes to, okay, I discovered something descriptive, can you come up with a model that helps me predict it the next time around? Honestly, right now people want BI. People want very traditional BI on the next generation data platform. >> Just, continuing on that theme, leaving machine learning aside, I guess, as I understand it, when we talked about the old school vendors, Care Data, when they wanted to support data scientists they grafted on some machine learning, like a parallel version of our- in the core Teradata engine. They also bought Astro Data, which was, you know, for a different audience. So, I guess, my question is, will we see from you, ultimately, a separate product line to support a new class of users? Or, are you thinking about new functionality that gets integrated into the core product. I think it's more of the latter. So, the way that we view it- and this is really looking at, like I said, what people are asking for today is, kind of, the basic, traditional BI. What we're building is essentially a business model. So, when someone uses AtScale, they're designing and they're telling us, they're asserting, these are the things I'm interested in measuring, and these are the attributes that I think might contribute to it. And, so that puts us in a pretty good position to start using, whether it's Spark on the back end, or built in machine learning algorithms on the Hadoop cluster, let's start using our knowledge of that business model to help make predictions on behalf of the customer. So, just a follow-up, and this really leaves out the machine learning part, which is, it sounds like, we went- in terms of big data we we first to archive it- supported more data retension than could do affordably with the data warehouse. Then we did the ETL offload, now we're doing more and more of the visualization, the ad-hoc stuff. >> That's exactly right. So, what- in a couple years time, what remains in the classic data warehouse, and what's in the Hadoop category? >> Well, so there is, I think what you're describing is the pure evolution, of, you know, any technology where you start with the infrastructure, you know, we've been in this for over ten years, now, you've got cloud. They are going APO and then going into the data science workbench. >> That's not official yet. >> I think we read about this, or at least they filed. But I think the direction is showing- now people are relying on the platform, the Hadoop platform, in order to build applications on top of it. And, so, I think, just like Josh is saying, the mainstream application on top of the database - and I think this is true for non-Hadoop systems as well - is always going to be analytics. Of course, data science is something that provides a lot of value, but it typically provides a lot of value to a few set of people that will then scale it out to the rest of their organization. I think if you now project out to what does this mean for the CIO and their environment, I don't think any of these platforms, Teradata or Hadoop, or Google, or Amazon or any of those, I don't think do 100% replace. And, I think that's where it becomes interesting, because you're now having to deal with a hetergeneous environment, where the business user is up, they're using Excel, they're using they're standard net application, they might be using the result of machine learning models, but they're also having to deal with the heterogeneous environment at the data level. Hadoop on Prime, Hadoop in the cloud, non-Hadoop in the cloud and non-Hadoop on Prime. And, of course that's a market that I think is very interesting for us as a simplification platform for that world. >> I think you guys are really thinking about it in a new way, and I think that's kind of a great, modern approach, let the freedom- and by the way, quick question on the Microsoft tool and Tableau, what percentage share do you think they are of the market? 50? Because you mentioned those are the two top ones. >> Are they? >> Yeah, I mentioned them, because if you look at the magic quadrant, clearly Microsoft, Power BI and Tableau have really shot up all the way to the right. >> Because it's easy to use, and it's easy to work with data. >> I think so, I think- look, from a functionality standpoint, you see Tableau's done a very good job on the visualization side. I think, from a business standpoint, and a business model execution, and I can talk from my days at Microsoft, it's a very great distribution model to get thousands and thousands of users to use power BI. Now, the guys that we didn't talk about on the last magic quadrant. People who are like Google Data Studio, or Amazon Quicksite, and I think that will change the ecosystem as well. Which, again, is great news for AtScale. >> More muscle coming in. >> That's right. >> For you guys, just more rising tide floats all boats. >> That's right. >> So, you guys are powering it. >> That's right. >> Modern BI would be safe to say? >> That's the idea. The idea is that the visualization is basically commoditized at this point. And what business users want and what enterprise leaders want is the ability to provide freedom and openness to their business users and never have to compromise security, speed and also the complexity of those models, which is what we- we're in the business of. >> Get people working, get people productive faster. >> In whatever tool they want. >> All right, Bruno. Thanks so much. Thanks for coming on. AtScale. Modern BI here in The Cube. Breaking it down. This is The Cube covering bid data SV strata Hadoop. Back with more coverage after this short break. (electronic music)

Published Date : Mar 15 2017

SUMMARY :

it's The Cube. live at Silicon Valley for the big The Cube coverage, Bruno, great to see you. Hadoop Maturity Survey and at the time So, what did you guys do to break out? this idea that you can get the speed, What's the to-do list, what's the check box, the sweet spot is really how do you Microsoft and Tableau, leaders in that area. and where do you let the business lose? I want you to answer the questions, So, the visualization war, if you will, and, just the enabler. for the next generation GI platform is and data is coming together. of the designer to figure out, So, take the syntax out of the hands And getting in the weeds on that stuff. the type of multidimensional analysis that you need, George, you want to get a question in, edgewise? all the different sectors. the next thing they're going to ask You've got users giving you signals, either the pattern of users that might be- on the next generation data platform. So, the way that we view it- and what's in the Hadoop category? is the pure evolution, of, you know, the Hadoop platform, in order to build applications I think you guys are really thinking about it because if you look at the magic quadrant, and it's easy to work with data. Now, the guys that we didn't talk about For you guys, just more The idea is that the visualization This is The Cube covering bid data

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Bruno	PERSON	0.99+
Bruno Aziza	PERSON	0.99+
George	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
ETNA	ORGANIZATION	0.99+
Brian	PERSON	0.99+
John Furrier	PERSON	0.99+
New York	LOCATION	0.99+
Josh Klahr	PERSON	0.99+
SIGNA	ORGANIZATION	0.99+
GSK	ORGANIZATION	0.99+
Josh	PERSON	0.99+
Home Depot	ORGANIZATION	0.99+
24 vendors	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Yankees'	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
US	LOCATION	0.99+
Excel	TITLE	0.99+
last year	DATE	0.99+
Amazon	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
last week	DATE	0.99+
Silicon Valley	LOCATION	0.99+
AtScale	ORGANIZATION	0.99+
American Express	ORGANIZATION	0.99+
first one	QUANTITY	0.99+
first	QUANTITY	0.99+
20 years ago	DATE	0.99+
50	QUANTITY	0.98+
2017	DATE	0.98+
Tableau	TITLE	0.98+
Macy's	ORGANIZATION	0.98+
One	QUANTITY	0.98+
Mork	ORGANIZATION	0.98+
power BI	TITLE	0.98+
Ecosystem	ORGANIZATION	0.98+
Sequel	PERSON	0.97+
Google	ORGANIZATION	0.97+
this week	DATE	0.97+
Power BI	TITLE	0.97+
Cloudera	ORGANIZATION	0.96+
15 different BI tools	QUANTITY	0.95+
past year and a half	DATE	0.95+
over ten years	QUANTITY	0.95+
today	DATE	0.95+
Tableu	TITLE	0.94+
Tableau	ORGANIZATION	0.94+
SQL	TITLE	0.93+
Astro Data	ORGANIZATION	0.93+
Cube	ORGANIZATION	0.92+
Wikibon	ORGANIZATION	0.92+
two key differentiators	QUANTITY	0.92+
AtScale	TITLE	0.91+
Care Data	ORGANIZATION	0.9+
about 10x	QUANTITY	0.9+
Spark Sequel	TITLE	0.89+
two top ones	QUANTITY	0.89+
Hadoop	TITLE	0.88+
Athena	ORGANIZATION	0.87+
two great products	QUANTITY	0.87+
Big Query	TITLE	0.86+
The Cube	ORGANIZATION	0.85+
Big Data	ORGANIZATION	0.85+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Data Studio: