Sudhir Hasbe, Google Cloud | Informatica World 2019

>> Live from Las Vegas, it's theCUBE. Covering Informatica World 2019. Brought to you by Informatica. >> Welcome back, everyone to theCUBE's live coverage of Informatica World 2019 I'm your host, Rebecca Knight, along with my cohost, John Furrier. We are joined by Sudhir Hasbe. He is the director of product management at Google Cloud. Thank you so much for coming on theCUBE. >> Thank you for inviting me. (laughing) >> So, this morning we saw Thomas Kurian up on the main stage to announce the expanded partnership. Big story in Wall Street Journal. Google Cloud and Informatica Team Up to Tame Data. Tell us more about this partnership. >> So if you take a look at the whole journey of data within organizations, lot of data is still siloed in different systems within different environments. Could be a hybrid on-prem. It could be multi-cloud and all. And customers need this whole end-to-end experience where you can go ahead and take that data, move it to Cloud, do data cleansing on it, do data preparation. You want to be able to go ahead and govern the data, know what data you have, like a catalog. Informatica provides all of those capabilities. And if you look at Google Cloud, we have some highly differentiated services like Google BigQuery, which customers love across the globe, to go ahead and use for analytics. We can do large scale analytics. We have customers from few terabytes to 100-plus petabytes, and storing that amount of data in BigQuery, analyzing, getting value out of it. And from there, all the A.I. capabilities that we have built on top of it. This whole journey of taking data from wherever it is, moving it, cleansing it, and then actually getting value out of it with Big Query, as with our A.I. capabilities. That whole end-to-end experience is what customers need. And with this partnership, I think we are bringing all the key components our customers need together for a perfect fit. >> Sadhir, first of all, great to see you. Since Google Next, we just had a great event by the way this year, congratulations. >> Thanks. >> A lot of great momentum in the enterprise. Explain for a minute. What is the relationship, what is the partnership? Just take a quick minute to describe what it is with Informatica that you're doing. >> Yeah, that's great. I think if you take a look at it, you can bring two key areas together in this partnership. There's data management. How do you get data into Cloud, how do you govern it, manage it, understand it. And then there is analyze the data and AI. So the main thing that we're bring together is these two capabilities. What do I mean by that? The two key components that will be available for our customers is the Intelligent Cloud services from Informatica, which will be available on GCP, will run on GCP. This will basically make sure that the whole end-to-end capability for that platform, like data pipelines and data cleansing and preparation, everything is now available natively on GCP. That's one thing. What that will also do is, Informatica team has actually optimized the execution as part of this migration. What that means is, now you'll be able to use products like Data Cloud, Dataproc. You'll be able to use some of the AI capabilities in BigQuery to actually go do the data cleansing and preparation and process-- >> So when you say "execute", you mean "running." >> Yeah, just running software. >> Not executing, go to market, but executing software. >> Executing software. If you have a data pipeline, you can literally layer this Dataproc underneath to go ahead and run some of the key processes. >> And so the value to the customer is seamless-- >> Seamless integration. >> Okay, so as you guys get more enterprise savvy, and it's clear you guys are doing good work, and obviously Thomas has got the chops there. We've covered that on theCUBE many times. As you go forward, this Cloud formula seems to be taking shape. Amazon, Azure, Google, coming in, providing onboarding to Cloud and vice-versa, so those relationships. The customers are scratching their heads, going, "Okay, where do I fit in that?" So, when you talk to customers, how do you explain that? Because, unlike the old days in computer science and the computer industry, there was known practices. You built a data center, you provisioned some servers, you did some things. It was the general-purpose formula. But every company is different. Their journey's different. Their software legacy make-up's different. Could be born in the cloud with on-prem compliance needs. So, how do customers figure this out? What's the playbook? >> I think the big thing is this: There's a trend in the industry, across the board, to go ahead and be more data-driven, build a culture that is data-driven culture. And as customers are looking at it, what they are seeing is, "Hey, traditionally I was doing a lot of stuff. "Managing infrastructure. Let me go build a data center. "Let me buy machines." That is not adding that much value. It is because. "I need to go do that." That's why they did that. But the real value is, if I can get the data, I can go analyze it, I can get better decisions from it. If I can use machine learning to differentiate my services, that's where the value is. So, most customers are looking at it and saying, "Hey, I know what I need to do in the industry now, "is basically go ahead and focus more on insights "and less on infrastructure." But as doing this, the most important thing is, data is still, as you mentioned, siloed. It's different applications, different data centers, still sitting in different places. So, I think what is happening with what we announced today is making it easy to get that data into Google Cloud and then leveraging that to go ahead and get insights. That's where the focus is for us. And as you get more of these capabilities in the cloud as native services, from Infomatica and Google, customers can now focus more on how to derive value from the data. Putting the data into Cloud, cleansing it, and data preparation, and all of that, that becomes easier. >> Okay, so that brings the solution question to the table. With the solutions that you see with Infomatica, because again, they have a broad space, a horizontal, on-prem and cloud, and they have a huge customer base with enterprise, 25 years, and big data is their thing. What us case is their low-hanging fruit right now? Where are people putting their toe in the water? Where are they jumping full in? Where do you see that spectrum of solutions? >> Great question. There are two or three key scenarios that I see across the board with talking to a lot of customers. Even today, I spoke to a lot of customers at this show. And the first main thing I hear is this whole thing, modedernization of the data warehousing and analytics infrastructure. Lot of data is still siloed and stuck into these different data systems that are there within organizations. And, if you want to go ahead and leverage that data to build on top of the data, democratize it with everybody within the organization, or to leverage AI and machine learning on top of it, you need to unwind what you've done and just take that data and put into Cloud and all. I think modernization of data warehouses and analytics infrastructure is one key play across the IT systems and IT operations. >> Before you go on to the next one, I just want to drill down on that. Because one of the things we're hearing, obviously here and all of the places, is that if you constrain the data, machine learning and AI application ultimately fails. >> Yes. >> So, legacy silos. You mentioned that. But also regulatory things. I got to have privacy now, forget my customer, GDPR first-year anniversary, new regulatory things around, all kinds of data, nevermind outside the United States. But the cloud is appealing, of just throwing it in there as one thing. It's an agility lag issue. Because lagging is not good for AI. You want real-time data. You need to have it fast. How does a customer do that? Is it best to store it in the cloud first, on-premise, with mechanisms? What's your take on this? >> I think it's different in different scenarios. I talk a lot of customers on this. Not all data is restricted from going anywhere. I think there are some data sets you want to have good governance in place. For example, if you have PII data, if you have important customer information, you want to make sure that you take the right steps to govern it. You want to anonymize it. You want to make sure that the right amount of data, per the policies within the organization, only gets into the right systems. And I think this is where, also, the partnership is helpful, because with Infomatica, the tooling that they're provided, or as you mentioned over 25 years, allows customers to understand what these data sets are, what value they're providing. And so, you can do anonymization of data before it lands into Cloud and all of that. So I think one thing is the tooling around that, which is critical. And the second thing is, if you can identify data sets that are real-time, and they don't have business-critical or PII-critical data, that you're fine as a business process to be there, then you can derive a lot of data in real time from all the data sets. >> Tell me about Google's big capabilities, because you guys have a lot of internal power platform features. BigQuery is one of them. Is BigQuery the secret weapon? Is that the big power source for managing the data? >> I would just say: Our customers love BigQuery, primarily because of the capability it provides. There are different capabilities. Let me just list a few. One is: We can do analytics at scale. So as organizations grow, even if data sets are small within organization, what I have seen is, over a period of time, when you derive a lot of value from data, you will start collecting more data within organization. And so, you have to think about scale, whether you are starting with one terabyte or one petabyte or 100 petabytes, it doesn't matter. Analyzing data at scale is what we're really good at, at different types of scale. Second is: democratizing data. We have done a good job of making data available through different tooling, existing tooling that customers have invested in and our tooling, to make it available to everybody. AirAsia is a good example. They have been able to go ahead and give right insights to everybody within the organization, which has helped them go save 5 to 10% in their operational costs. So that's one great example of democratizing access to insights. The third big thing is machine learning and AI. We all know there are just lack of resources to do, at once, analytics with AI and machine learning in the industry. So our goal has been democratize it. Make it easy within an organization. So investments that we have done with BigQuery ML, where you can do machine learning with just simple SQL statements or AutoML tables, which basically allows you to just, within the UI, map and say, "That's table in BigQuery, here's a column that I want to predict, and just automatically figure out what model you want to create, and then we can use neural networks to go do that. I think that kind of investments is what customers love about it from the platform side. >> What about the partnership from a particular functional part of the company, marketing? There's the old adage: 50% of my marketing budget is wasted. I just don't know which one. This one could really change that. >> Exactly right. >> So talk a little bit about the impact of it on marketing. >> I think the main thing is, if you think about the biggest challenge that CMOs have within organizations is how do you better marketing analytics and optimize the spend? So, one of the thing that we're doing with the partnership is not just breaking the silos, getting the data in BigQuery, all of that side and data governance. But another thing is with master data management capability that Infomatica brings to table. Now you can have all of your data in BigQuery. You leverage the Customer 360 that MDM provides and now CMOs can actually say, "Hey, I have a complete view of my customer. "I can do better segmentation. I can do better targeting. "I can give them better service." So that is actually going to derive lot of value with our customers. >> I want to just touch on that once, see if I can get this right. What you just said, I think might be the question I was just about to ask, which is: What is unique about Google's analytical portfolio with Infomatica specifically? Because there's other cloud deals they have. They have Azure and AWS. What's unique about you guys and Infomatica? Was it that piece? >> Yeah, I think there are a few things. One is the whole end-to-end experience of basically getting the data, breaking the silos, doing data governance, this tight integration between our product portfolio, where now you can get a great experience within the native GCP environment. That's one. And then on the other side, Cloud for Marketing is a big, big initiative for us. We work with hundreds of thousand of customers across the globe on their marketing spend and optimizing their marketing. And this is one of the areas where we can work together to go ahead and help those CMOs to get more value from their marketing investments. >> One of the conversations we're having here on theCUBE, and really that we're having in the technology industry, is about the skills gap. I want to hear what you're doing at Google to tackle this problem. >> I think one of the big things that we're doing is just trying to-- I have this team internally. In planning, I use "radical simplicity." And radical simplicity is: How do we take things that we are doing today and make it extremely simple for the next generation of innovation that we're doing? All the investments and BigQuery ML, you SQL for mostly everything. One of the other things that we announced at Next was SQL for data flow, SQL pipelines. What that means is, instead of writing Beam or Java code to build data flow pipelines, now you can write SQL commands to go ahead and create a whole pipeline. Similarly, machine learning with SQL. This whole aspect of simplifying capabilities so that you can use SQL and then AutoML, that's one part of it. And the second, of course, we are working with different partners to go ahead and have a lot of training that is available online, where customers don't have to go take classes, like traditional classes, but just go online. All the assets are available, examples are available. One of the big things in BigQuery we have is we have 70-plus public data sets, where you can go, with BigQuery sandbox, without credit card, you can start using it. You can start trying it out. You can use 70-plus data sets that already available and start learning the product. So I think that should help drive more-- >> Google's a real cultural tech company, so you guys obviously based that from Stanford. Very academic field, so you do hire a lot of smart people. But there's a lot of people graduating middle school, high school, college. Berkeley just graduated their first, inaugural class in data science and analytics. What's the skills, specifically, that young kids or people who are either retraining should either reboot, hone, or dial up? Is there any things that you see from people that are successful inside Google? I mean, sometimes you don't have to have that traditional math background or computer science, although math does help; it's key. But what is your observation? What's your personal view on this? >> I think the biggest thing I've noticed is the passion for data. I fundamentally believe that, in the next three to five years, most organizations will be driven with data and insights. Machine learning and AI is going to become more and more important. So this understanding and having the passion for understanding data, answering questions based on data is the first thing that you need to have. And then you can learn the technologies and everything else. They will become simpler and easier to use. But the key thing is this passion for data and having this data-driven decision-making is the biggest thing, so my recommendation to everybody who is going to college today and learning is: Go learn more about how to make better decisions with data. Learn more about tooling around data. Focus on data, and then-- >> It's like an athlete. If you're not at the gym shooting hoops, if you don't love it, if you're not living it, you're probably not going to be any-- (laughing) It's kind of like that. >> Sudhir, thank you so much for coming on theCUBE. It's a pleasure talking to you. >> Thank you. Thanks a lot for having me. >> I'm Rebecca Knight for John Furrier. You are watching theCUBE. (techno music)

Published Date : May 22 2019

SUMMARY :

Brought to you by Informatica. He is the director of product management at Google Cloud. Thank you for inviting me. Google Cloud and Informatica Team Up to Tame Data. at the whole journey of data within organizations, by the way this year, congratulations. What is the relationship, what is the partnership? the AI capabilities in BigQuery to actually go do If you have a data pipeline, you can literally layer and the computer industry, there was known practices. data is still, as you mentioned, siloed. Okay, so that brings the solution question to the table. And the first main thing I hear is obviously here and all of the places, is that all kinds of data, nevermind outside the United States. And the second thing is, if you can identify Is that the big power source for managing the data? And so, you have to think about scale, What about the partnership from a particular So, one of the thing that we're doing with the partnership the question I was just about to ask, which is: One is the whole end-to-end experience One of the conversations we're having here on theCUBE, One of the big things in BigQuery we have I mean, sometimes you don't have to have is the first thing that you need to have. if you don't love it, Sudhir, thank you so much for coming on theCUBE. Thanks a lot for having me. You are watching theCUBE.

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
two	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sadhir	PERSON	0.99+
Sudhir	PERSON	0.99+
50%	QUANTITY	0.99+
Sudhir Hasbe	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
Thomas Kurian	PERSON	0.99+
one petabyte	QUANTITY	0.99+
first	QUANTITY	0.99+
5	QUANTITY	0.99+
100 petabytes	QUANTITY	0.99+
one terabyte	QUANTITY	0.99+
SQL	TITLE	0.99+
25 years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
United States	LOCATION	0.99+
third	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Thomas	PERSON	0.99+
Java	TITLE	0.99+
over 25 years	QUANTITY	0.99+
One	QUANTITY	0.99+
Second	QUANTITY	0.99+
BigQuery	TITLE	0.99+
today	DATE	0.99+
Infomatica	ORGANIZATION	0.99+
one	QUANTITY	0.99+
second	QUANTITY	0.98+
one part	QUANTITY	0.98+
Beam	TITLE	0.98+
one thing	QUANTITY	0.98+
Azure	ORGANIZATION	0.98+
70-plus	QUANTITY	0.97+
10%	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
three key scenarios	QUANTITY	0.97+
second thing	QUANTITY	0.97+
AirAsia	ORGANIZATION	0.96+
Dataproc	ORGANIZATION	0.96+
two key components	QUANTITY	0.96+
five years	QUANTITY	0.95+
BigQuery ML	TITLE	0.95+
this year	DATE	0.95+
first-year	QUANTITY	0.94+
Cloud	TITLE	0.94+
two capabilities	QUANTITY	0.94+
first thing	QUANTITY	0.94+
Informatica World 2019	EVENT	0.93+
hundreds of thousand of customers	QUANTITY	0.93+
two key areas	QUANTITY	0.91+
Google Cloud	ORGANIZATION	0.91+
GCP	TITLE	0.91+

Sudhir Hasbe, Google Cloud | Google Cloud Next 2019

>> fly from San Francisco. It's the Cube covering Google Club next nineteen Tio by Google Cloud and its ecosystem partners. >> Hey, welcome back. Everyone live here in San Francisco, California is the cubes coverage of Google Cloud Next twenty nineteen star Third day of three days of wall to wall coverage. John for a maiko stupid demon devil on things out around the floor. Getting stories, getting scoops. Of course, we're here with Sadeer has Bay. Who's the director of product management? Google Cloud. So great to see you again. Go on Back on last year, I'LL see Big Query was a big product that we love. We thought the fifty many times about database with geek out on the databases. But it's not just about the databases. We talked about this yesterday, all morning on our kickoff. There is going to be database explosion everywhere. Okay, it's not. There's no one database anymore. It's a lot of databases, so that means data in whatever database format document relational, Unstructured. What you want to call it is gonna be coming into analytical tools. Yes, this's really important. It's also complex. Yeah, these be made easier. You guys have made their seers announcements Let's get to the hard news. What's the big news from your group around Big Queria Mail Auto ml Some of the news share >> the news. Perfect, I think not. Just databases are growing, but also applications. There's an explosion off different applications. Every organization is using hundreds of them, right from sales force to work today. So many of them, and so having a centralized place where you can bring all the data together, analyze it and make decisions. It's critical. So in that realm to break the data silos, we have announced a few important things that they went. One is clouded effusion, making it easy for customers to bring in data from different sources on Prum Ices in Cloud so that you can go out and as you bring the data and transform and visually just go out and move the data into Big query for for analysis, the whole idea is the board and have Dragon drop called free environment for customers to easily bring daytime. So we have, like, you know, a lot of customers, just bringing in all the data from their compromise. The system's oracle, my sequel whatever and then moving that into into big Query as they analyze. So that's one big thing. Super excited about it. A lot of attraction, lot of good feedback from our customers that they went. The second thing is Big Query, which is our Cloud Skill Data warehouse. We have customers from few terabytes to hundreds of terabytes with it. Way also have an inline experience for customers, like a data analyst who want to analyze data, Let's say from sales force work, they are from some other tools like that if you want to do that. Three. I have made hundred less connectors to all these different sense applications available to our partners. Like five Grand Super Metrics in Macquarie five four Barrel Box out of the box for two five clicks, >> you'LL be able to cloud but not above, but I guess that's afraid. But it's important. Connectors. Integration points are critical table stakes. Now you guys are making that a table stakes, not an ad on service the paid. You >> just basically go in and do five clicks. You can get the data, and you can use one of the partners connectors for making all the decisions. And also that's there. and we also announced Migration Service to migrate from candidate that shift those things. So just making it easy to get data into recipe so that you can unlock the value of the data is the first thing >> this has become the big story here. From the Cube standpoint on DH student, I've been talking about day all week. Data migration has been a pain in the butt, and it's critical linchpin that some say it could be the tell sign of how well Google Cloud will do in the Enterprise because it's not an easy solution. It's not just, oh, just move stuff over And the prizes have unique requirements. There's all kinds of governance, all kinds of weird deal things going on. So how are you guys making it easy? I guess that's the question. How you gonna make migrating in good for the enterprise? >> I think the one thing I'll tell you just before I had a customer tell me one pain. You have the best highways, but you're on grams to the highway. Is that a challenge? Can you pick that on? I'm like here are afraid. Analogy. Yeah, it's great. And so last year or so we have been focused on making the migration really easy for customers. We know a lot of customers want to move to cloud. And as they moved to cloud, we want to make sure that it's easy drag, drop, click and go for migration. So we're making that >> holding the on ramps basically get to get the data in the big challenge. What's the big learnings? What's the big accomplishment? >> I think the biggest thing has Bean in past. People have to write a lot ofthe court to go ahead and do these kind of activities. Now it is becoming Click and go, make it really cold free environment for customers. Make it highly reliable. And so that's one area. But that's just the first part of the process, right? What customers want is not just to get data into cloud into the query. They want to go out and get a lot of value out off it. And within that context, what we have done is way made some announcements and, uh, in the in that area. One big thing is the B I engine, because he'd be a engine. It's basically an acceleration on top of the query you get, like subsequently, agency response times for interactive dash boarding, interactive now reporting. So that's their butt in with that. What we're also announced is connected sheets, so connected sheets is basically going to give you spreadsheet experience on top ofthe big credit data sets. You can analyze two hundred ten billion rose off data and macquarie directly with drag drop weakened upriver tables again. Do visualizations customers love spreadsheets in general? >> Yeah, City area. I'm glad you brought it out. We run a lot of our business on sheep's way of so many of the pieces there and write if those the highways, we're using our data. You know what's the first step out of the starts? What are some of the big use cases that you see with that? >> So I think Andy, she is a good example of so air. Isha has a lot of their users operational users. You needed to have access to data on DH, so they basically first challenge was they really have ah subsequently agency so that they can actually do interact with access to the data and also be an engine is helping with that. They used their story on top. Off half now Big Quit it, Gordon. Make it accessible. Be engine will vote with all the other partner tooling too. But on the other side, they also needed to have spread sheet like really complex analysis of the business that they can improve operation. Last year we announced they have saved almost five to ten percent on operational costs, and in the airline, that's pretty massive. So basically they were able to go out and use our connective sheets experience. They have bean early Alfa customer to go out and use it to go in and analyse the business, optimize it and also so that's what customers are able to do with connected sheets. Take massive amounts of data off the business and analyze it and make better. How >> do we use that? So, for a cost, pretend way want to be a customer? We have so many tweets and data points from our media. I think fifty million people are in our kind of Twitter network that we've thought indexed over the years I tried to download on the C S V. It's horrible. So we use sheets, but also this They've had limitations on the han that client. So do we just go to Big Query? How would we work >> that you can use data fusion with you? Clicks move later into Big Query wants you now have it in big query in sheets. You will have an option from data connectors Macquarie. And once you go there, if you're in extended al far, you should get infection. Alfa. And then when you click on that, it will allow you to pick any table in bickering. And once you link the sheets to be query table, it's literally the spreadsheet is a >> run in >> front and got through the whole big query. So when you're doing a favour tables when you're saying Hey, aggregate, by this and all, it actually is internally calling big credit to do those activities. So you remove the barrier off doing something in the in the presentation layer and move that to the engine that actually can do the lot skill. >> Is this shipping? Now you mention it. Extended beta. What's the product? >> It's an extended out far for connected sheets. Okay, so it's like we're working with few customers early on board and >> make sure guys doing lighthouse accounts classic classic Early. >> If customers are already G sweet customer, we would love to get get >> more criteria on the connected sheets of Alfa sending bait after Now What's what's the criteria? >> I think nothing. If customers are ready to go ahead and give us feedback, that's what we care of. Okay, so you want to start with, like, twenty twenty five customers and then expanded over this year and expand it, >> maybe making available to people watching. Let us let us know what the hell what do they go? >> Throw it to me and then I can go with that. Folks, >> sit here. One of the other announcements saw this week I'm curious. How it connects into your pieces is a lot of the open source databases and Google offering those service maybe even expand as because we know, as John said in the open there, the proliferation of databases is only gonna increase. >> I think open source way announced lot of partnerships on the databases. Customers need different types of operational databases on. This is a great, great opportunity for us to partner with some of our partners and providing that, and it's not just data basis. We also announced announced Partnership with Confident. I've been working with the confident team for last one place here, working on the relationship, making sure our customers haven't. I believe customers should always have choice. And we have our native service with Cloud pops up. A lot of customers liked after they're familiar with CAFTA. So with our relationship with Khan fluent and what we announced now, customers will get native experience with CAFTA on Jessie P. I'm looking forward to that, making sure our customers are happy and especially in the streaming analytic space where you can get real time streams of data you want to be, Oh, directly analytics on top of it. That is a really high value add for us, So that's great. And so so that's the That's what I'm looking forward to his customers being able to go out and use all of these open source databases as well as messaging systems to go ahead and and do newer scenarios for with us. >> Okay, so you got big Big query. ML was announced in G. A big query also has auto support Auto ml tables. What does that mean? What's going what's going on today? >> So we announced aquarium L at Kew Blast next invader. So we're going Ta be that because PML is basically a sequel interface to creating machine learning models at scale. So if you have all your data and query, you can write two lines ofthe sequel and go ahead and create a model tow with, Let's say, clustering. We announced plastering. Now we announced Matrix factory ization. One great example I will give you is booking dot com booking dot com, one of the largest travel portals in the in the world. They have a challenge where all the hotel rooms have different kinds off criteria which says they have a TV. I have a ll the different things available and their problem was data quality. There was a lot of challenges with the quality of data they were getting. They were able to use clustering algorithm in sequel in Macquarie so that they could say, Hey, what are the anomalies in this data? Sets and identify their hotel rooms. That would say I'm a satellite TV, but no TV available. So those claims direct Lansing stuff. They were easily able to do with a data analyst sequel experience so that's that. >> That's a great example of automation. Yeah, humans would have to come in, clean the data that manually and or write scripts, >> so that's there. But on the other side, we also have, Ah, amazing technology in Auto Emma. So we had our primal table are normal vision off thermal available for customers to use on different technologies. But we realized a lot of problems in enterprise. Customers are structured data problems, So I have attained equerry. I want to be able to go in and use the same technology like neural networks. It will create models on top of that data. So with auto Emel tables, what we're enabling is customers can literally go in auto Emel Table Portal say, Here is a big query table. I want to be able to go out and create a model on. Here is the column that I want to predict from. Based on that data, and just three click a button will create an automated the best model possible. You'LL get really high accuracy with it, and then you will be able to go out and do predictions through an FBI or U can do bulk predictions out and started back into Aquarian also. So that's the whole thing when making machine learning accessible to everyone in the organization. That's our goal on with that, with a better product to exactly it should be in built into the product. >> So we know you've got a lot of great tech. But you also talk to a lot of customers. Wonder if you might have any good, you know, one example toe to really highlight. Thie updates that you >> think booking dot com is a good example. Our scent. Twentieth Century Fox last year shared their experience off how they could do segmentation of customers and target customers based on their past movies, that they're watched and now they could go out and protect. We have customers like News UK. They're doing subscription prediction like which customers are more likely to subscribe to their newspapers. Which ones are trying may turn out s o those He examples off how machine learning is helping customers like basically to go out and target better customers and make better decisions. >> So, do you talk about the ecosystem? Because one of things we were riffing on yesterday and I was giving a monologue, Dave, about we had a little argument, but I was saying that the old way was a lot of people are seeing an opportunity to make more margin as a system integrated or global less I, for instance. So if you're in the ecosystem dealing with Google, there's a margin opportunity because you guys lower the cost and increase the capability on the analytic side. Mention streaming analytics. So there's a business model moneymaking opportunity for partners that have to be kind of figured out. >> I was the >> equation there. Can you share that? Because there's actually an opportunity, because if you don't spend a lot of time analyzing the content from the data, talk aboutthe >> money means that there's a huge opportunity that, like global system integrators, to come in and help our customers. I think the big challenges more than the margin, there is lot of value in data that customers can get out off. There's a lot of interesting insights, not a good decision making they can do, and a lot of customers do need help in ramping up and making sure they can get value out of that. And it's a great opportunity for our global Asai partners and I've been meeting a lot of them at the show to come in and help organizations accelerate the whole process off, getting insights from from their data, making better decisions, do no more machine learning, leverage all of that. And I think there is a huge opportunity for them to come in. Help accelerate. What's the >> play about what some other low hanging fruit opportunities I'LL see that on ramping or the data ingestion is one >> one loving fruit? Yes, I think no hanging is just moving migration. Earlier, he said. Break the data silos. Get the data into DCP. There's a huge opportunity for customers to be like, you know, get a lot of value. By that migration is a huge opportunity. A lot of customers want to move to cloud, then they don't want to invest more and more and infrastructure on them so that they can begin level Is the benefits off loud? And I think helping customers my great migrations is going to be a huge Obviously, we actually announced the migration program also like a weak back also way. We will give training credits to our customers. We will fund some of the initial input, initial investment and migration activities without a side partners and all, so that that should help there. So I think that's one area. And the second area, I would say, is once the data is in the platform getting value out ofit with aquarium in auto ml, how do you help us? It must be done. I think that would be a huge opportunity. >> So you feel good too, dear. But, you know, build an ecosystem. Yeah. You feel good about that? >> Yeah, way feel very strongly about our technology partners, which are like folks like looker like tableau like, uh, talent confluence, tri factor for data prep All of those that partner ecosystem is there great and also the side partner ecosystem but for delivery so that we can provide great service to our customers >> will be given good logos on that slide. I got to say, Try facts and all the other ones were pretty good etcetera. Okay, so what's the top story for you in the show here, besides your crew out on the date aside for your area was a top story. And then generally, in your opinion, what's the most important story here in Google Cloud next. >> I think two things in general. The biggest news, I think, is open source partnership that we have announced. I'm looking forward to that. It's a great thing. It's a good thing both for the organizations as well as us on DH. Then generally, you'LL see lot off examples of enterprise customers betting on us from HSBC ends at bank that was there with mean in the session. They talked about how they're getting value out ofthe outof our data platform in general, it's amazing to see a lot more enterprises adopting and coming here telling their stories, sharing it with force. >> Okay, thanks so much for joining us. Look, you appreciate it. Good to see you again. Congratulations. Perfect fusion ingesting on ramps into the into the superhighway of Big Query Big engine. They're they're large scale data. Whereas I'm Jeffers dipping them in. We'LL stay with you for more coverage after this short break

Published Date : Apr 11 2019

SUMMARY :

It's the Cube covering So great to see you again. So in that realm to break the data silos, we have announced a few important Now you guys are making that a table You can get the data, and you can use one of the partners connectors linchpin that some say it could be the tell sign of how well Google Cloud will do in the Enterprise because And as they moved to cloud, we want to make sure that it's easy drag, drop, holding the on ramps basically get to get the data in the big challenge. going to give you spreadsheet experience on top ofthe big credit data sets. What are some of the big use cases that you see with that? But on the other side, they also needed to have spread So do we just go to Big Query? And once you link the sheets to be query table, it's literally the spreadsheet is a So you remove the barrier off doing something in the in the presentation What's the product? Okay, so it's like we're working with few customers Okay, so you want to start with, like, twenty twenty five customers and then expanded over this year and expand maybe making available to people watching. Throw it to me and then I can go with that. lot of the open source databases and Google offering those service maybe even expand as because we making sure our customers are happy and especially in the streaming analytic space where you can get Okay, so you got big Big query. I have a ll the different things available and their problem was data quality. That's a great example of automation. But on the other side, we also have, Ah, amazing technology in Auto Emma. But you also talk to a lot of customers. customers like basically to go out and target better customers and make better So, do you talk about the ecosystem? the content from the data, talk aboutthe And I think there is a huge opportunity for them to come in. to be like, you know, get a lot of value. So you feel good too, dear. Okay, so what's the top story for you in the show here, besides your crew out on the date aside for your area in general, it's amazing to see a lot more enterprises adopting and coming here telling Good to see you again.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
HSBC	ORGANIZATION	0.99+
Andy	PERSON	0.99+
San Francisco	LOCATION	0.99+
Last year	DATE	0.99+
Google	ORGANIZATION	0.99+
FBI	ORGANIZATION	0.99+
last year	DATE	0.99+
second area	QUANTITY	0.99+
Twentieth Century Fox	ORGANIZATION	0.99+
Jessie P.	PERSON	0.99+
five clicks	QUANTITY	0.99+
hundreds of terabytes	QUANTITY	0.99+
both	QUANTITY	0.99+
Gordon	PERSON	0.99+
three days	QUANTITY	0.99+
Sudhir Hasbe	PERSON	0.99+
one	QUANTITY	0.99+
yesterday	DATE	0.99+
Dave	PERSON	0.99+
one area	QUANTITY	0.99+
San Francisco, California	LOCATION	0.99+
fifty million people	QUANTITY	0.99+
Isha	PERSON	0.99+
two things	QUANTITY	0.99+
first challenge	QUANTITY	0.99+
Three	QUANTITY	0.99+
Jeffers	PERSON	0.98+
first part	QUANTITY	0.98+
Asai	ORGANIZATION	0.98+
second thing	QUANTITY	0.98+
two lines	QUANTITY	0.98+
News UK	ORGANIZATION	0.98+
ten percent	QUANTITY	0.98+
CAFTA	ORGANIZATION	0.98+
Third day	QUANTITY	0.98+
One	QUANTITY	0.97+
Khan fluent	ORGANIZATION	0.97+
Macquarie	LOCATION	0.97+
this week	DATE	0.97+
five	QUANTITY	0.97+
Twitter	ORGANIZATION	0.97+
hundred less connectors	QUANTITY	0.97+
three click	QUANTITY	0.97+
first thing	QUANTITY	0.96+
today	DATE	0.96+
two hundred ten billion	QUANTITY	0.96+
one example	QUANTITY	0.95+
Google Cloud	ORGANIZATION	0.95+
twenty twenty five customers	QUANTITY	0.94+
fifty many times	QUANTITY	0.93+
Google Cloud	TITLE	0.93+
Kew Blast	ORGANIZATION	0.92+
Alfa	ORGANIZATION	0.92+
Sadeer has Bay	ORGANIZATION	0.89+
first step	QUANTITY	0.89+
hundreds of them	QUANTITY	0.86+
Google Club	ORGANIZATION	0.85+
One great example	QUANTITY	0.83+
Big Queria Mail	ORGANIZATION	0.83+
Confident	ORGANIZATION	0.8+
one big thing	QUANTITY	0.8+
one pain	QUANTITY	0.79+
two five clicks	QUANTITY	0.79+
Next 2019	DATE	0.78+
almost five	QUANTITY	0.78+
Auto Emma	ORGANIZATION	0.77+
One big thing	QUANTITY	0.76+
Prum	ORGANIZATION	0.74+
Macquarie	ORGANIZATION	0.74+
dot com	ORGANIZATION	0.73+
this year	DATE	0.72+
nineteen star	QUANTITY	0.72+
Dragon drop	TITLE	0.7+
Metrics	TITLE	0.7+
ML	TITLE	0.7+
Big Query	OTHER	0.69+
C S V.	TITLE	0.69+
PML	TITLE	0.68+
Grand	TITLE	0.67+
maiko	PERSON	0.65+
Big Query	TITLE	0.65+

Sudhir Hasbe, Google Cloud | Google Cloud Next 2018

>> Live from San Francisco, it's theCUBE covering Google Cloud Next 2018, brought to you by Google Cloud and its ecosystem partners. (techy music) >> Hey, welcome back, everyone, this is theCUBE Live in San Francisco coverage of Google Cloud Next '18, I'm John Furrier with Jeff Frick. Day three of three days of coverage, kind of getting day three going here. Our next guest, Sudhir, as the director of product management, Google Cloud, has the luxury and great job of managing BigTable, BigQuery, I'm sorry, BigQuery, I guess BigTable, BigQuery. (laughs) Welcome back to the table, good to see you. >> Thank you. >> So, you guys had a great demo yesterday, I want to get your thoughts on that, I want to explore some of the machine learning things that you guys announced, but first I want to get perspective of the show for you guys. What's going on with you guys at the show here, what are some of the big announcements, what's happening? >> A lot of different announcements across the board, so I'm responsible for data analytics on the Google Cloud. One of our key products is Google BigQuery. Large scale, cloud scale data warehouse, a lot of customers using it for bringing all their enterprise data into the data warehouse, analyzing it at scale, you can do petabyte scale queries in seconds, so that's the kind of scale we provide. So, a lot of momentum on that, we announced a lot of things, a lot of enhancements within that. For example, one of the things we announced was we have a new experience, new UI of BigQuery, now you can literally do the query, as I was saying, of petabyte scale or something, any queries that you want, and with one click you can go into Data Studio, which is our DI tool that's available, or you can go in Sheets and then from there quickly go ahead and fire up a connector, connect to BigQuery, get the data in Sheets and do analysis. >> So, ease of use is a focus. >> Ease of use is a major focus for us. As we are growing we want to make sure everybody in the organization can get access to their data, analyze it. That was one, one of the things, which is pretty unique to BigQuery, which is there is a real time collection of information, so you can... There are customers that are actually collecting real time data from click-stream, for example, on their websites or other places, and moving it directly into BigQuery and analyzing it. Example, in-game analytics, if in-game you're actually playing games and you're going to collect those events and do real time analysis, you're going to literally put it into BigQuery at scale and do that. So, a lot of customers using BigQuery at different levels. We also announced Clustering that allows you to reduce the cost, improve efficiency, and make queries almost two X faster for us. So, a lot of announcements other than the machine learning. >> Well, the one thing I saw in the demo I thought was, I mean, it was machine learning, so that's hot topic here, obviously. >> Yes. >> Is you don't have to move the data, and this is something that we've been covering, go back to the Hadoop, back when we first started doing theCUBE, you know, data pipeline, all the complexities involved in moving the data, and at the scale and size of the data all this wrangling was going on just to get some machine learning in. >> Yep. >> So, talk about that new feature where you guys are doing it inside BigQuery. I think that's important, take a minute to explain that. >> Yeah, so when we were talking to our customers one of the biggest challenges they were facing with machine learning in general, or a couple of them were, one, every time you want to do machine learning you are to take data from your core data warehouse, like in BigQuery you have petabytes of scaled data sets, terabytes of data sets. Now, if you want to do machine learning on any portion of it you take it out of BigQuery, move it into some machine learning engine, ML engine, auto-ML, anything, then you realize, "Oh, I missed some of the data that I needed." I go back then again take the data, move it, and you have to go back and forth too much time. There are analysis I think that different organizations have done. 80% of the time the data scientists say they're spending on the moving of data-- >> Right. >> Wrangling data and all of that, so that is one big problem. The second big challenge we were hearing was skillset gap, there are just not that many PhD data scientists in the industry, how do we solve that problem? So, what we said is first problem, how do we solve it, why do people have to move data to the machine learning engines? Why can't I take the machine learning capability, move it inside where the data is, so bring the machine learning closer to data rather than data closer to machine learning. So, that's what BigQuery ML is, it's an ability to run regression-like models inside the data warehouse itself in BigQuery so that you can do that. The second we said the interface can't be complex. Our audiences already know SQL, they're already analyzing data, these folks, business analysts that are using BigQuery are the experts on the data. So, what we said is use your standard SQL, write two lines of code, create model, type of the model you want to run, give us the data, we will just run the machine learning model on the backend and you can do predictions pretty easily. So, that's what we are doing with that. >> That's awesome. >> So, Sudhir, I love to hear that you were driven by that, by your customers, because one of the things we talk about all the time is democratization. >> Yeah. >> If you want innovation you've got to democratize access to the data, and then you got to democratize access to the tools to actually do stuff with the data-- >> Yes. >> That goes way beyond just the hardcore data scientist in the organization-- >> Yeah, exactly. >> And that's really what you're trying to enable the customers to be able to do. >> Absolutely, if you look at it, if you just go on LinkedIn and search for data analyst versus data scientist there is 100 X more analysts in the industry, and our thing was how do we empower these analysts that understand the data, that are familiar with SQL, to go ahead and do data science. Now, we realize they're not going to be expert machine learning folks who understand all the intricacies of how the gradient descent works, all that, that's not their skillset, so our thing was reduce the complexity, make it very simple for them to use. The framework, like just use SQL and we take care of the internal hyper-tuning, the complexity of it, model selection. We try to do that internally within the technology, and they just get a simple interface for that. So, it's really empowering the SQL analyst with an organization to do machine learning with very little to no knowledge of machine learning. >> Right. >> Talk about the history of BigQuery, where did it come from? I mean, Google has this DNA of they do it internally for themselves-- >> Yes. >> Which is a tough customer-- >> Yes. >> In Cloud Spatter we had the product manager on for Cloud Spatter. Dip Dee, she was, like amazing, like okay, baked internally, did that have the same-- >> Yes. >> BigQuery, take a minute to talk about that, because you're now making it consumable for enterprise customers. >> Yeah. >> It's not a just, "Here's BigQuery." >> No. >> Talk about the origination, how it started, why, and how you guys use it internally. >> So, BigQuery internally is called Dremel. There's a paper on Dremel available. I think in 2012 or something we published it. Dremel has been used internally for analytics across Google. So, if you think about Spanner being used for transaction management in the company across all areas, BigQuery, or Dremel internally, is what we use for all large scale data analytics within Google. So, the whole company runs on, analyzes data with it, so our things was how do we take this capability that we are driving, and imagine like, when you have seven products that are more than a billion active users, the amount of data that gets generated, the insights we are giving in Maps and all the different places, a lot of those things are first analyzed in Dremel internally and we're making it available. So, our thing was how do we take that capability that's there internally and make it available to all enterprises. >> Right. >> As Sundhir was saying yesterday, our goal is empower all our customers to go ahead and do more. >> Right. >> And so, this is a way of taking the piece of technology that's powered Google for a while and also make it available to enterprises. >> It's tested, hardened and tested. >> Yeah, absolutely. >> It's not like it's vaporware. >> Yeah, it's not. (laughs) >> No, I mean, this is what I think is important about the show this year. If you look at it, you guys have done a really good job of taking the big guns of Google, the big stuff, and not try to just say, "We're Google and you can be like Google." You've taken it and you've kind of made it consumable. >> Yes. >> This has been a big focus, explain the mindset behind the product management. >> Absolutely, there is actually one of the key things Google is good at doing is taking what's there internally used, but also the research part of it. Actually, Corinna Cortes, who is head of our AI side who does a lot of research in SQL-based machine learning, so again, the-- >> Yeah. >> BigQuery ML is nothing new, like we internally have a research team that has been developing it for a few years. We have been using it internally for running all these models and all, and so what we were able to do it bring product management from our side, like hey, this is really a problem we are facing, moving data, skillset gap, and then we were like, research team was already enabling it and then we had an engineering team which is pretty strong. We were like, okay, let's bring all three triads together and go ahead and make sure we provide a real value to our customers with all of that we're doing, so that's how it came to light. >> So, I just want to get your take, early days like when there was the early Google search appliance, I'll just pick that up, and that was ancient, ancient ago, but one of the digs was, right, it didn't work as well in the enterprise, per se, because you just didn't have the same amount of data when you applied that type of technique to a Google flow of data and a Google flow of queries. So, how's that evolved over time, because you guys, like you said, seven applications with a billion-- >> Yep. >> Users, most enterprises don't have that, so how do they get the same type of performance if they don't have the same kind of throughput to build the models and to get that data, how's that kind of evolved? >> So, this is why I think thinking about, when we think about scale we think about scaling up and scaling down, right? We have customers who are using BigQuery with a few terabytes of data. Not every customer has petabytes scale, but what we're also noticing is these same customers, when they see value in data they collect more. I will give you a real example, Zulily, one of our customers, I used to be there before, so when they started doing real time data collection for doing real time analytics they were collecting like 50 million events a day. Within 18 months they started collecting five billion a day, 100 x improvement, and the reason is they started seeing value. They could take this real time data, analyze it, make some real time experiences possible on their website and all, with all of that they were able to go out and get real valuer for their customers, drive growth, so when customers see that kind of value they collect more data. So, what I would say is yes, a lot of customers start small, but they all have an aspiration to have lots of data, leverage that to create operational efficiency as well as growth, and so as they start doing that I think they will need infrastructure that can scale down and up all the way, and I think that's what we're focusing on, providing that. >> You guys look at the possibility, and I've seen some examples where customers are just, like, they're shell-shocked, and you're almost too good, right? I mean, it's like, "We've been doing "Dremel on a large scale, I bought this "data warehouse like 10 years ago," like what are you talking about? (laughs) I mean, there's a reality of we've been buying IT, enterprises have been buying IT and in comes Google, the gunslinger saying, "Hey, man, you can do all this stuff." There's a little bit of shell-shock factor for some IT people. Some engineering organizations get it right away. How are you guys dealing with this as you make it consumable? >> Yeah. >> There's probably a lot of education. As a product manager do you see, is that something that you think about, is that something you guys talk about? >> Yes, we do, so I think I actually see a difference in how customers, what customers need, enterprise customers versus cloud native companies. As you said, cloud native companies starting new, starting fresh, so it's a very different set of requirement. Enterprise customers, thinking about scale, thinking about security and how do you do that. So, BigQuery is a highly secure data warehouse. The other thing BigQuery has is it's a completely serverless platform, so we take care of the security. We encrypt all the data at rest and when it's moving. The key thing is when we share what is possible and how easy it is to manage and how fast people can start analyzing, you can bring the data. Like you can actually get started with BigQuery in minutes, like you just bring your data in and start analyzing it. You don't have to worry about how many machines do I need, how do I provision it, how many servers do I need. >> Yeah. >> So, enterprises, when they look at-- >> Cloud native ready. >> Yeah. >> All right, so take a minute to explain BigTable versus, I mean, BigTable versus BigQuery. >> Yes. >> What's the difference between the two, one's a data warehouse and the other one is a system for managing data? What's the difference between Big-- >> So, it's a no-SQL system, so I will... The simple example, I will give you a real example how customers use it, right. BigQuery is great for large scale analytics, people who want to take, like, petabyte scale data or terabyte scale data and analyze historical patterns, all of that, and do complex analysis. You want to do machine learning model creation, you can do that. What BigTable is great at is once you have pre-aggregated data you want to go ahead and really fast serving. If you have a website, I don't expect you to run a website and back it with BigQuery, it's not built for that. Whereas BigTable is exactly for that scenario, so for example, you have millions of people coming on the website, they want to see some key metrics that have been pre-created ready to go, you go to BigTable and that can actually do high performance, high throughput. Last statement on that, like almost 10,000-- >> Yeah. >> Requests per second per node and you can just create as many as you want, so you can really create high scale-- >> Auto-scaling, all kinds of stuff there. >> Exactly. >> And that's good for unstructured data as well-- >> Exactly. >> And managing it. >> Absolutely. >> Okay, so structured data, SQL, basically large scale-- >> Yes. >> BigTable for real time-- >> Yes. >> New kinds of datas, different data types. >> Absolutely, yes. >> What else do you have in the bag of goodies in there that you're working on? >> The one big thing that we also announced with this week was a GIS capability within BigQuery. GIS is geographical information, like everything today is location-based, latitude, longitude. Our customers were telling us really difficult to analyze it, right, like I want to know... Example would be we are here, I want to know how many food restaurants are in a two-mile radius of here, which ones are those, how many, should we create the next one here or not. Those kind of analyses are really difficult, so we partnered with Earth Engine, Earth Engine team within Google with Maps, and then what we're launching is ability to do geospatial analysis within BigQuery. Additionally along with that we also have a visualization tool that we launched this week, so folks who haven't seen that should go check that out. One great example I will give you is Geotab, their CEO is here, Neil. He was showing a demo in one of the sessions and he was talking about how he was able to transform his business. I'll give you an example, Geotab is basically into vehicle tracking, so they have these sensors that track different things with vehicles, and then with, and they store everything in BigQuery, collect all of that and all, and his thing was with BigQuery ML and a GIS capability, what he's now able to do is create models that can predict what intersections in a city when it's snowing are going to be dangerous, and for smart cities he can now recommend to cities where and how to invest in these kind of scenarios. Completely transforming his business because his business is not smart cities, his business was vehicle tracking and all, he's like, but with these capabilities they're transforming what they were doing and solving-- >> New discoveries. >> New discoveries, solving new problems, it's amazing. I wonder if you could just dig at a little bit to, you know, the fact that you've got this, these seven billion activities or apps that you can leverage, you know, specific functionality or goals or objectives or priorities in those groups, and now apply those, pull that data, pull that knowledge, pull those use cases into a completely different application on the enterprise. I mean, is that an active process-- >> I don't think that's how people. >> Do people query? >> No, no. >> But how does that happen? >> No, we don't-- >> As a customer. >> As a customer completely different, right? Our focus in Google Cloud is primarily enabling enterprises to collect their data, process their data, innovate on their data. We don't bring in, like, the Google side of it at all, like that's their completely different area that way, so we basically, enterprises, all their data stays within their environment. They basically, we don't touch it, we don't get to access it at all, and they can know it. >> Yeah, yeah, no, I didn't mean that, I meant, you know, like say Maps for instance, it's interesting to see how Maps has evolved over all these years. Every time you open it, oh, and it's directions-- >> Yep. >> Oh, now it's better directions, oh, now it's got gas stations, oh, now it's where the... And it triggered because you said the restaurants that are close by, so it's kind of adding value to the core app on that side, and as you just said, now geolocation can be used on the enterprise side-- >> Yeah, yes. >> And lots of different things, so that-- >> Exactly. >> That's where I meant that kind of connection-- >> Exactly right, so-- >> In terms of the value of what can I do with geolocation. >> Absolutely, exactly, so like, that's exactly what we did. With Earth Engine we had a lot of learnings on geospatial analysis and our thing was how do you make it easy for our enterprise customers to do that. We've partnered with them closely and we said, "Okay, here are the core pieces of things "we can add in BigQuery that will allow you "to do better geospatial analysis, visualize it." One of the big challenges is lat longs, I don't think they're that friendly with analysts, like oh, numbers and all that. So, we actually will turn a UI visualization tool that allows you to just fire a query and see visually on a map where things are, all the points look like and all. >> Awesome. >> So, just simplifying what analysts can do with all these. >> Sudhir, thanks for coming on, really appreciate it and congratulations on your success. Got a lot of great, big products there, hardened internally, now-- >> Yes. >> Making consumable, it's clear here at Google Cloud you guys are recognized that making it consumable-- >> Yep. >> Pre-existing, proven technologies, so I want to give you guys props for that, congratulations. >> Thank you, thanks a lot. >> Thanks for coming on the show. >> Thanks for coming on. >> Thank you. >> It's theCUBE coverage here, Google Cloud coverage, Google Next 2018. I'm John Furrier with Jeff Frick, stay with us, we've got all day with more coverage for day three. Stay with us after this short break. (techy music)

Published Date : Jul 26 2018

SUMMARY :

brought to you by Google Cloud and its ecosystem partners. has the luxury and great job of managing BigTable, What's going on with you guys at the show here, in seconds, so that's the kind of scale we provide. So, a lot of announcements other than the machine learning. Well, the one thing I saw in the demo I thought was, and at the scale and size of the data all this wrangling you guys are doing it inside BigQuery. of them were, one, every time you want to on the backend and you can do predictions pretty easily. So, Sudhir, I love to hear that you were driven by that, enable the customers to be able to do. Absolutely, if you look at it, if you just baked internally, did that have the same-- BigQuery, take a minute to talk about why, and how you guys use it internally. that gets generated, the insights we are giving all our customers to go ahead and do more. and also make it available to enterprises. Yeah, it's not. "We're Google and you can be like Google." the mindset behind the product management. SQL-based machine learning, so again, the-- like hey, this is really a problem we are facing, So, how's that evolved over time, because you guys, I will give you a real example, Zulily, like what are you talking about? As a product manager do you see, is that something that can start analyzing, you can bring the data. All right, so take a minute to explain BigTable so for example, you have millions of people One great example I will give you that you can leverage, you know, specific functionality We don't bring in, like, the Google side of it at all, Every time you open it, oh, and it's directions-- to the core app on that side, and as you just said, on geospatial analysis and our thing was how do you Got a lot of great, big products there, give you guys props for that, congratulations. I'm John Furrier with Jeff Frick, stay with us,

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
John	PERSON	0.99+
Yusef	PERSON	0.99+
Vodafone	ORGANIZATION	0.99+
Neil	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Deutsche	ORGANIZATION	0.99+
Earth Engine	ORGANIZATION	0.99+
Sudhir	PERSON	0.99+
Europe	LOCATION	0.99+
Jeff Frick	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Adolfo Hernandez	PERSON	0.99+
Telco	ORGANIZATION	0.99+
2012	DATE	0.99+
Google	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Corinna Cortes	PERSON	0.99+
Dave Brown	PERSON	0.99+
telco	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
100s	QUANTITY	0.99+
Adolfo	PERSON	0.99+
KDDI	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
London	LOCATION	0.99+
15	QUANTITY	0.99+
Io-Tahoe	ORGANIZATION	0.99+
Yusef Khan	PERSON	0.99+
80%	QUANTITY	0.99+
90%	QUANTITY	0.99+
Sudhir Hasbe	PERSON	0.99+
two	QUANTITY	0.99+
SK Telecom	ORGANIZATION	0.99+
two lines	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
BigQuery	TITLE	0.99+
IBM	ORGANIZATION	0.99+
four weeks	QUANTITY	0.99+
10s	QUANTITY	0.99+
Brazil	LOCATION	0.99+
three	QUANTITY	0.99+
SQL	TITLE	0.99+
San Francisco	LOCATION	0.99+
LinkedIn	ORGANIZATION	0.99+
Global Telco Business Unit	ORGANIZATION	0.99+

Ram Venkatesh, Hortonworks & Sudhir Hasbe, Google | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018. Brought to you by HortonWorks. >> We are wrapping up Day One of coverage of Dataworks here in San Jose, California on theCUBE. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We have two guests for this last segment of the day. We have Sudhir Hasbe, who is the director of product management at Google and Ram Venkatesh, who is VP of Engineering at Hortonworks. Ram, Sudhir, thanks so much for coming on the show. >> Thank you very much. >> Thank you. >> So, I want to start out by asking you about a joint announcement that was made earlier this morning about using some Hortonworks technology deployed onto Google Cloud. Tell our viewers more. >> Sure, so basically what we announced was support for the Hortonworks DataPlatform and Hortonworks DataFlow, HDP and HDF, running on top of the Google Cloud Platform. So this includes deep integration with Google's cloud storage connector layer as well as it's a certified distribution of HDP to run on the Google Cloud Platform. >> I think the key thing is a lot of our customers have been telling us they like the familiar environment of Hortonworks distribution that they've been using on-premises and as they look at moving to cloud, like in GCP, Google Cloud, they want the similar, familiar environment. So, they want the choice to deploy on-premises or Google Cloud, but they want the familiarity of what they've already been using with Hortonworks products. So this announcement actually helps customers pick and choose like whether they want to run Hortonworks distribution on-premises, they want to do it in cloud, or they wat to build this hybrid solution where the data can reside on-premises, can move to cloud and build these common, hybrid architecture. So, that's what this does. >> So, HDP customers can store data in the Google Cloud. They can execute ephemeral workloads, analytic workloads, machine learning in the Google Cloud. And there's some tie-in between Hortonworks's real-time or low latency or streaming capabilities from HDF in the Google Cloud. So, could you describe, at a full sort of detail level, the degrees of technical integration between your two offerings here. >> You want to take that? >> Sure, I'll handle that. So, essentially, deep in the heart of HDP, there's the HDFS layer that includes Hadoop compatible file system which is a plug-able file system layer. So, what Google has done is they have provided an implementation of this API for the Google Cloud Storage Connector. So this is the GCS Connector. We've taken the connector and we've actually continued to refine it to work with our workloads and now Hortonworks has actually bundling, packaging, and making this connector be available as part of HDP. >> So bilateral data movement between them? Bilateral workload movement? >> No, think of this as being very efficient when our workloads are running on top of GCP. When they need to get at data, they can get at data that is in the Google Cloud Storage buckets in a very, very efficient manner. So, since we have fairly deep expertise on workloads like Apache Hive and Apache Spark, we've actually done work in these workloads to make sure that they can run efficiently, not just on HDFS, but also in the cloud storage connector. This is a critical part of making sure that the architecture is actually optimized for the cloud. So, at our skill and our customers are moving their workloads from on-premise to the cloud, it's not just functional parity, but they also need sort of the operational and the cost efficiency that they're looking for as they move to the cloud. So, to do that, we need to enable these fundamental disaggregated storage pattern. See, on-prem, the big win with Hadoop was we could bring the processing to where the data was. In the cloud, we need to make sure that we work well when storage and compute are disaggregated and they're scaled elastically, independent of each other. So this is a fairly fundamental architectural change. We want to make sure that we enable this in a first-class manner. >> I think that's a key point, right. I think what cloud allows you to do is scale the storage and compute independently. And so, with storing data in Google Cloud Storage, you can like scale that horizontally and then just leverage that as your storage layer. And the compute can independently scale by itself. And what this is allowing customers of HDP and HDF is store the data on GCP, on the cloud storage, and then just use the scale, the compute side of it with HDP and HDF. >> So, if you'll indulge me to a name, another Hortonworks partner for just a hypothetical. Let's say one of your customers is using IBM Data Science Experience to do TensorFlow modeling and training, can they then inside of HDP on GCP, can they use the compute infrastructure inside of GCP to do the actual modeling which is more compute intensive and then the separate decoupled storage infrastructure to do the training which is more storage intensive? Is that a capability that would available to your customers? With this integration with Google? >> Yeah, so where we are going with this is we are saying, IBM DSX and other solutions that are built on top of HDP, they can transparently take advantage of the fact that they have HDP compute infrastructure to run against. So, you can run your machine learning training jobs, you can run your scoring jobs and you can have the same unmodified DSX experience whether you're running against an on-premise HDP environment or an in-cloud HDP environment. Further, that's sort of the benefit for partners and partner solutions. From a customer standpoint, the big value prop here is that customers, they're used to securing and governing their data on-prem in their particular way with HDP, with Apache Ranger, Atlas, and so forth. So, when they move to the cloud, we want this experience to be seamless from a management standpoint. So, from a data management standpoint, we want all of their learning from a security and governance perspective to apply when they are running in Google Cloud as well. So, we've had this capability on Azure and on AWS, so with this partnership, we are announcing the same type of deep integration with GCP as well. >> So Hortonworks is that one pane of glass across all your product partners for all manner of jobs. Go ahead, Rebecca. >> Well, I just wanted to ask about, we've talked about the reason, the impetus for this. With the customer, it's more familiar for customers, it offers the seamless experience, But, can you delve a little bit into the business problems that you're solving for customers here? >> A lot of times, our customers are at various points on their cloud journey, that for some of them, it's very simple, they're like there's a broom coming by and the datacenter is going away in 12 months and I need to be in the cloud. So, this is where there is a wholesale movement of infrastructure from on-premise to the cloud. Others are exploring individual business use cases. So, for example, one of our large customers, a travel partner, so they are exploring their new pricing model and they want to roll out this pricing model in the cloud. They have on-premise infrastructure, they know they have that for a while. They are spinning up new use cases in the cloud typically for reasons of agility. So, if you, typically many of our customers, they operate large, multi-tenant clusters on-prem. That's nice for, so a very scalable compute for running large jobs. But, if you want to run, for example, a new version of Spark, you have to upgrade the entire cluster before you can do that. Whereas in this sort of model, what they can say is, they can bring up a new workload and just have the specific versions and dependency that it needs, independent of all of their other infrastructure. So this gives them agility where they can move as fast as... >> Through the containerization of the Spark jobs or whatever. >> Correct, and so containerization as well as even spinning up an entire new environment. Because, in the cloud, given that you have access to elastic compute resources, they can come and go. So, your workloads are much more independent of the underlying cluster than they are on-premise. And this is where sort of the core business benefits around agility, speed of deployment, things like that come into play. >> And also, if you look at the total cost of ownership, really take an example where customers are collecting all this information through the month. And, at month end, you want to do closing of books. And so that's a great example where you want ephemeral workloads. So this is like do it once in a month, finish the books and close the books. That's a great scenario for cloud where you don't have to on-premises create an infrastructure, keep it ready. So that's one example where now, in the new partnership, you can collect all the data through the on-premises if you want throughout the month. But, move that and leverage cloud to go ahead and scale and do this workload and finish the books and all. That's one, the second example I can give is, a lot of customers collecting, like they run their e-commerce platforms and all on-premises, let's say they're running it. They can still connect all these events through HDP that may be running on-premises with Kafka and then, what you can do is, in-cloud, in GCP, you can deploy HDP, HDF, and you can use the HDF from there for real-time stream processing. So, collect all these clickstream events, use them, make decisions like, hey, which products are selling better?, should we go ahead and give?, how many people are looking at that product?, or how many people have bought it?. That kind of aggregation and real-time at scale, now you can do in-cloud and build these hybrid architectures that are there. And enable scenarios where in past, to do that kind of stuff, you would have to procure hardware, deploy hardware, all of that. Which all goes away. In-cloud, you can do that much more flexibly and just use whatever capacity you have. >> Well, you know, ephemeral workloads are at the heart of what many enterprise data scientists do. Real-world experiments, ad-hoc experiments, with certain datasets. You build a TensorFlow model or maybe a model in Caffe or whatever and you deploy it out to a cluster and so the life of a data scientist is often nothing but a stream of new tasks that are all ephemeral in their own right but are part of an ongoing experimentation program that's, you know, they're building and testing assets that may be or may not be deployed in the production applications. That's you know, so I can see a clear need for that, well, that capability of this announcement in lots of working data science shops in the business world. >> Absolutely. >> And I think coming down to, if you really look at the partnership, right. There are two or three key areas where it's going to have a huge advantage for our customers. One is analytics at-scale at a lower cost, like total cost of ownership, reducing that, running at-scale analytics. That's one of the big things. Again, as I said, the hybrid scenarios. Most customers, enterprise customers have huge deployments of infrastructure on-premises and that's not going to go away. Over a period of time, leveraging cloud is a priority for a lot of customers but they will be in these hybrid scenarios. And what this partnership allows them to do is have these scenarios that can span across cloud and on-premises infrastructure that they are building and get business value out of all of these. And then, finally, we at Google believe that the world will be more and more real-time over a period of time. Like, we already are seeing a lot of these real-time scenarios with IoT events coming in and people making real-time decisions. And this is only going to grow. And this partnership also provides the whole streaming analytics capabilities in-cloud at-scale for customers to build these hybrid plus also real-time streaming scenarios with this package. >> Well it's clear from Google what the Hortonworks partnership gives you in this competitive space, in the multi-cloud space. It gives you that ability to support hybrid cloud scenarios. You're one of the premier public cloud providers and we all know about. And clearly now that you got, you've had the Hortonworks partnership, you have that ability to support those kinds of highly hybridized deployments for your customers, many of whom I'm sure have those requirements. >> That's perfect, exactly right. >> Well a great note to end on. Thank you so much for coming on theCUBE. Sudhir, Ram, that you so much. >> Thank you, thanks a lot. >> Thank you. >> I'm Rebecca Knight for James Kobielus, we will have more tomorrow from DataWorks. We will see you tomorrow. This is theCUBE signing off. >> From sunny San Jose. >> That's right.

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, for coming on the show. So, I want to start out by asking you to run on the Google Cloud Platform. and as they look at moving to cloud, in the Google Cloud. So, essentially, deep in the heart of HDP, and the cost efficiency is scale the storage and to do the training which and you can have the same that one pane of glass With the customer, it's and just have the specific of the Spark jobs or whatever. of the underlying cluster and then, what you can and so the life of a data that the world will be And clearly now that you got, Sudhir, Ram, that you so much. We will see you tomorrow.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rebecca	PERSON	0.99+
two	QUANTITY	0.99+
Sudhir	PERSON	0.99+
Ram Venkatesh	PERSON	0.99+
San Jose	LOCATION	0.99+
HortonWorks	ORGANIZATION	0.99+
Sudhir Hasbe	PERSON	0.99+
Google	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
two guests	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
DataWorks	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Ram	PERSON	0.99+
AWS	ORGANIZATION	0.99+
one example	QUANTITY	0.99+
one	QUANTITY	0.99+
two offerings	QUANTITY	0.98+
12 months	QUANTITY	0.98+
One	QUANTITY	0.98+
Day One	QUANTITY	0.98+
DataWorks Summit 2018	EVENT	0.97+
IBM	ORGANIZATION	0.97+
second example	QUANTITY	0.97+
Google Cloud Platform	TITLE	0.96+
Atlas	ORGANIZATION	0.96+
Google Cloud	TITLE	0.94+
Apache Ranger	ORGANIZATION	0.92+
three key areas	QUANTITY	0.92+
Hadoop	TITLE	0.91+
Kafka	TITLE	0.9+
theCUBE	ORGANIZATION	0.88+
earlier this morning	DATE	0.87+
Apache Hive	ORGANIZATION	0.86+
GCP	TITLE	0.86+
one pane	QUANTITY	0.86+
IBM Data Science	ORGANIZATION	0.84+
Azure	TITLE	0.82+
Spark	TITLE	0.81+
first	QUANTITY	0.79+
HDF	ORGANIZATION	0.74+
once in a month	QUANTITY	0.73+
HDP	ORGANIZATION	0.7+
TensorFlow	OTHER	0.69+
Hortonworks DataPlatform	ORGANIZATION	0.67+
Apache Spark	ORGANIZATION	0.61+
GCS	OTHER	0.57+
HDP	TITLE	0.5+
DSX	TITLE	0.49+
Cloud Storage	TITLE	0.47+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Sudhir Hasbe: