Debanjan Saha, Google Cloud | October 2020
(gentle music) >> From the cube studios in Palo Alto and Boston, connecting with thought leaders all around the world. This is a Cube conversation. >> With Snowflake's, enormously successful IPO, it's clear that data warehousing in the cloud has come of age and a few companies know more about data and analytics than Google. Hi, I'm Paul Gillen. This is a cube conversation. And today we're going to talk about data warehousing and data analytics in the cloud. Google BigQuery, of course, is a popular, fully managed server less data warehouse that enables rapid SQL queries and interactive analysis of massive data sets. This summer, Google previewed BigQuery Omni, which essentially brings the capabilities of BigQuery to additional platforms including Amazon web services and soon Microsoft Azure. It's all part of Google's multicloud strategy. No one knows more about this strategy than Debanjan Saha, General Manager and Vice President of engineering for data analytics and Google cloud. And he joins me today. Debanjan, thanks so much for joining me. >> Paul, nice to meet you and thank you for having me today. >> So it's clear the data warehousing is now part of many enterprise data strategies. How has the rise of cloud change the way organizations are using data science in your view? >> Well, I mean, you know, the cloud definitely is a big enabler of data warehousing and data science, as you mentioned. I mean, it has enabled things that people couldn't do on-prem, for example, if you think about data science, the key ingredient of data science, before you can start anything is access to data and you need massive amount of data in order to build the right model that you want to use. And this was a big problem on-prem because people are always thinking about what data to keep, what to discard. That's not an issue in cloud. You can keep as much of data as you want, and that has been a big boon for data science. And it's not only your data, you can also have access to other data your, for example, your partner's data, public data sets and many other things that people have access to right? That's number one, number two of course, it's a very compute intensive operation and you know, large enterprises of course can afford them build a large data center and bring in lots of tens of thousands of CPU codes, GPU codes, TPU codes whatever have you, but it is difficult especially for smaller enterprises to have access to that amount of computing power which is very very important for data science. Cloud makes it easy. I mean, you know, it has in many ways democratize the use of data science and not only the big enterprises everyone can take advantage of the power of the computing power that various different cloud vendors make it available on their platform. And the third, not to overlook that, cloud also makes it available to customers and users, lots of various different data science platform, for example, Google's own TensorFlow and you have many other platforms Spark being one example of that, right? Both a cloud native platform as well as open source platforms, which is very very useful for people using data science and managed to open source, Spark also makes it very very affordable. And all of these things have contributed to massive boon in data science in the cloud and from my perspective. >> Now, of course we've seen over the last seven months a rush to the cloud triggered by the COVID-19 pandemic. How has that played out in the analytics field? Do you see any longterm changes to, to the landscape? The way customers are using analytics as a result of what's happened these last seven months? >> You know, I think as you know about kind of a digitization of our business is happening over a long period of time, right? And people are using AIML analytics in increasing numbers. What I've seen because of COVID-19 that trend has accelerated both in terms of people moving to cloud, and in terms of they're using advanced analytics and AIML and they have to do that, right? Pretty much every business is kind of leaning heavily on their data infrastructure in order to gain insight of what's coming next. A lot of the models that people are used to, is no longer valid things are changing very very rapidly right? So in order to survive and thrive people have to lean on data, lean on analytics to figure out what's coming around the corner. And that trend in my view is only going to accelerate. It's not going to go the other way round. >> One of the problems with cloud databases, We often hear complaints about is that there's so many of them. Do you see any resolution to that proliferation? >> Well, you know, I do think a one size does not fit all right. So it is important to have choice. It's important to have specialization. And that's why you see a lot of cloud databases. I don't think the number of cloud databases is going to go down. What I do expect to happen. People are going to use interoperable data formats. They are going to use open API so that it's very, very portable as people want to move from one database to another. The way I think the convergence is going to come is two ways, One, you know, a lot of databases, for example, use Federation. If you look at BigQuery, for example, you can start with BigQuery, but with BigQuery, you can have also access to data in other databases, not only in GCP or Google cloud but also in AWS with BigQuery Omni, for example, right? So that provides a layer of Federation, which kind of create convergence with respect, to weighing various different data assets people may have. I have also seen with, for example, with Looker, you know creation of enterprise wide data models and data API is gives people a platform so that they can build their custom data app and data solutions on top up and even from data API. Those I believe are going to be the points of convergence. I think data is probably going to be in different databases because different databases do different things well, that does not mean people wouldn't have access to all their data through one API or one set of models. >> Well, since we're on the subject of BigQuery. Now this summer, you introduced BigQuery Omni which is a database data warehouse, essentially a version of BigQuery that can query data in other cloud platforms, what, what is the strategy there? And what is the customer reaction been so far? >> Well, I mean, you know as you probably have seen talking to customers more than 80% of the customers that we talk to use multiple clouds and that trend is probably not going to change. I mean, it happens for various different reasons sometime because of compliance sometimes because they want to have different tools and different platform sometime because of M and a, we are a big believer of multi-cloud strategy and that's what we are trying to do with BigQuery Omni. We do realize people have choices. Customers will have their data in various different places and we will take our analytics wherever the data is. So customers won't have to worry about moving data from one place to another., and that's what we are trying to do with BigQuery Omni you know, going to see, you know for example, with Anthos, we have created a platform over which you can build this video as different data stacks and applications, which spans multiple clouds. I believe we are going to see more of that. And BigQuery Omni is just the beginning. >> And how have your customers reacted to that announcement. >> Oh deep! They reacted very, very positively. This is the first time they have a major cloud vendor offering a fully managed server less data warehouse platform on multiple clouds. And as I mentioned, I mean we have many customers who have some of their data assets for example, in GCP, they really love BigQuery. And they also have for example, applications running on AWS and Azure. And today the only option they have is to essentially shuttle their data between various different clouds in order to gain insight across the collective pool of data sets that they have, with BigQuery, Omni, they all tended to do that. They can keep their data wherever it is. They can still join across that data and get insights irrespective of which cloud their data is. >> You recently wrote on Forbes about the shortage of data scientists and the need to make data analytics more accessible to the average business user. What is Google doing in that respect? >> So we strongly, I mean, you know one of our goals is to make the data and insight from data available to everybody in the business right? That is the way you can democratize the use of analytics and AIML. And you know, one way to do that is to teach everybody R or Python or some specific tools but that's going to take a long time. So our approach is make the power of data analytics and AI AML available to our users, no matter what tools they're comfortable with. So for example, if you look at a B Q ML BigQuery ML, we have made it possible for our users who like SQL very much to use the power of ML without having to learn anything else or without having to move their data anywhere else. We have a lot of business users for example, who prefer X prefer spreadsheets and, you know, we've connected sheets. We have made the spreadsheet interface available on top of BigQuery, and they can use the power of BigQuery without having to learn anything else. Better yet we recently launched a BigQuery Q and A. And what Q and A allows you to do is to use natural language on top of big query data, right? So the goal, I mean, if you can do that that I think is the Nevada where people, anyone for example, somebody working in a call center talking to a customer can use a simple query to figure out what's going on with the bill, for example, right? And we believe that if we can democratize the use of data, insight and analytics that not only going to accelerate the digital transformation of the businesses, it's also going to grow consumption. And that's good for both the users, as well as business. >> Now you bought Looker last year, what would you say is different about the way Google is coming out the data analytics market from the way other cloud vendors are doing it. >> So Looker is a great addition to already strong portfolio of products that we have but you know, a lot of people think about Looker as a business intelligence platform. It's actually much more than that. What is unique about Looker is the semantic model that Looker can build on top of data assets, govern semantic model Looker can build on top of data assets, which may be in BigQuery maybe in cloud SQL maybe, you know, in other cloud for example, in Redshift or SQL data warehouse. And once you have the data model, you can create a data API and essentially an ID or integrated development environment on top of which you can build your custom workflows. You can build your custom dashboard you can build your custom data application. And that is, I think, where we are moving. I don't think people want the old dashboards anymore. They want their data experience to be immersive within the workflow and within the context in which they are using the data. And that's where I see Lot of customers are now using the power of Looker and BigQuery and other platform that we have and building this custom data apps. And what again, like BigQuery, Looker is also multi-platform it supports multiple data warehouses and databases and that kind of aligns very well with our philosophy of having an open platform that is multicloud as well as hybrid. >> Certainly, with Anthos and with BigQuery Omni, you demonstrated your commitment on P cloud, but not all cloud vendors have an interest in being multicloud. Do you see any, any change that standoff and are you really in a position to influence it? >> Absolutely. I think more than us it's a customer who is going to influence that, right? And almost every customer I talk to, they don't want to be in a walled garden. They want to be an open platform where they have the choice they have the flexibility and I believe these customers are going to push essentially the adoption of platforms, which are open and multicloud. And, you know, I believe over time the successful platforms have to be open platform. And the closed platform if you look at history has never been very successful, right? And you know, I sincerely think that we are on the right path and we are on the side of customers in this philosophy. >> Final question. What's your most important priority right now? >> You know, I wake up everyday thinking about how can you make our customer successful? And the best way to make our customer successful is to make sure that they can get business outcome out of the data that they have. And that's what we are trying to do. We want to accelerate time to value to data, you know, so that people can keep their data in a governed way. They can gain insight by using the tools that we can provide them. A lot of them, we have used internally for many years and those tools are now available to our customers. We also believe we need to democratize the use of analytics and AIML. And that's why we are trying to give customers tools where they don't have to learn a lot of new things and new skills in order to use them. And if we can do them successfully I think we are going to help our customers get more value out of their data and create businesses which can use that value. I'll give you a couple of quick examples. I mean, for example, if you look at Home Depot, they use our platform to improve the predictability of the inventory by two X. If you look at, for example HSBC, they have been able to use our platform to detect financial fraud 10 X faster. If you look at, for example Juan Perez, who's the CIO of UPS, they have used our AIML and analytics to do better logistics and route planning. And they have been able to save 10 million gallons of fuel every year which amounts to 400 million in cost savings. Those are the kind of business outcome we would like to drive with the power of our platform. >> Powerful stuff, democratize data multicloud data in any cloud who can argue with that. Debanjan Saha, General Manager and Vice President of engineering for data analytics at Google cloud. Thanks so much for joining me today. >> Paul, thank you thank you for inviting me. >> I'm Paul Gillen. This has been a cube conversation. >> Debanjan: Thank you. (soft music)
SUMMARY :
From the cube studios in Palo Alto and Boston, of BigQuery to additional platforms Paul, nice to meet you and So it's clear the data You can keep as much of data as you want, a rush to the cloud triggered and they have to do that, right? One of the problems They are going to use open API of BigQuery that can query know, going to see, you know to that announcement. is to essentially shuttle their data and the need to make data That is the way you is coming out the data analytics market of products that we have and are you really in a And you know, What's your most important and analytics to do better of engineering for data Paul, thank you thank This has been a cube conversation. (soft music)
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Paul Gillen | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
Debanjan | PERSON | 0.99+ |
Juan Perez | PERSON | 0.99+ |
October 2020 | DATE | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
Boston | LOCATION | 0.99+ |
HSBC | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
UPS | ORGANIZATION | 0.99+ |
BigQuery | TITLE | 0.99+ |
Home Depot | ORGANIZATION | 0.99+ |
400 million | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
two ways | QUANTITY | 0.99+ |
Debanjan Saha | PERSON | 0.99+ |
more than 80% | QUANTITY | 0.99+ |
Nevada | LOCATION | 0.99+ |
Python | TITLE | 0.99+ |
today | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
third | QUANTITY | 0.99+ |
SQL | TITLE | 0.99+ |
BigQuery Omni | TITLE | 0.99+ |
Looker | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.98+ |
Redshift | TITLE | 0.98+ |
BigQuery Omni | TITLE | 0.98+ |
one database | QUANTITY | 0.98+ |
10 million gallons | QUANTITY | 0.98+ |
one set | QUANTITY | 0.98+ |
both | QUANTITY | 0.97+ |
first time | QUANTITY | 0.97+ |
Snowflake | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
one | QUANTITY | 0.97+ |
COVID-19 pandemic | EVENT | 0.96+ |
10 X | QUANTITY | 0.96+ |
Both | QUANTITY | 0.95+ |
one example | QUANTITY | 0.95+ |
GCP | TITLE | 0.95+ |
Anthos | ORGANIZATION | 0.93+ |
This summer | DATE | 0.92+ |
this summer | DATE | 0.92+ |
tens of thousands | QUANTITY | 0.91+ |
last seven months | DATE | 0.89+ |
COVID-19 | OTHER | 0.88+ |
CPU | QUANTITY | 0.86+ |
two X. | QUANTITY | 0.86+ |
one size | QUANTITY | 0.86+ |
Spark | TITLE | 0.82+ |
Google cloud | ORGANIZATION | 0.79+ |