Yuanhao Sun, Transwarp | Big Data SV 2018

>> Announcer: Live, from San Jose, it's The Cube (light music) Presenting Big Data Silicon Valley. Brought to you by Silicon Angle Media, and its ecosystem partners. >> Hi, I'm Peter Burris and welcome back to Big Data SV, The Cube's, again, annual broadcast of what's happening in the big data marketplace here at, or adjacent to Strada here in San Jose. We've been broadcasting all day. We're going to be here tomorrow as well, over at the Forager eatery and place to come meander. So come on over. Spend some time with us. Now, we've had a number of great guests. Many of the thought leaders that are visiting here in San Jose today were on the big data marketplace. But I don't think any has traveled as far as our next guest. Yuanhao Sun is the ceo of Transwarp. Come all the way from Shanghai Yuanhao. It's once again great to see you on The Cube. Thank you very much for being here. >> Good to see you again. >> So Yuanhao, the Transwarp as a company has become extremely well known for great technology. There's a lot of reasons why that's the case, but you have some interesting updates on how the technology's being applied. Why don't you tell us what's going on? >> Okay, so, recently we announced the first order to the TPC-DS benchmark result. Our product, calling scepter, that is, SQL engine on top of Hadoop. We already add quite a lot of features, like dissre transactions, like a full SQL support. So that it can mimic, like oracle or the mutual, and also traditional database features so that we can pass the whole test. This single is also scalable, because it's distributed, scalable. So the large benchmark, like TPC-DS. It starts from 10 terabytes. SQL engine can pester without much trouble. >> So I know that there have been other firms that have claimed to pass TPCC-DS, but they haven't been audited. What does it mean to say you're audited? I'd presume that as a result, you've gone through some extremely stringent and specific tests to demonstrate that you can actually pass the entire suite. >> Yes, actually, there is a third party auditor. They already audit our test process and it results for the passed six, uh, five months. So it is fully audited. The reason why we can pass the test is because, actually, there's two major reasons for traditional databases. They are not scalable to the process large dataset. So they could not pass the test. For (mumbles) vendors, because the SQL engine, the features to reach enough to pass all the test. You know, there several steps in the benchmark, and the SQL queries, there are 99 queries, the syntax is not supported by all howve vendors yet. And also, the benchmark required to upload the data, after the queries, and then we run the queries for multiple concurrent users. That means you have to support disputed transactions. You have to make the upload data consistent. For howve vendors, the SQL engine on Hadoop. They haven't implemented the de-switch transaction capabilities. So that's why they failed to pass the benchmark. >> So I had the honor of traveling to Shanghai last year and going and speaking at your user conference and was quite impressed with the energy that was in the room as you announced a large number of new products. You've been very focused on taking what open source has to offer but adding significant value to it. As you said, you've done a lot with the SQL interfaces and various capabilities of SQL on top of Hadoop. Where is Transwarp going with its products today? How is it expanding? How is it being organizing? How is it being used? >> We group these products into three catalog, including big data, cloud, AI and the machine learning. So there are three categories. The big data, we upgrade the SQL engine, the stream engine, and we have a set of tools called adjustable studio to help people to streamline the big data operations. And the second part I lie is data cloud. We call it transwarp data cloud. So this product is going to be raised in early in May this year. So this product we build this product on top of common idiots. We provide how to buy the service, get a sense as service, air as a service to customers. A lot of people took credit multiple tenets. And they turned as isolated by network, storage, cpu. They free to create a clusters and speeding up on turning it off. So it can also scale hundreds of cost. So this is the, I think this is the first we implement, like, a network isolation and sweaty percendency in cobinets. So that it can support each day affairs and all how to components. And because it is elastic, just like car computing, but we run on bare model, people can consult the data, consult the applications in one place. Because all application and Hadoop components are conternalized, that means, we are talking images. We can spend up a very quickly and scale through a larger cluster. So this data cloud product is very interesting for large company, because they usually have a small IT team. But they have to provide a (mumbles), and a machine only capability to larger groups, like one found the people. So they need a convenient way to manage all these bigger clusters. And they have to isolate the resources. Even they need a bidding system. So this product is, we already have few big names in China, like China Post, Picture Channel, and Secret of Source Channel. So they are already applying this data cloud for their internal customers. >> And China has a, has a few people, so I presume that, you know, China Post for example, is probably a pretty big implementation. >> Yes so, they have a, but the IT team is, like less than 100 people, but they have to support thousands of users. So that's why they, you usually would deploy 100 cluster for each application, right, but today, for large organization, they have lots of applications. They hope to leverage big data capability, but a very small team, IT team, can also part of so many applications. So they need a convenient the way like a, just like when you put Hadoop on public cloud. We provide a product that allows you to provide a hardware service in private cloud on bare model machines. So this is the second product category. And the third is the machine learning and artificial intelligence. We provide a data sales platform, a machine learning tool, that is, interactive tools that allows people to create the machine only pipelines and models. We even implemented some automatic modeling capability that allow you to, to fisher in youring automatically or seeming automatically and to select the best items for you so that the machine learning can be, so everyone can be at Los Angeles. So they can use our tool to quickly create a models. And we also have some probuter models for different industry, like financial service, like banks, security companies, even iot. So we have different probuter machine only models for them. We just need to modify the template, then apply the machine only models to the applications very quickly. So that probably like a lesson, for example, for a bank customer, they just use it to deploy a model in one week. This is very quick for them. Otherwise, in the past, they have a company to build that application, to develop much models. They usually takes several months. Today it is much faster. So today we have three categories, particularly like cloud and machine learning. >> Peter Burris: Machine learning and AI. >> And so three products. >> And you've got some very, very big implementations. So you were talking about a couple of banks, but we were talking, before we came on, about some of the smart cities. >> Yuanhao Sun: Right. Kinds of things that you guys are doing at enormous scale. >> Yes, so we deploy our streaming productor for more than 300 cities in China. So this cluster is like connected together. So we use streaming capability to monitor the traffic and send the information from city to the central government. So all the, the sort of essential repoetry. So whenever illegal behavior on the road is detected, that information will be sent to the policeman, or the central repoetry within two second. Whenever you are seen by the camera in any place in China, their loads where we send out within two seconds. >> So the bad behavior is detected. It's identified as the location. The system also knows where the nearest police person is. And it sends a message and says, this car has performed something bad. >> Yeah and you should stop that car in the next station or in the next crossroad. Today there are tens of thousands policeman. They depends on this system for their daily work. >> Peter Burris: Interesting. >> So, just a question on, it sounds like one of your, sort of nearest competitors, in terms of, let's take the open source community, at least the APIs, and in their case open source, Waway. Have their been customers that tried to do a POC with you and with Waway, and said, well it took four months using the pure open source stuff, and it took, say, two weeks with your stack having, being much broader and deeper? Are any examples like that? >> There are quite a lot. We have more macro-share, like in financial services, we have about 100 bank users. So if we take all banks into account, for them they already use Hadoop. So we, our macro-share is above 60%. >> George Gilbert: 60. >> Yeah, in financial services. We usually do POC and, like run benchmarks. They are real workloads and usually it takes us three days or one week. They can found, we can speed up their workload very quickly. For Bank of China, they might go to their oracle workload to our platform. And they test our platform and the huave platform too. So the first thing is they cannot marry the whole oracle workload to open source Hadoop, because the missing features. We are able to support all this workloads with very minor modifications. So the modification takes only several hours. And we can finish the whole workload within two hours, but originally they take, usually take oracle more than one day, >> George Gilbert: Wow. >> more than ten hours to finish the workload. So it is very easy to see the benefits quickly. >> Now the you have a streaming product also with that same SQL interface. Are you going to see a migration of applications that used to be batch to more near real time or continuous, or will you see a whole new set of applications that weren't done before, because the latency wasn't appropriate? >> For streaming applications, real time cases they are mostly new applications, but if we are using storm api or spark streaming api, it is not so easy to develop your applications. And another issue is once you detect one new rule, you had to add those rules dynamically to your cluster. So to add to your printer, they do not have so many knowledge of writing scholar codes. They only know how to configure. Probably they are familiar with c-code. They just need to add one SQL statement to add a new rule. So that they can. >> In your system. >> Yeah, in our system. So it is much easier for them to program streaming applications. And for those customers who they don't have real time equations, they hope to do, like a real time data warehousing. They collect all this data from websites from their censors, like Petrol Channel, an oil company, the large oil company. They collect all the (mumbles) information directly to our streaming product. In the past, they just accredit to oracle and around the dashboard. So it only takes hours to see the results. But today, the application can be moved through our streaming product with only a few modifications, because they are all SQL statements. And this application becomes the real time. They can see the real time dashboard results in several seconds. >> So Yuanhao, you're number one in China. You're moving more aggressively to participate in the US market. What's the, last question, what's the biggest difference between being number one in China, the way that big data is being done in China versus the way you're encountering big data being done here, certainly in the US, for example? Is there a difference? >> I think there are some difference. Some a seem, katsumoto usually request a POC. But in China, they usually, I think they focus more on the results. They focus on what benefit they can gain from your product. So we have to prove them. So we have to hip them to my great application to see the benefits. I think in US, they focus more on technology than Chinese customers. >> Interesting, so they're more on technology here in the US, more in the outcome in China. Once again, Yuanhao Sun, from, ceo of Transwarp, thank you very much for being on The Cube. >> Thank you. And I'm Peter Burris with George Gilbert, my co-host, and we'll be back with more from big data SV, in San Jose. Come on over to the Forager, and spend some time with us. And we'll be back in a second. (light music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by Silicon Angle Media, over at the Forager eatery and place to come meander. So Yuanhao, the Transwarp as a company has become So that it can mimic, like oracle or the mutual, to demonstrate that you can actually pass the entire suite. And also, the benchmark required to upload the data, So I had the honor of traveling to Shanghai last year So this product is going to be raised you know, China Post for example, and to select the best items for you So you were talking about a couple of banks, Kinds of things that you guys are doing at enormous scale. from city to the central government. So the bad behavior is detected. or in the next crossroad. and it took, say, two weeks with your stack having, So if we take all banks into account, So the first thing is they cannot more than ten hours to finish the workload. Now the you have a streaming product also So to add to your printer, So it only takes hours to see the results. to participate in the US market. So we have to prove them. in the US, more in the outcome in China. Come on over to the Forager, and spend some time with us.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Shanghai	LOCATION	0.99+
George Gilbert	PERSON	0.99+
US	LOCATION	0.99+
China	LOCATION	0.99+
99 queries	QUANTITY	0.99+
three days	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
five months	QUANTITY	0.99+
San Jose	LOCATION	0.99+
China Post	ORGANIZATION	0.99+
Picture Channel	ORGANIZATION	0.99+
one week	QUANTITY	0.99+
six	QUANTITY	0.99+
four months	QUANTITY	0.99+
Los Angeles	LOCATION	0.99+
10 terabytes	QUANTITY	0.99+
last year	DATE	0.99+
today	DATE	0.99+
Today	DATE	0.99+
tomorrow	DATE	0.99+
more than one day	QUANTITY	0.99+
more than 300 cities	QUANTITY	0.99+
second part	QUANTITY	0.99+
two hours	QUANTITY	0.99+
less than 100 people	QUANTITY	0.99+
more than ten hours	QUANTITY	0.99+
Waway	ORGANIZATION	0.99+
Bank of China	ORGANIZATION	0.99+
third	QUANTITY	0.99+
Hadoop	TITLE	0.99+
Petrol Channel	ORGANIZATION	0.99+
three products	QUANTITY	0.98+
one new rule	QUANTITY	0.98+
hundreds	QUANTITY	0.98+
three categories	QUANTITY	0.98+
SQL	TITLE	0.98+
single	QUANTITY	0.98+
Transwarp	ORGANIZATION	0.98+
first	QUANTITY	0.98+
tens of thousands policeman	QUANTITY	0.98+
Yuanhao Sun	ORGANIZATION	0.98+
each application	QUANTITY	0.98+
two seconds	QUANTITY	0.98+
100 cluster	QUANTITY	0.97+
first thing	QUANTITY	0.97+
about 100 bank users	QUANTITY	0.97+
two second	QUANTITY	0.97+
each day	QUANTITY	0.97+
Big Data SV	ORGANIZATION	0.97+
The Cube	ORGANIZATION	0.96+
two major reasons	QUANTITY	0.95+
one	QUANTITY	0.95+
above 60%	QUANTITY	0.95+
early in May this year	DATE	0.94+
Source Channel	ORGANIZATION	0.93+
Big Data	ORGANIZATION	0.92+
Chinese	OTHER	0.9+
Strada	LOCATION	0.89+
second product category	QUANTITY	0.88+

Yuanhao Sun, Transwarp Technology - BigData SV 2017 - #BigDataSV - #theCUBE

>> Announcer: Live from San Jose, California, it's theCUBE, covering Big Data Silicon Valley 2017. (upbeat percussion music) >> Okay, welcome back everyone. Live here in Silicon Valley, San Jose, is the Big Data SV, Big Data Silicon Valley in conjunction with Strata Hadoop, this is theCUBE's exclusive coverage. Over the next two days, we've got wall-to-wall interviews with thought leaders, experts breaking down the future of big data, future of analytics, future of the cloud. I'm John Furrier with my co-host George Gilbert with Wikibon. Our next guest is Yuanhao Sun, who's the co-founder and CTO of Transwarp Technologies. Welcome to theCUBE. You were on, during the, 166 days ago, I noticed, on theCUBE, previously. But now you've got some news. So let's get the news out of the way. What are you guys announcing here, this week? >> Yes, so we are announcing 5.0, the latest version of Transwarp Hub. So in this version, we will call it probably revolutionary product, because the first one is we embedded communities in our product, so we will allow people to isolate different kind of workloads, using dock and containers, and we also provide a scheduler to better support mixed workloads. And the second is, we are building a set of tools allow people to build their warehouse. And then migrate from existing or traditional data warehouse to Hadoop. And we are also providing people capability to build a data mart, actually. It allow you to interactively query data. So we build a column store in memory and on SSD. And we totally write the whole SQL engine. That is a very tiny SQL engine, allow people to query data very quickly. And so today that tiny SQL engine is like about five to ten times faster than Spark 2.0. And we also allow people to build cubes on top of Hadoop. And then, once the cube is built, the SQL performance, like the TBCH performance, is about 100 times faster than existing database, or existing Spark 2.0. So it's super-fast. And in, actually we found a Paralect customer, so they replace their data with software, to build a data mart. And we already migrate, say 100 reports, from their data to our product. So the promise is very good. And the first one is we are providing tool for people to build the machine learning pipelines and we are leveraging TensorFlow, MXNet, and also Spark for people to visualize the pipeline and to build the data mining workflows. So this is kind of like Datasense tools, it's very easy for people to use. >> John: Okay, so take a minute to explain, 'cus that was great, you got the performance there, that's the news out of the way. Take a minute to explain Transwarp, your value proposition, and when people engage you as a customer. >> Yuanhao: Yeah so, people choose our product and the major reason is our compatibility to Oracle, DV2, and teradata SQL syntax, because you know, they have built a lot of applications onto those databases, so when they migrate to Hadoop, they don't want to rewrote whole program, so our compatibility, SQL compatibility is big advantage to them, so this is the first one. And we also support full ANCIT and distribute transactions onto Hadoop. So that a lot of applications can be migrate to our product, with few modification or without any changes. So this is the first our advantage. The second is because we are providing, even the best streaming engine, that is actually derived from Spark. So we apply this technology to IOT applications. You know the IOT pretty soon, they need a very low latency but they also need very complicated models on top of streams. So that's why we are providing full SQL support and machine learning support on top of streaming events. And we are also using event-driven technology to reduce the latency, to five to ten milliseconds. So this is second reason people choose our product. And then today we are announcing 5.0, and I think people will find more reason to choose our product. >> So you have the compatibility SQL, you have the tooling, and now you have the performance. So kind of the triple threat there. So what's the customer saying, when you go out and talk with your customers, what's the view of the current landscape for customers? What are they solving right now, what are the key challenges and pain points that customers have today? >> We have customers in more than 12 vertical segments, and in different verticals they have different pain points, actually so. Take one example: in financial services, the main pain point for them is to migrate existing legacy applications to Hadoop, you know they have accumulated a lot of data, and the performance is very bad using legacy database, so they need high performance Hadoop and Spark to speed up the performance, like reports. But in another vertical, like in logistic and transportation and IOT, the pain point is to find a very low latency streaming engine. At the same time, they need very complicated programming model to write their applications. And that example, like in public sector, they actually need very complicated and large scale search engine. They need to build analytical capability on top of search engine. They can search the results and analyze the result in the same time. >> George: Yuanhao, as always, whenever we get to interview you on theCube, you toss out these gems, sort of like you know diamonds, like big rocks that under millions of years, and incredible pressure, have been squeezed down into these incredibly valuable, kind of, you know, valuable, sort of minerals with lots of goodness in them, so I need you to unpack that diamond back into something that we can make sense out of, or I should say, that's more accessible. You've done something that none of the Hadoop Distro guys have managed to do, which is to build databases that are not just decision support, but can handle OLTP, can handle operational applications. You've done the streaming, you've done what even Databricks can't do without even trying any of the other stuff, which is getting the streaming down to event at a time. Let's step back from all these amazing things, and tell us what was the secret sauce that let you build a platform this advanced? >> So actually, we are driven by our customers, and we do see the trends people are looking for, better solutions, you know there are a lot of pain to set up a habitable class to use the Hadoop technology. So that's why we found it's very meaningful and also very necessary for us to build a SQL database on top of Hadoop. Quite a lot of customers in FS side, they ask us to provide asset until the transaction can be put on top of Hadoop, because they have to guarantee the consistency of their data. Otherwise they cannot use the technology. >> At the risk of interrupting, maybe you can tell us why others have built the analytic databases on top of Hadoop, to give the familiar SQL access, and obviously have a desire also to have transactions next to it, so you can inform a transaction decision with the analytics. One of the questions is, how did you combine the two capabilities? I mean it only took Oracle like 40 years. >> Right, so. Actually our transaction capability is only for analytics, you know, so this OLTP capability it is not for short term transactional applications, it's for data warehouse kind of workloads. >> George: Okay, so when you're ingesting. >> Yes, when you're ingesting, when you modify your data, in batch, you have to guarantee the consistency. So that's the OLTP capability. But we are also building another distributed storage, and distributed database, and that are providing that with OLTP capability. That means you can do concurrent transactions, on that database, but we are still developing that software right now. Today our product providing the digital transaction capability for people to actually build their warehouse. You know quite a lot of people believe data warehouse do not need transaction capability, but we found a lot of people modify their data in data warehouse, you know, they are loading their data continuously to data warehouse, like the CRM tables, customer information, they can be changed over time. So every day people need to update or change the data, that's why we have to provide transaction capability in data warehouse. >> George: Okay, and then so then well tell us also, 'cus the streaming problem is, you know, we're told that roughly two thirds of Spark deployments use streaming as a workload. And the biggest knock on Spark is that it can't process one event at a time, you got to do a little batch. Tell us some of the use cases that can take advantage of doing one event at a time, and how you solved that problem? >> Yuanhao: Yeah so the first use case we encounter is the anti-fraud, or fraud detection application in FSI, so whenever you swipe your credit card, the bank needs to tell you if the transaction is a fraud or not in a few milliseconds. But if you are using Spark streaming, it will usually take 500 milliseconds, so the latency is too high for such kind of application. And that's why we have to provide event per time, like means event-driven processing to detect the fraud, so that we can interrupt the transaction in a few milliseconds, so that's one kind of application. The other can come from IOT applications, so we already put our streaming framework in large manufacture factory. So they have to detect the main function of their equipments in a very short time, otherwise it may explode. So if you... So if you are using Spark streaming, probably when you submit your application, it will take you hundreds of milliseconds, and when you finish your detection, it usually takes a few seconds, so that will be too long for such kind of application. And that's why we need a low latency streaming engine, but you can see it is okay to use Storm or Flink, right? And problem is, we found it is: They need a very complicated programming model, that they are going to solve equation on the streaming events, they need to do the FFT transformation. And they are also asking to run some linear regression or some neural network on top of events, so that's why we have to provide a SQL interface and we are also embedding the CEP capability into our streaming engine, so that you can use pattern to match the events and to send alerts. >> George: So, SQL to get a set of events and maybe join some in the complex event processing, CEP, to say, does this fit a pattern I'm looking for? >> Yuanhao: Yes. >> Okay, and so, and then with the lightweight OLTP, that and any other new projects you're looking at, tell us perhaps the new use cases you'd be appropriated for. >> Yuanhao: Yeah so that's our official product actually, so we are going to solve the problem of large scale OLTP transaction problems like, so you know, a lot of... You know, in China, there is so many population, like in public sector or in banks, they need build a highly scalable transaction systems so that they can support a very high concurrent transactions at the same time, so that's why we are building such kind of technology. You know, in the past, people just divide transaction into multiple databases, like multiple Oracle instances or multiple mySQL instances. But the problem is: if the application is simple, you can very easily divide a transaction over the multiple instances of databases. But if the application is very complicated, especially when the ISV already wrote the applications based on Oracle or traditional database, they already depends on the transaction systems so that's why we have to build a same kind of transaction systems, so that we can support their legacy applications, but they can scale to hundreds of nodes, and they can scale to millions of transactions per second. >> George: On the transactional stuff? >> Yuanhao: Yes. >> Just correct me if I'm wrong, I know we're running out of time but I thought Oracle only scales out when you're doing decision support work, not when you're doing OLTP, not that it, that it can only, that it can maybe stretch to ten nodes or something like that, am I mistaken? >> Yuanhao: Yes, they can scale to 16 to all 32 nodes. >> George: For transactional work? >> For transaction works, but so that's the theoretical limit, but you know, like Google F1 and Google Spanner, they can scale to hundreds of nodes. But you know, the latency is higher than Oracle because you have to use distributed particle to communicate with multiple nodes, so the latency is higher. >> On Google? >> Yes. >> On Google. The latency is higher on the Google? >> 'Cus it has to go like all the way to Europe and back. >> Oracle or Google latency, you said? >> Google, because if you are using two phase commit protocol you have to talk to multiple nodes to broadcast your request to multiple nodes, and then wait for the feedback, so that mean you have a much higher latency, but it's necessary to maintain the consistency. So in a distributed OLTP databases, the latency is usually higher, but the concurrency is also much higher, and scalability is much better. >> George: So that's a problem you've stretched beyond what Oracle's done. >> Yuanhao: Yes, so because customer can tolerant the higher latency, but they need to scale to millions of transactions per second, so that's why we have to build a distributed database. >> George: Okay, for this reason we're going to have to have you back for like maybe five or ten consecutive segments, you know, maybe starting tomorrow. >> We're going to have to get you back for sure. Final question for you: What are you excited about, from a technology, in the landscape, as you look at open source, you're working with Spark, you mentioned Kubernetes, you have micro services, all the cloud. What are you most excited about right now in terms of new technology that's going to help simplify and scale, with low latency, the databases, the software. 'Cus you got IOT, you got autonomous vehicles, you have all this data, what are you excited about? >> So actually, so this technology we already solve these problems actually, but I think the most exciting thing is we found... There's two trends, the first trend is: We found it's very exciting to find more competition framework coming out, like the AI framework, like TensorFlow and MXNet, Torch, and tons of such machine learning frameworks are coming out, so they are solving different kinds of problems, like facial recognition from video and images, like human computer interactions using voice, using audio. So it's very exciting I think, but for... And also it's very, we found it's very exciting we are embedding these, we are combining these technologies together, so that's why we are using competitors you know. We didn't use YARN, because it cannot support TensorFlow or other framework, but you know, if you are using containers and if you have good scheduler, you can schedule any kind of competition frameworks. So we found it's very interesting to, to have these new frameworks, and we can combine together to solve different kinds of problems. >> John: Thanks so much for coming onto theCube, it's an operating system world we're living in now, it's a great time to be a technologist. Certainly the opportunities are out there, and we're breaking it down here inside theCube, live in Silicon Valley, with the best tech executives, best thought leaders and experts here inside theCube. I'm John Furrier with George Gilbert. We'll be right back with more after this short break. (upbeat percussive music)

Published Date : Mar 14 2017

SUMMARY :

Jose, California, it's theCUBE, So let's get the news out of the way. And the first one is we are providing tool and when people engage you as a customer. And then today we are announcing 5.0, So kind of the triple threat there. the pain point is to find so I need you to unpack because they have to guarantee next to it, so you can you know, so this OLTP capability So that's the OLTP capability. 'cus the streaming problem is, you know, the bank needs to tell you Okay, and so, and then and they can scale to millions scale to 16 to all 32 nodes. so the latency is higher. The latency is higher on the Google? 'Cus it has to go like all so that mean you have George: So that's a the higher latency, but they need to scale segments, you know, to get you back for sure. like the AI framework, like it's a great time to be a technologist.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
George	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
China	LOCATION	0.99+
five	QUANTITY	0.99+
Europe	LOCATION	0.99+
Transwarp Technologies	ORGANIZATION	0.99+
40 years	QUANTITY	0.99+
500 milliseconds	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
San Jose, California	LOCATION	0.99+
hundreds of nodes	QUANTITY	0.99+
Hadoop	TITLE	0.99+
Today	DATE	0.99+
ten nodes	QUANTITY	0.99+
first	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
100 reports	QUANTITY	0.99+
tomorrow	DATE	0.99+
second	QUANTITY	0.99+
first one	QUANTITY	0.99+
Yuanhao Sun	PERSON	0.99+
second reason	QUANTITY	0.99+
Spark 2.0	TITLE	0.99+
today	DATE	0.99+
this week	DATE	0.99+
ten times	QUANTITY	0.99+
16	QUANTITY	0.99+
two trends	QUANTITY	0.99+
Yuanhao	PERSON	0.99+
SQL	TITLE	0.99+
Spark	TITLE	0.99+
first trend	QUANTITY	0.99+
two capabilities	QUANTITY	0.98+
Silicon Valley, San Jose	LOCATION	0.98+
TensorFlow	TITLE	0.98+
one event	QUANTITY	0.98+
32 nodes	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
Torch	TITLE	0.98+
166 days ago	DATE	0.98+
one example	QUANTITY	0.98+
more than 12 vertical segments	QUANTITY	0.97+
ten milliseconds	QUANTITY	0.97+
hundreds of milliseconds	QUANTITY	0.97+
two thirds	QUANTITY	0.97+
MXNet	TITLE	0.97+
Databricks	ORGANIZATION	0.96+
Google	ORGANIZATION	0.96+
ten consecutive segments	QUANTITY	0.95+
first use	QUANTITY	0.95+
Wikibon	ORGANIZATION	0.95+
Big Data Silicon Valley	ORGANIZATION	0.95+
Strata Hadoop	ORGANIZATION	0.95+
about 100 times	QUANTITY	0.94+
Big Data SV	ORGANIZATION	0.94+
One of	QUANTITY	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Yuanhao: