Gian Merlino, Imply.io | AWS Startup Showcase S2 E2
(upbeat music) >> Hello, and welcome to theCUBE's presentation of the AWS Startup Showcase: Data as Code. This is Season 2, Episode 2 of the ongoing SaaS covering exciting startups from the AWS ecosystem and we're going to talk about the future of enterprise data analytics. I'm your host, John Furrier and today we're joined by Gian Merlino CTO and co-founder of Imply.io. Welcome to theCUBE. >> Hey, thanks for having me. >> Building analytics apps with Apache Druid and Imply is what the focus of this talk is and your company being showcased today. So thanks for coming on. You guys have been in the streaming data large scale for many, many years of pioneer going back. This past decade has been the key focus. Druid's unique position in that market has been key, you guys been empowering it. Take a minute to explain what you guys are doing over there at Imply. >> Yeah, for sure. So I guess to talk about Imply, I'll talk about Druid first. Imply is a open source based company and Apache Druid is the open source project that the Imply product's built around. So what Druid's all about is it's a database to power analytical applications. And there's a couple things I want to talk about there. The first off is, is why do we need that? And the second is why are we good at, and I'll just a little flavor of both. So why do we need database to power analytical apps? It's the same reason we need databases to power transactional apps. I mean, the requirements of these applications are different analytical applications, apps where you have tons of data coming in, you have lots of different people wanting to interact with that data, see what's happening both real time and historical. The requirements of that kind of application have sort of given rise to a new kind of database that Druid is one example of. There's others, of course, out there in both the open source and non open source world. And what makes Druid really good at it is, people often say what is Druid's big secret? How is it so good? Why is it so fast? And I never know what to say to that. I always sort of go to, well it's just getting all the little details right. It's a lot of pieces that individually need to be engineered, you build up software in layers, you build up a database in layers, just like any other piece of software. And to have really high performance and to do really well at a specific purpose, you kind of have to get each layer right and have each layer have as little overhead as possible. And so just a lot of kind of nitty gritty engineering work. >> What's interesting about the trends over the past 10 years in particular, maybe you can go back 10, 15 years is state of the art database was, stream a bunch of data put it into a pile, index it, interrogate it, get some reports, pretty basic stuff and then all of a sudden now you have with cloud, thousands of databases out there, potentially hundreds of databases living in the wild. So now data with Kafka and Kinesis, these kinds of technologies streaming data's happening in real time so you don't have time to put it in a pile or index it. You want real time analytics. And so perhaps whether they're mobile app, Instagrams of the world, this is now what people want in the enterprise. You guys are the heart of this. Can you talk about that dynamic of getting data quickly at scale? >> So our thinking is that actually both things matter. Realtime data matters but also historical context matters. And the best way to get historical context out of data is to put it in a pile, index it, so to speak, and then the best way to get realtime context to what's happening right now is to be able to operate on these streams. And so one of the things that we do in Druid, I wish I had more time to talk about it but one of the things that we do in Druid is we kind of integrate this real time processing and this historical processing. So we actually have a system that we call the historical system that does what you're saying, take all this data, put in a pile, index it for all your historical data. And we have a system that we call the realtime system that is pulling data in from things like Kafka, Kinesis, getting data pushed into it as the case may be. And this system is responsible for all the data that's recent, maybe the last hour or two of data will be handled by this system and then the older stuff handled by historical system. And our query layer blends these two together seamlessly so a user never needs to think about whether they're querying realtime data or historical data. It's presented as a blended view. >> It's interesting and you know a lot of the people just say, Hey, I don't really have the expertise, and now they're trying to learn it so their default was throw into a data lake. So that brings back that historical. So the rise of the data lake, you're seeing Databricks and others out there doing very well with the data lakes. How do you guys fit into that 'cause that makes it a lot of sense too cause that looks like historical information? >> So data lakes are great technology. We love that kind of stuff. I would say that a really popular pattern, with Druid there's actually two very popular patterns. One is, I would say streaming forward. So stream focus where you connect up to something like Kafka and you load data to stream and then we will actually take that data, we'll store all the historical data that came from the stream and instead of blend those two together. And another other pattern that's also very common is the data lake pattern. So you have a data lake and then you're sort of mirroring that data from the data lake into Druid. This is really common when you have a data lake that you want to be able to build an application on top of, you want to say I have this data in the data lake, I have my table, I want to build an application that has hundreds of people using it, that has really fast response time, that is always online. And so when I mirror that data into Druid and then build my app on top of that. >> Gian take me through the progression of the maturity cycle here. As you look back even a few years, the pioneers and the hardcore streaming data using data analytics at scale that you guys are doing with Druid was really a few percentage of the population doing that. And then as the hyperscale became mainstream, it's now in the enterprise, how stable is it? What's the current state of the art relative to the stability and adoption of the techniques that you guys are seeing? >> I think what we're seeing right now at this stage in the game, and this is something that we kind of see at the commercial side of Imply, what we're seeing at this stage of the game is that these kinds of realization that you actually can get a lot of value out of data by building interactive apps around it and by allowing people to kind of slice and dice it and play with it and just kind of getting out there to everybody, that there is a lot of value here and that it is actually very feasible to do with current technology. So I've been working on this problem, just in my own career for the past decade, 10 years ago where we were is even the most high tech of tech companies were like, well, I could sort of see the value. It seems like it might be difficult. And we're kind of getting from there to the high tech companies realizing that it is valuable and it is very doable. And I think that was something there was a tipping point that I saw a few years ago when these Druid and database like really started to blow up. And I think now we're seeing that beyond sort of the high tech companies, which is great to see. >> And a lot of people see the value of the data and they see the application as data as code means the application developers really want to have that functionality. Can you share the roadmap for the next 12 months for you guys on what's coming next? What's coming around the corner? >> Yeah, for sure. I mentioned during the Apache open source community, different products we're one member of that community, very prominent one but one member so I'll talk a bit about what we're doing for the Druid project as part of our effort to make Druid better and take it to the next level. And then I'll talk about some of the stuff we're doing on the, I guess, the Druid sort of commercial side. So on the Druid side, stuff that we're doing to make Druid better, take it to the next level, the big thing is something that we really started writing about a few weeks ago, the multi-stage query engine that we're working on, a new multi-stage query engine. If you're interested, the full details are on blog on our website and also on GitHub on Apache Druid GitHub, but short version is Druid's. We're sort of extending Druid's Query engine to support more and varied kinds of queries with a focus on sort of reporting queries, more complex queries. Druid's core query engine has classically been extremely good at doing rapid fire queries very quickly, so think thousands of queries per second where each query is maybe something that involves a filter in a group eye like a relatively straightforward query but we're just doing thousands of them constantly. Historically folks have not reached for technologies like Druid is, really complex and a thousand line sequel queries, complex supporting needs. Although people really do need to do both interactive stuff and complex stuff on the same dataset and so that's why we're building out these capabilities in Druid. And then on the implied commercial side, the big effort for this year is Polaris which is our cloud based Druid offering. >> Talk about the relationship between Druid and Imply? Share with the folks out there how that works. >> So Druid is, like I mentioned before, it's Apache Druid so it's a community based project. It's not a project that is owned by Imply, some open source projects are sort of owned or sponsored by a particular organization. Druid is not, Druid is an independent project. Imply is the biggest contributor to Druid. So the imply engineering team is contributing tons of stuff constantly and we're really putting a lot of the work in to improve Druid although it is a community effort. >> You guys are launching a new SaaS service on AWS. Can you tell me about what that's happening there, what it's all about? >> Yeah, so we actually launched that a couple weeks ago. It's called Polaris. It's very cool. So historically there's been two ways, you can either get started with Apache Druid, it's open source, you install it yourself, or you can get started with Imply Enterprise which is our enterprise offering. And these are the two ways you can get started historically. One of the issues of getting started with Apache Druid is that it is a very complicated distributed database. It's simple enough to run on a single server but once you want to scale things out, once you get all these things set up, you may want someone to take some of that operational burden off your hands. And on the Imply Enterprise side, it says right there in the name, it's enterprise product. It's something that may take a little bit of time to get started with. It's not something you can just roll up with a credit card and sign up for. So Polaris is really about of having a cloud product that's sort of designed to be really easy to get started with, really self-service that kind of stuff. So kind of providing a really nice getting started experience that does take that maintenance burden and operational burden away from you but is also sort of as easy to get started with as something that's database would be. >> So a more developer friendly than from an onboarding standpoint, classic. >> Exactly. Much more developer friendly is what we're going for with that product. >> So take me through the state of the art data as code in your mind 'cause infrastructure is code, DevOps has been awesome, that's cloud scale, we've seen that. Data as Code is a term we coined but means data's in the developer process. How do you see data being integrated into the workflow for developers in the future? >> Great question. I mean all kinds of ways. Part of the reason that, I kind of alluded to this earlier, building analytical applications, building applications based on data and based on letting people do analysis, how valuable it is and I guess to develop in that context there's kind of two big ways that we sort of see these things getting pushed out. One is developers building apps for other people to use. So think like, I want to build something like Google analytics, I want to build something that clicks my web traffic and then lets the marketing team slice and dice through it and make decisions about how well the marketing's doing. You can build something like that with databases like Druid and products like what we're having in Imply. I guess the other way is things that are actually helping developers do their own job. So kind of like use your own product or use it for yourself. And in this world, you kind of have things like... So going beyond what I think my favorite use case, I'll just talk about one. My favorite use case is so I'm really into performance, I spend the last 10 years of my life working on high performance database so obviously I'm into this kind of stuff. I love when people use our product to help make their own products faster. So this concept of performance monitoring and performance management for applications. One thing that I've seen some of our customers do and some of our users do that I really love is when you kind of take that performance data of your own app, as far as it can possibly go take it to the next level. I think the basic level of using performance data is I collect performance data from my application deployed out there in the world and I can just use it for monitoring. I can say, okay my response times are getting high in this region, maybe there's something wrong with that region. One of the very original use cases for Druid was that Netflix doing performance analysis, performance analysis more exciting than monitoring because you're not just understanding that there's a performance, is good or bad in whatever region sort of getting very fine grain. You're saying in this region, on this server rack for these devices, I'm seeing a degradation or I'm seeing a increase. You can see things like Apple just rolled out a new version of iOS and on that new version of iOS, my app is performing worse than the older version. And even though not many devices are on that new version yet I can kind of see that because I have the ability to get really deep in the data and then I can start slicing nice that more. I can say for those new iOS people, is it all iOS devices? Is it just the iPhone? Is it just the iPad? And that kind of stuff is just one example but it's an example that I really like. >> It's kind of like the data about the data was always good to have context, you're like data analytics for data analytics to see how it's working at scale. This is interesting because now you're bringing up the classic finding the needle in the haystack of needles, so to speak where you have so much data out there like edge cases, edge computing, for instance, you have devices sending data off. There's so much data coming in, the scale is a big issue. This is kind of where you guys seem to be a nice fit for, large scale data ingestion, large scaled data management, large scale data insights kind of all rolled in to one. Is that kind of-? >> Yeah, for sure. One of the things that we knew we had to do with Druid was we were building it for the sort of internet age and so we knew it had to scale well. So the original use case for Druid, the very first one that we ended up building for, the reason we build in the first place is because that original use case had massive scale and we struggled finding something, we were literally trying to do what we see people doing now which is we're trying to build an app on a massive data set and we're struggling to do it. And so we knew it had to scale to massive data sets. And so that's a little flavor of kind know how that works is, like I was mentioning earlier this, this realtime system and historical system, the realtime system is scalable, it's scalable out if you're reading from Kafka, we scale out just like any other Kafka consumer. And then the historical system is all based on what we call segments which are these files that has a few million rows per file. And a cluster is really big, might have thousands of servers, millions of segments, but it's a design that is kind of, it's a design that does scale to these multi-trillion road tables. >> It's interesting, you go back when you probably started, you had Twitter, Netflix, Facebook, I mean a handful of companies that were at the scale. Now, the trend is you're on this wave where those hyperscalers and, or these unique huge scale app companies are now mainstream enterprise. So as you guys roll out the enterprise version of building analytics and applications, which Druid and Imply, they got to going to get religion on this. And I think it's not hard because it's distributed computing which they're used to. So how is that enterprise transition going because I can imagine people would want it and are just kicking the tires or learning and then trying to put it into action. How are you seeing the adoption of the enterprise piece of it? >> The thing that's driving the interest is for sure doing more and more stuff on the internet because anything that happens on the internet whether it's apps or web based, there's more and more happening there and anything that is connected to the internet, anything that's serving customers on the internet, it's going to generate an absolute mountain of data. And the only question is not if you're going to have that much data, you do if you're doing anything on the internet, the only question is what are you going to do with it? So that's I think what drives the interest, is people want to try to get value out of this. And then what drives the actual adoption is I think, I don't want to necessarily talk about specific folks but within every industry I would say there's people that are leaders, there's organizations that are leaders, teams that are leaders, what drives a lot of interest is seeing someone in your own industry that has adopted new technology and has gotten a lot of value out of it. So a big part of what we do at Imply is that identify those leaders, work with them and then you can talk about how it's helped them in their business. And then also I guess the classic enterprise thing, what they're looking for is a sense of stability, a sense of supportability, a sense of robustness and this is something that comes with maturity. I think that the super high tech companies are comfortable using some open source software that's rolled off the presses a few months ago; he big enterprises are looking for something that has corporate backing, they're looking for something that's been around for a while and I think that Druid technologies like it are breaching that little maturity right now. >> It's interesting that supply chain has come up in the software side. That conversation is a lot now, you're hearing about open source being great, but in the cloud scale, you can get the data in there to identify opportunities and also potentially vulnerabilities is big discussion. Question for you on the cloud native side, how do you see cloud native, cloud scale with services like serverless Lambda, edge merging, it's easier to get into the cloud scale. How do you see the enterprise being hardened out with Druid and Imply? >> I think the cloud stuff is great, we love using it to build all of our own stuff, our product is of course built on other cloud technologies and I think these technologies built on each other, you sort of have like I mentioned earlier, all software is built in layers and cloud architecture is the same thing. What we see ourselves as doing is we're building the next layer of that stack. So we're building the analytics database layer. You saw when people first started doing these in public cloud, the very first two services that came out you can get a virtual machine and you can store some data and you can retrieve that data but there's no real analytics on it, there's just kind of storage and retrieval. And then as time goes on higher and higher levels get built out delivering more and more value and then the levels mature as they go up. And so the the bottom of layers are incredibly mature, the top most layers are cutting edge and there's a kind of a maturity gradient between those two. And so what we're doing is we're building out one of those layers. >> Awesome extraction layers, faster performance, great stuff. Final question for you, Gian, what's your vision for the future? How do you Imply and Druid it going? What's it look like five years from now? >> I think that for sure it seems like that there's two big trends that are happening in the world and it's going to sound a little bit self serving for me to say it but I believe what we're doing here says, I'm here 'cause I believe it, I believe in open source and I believe in cloud stuff. That's why I'm really excited that what we're doing is we're building a great cloud product based on a great open source project. I think that's the kind of company that I would want to buy from if I wasn't at this company and I was just building something, I would want to buy a great cloud product that's backed by a great open source project. So I think the kind of the way I see the industry going, the way I see us going and I think would be a great place to end up just kind of as an engineering world, as an industry is a lot of these really great open source projects doing things like what Kubernetes doing containers, we're doing with analytics et cetera. And then really first class really well done cloud versions of each one of them and so you can kind of choose, do you want to get down and dirty with the open source or do you want to choose just kind of have the abstraction of the cloud. >> That's awesome. Cloud scale, cloud flexibility, community getting down and dirty open source, the best of both worlds. Great solution. Goin, thanks for coming on and thanks for sharing here in the Showcase. Thanks for coming on theCUBE. >> Thank you too. >> Okay, this is theCUBE Showcase Season 2, Episode 2. I'm John Furrier, your host. Data as Code is the theme of this episode. Thanks for watching. (upbeat music)
SUMMARY :
of the AWS Startup Showcase: Data as Code. Take a minute to explain what you guys are And the second is why are we good at, Instagrams of the world, And so one of the things know a lot of the people data that came from the of the art relative to the that beyond sort of the the next 12 months for you So on the Druid side, Talk about the relationship Imply is the biggest contributor to Druid. Can you tell me about what And on the Imply Enterprise side, So a more developer friendly than from we're going for with that product. means data's in the developer process. I have the ability to get It's kind of like the One of the things that of the enterprise piece of it? I guess the classic enterprise thing, but in the cloud scale, And so the the bottom of How do you Imply and Druid it going? and so you can kind of choose, here in the Showcase. Data as Code is the theme of this episode.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
John Furrier | PERSON | 0.99+ |
Gian Merlino | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
two ways | QUANTITY | 0.99+ |
iOS | TITLE | 0.99+ |
Netflix | ORGANIZATION | 0.99+ |
10 | QUANTITY | 0.99+ |
each layer | QUANTITY | 0.99+ |
iPhone | COMMERCIAL_ITEM | 0.99+ |
millions | QUANTITY | 0.99+ |
Druid | TITLE | 0.99+ |
iPad | COMMERCIAL_ITEM | 0.99+ |
first | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
second | QUANTITY | 0.99+ |
thousands | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
Imply | ORGANIZATION | 0.99+ |
both | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
each query | QUANTITY | 0.99+ |
theCUBE | ORGANIZATION | 0.98+ |
ORGANIZATION | 0.98+ | |
ORGANIZATION | 0.98+ | |
Gian | PERSON | 0.98+ |
Kafka | TITLE | 0.98+ |
Imply.io | ORGANIZATION | 0.97+ |
one example | QUANTITY | 0.97+ |
first two services | QUANTITY | 0.97+ |
hundreds of people | QUANTITY | 0.97+ |
each one | QUANTITY | 0.97+ |
two big ways | QUANTITY | 0.97+ |
10 years ago | DATE | 0.96+ |
past decade | DATE | 0.96+ |
first class | QUANTITY | 0.96+ |
one member | QUANTITY | 0.96+ |
Lambda | TITLE | 0.96+ |
two big trends | QUANTITY | 0.96+ |
Apache | ORGANIZATION | 0.95+ |
both worlds | QUANTITY | 0.95+ |
Polaris | ORGANIZATION | 0.95+ |
one member | QUANTITY | 0.95+ |
today | DATE | 0.95+ |
Fangjin Yang, Imply.io | CUBE Conversation
(bright upbeat music) >> Welcome, everyone, to this CUBE Conversation featuring Imply. I'm your host, Lisa Martin. Today, we are excited to be joined by FJ Yang, the co-founder and CEO of Imply. FJ, thanks so much for joining us today. >> Lisa, thank you so much for having me. >> Tell me a little bit about yourself and about Imply. >> Yeah, absolutely. So, I started Imply a couple years ago and before start the company, I was a technologist. So, I was a software engineer and software developer primarily specializing in distributed systems. And one of the projects I worked on, ultimately became kind of the centerpiece behind Imply. Imply, as a company is a database company. What we do is we provide developers a powerful tool in order to help them build various types of data analytic applications. We're also an open source company, where the company develops a popular open source project called Apache Druid. >> Got it, so database as a service for modern analytics applications. You're also one of the original authors of Apache Druid. Talk to me, gimme a timeline, Druid's 10-year history or so. What's the big picture? What's been the market evolution that you've seen? >> Yeah, absolutely. So, I moved out to Silicon Valley basically to try and work at a startup, 'cause I was enamored with startups and I thought they were the coolest thing ever. So, at one point, I basically joined the smallest startup I could find. It was a startup called Metamarkets, which actually doesn't exist anymore, it was ultimately acquired by Snapchat a couple years ago. But, I was one of the first employees there. And what we were trying to do at the time, was we were trying to build an analytics application, a user-facing application where people could slice and dice various types of data. At the time, the data sets we were working with were like online advertising, digital advertising data sets which were very large and complex. And, we really struggled to find a database that could basically power the kind of interactive and user experience that we know we want to provide our end customers. So, what ended up happening was we decided to build our own database and we were a three or five-person shop when we decided to build our own database, and that was Druid. And over time, we saw many other types of companies actually struggle with a similar set of problems, albeit with very different types of use cases and very different types of data sets. And, the Druid community kind of grew and evolved from that. And in my work in engaging with the community, what I saw was a market opportunity and a market gap and that's where Imply formed. >> Let's double click on that. You talked about why you built Druid, the problem you were looking to solve. But, talk to me about the role that Imply has. >> Right. So, Imply is a commercial company. What we do is we build kind of an end-to-end enterprise product around Druid as the core engine. Imply provides deployment, it deploys management, it provides security, and it also provides visualization and monitoring pieces around Druid as a core engine. What we aim to do at Imply is really enable developers to build various types of data applications with only the click of a few buttons and interacting with a simple set of APIs. So, the goal is, if you're a developer, you don't have to think about managing the database yourself, you don't have to think about the operational complexity at the database, but instead, what you do is just work with APIs and build your application. >> So, then what gives Druid its superpower? What makes Druid Druid? >> Yeah, so, Druid, the easiest way to think about it, is it's a really fast calculator and it's a very fast calculator for a whole lot of data. So, when you have a whole lot of data and you want to crunch numbers very, very quickly, Druid is very good at doing that. And, people always ask me this question, which is, what makes Druid special? And I always struggle with it, because it's never just one thing, it's actually layers, upon layers, upon layers of engineering. You start with fundamentals of how you maximally optimize the resources of any hardware. So, how do you maximize storage? How do you maximize compute? And then, there's a lot of optimizations around how do you store the data? How do you access that data in a very fast way once it's stored in order to run computations very quickly? So, unfortunately, there's no silver bullet about Druid, but maybe I can summarize in this way. Druid, it's like a search system, and a data warehouse, and a time series database all mixed together. And, that architecture enables it to be very, very quickly. And unfortunately, if you don't know what some of the components I'm talking about are, it's hard to describe where the secret sauce is (chuckling). >> Sometimes you want to keep that secret sauce secret. Talk to me about the overall data space, as we see these days, every company is a data company or if it's not, it needs to be to be successful. Where does Druid fit in the overall data space? Give us that picture of where it fits. >> Yeah, absolutely. So, it's pretty interesting that you see now in the public markets as well as the private markets, some of the hottest unicorns out there are actually data companies. And, I think what people are are understanding now for the first time, is just how vast and complex the data space is and also how large the market is as well. So for sure, there's many different components and pieces in the data space, and they oftentimes come together to form what's known as a data stack. So, data stack is basically kind of an architecture that has various systems and each of these systems are designed to do a certain set of things very, very well. For example, a company that recently went public is a company called Confluent, which mostly catered towards data transport, so getting data from one place to another. They're built around an open source engine called Apache Kafka. Databricks is another mega unicorn that's going to go public pretty soon. And they're built around an open source project called Spark, which is mainly used for data processing. Where we sit is on the data query side. So, what that means is we're a system in which people can store data and then access that data very, very quickly. And there's other systems that do that, but where our bread and butter is, is we're building some sort of application, where you have end users that are clicking buttons in order to get access to data, we're a platform that enables the best end user experience. We return queries very, very quickly with a consistent SLA, we immediately visualize data as soon as it's made available, and then we can support many, many, many concurrent end users to access the system at the same time. >> So, real time. One of the things I think that we learned during the pandemic, one of the many things is that access to real time data, it's no longer a nice to have, it is table stakes for, as I said, every company, these days is a data company. So with how you describe it, how should people think of Druid versus a data warehouse? >> Yeah. So, that's a great question. And obviously, data warehouses have been around since the 70s. In the B2B space, they're among the largest players that kind of exist in enterprise software. So, it's only natural that when you come up with sort of a new analytics database, that people compare it with what they already know, which is data warehouse. So, a lot of how we think about why we're different than data warehouse goes back to how I answered the previous question, and that we're focused right now, really, on powering different types of data applications. Data applications are UIs in which people are really accessing and getting insights from data by clicking buttons versus writing more complex equal queries. And when you click buttons and you get access to data, what you want in terms of an end user experience, is you want answers to questions to come back almost immediately. So you don't want to click a button and then see a spinning dial that goes on for minute and minutes before an answer comes back. You basically want results to come back immediately. You want that experience no matter what types of queries that you're issuing or how many people are issuing those queries. If you have thousands, if not tens of thousands of people that are trying to access data exact same time, you want to give a consistent user experience like Google, which is one of my favorite products. There're millions of people that use Google, and ask questions and they get their answers back immediately. So we try to provide that same experience, but instead of a generic search engine, what we're doing is we're providing a system that basically answers questions on data and users get a very interactive and fast experience when asking questions. And that's something that I think is very different than what data warehouses are primarily specialized in. Data warehouses are really designed to be systems in which people write very large complex sequel queries that might take minutes or hours sometimes to run. But the experience of using a data warehouse to power and application is not a great one. >> So, I'm just curious, FJ, in the last couple of years, with, as I mentioned before the access to real time data no longer a nice to have, but it's something business critical for so many industries, did you see any industries in particular in the recent years that were really primed candidates for what Druid would can deliver? >> Yeah, that's a great question. And you can imagine that the industries that really heavily rely on fast decision making are the ones that are earliest to adopt technologies like this. So, in the security space, and the observability space, as well as working with networking and various forms of backend kind of metrics data, this system has been very popular and it's been popular because people need to triage (indistinct) as they occur, they need to resolve problems, and they also need immediate visibility, as well as very fast queries on data. Another space is online advertising. Online advertising, nowadays is almost entirely programmatic and digital. So, response times are critical in order to make decisions. And that's where Druid was actually born. It was born for advertising before it kind of went everywhere else. We're seeing it more in fraud protection, fraud prevention as well as fraud diagnostics nowadays. We're seeing it in retail as well, which is pretty interesting. And, the goal, of course, is I believe every industry and every vertical needs the capabilities that we provide. So hopefully, we see a whole lot more use cases in the near future. >> Right, it's absolutely horizontal these days. So, 10-year history, you've got a community of thousands, what's the future of Druid? What do you see when you open the crystal ball and look now down the 12 months, 18 months road? >> Yeah. So, I think as a technologist, your goal as the technologist, at least for me, is to try and create technology that has as much applicability as possible and solves problems for as many people as possible. That's always the way I think about it. So, I want to do good engineering and I want to build good systems. And I think what the hallmark of a really good system is you can solve all different types of problems and condense all these different problems, actually into the same set of models and the same set of principles. And, a thing that makes me most excited about Druid is the many, many different industries that it's found value and the many different use cases it's found value. So, if I were to give 30,000 foot roadmap, that's what we're trying to do with the next generation of Druid. We're actually doing a pretty major engine upgrade right now, and pretty major overhaul the entire system. And the goal of that is to take all the learnings that we've had over the last decade and to create something new that can solve an expanded set of problems that we've heard from the community and from other places as well. >> Excellent. FJ, exciting work that you've done the last 10 years. Congratulations on that. Looking forward to the roadmap that you talked about. Thanks for sharing what Druid is, the Imply connection, and all the different use cases where it applies. We appreciate your insights. >> Appreciate you having me on the show. Thank you very much. >> My pleasure. For FJ Yang, I'm Lisa Martin. You're watching this CUBE Conversation, the leader in live tech enterprise coverage. (bright upbeat music)
SUMMARY :
the co-founder and CEO of Imply. and before start the company, You're also one of the original At the time, the data sets we were working the problem you were looking to solve. So, the goal is, if you're a developer, of the components I'm talking about are, the overall data space? in the data space, One of the things I think So, a lot of how we think So, in the security space, and look now down the 12 and the same set of principles. and all the different use Appreciate you having me on the show. the leader in live tech
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Martin | PERSON | 0.99+ |
thousands | QUANTITY | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Lisa | PERSON | 0.99+ |
Snapchat | ORGANIZATION | 0.99+ |
10-year | QUANTITY | 0.99+ |
18 months | QUANTITY | 0.99+ |
FJ Yang | PERSON | 0.99+ |
three | QUANTITY | 0.99+ |
Imply | ORGANIZATION | 0.99+ |
Confluent | ORGANIZATION | 0.99+ |
12 months | QUANTITY | 0.99+ |
30,000 foot | QUANTITY | 0.99+ |
Druid | TITLE | 0.99+ |
each | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Fangjin Yang | PERSON | 0.99+ |
first time | QUANTITY | 0.98+ |
Today | DATE | 0.98+ |
ORGANIZATION | 0.98+ | |
today | DATE | 0.98+ |
millions of people | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
Imply.io | ORGANIZATION | 0.97+ |
Metamarkets | ORGANIZATION | 0.96+ |
five-person | QUANTITY | 0.96+ |
first employees | QUANTITY | 0.94+ |
tens of thousands of people | QUANTITY | 0.94+ |
pandemic | EVENT | 0.94+ |
last couple of years | DATE | 0.91+ |
FJ | PERSON | 0.91+ |
70s | DATE | 0.89+ |
one thing | QUANTITY | 0.89+ |
Databricks | ORGANIZATION | 0.88+ |
one point | QUANTITY | 0.87+ |
Druid | PERSON | 0.84+ |
couple years ago | DATE | 0.81+ |
last decade | DATE | 0.75+ |
Apache Druid | ORGANIZATION | 0.73+ |
Conversation | EVENT | 0.73+ |
Apache | ORGANIZATION | 0.72+ |
last 10 years | DATE | 0.72+ |
double | QUANTITY | 0.69+ |
Spark | TITLE | 0.66+ |
my favorite products | QUANTITY | 0.62+ |
CUBE Conversation | TITLE | 0.58+ |
minutes | QUANTITY | 0.54+ |
minute | QUANTITY | 0.51+ |
Kafka | TITLE | 0.41+ |
CUBE Conversation | EVENT | 0.31+ |