Tim Yocum, Influx Data | Evolving InfluxDB into the Smart Data Platform

(soft electronic music) >> Okay, we're back with Tim Yocum who is the Director of Engineering at InfluxData. Tim, welcome, good to see you. >> Good to see you, thanks for having me. >> You're really welcome. Listen, we've been covering opensource software on theCUBE for more than a decade and we've kind of watched the innovation from the big data ecosystem, the cloud is being built out on opensource, mobile, social platforms, key databases, and of course, InfluxDB. And InfluxData has been a big consumer and crontributor of opensource software. So my question to you is where have you seen the biggest bang for the buck from opensource software? >> So yeah, you know, Influx really, we thrive at the intersection of commercial services and opensource software, so OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services, templating engines. Our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants. And like you've mentioned, even better, we contribute a lot back to the projects that we use, as well as our own product InfluxDB. >> But I got to ask you, Tim, because one of the challenge that we've seen, in particular, you saw this in the heyday of Hadoop, the innovations come so fast and furious, and as a software company, you got to place bets, you got to commit people, and sometimes those bets can be risky and not pay off. So how have you managed this challenge? >> Oh, it moves fast, yeah. That's a benefit, though, because the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we tend to do is we fail fast and fail often; we try a lot of things. You know, you look at Kubernetes, for example. That ecosystem is driven by thousands of intelligent developers, engineers, builders. They're adding value every day, so we have to really keep up with that. And as the stack changes, we try different technologies, we try different methods. And at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's something that we just do every day. >> So we have a survey partner down in New York City called Enterprise Technology Research, ETR, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes, is one of the areas that is kind of, it's been off the charts and seen the most significant adoption and velocity particularly along with cloud, but really, Kubernetes is just, you know, still up and to the right consistently, even with the macro headwinds and all of the other stuff that we're sick of talking about. So what do you do with Kubernetes in the platform? >> Yeah, it's really central to our ability to run the product. When we first started out, we were just on AWS and the way we were running was a little bit like containers junior. Now we're running Kubernetes everywhere at AWS, Azure, Google cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code. So our developers can focus on delivering services not trying to learn the intricacies of Amazon, Azure, and Google, and figure out how to deliver services on those three clouds with all of their differences. >> Just a followup on that, is it now, so I presume it sounds like there's a PaaS layer there to allow you guys to have a consistent experience across clouds and out to the edge, wherever. Is that correct? >> Yeah, so we've basically built more or less platform engineering is this the new, hot phrase. Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that just gets all of the underlying infrastructure out of the way and lets them focus on delivering Influx cloud. >> And I know I'm taking a little bit of a tangent, but is that, I'll call it a PaaS layer, if I can use that term, are there specific attributes to InfluxDB or is it kind of just generally off-the-shelf PaaS? Is there any purpose built capability there that is value-add or is it pretty much generic? >> So we really build, we look at things with a build versus buy, through a build versus buy lens. Some things we want to leverage, cloud provider services, for instance, POSTGRES databases for metadata, perhaps. Get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can deliver on, that has consistency, that is all generated from code. that we can, as an SRE group, as an OPS team, that we can manage with very few people, really, and we can stamp out clusters across multiple regions in no time. >> So sometimes you build, sometimes you buy it. How do you make those decisions and what does that mean for the platform and for customers? >> Yeah, so what we're doing is, it's like everybody else will do. We're looking for trade-offs that make sense. We really want to protect our customers' data, so we look for services that support our own software with the most up-time reliability and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team and of course, for our customers; you don't even see that. But we don't want to try to reinvent the wheel, like I had mentioned with SQL datasource for metadata, perhaps. Let's build on top of what of these three large cloud providers have already perfected and we can then focus on our platform engineering and we can help our developers then focus on the InfluxData software, the Influx cloud software. >> So take it to the customer level. What does it mean for them, what's the value that they're going to get out of all these innovations that we've been talking about today, and what can they expect in the future? >> So first of all, people who use the OSS product are really going to be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across over four billion series keys that people have stored, so there's a proven ability to scale. Now in terms of the opensource software and how we've developed the platform, you're getting highly available, high cardinality time-series platform. We manage it and really, as I had mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in realtime. We deploy to our platform every day, repeatedly, all the time. And it's that continuous deployment that allow us to continue testing things in flight, rolling things out that change, new features, better ways of doing deployments, safer ways of doing deployments. All of that happens behind the scenes and like we had mentioned earllier, Kubernetes, I mean, that allows us to get that done. We couldn't do it without having that platform as a base layer for us to then put our software on. So we iterate quickly. When you're on the Influx cloud platform, you really are able to take advantage of new features immediately. We roll things out every day and as those things go into production, you have the ability to use them. And so in the then, we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let us do that for you. >> That makes sense. Are the innovations that we're talking about in the evolution of InfluxDB, do you see that as sort of a natural evolution for existing customers? Is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >> Yeah, it really is. It's a little bit of both. Any engineer will say, "Well it depends." So cloud-native technologies are really the hot thing, IoT, industrial IoT especially. People want to just shove tons of data out there and be able to do queries immediately and they don't want to manage infrastructure. What we've started to see are people that use the cloud service as their datastore backbone and then they use edge computing with our OSS product to ingest data from say, multiple production lines, and down-sample that data, send the rest of that data off to Influx cloud where the heavy processing takes place. So really, us being in all the different clouds and iterating on that, and being in all sorts of different regions, allows for people to really get out of the business of trying to manage that big data, have us take care of that. And, of course, as we change the platform, endusers benefit from that immediately. >> And so obviously you've taken away a lot of the heavy lifting for the infrastructure. Would you say the same things about security, especially as you go out to IoT at the edge? How should we be thinking about the value that you bring from a security perspective? >> We take security super seriously. It's built into our DNA. We do a lot of work to ensure that our platform is secure, that the data that we store is kept private. It's, of course, always a concern, you see in the news all the time, companies being compromised. That's something that you can have an entire team working on which we do, to make sure that the data that you have, whether it's in transit, whether it's at rest is always kept secure, is only viewable by you. You look at things like software bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software and we do that, you know, as we use new tools. That's something, that's just part of our jobs to make sure that the platform that we're running has fully vetted software. And you know, with opensource especially, that's a lot of work, and so it's definitely new territory. Supply chain attacks are definitely happening at a higher clip that they used to but that is really just part of a day in the life for folks like us that are building platforms. >> And that's key, especially when you start getting into the, you know, that we talk about IoT and the operations technologies, the engineers running that infrastrucutre. You know, historically, as you know, Tim, they would air gap everything; that's how they kept it safe. But that's not feasible anymore. Everything's-- >> Can't do that. >> connected now, right? And so you've got to have a partner that is, again, take away that heavy lifting to R&D so you can focus on some of the other activities. All right, give us the last word and the key takeaways from your perspective. >> Well, you know, from my perspective, I see it as a two-lane approach, with Influx, with any time-series data. You've got a lot of stuff that you're going to run on-prem. What you had mentioned, air gapping? Sure, there's plenty of need for that. But at the end of the day, people that don't want to run big datacenters, people that want to entrust their data to a company that's got a full platform set up for them that they can build on, send that data over to the cloud. The cloud is not going away. I think a more hybrid approach is where the future lives and that's what we're prepared for. >> Tim, really appreciate you coming to the program. Great stuff, good to see you. >> Thanks very much, appreciate it. >> Okay in a moment, I'll be back to wrap up today's session. You're watching theCUBE. (soft electronic music)

Published Date : Nov 8 2022

SUMMARY :

the Director of Engineering at InfluxData. So my question to you back to the projects that we use, in the heyday of Hadoop, And at the end of the day, we and all of the other stuff and the way we were and out to the edge, wherever. And so that just gets all of that we can manage with for the platform and for customers? and we can then focus on that they're going to get And so in the then, we want you to focus about in the evolution of InfluxDB, and down-sample that data, that you bring from a that the data that you have, and the operations technologies, and the key takeaways that data over to the cloud. you coming to the program. to wrap up today's session.

ENTITIES

Entity	Category	Confidence
Tim Yocum	PERSON	0.99+
Tim	PERSON	0.99+
InfluxData	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
both	QUANTITY	0.99+
two-lane	QUANTITY	0.99+
thousands	QUANTITY	0.99+
tomorrow	DATE	0.98+
today	DATE	0.98+
more than a decade	QUANTITY	0.98+
270 terabytes	QUANTITY	0.98+
InfluxDB	TITLE	0.98+
one	QUANTITY	0.97+
about 1500 CIOs	QUANTITY	0.97+
Influx	ORGANIZATION	0.96+
Azure	ORGANIZATION	0.94+
one way	QUANTITY	0.93+
single server	QUANTITY	0.93+
first	QUANTITY	0.92+
PaaS	TITLE	0.92+
Kubernetes	TITLE	0.91+
Enterprise Technology Research	ORGANIZATION	0.91+
Kubernetes	ORGANIZATION	0.91+
three clouds	QUANTITY	0.9+
ETR	ORGANIZATION	0.89+
tons of data	QUANTITY	0.87+
rsus	ORGANIZATION	0.87+
Hadoop	TITLE	0.85+
over four billion series	QUANTITY	0.85+
three large cloud providers	QUANTITY	0.74+
three different cloud providers	QUANTITY	0.74+
theCUBE	ORGANIZATION	0.66+
SQL	TITLE	0.64+
opensource	ORGANIZATION	0.63+
intelligent developers	QUANTITY	0.57+
POSTGRES	ORGANIZATION	0.52+
earllier	ORGANIZATION	0.5+
Azure	TITLE	0.49+
InfluxDB	OTHER	0.48+
cloud	TITLE	0.4+

Brian Gilmore, Influx Data | Evolving InfluxDB into the Smart Data Platform

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now, in this program, we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program, you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think, like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean, if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems. Certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean, commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away. Just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean, we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is, you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like, take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and, you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally, I would just say please, like watch in ice in Tim's sessions, Like these are two of our best and brightest. They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time, really hot area. As Brian said in a moment, I'll be right back with Anna East Dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't want to miss this.

Published Date : Nov 8 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. who are using out on a, on a daily basis, you know, and having that sort of big shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, results in, in, you know, milliseconds of time since it hit the, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try you know, the risk of, of, you know, any issues that can come with new software rollouts. And you can do some experimentation and, you know, using the cloud resources. but you know, when it came to this particular new engine, you know, that power performance really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is, you know, really starting to hit that steep part of the S-curve. going out and, you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. the critical aspects of key open source components of the Influx DB engine,

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Dave	PERSON	0.99+
Dave Valante	PERSON	0.99+
Brian	PERSON	0.99+
Tim	PERSON	0.99+
60,000 people	QUANTITY	0.99+
Influx	ORGANIZATION	0.99+
today	DATE	0.99+
Bryan	PERSON	0.99+
two	QUANTITY	0.99+
twice	QUANTITY	0.99+
both	QUANTITY	0.99+
first	QUANTITY	0.99+
three years ago	DATE	0.99+
Influx DB	TITLE	0.99+
Influx Data	ORGANIZATION	0.99+
tomorrow	DATE	0.98+
Apache	ORGANIZATION	0.98+
Anna East Dos Georgio	PERSON	0.98+
IOT	ORGANIZATION	0.97+
one	QUANTITY	0.97+
In Flux Data	ORGANIZATION	0.96+
Influx	TITLE	0.95+
The Cube	ORGANIZATION	0.95+
tons	QUANTITY	0.95+
Cube	ORGANIZATION	0.94+
Rust	TITLE	0.93+
both enterprises	QUANTITY	0.92+
iot T	TITLE	0.91+
second	QUANTITY	0.89+
Go	TITLE	0.88+
two thumbs	QUANTITY	0.87+
Anna East	PERSON	0.87+
Parque	TITLE	0.85+
a minute ago	DATE	0.84+
Influx State	ORGANIZATION	0.83+
Dos Georgio	ORGANIZATION	0.8+
influx data	ORGANIZATION	0.8+
Apache Arrow	ORGANIZATION	0.76+
GitHub	ORGANIZATION	0.75+
Bryan	LOCATION	0.74+
phase one	QUANTITY	0.71+
past May	DATE	0.69+
Go	ORGANIZATION	0.64+
number two	QUANTITY	0.64+
millisecond ago	DATE	0.61+
InfluxDB	TITLE	0.6+
Time	TITLE	0.55+
industrial	QUANTITY	0.54+
phase two	QUANTITY	0.54+
Parque	COMMERCIAL_ITEM	0.53+
couple	QUANTITY	0.5+
time	TITLE	0.5+
things	QUANTITY	0.49+
TSI	ORGANIZATION	0.4+
Arrow	TITLE	0.38+
PARQUE	OTHER	0.3+

Tim Yocum, Influx Data

(upbeat music) >> Okay, we're back with Tim Yoakum, who is the Director of Engineering at Influx Data. Tim, welcome. Good to see you. >> Good to see you. Thanks for having me. >> You're really welcome. Listen, we've been covering open source software on the Cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem, the cloud is being built out on open source, mobile social platforms, key databases, and of course Influx DB, and Influx Data has been a big consumer and contributor of open source software. So my question to you is where have you seen the biggest bang for the buck from open source software? >> So, yeah, you know, Influx, really, we thrive at the intersection of commercial services and open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service, from our core storage engine technologies to web services, templating engines. Our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants. And like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product, Influx DB. >> You know, but I got to ask you, Tim, because one of the challenge that we've seen, in particular, you saw this in the heyday of Hadoop. The innovations come so fast and furious, and as a software company, you got to place bets, you got to, you know, commit people, and sometimes those bets can be risky and not pay off. How have you managed this challenge? >> Oh, it moves fast, yeah. That's a benefit though, because the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we tend to do is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example. That ecosystem is driven by thousands of intelligent developers, engineers, builders. They're adding value every day. So we have to really keep up with that. And as the stack changes, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's something that we just do every day. >> So we have a survey partner down in New York City called Enterprise Technology Research, ETR, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes, is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity, particularly, you know, along with cloud. But really Kubernetes is just, you know, still up and to the right consistently, even with, you know the macro headwinds and all of the other stuff that we're sick of talking about. So what are you doing with Kubernetes in the platform? >> Yeah, it's really central to our ability to run the product. When we first started out, we were just on AWS, and the way we were running was a little bit like containers junior. Now we're running Kubernetes everywhere, at AWS, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers, and we can manage that in code. So our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google, and figure out how to deliver services on those three clouds with all of their differences. >> Just a follow up on that, is it, now, so I presume it sounds like there's a PaaS layer there to allow you guys to have a consistent experience across clouds and up to the edge, you know, wherever. Is that, is that correct? >> Yeah, so we've basically built, more or less, platform engineering. This is the new hot phrase. You know, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on, and they only have to learn one way of deploying their application, managing their application. And so that just gets all of the underlying infrastructure out of the way and lets them focus on delivering Influx Cloud. >> Yeah, and I know I'm taking a little bit of a tangent, but is that, I'll call it a PaaS layer if I can use that term, are there specific attributes to Influx DB, or is it kind of just generally off the shelf PaaS? You know, is there any purpose built capability there that is value add, or is it pretty much generic? >> So we really build, we look at things with a build versus buy, through a build versus buy lens. Some things we want to leverage, cloud provider services for instance, Postgres databases for metadata perhaps, get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can deliver on, that has consistency, that is all generated from code that we can, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions in no time. >> So how, so sometimes you build, sometimes you buy it. How do you make those decisions, and what does that mean for the platform and for customers? >> Yeah, so what we're doing is, it's like everybody else will do. We're looking for trade offs that make sense. You know, we really want to protect our customers' data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers, you don't even see that, but we don't want to try to reinvent the wheel. Like I had had mentioned with SQL data storage for metadata perhaps. Let's build on top of what these three large cloud providers have already perfected, and we can then focus on our platform engineering, and we can have our developers then focus on the Influx Data software, Influx Cloud software. >> So take it to the customer level. What does it mean for them? What's the value that they're going to get out of all these innovations that we've been been talking about today? And what can they expect in the future? >> So first of all, people who use the OSS product are really going to be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you. But then you want to scale up. We have some 270 terabytes of data across over 4 billion series keys that people have stored. So there's a proven ability to scale. Now, in terms of the open source software, and how we've developed the platform, you're getting highly available, high cardinality time series platform. We manage it, and really as I mentioned earlier, we can keep up with the state of the art. We keep reinventing. We keep deploying things in real time. We deploy to our platform every day repeatedly, all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change, new features, better ways of doing deployments, safer ways of doing deployments. All of that happens behind the scenes. And we had mentioned earlier Kubernetes, I mean that allows us to get that done. We couldn't do it without having that platform as a base layer for us to then put our software on. So we iterate quickly. When you're on the Influx Cloud platform, you really are able to take advantage of new features immediately. We roll things out every day. And as those things go into production, you have the ability to use them. And so in the end, we want you to focus on getting actionable insights from your data instead of running infrastructure. You know, let us do that for you. >> And that makes sense, but so is the, are the innovations that we're talking about in the evolution of Influx DB, do you see that as sort of a natural evolution for existing customers? Is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >> Yeah, it really is. It's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are really the hot thing. IoT, industrial IoT especially, people want to just shove tons of data out there and be able to do queries immediately, and they don't want to manage infrastructure. What we've started to see are people that use the cloud service as their data store backbone, and then they use edge computing with our OSS product to ingest data from say multiple production lines and down-sample that data, send the rest of that data off to Influx Cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that, and being in all sorts of different regions allows for people to really get out of the business of trying to manage that big data, have us take care of that. And of course, as we change the platform, end users benefit from that immediately. >> And so obviously, taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IoT and the edge? How should we be thinking about the value that you bring from a security perspective? >> Yeah, we take security super seriously. It's built into our DNA. We do a lot of work to ensure that our platform is secure, that the data we store is kept private. It's of course always a concern. You see in the news all the time companies being compromised. You know, that's something that you can have an entire team working on, which we do, to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You look at things like software bill of materials. If you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that's just part of our jobs, to make sure that the platform that we're running has fully vetted software. And with open source especially, that's a lot of work. And so it's definitely new territory. Supply chain attacks are definitely happening at a higher clip than they used to. But that is really just part of a day in the life for folks like us that are building platforms. >> Yeah, and that's key. I mean, especially when you start getting into the, you know, we talk about IoT and the operations technologies, the engineers running that infrastructure. You know, historically, as you know, Tim, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >> Can't do that. >> connected now, right? And so you've got to have a partner that is, again, take away that heavy lifting to R and D so you can focus on some of the other activities. All right. Give us the last word and the key takeaways from your perspective. >> Well, you know, from my perspective, I see it as a a two lane approach. With Influx, with any any time series data, you know, you've got a lot of stuff that you're going to run on-prem. What you mentioned, air gaping, sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want to entrust their data to a company that's got a full platform set up for them that they can build on, send that data over to the cloud. The cloud is not going away. I think a more hybrid approach is where the future lives, and that's what we're prepared for. >> Tim, really appreciate you coming to the program. Great stuff. Good to see you. >> Thanks very much. Appreciate it. >> Okay, in a moment, I'll be back to wrap up today's session. You're watching the Cube. (gentle music)

Published Date : Oct 18 2022

SUMMARY :

Good to see you. Good to see you. So my question to you is to the projects that we use in the heyday of Hadoop. And as the stack changes, we and all of the other stuff that and the way we were to allow you guys to have and they only have to learn one way that we can manage with So how, so sometimes you and we can have our developers then focus So take it to the customer level. And so in the end, we want you to focus And of course, as we change the platform, that the data we store is kept private. and the operations technologies, and the key takeaways that data over to the cloud. you coming to the program. Thanks very much. I'll be back to wrap up today's session.

ENTITIES

Entity	Category	Confidence
Tim Yoakum	PERSON	0.99+
Tim	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Influx Data	ORGANIZATION	0.99+
Tim Yocum	PERSON	0.99+
Google	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
two lane	QUANTITY	0.99+
Influx	ORGANIZATION	0.98+
Azure	ORGANIZATION	0.98+
270 terabytes	QUANTITY	0.98+
about 1500 CIOs	QUANTITY	0.97+
tomorrow	DATE	0.97+
more than a decade	QUANTITY	0.97+
over 4 billion	QUANTITY	0.97+
one	QUANTITY	0.97+
tons of data	QUANTITY	0.95+
Influx DB	TITLE	0.95+
Kubernetes	TITLE	0.94+
Enterprise Technology Research	ORGANIZATION	0.93+
first	QUANTITY	0.93+
single server	QUANTITY	0.92+
SQL	TITLE	0.91+
three	QUANTITY	0.91+
Postgres	ORGANIZATION	0.91+
Influx Cloud	TITLE	0.9+
thousands of intelligent developers	QUANTITY	0.9+
ETR	ORGANIZATION	0.9+
Hadoop	TITLE	0.9+
three large cloud providers	QUANTITY	0.81+
three clouds	QUANTITY	0.79+
Influx DB	ORGANIZATION	0.74+
cloud	QUANTITY	0.62+
Google Cloud	ORGANIZATION	0.56+
Cube	PERSON	0.53+
Cube	COMMERCIAL_ITEM	0.52+
Cloud	TITLE	0.45+
Influx	TITLE	0.36+

Evan Kaplan, InfluxData | AWS re:invent 2022

>>Hey everyone. Welcome to Las Vegas. The Cube is here, live at the Venetian Expo Center for AWS Reinvent 2022. Amazing attendance. This is day one of our coverage. Lisa Martin here with Day Ante. David is great to see so many people back. We're gonna be talk, we've been having great conversations already. We have a wall to wall coverage for the next three and a half days. When we talk to companies, customers, every company has to be a data company. And one of the things I think we learned in the pandemic is that access to real time data and real time analytics, no longer a nice to have that is a differentiator and a competitive all >>About data. I mean, you know, I love the topic and it's, it's got so many dimensions and such texture, can't get enough of data. >>I know we have a great guest joining us. One of our alumni is back, Evan Kaplan, the CEO of Influx Data. Evan, thank you so much for joining us. Welcome back to the Cube. >>Thanks for having me. It's great to be here. So here >>We are, day one. I was telling you before we went live, we're nice and fresh hosts. Talk to us about what's new at Influxed since the last time we saw you at Reinvent. >>That's great. So first of all, we should acknowledge what's going on here. This is pretty exciting. Yeah, that does really feel like, I know there was a show last year, but this feels like the first post Covid shows a lot of energy, a lot of attention despite a difficult economy. In terms of, you know, you guys were commenting in the lead into Big data. I think, you know, if we were to talk about Big Data five, six years ago, what would we be talking about? We'd been talking about Hadoop, we were talking about Cloudera, we were talking about Hortonworks, we were talking about Big Data Lakes, data stores. I think what's happened is, is this this interesting dynamic of, let's call it if you will, the, the secularization of data in which it breaks into different fields, different, almost a taxonomy. You've got this set of search data, you've got this observability data, you've got graph data, you've got document data and what you're seeing in the market and now you have time series data. >>And what you're seeing in the market is this incredible capability by developers as well and mostly open source dynamic driving this, this incredible capability of developers to assemble data platforms that aren't unicellular, that aren't just built on Hado or Oracle or Postgres or MySQL, but in fact represent different data types. So for us, what we care about his time series, we care about anything that happens in time, where time can be the primary measurement, which if you think about it, is a huge proportion of real data. Cuz when you think about what drives ai, you think about what happened, what happened, what happened, what happened, what's going to happen. That's the functional thing. But what happened is always defined by a period, a measurement, a time. And so what's new for us is we've developed this new open source engine called IOx. And so it's basically a refresh of the whole database, a kilo database that uses Apache Arrow, par K and data fusion and turns it into a super powerful real time analytics platform. It was already pretty real time before, but it's increasingly now and it adds SQL capability and infinite cardinality. And so it handles bigger data sets, but importantly, not just bigger but faster, faster data. So that's primarily what we're talking about to show. >>So how does that affect where you can play in the marketplace? Is it, I mean, how does it affect your total available market? Your great question. Your, your customer opportunities. >>I think it's, it's really an interesting market in that you've got all of these different approaches to database. Whether you take data warehouses from Snowflake or, or arguably data bricks also. And you take these individual database companies like Mongo Influx, Neo Forge, elastic, and people like that. I think the commonality you see across the volume is, is many of 'em, if not all of them, are based on some sort of open source dynamic. So I think that is an in an untractable trend that will continue for on. But in terms of the broader, the broader database market, our total expand, total available tam, lots of these things are coming together in interesting ways. And so the, the, the wave that will ride that we wanna ride, because it's all big data and it's all increasingly fast data and it's all machine learning and AI is really around that measurement issue. That instrumentation the idea that if you're gonna build any sophisticated system, it starts with instrumentation and the journey is defined by instrumentation. So we view ourselves as that instrumentation tooling for understanding complex systems. And how, >>I have to follow quick follow up. Why did you say arguably data bricks? I mean open source ethos? >>Well, I was saying arguably data bricks cuz Spark, I mean it's a great company and it's based on Spark, but there's quite a gap between Spark and what Data Bricks is today. And in some ways data bricks from the outside looking in looks a lot like Snowflake to me looks a lot like a really sophisticated data warehouse with a lot of post-processing capabilities >>And, and with an open source less >>Than a >>Core database. Yeah. Right, right, right. Yeah, I totally agree. Okay, thank you for that >>Part that that was not arguably like they're, they're not a good company or >>No, no. They got great momentum and I'm just curious. Absolutely. You know, so, >>So talk a little bit about IOx and, and what it is enabling you guys to achieve from a competitive advantage perspective. The key differentiators give us that scoop. >>So if you think about, so our old storage engine was called tsm, also open sourced, right? And IOx is open sourced and the old storage engine was really built around this time series measurements, particularly metrics, lots of metrics and handling those at scale and making it super easy for developers to use. But, but our old data engine only supported either a custom graphical UI that you'd build yourself on top of it or a dashboarding tool like Grafana or Chronograph or things like that. With IOCs. Two or three interventions were important. One is we now support, we'll support things like Tableau, Microsoft, bi, and so you're taking that same data that was available for instrumentation and now you're using it for business intelligence also. So that became super important and it kind of answers your question about the expanded market expands the market. The second thing is, when you're dealing with time series data, you're dealing with this concept of cardinality, which is, and I don't know if you're familiar with it, but the idea that that it's a multiplication of measurements in a table. And so the more measurements you want over the more series you have, you have this really expanding exponential set that can choke a database off. And the way we've designed IIS to handle what we call infinite cardinality, where you don't even have to think about that design point of view. And then lastly, it's just query performance is dramatically better. And so it's pretty exciting. >>So the unlimited cardinality, basically you could identify relationships between data and different databases. Is that right? Between >>The same database but different measurements, different tables, yeah. Yeah. Right. Yeah, yeah. So you can handle, so you could say, I wanna look at the way, the way the noise levels are performed in this room according to 400 different locations on 25 different days, over seven months of the year. And that each one is a measurement. Each one adds to cardinality. And you can say, I wanna search on Tuesdays in December, what the noise level is at 2:21 PM and you get a very quick response. That kind of instrumentation is critical to smarter systems. How are >>You able to process that data at at, in a performance level that doesn't bring the database to its knees? What's the secret sauce behind that? >>It's AUM database. It's built on Parque and Apache Arrow. But it's, but to say it's nice to say without a much longer conversation, it's an architecture that's really built for pulling that kind of data. If you know the data is time series and you're looking for a time measurement, you already have the ability to optimize pretty dramatically. >>So it's, it's that purpose built aspect of it. It's the >>Purpose built aspect. You couldn't take Postgres and do the same >>Thing. Right? Because a lot of vendors say, oh yeah, we have time series now. Yeah. Right. So yeah. Yeah. Right. >>And they >>Do. Yeah. But >>It's not, it's not, the founding of the company came because Paul Dicks was working on Wall Street building time series databases on H base, on MyQ, on other platforms and realize every time we do it, we have to rewrite the code. We build a bunch of application logic to handle all these. We're talking about, we have customers that are adding hundreds of millions to billions of points a second. So you're talking about an ingest level. You know, you think about all those data points, you're talking about ingest level that just doesn't, you know, it just databases aren't designed for that. Right? And so it's not just us, our competitors also build good time series databases. And so the category is really emergent. Yeah, >>Sure. Talk about a favorite customer story they think really articulates the value of what Influx is doing, especially with IOx. >>Yeah, sure. And I love this, I love this story because you know, Tesla may not be in favor because of the latest Elon Musker aids, but, but, but so we've had about a four year relationship with Tesla where they built their power wall technology around recording that, seeing your device, seeing the stuff, seeing the charging on your car. It's all captured in influx databases that are reporting from power walls and mega power packs all over the world. And they report to a central place at, at, at Tesla's headquarters and it reports out to your phone and so you can see it. And what's really cool about this to me is I've got two Tesla cars and I've got a Tesla solar roof tiles. So I watch this date all the time. So it's a great customer story. And actually if you go on our website, you can see I did an hour interview with the engineer that designed the system cuz the system is super impressive and I just think it's really cool. Plus it's, you know, it's all the good green stuff that we really appreciate supporting sustainability, right? Yeah. >>Right, right. Talk about from a, what's in it for me as a customer, what you guys have done, the change to IOCs, what, what are some of the key features of it and the key values in it for customers like Tesla, like other industry customers as well? >>Well, so it's relatively new. It just arrived in our cloud product. So Tesla's not using it today. We have a first set of customers starting to use it. We, the, it's in open source. So it's a very popular project in the open source world. But the key issues are, are really the stuff that we've kind of covered here, which is that a broad SQL environment. So accessing all those SQL developers, the same people who code against Snowflake's data warehouse or data bricks or Postgres, can now can code that data against influx, open up the BI market. It's the cardinality, it's the performance. It's really an architecture. It's the next gen. We've been doing this for six years, it's the next generation of everything. We've seen how you make time series be super performing. And that's only relevant because more and more things are becoming real time as we develop smarter and smarter systems. The journey is pretty clear. You instrument the system, you, you let it run, you watch for anomalies, you correct those anomalies, you re instrument the system. You do that 4 billion times, you have a self-driving car, you do that 55 times, you have a better podcast that is, that is handling its audio better, right? So everything is on that journey of getting smarter and smarter. So >>You guys, you guys the big committers to IOCs, right? Yes. And how, talk about how you support the, develop the surrounding developer community, how you get that flywheel effect going >>First. I mean it's actually actually a really kind of, let's call it, it's more art than science. Yeah. First of all, you you, you come up with an architecture that really resonates for developers. And Paul Ds our founder, really is a developer's developer. And so he started talking about this in the community about an architecture that uses Apache Arrow Parque, which is, you know, the standard now becoming for file formats that uses Apache Arrow for directing queries and things like that and uses data fusion and said what this thing needs is a Columbia database that sits behind all of this stuff and integrates it. And he started talking about it two years ago and then he started publishing in IOCs that commits in the, in GitHub commits. And slowly, but over time in Hacker News and other, and other people go, oh yeah, this is fundamentally right. >>It addresses the problems that people have with things like click cows or plain databases or Coast and they go, okay, this is the right architecture at the right time. Not different than original influx, not different than what Elastic hit on, not different than what Confluent with Kafka hit on and their time is you build an audience of people who are committed to understanding this kind of stuff and they become committers and they become the core. Yeah. And you build out from it. And so super. And so we chose to have an MIT open source license. Yeah. It's not some secondary license competitors can use it and, and competitors can use it against us. Yeah. >>One of the things I know that Influx data talks about is the time to awesome, which I love that, but what does that mean? What is the time to Awesome. Yeah. For developer, >>It comes from that original story where, where Paul would have to write six months of application logic and stuff to build a time series based applications. And so Paul's notion was, and this was based on the original Mongo, which was very successful because it was very easy to use relative to most databases. So Paul developed this commitment, this idea that I quickly joined on, which was, hey, it should be relatively quickly for a developer to build something of import to solve a problem, it should be able to happen very quickly. So it's got a schemaless background so you don't have to know the schema beforehand. It does some things that make it really easy to feel powerful as a developer quickly. And if you think about that journey, if you feel powerful with a tool quickly, then you'll go deeper and deeper and deeper and pretty soon you're taking that tool with you wherever you go, it becomes the tool of choice as you go to that next job or you go to that next application. And so that's a fundamental way we think about it. To be honest with you, we haven't always delivered perfectly on that. It's generally in our dna. So we do pretty well, but I always feel like we can do better. >>So if you were to put a bumper sticker on one of your Teslas about influx data, what would it >>Say? By the way, I'm not rich. It just happened to be that we have two Teslas and we have for a while, we just committed to that. The, the, so ask the question again. Sorry. >>Bumper sticker on influx data. What would it say? How, how would I >>Understand it be time to Awesome. It would be that that phrase his time to Awesome. Right. >>Love that. >>Yeah, I'd love it. >>Excellent time to. Awesome. Evan, thank you so much for joining David, the >>Program. It's really fun. Great thing >>On Evan. Great to, you're on. Haven't Well, great to have you back talking about what you guys are doing and helping organizations like Tesla and others really transform their businesses, which is all about business transformation these days. We appreciate your insights. >>That's great. Thank >>You for our guest and Dave Ante. I'm Lisa Martin, you're watching The Cube, the leader in emerging and enterprise tech coverage. We'll be right back with our next guest.

Published Date : Nov 29 2022

SUMMARY :

And one of the things I think we learned in the pandemic is that access to real time data and real time analytics, I mean, you know, I love the topic and it's, it's got so many dimensions and such Evan, thank you so much for joining us. It's great to be here. Influxed since the last time we saw you at Reinvent. terms of, you know, you guys were commenting in the lead into Big data. And so it's basically a refresh of the whole database, a kilo database that uses So how does that affect where you can play in the marketplace? And you take these individual database companies like Mongo Influx, Why did you say arguably data bricks? And in some ways data bricks from the outside looking in looks a lot like Snowflake to me looks a lot Okay, thank you for that You know, so, So talk a little bit about IOx and, and what it is enabling you guys to achieve from a And the way we've designed IIS to handle what we call infinite cardinality, where you don't even have to So the unlimited cardinality, basically you could identify relationships between data And you can say, time measurement, you already have the ability to optimize pretty dramatically. So it's, it's that purpose built aspect of it. You couldn't take Postgres and do the same So yeah. And so the category is really emergent. especially with IOx. And I love this, I love this story because you know, what you guys have done, the change to IOCs, what, what are some of the key features of it and the key values in it for customers you have a self-driving car, you do that 55 times, you have a better podcast that And how, talk about how you support architecture that uses Apache Arrow Parque, which is, you know, the standard now becoming for file And you build out from it. One of the things I know that Influx data talks about is the time to awesome, which I love that, So it's got a schemaless background so you don't have to know the schema beforehand. It just happened to be that we have two Teslas and we have for a while, What would it say? Understand it be time to Awesome. Evan, thank you so much for joining David, the Great thing Haven't Well, great to have you back talking about what you guys are doing and helping organizations like Tesla and others really That's great. You for our guest and Dave Ante.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Evan Kaplan	PERSON	0.99+
six months	QUANTITY	0.99+
Evan	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
Influx Data	ORGANIZATION	0.99+
Paul	PERSON	0.99+
55 times	QUANTITY	0.99+
two	QUANTITY	0.99+
2:21 PM	DATE	0.99+
Las Vegas	LOCATION	0.99+
Dave Ante	PERSON	0.99+
Paul Dicks	PERSON	0.99+
six years	QUANTITY	0.99+
last year	DATE	0.99+
hundreds of millions	QUANTITY	0.99+
Mongo Influx	ORGANIZATION	0.99+
4 billion times	QUANTITY	0.99+
Two	QUANTITY	0.99+
December	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
Influxed	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Influx	ORGANIZATION	0.99+
IOx	TITLE	0.99+
MySQL	TITLE	0.99+
three	QUANTITY	0.99+
Tuesdays	DATE	0.99+
each one	QUANTITY	0.98+
400 different locations	QUANTITY	0.98+
25 different days	QUANTITY	0.98+
first set	QUANTITY	0.98+
an hour	QUANTITY	0.98+
First	QUANTITY	0.98+
six years ago	DATE	0.98+
The Cube	TITLE	0.98+
One	QUANTITY	0.98+
Neo Forge	ORGANIZATION	0.98+
second thing	QUANTITY	0.98+
Each one	QUANTITY	0.98+
Paul Ds	PERSON	0.97+
IOx	ORGANIZATION	0.97+
today	DATE	0.97+
Teslas	ORGANIZATION	0.97+
MIT	ORGANIZATION	0.96+
Postgres	ORGANIZATION	0.96+
over seven months	QUANTITY	0.96+
one	QUANTITY	0.96+
five	DATE	0.96+
Venetian Expo Center	LOCATION	0.95+
Big Data Lakes	ORGANIZATION	0.95+
Cloudera	ORGANIZATION	0.94+
Columbia	LOCATION	0.94+
InfluxData	ORGANIZATION	0.94+
Wall Street	LOCATION	0.93+
SQL	TITLE	0.92+
Elastic	TITLE	0.92+
Data Bricks	ORGANIZATION	0.92+
Hacker News	TITLE	0.92+
two years ago	DATE	0.91+
Oracle	ORGANIZATION	0.91+
AWS Reinvent 2022	EVENT	0.91+
Elon Musker	PERSON	0.9+
Snowflake	ORGANIZATION	0.9+
Reinvent	ORGANIZATION	0.89+
billions of points a second	QUANTITY	0.89+
four year	QUANTITY	0.88+
Chronograph	TITLE	0.88+
Confluent	TITLE	0.87+
Spark	TITLE	0.86+
Apache	ORGANIZATION	0.86+
Snowflake	TITLE	0.85+
Grafana	TITLE	0.85+
GitHub	ORGANIZATION	0.84+

Anais Dotis Georgiou, InfluxData | Evolving InfluxDB into the Smart Data Platform

>>Okay, we're back. I'm Dave Valante with The Cube and you're watching Evolving Influx DB into the smart data platform made possible by influx data. Anna East Otis Georgio is here. She's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into realtime analytics. Anna is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IO X is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory, of course for speed. It's a kilo store, so it gives you compression efficiency, it's gonna give you faster query speeds, it gonna use store files and object storages. So you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOCs is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's lift tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import, super useful. Also, broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so a lot there. Now we talked to Brian about how you're using Rust and and which is not a new programming language and of course we had some drama around Russ during the pandemic with the Mozilla layoffs, but the formation of the Russ Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Rust was chosen because of his exceptional performance and rebi reliability. So while rust is synt tactically similar to c c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers and dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on card for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ, Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fixed race conditions to protect against buffering overflows and to ensure thread safe ay caching structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learned about the the new engine and the, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you're really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data and so much of the efficiency and performance of IOCs comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of illustrate why calmer data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then neighbor each other and when they neighbor each other in the storage format. This provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the min and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one times stamp and do that for every single row. So you're scanning across a ton more data and that's why row oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, calmer data fit framework. So that's where a lot of the advantages come >>From. Okay. So you've basically described like a traditional database, a row approach, but I've seen like a lot of traditional databases say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native it, is it not as effective as the, is the form not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. >>Yeah. Got it. So let's talk about Arrow data fusion. What is data fusion? I know it's written in rust, but what does it bring to to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as its in memory format. So the way that it helps influx DB IOx is that okay, it's great if you can write unlimited amount of cardinality into influx cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PDA's data frames as well and all of the machine learning tools associated with pandas. >>Okay. You're also leveraging par K in the platform course. We heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Par K and why is it important? >>Sure. So Par K is the calm oriented durable file format. So it's important because it'll enable bulk import and bulk export. It has compatibility with Python and pandas so it supports a broader ecosystem. Parque files also take very little disc disc space and they're faster to scan because again they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and these, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call it the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOCs and I really encourage if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and I just wanna learn more, then I would encourage you to go to the monthly tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel. Look for the influx D DB underscore IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about IOCs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how influx TB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and you guys super responsive, so really appreciate that. All right, thank you so much and East for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yokum. He's the director of engineering for Influx Data and we're gonna talk about how you update a SaaS engine while the plane is flying at 30,000 feet. You don't wanna miss this.

Published Date : Nov 8 2022

SUMMARY :

to increase the granularity of time series analysis analysis and bring the world of data Hi, thank you so much. So you got very cost effective approach. it aims to have no limits on cardinality and also allow you to write any kind of event data that So lots of platforms, lots of adoption with rust, but why rust as an all the fine grain control, you need to take advantage of even to even today you do a lot of garbage collection in these, in these systems and And so you can picture this table where we have like two rows with the two temperature values for order to answer that question and you have those immediately available to you. to pluck out that one temperature value that you want at that one times stamp and do that for every about is really, you know, kind of native it, is it not as effective as the, Yeah, it's, it's not as effective because you have more expensive compression and because So let's talk about Arrow data fusion. It also has a PANDAS API so that you could take advantage of What are you doing with So it's important What's the value that you're bringing to the community? here is that the more you contribute and build those up, then the kind of summarize, you know, where what, what the big takeaways are from your perspective. So if there's a particular technology or stack that you wanna dive deeper into and want and you guys super responsive, so really appreciate that. I really appreciate it. Influx Data and we're gonna talk about how you update a SaaS engine while

ENTITIES

Entity	Category	Confidence
Tim Yokum	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Brian	PERSON	0.99+
Anna	PERSON	0.99+
James Bellenger	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
James	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
three months	QUANTITY	0.99+
16 times	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Python	TITLE	0.99+
mobile.twitter.com	OTHER	0.99+
Influx Data	ORGANIZATION	0.99+
iOS	TITLE	0.99+
Twitter	ORGANIZATION	0.99+
30,000 feet	QUANTITY	0.99+
Russ Foundation	ORGANIZATION	0.99+
Scala	TITLE	0.99+
Twitter Lite	TITLE	0.99+
two rows	QUANTITY	0.99+
200 megabyte	QUANTITY	0.99+
Node	TITLE	0.99+
Three months ago	DATE	0.99+
one application	QUANTITY	0.99+
both places	QUANTITY	0.99+
each row	QUANTITY	0.99+
Par K	TITLE	0.99+
Anais Dotis Georgiou	PERSON	0.99+
one language	QUANTITY	0.98+
first one	QUANTITY	0.98+
15 engineers	QUANTITY	0.98+
Anna East Otis Georgio	PERSON	0.98+
both	QUANTITY	0.98+
one second	QUANTITY	0.98+
25 engineers	QUANTITY	0.98+
About 800 people	QUANTITY	0.98+
sql	TITLE	0.98+
Node Summit 2017	EVENT	0.98+
two temperature values	QUANTITY	0.98+
one times	QUANTITY	0.98+
c plus plus	TITLE	0.97+
Rust	TITLE	0.96+
SQL	TITLE	0.96+
today	DATE	0.96+
Influx	ORGANIZATION	0.95+
under 600 kilobytes	QUANTITY	0.95+
first	QUANTITY	0.95+
c plus plus	TITLE	0.95+
Apache	ORGANIZATION	0.95+
par K	TITLE	0.94+
React	TITLE	0.94+
Russ	ORGANIZATION	0.94+
About three months ago	DATE	0.93+
8:30 AM Pacific time	DATE	0.93+
twitter.com	OTHER	0.93+
last decade	DATE	0.93+
Node	ORGANIZATION	0.92+
Hadoop	TITLE	0.9+
InfluxData	ORGANIZATION	0.89+
c c plus plus	TITLE	0.89+
Cube	ORGANIZATION	0.89+
each column	QUANTITY	0.88+
InfluxDB	TITLE	0.86+
Influx DB	TITLE	0.86+
Mozilla	ORGANIZATION	0.86+
DB IOx	TITLE	0.85+

Evolving InfluxDB into the Smart Data Platform

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now in this program we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems, certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away, just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally I would just say please, like watch in ICE in Tim's sessions, like these are two of our best and brightest, They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time really hot area. As Brian said in a moment, I'll be right back with Anna East dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't wanna miss this >>Time series Data is everywhere. The number of sensors, systems and applications generating time series data increases every day. All these data sources producing so much data can cause analysis paralysis. Influx DB is an entire platform designed with everything you need to quickly build applications that generate value from time series data influx. DB Cloud is a serverless solution, which means you don't need to buy or manage your own servers. There's no need to worry about provisioning because you only pay for what you use. Influx DB Cloud is fully managed so you get the newest features and enhancements as they're added to the platform's code base. It also means you can spend time building solutions and delivering value to your users instead of wasting time and effort managing something else. Influx TVB Cloud offers a range of security features to protect your data, multiple layers of redundancy ensure you don't lose any data access controls ensure that only the people who should see your data can see it. >>And encryption protects your data at rest and in transit between any of our regions or cloud providers. InfluxDB uses a single API across the entire platform suite so you can build on open source, deploy to the cloud and then then easily query data in the cloud at the edge or on prem using the same scripts. And InfluxDB is schemaless automatically adjusting to changes in the shape of your data without requiring changes in your application. Logic. InfluxDB Cloud is production ready from day one. All it needs is your data and your imagination. Get started today@influxdata.com slash cloud. >>Okay, we're back. I'm Dave Valante with a Cube and you're watching evolving Influx DB into the smart data platform made possible by influx data. Anna ETOs Georgio is here, she's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into real-time analytics and is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IX is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory of course for speed. It's a kilo store, so it gives you a compression efficiency, it's gonna give you faster query speeds, you store files and object storage, so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's live tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import super useful. Also broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs, but the formation of the Rust Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, the adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Russ was chosen because of his exceptional performance and reliability. So while Russ is synt tactically similar to c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers. And dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on ality, for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fix race conditions, to protection against buffering overflows and to ensure thread safe async cashing structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learn about the, the new engine and, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It it's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data. And so much of the efficiency and performance of IOx comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of of illustrate why column or data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format, this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the men and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why Rowe Oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, commoner data fit framework. So that's where a lot of the advantages come >>From. Okay. So you basically described like a traditional database, a row approach, but I've seen like a lot of traditional database say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native i, is it not as effective? Is the, is the foreman not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. Yeah. >>Got it. So let's talk about Arrow Data Fusion. What is data fusion? I know it's written in Rust, but what does it bring to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as it's in memory format. So the way that it helps in influx DB IOCs is that okay, it's great if you can write unlimited amount of cardinality into influx Cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So Data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PANDAS data frames as well and all of the machine learning tools associated with Pandas. >>Okay. You're also leveraging Par K in the platform cause we heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Parque and why is it important? >>Sure. So parque is the column oriented durable file format. So it's important because it'll enable bulk import, bulk export, it has compatibility with Python and Pandas, so it supports a broader ecosystem. Par K files also take very little disc disc space and they're faster to scan because again, they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and he's, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOx and I really encourage, if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and you just wanna learn more, then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel look for the influx DDB unders IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about iacs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how INFLUX DB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and, and you guys super responsive, so really appreciate that. All right, thank you so much Anise for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakum, he's the director of engineering for Influx Data and we're gonna talk about how you update a SAS engine while the plane is flying at 30,000 feet. You don't wanna miss this. >>I'm really glad that we went with InfluxDB Cloud for our hosting because it has saved us a ton of time. It's helped us move faster, it's saved us money. And also InfluxDB has good support. My name's Alex Nada. I am CTO at Noble nine. Noble Nine is a platform to measure and manage service level objectives, which is a great way of measuring the reliability of your systems. You can essentially think of an slo, the product we're providing to our customers as a bunch of time series. So we need a way to store that data and the corresponding time series that are related to those. The main reason that we settled on InfluxDB as we were shopping around is that InfluxDB has a very flexible query language and as a general purpose time series database, it basically had the set of features we were looking for. >>As our platform has grown, we found InfluxDB Cloud to be a really scalable solution. We can quickly iterate on new features and functionality because Influx Cloud is entirely managed, it probably saved us at least a full additional person on our team. We also have the option of running InfluxDB Enterprise, which gives us the ability to even host off the cloud or in a private cloud if that's preferred by a customer. Influx data has been really flexible in adapting to the hosting requirements that we have. They listened to the challenges we were facing and they helped us solve it. As we've continued to grow, I'm really happy we have influx data by our side. >>Okay, we're back with Tim Yokum, who is the director of engineering at Influx Data. Tim, welcome. Good to see you. >>Good to see you. Thanks for having me. >>You're really welcome. Listen, we've been covering open source software in the cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem. The cloud has been being built out on open source, mobile, social platforms, key databases, and of course influx DB and influx data has been a big consumer and contributor of open source software. So my question to you is, where have you seen the biggest bang for the buck from open source software? >>So yeah, you know, influx really, we thrive at the intersection of commercial services and open, so open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services temping engines. Our, our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants and like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product influx db. >>You know, but I gotta ask you, Tim, because one of the challenge that that we've seen in particular, you saw this in the heyday of Hadoop, the, the innovations come so fast and furious and as a software company you gotta place bets, you gotta, you know, commit people and sometimes those bets can be risky and not pay off well, how have you managed this challenge? >>Oh, it moves fast. Yeah, that, that's a benefit though because it, the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we, what we tend to do is, is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example, that ecosystem is driven by thousands of intelligent developers, engineers, builders, they're adding value every day. So we have to really keep up with that. And as the stack changes, we, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's, it's something that we just do every day. >>So we have a survey partner down in New York City called Enterprise Technology Research etr, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity particularly, you know, along with cloud. But, but really Kubernetes is just, you know, still up until the right consistently even with, you know, the macro headwinds and all, all of the stuff that we're sick of talking about. But, so what are you doing with Kubernetes in the platform? >>Yeah, it, it's really central to our ability to run the product. When we first started out, we were just on AWS and, and the way we were running was, was a little bit like containers junior. Now we're running Kubernetes everywhere at aws, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code so our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. >>Just to follow up on that, is it, no. So I presume it's sounds like there's a PAs layer there to allow you guys to have a consistent experience across clouds and out to the edge, you know, wherever is that, is that correct? >>Yeah, so we've basically built more or less platform engineering, This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that, that just gets all of the underlying infrastructure out of the way and, and lets them focus on delivering influx cloud. >>Yeah, and I know I'm taking a little bit of a tangent, but is that, that, I'll call it a PAs layer if I can use that term. Is that, are there specific attributes to Influx db or is it kind of just generally off the shelf paths? You know, are there, is, is there any purpose built capability there that, that is, is value add or is it pretty much generic? >>So we really build, we, we look at things through, with a build versus buy through a, a build versus by lens. Some things we want to leverage cloud provider services, for instance, Postgres databases for metadata, perhaps we'll get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can, can deliver on that has consistency that is, is all generated from code that we can as a, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions and in no time. >>So how, so sometimes you build, sometimes you buy it. How do you make those decisions and and what does that mean for the, for the platform and for customers? >>Yeah, so what we're doing is, it's like everybody else will do, we're we're looking for trade offs that make sense. You know, we really want to protect our customers data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, like I had mentioned with SQL data stores for metadata, perhaps let's build on top of what of these three large cloud providers have already perfected. And we can then focus on our platform engineering and we can have our developers then focus on the influx data, software, influx, cloud software. >>So take it to the customer level, what does it mean for them? What's the value that they're gonna get out of all these innovations that we've been been talking about today and what can they expect in the future? >>So first of all, people who use the OSS product are really gonna be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across, over 4 billion series keys that people have stored. So there's a proven ability to scale now in terms of the open source, open source software and how we've developed the platform. You're getting highly available high cardinality time series platform. We manage it and, and really as, as I mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in real time. We deploy to our platform every day repeatedly all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change new features, better ways of doing deployments, safer ways of doing deployments. >>All of that happens behind the scenes. And like we had mentioned earlier, Kubernetes, I mean that, that allows us to get that done. We couldn't do it without having that platform as a, as a base layer for us to then put our software on. So we, we iterate quickly. When you're on the, the Influx cloud platform, you really are able to, to take advantage of new features immediately. We roll things out every day and as those things go into production, you have, you have the ability to, to use them. And so in the end we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let, let us do that for you. So, >>And that makes sense, but so is the, is the, are the innovations that we're talking about in the evolution of Influx db, do, do you see that as sort of a natural evolution for existing customers? I, is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >>Yeah, it really is it, it's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are, are really the hot thing. Iot, industrial iot especially, people want to just shove tons of data out there and be able to do queries immediately and they don't wanna manage infrastructure. What we've started to see are people that use the cloud service as their, their data store backbone and then they use edge computing with R OSS product to ingest data from say, multiple production lines and downsample that data, send the rest of that data off influx cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that and being in all sorts of different regions allows for people to really get out of the, the business of man trying to manage that big data, have us take care of that. And of course as we change the platform end users benefit from that immediately. And, >>And so obviously taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IOT and the Edge? How should we be thinking about the value that you bring from a security perspective? >>Yeah, we take, we take security super seriously. It, it's built into our dna. We do a lot of work to ensure that our platform is secure, that the data we store is, is kept private. It's of course always a concern. You see in the news all the time, companies being compromised, you know, that's something that you can have an entire team working on, which we do to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You know, you look at things like software, bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that, that's just part of our jobs to make sure that the platform that we're running it has, has fully vetted software and, and with open source especially, that's a lot of work. And so it's, it's definitely new territory. Supply chain attacks are, are definitely happening at a higher clip than they used to, but that is, that is really just part of a day in the, the life for folks like us that are, are building platforms. >>Yeah, and that's key. I mean especially when you start getting into the, the, you know, we talk about IOT and the operations technologies, the engineers running the, that infrastructure, you know, historically, as you know, Tim, they, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >>That >>Connected now, right? And so you've gotta have a partner that is again, take away that heavy lifting to r and d so you can focus on some of the other activities. Right. Give us the, the last word and the, the key takeaways from your perspective. >>Well, you know, from my perspective I see it as, as a a two lane approach with, with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, what you had mentioned, air gaping. Sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want torus their data to, to a company that's, that's got a full platform set up for them that they can build on, send that data over to the cloud, the cloud is not going away. I think more hybrid approach is, is where the future lives and that's what we're prepared for. >>Tim, really appreciate you coming to the program. Great stuff. Good to see you. >>Thanks very much. Appreciate it. >>Okay, in a moment I'll be back to wrap up. Today's session, you're watching The Cube. >>Are you looking for some help getting started with InfluxDB Telegraph or Flux Check >>Out Influx DB University >>Where you can find our entire catalog of free training that will help you make the most of your time series data >>Get >>Started for free@influxdbu.com. >>We'll see you in class. >>Okay, so we heard today from three experts on time series and data, how the Influx DB platform is evolving to support new ways of analyzing large data sets very efficiently and effectively in real time. And we learned that key open source components like Apache Arrow and the Rust Programming environment Data fusion par K are being leveraged to support realtime data analytics at scale. We also learned about the contributions in importance of open source software and how the Influx DB community is evolving the platform with minimal disruption to support new workloads, new use cases, and the future of realtime data analytics. Now remember these sessions, they're all available on demand. You can go to the cube.net to find those. Don't forget to check out silicon angle.com for all the news related to things enterprise and emerging tech. And you should also check out influx data.com. There you can learn about the company's products. You'll find developer resources like free courses. You could join the developer community and work with your peers to learn and solve problems. And there are plenty of other resources around use cases and customer stories on the website. This is Dave Valante. Thank you for watching Evolving Influx DB into the smart data platform, made possible by influx data and brought to you by the Cube, your leader in enterprise and emerging tech coverage.

Published Date : Nov 2 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, you know, all of the, the real time queries, the, the multiple language query support, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try And so just, you know, being careful, maybe a little cautious in terms And you can do some experimentation and, you know, using the cloud resources. You know, this is a new very sort of popular systems language, you know, really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. going out and you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. goes, goes beyond just the historical into the real time really hot area. There's no need to worry about provisioning because you only pay for what you use. InfluxDB uses a single API across the entire platform suite so you can build on Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the Hi, thank you so much. it's gonna give you faster query speeds, you store files and object storage, it aims to have no limits on cardinality and also allow you to write any kind of event data that It's really, the adoption is really starting to get steep on all the control, all the fine grain control, you need to take you know, the community is modernizing the platform, but I wanna talk about Apache And so you can answer that question and you have those immediately available to you. out that one temperature value that you want at that one time stamp and do that for every talking about is really, you know, kind of native i, is it not as effective? Yeah, it's, it's not as effective because you have more expensive compression and So let's talk about Arrow Data Fusion. It also has a PANDAS API so that you could take advantage of PANDAS What are you doing with and Pandas, so it supports a broader ecosystem. What's the value that you're bringing to the community? And I think kind of the idea here is that if you can improve kind of summarize, you know, where what, what the big takeaways are from your perspective. the hard work questions and you All right, thank you so much Anise for explaining I really appreciate it. Data and we're gonna talk about how you update a SAS engine while I'm really glad that we went with InfluxDB Cloud for our hosting They listened to the challenges we were facing and they helped Good to see you. Good to see you. So my question to you is, So yeah, you know, influx really, we thrive at the intersection of commercial services and open, You know, you look at Kubernetes for example, But, but really Kubernetes is just, you know, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. to the edge, you know, wherever is that, is that correct? This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us Is that, are there specific attributes to Influx db as an SRE group, as an ops team, that we can manage with very few people So how, so sometimes you build, sometimes you buy it. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, and really as, as I mentioned earlier, we can keep up with the state of the art. the end we want you to focus on getting actual insights from your data instead of running infrastructure, So cloud native technologies are, are really the hot thing. You see in the news all the time, companies being compromised, you know, technologies, the engineers running the, that infrastructure, you know, historically, as you know, take away that heavy lifting to r and d so you can focus on some of the other activities. with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, Tim, really appreciate you coming to the program. Thanks very much. Okay, in a moment I'll be back to wrap up. brought to you by the Cube, your leader in enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
David Brown	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave Volante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Stu	PERSON	0.99+
Herain Oberoi	PERSON	0.99+
John	PERSON	0.99+
Dave Valante	PERSON	0.99+
Kamile Taouk	PERSON	0.99+
John Fourier	PERSON	0.99+
Rinesh Patel	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Santana Dasgupta	PERSON	0.99+
Europe	LOCATION	0.99+
Canada	LOCATION	0.99+
BMW	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
ICE	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Jack Berkowitz	PERSON	0.99+
Australia	LOCATION	0.99+
NVIDIA	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Venkat	PERSON	0.99+
Michael	PERSON	0.99+
Camille	PERSON	0.99+
Andy Jassy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Venkat Krishnamachari	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Don Tapscott	PERSON	0.99+
thousands	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Intercontinental Exchange	ORGANIZATION	0.99+
Children's Cancer Institute	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
telco	ORGANIZATION	0.99+
Sabrina Yan	PERSON	0.99+
Tim	PERSON	0.99+
Sabrina	PERSON	0.99+
John Furrier	PERSON	0.99+
Google	ORGANIZATION	0.99+
MontyCloud	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Leo	PERSON	0.99+
COVID-19	OTHER	0.99+
Santa Ana	LOCATION	0.99+
UK	LOCATION	0.99+
Tushar	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Valente	PERSON	0.99+
JL Valente	PERSON	0.99+
1,000	QUANTITY	0.99+

Evolving InfluxDB into the Smart Data Platform Full Episode

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now in this program we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems, certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away, just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally I would just say please, like watch in ICE in Tim's sessions, like these are two of our best and brightest, They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time really hot area. As Brian said in a moment, I'll be right back with Anna East dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't wanna miss this >>Time series Data is everywhere. The number of sensors, systems and applications generating time series data increases every day. All these data sources producing so much data can cause analysis paralysis. Influx DB is an entire platform designed with everything you need to quickly build applications that generate value from time series data influx. DB Cloud is a serverless solution, which means you don't need to buy or manage your own servers. There's no need to worry about provisioning because you only pay for what you use. Influx DB Cloud is fully managed so you get the newest features and enhancements as they're added to the platform's code base. It also means you can spend time building solutions and delivering value to your users instead of wasting time and effort managing something else. Influx TVB Cloud offers a range of security features to protect your data, multiple layers of redundancy ensure you don't lose any data access controls ensure that only the people who should see your data can see it. >>And encryption protects your data at rest and in transit between any of our regions or cloud providers. InfluxDB uses a single API across the entire platform suite so you can build on open source, deploy to the cloud and then then easily query data in the cloud at the edge or on prem using the same scripts. And InfluxDB is schemaless automatically adjusting to changes in the shape of your data without requiring changes in your application. Logic. InfluxDB Cloud is production ready from day one. All it needs is your data and your imagination. Get started today@influxdata.com slash cloud. >>Okay, we're back. I'm Dave Valante with a Cube and you're watching evolving Influx DB into the smart data platform made possible by influx data. Anna ETOs Georgio is here, she's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into real-time analytics and is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IX is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory of course for speed. It's a kilo store, so it gives you a compression efficiency, it's gonna give you faster query speeds, you store files and object storage, so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's live tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import super useful. Also broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs, but the formation of the Rust Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, the adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Russ was chosen because of his exceptional performance and reliability. So while Russ is synt tactically similar to c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers. And dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on ality, for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fix race conditions, to protection against buffering overflows and to ensure thread safe async cashing structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learn about the, the new engine and, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It it's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data. And so much of the efficiency and performance of IOx comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of of illustrate why column or data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format, this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the men and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why Rowe Oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, commoner data fit framework. So that's where a lot of the advantages come >>From. Okay. So you basically described like a traditional database, a row approach, but I've seen like a lot of traditional database say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native i, is it not as effective? Is the, is the foreman not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. Yeah. >>Got it. So let's talk about Arrow Data Fusion. What is data fusion? I know it's written in Rust, but what does it bring to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as it's in memory format. So the way that it helps in influx DB IOCs is that okay, it's great if you can write unlimited amount of cardinality into influx Cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So Data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PANDAS data frames as well and all of the machine learning tools associated with Pandas. >>Okay. You're also leveraging Par K in the platform cause we heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Parque and why is it important? >>Sure. So parque is the column oriented durable file format. So it's important because it'll enable bulk import, bulk export, it has compatibility with Python and Pandas, so it supports a broader ecosystem. Par K files also take very little disc disc space and they're faster to scan because again, they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and he's, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOx and I really encourage, if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and you just wanna learn more, then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel look for the influx DDB unders IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about iacs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how INFLUX DB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and, and you guys super responsive, so really appreciate that. All right, thank you so much Anise for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakum, he's the director of engineering for Influx Data and we're gonna talk about how you update a SAS engine while the plane is flying at 30,000 feet. You don't wanna miss this. >>I'm really glad that we went with InfluxDB Cloud for our hosting because it has saved us a ton of time. It's helped us move faster, it's saved us money. And also InfluxDB has good support. My name's Alex Nada. I am CTO at Noble nine. Noble Nine is a platform to measure and manage service level objectives, which is a great way of measuring the reliability of your systems. You can essentially think of an slo, the product we're providing to our customers as a bunch of time series. So we need a way to store that data and the corresponding time series that are related to those. The main reason that we settled on InfluxDB as we were shopping around is that InfluxDB has a very flexible query language and as a general purpose time series database, it basically had the set of features we were looking for. >>As our platform has grown, we found InfluxDB Cloud to be a really scalable solution. We can quickly iterate on new features and functionality because Influx Cloud is entirely managed, it probably saved us at least a full additional person on our team. We also have the option of running InfluxDB Enterprise, which gives us the ability to even host off the cloud or in a private cloud if that's preferred by a customer. Influx data has been really flexible in adapting to the hosting requirements that we have. They listened to the challenges we were facing and they helped us solve it. As we've continued to grow, I'm really happy we have influx data by our side. >>Okay, we're back with Tim Yokum, who is the director of engineering at Influx Data. Tim, welcome. Good to see you. >>Good to see you. Thanks for having me. >>You're really welcome. Listen, we've been covering open source software in the cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem. The cloud has been being built out on open source, mobile, social platforms, key databases, and of course influx DB and influx data has been a big consumer and contributor of open source software. So my question to you is, where have you seen the biggest bang for the buck from open source software? >>So yeah, you know, influx really, we thrive at the intersection of commercial services and open, so open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services temping engines. Our, our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants and like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product influx db. >>You know, but I gotta ask you, Tim, because one of the challenge that that we've seen in particular, you saw this in the heyday of Hadoop, the, the innovations come so fast and furious and as a software company you gotta place bets, you gotta, you know, commit people and sometimes those bets can be risky and not pay off well, how have you managed this challenge? >>Oh, it moves fast. Yeah, that, that's a benefit though because it, the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we, what we tend to do is, is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example, that ecosystem is driven by thousands of intelligent developers, engineers, builders, they're adding value every day. So we have to really keep up with that. And as the stack changes, we, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's, it's something that we just do every day. >>So we have a survey partner down in New York City called Enterprise Technology Research etr, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity particularly, you know, along with cloud. But, but really Kubernetes is just, you know, still up until the right consistently even with, you know, the macro headwinds and all, all of the stuff that we're sick of talking about. But, so what are you doing with Kubernetes in the platform? >>Yeah, it, it's really central to our ability to run the product. When we first started out, we were just on AWS and, and the way we were running was, was a little bit like containers junior. Now we're running Kubernetes everywhere at aws, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code so our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. >>Just to follow up on that, is it, no. So I presume it's sounds like there's a PAs layer there to allow you guys to have a consistent experience across clouds and out to the edge, you know, wherever is that, is that correct? >>Yeah, so we've basically built more or less platform engineering, This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that, that just gets all of the underlying infrastructure out of the way and, and lets them focus on delivering influx cloud. >>Yeah, and I know I'm taking a little bit of a tangent, but is that, that, I'll call it a PAs layer if I can use that term. Is that, are there specific attributes to Influx db or is it kind of just generally off the shelf paths? You know, are there, is, is there any purpose built capability there that, that is, is value add or is it pretty much generic? >>So we really build, we, we look at things through, with a build versus buy through a, a build versus by lens. Some things we want to leverage cloud provider services, for instance, Postgres databases for metadata, perhaps we'll get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can, can deliver on that has consistency that is, is all generated from code that we can as a, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions and in no time. >>So how, so sometimes you build, sometimes you buy it. How do you make those decisions and and what does that mean for the, for the platform and for customers? >>Yeah, so what we're doing is, it's like everybody else will do, we're we're looking for trade offs that make sense. You know, we really want to protect our customers data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, like I had mentioned with SQL data stores for metadata, perhaps let's build on top of what of these three large cloud providers have already perfected. And we can then focus on our platform engineering and we can have our developers then focus on the influx data, software, influx, cloud software. >>So take it to the customer level, what does it mean for them? What's the value that they're gonna get out of all these innovations that we've been been talking about today and what can they expect in the future? >>So first of all, people who use the OSS product are really gonna be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across, over 4 billion series keys that people have stored. So there's a proven ability to scale now in terms of the open source, open source software and how we've developed the platform. You're getting highly available high cardinality time series platform. We manage it and, and really as, as I mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in real time. We deploy to our platform every day repeatedly all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change new features, better ways of doing deployments, safer ways of doing deployments. >>All of that happens behind the scenes. And like we had mentioned earlier, Kubernetes, I mean that, that allows us to get that done. We couldn't do it without having that platform as a, as a base layer for us to then put our software on. So we, we iterate quickly. When you're on the, the Influx cloud platform, you really are able to, to take advantage of new features immediately. We roll things out every day and as those things go into production, you have, you have the ability to, to use them. And so in the end we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let, let us do that for you. So, >>And that makes sense, but so is the, is the, are the innovations that we're talking about in the evolution of Influx db, do, do you see that as sort of a natural evolution for existing customers? I, is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >>Yeah, it really is it, it's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are, are really the hot thing. Iot, industrial iot especially, people want to just shove tons of data out there and be able to do queries immediately and they don't wanna manage infrastructure. What we've started to see are people that use the cloud service as their, their data store backbone and then they use edge computing with R OSS product to ingest data from say, multiple production lines and downsample that data, send the rest of that data off influx cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that and being in all sorts of different regions allows for people to really get out of the, the business of man trying to manage that big data, have us take care of that. And of course as we change the platform end users benefit from that immediately. And, >>And so obviously taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IOT and the Edge? How should we be thinking about the value that you bring from a security perspective? >>Yeah, we take, we take security super seriously. It, it's built into our dna. We do a lot of work to ensure that our platform is secure, that the data we store is, is kept private. It's of course always a concern. You see in the news all the time, companies being compromised, you know, that's something that you can have an entire team working on, which we do to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You know, you look at things like software, bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that, that's just part of our jobs to make sure that the platform that we're running it has, has fully vetted software and, and with open source especially, that's a lot of work. And so it's, it's definitely new territory. Supply chain attacks are, are definitely happening at a higher clip than they used to, but that is, that is really just part of a day in the, the life for folks like us that are, are building platforms. >>Yeah, and that's key. I mean especially when you start getting into the, the, you know, we talk about IOT and the operations technologies, the engineers running the, that infrastructure, you know, historically, as you know, Tim, they, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >>That >>Connected now, right? And so you've gotta have a partner that is again, take away that heavy lifting to r and d so you can focus on some of the other activities. Right. Give us the, the last word and the, the key takeaways from your perspective. >>Well, you know, from my perspective I see it as, as a a two lane approach with, with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, what you had mentioned, air gaping. Sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want torus their data to, to a company that's, that's got a full platform set up for them that they can build on, send that data over to the cloud, the cloud is not going away. I think more hybrid approach is, is where the future lives and that's what we're prepared for. >>Tim, really appreciate you coming to the program. Great stuff. Good to see you. >>Thanks very much. Appreciate it. >>Okay, in a moment I'll be back to wrap up. Today's session, you're watching The Cube. >>Are you looking for some help getting started with InfluxDB Telegraph or Flux Check >>Out Influx DB University >>Where you can find our entire catalog of free training that will help you make the most of your time series data >>Get >>Started for free@influxdbu.com. >>We'll see you in class. >>Okay, so we heard today from three experts on time series and data, how the Influx DB platform is evolving to support new ways of analyzing large data sets very efficiently and effectively in real time. And we learned that key open source components like Apache Arrow and the Rust Programming environment Data fusion par K are being leveraged to support realtime data analytics at scale. We also learned about the contributions in importance of open source software and how the Influx DB community is evolving the platform with minimal disruption to support new workloads, new use cases, and the future of realtime data analytics. Now remember these sessions, they're all available on demand. You can go to the cube.net to find those. Don't forget to check out silicon angle.com for all the news related to things enterprise and emerging tech. And you should also check out influx data.com. There you can learn about the company's products. You'll find developer resources like free courses. You could join the developer community and work with your peers to learn and solve problems. And there are plenty of other resources around use cases and customer stories on the website. This is Dave Valante. Thank you for watching Evolving Influx DB into the smart data platform, made possible by influx data and brought to you by the Cube, your leader in enterprise and emerging tech coverage.

Published Date : Oct 28 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, you know, all of the, the real time queries, the, the multiple language query support, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try And so just, you know, being careful, maybe a little cautious in terms And you can do some experimentation and, you know, using the cloud resources. You know, this is a new very sort of popular systems language, you know, really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. going out and you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. goes, goes beyond just the historical into the real time really hot area. There's no need to worry about provisioning because you only pay for what you use. InfluxDB uses a single API across the entire platform suite so you can build on Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the Hi, thank you so much. it's gonna give you faster query speeds, you store files and object storage, it aims to have no limits on cardinality and also allow you to write any kind of event data that It's really, the adoption is really starting to get steep on all the control, all the fine grain control, you need to take you know, the community is modernizing the platform, but I wanna talk about Apache And so you can answer that question and you have those immediately available to you. out that one temperature value that you want at that one time stamp and do that for every talking about is really, you know, kind of native i, is it not as effective? Yeah, it's, it's not as effective because you have more expensive compression and So let's talk about Arrow Data Fusion. It also has a PANDAS API so that you could take advantage of PANDAS What are you doing with and Pandas, so it supports a broader ecosystem. What's the value that you're bringing to the community? And I think kind of the idea here is that if you can improve kind of summarize, you know, where what, what the big takeaways are from your perspective. the hard work questions and you All right, thank you so much Anise for explaining I really appreciate it. Data and we're gonna talk about how you update a SAS engine while I'm really glad that we went with InfluxDB Cloud for our hosting They listened to the challenges we were facing and they helped Good to see you. Good to see you. So my question to you is, So yeah, you know, influx really, we thrive at the intersection of commercial services and open, You know, you look at Kubernetes for example, But, but really Kubernetes is just, you know, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. to the edge, you know, wherever is that, is that correct? This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us Is that, are there specific attributes to Influx db as an SRE group, as an ops team, that we can manage with very few people So how, so sometimes you build, sometimes you buy it. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, and really as, as I mentioned earlier, we can keep up with the state of the art. the end we want you to focus on getting actual insights from your data instead of running infrastructure, So cloud native technologies are, are really the hot thing. You see in the news all the time, companies being compromised, you know, technologies, the engineers running the, that infrastructure, you know, historically, as you know, take away that heavy lifting to r and d so you can focus on some of the other activities. with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, Tim, really appreciate you coming to the program. Thanks very much. Okay, in a moment I'll be back to wrap up. brought to you by the Cube, your leader in enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Dave Valante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Tim	PERSON	0.99+
Google	ORGANIZATION	0.99+
16 times	QUANTITY	0.99+
two rows	QUANTITY	0.99+
New York City	LOCATION	0.99+
60,000 people	QUANTITY	0.99+
Rust	TITLE	0.99+
Influx	ORGANIZATION	0.99+
Influx Data	ORGANIZATION	0.99+
today	DATE	0.99+
Influx Data	ORGANIZATION	0.99+
Python	TITLE	0.99+
three experts	QUANTITY	0.99+
InfluxDB	TITLE	0.99+
both	QUANTITY	0.99+
each row	QUANTITY	0.99+
two lane	QUANTITY	0.99+
Today	DATE	0.99+
Noble nine	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Flux	ORGANIZATION	0.99+
Influx DB	TITLE	0.99+
each column	QUANTITY	0.99+
270 terabytes	QUANTITY	0.99+
cube.net	OTHER	0.99+
twice	QUANTITY	0.99+
Bryan	PERSON	0.99+
Pandas	TITLE	0.99+
c plus plus	TITLE	0.99+
three years ago	DATE	0.99+
two	QUANTITY	0.99+
more than a decade	QUANTITY	0.98+
Apache	ORGANIZATION	0.98+
dozens	QUANTITY	0.98+
free@influxdbu.com	OTHER	0.98+
30,000 feet	QUANTITY	0.98+
Rust Foundation	ORGANIZATION	0.98+
two temperature values	QUANTITY	0.98+
In Flux Data	ORGANIZATION	0.98+
one time stamp	QUANTITY	0.98+
tomorrow	DATE	0.98+
Russ	PERSON	0.98+
IOT	ORGANIZATION	0.98+
Evolving InfluxDB	TITLE	0.98+
first	QUANTITY	0.97+
Influx data	ORGANIZATION	0.97+
one	QUANTITY	0.97+
first one	QUANTITY	0.97+
Influx DB University	ORGANIZATION	0.97+
SQL	TITLE	0.97+
The Cube	TITLE	0.96+
Influx DB Cloud	TITLE	0.96+
single server	QUANTITY	0.96+
Kubernetes	TITLE	0.96+

Evolving InfluxDB into the Smart Data Platform Open

>> This past May, the Cube, in collaboration with Influx Data shared with you the latest innovations in Time series databases. We talked at length about why a purpose-built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember that time series data is any data that's stamped in time and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community we talked about how in theory those time slices could be taken, you know every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors, and other devices and IOT equipment. Time series databases have had to evolve to efficiently support realtime data in emerging use, use cases in IOT and other use cases. And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the Smart Data platform, made possible by influx data and produced by the cube. My name is Dave Vellante, and I'll be your host today. Now, in this program, we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're going to hear from Brian Gilmore who is the director of IOT and emerging technologies at Influx Data. And we're going to talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program, you're going to hear a lot about things like rust implementation of Apache Arrow, the use of Parquet and tooling such as data fusion, which are powering a new engine for Influx db. Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices if you will, from, for example minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're going to hear from Anais Dotis-Georgiou who is a developer advocate at Influx Data. And we're going to get into the "why's" of these open source capabilities, and how they contribute to the evolution of the Influx DB platform. And then we're going to close the program with Tim Yocum. He's the director of engineering at Influx Data, and he's going to explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started.

Published Date : Oct 18 2022

SUMMARY :

by compressing the historical time slices

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Brian	PERSON	0.99+
Tim Yocum	PERSON	0.99+
Influx Data	ORGANIZATION	0.99+
Anais Dotis-Georgiou	PERSON	0.99+
Influx DB	TITLE	0.99+
InfluxDB	TITLE	0.94+
first	QUANTITY	0.91+
today	DATE	0.88+
second	QUANTITY	0.85+
Time	TITLE	0.82+
Parquet	TITLE	0.76+
Apache	ORGANIZATION	0.75+
past May	DATE	0.75+
Influx	TITLE	0.75+
IOT	ORGANIZATION	0.69+
Cube	ORGANIZATION	0.65+
influx	ORGANIZATION	0.53+
Arrow	TITLE	0.48+

Anais Dotis Georgiou, InfluxData

(upbeat music) >> Okay, we're back. I'm Dave Vellante with The Cube and you're watching Evolving InfluxDB into the smart data platform made possible by influx data. Anais Dotis-Georgiou is here. She's a developer advocate for influx data and we're going to dig into the rationale and value contribution behind several open source technologies that InfluxDB is leveraging to increase the granularity of time series analysis and bring the world of data into realtime analytics. Anais welcome to the program. Thanks for coming on. >> Hi, thank you so much. It's a pleasure to be here. >> Oh, you're very welcome. Okay, so IOx is being touted as this next gen open source core for InfluxDB. And my understanding is that it leverages in memory, of course for speed. It's a kilometer store, so it gives you compression efficiency it's going to give you faster query speeds, it's going to see you store files and object storages so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features but what are the high level value points that people should understand? >> Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want whether that's lift tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metric queries we also want to have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import, super useful. Also, broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like SQL, Python and maybe even Pandas in the future. >> Okay, so a lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs but the formation of the Rust Foundation really addressed any of those concerns and you got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with Rust but why Rust as an alternative to say C++ for example? >> Sure, that's a great question. So Rust was chosen because of his exceptional performance and reliability. So while Rust is syntactically similar to C++ and it has similar performance it also compiles to a native code like C++ But unlike C++ it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And Rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers and dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like C++. So Rust like helps meet that requirement of having no limits on cardinality, for example, because it's we're also using the Rust implementation of Apache Arrow and this control over memory and also Rust's packaging system called Crates IO offers everything that you need out of the box to have features like async and await to fix race conditions to protect against buffering overflows and to ensure thread safe async caching structures as well. So essentially it's just like has all the control all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high cardinality use cases. >> Yeah, and the more I learn about the new engine and the platform IOx et cetera, you see things like the old days not even to even today you do a lot of garbage collection in these systems and there's an inverse, impact relative to performance. So it looks like you're really, the community is modernizing the platform but I want to talk about Apache Arrow for a moment. It's designed to address the constraints that are associated with analyzing large data sets. We know that, but please explain why, what is Arrow and what does it bring to InfluxDB? >> Sure. Yeah. So Arrow is a a framework for defining in memory column data. And so much of the efficiency and performance of IOx comes from taking advantage of column data structures. And I will, if you don't mind, take a moment to kind of illustrate why column data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our store. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the store. Well, usually our room temperature is regulated so those values don't change very often. So when you have calm oriented storage essentially you take each row each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you want to define like the min and max value of the temperature in the room across a thousand different points you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of column oriented storage. So if you had a row oriented storage, you'd first have to look at every field like the temperature in the room and the temperature of the store. You'd have to go across every tag value that maybe describes where the room is located or what model the store is. And every timestamp you then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why row oriented doesn't provide the same efficiency as column and Apache Arrow is in memory column data column data fit framework. So that's where a lot of the advantages come from. >> Okay. So you've basically described like a traditional database a row approach, but I've seen like a lot of traditional databases say, okay, now we've got we can handle Column format versus what you're talking about is really kind of native is it not as effective as the former not as effective because it's largely a bolt on? Can you like elucidate on that front? >> Yeah, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why row oriented storage isn't as efficient as column oriented storage. >> Yeah. Got it. So let's talk about Arrow data fusion. What is data fusion? I know it's written in Rust but what does it bring to to the table here? >> Sure. So it's an extensible query execution framework and it uses Arrow as its in memory format. So the way that it helps InfluxDB IOx is that okay it's great if you can write unlimited amount of cardinality into InfluxDB, but if you don't have a query engine that can successfully query that data then I don't know how much value it is for you. So data fusion helps enable the query process and transformation of that data. It also has a Pandas API so that you could take advantage of Pandas data frames as well and all of the machine learning tools associated with Pandas. >> Okay. You're also leveraging Par-K in the platform course. We heard a lot about Par-K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Par-K and why is it important? >> Sure. So Par-K is the column oriented durable file format. So it's important because it'll enable bulk import and bulk export. It has compatibility with Python and Pandas so it supports a broader ecosystem. Par-K files also take very little disc space and they're faster to scan because again they're column oriented, in particular I think Par-K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the benefits of Par-K. >> Got it. Very popular. So and these, what exactly is Influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >> Sure. So InfluxDB first has contributed a lot of different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing Influx. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >> Yeah. Got it. You got that virtuous cycle going people call it the flywheel. Give us your last thoughts and kind of summarize, what the big takeaways are from your perspective. >> So I think the big takeaway is that, Influx data is doing a lot of really exciting things with InfluxDB IOx and I really encourage if you are interested in learning more about the technologies that Influx is leveraging to produce IOx the challenges associated with it and all of the hard work questions and I just want to learn more then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel. Look for the InfluxDB underscore IOx channel specifically to learn more about how to join those office hours and those monthly tech talks as well as ask any questions they have about IOx what to expect and what you'd like to learn more about. I as a developer advocate, I want to answer your questions. So if there's a particular technology or stack that you want to dive deeper into and want more explanation about how InfluxDB leverages it to build IOx, I will be really excited to produce content on that topic for you. >> Yeah, that's awesome. You guys have a really rich community collaborate with your peers, solve problems and you guys super responsive, so really appreciate that. All right, thank you so much Anais for explaining all this open source stuff to the audience and why it's important to the future of data. >> Thank you. I really appreciate it. >> All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakam. He's the director of engineering for Influx Data and we're going to talk about how you update a SaaS engine while the plane is flying at 30,000 feet. You don't want to miss this. (upbeat music)

Published Date : Oct 18 2022

SUMMARY :

and bring the world of data It's a pleasure to be here. it's going to give you and some of the most impressive ones to me and you got big guns and dangling pointers are the main classes Yeah, and the more I and the temperature of the store. is it not as effective as the former not and because you can't scan to to the table here? So the way that it helps Par-K in the platform course. and they're faster to scan So and these, what exactly is Influx data and appreciation of the and kind of summarize, of the hard work questions and you guys super responsive, I really appreciate it. and we're going to talk about

ENTITIES

Entity	Category	Confidence
Tim Yoakam	PERSON	0.99+
Brian	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Anais	PERSON	0.99+
two rows	QUANTITY	0.99+
16 times	QUANTITY	0.99+
Influx Data	ORGANIZATION	0.99+
each row	QUANTITY	0.99+
Python	TITLE	0.99+
Rust	TITLE	0.99+
C++	TITLE	0.99+
SQL	TITLE	0.99+
Anais Dotis Georgiou	PERSON	0.99+
InfluxDB	TITLE	0.99+
both	QUANTITY	0.99+
Rust Foundation	ORGANIZATION	0.99+
30,000 feet	QUANTITY	0.99+
first one	QUANTITY	0.99+
Mozilla	ORGANIZATION	0.99+
Pandas	TITLE	0.98+
InfluxData	ORGANIZATION	0.98+
Influx	ORGANIZATION	0.98+
IOx	TITLE	0.98+
each column	QUANTITY	0.97+
one time stamp	QUANTITY	0.97+
first	QUANTITY	0.97+
Influx	TITLE	0.96+
Anais Dotis-Georgiou	PERSON	0.95+
Crates IO	TITLE	0.94+
IOx	ORGANIZATION	0.94+
two temperature values	QUANTITY	0.93+
Apache	ORGANIZATION	0.93+
today	DATE	0.93+
8:30 AM Pacific time	DATE	0.92+
Wednesday	DATE	0.91+
one temperature	QUANTITY	0.91+
two temperature values	QUANTITY	0.91+
InfluxDB IOx	TITLE	0.9+
influx	ORGANIZATION	0.89+
last decade	DATE	0.88+
single row	QUANTITY	0.83+
a ton more data	QUANTITY	0.81+
thousand	QUANTITY	0.8+
dozens of other features	QUANTITY	0.8+
a thousand different points	QUANTITY	0.79+
Hadoop	TITLE	0.77+
Par-K	TITLE	0.76+
points	QUANTITY	0.75+
each	QUANTITY	0.75+
Slack	TITLE	0.74+
Evolving InfluxDB	TITLE	0.68+
kilometer	QUANTITY	0.67+
Arrow	TITLE	0.62+
The Cube	ORGANIZATION	0.61+

Angelo Fausti & Caleb Maclachlan | The Future is Built on InfluxDB

>> Okay. We're now going to go into the customer panel, and we'd like to welcome Angelo Fausti, who's a software engineer at the Vera C. Rubin Observatory, and Caleb Maclachlan who's senior spacecraft operations software engineer at Loft Orbital. Guys, thanks for joining us. You don't want to miss folks this interview. Caleb, let's start with you. You work for an extremely cool company, you're launching satellites into space. Of course doing that is highly complex and not a cheap endeavor. Tell us about Loft Orbital and what you guys do to attack that problem. >> Yeah, absolutely. And thanks for having me here by the way. So Loft Orbital is a company that's a series B startup now, who, and our mission basically is to provide rapid access to space for all kinds of customers. Historically, if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, have a big software teams, and then eventually worry about, a bunch like, just a lot of very specialized engineering. And what we're trying to do is change that from a super specialized problem that has an extremely high barrier of access, to a infrastructure problem. So that it's almost as simple as deploying a VM in AWS or GCP is getting your programs, your mission deployed on orbit with access to different sensors, cameras, radios, stuff like that. So, that's kind of our mission and just to give a really brief example of the kind of customer that we can serve. There's a really cool company called Totum Labs, who is working on building IoT cons, an IoT constellation for, internet of things, basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor IoT which means you have this little modem inside a container that container that you track from anywhere in the world as it's going across the ocean. So, and it's really little, and they've been able to stay a small startup that's focused on their product, which is the, that super crazy, complicated, cool radio, while we handle the whole space segment for them, which just, you know, before Loft was really impossible. So that's our mission is providing space infrastructure as a service. We are kind of groundbreaking in this area and we're serving a huge variety of customers with all kinds of different missions, and obviously generating a ton of data in space that we've got to handle. >> Yeah. So amazing Caleb, what you guys do. Now, I know you were lured to the skies very early in your career, but how did you kind of land in this business? >> Yeah, so, I guess just a little bit about me. For some people, they don't necessarily know what they want to do like earlier in their life. For me I was five years old and I knew I want to be in the space industry. So, I started in the Air Force, but have stayed in the space industry my whole career and been a part of, this is the fifth space startup that I've been a part of actually. So, I've kind of started out in satellites, spent some time in working in the launch industry on rockets, then, now I'm here back in satellites and honestly, this is the most exciting of the different space startups that I've been a part of. >> Super interesting. Okay. Angelo, let's talk about the Rubin Observatory. Vera C. Rubin, famous woman scientist, galaxy guru. Now you guys, the Observatory, you're up way up high, you get a good look at the Southern sky. And I know COVID slowed you guys down a bit, but no doubt you continued to code away on the software. I know you're getting close, you got to be super excited, give us the update on the Observatory and your role. >> All right. So, yeah. Rubin is a state of the art observatory that is in construction on a remote mountain in Chile. And, with Rubin we'll conduct the large survey of space and time. We're going to observe the sky with eight meter optical telescope and take 1000 pictures every night with 2.2 Gigapixel camera. And we are going to do that for 10 years, which is the duration of the survey. >> Yeah, amazing project. Now, you earned a doctor of philosophy so you probably spent some time thinking about what's out there, and then you went out to earn a PhD in astronomy and astrophysics. So, this is something that you've been working on for the better part of your career, isn't it? >> Yeah, that's right, about 15 years. I studied physics in college. Then I got a PhD in astronomy. And, I worked for about five years in another project, the Dark Energy Survey before joining Rubin in 2015. >> Yeah, impressive. So it seems like both your organizations are looking at space from two different angles. One thing you guys both have in common of course is software, and you both use InfluxDB as part of your data infrastructure. How did you discover InfluxDB, get into it? How do you use the platform? Maybe Caleb you could start. >> Yeah, absolutely. So, the first company that I extensively used InfluxDB in, was a launch startup called Astra. And we were in the process of designing our first generation rocket there, and testing the engines, pumps, everything that goes into a rocket. And, when I joined the company our data story was not very mature. We were collecting a bunch of data in LabVIEW and engineers were taking that over to MATLAB to process it. And at first, there, you know, that's the way that a lot of engineers and scientists are used to working. And at first that was, like people weren't entirely sure that that was, that needed to change. But, it's, something, the nice thing about InfluxDB is that, it's so easy to deploy. So as, our software engineering team was able to get it deployed and, up and running very quickly and then quickly also backport all of the data that we collected this far into Influx. And, what was amazing to see and is kind of the super cool moment with Influx is, when we hooked that up to Grafana, Grafana as the visualization platform we used with Influx, 'cause it works really well with it. There was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data, where they could just almost instantly easily discover data that they hadn't been able to see before, and take the manual processes that they would run after a test and just throw those all in Influx and have live data as tests were coming, and, I saw them implementing like crazy rocket equation type stuff in Influx, and it just was totally game changing for how we tested. >> So Angelo, I was explaining in my open, that you could add a column in a traditional RDBMS and do time series, but with the volume of data that you're talking about in the example that Caleb just gave, you have to have a purpose built time series database. Where did you first learn about InfluxDB? >> Yeah, correct. So, I work with the data management team, and my first project was the record metrics that measured the performance of our software, the software that we used to process the data. So I started implementing that in our relational database. But then I realized that in fact I was dealing with time series data and I should really use a solution built for that. And then I started looking at time series databases and I found InfluxDB, and that was back in 2018. The, another use for InfluxDB that I'm also interested is the visits database. If you think about the observations, we are moving the telescope all the time and pointing to specific directions in the sky and taking pictures every 30 seconds. So that itself is a time series. And every point in that time series, we call a visit. So we want to record the metadata about those visits in InfluxDB. That time series is going to be 10 years long, with about 1000 points every night. It's actually not too much data compared to other problems. It's really just a different time scale. >> The telescope at the Rubin Observatory is like, pun intended, I guess the star of the show. And I believe I read that it's going to be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hubble's widest camera view, which is amazing. Like, that's like 40 moons in an image, amazingly fast as well. What else can you tell us about the telescope? >> This telescope it has to move really fast. And, it also has to carry the primary mirror which is an eight meter piece of glass. It's very heavy. And it has to carry a camera which has about the size of a small car. And this whole structure weighs about 300 tons. For that to work, the telescope needs to be very compact and stiff. And one thing that's amazing about it's design is that, the telescope, this 300 tons structure, it sits on a tiny film of oil, which has the diameter of human hair. And that makes an, almost zero friction interface. In fact, a few people can move this enormous structure with only their hands. As you said, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So, each image has, in diameter the size of about seven full moons. And, with that, we can map the entire sky in only three days. And of course, during operations everything's controlled by software and it is automatic. There's a very complex piece of software called the Scheduler, which is responsible for moving the telescope, and the camera, which is recording 15 terabytes of data every night. >> And Angelo, all this data lands in InfluxDB, correct? And what are you doing with all that data? >> Yeah, actually not. So we use InfluxDB to record engineering data and metadata about the observations. Like telemetry, events, and commands from the telescope. That's a much smaller data set compared to the images. But it is still challenging because you have some high frequency data that the system needs to keep up, and, we need to store this data and have it around for the lifetime of the project. >> Got it. Thank you. Okay, Caleb, let's bring you back in. Tell us more about the, you got these dishwasher size satellites, kind of using a multi-tenant model, I think it's genius. But tell us about the satellites themselves. >> Yeah, absolutely. So, we have in space some satellites already that as you said, are like dishwasher, mini fridge kind of size. And we're working on a bunch more that are a variety of sizes from shoebox to, I guess, a few times larger than what we have today. And it is, we do shoot to have effectively something like a multi-tenant model where we will buy a bus off the shelf. The bus is what you can kind of think of as the core piece of the satellite, almost like a motherboard or something where it's providing the power, it has the solar panels, it has some radios attached to it. It handles the attitude control, basically steers the spacecraft in orbit, and then we build also in-house, what we call our payload hub which is, has all, any customer payloads attached and our own kind of Edge processing sort of capabilities built into it. And, so we integrate that, we launch it, and those things because they're in lower Earth orbit, they're orbiting the earth every 90 minutes. That's, seven kilometers per second which is several times faster than a speeding bullet. So we have one of the unique challenges of operating spacecraft in lower Earth orbit is that generally you can't talk to them all the time. So, we're managing these things through very brief windows of time, where we get to talk to them through our ground sites, either in Antarctica or in the North pole region. >> Talk more about how you use InfluxDB to make sense of this data through all this tech that you're launching into space. >> We basically, previously we started off when I joined the company, storing all of that as Angelo did in a regular relational database. And we found that it was so slow and the size of our data would balloon over the course of a couple days to the point where we weren't able to even store all of the data that we were getting. So we migrated to InfluxDB to store our time series telemetry from the spacecraft. So, that's things like power level, voltage, currents, counts, whatever metadata we need to monitor about the spacecraft, we now store that in InfluxDB. And that has, now we can actually easily store the entire volume of data for the mission life so far without having to worry about the size bloating to an unmanageable amount, and we can also seamlessly query large chunks of data. Like if I need to see, you know, for example, as an operator, I might want to see how my battery state of charge is evolving over the course of the year, I can have, plot in an Influx that loads that in a fraction of a second for a year's worth of data because it does, intelligent, it can intelligently group the data by assigning time interval. So, it's been extremely powerful for us to access the data. And, as time has gone on, we've gradually migrated more and more of our operating data into Influx. >> Yeah. Let's talk a little bit about, we throw this term around a lot of, you know, data driven, a lot of companies say, "Oh yes, we're data driven." But you guys really are, I mean, you got data at the core. Caleb, what does that mean to you? >> Yeah, so, you know, I think the, and the clearest example of when I saw this be like totally game changing is what I mentioned before at Astra where our engineer's feedback loop went from a lot of kind of slow researching, digging into the data to like an instant, instantaneous almost, seeing the data, making decisions based on it immediately rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. But to give another practical example, as I said, we have a huge amount of data that comes down every orbit and we need to be able to ingest all of that data almost instantaneously and provide it to the operator in near real time, about a second worth of latency is all that's acceptable for us to react to see what is coming down from the spacecraft. And building that pipeline is challenging from a software engineering standpoint. My primary language is Python which isn't necessarily that fast. So what we've done is started, and the goal of being data-driven is publish metrics on individual, how individual pieces of our data processing pipeline are performing into Influx as well. And we do that in production as well as in dev. So we have kind of a production monitoring flow. And what that has done is allow us to make intelligent decisions on our software development roadmap where it makes the most sense for us to focus our development efforts in terms of improving our software efficiency, just because we have that visibility into where the real problems are. And sometimes we've found ourselves before we started doing this, kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. But now that we're being a bit more data driven there, we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scale from supporting a couple of satellites to supporting many, many satellites at once. >> Yeah, of course is how you reduced those dead ends. Maybe Angelo you could talk about what sort of data-driven means to you and your teams. >> I would say that, having real time visibility to the telemetry data and metrics is crucial for us. We need to make sure that the images that we collect with the telescope have good quality, and, that they are within the specifications to meet our science goals. And so if they are not, we want to know that as soon as possible and then start fixing problems. >> Caleb, what are your sort of event, you know, intervals like? >> So I would say that, as of today on the spacecraft, the event, the level of timing that we deal with probably tops out at about 20 Hertz, 20 measurements per second on things like our gyroscopes. But, the, I think the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications and I'll give an example from when I worked at, on the rockets at Astra. There, our baseline data rate that we would ingest data during a test is 500 Hertz. So 500 samples per second, and in some cases we would actually need to ingest much higher rate data, even up to like 1.5 kilohertz, so extremely, extremely high precision data there where timing really matters a lot. And, you know, I can, one of the really powerful things about Influx is the fact that it can handle this. That's one of the reasons we chose it, because, there's, times when we're looking at the results of a firing where you're zooming in, you know, I talked earlier about how on my current job we often zoom out to look at a year's worth of data. You're zooming in to where your screen is preoccupied by a tiny fraction of a second, and you need to see same thing as Angelo just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers, so that can be something like, "Hey, I opened this valve at exactly this time," and that goes, we want to have that at, micro, or even nanosecond precision so that we know, okay, we saw a spike in chamber pressure at this exact moment, was that before or after this valve opened? That kind of visibility is critical in these kind of scientific applications, and absolutely game changing to be able to see that in near real time, and with, a really easy way for engineers to be able to visualize this data themselves without having to wait for us software engineers to go build it for them. >> Can the scientists do self-serve or do you have to design and build all the analytics and queries for your scientists? >> Well, I think that's absolutely, from my perspective that's absolutely one of the best things about Influx and what I've seen be game changing is that, generally I'd say anyone can learn to use Influx. And honestly, most of our users might not even know they're using Influx, because, the interface that we expose to them is Grafana, which is a generic graphing, open source graphing library that is very similar to Influx zone Chronograf. >> Sure. >> And what it does is, it provides this almost, it's a very intuitive UI for building your queries. So, you choose a measurement and it shows a dropdown of available measurements. And then you choose the particular fields you want to look at, and again, that's a dropdown. So, it's really easy for our users to discover and there's kind of point and click options for doing math, aggregations. You can even do like perfect kind of predictions all within Grafana, the Grafana user interface, which is really just a wrapper around the APIs and functionality that Influx provides. >> Putting data in the hands of those who have the context, the domain experts is key. Angelo, is it the same situation for you, is it self-serve? >> Yeah, correct. As I mentioned before, we have the astronomers making their own dashboards because they know what exactly what they need to visualize. >> Yeah, I mean, it's all about using the right tool for the job. I think for us, when I joined the company we weren't using InfluxDB and we were dealing with serious issues of the database growing to an incredible size extremely quickly, and being unable to like even querying short periods of data was taking on the order of seconds, which is just not possible for operations. >> Guys, this has been really formative, it's pretty exciting to see how the edge, is mountaintops, lower Earth orbits, I mean space is the ultimate edge, isn't it? I wonder if you could answer two questions to wrap here. You know, what comes next for you guys? And is there something that you're really excited about that you're working on? Caleb maybe you could go first and then Angelo you can bring us home. >> Basically what's next for Loft Orbital is more satellites, a greater push towards infrastructure, and really making, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, making that happen. It's extremely exciting, an extremely exciting time to be in this company and to be in this industry as a whole. Because there are so many interesting applications out there, so many cool ways of leveraging space that people are taking advantage of, and with companies like SpaceX and the, now rapidly lowering cost of launch it's just a really exciting place to be in. We're launching more satellites, we are scaling up for some constellations, and our ground system has to be improved to match. So, there's a lot of improvements that we're working on to really scale up our control software to be best in class and make it capable of handling such a large workload, so. >> Are you guys hiring? >> We are absolutely hiring, so I would, we have positions all over the company, so, we need software engineers, we need people who do more aerospace specific stuff. So absolutely, I'd encourage anyone to check out the Loft Orbital website, if this is at all interesting. >> All right, Angelo, bring us home. >> Yeah. So what's next for us is really getting this telescope working and collecting data. And when that's happened is going to be just a deluge of data coming out of this camera and handling all that data is going to be really challenging. Yeah, I want to be here for that, I'm looking forward. Like for next year we have like an important milestone, which is our commissioning camera, which is a simplified version of the full camera, it's going to be on sky, and so yeah, most of the system has to be working by then. >> Nice. All right guys, with that we're going to end it. Thank you so much, really fascinating, and thanks to InfluxDB for making this possible, really groundbreaking stuff, enabling value creation at the Edge, in the cloud, and of course, beyond at the space. So, really transformational work that you guys are doing, so congratulations and really appreciate the broader community. I can't wait to see what comes next from having this entire ecosystem. Now, in a moment, I'll be back to wrap up. This is Dave Vellante, and you're watching theCUBE, the leader in high tech enterprise coverage. >> Welcome. Telegraf is a popular open source data collection agent. Telegraf collects data from hundreds of systems like IoT sensors, cloud deployments, and enterprise applications. It's used by everyone from individual developers and hobbyists, to large corporate teams. The Telegraf project has a very welcoming and active Open Source community. Learn how to get involved by visiting the Telegraf GitHub page. Whether you want to contribute code, improve documentation, participate in testing, or just show what you're doing with Telegraf. We'd love to hear what you're building. >> Thanks for watching Moving the World with InfluxDB, made possible by Influx Data. I hope you learned some things and are inspired to look deeper into where time series databases might fit into your environment. If you're dealing with large and or fast data volumes, and you want to scale cost effectively with the highest performance, and you're analyzing metrics and data over time, times series databases just might be a great fit for you. Try InfluxDB out. You can start with a free cloud account by clicking on the link in the resources below. Remember, all these recordings are going to be available on demand of thecube.net and influxdata.com, so check those out. And poke around Influx Data. They are the folks behind InfluxDB, and one of the leaders in the space. We hope you enjoyed the program, this is Dave Vellante for theCUBE, we'll see you soon. (upbeat music)

Published Date : May 18 2022

SUMMARY :

and what you guys do of the kind of customer that we can serve. So amazing Caleb, what you guys do. of the different space startups the Rubin Observatory. Rubin is a state of the art observatory and then you went out to the Dark Energy Survey and you both use InfluxDB and is kind of the super in the example that Caleb just gave, the software that we that it's going to be the first and the camera, that the system needs to keep up, let's bring you back in. is that generally you can't to make sense of this data all of the data that we were getting. But you guys really are, I digging into the data to like an instant, means to you and your teams. the images that we collect of the ability to have high precision data because, the interface that and functionality that Influx provides. Angelo, is it the same situation for you, we have the astronomers and we were dealing with and then Angelo you can bring us home. and to be in this industry as a whole. out the Loft Orbital website, most of the system has and of course, beyond at the space. and hobbyists, to large corporate teams. and one of the leaders in the space.

ENTITIES

Entity	Category	Confidence
Caleb	PERSON	0.99+
Caleb Maclachlan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Angelo Fausti	PERSON	0.99+
Loft Orbital	ORGANIZATION	0.99+
Chile	LOCATION	0.99+
Totum Labs	ORGANIZATION	0.99+
2015	DATE	0.99+
10 years	QUANTITY	0.99+
Antarctica	LOCATION	0.99+
1000 pictures	QUANTITY	0.99+
SpaceX	ORGANIZATION	0.99+
2018	DATE	0.99+
15 terabytes	QUANTITY	0.99+
40 moons	QUANTITY	0.99+
Vera C. Rubin	PERSON	0.99+
Influx	TITLE	0.99+
Python	TITLE	0.99+
300 tons	QUANTITY	0.99+
500 Hertz	QUANTITY	0.99+
Angelo	PERSON	0.99+
two questions	QUANTITY	0.99+
earth	LOCATION	0.99+
next year	DATE	0.99+
Telegraf	ORGANIZATION	0.99+
Astra	ORGANIZATION	0.99+
InfluxDB	TITLE	0.99+
today	DATE	0.99+
2.2 Gigapixel	QUANTITY	0.99+
both	QUANTITY	0.99+
each image	QUANTITY	0.99+
thecube.net	OTHER	0.99+
North pole	LOCATION	0.99+
first project	QUANTITY	0.99+
first	QUANTITY	0.99+
One	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Earth	LOCATION	0.99+
one	QUANTITY	0.99+
eight meter	QUANTITY	0.98+
first generation	QUANTITY	0.98+
Vera C. Rubin Observatory	ORGANIZATION	0.98+
three orders	QUANTITY	0.98+
influxdata.com	OTHER	0.98+
1.5 kilohertz	QUANTITY	0.98+
three days	QUANTITY	0.98+
first company	QUANTITY	0.97+
one thing	QUANTITY	0.97+
Moving the World	TITLE	0.97+
Grafana	TITLE	0.97+
two different angles	QUANTITY	0.97+
about 1000 points	QUANTITY	0.97+
Rubin Observatory	LOCATION	0.96+
hundreds of systems	QUANTITY	0.96+

The Future Is Built On InFluxDB

>>Time series data is any data that's stamped in time in some way that could be every second, every minute, every five minutes, every hour, every nanosecond, whatever it might be. And typically that data comes from sources in the physical world like devices or sensors, temperature, gauges, batteries, any device really, or things in the virtual world could be software, maybe it's software in the cloud or data and containers or microservices or virtual machines. So all of these items, whether in the physical or virtual world, they're generating a lot of time series data. Now time series data has been around for a long time, and there are many examples in our everyday lives. All you gotta do is punch up any stock, ticker and look at its price over time and graphical form. And that's a simple use case that anyone can relate to and you can build timestamps into a traditional relational database. >>You just add a column to capture time and as well, there are examples of log data being dumped into a data store that can be searched and captured and ingested and visualized. Now, the problem with the latter example that I just gave you is that you gotta hunt and Peck and search and extract what you're looking for. And the problem with the former is that traditional general purpose databases they're designed as sort of a Swiss army knife for any workload. And there are a lot of functions that get in the way and make them inefficient for time series analysis, especially at scale. Like when you think about O T and edge scale, where things are happening super fast, ingestion is coming from many different sources and analysis often needs to be done in real time or near real time. And that's where time series databases come in. >>They're purpose built and can much more efficiently support ingesting metrics at scale, and then comparing data points over time, time series databases can write and read at significantly higher speeds and deal with far more data than traditional database methods. And they're more cost effective instead of throwing processing power at the problem. For example, the underlying architecture and algorithms of time series databases can optimize queries and they can reclaim wasted storage space and reuse it. At scale time, series databases are simply a better fit for the job. Welcome to moving the world with influx DB made possible by influx data. My name is Dave Valante and I'll be your host today. Influx data is the company behind InfluxDB. The open source time series database InfluxDB is designed specifically to handle time series data. As I just explained, we have an exciting program for you today, and we're gonna showcase some really interesting use cases. >>First, we'll kick it off in our Palo Alto studios where my colleague, John furrier will interview Evan Kaplan. Who's the CEO of influx data after John and Evan set the table. John's gonna sit down with Brian Gilmore. He's the director of IOT and emerging tech at influx data. And they're gonna dig into where influx data is gaining traction and why adoption is occurring and, and why it's so robust. And they're gonna have tons of examples and double click into the technology. And then we bring it back here to our east coast studios, where I get to talk to two practitioners, doing amazing things in space with satellites and modern telescopes. These use cases will blow your mind. You don't want to miss it. So thanks for being here today. And with that, let's get started. Take it away. Palo Alto. >>Okay. Today we welcome Evan Kaplan, CEO of influx data, the company behind influx DB. Welcome Evan. Thanks for coming on. >>Hey John, thanks for having me >>Great segment here on the influx DB story. What is the story? Take us through the history. Why time series? What's the story >><laugh> so the history history is actually actually pretty interesting. Um, Paul dicks, my partner in this and our founder, um, super passionate about developers and developer experience. And, um, he had worked on wall street building a number of time series kind of platform trading platforms for trading stocks. And from his point of view, it was always what he would call a yak shave, which means you had to do a ton of work just to start doing work, which means you had to write a bunch of extrinsic routines. You had to write a bunch of application handling on existing relational databases in order to come up with something that was optimized for a trading platform or a time series platform. And he sort of, he just developed this real clear point of view is this is not how developers should work. And so in 2013, he went through why Combinator and he built something for, he made his first commit to open source in flu DB at the end of 2013. And, and he basically, you know, from my point of view, he invented modern time series, which is you start with a purpose-built time series platform to do these kind of workloads. And you get all the benefits of having something right outta the box. So a developer can be totally productive right away. >>And how many people in the company what's the history of employees and stuff? >>Yeah, I think we're, I, you know, I always forget the number, but it's something like 230 or 240 people now. Um, the company, I joined the company in 2016 and I love Paul's vision. And I just had a strong conviction about the relationship between time series and IOT. Cuz if you think about it, what sensors do is they speak time, series, pressure, temperature, volume, humidity, light, they're measuring they're instrumenting something over time. And so I thought that would be super relevant over long term and I've not regretted it. >>Oh no. And it's interesting at that time, go back in the history, you know, the role of databases, well, relational database is the one database to rule the world. And then as clouds started coming in, you starting to see more databases, proliferate types of databases and time series in particular is interesting. Cuz real time has become super valuable from an application standpoint, O T which speaks time series means something it's like time matters >>Time. >>Yeah. And sometimes data's not worth it after the time, sometimes it worth it. And then you get the data lake. So you have this whole new evolution. Is this the momentum? What's the momentum, I guess the question is what's the momentum behind >>You mean what's causing us to grow. So >>Yeah, the time series, why is time series >>And the >>Category momentum? What's the bottom line? >>Well, think about it. You think about it from a broad, broad sort of frame, which is where, what everybody's trying to do is build increasingly intelligent systems, whether it's a self-driving car or a robotic system that does what you want to do or a self-healing software system, everybody wants to build increasing intelligent systems. And so in order to build these increasing intelligent systems, you have to instrument the system well, and you have to instrument it over time, better and better. And so you need a tool, a fundamental tool to drive that instrumentation. And that's become clear to everybody that that instrumentation is all based on time. And so what happened, what happened, what happened what's gonna happen? And so you get to these applications like predictive maintenance or smarter systems. And increasingly you want to do that stuff, not just intelligently, but fast in real time. So millisecond response so that when you're driving a self-driving car and the system realizes that you're about to do something, essentially you wanna be able to act in something that looks like real time, all systems want to do that, want to be more intelligent and they want to be more real time. And so we just happen to, you know, we happen to show up at the right time in the evolution of a >>Market. It's interesting near real time. Isn't good enough when you need real time. >><laugh> yeah, it's not, it's not. And it's like, and it's like, everybody wants, even when you don't need it, ironically, you want it. It's like having the feature for, you know, you buy a new television, you want that one feature, even though you're not gonna use it, you decide that your buying criteria real time is a buying criteria >>For, so you, I mean, what you're saying then is near real time is getting closer to real time as possible, as fast as possible. Right. Okay. So talk about the aspect of data, cuz we're hearing a lot of conversations on the cube in particular around how people are implementing and actually getting better. So iterating on data, but you have to know when it happened to get, know how to fix it. So this is a big part of how we're seeing with people saying, Hey, you know, I wanna make my machine learning algorithms better after the fact I wanna learn from the data. Um, how does that, how do you see that evolving? Is that one of the use cases of sensors as people bring data in off the network, getting better with the data knowing when it happened? >>Well, for sure. So, so for sure, what you're saying is, is, is none of this is non-linear, it's all incremental. And so if you take something, you know, just as an easy example, if you take a self-driving car, what you're doing is you're instrumenting that car to understand where it can perform in the real world in real time. And if you do that, if you run the loop, which is I instrumented, I watch what happens, oh, that's wrong? Oh, I have to correct for that. I correct for that in the software. If you do that for a billion times, you get a self-driving car, but every system moves along that evolution. And so you get the dynamic of, you know, of constantly instrumenting watching the system behave and do it. And this and sets up driving car is one thing. But even in the human genome, if you look at some of our customers, you know, people like, you know, people doing solar arrays, people doing power walls, like all of these systems are getting smarter. >>Well, let's get into that. What are the top applications? What are you seeing for your, with in, with influx DB, the time series, what's the sweet spot for the application use case and some customers give some >>Examples. Yeah. So it's, it's pretty easy to understand on one side of the equation that's the physical side is sensors are sensors are getting cheap. Obviously we know that and they're getting the whole physical world is getting instrumented, your home, your car, the factory floor, your wrist, watch your healthcare, you name it. It's getting instrumented in the physical world. We're watching the physical world in real time. And so there are three or four sweet spots for us, but, but they're all on that side. They're all about IOT. So they're think about consumer IOT projects like Google's nest todo, um, particle sensors, um, even delivery engines like rapid who deliver the Instacart of south America, like anywhere there's a physical location do and that's on the consumer side. And then another exciting space is the industrial side factories are changing dramatically over time. Increasingly moving away from proprietary equipment to develop or driven systems that run operational because what, what has to get smarter when you're building, when you're building a factory is systems all have to get smarter. And then, um, lastly, a lot in the renewables sustainability. So a lot, you know, Tesla, lucid, motors, Cola, motors, um, you know, lots to do with electric cars, solar arrays, windmills, arrays, just anything that's gonna get instrumented that where that instrumentation becomes part of what the purpose >>Is. It's interesting. The convergence of physical and digital is happening with the data IOT. You mentioned, you know, you think of IOT, look at the use cases there, it was proprietary OT systems. Now becoming more IP enabled internet protocol and now edge compute, getting smaller, faster, cheaper AI going to the edge. Now you have all kinds of new capabilities that bring that real time and time series opportunity. Are you seeing IOT going to a new level? What was the, what's the IOT where's the IOT dots connecting to because you know, as these two cultures merge yeah. Operations, basically industrial factory car, they gotta get smarter, intelligent edge is a buzzword, but I mean, it has to be more intelligent. Where's the, where's the action in all this. So the >>Action, really, it really at the core, it's at the developer, right? Because you're looking at these things, it's very hard to get an off the shelf system to do the kinds of physical and software interaction. So the actions really happen at the developer. And so what you're seeing is a movement in the world that, that maybe you and I grew up in with it or OT moving increasingly that developer driven capability. And so all of these IOT systems they're bespoke, they don't come out of the box. And so the developer, the architect, the CTO, they define what's my business. What am I trying to do? Am I trying to sequence a human genome and figure out when these genes express theself or am I trying to figure out when the next heart rate monitor's gonna show up on my apple watch, right? What am I trying to do? What's the system I need to build. And so starting with the developers where all of the good stuff happens here, which is different than it used to be, right. Used to be you'd buy an application or a service or a SA thing for, but with this dynamic, with this integration of systems, it's all about bespoke. It's all about building >>Something. So let's get to the developer real quick, real highlight point here is the data. I mean, I could see a developer saying, okay, I need to have an application for the edge IOT edge or car. I mean, we're gonna have, I mean, Tesla's got applications of the car it's right there. I mean, yes, there's the modern application life cycle now. So take us through how this impacts the developer. Does it impact their C I C D pipeline? Is it cloud native? I mean, where does this all, where does this go to? >>Well, so first of all, you're talking about, there was an internal journey that we had to go through as a company, which, which I think is fascinating for anybody who's interested is we went from primarily a monolithic software that was open sourced to building a cloud native platform, which means we had to move from an agile development environment to a C I C D environment. So to a degree that you are moving your service, whether it's, you know, Tesla monitoring your car and updating your power walls, right. Or whether it's a solar company updating the arrays, right. To degree that that service is cloud. Then increasingly remove from an agile development to a C I C D environment, which you're shipping code to production every day. And so it's not just the developers, all the infrastructure to support the developers to run that service and that sort of stuff. I think that's also gonna happen in a big way >>When your customer base that you have now, and as you see, evolving with infl DB, is it that they're gonna be writing more of the application or relying more on others? I mean, obviously there's an open source component here. So when you bring in kind of old way, new way old way was I got a proprietary, a platform running all this O T stuff and I gotta write, here's an application. That's general purpose. Yeah. I have some flexibility, somewhat brittle, maybe not a lot of robustness to it, but it does its job >>A good way to think about this is versus a new way >>Is >>What so yeah, good way to think about this is what, what's the role of the developer slash architect CTO that chain within a large, within an enterprise or a company. And so, um, the way to think about it is I started my career in the aerospace industry <laugh> and so when you look at what Boeing does to assemble a plane, they build very, very few of the parts. Instead, what they do is they assemble, they buy the wings, they buy the engines, they assemble, actually, they don't buy the wings. It's the one thing they buy the, the material for the w they build the wings, cuz there's a lot of tech in the wings and they end up being assemblers smart assemblers of what ends up being a flying airplane, which is pretty big deal even now. And so what, what happens with software people is they have the ability to pull from, you know, the best of the open source world. So they would pull a time series capability from us. Then they would assemble that with, with potentially some ETL logic from somebody else, or they'd assemble it with, um, a Kafka interface to be able to stream the data in. And so they become very good integrators and assemblers, but they become masters of that bespoke application. And I think that's where it goes, cuz you're not writing native code for everything. >>So they're more flexible. They have faster time to market cuz they're assembling way faster and they get to still maintain their core competency. Okay. Their wings in this case, >>They become increasingly not just coders, but designers and developers. They become broadly builders is what we like to think of it. People who start and build stuff by the way, this is not different than the people just up the road Google have been doing for years or the tier one, Amazon building all their own. >>Well, I think one of the things that's interesting is is that this idea of a systems developing a system architecture, I mean systems, uh, uh, systems have consequences when you make changes. So when you have now cloud data center on premise and edge working together, how does that work across the system? You can't have a wing that doesn't work with the other wing kind of thing. >>That's exactly. But that's where the that's where the, you know, that that Boeing or that airplane building analogy comes in for us. We've really been thoughtful about that because IOT it's critical. So our open source edge has the same API as our cloud native stuff that has enterprise on pre edge. So our multiple products have the same API and they have a relationship with each other. They can talk with each other. So the builder builds it once. And so this is where, when you start thinking about the components that people have to use to build these services is that you wanna make sure, at least that base layer, that database layer, that those components talk to each other. >>So I'll have to ask you if I'm the customer. I put my customer hat on. Okay. Hey, I'm dealing with a lot. >>That mean you have a PO for <laugh> >>A big check. I blank check. If you can answer this question only if the tech, if, if you get the question right, I got all this important operation stuff. I got my factory, I got my self-driving cars. This isn't like trivial stuff. This is my business. How should I be thinking about time series? Because now I have to make these architectural decisions, as you mentioned, and it's gonna impact my application development. So huge decision point for your customers. What should I care about the most? So what's in it for me. Why is time series >>Important? Yeah, that's a great question. So chances are, if you've got a business that was, you know, 20 years old or 25 years old, you were already thinking about time series. You probably didn't call it that you built something on a Oracle or you built something on IBM's DB two, right. And you made it work within your system. Right? And so that's what you started building. So it's already out there. There are, you know, there are probably hundreds of millions of time series applications out there today. But as you start to think about this increasing need for real time, and you start to think about increasing intelligence, you think about optimizing those systems over time. I hate the word, but digital transformation. Then you start with time series. It's a foundational base layer for any system that you're gonna build. There's no system I can think of where time series, shouldn't be the foundational base layer. If you just wanna store your data and just leave it there and then maybe look it up every five years. That's fine. That's not time. Series time series is when you're building a smarter, more intelligent, more real time system. And the developers now know that. And so the more they play a role in building these systems, the more obvious it becomes. >>And since I have a PO for you and a big check, yeah. What is, what's the value to me as I, when I implement this, what's the end state, what's it look like when it's up and running? What's the value proposition for me. What's an >>So, so when it's up and running, you're able to handle the queries, the writing of the data, the down sampling of the data, they're transforming it in near real time. So that the other dependencies that a system that gets for adjusting a solar array or trading energy off of a power wall or some sort of human genome, those systems work better. So time series is foundational. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build a really compelling, intelligent system. I think that's what developers and archs are seeing now. >>Bottom line, final word. What's in it for the customer. What's what, what's your, um, what's your statement to the customer? What would you say to someone looking to do something in time series on edge? >>Yeah. So, so it's pretty clear to clear to us that if you're building, if you view yourself as being in the build business of building systems that you want 'em to be increasingly intelligent, self-healing autonomous. You want 'em to operate in real time that you start from time series. But I also wanna say what's in it for us influx what's in it for us is people are doing some amazing stuff. You know, I highlighted some of the energy stuff, some of the human genome, some of the healthcare it's hard not to be proud or feel like, wow. Yeah. Somehow I've been lucky. I've arrived at the right time, in the right place with the right people to be able to deliver on that. That's that's also exciting on our side of the equation. >>Yeah. It's critical infrastructure, critical, critical operations. >>Yeah. >>Yeah. Great stuff, Evan. Thanks for coming on. Appreciate this segment. All right. In a moment, Brian Gilmore director of IOT and emerging technology that influx day will join me. You're watching the cube leader in tech coverage. Thanks for watching >>Time series data from sensors systems and applications is a key source in driving automation and prediction in technologies around the world. But managing the massive amount of timestamp data generated these days is overwhelming, especially at scale. That's why influx data developed influx DB, a time series data platform that collects stores and analyzes data influx DB empowers developers to extract valuable insights and turn them into action by building transformative IOT analytics and cloud native applications, purpose built and optimized to handle the scale and velocity of timestamped data. InfluxDB puts the power in your hands with developer tools that make it easy to get started quickly with less code InfluxDB is more than a database. It's a robust developer platform with integrated tooling. That's written in the languages you love. So you can innovate faster, run in flex DB anywhere you want by choosing the provider and region that best fits your needs across AWS, Microsoft Azure and Google cloud flex DB is fast and automatically scalable. So you can spend time delivering value to customers, not managing clusters, take control of your time series data. So you can focus on the features and functionalities that give your applications a competitive edge. Get started for free with influx DB, visit influx data.com/cloud to learn more. >>Okay. Now we're joined by Brian Gilmore director of IOT and emerging technologies at influx data. Welcome to the show. >>Thank you, John. Great to be here. >>We just spent some time with Evan going through the company and the value proposition, um, with influx DV, what's the momentum, where do you see this coming from? What's the value coming out of this? >>Well, I think it, we're sort of hitting a point where the technology is, is like the adoption of it is becoming mainstream. We're seeing it in all sorts of organizations, everybody from like the most well funded sort of advanced big technology companies to the smaller academics, the startups and the managing of that sort of data that emits from that technology is time series and us being able to give them a, a platform, a tool that's super easy to use, easy to start. And then of course will grow with them is, is been key to us. Sort of, you know, riding along with them is they're successful. >>Evan was mentioning that time series has been on everyone's radar and that's in the OT business for years. Now, you go back since 20 13, 14, even like five years ago that convergence of physical and digital coming together, IP enabled edge. Yeah. Edge has always been kind of hyped up, but why now? Why, why is the edge so hot right now from an adoption standpoint? Is it because it's just evolution, the tech getting better? >>I think it's, it's, it's twofold. I think that, you know, there was, I would think for some people, everybody was so focused on cloud over the last probably 10 years. Mm-hmm <affirmative> that they forgot about the compute that was available at the edge. And I think, you know, those, especially in the OT and on the factory floor who weren't able to take Avan full advantage of cloud through their applications, you know, still needed to be able to leverage that compute at the edge. I think the big thing that we're seeing now, which is interesting is, is that there's like a hybrid nature to all of these applications where there's definitely some data that's generated on the edge. There's definitely done some data that's generated in the cloud. And it's the ability for a developer to sort of like tie those two systems together and work with that data in a very unified uniform way. Um, that's giving them the opportunity to build solutions that, you know, really deliver value to whatever it is they're trying to do, whether it's, you know, the, the out reaches of outer space or whether it's optimizing the factory floor. >>Yeah. I think, I think one of the things you also mentions genome too, dig big data is coming to the real world. And I think I, OT has been kind of like this thing for OT and, and in some use case, but now with the, with the cloud, all companies have an edge strategy now. So yeah, what's the secret sauce because now this is hot, hot product for the whole world and not just industrial, but all businesses. What's the secret sauce. >>Well, I mean, I think part of it is just that the technology is becoming more capable and that's especially on the hardware side, right? I mean, like technology compute is getting smaller and smaller and smaller. And we find that by supporting all the way down to the edge, even to the micro controller layer with our, um, you know, our client libraries and then working hard to make our applications, especially the database as small as possible so that it can be located as close to sort of the point of origin of that data in the edge as possible is, is, is fantastic. Now you can take that. You can run that locally. You can do your local decision making. You can use influx DB as sort of an input to automation control the autonomy that people are trying to drive at the edge. But when you link it up with everything that's in the cloud, that's when you get all of the sort of cloud scale capabilities of parallelized, AI and machine learning and all of that. >>So what's interesting is the open source success has been something that we've talked about a lot in the cube about how people are leveraging that you guys have users in the enterprise users that IOT market mm-hmm <affirmative>, but you got developers now. Yeah. Kind of together brought that up. How do you see that emerging? How do developers engage? What are some of the things you're seeing that developers are really getting into with InfluxDB >>What's? Yeah. Well, I mean, I think there are the developers who are building companies, right? And these are the startups and the folks that we love to work with who are building new, you know, new services, new products, things like that. And, you know, especially on the consumer side of IOT, there's a lot of that, just those developers. But I think we, you gotta pay attention to those enterprise developers as well, right? There are tons of people with the, the title of engineer in, in your regular enterprise organizations. And they're there for systems integration. They're there for, you know, looking at what they would build versus what they would buy. And a lot of them come from, you know, a strong, open source background and they, they know the communities, they know the top platforms in those spaces and, and, you know, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building a brand new one. >>You know, it's interesting too, when Evan and I were talking about open source versus closed OT systems, mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining open dozens of data formats out there? Bunch of standards, protocols, new things are emerging. Everyone wants to have a control plane. Everyone wants to leverage the value of data. How do you guys keep track of it all? What do you guys support? >>Yeah, well, I mean, I think either through direct connection, like we have a product called Telegraph, it's unbelievable. It's open source, it's an edge agent. You can run it as close to the edge as you'd like, it speaks dozens of different protocols in its own, right? A couple of which MQTT B, C U a are very, very, um, applicable to these T use cases. But then we also, because we are sort of not only open source, but open in terms of our ability to collect data, we have a lot of partners who have built really great integrations from their own middleware, into influx DB. These are companies like ke wear and high bite who are really experts in those downstream industrial protocols. I mean, that's a business, not everybody wants to be in. It requires some very specialized, very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, we get the best of both worlds. The customers can use the platforms they need up to the point where they would be putting into our database. >>What's some of customer testimonies that they, that share with you. Can you share some anecdotal kind of like, wow, that's the best thing I've ever used. This really changed my business, or this is a great tech that's helped me in these other areas. What are some of the, um, soundbites you hear from customers when they're successful? >>Yeah. I mean, I think it ranges. You've got customers who are, you know, just finally being able to do the monitoring of assets, you know, sort of at the edge in the field, we have a customer who's who's has these tunnel boring machines that go deep into the earth to like drill tunnels for, for, you know, cars and, and, you know, trains and things like that. You know, they are just excited to be able to stick a database onto those tunnel, boring machines, send them into the depths of the earth and know that when they come out, all of that telemetry at a very high frequency has been like safely stored. And then it can just very quickly and instantly connect up to their, you know, centralized database. So like just having that visibility is brand new to them. And that's super important. On the other hand, we have customers who are way far beyond the monitoring use case, where they're actually using the historical records in the time series database to, um, like I think Evan mentioned like forecast things. So for predictive maintenance, being able to pull in the telemetry from the machines, but then also all of that external enrichment data, the metadata, the temperatures, the pressure is who is operating the machine, those types of things, and being able to easily integrate with platforms like Jupyter notebooks or, you know, all of those scientific computing and machine learning libraries to be able to build the models, train the models, and then they can send that information back down to InfluxDB to apply it and detect those anomalies, which >>Are, I think that's gonna be an, an area. I personally think that's a hot area because I think if you look at AI right now, yeah. It's all about training the machine learning albums after the fact. So time series becomes hugely important. Yeah. Cause now you're thinking, okay, the data matters post time. Yeah. First time. And then it gets updated the new time. Yeah. So it's like constant data cleansing data iteration, data programming. We're starting to see this new use case emerge in the data field. >>Yep. Yeah. I mean, I think you agree. Yeah, of course. Yeah. The, the ability to sort of handle those pipelines of data smartly, um, intelligently, and then to be able to do all of the things you need to do with that data in stream, um, before it hits your sort of central repository. And, and we make that really easy for customers like Telegraph, not only does it have sort of the inputs to connect up to all of those protocols and the ability to capture and connect up to the, to the partner data. But also it has a whole bunch of capabilities around being able to process that data, enrich it, reform at it, route it, do whatever you need. So at that point you're basically able to, you're playing your data in exactly the way you would wanna do it. You're routing it to different, you know, destinations and, and it's, it's, it's not something that really has been in the realm of possibility until this point. Yeah. Yeah. >>And when Evan was on it's great. He was a CEO. So he sees the big picture with customers. He was, he kinda put the package together that said, Hey, we got a system. We got customers, people are wanting to leverage our product. What's your PO they're sell. He's selling too as well. So you have that whole CEO perspective, but he brought up this notion that there's multiple personas involved in kind of the influx DB system architect. You got developers and users. Can you talk about that? Reality as customers start to commercialize and operationalize this from a commercial standpoint, you got a relationship to the cloud. Yep. The edge is there. Yep. The edge is getting super important, but cloud brings a lot of scale to the table. So what is the relationship to the cloud? Can you share your thoughts on edge and its relationship to the cloud? >>Yeah. I mean, I think edge, you know, edges, you can think of it really as like the local information, right? So it's, it's generally like compartmentalized to a point of like, you know, a single asset or a single factory align, whatever. Um, but what people do who wanna pro they wanna be able to make the decisions there at the edge locally, um, quickly minus the latency of sort of taking that large volume of data, shipping it to the cloud and doing something with it there. So we allow them to do exactly that. Then what they can do is they can actually downsample that data or they can, you know, detect like the really important metrics or the anomalies. And then they can ship that to a central database in the cloud where they can do all sorts of really interesting things with it. Like you can get that centralized view of all of your global assets. You can start to compare asset to asset, and then you can do those things like we talked about, whereas you can do predictive types of analytics or, you know, larger scale anomaly detections. >>So in this model you have a lot of commercial operations, industrial equipment. Yep. The physical plant, physical business with virtual data cloud all coming together. What's the future for InfluxDB from a tech standpoint. Cause you got open. Yep. There's an ecosystem there. Yep. You have customers who want operational reliability for sure. I mean, so you got organic <laugh> >>Yeah. Yeah. I mean, I think, you know, again, we got iPhones when everybody's waiting for flying cars. Right. So I don't know. We can like absolutely perfectly predict what's coming, but I think there are some givens and I think those givens are gonna be that the world is only gonna become more hybrid. Right. And then, you know, so we are going to have much more widely distributed, you know, situations where you have data being generated in the cloud, you have data gen being generated at the edge and then there's gonna be data generated sort sort of at all points in between like physical locations as well as things that are, that are very virtual. And I think, you know, we are, we're building some technology right now. That's going to allow, um, the concept of a database to be much more fluid and flexible, sort of more aligned with what a file would be like. >>And so being able to move data to the compute for analysis or move the compute to the data for analysis, those are the types of, of solutions that we'll be bringing to the customers sort of over the next little bit. Um, but I also think we have to start thinking about like what happens when the edge is actually off the planet. Right. I mean, we've got customers, you're gonna talk to two of them, uh, in the panel who are actually working with data that comes from like outside the earth, like, you know, either in low earth orbit or you know, all the way sort of on the other side of the universe. Yeah. And, and to be able to process data like that and to do so in a way it's it's we gotta, we gotta build the fundamentals for that right now on the factory floor and in the mines and in the tunnels. Um, so that we'll be ready for that one. >>I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, this is kind of new thinking is hyper scale's always been built up full stack developers, even the old OT world, Evan was pointing out that they built everything right. And the world's going to more assembly with core competency and IP and also property being the core of their apple. So faster assembly and building, but also integration. You got all this new stuff happening. Yeah. And that's to separate out the data complexity from the app. Yes. So space genome. Yep. Driving cars throws off massive data. >>It >>Does. So is Tesla, uh, is the car the same as the data layer? >>I mean the, yeah, it's, it's certainly a point of origin. I think the thing that we wanna do is we wanna let the developers work on the world, changing problems, the things that they're trying to solve, whether it's, you know, energy or, you know, any of the other health or, you know, other challenges that these teams are, are building against. And we'll worry about that time series data and the underlying data platform so that they don't have to. Right. I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform quickly, integrate it with their data sources and the other pieces of their applications. It's going to allow them to bring much faster time to market on these products. It's gonna allow them to be more iterative. They're gonna be able to do more sort of testing and things like that. And ultimately it will, it'll accelerate the adoption and the creation of >>Technology. You mentioned earlier in, in our talk about unification of data. Yeah. How about APIs? Cuz developers love APIs in the cloud unifying APIs. How do you view view that? >>Yeah, I mean, we are APIs, that's the product itself. Like everything, people like to think of it as sort of having this nice front end, but the front end is B built on our public APIs. Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, but then data processing, data analytics, and then, you know, sort of data extraction to bring it to other platforms or other applications, microservices, whatever it might be. So, I mean, it is a world of APIs right now and you know, we, we bring a very sort of useful set of them for managing the time series data. These guys are all challenged with. It's >>Interesting. You and I were talking before we came on camera about how, um, data is, feels gonna have this kind of SRE role that DevOps had site reliability engineers, which manages a bunch of servers. There's so much data out there now. Yeah. >>Yeah. It's like reigning data for sure. And I think like that ability to be like one of the best jobs on the planet is gonna be to be able to like, sort of be that data Wrangler to be able to understand like what the data sources are, what the data formats are, how to be able to efficiently move that data from point a to point B and you know, to process it correctly so that the end users of that data aren't doing any of that sort of hard upfront preparation collection storage's >>Work. Yeah. That's data as code. I mean, data engineering is it is becoming a new discipline for sure. And, and the democratization is the benefit. Yeah. To everyone, data science get easier. I mean data science, but they wanna make it easy. Right. <laugh> yeah. They wanna do the analysis, >>Right? Yeah. I mean, I think, you know, it, it's a really good point. I think like we try to give our users as many ways as there could be possible to get data in and get data out. We sort of think about it as meeting them where they are. Right. So like we build, we have the sort of client libraries that allow them to just port to us, you know, directly from the applications and the languages that they're writing, but then they can also pull it out. And at that point nobody's gonna know the users, the end consumers of that data, better than those people who are building those applications. And so they're building these user interfaces, which are making all of that data accessible for, you know, their end users inside their organization. >>Well, Brian, great segment, great insight. Thanks for sharing all, all the complexities and, and IOT that you guys helped take away with the APIs and, and assembly and, and all the system architectures that are changing edge is real cloud is real. Yeah, absolutely. Mainstream enterprises. And you got developer attraction too, so congratulations. >>Yeah. It's >>Great. Well, thank any, any last word you wanna share >>Deal with? No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, download it, try out the open source contribute if you can. That's a, that's a huge thing. It's part of being the open source community. Um, you know, but definitely just, just use it. I think when once people use it, they try it out. They'll understand very, >>Very quickly. So open source with developers, enterprise and edge coming together all together. You're gonna hear more about that in the next segment, too. Right. Thanks for coming on. Okay. Thanks. When we return, Dave LAN will lead a panel on edge and data influx DB. You're watching the cube, the leader in high tech enterprise coverage. >>Why the startup, we move really fast. We find that in flex DB can move as fast as us. It's just a great group, very collaborative, very interested in manufacturing. And we see a bright future in working with influence. My name is Aaron Seley. I'm the CTO at HBI. Highlight's one of the first companies to focus on manufacturing data and apply the concepts of data ops, treat that as an asset to deliver to the it system, to enable applications like overall equipment effectiveness that can help the factory produce better, smarter, faster time series data. And manufacturing's really important. If you take a piece of equipment, you have the temperature pressure at the moment that you can look at to kind of see the state of what's going on. So without that context and understanding you can't do what manufacturers ultimately want to do, which is predict the future. >>Influx DB represents kind of a new way to storm time series data with some more advanced technology and more importantly, more open technologies. The other thing that influx does really well is once the data's influx, it's very easy to get out, right? They have a modern rest API and other ways to access the data. That would be much more difficult to do integrations with classic historians highlight can serve to model data, aggregate data on the shop floor from a multitude of sources, whether that be P C U a servers, manufacturing execution systems, E R P et cetera, and then push that seamlessly into influx to then be able to run calculations. Manufacturing is changing this industrial 4.0, and what we're seeing is influx being part of that equation. Being used to store data off the unified name space, we recommend InfluxDB all the time to customers that are exploring a new way to share data manufacturing called the unified name space who have open questions around how do I share this new data that's coming through my UNS or my QTT broker? How do I store this and be able to query it over time? And we often point to influx as a solution for that is a great brand. It's a great group of people and it's a great technology. >>Okay. We're now going to go into the customer panel and we'd like to welcome Angelo Fasi. Who's a software engineer at the Vera C Ruben observatory in Caleb McLaughlin whose senior spacecraft operations software engineer at loft orbital guys. Thanks for joining us. You don't wanna miss folks this interview, Caleb, let's start with you. You work for an extremely cool company. You're launching satellites into space. I mean, there, of course doing that is, is highly complex and not a cheap endeavor. Tell us about loft Orbi and what you guys do to attack that problem. >>Yeah, absolutely. And, uh, thanks for having me here by the way. Uh, so loft orbital is a, uh, company. That's a series B startup now, uh, who and our mission basically is to provide, uh, rapid access to space for all kinds of customers. Uh, historically if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, you know, have a big software teams, uh, and then eventually worry about, you know, a bunch like just a lot of very specialized engineering. And what we're trying to do is change that from a super specialized problem that has an extremely high barrier of access to a infrastructure problem. So that it's almost as simple as, you know, deploying a VM in, uh, AWS or GCP is getting your, uh, programs, your mission deployed on orbit, uh, with access to, you know, different sensors, uh, cameras, radios, stuff like that. >>So that's, that's kind of our mission. And just to give a really brief example of the kind of customer that we can serve. Uh, there's a really cool company called, uh, totem labs who is working on building, uh, IOT cons, an IOT constellation for in of things, basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor T, which means you have this little modem inside a container container that you, that you track from anywhere in the world as it's going across the ocean. Um, so they're, it's really little and they've been able to stay a small startup that's focused on their product, which is the, uh, that super crazy complicated, cool radio while we handle the whole space segment for them, which just, you know, before loft was really impossible. So that's, our mission is, uh, providing space infrastructure as a service. We are kind of groundbreaking in this area and we're serving, you know, a huge variety of customers with all kinds of different missions, um, and obviously generating a ton of data in space, uh, that we've gotta handle. Yeah. >>So amazing Caleb, what you guys do, I, now I know you were lured to the skies very early in your career, but how did you kinda land on this business? >>Yeah, so, you know, I've, I guess just a little bit about me for some people, you know, they don't necessarily know what they wanna do like early in their life. For me, I was five years old and I knew, you know, I want to be in the space industry. So, you know, I started in the air force, but have, uh, stayed in the space industry, my whole career and been a part of, uh, this is the fifth space startup that I've been a part of actually. So, you know, I've, I've, uh, kind of started out in satellites, did spent some time in working in, uh, the launch industry on rockets. Then, uh, now I'm here back in satellites and you know, honestly, this is the most exciting of the difference based startups. That I've been a part of >>Super interesting. Okay. Angelo, let's, let's talk about the Ruben observatory, ver C Ruben, famous woman scientist, you know, galaxy guru. Now you guys the observatory, you're up way up high. You're gonna get a good look at the Southern sky. Now I know COVID slowed you guys down a bit, but no doubt. You continued to code away on the software. I know you're getting close. You gotta be super excited. Give us the update on, on the observatory and your role. >>All right. So yeah, Rubin is a state of the art observatory that, uh, is in construction on a remote mountain in Chile. And, um, with Rubin, we conduct the, uh, large survey of space and time we are going to observe the sky with, uh, eight meter optical telescope and take, uh, a thousand pictures every night with a 3.2 gig up peaks of camera. And we are going to do that for 10 years, which is the duration of the survey. >>Yeah. Amazing project. Now you, you were a doctor of philosophy, so you probably spent some time thinking about what's out there and then you went out to earn a PhD in astronomy, in astrophysics. So this is something that you've been working on for the better part of your career, isn't it? >>Yeah, that's that's right. Uh, about 15 years, um, I studied physics in college, then I, um, got a PhD in astronomy and, uh, I worked for about five years in another project. Um, the dark energy survey before joining rubing in 2015. >>Yeah. Impressive. So it seems like you both, you know, your organizations are looking at space from two different angles. One thing you guys both have in common of course is, is, is software. And you both use InfluxDB as part of your, your data infrastructure. How did you discover influx DB get into it? How do you use the platform? Maybe Caleb, you could start. >>Uh, yeah, absolutely. So the first company that I extensively used, uh, influx DBN was a launch startup called, uh, Astra. And we were in the process of, uh, designing our, you know, our first generation rocket there and testing the engines, pumps, everything that goes into a rocket. Uh, and when I joined the company, our data story was not, uh, very mature. We were collecting a bunch of data in LabVIEW and engineers were taking that over to MATLAB to process it. Um, and at first there, you know, that's the way that a lot of engineers and scientists are used to working. Um, and at first that was, uh, like people weren't entirely sure that that was a, um, that that needed to change, but it's something the nice thing about InfluxDB is that, you know, it's so easy to deploy. So as the, our software engineering team was able to get it deployed and, you know, up and running very quickly and then quickly also backport all of the data that we collected thus far into influx and what, uh, was amazing to see. >>And as kind of the, the super cool moment with influx is, um, when we hooked that up to Grafana Grafana as the visualization platform we used with influx, cuz it works really well with it. Uh, there was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data where they could just almost instantly easily discover data that they hadn't been able to see before and take the manual processes that they would run after a test and just throw those all in influx and have live data as tests were coming. And, you know, I saw them implementing like crazy rocket equation type stuff in influx, and it just was totally game changing for how we tested. >>So Angelo, I was explaining in my open, you know, you could, you could add a column in a traditional RDBMS and do time series, but with the volume of data that you're talking about, and the example of the Caleb just gave you, I mean, you have to have a purpose built time series database, where did you first learn about influx DB? >>Yeah, correct. So I work with the data management team, uh, and my first project was the record metrics that measured the performance of our software, uh, the software that we used to process the data. So I started implementing that in a relational database. Um, but then I realized that in fact, I was dealing with time series data and I should really use a solution built for that. And then I started looking at time series databases and I found influx B. And that was, uh, back in 2018. The another use for influx DB that I'm also interested is the visits database. Um, if you think about the observations we are moving the telescope all the time in pointing to specific directions, uh, in the Skype and taking pictures every 30 seconds. So that itself is a time series. And every point in that time series, uh, we call a visit. So we want to record the metadata about those visits and flex to, uh, that time here is going to be 10 years long, um, with about, uh, 1000 points every night. It's actually not too much data compared to other, other problems. It's, uh, really just a different, uh, time scale. >>The telescope at the Ruben observatory is like pun intended, I guess the star of the show. And I, I believe I read that it's gonna be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hub's widest camera view, which is amazing, right? That's like 40 moons in, in an image amazingly fast as well. What else can you tell us about the telescope? >>Um, this telescope, it has to move really fast and it also has to carry, uh, the primary mirror, which is an eight meter piece of glass. It's very heavy and it has to carry a camera, which has about the size of a small car. And this whole structure weighs about 300 tons for that to work. Uh, the telescope needs to be, uh, very compact and stiff. Uh, and one thing that's amazing about it's design is that the telescope, um, is 300 tons structure. It sits on a tiny film of oil, which has the diameter of, uh, human hair. And that makes an almost zero friction interface. In fact, a few people can move these enormous structure with only their hands. Uh, as you said, uh, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So each image has, uh, in diameter the size of about seven full moons. And, uh, with that, we can map the entire sky in only, uh, three days. And of course doing operations everything's, uh, controlled by software and it is automatic. Um there's a very complex piece of software, uh, called the scheduler, which is responsible for moving the telescope, um, and the camera, which is, uh, recording 15 terabytes of data every night. >>Hmm. And, and, and Angela, all this data lands in influx DB. Correct. And what are you doing with, with all that data? >>Yeah, actually not. Um, so we are using flex DB to record engineering data and metadata about the observations like telemetry events and commands from the telescope. That's a much smaller data set compared to the images, but it is still challenging because, uh, you, you have some high frequency data, uh, that the system needs to keep up and we need to, to start this data and have it around for the lifetime of the price. Mm, >>Got it. Thank you. Okay, Caleb, let's bring you back in and can tell us more about the, you got these dishwasher size satellites. You're kind of using a multi-tenant model. I think it's genius, but, but tell us about the satellites themselves. >>Yeah, absolutely. So, uh, we have in space, some satellites already that as you said, are like dishwasher, mini fridge kind of size. Um, and we're working on a bunch more that are, you know, a variety of sizes from shoebox to, I guess, a few times larger than what we have today. Uh, and it is, we do shoot to have effectively something like a multi-tenant model where, uh, we will buy a bus off the shelf. The bus is, uh, what you can kind of think of as the core piece of the satellite, almost like a motherboard or something where it's providing the power. It has the solar panels, it has some radios attached to it. Uh, it handles the attitude control, basically steers the spacecraft in orbit. And then we build also in house, what we call our payload hub, which is, has all, any customer payloads attached and our own kind of edge processing sort of capabilities built into it. >>And, uh, so we integrate that. We launch it, uh, and those things, because they're in lower orbit, they're orbiting the earth every 90 minutes. That's, you know, seven kilometers per second, which is several times faster than a speeding bullet. So we've got, we have, uh, one of the unique challenges of operating spacecraft and lower orbit is that generally you can't talk to them all the time. So we're managing these things through very brief windows of time, uh, where we get to talk to them through our ground sites, either in Antarctica or, you know, in the north pole region. >>Talk more about how you use influx DB to make sense of this data through all this tech that you're launching into space. >>We basically previously we started off when I joined the company, storing all of that as Angelo did in a regular relational database. And we found that it was, uh, so slow in the size of our data would balloon over the course of a couple days to the point where we weren't able to even store all of the data that we were getting. Uh, so we migrated to influx DB to store our time series telemetry from the spacecraft. So, you know, that's things like, uh, power level voltage, um, currents counts, whatever, whatever metadata we need to monitor about the spacecraft. We now store that in, uh, in influx DB. Uh, and that has, you know, now we can actually easily store the entire volume of data for the mission life so far without having to worry about, you know, the size bloating to an unmanageable amount. >>And we can also seamlessly query, uh, large chunks of data. Like if I need to see, you know, for example, as an operator, I might wanna see how my, uh, battery state of charge is evolving over the course of the year. I can have a plot and an influx that loads that in a fraction of a second for a year's worth of data, because it does, you know, intelligent, um, I can intelligently group the data by, uh, sliding time interval. Uh, so, you know, it's been extremely powerful for us to access the data and, you know, as time has gone on, we've gradually migrated more and more of our operating data into influx. >>You know, let's, let's talk a little bit, uh, uh, but we throw this term around a lot of, you know, data driven, a lot of companies say, oh, yes, we're data driven, but you guys really are. I mean, you' got data at the core, Caleb, what does that, what does that mean to you? >>Yeah, so, you know, I think the, and the clearest example of when I saw this be like totally game changing is what I mentioned before at Astro where our engineer's feedback loop went from, you know, a lot of kind of slow researching, digging into the data to like an instant instantaneous, almost seeing the data, making decisions based on it immediately, rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. Um, but to give another practical example, uh, as I said, we have a huge amount of data that comes down every orbit, and we need to be able to ingest all of that data almost instantaneously and provide it to the operator. And near real time, you know, about a second worth of latency is all that's acceptable for us to react to, to see what is coming down from the spacecraft and building that pipeline is challenging from a software engineering standpoint. >>Um, our primary language is Python, which isn't necessarily that fast. So what we've done is started, you know, in the, in the goal of being data driven is publish metrics on individual, uh, how individual pieces of our data processing pipeline are performing into influx as well. And we do that in production as well as in dev. Uh, so we have kind of a production monitoring, uh, flow. And what that has done is allow us to make intelligent decisions on our software development roadmap, where it makes the most sense for us to, uh, focus our development efforts in terms of improving our software efficiency. Uh, just because we have that visibility into where the real problems are. Um, it's sometimes we've found ourselves before we started doing this kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. Uh, but now, now that we're being a bit more data driven, there we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scale to, from supporting a couple satellites, to supporting many, many satellites at >>Once. Yeah. Coach. So you reduced those dead ends, maybe Angela, you could talk about what, what sort of data driven means to, to you and your teams? >>I would say that, um, having, uh, real time visibility, uh, to the telemetry data and, and metrics is, is, is crucial for us. We, we need, we need to make sure that the image that we collect with the telescope, uh, have good quality and, um, that they are within the specifications, uh, to meet our science goals. And so if they are not, uh, we want to know that as soon as possible and then, uh, start fixing problems. >>Caleb, what are your sort of event, you know, intervals like? >>So I would say that, you know, as of today on the spacecraft, the event, the, the level of timing that we deal with probably tops out at about, uh, 20 Hertz, 20 measurements per second on, uh, things like our, uh, gyroscopes, but the, you know, I think the, the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications. And I'll give an example, uh, from when I worked at, on the rocket at Astra there, our baseline data rate that we would ingest data during a test is, uh, 500 Hertz. So 500 samples per second. And in some cases we would actually, uh, need to ingest much higher rate data, even up to like 1.5 kilohertz. So, uh, extremely, extremely high precision, uh, data there where timing really matters a lot. And, uh, you know, I can, one of the really powerful things about influx is the fact that it can handle this. >>That's one of the reasons we chose it, uh, because there's times when we're looking at the results of a firing where you're zooming in, you know, I talked earlier about how on my current job, we often zoom out to look, look at a year's worth of data. You're zooming in to where your screen is preoccupied by a tiny fraction of a second. And you need to see same thing as Angela just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers. So that can be something like, Hey, I opened this valve at exactly this time and that goes, we wanna have that at, you know, micro or even nanosecond precision so that we know, okay, we saw a spike in chamber pressure at, you know, at this exact moment, was that before or after this valve open, those kind of, uh, that kind of visibility is critical in these kind of scientific, uh, applications and absolutely game changing to be able to see that in, uh, near real time and, uh, with a really easy way for engineers to be able to visualize this data themselves without having to wait for, uh, software engineers to go build it for them. >>Can the scientists do self-serve or are you, do you have to design and build all the analytics and, and queries for your >>Scientists? Well, I think that's, that's absolutely from, from my perspective, that's absolutely one of the best things about influx and what I've seen be game changing is that, uh, generally I'd say anyone can learn to use influx. Um, and honestly, most of our users might not even know they're using influx, um, because what this, the interface that we expose to them is Grafana, which is, um, a generic graphing, uh, open source graphing library that is very similar to influx own chronograph. Sure. And what it does is, uh, let it provides this, uh, almost it's a very intuitive UI for building your queries. So you choose a measurement and it shows a dropdown of available measurements. And then you choose a particular, the particular field you wanna look at. And again, that's a dropdown, so it's really easy for our users to discover. And there's kind of point and click options for doing math aggregations. You can even do like perfect kind of predictions all within Grafana, the Grafana user interface, which is really just a wrapper around the APIs and functionality of the influx provides putting >>Data in the hands of those, you know, who have the context of domain experts is, is key. Angela, is it the same situation for you? Is it self serve? >>Yeah, correct. Uh, as I mentioned before, um, we have the astronomers making their own dashboards because they know what exactly what they, they need to, to visualize. Yeah. I mean, it's all about using the right tool for the job. I think, uh, for us, when I joined the company, we weren't using influx DB and we, we were dealing with serious issues of the database growing to an incredible size extremely quickly, and being unable to like even querying short periods of data was taking on the order of seconds, which is just not possible for operations >>Guys. This has been really formative it's, it's pretty exciting to see how the edge is mountaintops, lower orbits to be space is the ultimate edge. Isn't it. I wonder if you could answer two questions to, to wrap here, you know, what comes next for you guys? Uh, and is there something that you're really excited about that, that you're working on Caleb, maybe you could go first and an Angela, you can bring us home. >>Uh, basically what's next for loft. Orbital is more, more satellites, a greater push towards infrastructure and really making, you know, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, uh, making that happen, it's extremely exciting and extremely exciting time to be in this company and to be in this industry as a whole, because there are so many interesting applications out there. So many cool ways of leveraging space that, uh, people are taking advantage of. And with, uh, companies like SpaceX and the now rapidly lowering cost, cost of launch, it's just a really exciting place to be. And we're launching more satellites. We are scaling up for some constellations and our ground system has to be improved to match. So there's a lot of, uh, improvements that we're working on to really scale up our control software, to be best in class and, uh, make it capable of handling such a large workload. So >>You guys hiring >><laugh>, we are absolutely hiring. So, uh, I would in we're we need, we have PE positions all over the company. So, uh, we need software engineers. We need people who do more aerospace, specific stuff. So, uh, absolutely. I'd encourage anyone to check out the loft orbital website, if there's, if this is at all interesting. >>All right. Angela, bring us home. >>Yeah. So what's next for us is really, uh, getting this, um, telescope working and collecting data. And when that's happen is going to be just, um, the Lu of data coming out of this camera and handling all, uh, that data is going to be really challenging. Uh, yeah. I wanna wanna be here for that. <laugh> I'm looking forward, uh, like for next year we have like an important milestone, which is our, um, commissioning camera, which is a simplified version of the, of the full camera it's going to be on sky. And so yeah, most of the system has to be working by them. >>Nice. All right, guys, you know, with that, we're gonna end it. Thank you so much, really fascinating, and thanks to influx DB for making this possible, really groundbreaking stuff, enabling value creation at the edge, you know, in the cloud and of course, beyond at the space. So really transformational work that you guys are doing. So congratulations and really appreciate the broader community. I can't wait to see what comes next from having this entire ecosystem. Now, in a moment, I'll be back to wrap up. This is Dave ante, and you're watching the cube, the leader in high tech enterprise coverage. >>Welcome Telegraph is a popular open source data collection. Agent Telegraph collects data from hundreds of systems like IOT sensors, cloud deployments, and enterprise applications. It's used by everyone from individual developers and hobbyists to large corporate teams. The Telegraph project has a very welcoming and active open source community. Learn how to get involved by visiting the Telegraph GitHub page, whether you want to contribute code, improve documentation, participate in testing, or just show what you're doing with Telegraph. We'd love to hear what you're building. >>Thanks for watching. Moving the world with influx DB made possible by influx data. I hope you learn some things and are inspired to look deeper into where time series databases might fit into your environment. If you're dealing with large and or fast data volumes, and you wanna scale cost effectively with the highest performance and you're analyzing metrics and data over time times, series databases just might be a great fit for you. Try InfluxDB out. You can start with a free cloud account by clicking on the link and the resources below. Remember all these recordings are gonna be available on demand of the cube.net and influx data.com. So check those out and poke around influx data. They are the folks behind InfluxDB and one of the leaders in the space, we hope you enjoyed the program. This is Dave Valante for the cube. We'll see you soon.

Published Date : May 12 2022

SUMMARY :

case that anyone can relate to and you can build timestamps into Now, the problem with the latter example that I just gave you is that you gotta hunt As I just explained, we have an exciting program for you today, and we're And then we bring it back here Thanks for coming on. What is the story? And, and he basically, you know, from my point of view, he invented modern time series, Yeah, I think we're, I, you know, I always forget the number, but it's something like 230 or 240 people relational database is the one database to rule the world. And then you get the data lake. So And so you get to these applications Isn't good enough when you need real time. It's like having the feature for, you know, you buy a new television, So this is a big part of how we're seeing with people saying, Hey, you know, And so you get the dynamic of, you know, of constantly instrumenting watching the What are you seeing for your, with in, with influx DB, So a lot, you know, Tesla, lucid, motors, Cola, You mentioned, you know, you think of IOT, look at the use cases there, it was proprietary And so the developer, So let's get to the developer real quick, real highlight point here is the data. So to a degree that you are moving your service, So when you bring in kind of old way, new way old way was you know, the best of the open source world. They have faster time to market cuz they're assembling way faster and they get to still is what we like to think of it. I mean systems, uh, uh, systems have consequences when you make changes. But that's where the that's where the, you know, that that Boeing or that airplane building analogy comes in So I'll have to ask you if I'm the customer. Because now I have to make these architectural decisions, as you mentioned, And so that's what you started building. And since I have a PO for you and a big check, yeah. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build What would you say to someone looking to do something in time series on edge? in the build business of building systems that you want 'em to be increasingly intelligent, Brian Gilmore director of IOT and emerging technology that influx day will join me. So you can focus on the Welcome to the show. Sort of, you know, riding along with them is they're successful. Now, you go back since 20 13, 14, even like five years ago that convergence of physical And I think, you know, those, especially in the OT and on the factory floor who weren't able And I think I, OT has been kind of like this thing for OT and, you know, our client libraries and then working hard to make our applications, leveraging that you guys have users in the enterprise users that IOT market mm-hmm <affirmative>, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining open dozens very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, What are some of the, um, soundbites you hear from customers when they're successful? machines that go deep into the earth to like drill tunnels for, for, you know, I personally think that's a hot area because I think if you look at AI right all of the things you need to do with that data in stream, um, before it hits your sort of central repository. So you have that whole CEO perspective, but he brought up this notion that You can start to compare asset to asset, and then you can do those things like we talked about, So in this model you have a lot of commercial operations, industrial equipment. And I think, you know, we are, we're building some technology right now. like, you know, either in low earth orbit or you know, all the way sort of on the other side of the universe. I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform How do you view view that? Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, There's so much data out there now. that data from point a to point B and you know, to process it correctly so that the end And, and the democratization is the benefit. allow them to just port to us, you know, directly from the applications and the languages Thanks for sharing all, all the complexities and, and IOT that you Well, thank any, any last word you wanna share No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, You're gonna hear more about that in the next segment, too. the moment that you can look at to kind of see the state of what's going on. And we often point to influx as a solution Tell us about loft Orbi and what you guys do to attack that problem. So that it's almost as simple as, you know, We are kind of groundbreaking in this area and we're serving, you know, a huge variety of customers and I knew, you know, I want to be in the space industry. famous woman scientist, you know, galaxy guru. And we are going to do that for 10 so you probably spent some time thinking about what's out there and then you went out to earn a PhD in astronomy, Um, the dark energy survey So it seems like you both, you know, your organizations are looking at space from two different angles. something the nice thing about InfluxDB is that, you know, it's so easy to deploy. And, you know, I saw them implementing like crazy rocket equation type stuff in influx, and it Um, if you think about the observations we are moving the telescope all the And I, I believe I read that it's gonna be the first of the next Uh, the telescope needs to be, And what are you doing with, compared to the images, but it is still challenging because, uh, you, you have some Okay, Caleb, let's bring you back in and can tell us more about the, you got these dishwasher and we're working on a bunch more that are, you know, a variety of sizes from shoebox sites, either in Antarctica or, you know, in the north pole region. Talk more about how you use influx DB to make sense of this data through all this tech that you're launching of data for the mission life so far without having to worry about, you know, the size bloating to an Like if I need to see, you know, for example, as an operator, I might wanna see how my, You know, let's, let's talk a little bit, uh, uh, but we throw this term around a lot of, you know, data driven, And near real time, you know, about a second worth of latency is all that's acceptable for us to react you know, in the, in the goal of being data driven is publish metrics on individual, So you reduced those dead ends, maybe Angela, you could talk about what, what sort of data driven means And so if they are not, So I would say that, you know, as of today on the spacecraft, the event, so that we know, okay, we saw a spike in chamber pressure at, you know, at this exact moment, the particular field you wanna look at. Data in the hands of those, you know, who have the context of domain experts is, issues of the database growing to an incredible size extremely quickly, and being two questions to, to wrap here, you know, what comes next for you guys? a greater push towards infrastructure and really making, you know, So, uh, we need software engineers. Angela, bring us home. And so yeah, most of the system has to be working by them. at the edge, you know, in the cloud and of course, beyond at the space. involved by visiting the Telegraph GitHub page, whether you want to contribute code, and one of the leaders in the space, we hope you enjoyed the program.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
John	PERSON	0.99+
Angela	PERSON	0.99+
Evan	PERSON	0.99+
2015	DATE	0.99+
SpaceX	ORGANIZATION	0.99+
2016	DATE	0.99+
Dave Valante	PERSON	0.99+
Antarctica	LOCATION	0.99+
Boeing	ORGANIZATION	0.99+
Caleb	PERSON	0.99+
10 years	QUANTITY	0.99+
Chile	LOCATION	0.99+
Brian	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Evan Kaplan	PERSON	0.99+
Aaron Seley	PERSON	0.99+
Angelo Fasi	PERSON	0.99+
2013	DATE	0.99+
Paul	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
2018	DATE	0.99+
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two questions	QUANTITY	0.99+
Caleb McLaughlin	PERSON	0.99+
40 moons	QUANTITY	0.99+
two systems	QUANTITY	0.99+
two	QUANTITY	0.99+
Angelo	PERSON	0.99+
230	QUANTITY	0.99+
300 tons	QUANTITY	0.99+
three	QUANTITY	0.99+
500 Hertz	QUANTITY	0.99+
3.2 gig	QUANTITY	0.99+
15 terabytes	QUANTITY	0.99+
eight meter	QUANTITY	0.99+
two practitioners	QUANTITY	0.99+
20 Hertz	QUANTITY	0.99+
25 years	QUANTITY	0.99+
Today	DATE	0.99+
Palo Alto	LOCATION	0.99+
Python	TITLE	0.99+
Oracle	ORGANIZATION	0.99+
Paul dicks	PERSON	0.99+
First	QUANTITY	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
first	QUANTITY	0.99+
earth	LOCATION	0.99+
240 people	QUANTITY	0.99+
three days	QUANTITY	0.99+
apple	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
HBI	ORGANIZATION	0.99+
Dave LAN	PERSON	0.99+
today	DATE	0.99+
each image	QUANTITY	0.99+
next year	DATE	0.99+
cube.net	OTHER	0.99+
InfluxDB	TITLE	0.99+
one	QUANTITY	0.98+
1000 points	QUANTITY	0.98+

Evan Kaplan, InfluxData

>>Okay. Today we welcome Evan Kaplan, CEO of Influx Data, the company behind Influx DB Welcome, Evan. Thanks for coming on. >>Hey, John. Thanks for having me. >>Great segment here on the influx. DB Story. What is the story? Take us through the history. Why Time series? What's the story? >>So the history of history is actually actually pretty interesting. Paul Dicks, my partner in this and our founder, um, super passionate about developers and developer experience. And, um, he had worked on Wall Street building a number of times series kind of platform trading platforms for trading stocks. And from his point of view, it was always what he would call a yak shave, which means you have to do a tonne of work just to start doing work. Which means you have to write a bunch of extrinsic routines. You had to write a bunch of application handling on existing relational databases in order to come up with something that was optimised for a trading platform or a time series platform. And he sort of he just developed This real clear point of view is this is not how developers should work. And so in 2013, he went through y Combinator and he built something for he made his first commit to open source influx TB at the end of 2013. And basically, you know, from my point of view, you invented modern time series, which is you start with a purpose built time series platform to do these kind of work clothes, and you get all the benefits of having something right out of the box or developer can be totally productive right away. >>And how many people in the company What's the history of employees and stuff? Yeah, >>I think we're you know, I always forget the number, but it's something like 230 or 240 people now. Um, the company I joined the company in 2016 and I love Paul's vision, and I just had a strong conviction about the relationship between Time series and Iot. Because if you think about it, what sensors do is they speak time, series, pressure, temperature, volume, humidity, light. They're measuring their instrumented something over time. And so I thought that would be super relevant over long term, and I've not regretted. Oh, >>no, and it's interesting at that time to go back in history. You know the role of databases are relational database, the one database to rule the world. And then, as clouds started coming in, you're starting to see more databases, proliferate types of databases. And Time series in particular, is interesting because real time has become super valuable. From an application standpoint, Iot, which speaks Time series, means something. It's like time matters >>times, >>and sometimes date is not worth it after the time. Sometimes it's worth it. And then you get the Data lake, so you have this whole new evolution. Is this the momentum? What's the momentum? I guess the question is, what's the momentum behind >>what's causing us to grow? So >>the time series. Why is time series in the category momentum? What's the bottom line? We'll >>think about it. You think about it from abroad, abroad, sort of frame, which is where what everybody's trying to do is build increasingly intelligent systems, whether it's a self driving car or a robotic system that does what you want to do or self healing software system. Everybody wants to build increasing intelligence systems, and so, in order to build these increasingly intelligence systems. You have to instrument the system well, and you have to instrument it over time, better and better. And so you need a tool, a fundamental tool to drive that instrumentation. And that's become clear to everybody that that instrumentation is all based on time. And so what happened? What happened? What happened? What's going to happen? And so you get to these applications, like predictive maintenance or smarter systems. And increasingly, you want to do that stuff not just intelligently, but fast in real time, so millisecond response, so that when you're driving a self driving car and the system realises that you're about to do something, essentially, you want to be able to act in something that looks like real time. All systems want to do that. I want to be more intelligent, and they want to be more real time. So we just happened to, you know, we happen to show up at the right time. In the evolution of the market. >>It's interesting. Near real time isn't good enough when you need real time. Yeah, >>it's not, it's not, and it's like it's like everybody wants even when you don't need it. Uh, ironically, you want it. It's like having the feature for, you know, you buy a new television, you want that one feature even though you're not going to use it, you decide that you're buying criteria. Real time is a buying criteria. >>So what you're saying, then is near real time is getting closer to real time as possible as possible. Okay, so talk about the aspect of data cause we're hearing a lot of conversations on the Cubans particular around how people are implementing and actually getting better. So iterating on data. >>But >>you have to know when it happened to get know how to fix it. So this is a big part of what we're seeing with people saying, Hey, you know, I want to make my machine learning albums better after the fact I want to learn from the data. Um, how does that How do you see that evolving? Is that one of the use cases of sensors as people bring data in off the network, getting better with the data knowing when it happened? >>Well, for sure, So for sure, what you're saying is is none of this is non linear. It's all incremental. And so if you take something, you know, just as an easy example. If you take a self driving car, what you're doing is your instrument in that car to understand where it can perform in the real world in real time. And if you do that, if you run the loop, which is I instrumented, I watch what happens. Oh, that's wrong. Oh, I have to correct for that. Correct for that in the software, if you do that four billion times, you get a self driving car. But every system moves along that evolution. And so you get the dynamic of you know of constantly instrumented, watching the system behave and do it and this and sets up driving cars. One thing. But even in the human genome, if you look at some of our customers, you know people like, you know, people doing solar arrays. People doing power walls like all of these systems, are getting smarter. >>What are the top application? What are you seeing your with Influx DB The Time series. What's the sweet spot for the application use case and some customers give some examples. >>Yeah, so it's pretty easy to understand. On one side of the equation. That's the physical side is sensors are the sensors are getting cheap. Obviously, we know that, and they're getting. The whole physical world is getting instrumented your home, your car, the factory floor, your wrist watch your healthcare, you name it. It's getting instrumented in the physical world. We're watching the physical world in real time, and so there are three or four sweet spots for us. But they're all on that side. They're all about Iot. So they're talking about consumer Iot projects like Google's Nest Tato Um, particle sensors, Um, even delivery engines like Happy who deliver the interesting part of South America. Like anywhere. There's a physical location doing that's on the consumer side. And then another exciting space is the industrial side. Factories are changing dramatically over time, increasingly moving away from proprietary equipment to develop or driven systems that run operational because what it has to get smarter when you're building, when you're building a factory, systems all have to get smarter. And then lastly, a lot in the renewables sustainability. So a lot, you know, Tesla, lucid motors, Nicola Motors, um you know, lots to do with electric cars, solar arrays, windmills are raised just anything that's going to get instrumented, that where that instrumentation becomes part of what the purpose is. >>It's interesting. The convergence of physical and digital is happening with the data Iot you mentioned. You know, you think of Iot. Look at the use cases there. It was proprietary OT systems now becoming more I p enabled Internet protocol and now edge compute getting smaller, faster, cheaper ai going to the edge. Now you have all kinds of new capabilities that bring that real time and time series opportunity. Are you seeing Iot going to a new level? What was that? What's the Iot? Where's the Iot dots connecting to? Because, you know, as these two cultures merge operations basically industrial factory car, they gotta get smarter. Intelligent edge is a buzzword, but it has to be more intelligent. Where's the where's the action in all this? So the >>action really, really at the core? >>It's >>at the developer, right, Because you're looking at these things. It's very hard to get off the shelf system to do the kinds of physical and software interaction. So the actions really happen at the developers. And so what you're seeing is a movement in the world that that maybe you and I grew up in with I t r o T moving increasingly that developer driven capability. And so all of these Iot systems, their bespoke, they don't come out of the box. And so the developer and the architect, the CTO they define what's my business? What am I trying to do trying to sequence the human genome and figure out when these genes express themselves? Or am I trying to figure out when the next heart rate monitor is going to show up in my apple watch, right? What am I trying to do? What's the system I need to build? And so starting with the developers where all of the good stuff happens here, which is different than it used to be, right, used to be used by an application or a service or a sad thing for But with this dynamic with this integration of systems, it's all about bespoke. It's all about building something. >>So let's get to the death of a real quick, real highlight point. Here is the data. I mean, I could see a developer saying, Okay, I need to have an application for the edge Iot, edge or car. I mean, we're gonna test look at applications of the cars right there. I mean, there's the modern application lifecycle now, so take us through how this impacts the developer doesn't impact their CI CD. Pipeline is a cloud native. I mean, where does this all Where does this go to? >>Well, so first of all you talking about, there was an internal journey that we had to go through as a company, which which I think is fascinating for anybody's interested as we went from primarily a monolithic software that was open source to building a cloud native platform, which means we have to move from an agile development environment to a C I C d. Environ. So two degree that you're moving your service whether it's, you know, Tesla, monitoring your car and updating your power walls right? Or whether it's a solar company updating your race right to the degree that services cloud then increasingly removed from an agile development to a CI CD environment which is shipping code to production every day. And so it's not just the developers, all the infrastructure to support the developers to run that service and that sort of stuff. I think that's also going to happen in a big way >>when your customer base that you have now and you see evolving with influx DB is it that they're gonna be writing more of the application or relying more on others? I mean, obviously the open source component here. So when you bring in kind of old way new Way Old Way was, I got a proprietary platform running all this Iot stuff and I got to write, Here's an application. That's general purpose. I have some flexibility, somewhat brittle. Maybe not a lot of robustness to it, but it does its job >>a good way to think about this. >>This is what >>So, yeah, a good way to think about this is what What's the role of the developer slashed architect C T o that chain within a large enterprise or a company. And so, um, the way to think about is I started my career in the aerospace industry, and so when you look at what Boeing does to assemble a plane, they build very, very few of the parts instead. What they do is they assemble, they buy the wings, they buy the engines they assemble. Actually, they don't buy the wings. It's the one thing they buy, the material of the way they build the wings because there's a lot of tech in the wings and they end up being assemblers, smart assemblers of what ends up being a flying aeroplane, which is pretty big deal even now. And so what happens with software people is they have the ability to pull from, you know, the best of the open source world, so they would pull a time series capability from us. Then they would assemble that with potentially some E t l logic from somebody else, or they assemble it with, um, a Kafka interface to be able to stream the data in. And so they become very good integrators and assemblers. But they become masters of that bespoke application, and I think that's where it goes because you're not writing native code for everything, >>so they're more flexible. They have faster time to market because they're assembling way faster and they get to still maintain their core competency. OK, the wings. In this case, >>they become increasingly not just coders, but designers and developers. They become broadly builders is what we like to think of it. People who started build stuff. By the way. This is not different than the people have just up the road Google have been doing for years or the tier one Amazon building all their own. >>Well, I think one of the things that's interesting is that this idea of a systems developing a system architecture, I mean systems, uh, systems have consequences when you make changes. So when you have now cloud data centre on premise and edge working together, how does that work across the system? You can't have a wing that doesn't work with the other wing. That's exactly >>that's where that's where the, you know that that Boeing or that aeroplane building analogy comes in for us. We've really been thoughtful about that because I o. T. It's critical. So are open Source Edge has the same API as our cloud native stuff that hasn't enterprise on premises or multiple products have the same API, and they have a relationship with each other. They can talk with each other, so the builder builds at once. And so this is where when you start thinking about the components that people have to use to build these services is that you want to make sure at least that base layer that database layer that those components talk to each other. >>We'll have to ask you. I'm the customer. I put my customer hat on. Okay. Hey, I'm dealing with a lot. >>I mean, you have appeal for >>a big check blank check. If you can answer this question only if you get the question right. I got all this important operation stuff. I got my factory. I got my self driving cars. This isn't like trivial stuff. This is my business. How should I be thinking about Time Series? Because now I have to make these architectural decisions as you mentioned and it's going to impact my application development. So huge decision point for your customers. What should I care about the most? What's in it for me? Why is time series important? Yeah, >>that's a great question. So chances are if you've got a business that was 20 years old or 25 years old, you're already thinking about Time series. You probably didn't call it that you built something on a work call or you build something that IBM db two. Right, and you made it work within your system, right? And so that's what you started building. So it's already out there. There are, you know, they're probably hundreds of millions of Time series applications out there today. But as you start to think about this increasing need for real time and you start to think about increasing intelligence, you think about optimising those systems over time. I hate the word but digital transformation, and you start with Time series. It's a foundational base layer for any system that you're going to build. There's no system I can think of where time series shouldn't be the foundational base layer. If you just want to store your data and just leave it there and then maybe look it up every five years, that's fine. That's not time. Serious time series when you're building a smarter, more intelligent, more real time system, and the developers now know that, and so the more they play a role in building these systems, the more obvious it becomes. >>And since I have a P o for you in a big check, what what's the value to me as like when I implement this What's the end state? What's it look like when it's up and running? What's the value proposition for me? What's in it? >>So when it's up and running, you're able to handle the queries, the writing of the data, the down sampling of the data transforming it in near real time. So the other dependencies that a system that gets for adjusting a solar array or trading energy off of a power wall or some sort of human genome those systems work better. So time series is foundational. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build a really compelling intelligence system. I think that's what developers and architects are seeing now. >>Bottom line. Final word. What's in it for the customer? What's what's your What's your statement of the customer? Would you say to someone looking to do something in time, series and edge? >>Yeah. So it's pretty clear to clear to us that if you're building, if you view yourself as being in the building business of building systems that you want them to be increasingly intelligent, self healing, autonomous, you want them to operate in real time that you start from Time series. I also want to say What's in it for us in flux? What's in it for us is people are doing some amazing stuff. I highlighted some of the energy stuff, some of the human genome, some of the health care. It's hard not to be proud or feel like. Wow. Somehow I've been lucky. I've arrived at the right time in the right place, with the right people to be able to deliver on that. That's That's also exciting on our side of the equation. >>It's critical infrastructure, critical critical operations. >>Yeah, great >>stuff. Evan. Thanks for coming on. Appreciate this segment. All right. In a moment. Brian Gilmore, director of Iot and emerging Technology that influx, they will join me. You're watching the Cube leader in tech coverage. Thanks for watching

Published Date : May 8 2022

SUMMARY :

Thanks for coming on. What is the story? And basically, you know, from my point of view, you invented modern time series, I think we're you know, I always forget the number, but it's something like 230 or 240 people now. the one database to rule the world. And then you get the Data lake, so you have this whole new the time series. You have to instrument the system well, and you have to instrument it over Near real time isn't good enough when you need real time. It's like having the feature for, you know, you buy a new television, Okay, so talk about the aspect of data cause we're hearing a lot of conversations on the Cubans particular around how saying, Hey, you know, I want to make my machine learning albums better after the fact I want to learn from the data. Correct for that in the software, if you do that four billion times, What's the sweet spot for the application use case and some customers give some examples. So a lot, you know, Tesla, lucid motors, Nicola Motors, So the And so the developer and the architect, the CTO they define what's my business? Here is the data. And so it's not just the developers, So when you bring in kind of old way new Way Old Way was, the way to think about is I started my career in the aerospace industry, and so when you look at what Boeing OK, the wings. This is not different than the people have just So when you have now cloud data centre on premise and edge working together, And so this is where when you start I'm the customer. Because now I have to make these architectural decisions as you I hate the word but digital transformation, and you start with Time series. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build What's in it for the customer? in the building business of building systems that you want them to be increasingly intelligent, director of Iot and emerging Technology that influx, they will join me.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
2016	DATE	0.99+
2013	DATE	0.99+
Evan Kaplan	PERSON	0.99+
Influx Data	ORGANIZATION	0.99+
Boeing	ORGANIZATION	0.99+
Evan	PERSON	0.99+
Google	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
230	QUANTITY	0.99+
Paul Dicks	PERSON	0.99+
Iot	ORGANIZATION	0.99+
three	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
South America	LOCATION	0.99+
Today	DATE	0.99+
Paul	PERSON	0.99+
240 people	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Cubans	PERSON	0.98+
four billion times	QUANTITY	0.98+
Iot	TITLE	0.98+
first	QUANTITY	0.98+
Nicola Motors	ORGANIZATION	0.98+
one	QUANTITY	0.97+
lucid motors	ORGANIZATION	0.97+
time series	TITLE	0.96+
two cultures	QUANTITY	0.96+
today	DATE	0.96+
one side	QUANTITY	0.96+
InfluxData	ORGANIZATION	0.95+
Wall Street	LOCATION	0.95+
Influx DB	ORGANIZATION	0.95+
tier one	QUANTITY	0.93+
Time series	TITLE	0.93+
Kafka	TITLE	0.93+
millions	QUANTITY	0.92+
one feature	QUANTITY	0.91+
end of 2013	DATE	0.9+
two degree	QUANTITY	0.89+
One thing	QUANTITY	0.87+
one thing	QUANTITY	0.84+
four sweet spots	QUANTITY	0.84+
25 years old	QUANTITY	0.84+
20 years old	QUANTITY	0.8+
Influx DB	COMMERCIAL_ITEM	0.78+
Cube	ORGANIZATION	0.77+
a tonne of work	QUANTITY	0.74+
one database	QUANTITY	0.74+
apple	ORGANIZATION	0.71+
five years	QUANTITY	0.7+
DB	ORGANIZATION	0.67+
influx	ORGANIZATION	0.6+
agile	TITLE	0.56+
years	QUANTITY	0.53+
Time	TITLE	0.52+
lake	LOCATION	0.51+
db two	TITLE	0.51+
Story	TITLE	0.44+

Moving The World With InfluxDB

(upbeat music) >> Okay, we're now going to go into the customer panel. And we'd like to welcome Angelo Fausti, who's software engineer at the Vera C Rubin Observatory, and Caleb Maclachlan, who's senior spacecraft operations software engineer at Loft Orbital. Guys, thanks for joining us. You don't want to miss folks, this interview. Caleb, let's start with you. You work for an extremely cool company. You're launching satellites into space. Cause doing that is highly complex and not a cheap endeavor. Tell us about Loft Orbital and what you guys do to attack that problem? >> Yeah, absolutely. And thanks for having me here, by the way. So Loft Orbital is a company that's a series B startup now. And our mission basically is to provide rapid access to space for all kinds of customers. Historically, if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, have big software teams, and then eventually worry about a lot of very specialized engineering. And what we're trying to do is, change that from a super specialized problem that has an extremely high barrier of access to a infrastructure problem. So that it's almost as simple as deploying a VM in AWS or GCP, as getting your programs, your mission deployed on orbit, with access to different sensors, cameras, radios, stuff like that. So that's kind of our mission. And just to give a really brief example of the kind of customer that we can serve. There's a really cool company called Totum labs, who is working on building an IoT constellation, for Internet of Things. Basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor IoT, which means you have this little modem inside a container. A container that you track from anywhere on the world as it's going across the ocean. So it's really little. And they've been able to stay small startup that's focused on their product, which is that super crazy, complicated, cool radio, while we handle the whole space segment for them, which just, before Loft was really impossible. So that's our mission is, providing space infrastructure as a service. We are kind of groundbreaking in this area, and we're serving a huge variety of customers with all kinds of different missions, and obviously, generating a ton of data in space that we've got to handle. >> Yeah, so amazing, Caleb, what you guys do. I know you were lured to the skies very early in your career, but how did you kind of land in this business? >> Yeah, so I guess just a little bit about me. For some people, they don't necessarily know what they want to do, early in their life. For me, I was five years old and I knew, I want to be in the space industry. So I started in the Air Force, but have stayed in the space industry my whole career and been a part of, this is the fifth space startup that I've been a part of, actually. So I've kind of started out in satellites, did spend some time in working in the launch industry on rockets. Now I'm here back in satellites. And honestly, this is the most exciting of the different space startups that I've been a part of. So, always been passionate about space and basically writing software for operating in space for basically extending how we write software into orbit. >> Super interesting. Okay, Angelo. Let's talk about the Rubin Observatory Vera C. Rubin, famous woman scientists, Galaxy guru, Now you guys, the observatory are up, way up high, you're going to get a good look at the southern sky. I know COVID slowed you guys down a bit. But no doubt you continue to code away on the software. I know you're getting close. You got to be super excited. Give us the update on the observatory and your role. >> All right. So yeah, Rubin is state of the art observatory that is in construction on a remote mountain in Chile. And with Rubin we'll conduct the large survey of space and time. We are going to observe the sky with eight meter optical telescope and take 1000 pictures every night with 3.2 gigapixel camera. And we're going to do that for 10 years, which is the duration of the survey. The goal is to produce an unprecedented data set. Which is going to be about .5 exabytes of image data. And from these images will detect and measure the properties of billions of astronomical objects. We are also building a science platform that's hosted on Google Cloud, so that the scientists and the public can explore this data to make discoveries. >> Yeah, amazing project. Now, you aren't a Doctor of Philosophy. So you probably spent some time thinking about what's out there. And then you went on to earn a PhD in astronomy and astrophysics. So this is something that you've been working on for the better part of your career, isn't it? >> Yeah, that's right. About 15 years. I studied physics in college, then I got a PhD in astronomy. And I worked for about five years in another project, the Dark Energy survey before joining Rubin in 2015. >> Yeah, impressive. So it seems like both your organizations are looking at space from two different angles. One thing you guys both have in common, of course, is software. And you both use InfluxDB as part of your data infrastructure. How did you discover InfluxDB, get into it? How do you use the platform? Maybe Caleb, you can start. >> Yeah, absolutely. So the first company that I extensively used InfluxDB in was a launch startup called Astra. And we were in the process of designing our first generation rocket there and testing the engines, pumps. Everything that goes into a rocket. And when I joined the company, our data story was not very mature. We were collecting a bunch of data in LabVIEW. And engineers were taking that over to MATLAB to process it. And at first, that's the way that a lot of engineers and scientists are used to working. And at first that was, like, people weren't entirely sure that, that needed to change. But it's something, the nice thing about InfluxDB is that, it's so easy to deploy. So our software engineering team was able to get it deployed and up and running very quickly and then quickly also backport all of the data that we've collected thus far into Influx. And what was amazing to see and it's kind of the super cool moment with Influx is, when we hooked that up to Grafana, Grafana, is the visualization platform we use with influx, because it works really well with it. There was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data, where they could just almost instantly, easily discover data that they hadn't been able to see before. And take the manual processes that they would run after a test and just throw those all in Influx and have live data as tests were coming. And I saw them implementing crazy rocket equation type stuff in Influx and it just was totally game changing for how we tested. And things that previously it would be like run a test, then wait an hour for the engineers to crunch the data and then we run another test with some changed parameters or a changed startup sequence or something like that, became, by the time the test is over, the engineers know what the next step is, because they have this just like instant game changing access to data. So since that experience, basically everywhere I've gone, every company since then, I've been promoting InfluxDB and using it and spinning it up and quickly showing people how simple and easy it is. >> Yeah, thank you. So Angelo, I was explaining in my open that, you know you could add a column in a traditional RDBMS and do time series. But with the volume of data that you're talking about in the example that Caleb just gave, you have to have a purpose built time series database. Where did you first learn about InfluxDB? >> Yeah, correct. So I worked with the data management team and my first project was the record metrics that measure the performance of our software. The software that we use to process the data. So I started implementing that in our relational database. But then I realized that in fact, I was dealing with time series data. And I should really use a solution built for that. And then I started looking at time series databases and I found InfluxDB, that was back in 2018. Then I got involved in another project. To record telemetry data from the telescope itself. It's very challenging because you have so many subsystems and sensors, producing data. And with that data, the goal is to look at the telescope harder in real time so we can make decisions and make sure that everything's doing the right thing. And another use for InfluxDB that I'm also interested, is the visits database. If you think about the observations, we are moving the telescope all the time and pointing to specific directions in the sky and taking pictures every 30 seconds. So that itself is a time series. And every point in the time series, we call that visit. So we want to record the metadata about those visits in InfluxDB. That time series is going to be 10 years long, with about 1000 points every night. It's actually not too much data compared to the other problems. It's really just the different time scale. So yeah, we have plans on continuing using InfluxDB and finding new applications in the project. >> Yeah and the speed with which you can actually get high quality images. Angelo, my understanding is, you use InfluxDB, as you said, you're monitoring the telescope hardware and the software. And just say, some of the scientific data as well. The telescope at the Rubin Observatory is like, no pun intended, I guess, the star of the show. And I believe, I read that it's going to be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hubble's widest camera view, which is amazing. That's like 40 moons in an image, and amazingly fast as well. What else can you tell us about the telescope? >> Yeah, so it's really a challenging project, from the point of view of engineering. This telescope, it has to move really fast. And it also has to carry the primary mirror, which is an eight meter piece of glass, it's very heavy. And it has to carry a camera, which is about the size of a small car. And this whole structure weighs about 300 pounds. For that to work, the telescope needs to be very compact and stiff. And one thing that's amazing about its design is that the telescope, this 300 tons structure, it sits on a tiny film of oil, which has the diameter of human hair, in that brings an almost zero friction interface. In fact, a few people can move this enormous structure with only their hands. As you said, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So each image has, in diameter, the size of about seven full moons. And with that we can map the entire sky in only three days. And of course, during operations, everything's controlled by software, and it's automatic. There's a very complex piece of software called the scheduler, which is responsible for moving the telescope and the camera. Which will record the 15 terabytes of data every night. >> And Angelo, all this data lands in InfluxDB, correct? And what are you doing with all that data? >> Yeah, actually not. So we're using InfluxDB to record engineering data and metadata about the observations, like telemetry events and the commands from the telescope. That's a much smaller data set compared to the images. But it is still challenging because you have some high frequency data that the system needs to keep up and we need to store this data and have it around for the lifetime of the project. >> Hm. So at the mountain, we keep the data for 30 days. So the observers, they use Influx and InfluxDB instance, running there to analyze the data. But we also replicate the data to another instance running at the US data facility, where we have more computational resources and so more people can look at the data without interfering with the observations. Yeah, I have to say that InfluxDB has been really instrumental for us, and especially at this phase of the project where we are testing and integrating the different pieces of hardware. And it's not just the database, right. It's the whole platform. So I like to give this example, when we are doing this kind of task, it's hard to know in advance which dashboards and visualizations you're going to need, right. So what you really need is a data exploration tool. And with tools like chronograph, for example, having the ability to query and create dashboards on the fly was really a game changer for us. So astronomers, they typically are not software engineers, but they are the ones that know better than anyone, what needs to be monitored. And so they use chronograph and they can create the dashboards and the visualizations that they need. >> Got it. Thank you. Okay, Caleb, let's bring you back in. Tell us more about, you got these dishwasher size satellites are kind of using a multi tenant model. I think it's genius. But tell us about the satellites themselves. >> Yeah, absolutely. So we have in space, some satellites already. That, as you said, are like dishwasher, mini fridge kind of size. And we're working on a bunch more that are a variety of sizes from shoe box to I guess, a few times larger than what we have today. And it is, we do shoot to have, effectively something like a multi tenant model where we will buy a bus off the shelf, the bus is, what you can kind of think of as the core piece of the satellite, almost like a motherboard or something. Where it's providing the power, it has the solar panels, it has some radios attached to it, it handles the altitude control, basically steers the spacecraft in orbit. And then we build, also in house, what we call our payload hub, which is has all any customer payloads attached, and our own kind of edge processing sort of capabilities built into it. And so we integrate that, we launch it, and those things, because they're in low Earth orbit, they're orbiting the Earth every 90 minutes. That's seven kilometers per second, which is several times faster than a speeding bullet. So we've got, we have one of the unique challenges of operating spacecraft in lower Earth orbit is that generally you can't talk to them all the time. So we're managing these things through very brief windows of time. Where we get to talk to them through our ground sites, either in Antarctica or in the North Pole region. So we'll see them for 10 minutes, and then we won't see them for the next 90 minutes as they zip around the Earth collecting data. So one of the challenges that exists for a company like ours is, that's a lot of, you have to be able to make real time decisions operationally, in those short windows that can sometimes be critical to the health and safety of the spacecraft. And it could be possible that we put ourselves into a low power state in the previous orbit or something potentially dangerous to the satellite can occur. And so as an operator, you need to very quickly process that data coming in. And not just the the live data, but also the massive amounts of data that were collected in, what we call the back orbit, which is the time that we couldn't see the spacecraft. >> We got it. So talk more about how you use InfluxDB to make sense of this data from all those tech that you're launching into space. >> Yeah, so we basically, previously we started off, when I joined the company, storing all of that, as Angelo did, in a regular relational database. And we found that it was so slow, and the size of our data would balloon over the course of a couple of days to the point where we weren't able to even store all of the data that we were getting. So we migrated to InfluxDB to store our time series telemetry from the spacecraft. So that thing's like power level voltage, currents counts, whatever metadata we need to monitor about the spacecraft, we now store that in InfluxDB. And that has, you know, now we can actually easily store the entire volume of data for the mission life so far, without having to worry about the size bloating to an unmanageable amount. And we can also seamlessly query large chunks of data, like if I need to see, for example, as an operator, I might want to see how my battery state of charge is evolving over the course of the year, I can have a plot in an Influx that loads that in a fraction of a second for a year's worth of data, because it does, you know, intelligent. I can intelligently group the data by citing time interval. So it's been extremely powerful for us to access the data. And as time has gone on, we've gradually migrated more and more of our operating data into Influx. So not only do we store the basic telemetry about the bus and our payload hub, but we're also storing data for our customers, that our customers are generating on board about things like you know, one example of a customer that's doing something pretty cool. They have a computer on our satellite, which they can reprogram themselves to do some AI enabled edge compute type capability in space. And so they're sending us some metrics about the status of their workloads, in addition to the basics, like the temperature of their payload, their computer or whatever else. And we're delivering that data to them through Influx in a Grafana dashboard that they can plot where they can see, not only has this pipeline succeeded or failed, but also where was the spacecraft when this occurred? What was the voltage being supplied to their payload? Whatever they need to see, it's all right there for them. Because we're aggregating all that data in InfluxDB. >> That's awesome. You're measuring everything. Let's talk a little bit about, we throw this term around a lot, data driven. A lot of companies say, Oh, yes, we're data driven. But you guys really are. I mean, you got data at the core. Caleb, what does that what does that mean to you? >> Yeah, so you know, I think, the clearest example of when I saw this, be like totally game changing is, what I mentioned before it, at Astra, were our engineers feedback loop went from a lot of, kind of slow researching, digging into the data to like an instant, instantaneous, almost, Seeing the data, making decisions based on it immediately, rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. But to give another practical example, as I said, we have a huge amount of data that comes down every orbit, and we need to be able to ingest all that data almost instantaneously and provide it to the operator in near real time. About a second worth of latency is all that's acceptable for us to react to. To see what is coming down from the spacecraft and building that pipeline is challenging, from a software engineering standpoint. Our primary language is Python, which isn't necessarily that fast. So what we've done is started, in the in the goal being data driven, is publish metrics on individual, how individual pieces of our data processing pipeline, are performing into Influx as well. And we do that in production as well as in dev. So we have kind of a production monitoring flow. And what that has done is, allow us to make intelligent decisions on our software development roadmap. Where it makes the most sense for us to focus our development efforts in terms of improving our software efficiency, just because we have that visibility into where the real problems are. At sometimes we've found ourselves, before we started doing this, kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. But now, that we're being a bit more data driven, there, we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scaled from supporting a couple of satellites to supporting many, many satellites at once. >> So you reduce those dead ends. Maybe Angela, you could talk about what sort of data driven means to you and your team? >> Yeah, I would say that having real time visibility, to the telemetry data and metrics is crucial for us. We need to make sure that the images that we collect, with the telescope have good quality and that they are within the specifications to meet our science goals. And so if they are not, we want to know that as soon as possible, and then start fixing problems. >> Yeah, so I mean, you think about these big science use cases, Angelo. They are extremely high precision, you have to have a lot of granularity, very tight tolerances. How does that play into your time series data strategy? >> Yeah, so one of the subsystems that produce the high volume and high rates is the structure that supports the telescope's primary mirror. So on that structure, we have hundreds of actuators that compensate the shape of the mirror for the formations. That's part of our active updated system. So that's really real time. And we have to record this high data rates, and we have requirements to handle data that are a few 100 hertz. So we can easily configure our database with milliseconds precision, that's for telemetry data. But for events, sometimes we have events that are very close to each other and then we need to configure database with higher precision. >> um hm For example, micro seconds. >> Yeah, so Caleb, what are your event intervals like? >> So I would say that, as of today on the spacecraft, the event, the level of timing that we deal with probably tops out at about 20 hertz, 20 measurements per second on things like our gyroscopes. But I think the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications. And I'll give you an example, from when I worked on the rockets at Astra. There, our baseline data rate that we would ingest data during a test is 500 hertz, so 500 samples per second. And in some cases, we would actually need to ingest much higher rate data. Even up to like 1.5 kilohertz. So extremely, extremely high precision data there, where timing really matters a lot. And, I can, one of the really powerful things about Influx is the fact that it can handle this, that's one of the reasons we chose it. Because there's times when we're looking at the results of firing, where you're zooming in. I've talked earlier about how on my current job, we often zoom out to look at a year's worth of data. You're zooming in, to where your screen is preoccupied by a tiny fraction of a second. And you need to see, same thing, as Angelo just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers. So that can be something like, hey, I opened this valve at exactly this time. And that goes, we want to have that at micro or even nanosecond precision, so that we know, okay, we saw a spike in chamber pressure at this exact moment, was that before or after this valve open? That kind of visibility is critical in these kinds of scientific applications and absolutely game changing, to be able to see that in near real time. And with a really easy way for engineers to be able to visualize this data themselves without having to wait for us software engineers to go build it for them. >> Can the scientists do self serve? Or do you have to design and build all the analytics and queries for scientists? >> I think that's absolutely from my perspective, that's absolutely one of the best things about Influx, and what I've seen be game changing is that, generally, I'd say anyone can learn to use Influx. And honestly, most of our users might not even know they're using Influx. Because the interface that we expose to them is Grafana, which is generic graphing, open source graphing library that is very similar to Influx zone chronograph. >> Sure. >> And what it does is, it provides this, almost, it's a very intuitive UI for building your query. So you choose a measurement, and it shows a drop down of available measurements, and then you choose the particular field you want to look at. And again, that's a drop down. So it's really easy for our users to discover it. And there's kind of point and click options for doing math, aggregations. You can even do like, perfect kind of predictions all within Grafana. The Grafana user interface, which is really just a wrapper around the API's and functionality that Influx provides. So yes, absolutely, that's been the most powerful thing about it, is that it gets us out of the way, us software engineers, who may not know quite as much as the scientists and engineers that are closer to the interesting math. And they build these crazy dashboards that I'm just like, wow, I had no idea you could do that. I had no idea that, that is something that you would want to see. And absolutely, that's the most empowering piece. >> Yeah, putting data in the hands of those who have the context, the domain experts is key. Angelo is it the same situation for you? Is it self serve? >> Yeah, correct. As I mentioned before, we have the astronomers making their own dashboards, because they know exactly what they need to visualize. And I have an example just from last week. We had an engineer at the observatory that was building a dashboard to monitor the cooling system of the entire building. And he was familiar with InfluxQL, which was the primarily query language in version one of InfluxDB. And he had, that was really a challenge because he had all the data spread at multiple InfluxDB measurements. And he was like doing one query for each measurement and was not able to produce what he needed. And then, but that's the perfect use case for Flux, which is the new data scripting language that Influx data developed and introduced as the main language in version two. And so with Flux, he was able to combine data from multiple measurements and summarize this data in a nice table. So yeah, having more flexible and powerful language, also allows you to make better a visualization. >> So Angelo, where would you be without time series database, that technology generally, may be specifically InfluxDB, as one of the leading platforms. Would you be able to do this? >> Yeah, it's hard to imagine, doing what we are doing without InfluxDB. And I don't know, perhaps it would be just a matter of time to rediscover InfluxDB. >> Yeah. How about you Caleb? >> Yeah, I mean, it's all about using the right tool for the job. I think for us, when I joined the company, we weren't using InfluxDB and we were dealing with serious issues of the database growing to a an incredible size, extremely quickly. And being unable to, like even querying short periods of data, was taking on the order of seconds, which is just not possible for operations. So time series database is, if you're dealing with large volumes of time series data, Time series database is the right tool for the job and Influx is a great one for it. So, yeah, it's absolutely required to use for this kind of data, there is not really any other option. >> Guys, this has been really informative. It's pretty exciting to see, how the edge is mountain tops, lower Earth orbits. Space is the ultimate edge. Isn't it. I wonder if you could two questions to wrap here. What comes next for you guys? And is there something that you're really excited about? That you're working on. Caleb, may be you could go first and than Angelo you could bring us home. >> Yeah absolutely, So basically, what's next for Loft Orbital is more, more satellites a greater push towards infrastructure and really making, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, making that happen. It's extremely exciting and extremely exciting time to be in this company and to be in this industry as a whole. Because there are so many interesting applications out there. So many cool ways of leveraging space that people are taking advantage of and with companies like SpaceX, now rapidly lowering cost of launch. It's just a really exciting place to be in. And we're launching more satellites. We're scaling up for some constellations and our ground system has to be improved to match. So there is a lot of improvements that we are working on to really scale up our control systems to be best in class and make it capable of handling such large workloads. So, yeah. What's next for us is just really 10X ing what we are doing. And that's extremely exciting. >> And anything else you are excited about? Maybe something personal? Maybe, you know, the titbit you want to share. Are you guys hiring? >> We're absolutely hiring. So, we've positions all over the company. So we need software engineers. We need people who do more aerospace specific stuff. So absolutely, I'd encourage anyone to check out the Loft Orbital website, if this is at all interesting. Personal wise, I don't have any interesting personal things that are data related. But my current hobby is sea kayaking, so I'm working on becoming a sea kayaking instructor. So if anyone likes to go sea kayaking out in the San Francisco Bay area, hopefully I'll see you out there. >> Love it. All right, Angelo, bring us home. >> Yeah. So what's next for us is, we're getting this telescope working and collecting data and when that's happened, it's going to be just a delish of data coming out of this camera. And handling all that data, is going to be a really challenging. Yeah, I wonder I might not be here for that I'm looking for it, like for next year we have an important milestone, which is our commissioning camera, which is a simplified version of the full camera, is going to be on sky and so most of the system has to be working by then. >> Any cool hobbies that you are working on or any side project? >> Yeah, actually, during the pandemic I started gardening. And I live here in Two Sun, Arizona. It gets really challenging during the summer because of the lack of water, right. And so, we have an automatic irrigation system at the farm and I'm trying to develop a small system to monitor the irrigation and make sure that our plants have enough water to survive. >> Nice. All right guys, with that we're going to end it. Thank you so much. Really fascinating and thanks to InfluxDB for making this possible. Really ground breaking stuff, enabling value at the edge, in the cloud and of course beyond, at the space. Really transformational work, that you guys are doing. So congratulations and I really appreciate the broader community. I can't wait to see what comes next from this entire eco system. Now in the moment, I'll be back to wrap up. This is Dave Vallante. And you are watching The cube, the leader in high tech enterprise coverage. (upbeat music)

Published Date : Apr 21 2022

SUMMARY :

and what you guys do of the kind of customer that we can serve. Caleb, what you guys do. So I started in the Air Force, code away on the software. so that the scientists and the public for the better part of the Dark Energy survey And you both use InfluxDB and it's kind of the super in the example that Caleb just gave, the goal is to look at the of the next gen telescopes to come online. the telescope needs to be that the system needs to keep up And it's not just the database, right. Okay, Caleb, let's bring you back in. the bus is, what you can kind of think of So talk more about how you use InfluxDB And that has, you know, does that mean to you? digging into the data to like an instant, means to you and your team? the images that we collect, I mean, you think about these that produce the high volume For example, micro seconds. that's one of the reasons we chose it. that's absolutely one of the that are closer to the interesting math. Angelo is it the same situation for you? And he had, that was really a challenge as one of the leading platforms. Yeah, it's hard to imagine, How about you Caleb? of the database growing Space is the ultimate edge. and to be in this industry as a whole. And anything else So if anyone likes to go sea kayaking All right, Angelo, bring us home. and so most of the system because of the lack of water, right. in the cloud and of course

ENTITIES

Entity	Category	Confidence
Angela	PERSON	0.99+
2015	DATE	0.99+
Dave Vallante	PERSON	0.99+
Angelo Fausti	PERSON	0.99+
1000 pictures	QUANTITY	0.99+
Loft Orbital	ORGANIZATION	0.99+
Caleb Maclachlan	PERSON	0.99+
40 moons	QUANTITY	0.99+
500 hertz	QUANTITY	0.99+
30 days	QUANTITY	0.99+
Chile	LOCATION	0.99+
SpaceX	ORGANIZATION	0.99+
Caleb	PERSON	0.99+
2018	DATE	0.99+
Antarctica	LOCATION	0.99+
10 years	QUANTITY	0.99+
15 terabytes	QUANTITY	0.99+
San Francisco Bay	LOCATION	0.99+
Earth	LOCATION	0.99+
North Pole	LOCATION	0.99+
Angelo	PERSON	0.99+
Python	TITLE	0.99+
Vera C. Rubin	PERSON	0.99+
Influx	TITLE	0.99+
10 minutes	QUANTITY	0.99+
3.2 gigapixel	QUANTITY	0.99+
InfluxDB	TITLE	0.99+
300 tons	QUANTITY	0.99+
two questions	QUANTITY	0.99+
both	QUANTITY	0.99+
Rubin Observatory	LOCATION	0.99+
last week	DATE	0.99+
each image	QUANTITY	0.99+
1.5 kilohertz	QUANTITY	0.99+
first project	QUANTITY	0.99+
eight meter	QUANTITY	0.99+
today	DATE	0.99+
next year	DATE	0.99+
Vera C Rubin Observatory	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
US	LOCATION	0.99+
one thing	QUANTITY	0.98+
an hour	QUANTITY	0.98+
first	QUANTITY	0.98+
first generation	QUANTITY	0.98+
one	QUANTITY	0.98+
three orders	QUANTITY	0.98+
one example	QUANTITY	0.97+
Two Sun, Arizona	LOCATION	0.97+
InfluxQL	TITLE	0.97+
hundreds of actuators	QUANTITY	0.97+
each measurement	QUANTITY	0.97+
about 300 pounds	QUANTITY	0.97+

Evan Kaplan, InfluxData

(upbeat music) >> Okay today, we welcome Evan Kaplan, CEO of InfluxData, the company behind InfluxDB. Welcome Evan, thanks for coming on. >> Hey John, thanks for having me. >> Great segment here on the InfluxDB story. What is the story? Take us through the history, why time series? What's the story? >> So the history history is actually pretty interesting. Paul Dix my partner in this and our founder, super passionate about developers and developer experience. And he had worked on wall street building a number of time series kind of platform, trading platforms for trading stocks. And from his point of view, it was always what he would call a yak shave. Which means you had to do a ton of work just to start doing work. Which means you had to write a bunch of extrinsic routines, you had to write a bunch of application handling on existing relational databases, in order to come up with something that was optimized for a trading platform or a time series platform. And he sort of, he just developed this real clear point of view. This is not how developers should work. And so in 2013, he went through Y Combinator, and he built something for, he made his first commit to open source InfluxDB in the end of 2013. And he basically, you know from my point of view, he invented modern time series, which is you start with a purpose built time series platform to do these kind of workloads, and you get all the benefits of having something right out of the box. So a developer can be totally productive right away. >> And how many people are in the company? What's the history of employees is there? >> Yeah, I think we're, you know, I always forget the number but something like 230 or 240 people now. I joined the company in 2016, and I love Paul's vision. And I just had a strong conviction about the relationship between time series and IOT. 'Cause if you think about it, what sensors do is they speak time series. Pressure, temperature, volume, humidity, light, they're measuring, they're instrumenting something over time. And so I thought that would be super relevant over the long term, and I've not regretted it. >> Oh no, and it's interesting at that time if you go back in history, you know, the role of database. It's all relational database, the one database to rule the world. And then as cloud started coming in, you started to see more databases proliferate, types of databases. And time series in particular is interesting 'cause real time has become super valuable from an application standpoint. IOT which speaks time series, means something. It's like time matters >> Times yeah. >> And sometimes data's not worth it after the time, sometimes it's worth it. And then you get the data lake, so you have this whole new evolution. Is this the momentum? What's the momentum? I guess the question is what's the momentum behind it? >> You mean what's causing us to grow so fast? >> Yeah the time series, why is time series- >> And the category- >> Momentum, what's the bottom line? >> Well think about it, you think about it from a broad sort of frame which is, what everybody's trying to do is build increasingly intelligent systems. whether it's a self-driving car or a robotic system that does what you want to do, or a self-healing software system. Everybody wants to build increasing intelligent systems. And so in order to build these increasing intelligent systems, you have to instrument the system well. And you have to instrument it over time, better and better. And so you need a tool, a fundamental tool to drive that instrumentation. And that's become clear to everybody that that instrumentation is all based on time. And so what happened, what happened, what happened, what's going to happen. And so you get to these applications like predictive maintenance, or smarter systems, and increasingly you want to do that stuff not just intelligently, but fast in real time. So millisecond response, so that when you're driving a self-driving car, and the system realizes that you're about to do something, essentially you want to be able to act in something that looks like real time. All systems want to do that, they want to be more intelligent, and they want to be more real time. And so we just happen to, you know, we happen to show up at the right time in the evolution of a market. >> It's interesting near real time isn't good enough when you need real time. >> Yeah, it's not, it's not. And it's like everybody wants real even when you don't need it, ironically you want it. It's like having the feature for, you know you buy a new television, you want that one feature, even though you're not going to use it. You decide that's your buying criteria. Real time is criteria for people. >> So I mean, what you're saying then is near realtime is getting closer to real time as fast as possible? >> Right. >> Okay, so talk about the aspect of data, 'cause we're hearing a lot of conversations on theCUBE in particular around how people are implementing and actually getting better. So iterating on data, but you have to know when it happened to get know how to fix it. So this is a big part of what we're seeing with people saying, "Hey, you know I want to "make my machine learning algorithms better "after the fact, I want to learn from the data." How do you see that evolving? Is that one of the use cases of sensors as people bring data in off the network, getting better with the data, knowing when it happened? >> Well, for sure what you're saying is, is none of this is non-linear, it's all incremental. And so if you take something, you know just as an easy example, if you take a self-driving car, what you're doing is you're instrumenting that car to understand where it can perform in the real world in real time. And if you do that, if you run the loop which is, I instrument it, I watch what happens, oh that's wrong, oh I have to correct for that. I correct for that in the software. If you do that for a billion times, you get a self-driving car. But every system moves along that evolution. And so you get the dynamic of constantly instrumenting, watching the system behave and do it. And so a self driving car is one thing, but even in the human genome, if you look at some of our customers, you know, people like, people doing solar arrays, people doing power walls like all of these systems are getting smarter and smarter. >> Well, let's get into that. What are the top applications? What are you seeing with InfluxDB, the time series, what's the sweet spot for the application use case and some customers? Give some examples. >> Yeah so it's pretty easy to understand on one side of the equation, that's the physical side is, sensors are getting cheap obviously we know that. The whole physical world is getting instrumented, your home, your car, the factory floor, your wrist watch, your healthcare, you name it, it's getting instrumented in the physical world. We're watching the physical world in real time. And so there are three or four sweet spots for us, but they're all on that side, they're all about IOT. So they're thinking about consumer IOT kind of projects like Google's Nest, Tudor, particle sensors, even delivery engines like Rappi, who deliver the instant car to South America. Like anywhere there's a physical location and that's on the consumer side. And then another exciting space is the industrial side. Factories are changing dramatically over time. Increasingly moving away from proprietary equipment to develop or driven systems that run operational. Because what has to get smarter when you're building a factory is systems all have to get smarter. And then lastly, a lot in the renewables, so sustainability. So a lot, you know, Tesla, Lucid motors, Nicola motors, you know, lots to do with electric cars, solar arrays, windmills arrays, just anything that's going to get instrumented that where that instrumentation becomes part of what the purpose is. >> It's interesting the convergence of physical and digital is happening with the data. IOT you mentioned, you know, you think of IOT, look at the use cases there. It was proprietary OT systems, now becoming more IP enabled, internet protocol. And now edge compute, getting smaller, faster, cheaper. AI going to the edge. Now you have all kinds of new capabilities that bring that real time and time series opportunity. Are you seeing IOT going to a new level? Where's the IOT OT dots connecting to? Because, you know as these two cultures merge, operations basically, industrial, factory, car, they got to get smarter. Intelligent edge is a buzzword but I mean, it has to be more intelligent. Where's the action in all this? >> So the action, really, it really at the core, it's at the developer, right? Because you're looking at these things, it's very hard to get an off the shelf system to do the kinds of physical and software interaction. So the action's really happen at the developer. And so what you're seeing is a movement in the world that maybe you and I grew up in with IT or OT moving increasingly that developer driven capability. And so all of these IOT systems, they're bespoke, they don't come out of the box. And so the developer, the architect, the CTO, they define what's my business? What am I trying to do? Am I trying to sequence a human genome and figure out when these genes express themselves? Or am I trying to figure out when the next heart rate monitor is going to show up in my apple watch? Right, what am I trying to do? What's the system I need to build? And so starting with the developer is where all of the good stuff happens here. Which is different than it used to be, right. It used to be you'd buy an application or a service or a SaaS thing for, but with this dynamic, with this integration of systems, it's all about bespoke, it's all about building something. >> So let's get to the developer real quick. Real highlight point here is the data, I mean, I could see a developer saying, "Okay, I need to have an application for the edge," IOT edge or car, I mean we're going to have, I mean Tesla got applications of the car, it's right there. I mean, there's the modern application life cycle now. So take us through how does this impacts the developer. Does it impact their CICD pipeline? Is it cloud native? I mean where does this go to? >> Well, so first of all you're talking about, there was an internal journey that we had to go through as a company which I think is fascinating for anybody that's interested, is we went from primarily a monolithic software that was open sourced to building a Cloud-native platform. Which means we had to move from an agile development environment to a CICD environment. So to degree that you are moving your service, whether it's you know, Tesla monitoring your car and updating your power walls, right. Or whether it's a solar company updating the arrays, right, to a degree that that service is cloud. Then increasingly we remove from an agile development to a CICD environment, which you're shipping code to production every day. And so it's not just the developers, it's all the infrastructure to support the developers to run that service and that sort of stuff. I think that's also going to happen in a big way. >> When your customer base that you have now, and as you see evolving with in InfluxDB, is it that they're going to be writing more of the application or relying more on others? I mean obviously it's an open source component here. So when you bring in kind of old way, new way, old way was, I got a proprietary platform running all this IOT stuff, and I got to write, here's an application that's general purpose. I have some flexibility, somewhat brittle, maybe not a lot of robustness to it, but it does this job. >> A good way to think about this is- >> Versus new way which is what? >> So yeah a good way to think about this is what's the role of the developer/architect, CTO, that chain within a large, with an enterprise or a company. And so the way to think about is I started my career in the aerospace industry. And so when you look at what Boeing does to assemble a plane, they build very very few of the parts. Instead what they do is they assemble. They buy the wings, they buy the engines, they assemble, actually they don't buy the wings. That's the one thing, they buy the material for the wing. They build the wings 'cause there's a lot of tech in the wings, and they end up being assemblers, smart assemblers of what ends up being a flying airplane. Which is a pretty big deals even now. And so what happens with software people is, they have the ability to pull from you know, the best of the open source world. So they would pull a time series capability from us, then they would assemble that with potentially some ETL logic from somebody else. Or they'd assemble it with a Kafka interface to be able to stream the data in. And so they become very good integrators and assemblers but they become masters of that bespoke application. And I think that's where it goes 'cause you're not writing native code for everything. >> So they're more flexible, they have faster time to market 'cause they're assembling. >> Way faster. >> And they get to still maintain their core competency, AKA their wings in this case. >> They become increasingly not just coders but designers and developers. They become broadly builders is what we like to think of it. People who start and build stuff. By the way, this is not different than the people just up the road. Google have been doing for years or the tier one Amazon building all their own. >> Well, I think one of the things that's interesting is that this idea of a systems developing, a system architecture. I mean systems have consequences when you make changes. So when you have now cloud data center on-premise and edge working together, how does that work across the system? You can't have a wing that doesn't work with the other wing kind of thing. >> That's exactly, but that's where that Boeing or that airplane building analogy comes in. For us, we've really been thoughtful about that because IOT it's critical. So our open source edge has the same API as our cloud native stuff that has enterprise on prem edge. So our multiple products have the same API and they have a relationship with each other. They can talk with each other. So the builder builds it once. And so this is where, when you start thinking about the components that people have to use to build these services is that, you want to make sure at least that base layer, that database layer that those components talk to each other. >> So I'll have to ask you if I'm the customer, I put my customer hat on. Okay, hey, I'm dealing with a lot. >> Does that mean you have a PO for- >> (laughs) A big check, a blank check, if you can answer this question. >> Only if in tech. >> If you get the question right. I got all this important operation stuff, I got my factory, I got my self-driving cars, this isn't like trivial stuff, this is my business. How should I be thinking about time series? Because now I have to make these architectural decisions as you mentioned and it's going to impact my application development. So huge decision point for your customers. What should I care about the most? What's in it for me? Why is time series important? >> Yeah, that's a great question. So chances are, if you've got a business that was 20 years old or 25 years old, you were already thinking about time series. You probably didn't call it that, you built something on Oracle, or you built something on IBM's Db2, right, and you made it work within your system. Right, and so that's what you started building. So it's already out there, there are probably hundreds of millions of time series applications out there today. But as you start to think about this increasing need for real time, and you start to think about increasing intelligence, you think about optimizing those systems over time, I hate the word, but digital transformation. Then you start with time series, it's a foundational base layer for any system that you're going to build. There's no system I can think of where time series shouldn't be the foundational base layer. If you just want to store your data and just leave it there and then maybe look it up every five years, that's fine. That's not time series. Time series is when you're building a smarter more intelligent, more real time system. And the developers now know that. And so the more they play a role in building these systems the more obvious it becomes. >> And since I have a PO for you and a big check. >> Yeah. >> What's the value to me when I implement this? What's the end state? What's it look like when it's up and running? What's the value proposition for me? What's in it for me? >> So when it's up and running, you're able to handle the queries, the writing of the data, the down sampling of the data, the transforming it in near real time. So that the other dependencies that a system it gets for adjusting a solar array or trading energy off of a power wall or some sort of human genome, those systems work better. So time series is foundational. It's not like it's doing every action that's above, but it's foundational to build a really compelling intelligence system. I think that's what developers and architects are seeing now. >> Bottom line, final word, what's in it for the customer? What's your statement to the customer? What would you say to someone looking to do something in time series and edge? >> Yeah so it's pretty clear to us that if you're building, if you view yourself as being in the business of building systems, that you want 'em to be increasingly intelligent, self-healing autonomous. You want 'em to operate in real time, that you start from time series. But I also want to say what's in it for us, Influx. What's in it for us is, people are doing some amazing stuff. You know, I highlighted some of the energy stuff, some of the human genome, some of the healthcare, it's hard not to be proud or feel like, "Wow." >> Yeah. >> "Somehow I've been lucky, I've arrived at the right time, "in the right place with the right people "to be able to deliver on that." That's also exciting on our side of the equation. >> Yeah, it's critical infrastructure, critical of operations. >> Yeah. >> Great stuff. Evan thanks for coming on, appreciate this segment. All right, in a moment, Brian Gilmore director of IOT and emerging technology at InfluxData will join me. You're watching theCUBE, leader in tech coverage. Thanks for watching. (upbeat music)

Published Date : Apr 19 2022

SUMMARY :

the company behind InfluxDB. What is the story? And he basically, you know I joined the company in 2016, database, the one database And then you get the data lake, And so you get to these applications when you need real time. It's like having the feature for, Is that one of the use cases of sensors And so you get the dynamic InfluxDB, the time series, and that's on the consumer side. It's interesting the And so the developer, of the car, it's right there. So to degree that you is it that they're going to be And so the way to think they have faster time to market And they get to still By the way, this is not So when you have now cloud So our open source edge has the same API So I'll have to ask if you can answer this question. What should I care about the most? And so the more they play a for you and a big check. So that the other that you want 'em to be "in the right place with the right people critical of operations. Brian Gilmore director of IOT

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
2016	DATE	0.99+
Brian Gilmore	PERSON	0.99+
Boeing	ORGANIZATION	0.99+
Evan Kaplan	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Evan Kaplan	PERSON	0.99+
2013	DATE	0.99+
Tesla	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Paul Dix	PERSON	0.99+
South America	LOCATION	0.99+
230	QUANTITY	0.99+
Evan	PERSON	0.99+
InfluxData	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Paul	PERSON	0.99+
three	QUANTITY	0.99+
today	DATE	0.99+
240 people	QUANTITY	0.99+
first	QUANTITY	0.99+
IOT	ORGANIZATION	0.98+
one	QUANTITY	0.98+
end of 2013	DATE	0.97+
one side	QUANTITY	0.97+
Lucid	ORGANIZATION	0.96+
Y Combinator	ORGANIZATION	0.96+
one thing	QUANTITY	0.96+
tier one	QUANTITY	0.94+
InfluxDB	TITLE	0.93+
one feature	QUANTITY	0.93+
25 years old	QUANTITY	0.93+
20 years old	QUANTITY	0.93+
one database	QUANTITY	0.91+
hundreds of millions of time series	QUANTITY	0.9+
two cultures	QUANTITY	0.89+
Influx	OTHER	0.88+
every five years	QUANTITY	0.87+
InfluxDB	ORGANIZATION	0.84+
Nicola	ORGANIZATION	0.81+
Db2	TITLE	0.76+
theCUBE	ORGANIZATION	0.76+
Rappi	ORGANIZATION	0.76+
a billion times	QUANTITY	0.76+
a ton of work	QUANTITY	0.72+
apple	ORGANIZATION	0.69+
Tudor	ORGANIZATION	0.69+
Kafka	TITLE	0.69+
four sweet spots	QUANTITY	0.65+
years	QUANTITY	0.59+

Brian Gilmore, InfluxData

>>Okay. Now we're joined by Brian Gilmore, director of IOT and emerging technologies at influx data. Welcome to the show. >>Thank you, John. Great to be >>Here. We just spent some time with Evan going through the company and the value proposition, um, with influx DB, what's the momentum. What do see this coming from? What's the value coming out of this? >>Well, I think it, we're sort of hitting a point where the technology is, is like the adoption of it is becoming mainstream. We're seeing it in all sorts of organizations, everybody from like the most well funded sort of advanced big technology companies to the smaller academics, the startups and the managing of that sort, sort of data that emits from that technology is time series and us being able to give them a, a platform, a tool that's super easy to use, easy to start. And then of course we'll grow with them is, has been key to us, sort of, you know, riding along with them is they're successful. >>Evan was mentioning that time series has been on everyone's radar and that's in the OT business for years. Now, you go back 20 13, 14, even like five years ago that convergence of physical and digital coming together, IP enabled edge. Yeah. Edge has always been kind of hyped up, but why now? Why, why is the edge so hot right now from an adoption standpoint? Is it because it's just evolution, the tech getting better? >>I think it's, it's, it's twofold. I think that, you know, there was, I would think for some people, everybody was so focused on cloud over the last probably 10 years. Mm-hmm <affirmative> that they forgot about the compute that was available at the edge. And I think, you know, those, especially in the OT and on the factory floor who weren't able to take advantage full advantage of cloud through their applications, you know, still needed to be able to leverage that compute at the edge. I think the big thing that we're seeing now, which is trusting is, is that there's like a hybrid nature to all of these applications where there is definitely some data that's generated on the edge. There's definitely done some data that's generated in the cloud. And it's the ability for a developer to sort of like tie those two systems together and work with that data in a very unified uniform way. Um, that's giving them the opportunity to build solutions that, you know, really deliver value to whatever it is they're trying to do, whether it's, you know, the, the outer reaches of outer space or whether it's optimizing the factory floor. >>Yeah. I think, I think one of the things you also mentioned genome too, dig big data is coming to the real world. And I think I, I O T has been kind of like this thing for OT and, and some use case, but now with the, with the cloud, all companies have an edge strategy now. So yeah, what's the secret sauce because now this is hot, hot product for the whole world and not just industrial, but all businesses. What's the secret sauce. >>Well, I mean, I think part of it is just that the technology is becoming more capable and that's especially on the hardware side, right? I mean, like technology compute is getting smaller and smaller and smaller. And we find that by supporting all the way down to the edge, even to the micro controller layer with our, um, you know, our client libraries and then working hard to make our applications, especially the database as small as possible so that it can be located as close to sort of the point of origin of that data in the edge as possible is, is, is fantastic. Now you can take that. You can run that locally. You can do your local decision making. You can use influx DB as sort of an input to automation control the autonomy that people are trying to drive at the edge, but when you link it up with everything that's in the cloud, that's when you get all of the sort of cloud scale capabilities of parallel eyes, AI, and machine learning and all of that. So >>What's interesting is the open source success has been something that we've talked about a lot in the cube about how people are leveraging that you guys have users in the enterprise users at I O T market mm-hmm <affirmative>, but you got developers now. Yeah. Kind of together brought that up. How do you see that emerging? How do developers engage? What are some of, as you're seeing that developers are really getting into with influx DB what's >>Yeah. Well, I mean, I think there are the developers who are building companies, right? I mean, these are the startups and the folks that we love to work with who are building new, you know, new services, new products, things like that. And, you know, especially on the consumer side of, I T there's a lot of that, just those developers, but I think we, you gotta pay attention to those enterprise develop as well, right? There are tons of people with the, the title of engineer in, in your regular enterprise organizations. And they're there for a systems integration. They're there for, you know, looking at what they would build versus what they would buy. And a lot of them come from, you know, a strong, open source background and they, they know the communities, they know the top platforms in those spaces and, and, you know, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building a brand new one. >>You know, it's interesting too, when Evan and I were talking about open source versus closed OT systems, mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining opens dozens of data formats out there? A bunch of standards, protocols, new things are emerging, and everyone wants to have a control plane. Everyone wants to leverage the value of data. How do you guys keep track of it all? What do you guys support? >>Yeah, well, I mean, I think either through direct connection, like we have a product called Telegraph, it's unbelievable. It's open source, it's an edge agent. You can run it as close to the edge as you'd like, it speaks dozens of different protocols and its own, right. A couple of which M Q T T UA are very, very, um, applicable to these IOT use cases. But then we also, because we are sort of not only open source, but open in terms of our ability to collect data, we have a lot of partners who have built really great integrations from their own middleware, into influx DB. These are companies like cap wire and high by who are really experts in those downstream industrial protocols. I mean, that's a business, not everybody wants to be in. It requires some very specialized, very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, we get the best of both worlds. The customers can use the platforms they need up to the point where they would be putting into our database. >>What's some of the customer testimonies that they, that share with you. Can you share some anecdotal, all kind of like, wow, that's the best thing I've ever used. That's really changed my business. Or this is a great tech that didn't helped me in these other areas. What are some of the, um, sound bites you hear from customers when they're successful? >>Yeah. I mean, I think it ranges. You've got customers who are, you know, just finally being able to do the monitoring of assets, you know, sort of at the edge in the field, we have a customer who's who has these tunnel boring machines that go deep into the earth to like drill tunnels for, for, you know, cars and, and, you know, trains and things like that. You know, they are just excited to be able to stick a database onto those tunnel, boring machines, send them in to the depths of the earth and know that when they come out, all of that telemetry at a very high frequency has been like safely stored. And then it can just very quickly and instantly connect up to their, you know, centralized database. So like just having that visibility is brand new to them. And that's super important. On the other hand, you have customers who are way far beyond the monitoring use case. >>We're, they're actually using the historical records in the time series database to, um, like I think Evan mentioned like forecast things. So for predictive maintenance, being able to pull in the telemetry from the machines, but then also all of that external enrichment data, the metadata, the temperatures, the pressures who was operating the machine, those types of things, and being able to of easily integrate with platforms like Jupyter notebooks. Yeah. Or, you know, all of those scientific computing and machine learning libraries to be able to build the models, train the models, and then they can send that information back down to influx TV to apply it and detect those anomalies, which >>Are, I think that's gonna be an, an area. I personally think that's a hot area because I think if you look at AI right now yeah. It's all about two training, the machine learning albums after the fact. So time series becomes hugely important. Yeah. Cause now you're thinking, okay, the data matters post time. Yeah. For sure. And then it gets updated the new time. Yeah. So it's like constant data cleansing data iteration, data programming. We're starting to see this new use case emerge in the data feed. Yep. >>Yeah. I mean, I think >>You >>Agree. Yeah, of course. Yeah. The, the ability to sort of handle those pipelines of data smartly, um, intelligently, and then to be able to do all of the things you need to do with that data in stream, um, before it hits your sort of central repository. And, and we make that really easy for customers like Telegraph, not only does it have sort of the inputs to connect up to all of those protocols and the ability to capture and connect up to the, to the partner data. But also it has a whole bunch of capabilities around being able to process that data, enrich it, reformat it, route it, do whatever you need. So at that point you're basically able to, you're playing your data in exactly the way you would wanna do it. You're routing it to D and you know, destinations and, and it's, it's, it's not something that really has been in the realm of possibility until this point. Yeah. >>Yeah. And when Evan was on it's great. He was a CEO. So he sees the big picture with customers. He was, he kind of put the package together that said, Hey, we got a system. We got customers, people are wanting to leverage our product. What's your PO they're sell, he's selling too as well. So you have that whole C your perspective, but he brought up this notion that there's multiple personas involved in kind of the influx DB system architect. You got developers and users. Can you talk about that? Reality as customers start to commercialize and operationalize this from a commercial standpoint, you got a relationship to the cloud. Yep. The edge is there. Yep. The edge is getting super important, but cloud brings a lot of scale to the table. So what is the relationship to the cloud? Can you share your thoughts on edge and its relationship to the cloud? Yeah. >>I mean, I think edge, you know, edge is you can think of it really as like the local information, right? So it's, it's generally like compartmentalized to a point of like, you know, a single asset or a single factory align, whatever. Um, but what people do who wanna pro they wanna be able to make the decisions there at the edge locally, um, quickly minus the latency of sort of taking that large volume of data, shipping it to the cloud and doing something with it there. So we allow, allow them to do exactly that. Then what they can do is they can actually down sample that data or they can, you know, detect like the really important metrics or the anomalies. And then they can ship that to a central database in the cloud where they can do all sorts of really interesting things with it. Like you can get that centralized view of all of your global assets. You can start to compare asset to asset, and then you can do as things like we talked about, whereas you can do predictive types of analytics or, you know, larger scale anomaly >>Detections. So in this model you have a lot of commercial operations, industrial equipment. Yep. The physical plant, physical business with virtual data cloud all coming together. What's the future for influx DB from a tech standpoint. Cause you got open. Yep. There's an ecosystem there. Yep. You have customers who want operational reliability for sure. I mean, so you got organic <laugh> >>Yeah. Yeah. I mean, I think, you know, again, we got iPhones when everybody's waiting for flying cars. Right. So I don't know. We can like absolutely perfectly predict what's coming, but I think there are some givens and I think those givens are gonna be that the world is only gonna become more hybrid. Right. And then, you know, so we are going to have much more widely distributed, you know, situations where you have data being generated in the cloud, you have data gen being generated at the edge and then there's gonna be data generated sort sort of at all points in between like physical locations as well as things that are, that are very virtual. And I think, you know, we are, we're building some technology right now. That's going to allow, um, the concept of a database to be much more fluid and flexible, sort of more aligned with what a file would be like. >>And so being able to move data to the compute for analysis or move the compute to the data for analysis, those are the types of, of solution is that we'll be bringing to the customers sort of over the next little bit. Um, but I also think we have to start thinking about like what happens when the edge is actually off the planet, right. I mean, we've got customers, you're gonna talk to two of them, uh, in the panel who are actually working with data that comes from like outside the earth. Like, you know, either in low earth orbit or, you know, all the, you sort of on the other side of the universe and, and to be able to process data like that and to do so in a way it's it's we gotta, we gotta build the fundamentals for that right now on the factory floor and in the mines and in the tunnels. Um, so that we'll be ready for that >>One. I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, this is kind of new thinking is hyper scale's always been built up full stack developers, even the old OT world that Evan was pointing out, that they built everything. Right. And the world's going into more assembly with core competency and IP and also property being the core of their apple. So faster assembly and building <affirmative>, but also integration. You got all this new stuff happening. Yeah. And that's to separate out the data complexity from the app. Yes. So space genome. Yep. Driving cars throws off massive data. >>It does. >>So is Tesla and there is the car the same as the data layer. >>I mean, yeah. It's, it's certainly a point of origin. I think the thing that we wanna do is we wanna let the developers work on the world, changing problems, the things that they're trying to solve, whether it's, you know, energy or, you know, any of the other health or, you know, other challenges that these teams are, are building against. And we'll worry about that time series data in the underlying data platforms so that they don't have to. Right. I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform quickly, integrate it with their data sources and the other pieces of their applications. It's going to allow them to bring much faster time to market on these products. It's gonna allow them to be more iterative. They're gonna be able to do more sort of testing and things like that. And ultimately will it'll accelerate the adoption and the creation of >>Technology. You mentioned earlier in, in our talk about unification of data. Yeah. How about APIs? Cuz developers love APIs in the cloud unifying APIs. How do you view view that? >>Yeah, I mean, we are APIs, that's the product itself. Like everything people like to think of it is sort of having this nice front end, but the front end is B built on our public APIs. Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, but then data processing, data analytics, and then, you know, sort of data extraction to bring it to other platforms or other applications, microservices, whatever it might be. So, I mean, it is a world of APIs right now and you know, we, we bring a very sort of useful set of them for managing the time series data. These guys are all challenged with. >>It's interesting. You and I were talking before we came on camera about how, um, data feels gonna have this kind of SRE role that DevOps had site reliability engineers, which managed a bunch of there's so much data out there now. Yeah. >>Yeah. It's like raining data for sure. And I think like that ability to like one of the best jobs on the planet is gonna be to be able to like, sort of be that data Wrangler, to be able to understand like what the data sources are, what the data formats are, how to be able to efficiently move that data from point a to point B and you know, to process it correctly so that the end users of that data aren't doing any of that sort of hard upfront preparation collection, storage work >>That's data as code. I mean, data engineering. It is, it is becoming a new discipline it for sure. And, and the democratization is the benefit. Yeah. To everyone, data science get easier. I mean, data science, but they wanna make it easy. Right. <laugh> yeah. They wanna do the analysis, right? >>Yeah. I mean, I think, you know, it's, it's a really good point. I think like we try to give our users as many ways as there could be possible to get data in and get data out. We sort of think about it as meeting them where they are. Right. So like we build, we have the sort of client libraries that allow them to just port to us, you know, directly from the applications and the languages that they're writing, but then they can also pull it out. And at that point nobody's gonna know the users, the end consumers of that data, better than those people who are building those applications. And so they're building these users and interfaces, which are making all of that data accessible for, you know, their end users inside their organization. >>Well, Brian, great segment, great insight. Thanks for sharing all, all the complexities and, and IOT that you guys help take away with APIs and, and assembly and, and all the system architectures that are changing edge is real cloud is real, absolutely mainstream enterprises. New got developer attraction too. So congratulations. >>Yeah. It's >>Great. Well, thank you. Any, any last word you wanna share >>Deal with? No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, download it, try out the open source contribute if you can. That's a, that's a huge thing. It's part of being the open source community. Um, you know, but definitely just, just use it. I think once people use it, they try it out. They'll understand very, very >>Quickly awesome open source with developers, enterprise and edge coming together >>All together all together. You're gonna hear more about that in the next segment, too. >>Thanks for coming on. Okay. Thanks. When we return, Dave Lon will lead a panel on edge and data influx DB. You're watching the cube, the leader and high tech enterprise coverage.

Published Date : Apr 19 2022

SUMMARY :

Welcome to the show. What's the value coming out of this? has been key to us, sort of, you know, riding along with them is they're successful. Now, you go back 20 13, 14, even like five years ago that convergence of physical to take advantage full advantage of cloud through their applications, you know, still needed to be able to leverage that And I think I, I O T has been kind of like this thing for OT and, all the way down to the edge, even to the micro controller layer with our, um, you know, that you guys have users in the enterprise users at I O T market mm-hmm <affirmative>, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building How do you guys keep track of it all? very hard work and a lot of support, um, you know, and so by making those connections and building those What are some of the, um, sound bites you hear from customers when they're successful? machines that go deep into the earth to like drill tunnels for, for, you know, Or, you know, all of those scientific computing and machine learning libraries to be able to build I personally think that's a hot area because I think if you look at AI right now You're routing it to D and you know, So you have that whole C your perspective, but he brought up this notion that I mean, I think edge, you know, edge is you can think of it really as like the local information, I mean, so you got organic <laugh> And I think, you know, we are, we're building some technology right now. Like, you know, either in low earth orbit or, you know, all the, you sort of on the other side of And that's to separate out the data complexity from the app. I mean, I think you talked about it, uh, you know, for them just to be able to adopt How do you view view that? but then data processing, data analytics, and then, you know, sort of data extraction to bring it to other kind of SRE role that DevOps had site reliability engineers, which managed a bunch of there's how to be able to efficiently move that data from point a to point B and you know, and the democratization is the benefit. that allow them to just port to us, you know, directly from the applications and you guys help take away with APIs and, and assembly and, and all the system architectures that are changing Any, any last word you wanna share No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, You're gonna hear more about that in the next segment, too. When we return, Dave Lon will lead a panel on edge

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
Evan	PERSON	0.99+
Dave Lon	PERSON	0.99+
John	PERSON	0.99+
Brian	PERSON	0.99+
two systems	QUANTITY	0.99+
two	QUANTITY	0.99+
dozens	QUANTITY	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
Tesla	ORGANIZATION	0.99+
apple	ORGANIZATION	0.99+
one	QUANTITY	0.97+
both worlds	QUANTITY	0.96+
five years ago	DATE	0.96+
earth	LOCATION	0.95+
IOT	ORGANIZATION	0.94+
two training	QUANTITY	0.94+
Telegraph	ORGANIZATION	0.9+
single	QUANTITY	0.9+
InfluxData	ORGANIZATION	0.89+
single asset	QUANTITY	0.87+
Jupyter	ORGANIZATION	0.84+
One	QUANTITY	0.82+
dozens of data formats	QUANTITY	0.8+
influx	ORGANIZATION	0.79+
DevOps	ORGANIZATION	0.72+
10 years	QUANTITY	0.68+
tons of people	QUANTITY	0.66+
T	OTHER	0.63+
different	QUANTITY	0.59+
them	QUANTITY	0.57+
20 13	DATE	0.55+
twofold	QUANTITY	0.54+
14	DATE	0.38+

Brian Mullen & Arwa Kaddoura, InfluxData | AWS re:Invent 2021

(upbeat music) >> Everybody welcome back to theCUBE, continuous coverage of AWS 2021. This is the biggest hybrid event of the year, theCUBEs ninth year covering AWS re:Invent. My name is Dave Vellante. Arwa Kaddoura is here CUBE alumni, chief revenue officer now of InfluxData and Brian Mullen, who's the chief marketing officer. Folks good to see you. >> Thanks for having us. >> Dave: All right, great to see you face to face. >> It's great to meet you in person finally. >> So Brian, tell us about InfluxData. People might not be familiar with the company. >> Sure, yes. InfluxData, we're the company behind a pretty well-known project called Influx DB. And we're a platform for handling time series data. And so what time series data is, is really it's any, we think of it as any data that's stamped in time in some way. That could be every second, every two minutes, every five minutes, every nanosecond, whatever it might be. And typically that data comes from, you know, of course, sources and the sources are, you know, they could be things in the physical world like devices and sensors, you know, temperature gauges, batteries. Also things in the virtual world and, you know, software that you're building and running in the cloud, you know, containers, microservices, virtual machines. So all of these, whether in the physical world or the virtual world are kind of generating a lot of time series data and our platforms are designed specifically to handle that. >> Yeah so, lots to unpack here Arwa, I mean, I've kind of followed you since we met on virtually. Kind of followed your career and I know when you choose to come to a company, you start with the customer that's what your that's your... Those are your peeps. >> Arwa: Absolutely. >> So what was it that drew you to InfluxData, the customers were telling you? >> Yeah, I think what I saw happening from a marketplace is a few paradigm shifts, right? And the first paradigm shift is obviously what the cloud is enabling, right? So everything that we used to take for granted, when you know, Andreessen Horowitz said, "software was eating the world", right? And then we moved into apps are eating the world. And now you look at the cloud infrastructure that, you know, folks like AWS have empowered, they've allowed services like ours and databases, and sort of querying capabilities like Influx DB to basically run at a scale that we never would have been able to do. Just sort of with, you know, you host it yourself type of a situation. And then the other thing that it's enabled is again, if you go back to sort of database history, relational, right? Was humongous, totally transformed what we could do in terms of transactional systems. Then you moved into sort of the big data, the Hadoops, the search, right. The elastic. And now what we're seeing is time series is becoming the new paradigm. That's enabling a whole set of new use cases that have never been enabled before, right? So people that are generating these large volumes of data, like Brian talked about and needing a platform that can ingest millions of points per second. And then the ability to query that in real time in order to take that action and in order to power things like ML and things like sort of, you know, autonomous type capabilities now need this type of capability. So that's all to know >> Okay so, it's the real timeness, right? It's the use cases. Maybe you could talk a little bit more about those use cases and--- >> Sure, sure. So, yeah so we have kind of thinking about things as both the kind of virtual world where people are pulling data off of sources that are in infrastructure, software infrastructure. We have a number like PayPal is a customer of ours, and Apple. They pull a time series data from the infrastructure that runs their payments platform. So you can imagine the volume that they're dealing with. Think about how much data you might have in like a regular relational scenario now multiply every that, every piece of data times however, often you're looking at it. Every one second, every 10 minutes, whatever it might be. You're talking about an order of magnitude, larger volume, higher volume of data. And so the tools that people were using were just not really equipped to handle that kind of volume, which is unique to time series. So we have customers like PayPal in kind of the software infrastructure side. We also have quite a bit of activity among customers on the IOT side. So Tesla is a customer they're pulling telematics and battery data off of the vehicle, pulling that back into their cloud platform. Nest is also our customer. So we're pretty used to seeing, you know, connected thermostats in homes. Think of all the data that's coming from those individual units and their, it's all time series data and they're pulling it into their platform using Influx. >> So, that's interesting. So Tesla take that example they will maybe persist some of the data, maybe not all of it. It's a femoral and end up putting some of it back to the cloud, probably a small portion percentage wise but it's a huge amount of data of data, right? >> Brian: Yeah. >> So, if they might want to track some anomalies okay, capture every time animal runs across, you know, and put that back into the cloud. So where do you guys fit in that analysis and what makes you sort of the best platform for time series data base. >> Yeah, it's interesting you say that because it is a femoral and there are really two parts of it. This is one of the reasons that time series is such a challenge to handle with something that's not really designed to handle it. In a moment, in that minute, in the last hour, you have, you really want to see all the data you want all of what's happening and have full context for what's going on and seeing these fluctuations but then maybe a day later, a week later, you may not care about that level of fidelity. And so you down sample it, you have like a, kind of more of a summarized view of what happened in that moment. So being able to kind of toggle between high fidelity and low fidelity, it's a super hard problem to solve. And so our platform Influx DB really allows you to do that. >> So-- >> And that is different from relational databases, which are great at ingesting, but not great at kicking data out. >> Right. >> And I think what you're pointing to is in order to optimize these platforms, you have to ingest and get rid of data as quickly as you can. And that is not something that a traditional database can do. >> So, who do you sell to? Who's your ideal customer profile? I mean, pretty diverse. >> Yeah, It, so it tends to focus on builders, right? And builders is now obviously a much wider audience, right? We used to say developers, right. Highly technical folks that are building applications. And part of what we love about InfluxData is we're not necessarily trying to only make it for the most sophisticated builders, right? We are trying to allow you to build an application with the minimum amount of code and the greatest amount of integrations, right. So we really power you to do more with less and get rid of unnecessary code or, you know, give you that simplicity. Because for us, it's all about speed to market. You want an application, you have an idea of what it is that you're trying to measure or monitor or instrument, right? We give you the tools, we give you the integrations. We allow you to have to work in the IDE that you prefer. We just launched VS Code Integration, for example. And that then allows these technical audiences that are solving really hard problems, right? With today's technologies to really take our product to market very quickly. >> So, I want to follow up on that. So I like the term builder. It's an AWS kind of popularized that term, but there's sort of two vectors of that. There's the hardcore developers, but there's also increasingly domain experts that are building data products and then more generalists. And I think you're saying you serve both of those, but you do integrations that maybe make it easier for the latter. And of course, if the former wants to go crazy they can. Is that a right understanding? >> Yes absolutely. It is about accessibility and meeting developers where they are. For example, you probably still need a solid technical foundation to use a product like ours, but increasingly we're also investing in education, in videos and templates. Again, integrations that make it easier for people to maybe just bring a visualization layer that they themselves don't have to build. So it is about accessibility, but yes obviously with builders they're a technical foundation is pretty important. But, you know, right now we're at almost 500,000 active instances of Influx DB sort of being out there in the wild. So that to me shows, that it's a pretty wide variety of audiences that are using us. >> So, you're obviously part of the AWS ecosystem, help us understand that partnership they announced today of Serverless for Kinesis. Like, what does that mean to you as you compliment that, is that competitive? Maybe you can address that. >> Yeah, so we're a long-time partner of AWS. We've been in the partner network for several years now. And we think about it now in a couple of ways. First it's an important channel, go to market channel for us with our customers. So as you know, like AWS is an ecosystem unto itself and so many developers, many of these builders are building their applications for their own end users in, on AWS, in that ecosystem. And so it's important for us to number one, have an offering that allows them to put Influx on that bill so we're offered in the marketplace. You can sign up for and purchase and pay for Influx DB cloud using or via AWS marketplace. And then as Arwa mentioned, we have a number of integrations with all the kind of adjacent products and services from Amazon that many of our developers are using. And so when we think about kind of quote and quote, going to where the developer, meeting developers where they are that's an important part of it. If you're an AWS focused developer, then we want to give you not only an easy way to pay for and use our product but also an easy way to integrate it into all the other things that you're using. >> And I think it was 2012, it might've even been 11 on theCUBE, Jerry Chen of Greylock. We were asking him, you think AWS is going to move up the stack and develop applications. He said, no I don't think so. I think they're going to enable developers and builders to do that and then they'll compete with the traditional SaaS vendors. And that's proved to be true, at least thus far. You never say never with AWS. But then recently he wrote a piece called "Castles on the Cloud." And the premise was essentially the ISV's will build on top of clouds. And that seems to be what you're doing with Influx DB. Maybe you could tell us a little bit more about that. We call it super clouds. >> Arwa: That's right. >> you know, leveraging the 100 billion dollars a year that the hyperscalers spend to develop an abstraction layer that solves a particular problem but maybe you could describe what that is from your perspective, Influx DB. >> Yeah, well increasingly we grew up originally as an open source software company. >> Dave: Yeah, right. >> People downloaded the download Influx DB ran it locally on a laptop, put up on the server. And, you know, that's our kind of origin as a company, but increasingly what we recognize is our customers, our developers were building on the building in and on the cloud. And so it was really important for us to kind of meet them there. And so we think about, first of all, offering a product that is easily consumed in the cloud and really just allows them to essentially hit an end point. So with Influx DB cloud, they really have, don't have to worry about any of that kind of deployment and operation of a cluster or anything like that. Really, they just from a usage perspective, just pay for three things. The first is data in, how much data are you putting in? Second is query count. How many queries are you making against? And then third is storage. How much data do you have and how long are you storing it? And really, it's a pretty simple proposition for the developer to kind of see and understand what their costs are going to be as they grow their workload. >> So it's a managed service is that right? >> Brian: It is a managed service. >> Okay and how do you guys price? Is it kind of usage based. >> Total usage based, yeah, again data ingestion. We've got the query count and the storage that Brian talked about, but to your point, back to the sort of what the hyperscalers are doing in terms of creating this global infrastructure that can easily be tapped into. We then extend above that, right? We effectively become a platform as a service builder tool. Many of our customers actually use InfluxData to then power their own products, which they then commercialize into a SaaS application. Right, we've got customers that are doing, you know, Kubernetes monitoring or DevOps monitoring solutions, right? That monitor, you know, people's infrastructure or web applications or any of those things. We've got people building us into, you know, Industrial IoT such as PTC's ThingWorx, right? Where they've developed their own platform >> Dave: Very cool. >> Completely backed up by our time series database, right. Rather than them having to build everything, we become that key ingredient. And then of course the fully cloud managed service means that they could go to market that much quicker. Nobody's for procuring servers, nobody is managing, you know, security patches any of that, it's all fully done for you. And it scales up beautifully, which is the key. And to some of our customers, they also want to scale up or down, right. They know when their peak hours are or peak times they need something that can handle that load. >> So looking ahead to next year, so anyway, I'm glad AWS decided to do re:Invent live. (Arwa mumbling) >> You know, that's weird, right? We thought in June, at Mobile World Congress, we were going to, it was going to be the gateway to returning but who knows? It's like two steps forward, one step back. One step forward, two steps back but we're at least moving in the right direction. So what about for you guys InfluxData? Looking ahead for the coming year, Brian, what can we expect? You know, give us a little view of sharp view of (mumbles) >> Well kind of a keeping in the theme of meeting developers where they are, we want to build out more in the Amazon ecosystem. So more integrations, more kind of ease of use for kind of adjacent products. Another is just availability. So we've been, we're now on actually three clouds. In addition to AWS, we're on Azure and Google cloud, but now expanding horizontally and showing up so we can meet our customers that are working in Europe, expanding into Asia-Pacific which we did earlier this year. And so I think we'll continue to expand the platform globally to bring it closer to where our customers are. >> Arwa: Can I. >> All right go ahead, please. >> And I would say also the hybrid capabilities probably will also be important, right? Some of our customers run certain workloads locally and then other workloads in the cloud. That ability to have that seamless experience regardless, I think is another really critical advancement that we're continuing to invest in. So that as far as the customer is concerned, it's just an API endpoint and it doesn't matter where they're deploying. >> So where do they go, can they download a freebie version? Give us the last word. >> They go to influxdata.com. We do have a free account that anyone can sign up for. It's again, fully cloud hosted and managed. It's a great place to get started. Just learn more about our capabilities and if you're here at AWS re:Invent, we'd love to see you as well. >> Check it out. All right, guys thanks for coming on theCUBEs. >> Thank you. >> Dave: Great to see you. >> All right, thank you. >> Awesome. >> All right, and thank you for watching. Keep it right there. This is Dave Vellante for theCUBEs coverage of AWS re:Invent 2021. You're watching the leader in high-tech coverage. (upbeat music)

Published Date : Nov 30 2021

SUMMARY :

hybrid event of the year, to see you face to face. you in person finally. So Brian, tell us about InfluxData. the sources are, you know, I've kind of followed you and things like sort of, you know, Maybe you could talk a little So we're pretty used to seeing, you know, of it back to the cloud, and put that back into the cloud. And so you down sample it, And that is different and get rid of data as quickly as you can. So, who do you sell to? in the IDE that you prefer. And of course, if the former So that to me shows, Maybe you can address that. So as you know, like AWS And that seems to be what that the hyperscalers spend we grew up originally as an for the developer to kind of see Okay and how do you guys price? that are doing, you know, means that they could go to So looking ahead to So what about for you guys InfluxData? Well kind of a keeping in the theme So that as far as the So where do they go, can It's a great place to get started. for coming on theCUBEs. All right, and thank you for watching.

ENTITIES

Entity	Category	Confidence
Brian Mullen	PERSON	0.99+
Brian	PERSON	0.99+
Arwa Kaddoura	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Europe	LOCATION	0.99+
Apple	ORGANIZATION	0.99+
PayPal	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Asia	LOCATION	0.99+
Tesla	ORGANIZATION	0.99+
Jerry Chen	PERSON	0.99+
Andreessen Horowitz	PERSON	0.99+
one	QUANTITY	0.99+
two parts	QUANTITY	0.99+
June	DATE	0.99+
2012	DATE	0.99+
two steps	QUANTITY	0.99+
Arwa	PERSON	0.99+
one step	QUANTITY	0.99+
next year	DATE	0.99+
third	QUANTITY	0.99+
Second	QUANTITY	0.99+
First	QUANTITY	0.99+
a week later	DATE	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.99+
three things	QUANTITY	0.98+
influxdata.com	OTHER	0.98+
Influx DB	TITLE	0.98+
Castles on the Cloud	TITLE	0.98+
One step	QUANTITY	0.98+
a day later	DATE	0.97+
today	DATE	0.97+
CUBE	ORGANIZATION	0.96+
11	QUANTITY	0.95+
InfluxData	ORGANIZATION	0.94+
two vectors	QUANTITY	0.94+
Greylock	ORGANIZATION	0.94+
ThingWorx	ORGANIZATION	0.94+
ninth year	QUANTITY	0.92+
VS Code	TITLE	0.92+
every five minutes	QUANTITY	0.92+
earlier this year	DATE	0.92+
500,000	QUANTITY	0.91+
every nanosecond	QUANTITY	0.9+
first paradigm	QUANTITY	0.9+
every two minutes	QUANTITY	0.9+
three clouds	QUANTITY	0.89+
PTC	ORGANIZATION	0.89+
every 10 minutes	QUANTITY	0.88+
Mobile World Congress	EVENT	0.86+
100 billion dollars a year	QUANTITY	0.86+
Azure	TITLE	0.83+
Influx	TITLE	0.82+
theCUBE	ORGANIZATION	0.82+
DevOps	TITLE	0.81+
coming year	DATE	0.81+

Ven Savage, Morgan School District | Next Level Network Experience

>>from around the globe. It's the Cube with digital coverage of next level network experience event brought to >>you by info blocks. Okay, welcome back, everyone. This is the Cube's coverage of the next level networking experience. Virtual event within four blocks. I'm John Furrow, your host of the Cube. We're here in our Palo Alto, Calif. Studios as part of our remote access during Covic, getting the interviews and the stories and sharing that with you. We got a great guest here, then savages the network operations manager at Morgan School District in Utah. A customer of info blocks to share a story. Then thanks for coming on. >>Thanks for having >>me. First of all, the Red Sox had a plus interview. I would say right now is gonna go great. Go Sox. Which baseball was in season. Great to have you on. Um, >>we'll get there. We'll >>get there. Um, my Yankee fans say when I say that. But anyway, Miss baseball, um, you know. But that brings up covert 19 baseball season sports. Life has been impacted. Your district. Like many school districts around the world, we're told to shut down, send workers home. That meant sending kids home, too. So we got the educators, get the administration, and you've got the kids all going home. >>Yeah. >>What did you do to keep things going? Because then stop. They had to do the remote learning and new things were emerging. New patterns, new traffic, new kinds of experiences. What did you learn? What's going on? >>Well, first we tried to lock the doors and pretend we weren't there, but they found us. Um, really? I mean, real quickly in our school district, we're not a 1 to 1 operation, so the, uh that caused a big change for us. Um, we had to quickly adapt. And we chose to use chromebooks because that's what we have for the students to use in their classes. So getting that, uh, squared away and send out into the family's was was a big challenge. But then on top of that being the school district, we then had to decide. Okay, how do we protect and filter provide the filtering that the students are gonna need even though they're at home? So there's some relative safety there when they're online and and accessing your email and things like that. So those were. Our two are probably our two. Biggest hurdles was, you know, ramping up the devices and then and then providing, making sure, you know, the network access from a filtering and consistency standpoint was going to work. >>You know, I got to ask you because I see this kind of disruption you don't You don't read about this in the i t. Manual around disaster recovery and, you know, disruption to operations. But essentially, the whole thing changes, but you still got to connect to the network, DNS. You gotta get the access to the content. You got content, you get systems. You got security all to be managed while in flight of dealing with connection points that remote. So you've got the disruption and the craziness of that, and then you've got this big I o t experiment basically edge of the network, you know, in all over the place. You know, on one hand, you kind of geek out and say, Wow, this is really kind of a challenge is an opportunity to solve the problem at the same time, you know, What do you do? So take us through that because that's a is a challenge of locking down the security in a borderless environment. People are everywhere. The students business has to get done. You got to resolve to. The resource is >>so thankfully, we had migrated If it blocks several years ago. Um and just this last, I would say in October, I finally got us on. Ah, cloud the blocks. One threat defense Cloud portion of it too. So from a security standpoint, we already had a really good, um foundation in place from both the DNs aspect and the DNS security aspect. Um so that was to be honest, most users. It was seamless transition. In many regards, both users didn't even realize they were being, You know, pushed through the info blocks is cloud DNs server, you know, which was providing security and filtering. So that was a big plus for us because it it was less man hours. We had to spend troubleshooting people's DNS resolutions. Why sites Wouldn't you know? Maybe they weren't being filtered correctly. All that was was to be honest, perfect. Where other platforms we had previously were just a nightmare to manage, >>like, for example, of the old way versus the new way here and marital, is it? What files configuration will take us through? What? You >>know, it was like a separate. It was a separate product content filter that works in conjunction with the firewall. Um, and I'm not going to name the company's name. I don't want, you know, even though many company but it seemed with that product we were spending, on average about 3 to 4 hours a day fixing false positives just from a filtering aspect because it would interfere with the DNS. And it does. It didn't really do it. I mean, how it filters is not based on DNS. Totally right. So by migrating temple blocks are DNS and the filtering the security is all handling at the DNs level. And it was just much more, um, to be I mean, frankly, honestly, is much more invisible to the end user. So >>more efficient. You decouple filtering from DNs resolution. Got it. All right, this is the big topic. I've been talking with info blocks people on this program in this event is on how this new d d I layer DNs d XP and I p address management kind of altogether super important. It's critical infrastructure Yeah. No spoilers, Enterprise. You're borderless institution. Same thing you go to school as a customer. How does the d I lay out this foundational security play for delivering this next level experience? What's your take on that? >>Well, for our like, for a school platform, we we use it in a number of ways. Besides, I mean, the filtering is huge, but just for the ability, like, for example, one of the components is is response policy zones or DNS firewalls what they call it, and that allows you one to manage, um, traditional, like DNS names, right? P addresses you can. You can manage those by creating essentially a zone that is like a white list of blacklist rewrite. So you've got a lot of control, and again it's filtering at the DNs level, so it's looking based on DNS responses inquiry. The other aspect of that is, is the feeds that you receive from info blocks. So by subscribing to those, we, um we have access to a lot of information that info Blocks and their partners have created identifying, you know, bad actors, malware attack vectors based on again DNs, uh, traffic, if you will, and so that takes a load office. Not having to worry. I'm trying to do all that on our own. I mean, we've seen a lot of attacks minimized because of the feeds themselves. So that again frees us up. We're a very small school district. In some regards, there's a I am the only network person in the district, and there's like, a total of four of us that manage, you know, kind of the support aspect. And so, being able to not have to spend time researching or tracking down, you know, breaches and attacks as much because of the DNS. Security frees me up to do other things, you know, like in the more standard networking realm, from a design and implementation. >>Great. Thanks for sharing that. I want to ask about security as a very competitive space security here and everyone promising it different things at different security things. You know, by I gotta ask you, why did you guys decide to use info blocks and what's the reason behind it? >>Well, to be frankly honest, I'm actually in info blocks trainer and I've been training for 15 years, so I kind of had an agenda when I first took this job to help out the school district. In my experience, I've been doing working in networking for over 20 years. And in my experience, I ever boxes one of the most easy and in best managed DNS solutions that I've come across. So, um, you know, I might be a little biased, but I'm okay with that. And so I I pushed us to be honest, to get there and then from the security aspect has all that has evolved. It just makes to me it makes sense. Why not wrap the more things you can maybe wrapped together. And so you know, when you're talking about attacks, over 90% of attacks use DNS. So if I have a solution that is already providing my DNS and then wraps the security into it, it just makes the most sense for me. >>Yeah. I mean, go back. The info box is DNA. You got cricket. Liu Stuart Bailey, the founder, was this is zero. This didn't just wake up one day and decided to start up these air practitioners early days of the Internet. They know DNS cold and DNS is we've been evolved. I mean, and when it needs that when you get into the DNS. Hacks and then you realize Okay, let's build an abstraction layer. You've seen Internet navigation discovery, all the stuff that's been proven. It is a critical infrastructure. >>Well, and to be honest, it's It's one of those services that you can't can't filter the firewall right. You have to have it. You have to. It's that foundation layer. And so it makes sense that Attackers air leveraging it because the fire will has to let it through in and out. And so it's a natural, almost a natural path for them to break in. So having something that speaks native DNS as part of your security platform makes more sense because it it can understand and see those attacks, the more sophisticated they become as well. >>So I gotta ask you, since you're very familiar info blocks and you're actually deploying its great solution. But I got this new DD I Layer, which is an abstraction, is always a great evolution. Take away complexity and more functionality. Cloud certainly cloud natives everywhere. That's but if it's for what is the update, if if I'm watching this month, you know I've been running DNS and I know it's out there. It's been running everything. And I got a update, my foundation of my business. I got to make my DNS rock solid. What's the new update? What's info blocks doing now? I know they got DNS chops seeing that on it. What's new about info blocks? What do you say? >>Well, it's, you know, they have a couple things that they've been trying to modify over the last several years. In my opinion, making more DNS like a you know, like software as a service, you know, service on demand, type of approach. That's a yes. So you have the cloud components to where you can take a lot of the heavy lifting, maybe off of your network team's shoulders. Because it is, it is. Um, I think people will be surprised how many customers out there. I have, ah, teams that are managing the DNS and even the D HCP aspect that that's not really what their experiences and then they don't They don't have, ah, true, maybe background Indians, and so having something that can help make that easier. It's almost, you know, hey, maybe used this term it almost sounds like it's too simple, but it's almost like a plug and play approached for some. For some environments, you know you're able to pop that in, and a lot of probably the problems they've been dealing with and not realizing what the root cause was will be fixed. So that's always a huge component with with info blocks. But their security is really what's come about in the last several years, Um, and and back as a school district, you know, our besides securing traffic, which every customer has to do, um, we have our you know, we're We have a lot of laws and regulations around filtering with with students and teachers. So anyone that's using a campus own device And so for us this I don't think people realized that the maturity that the filtering aspect of the blocks one defence now it's it's really evolved over the last couple of years. It's become a really, really good product and, like I said earlier, just work seamlessly with the data security. So it is going to be using >>an SD Wan unpacked everything. You go regular root level DNs is it? So I gotta ask you. How is the info blocks helping you keep network services running in system secure? >>Well, I think I think we're more on just the DNs d It does R d eight DNS and DCP. So from that standpoint, you know, in the five years almost we've been running that aspect. We have had very little if if maybe one or two incidents of problems with, you know from a DNS TCP so so are our users are able to connect, you know, when they turn on their computer To them, the Internet's up. You know, there's no there's no bumps in the road stopping them from from being able to connect. So that's a huge thing. You know, you don't have to deal with those Those constant issues again is a small team that just takes time away from the big projects. You're trying to, um, and then to the being able to now combine things. Security filtering solution. Uh, that alone has probably saved us. Oh, we'll probably you know, upwards of 500 man hours in the last eight months. So where normally we would be spending those hours again, troubleshooting issues that false positives, things like that. And there's a small team that just sucks the life out of you when you have to. You always spend time on that. >>I mean, you always chasing your tails. Almost. You want to be productive. Automation plays >>a >>key role in that, >>right? Yeah. >>So I got to ask you, you know, just a general question. I'm curious. You know, one of the things I see is sprawling of devices. WiFi was a great example that put an access point up a rogue access point, you know, as you get more connections. De HCP was amazing about this is awesome. But also, you had also de HCP problem. You got the the key Management is not just around slinging more d HDP around. So you got the trend? Is more connections on the eyepiece? Not how does info blocks make that easier? Because for people who may not know, the DNS ends announcing TCP and IP address management. They're all kind of tied together. Right? So this >>is the >>magic of DD I in my head. I want to get your thoughts on how you see that. Evolving. >>Yeah, I think that's another kind of back twice. It's kind of almost like a plug and play for a lot of customer environments. They're getting, you know, you're getting the DSP, DNs and eye Pam all wrapped in once you have this product that speaks, well, those languages, if you will and that And, um along with some of the reporting services and things of that nature. Um, when I look for, like, a Mac address in my influx database, I'm not just going to get ah, Mac address and what the i p addresses. I'm not just going to get the DNs like the host name. Maybe you know, the beauty and fully qualified domain name. Either I have the ability to bring in all this information that one. The client is communicating with the DCP DNS server on top of things like metadata that you can configure in the database to help really color in the picture of your network. So when you're looking at what device is using this I p when we talk about rogue devices or things like that, uh, I can get so much more information out of info blocks that almost almost to the point where you're almost being able to nail down the location of where the devices that even if it's a wireless client because it works in conjunction with some of our wireless appointments, too. So within, you know, a matter of minutes we have almost all the information we would need to take whatever action is appropriate for something like that, that getting used to take us hours and hours to troubleshoot. >>Appreciate a lot of the other interviews I've done with the info blocks, folks. One of the things that came out of them is the trailing. You can see the trail they're getting. They got to get in somewhere. DNS is the footprints of there you got? That's the traffic, and that's been helping on a potential attacks in D DOS is, for example, no one knows what that is, but DNS is what he said. A lot of the surface areas, DNS. With the hackers are makes it easier to find things. >>Well, you know, by integrating with the cloud I've I've got, you know, that the cloud based with the blocks one, it added a advanced DNS security, which helps protect skins Adidas as well as any cast to help provide more availability because I'm pushing on my DNs traffic through those cloud servers. It's like I've I'm almost equivalent of a very large organization that would normally spend millions of millions of dollars trying to do this on their own. So I'm getting the benefits and kind of the equivalent from that cloud hybrid approach that normally we would never have have. The resource is, >>Well, then I really appreciate you taking the time out of your busy day to remote into the Cube studios. Talk about next level networking experience, so I want to just ask you, just put your experience hat on. You've been You've seen some waves. You've seen the technology evolve when you hear next level networking and when you hear next level networking experience almost two separate meetings. But next level networking means next level. Next level networking experience means is some experience behind it. One of those two phrases mean to you next level networking and next level networking experience. >>Well, to me, I always look at it as the evolution of being able to have a user experience that's consistent no matter where you're located, with your home in your office and special with in today's environment. We have to be able to provide that consistent experience. But what I think what a lot of people may not think about or my overlook if you're just, you know, more of an end user is along with that experience, it has to be a consistent excess security approach. So if I'm an end user, um, I should be able to have the access the, um and the security, which, you know, you know, filtering all that fun stuff to not just allow me the connectivity, but to bring me, you know, that to keep the secure wherever I met. And ah, um, I think schools, you know, obviously with code and in the one the one that everyone was forced to do. But I think businesses And generally I think that's, you know, years ago, Cisco when I worked with Cisco, we talked about, you know, the remote user of the mobile user and how Cisco is kind of leading, uh, the way on that. And I think, you know, with the nature of things like this pandemic, I think being able to have your your users again have that consistent experience, no matter where they're at is going to be key. And so that's how I see when I think of the network evolution, I think that's how it it has to go. >>Well, we appreciate your your time sharing your insights Has a lot of a lot of people are learning that you've got to pour the concrete to build the building. DNS becoming kind of critical infrastructure. But final question for you. I got you here, you know? How you doing? Actually, schools looks like they're gonna have some either fully virtual for the next semester or some sort of time or set schedule. There's all kinds of different approaches. This is the end of the day. It's still is this big i o t experiment from a traffic standpoint. So new expectations create new solutions. What do you see on the horizon? What challenges do you see as you ride this way? Because you've got a hold down the fort, their school district for 3000 students. And you got the administration and the faculty. So you know What are you expecting? And what do you hope to see Evolve Or what do you want to stay away from? What's your opinion? >>I think? I think my my biggest concern is, you know, making sure our like, our students and staff don't, uh, you know, run into trouble on by say that more from, you know, you know, by being, you know, being exposed to attacks, you know, their data with Delta becomes, you know, comes back to our data as a district. But, you know, the student data, I think I think, you know, with anything kids are very vulnerable. Ah, very role, vulnerable targets for many reasons. You know, they're quick to use technology that quick to use, like social media, things like that. But they're they're probably the first ones to do security Does not, you know, across their mind. So I think my big my big concern is as we're moving this, you know, hybrid, hybrid approach where kids can be in school where they're going to be at home. Maybe they'll change from the days of the week. It'll fluctuate, uh, keeping them secure, you know, protecting them from themselves. Maybe in a way, if I have to be the guy is kind of the grumpy old dad it looked at. I'm okay with wearing that hat. I think that's my biggest. Our concern is providing that type of, uh, stability and security. So parents at the end of that could be, you know, I have more peace of mind that their kids you know, our online even more. It's great >>that you can bring that experience because, you know, new new environments, like whether it zooming or using, try and get the different software tools that are out there that were built for on premise premises. You have now potentially a click here. Click there. They could be a target. So, you know, being safe and getting the job done to make sure they have up time. So the remote access it again. If you've got a new edge now, right? So the edge of the network is the home. Exactly. Yeah. Your service area just got bigger. >>Yeah. Yeah, we're in. You know, I'm everybody's guest, whether they like it or not. >>I appreciate that. Appreciate your time and good luck. And let's stay in touch. Thanks for your time. >>Hey, thanks for having me. You guys have a good rest of your weekend? Day two. State State. >>Thank you very much. It's the Cube's coverage with info blocks for a special next level networking experience. Pop up event. I'm John for the Cube. Your host. Thanks for watching. Yeah, yeah, yeah.

Published Date : Jul 27 2020

SUMMARY :

It's the Cube with digital coverage of next you by info blocks. Great to have you on. we'll get there. um, you know. What did you do to keep things going? making sure, you know, the network access from a filtering and consistency standpoint experiment basically edge of the network, you know, in all over the place. blocks is cloud DNs server, you know, which was providing security and filtering. I don't want, you know, even though many company but Same thing you go to school as a customer. lot of information that info Blocks and their partners have created identifying, you know, why did you guys decide to use info blocks and what's the reason behind it? And so you know, when you're talking about attacks, over 90% of attacks use DNS. I mean, and when it needs that when you get into the DNS. Well, and to be honest, it's It's one of those services that you can't can't What do you say? So you have the cloud components to where you can take a lot of the heavy lifting, maybe off How is the info blocks helping you keep network services running in system secure? So from that standpoint, you know, in the five years almost we've I mean, you always chasing your tails. Yeah. you know, as you get more connections. I want to get your thoughts on how you see that. So within, you know, a matter of minutes we have almost Appreciate a lot of the other interviews I've done with the info blocks, folks. Well, you know, by integrating with the cloud I've I've got, you know, that the cloud based You've seen the technology evolve when you hear next but to bring me, you know, that to keep the secure wherever I met. I got you here, you know? on by say that more from, you know, you know, by being, So, you know, being safe and getting the job done to make sure they have You know, I'm everybody's guest, whether they like it or not. I appreciate that. You guys have a good rest of your weekend? Thank you very much.

ENTITIES

Entity	Category	Confidence
Red Sox	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
15 years	QUANTITY	0.99+
John Furrow	PERSON	0.99+
October	DATE	0.99+
Liu Stuart Bailey	PERSON	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
One	QUANTITY	0.99+
3000 students	QUANTITY	0.99+
two phrases	QUANTITY	0.99+
John	PERSON	0.99+
Delta	ORGANIZATION	0.99+
Mac	COMMERCIAL_ITEM	0.99+
five years	QUANTITY	0.99+
four	QUANTITY	0.99+
Adidas	ORGANIZATION	0.99+
Utah	LOCATION	0.99+
over 20 years	QUANTITY	0.98+
both users	QUANTITY	0.98+
over 90%	QUANTITY	0.98+
two incidents	QUANTITY	0.98+
today	DATE	0.98+
first	QUANTITY	0.98+
both	QUANTITY	0.97+
twice	QUANTITY	0.97+
1	QUANTITY	0.97+
Cube	ORGANIZATION	0.96+
several years ago	DATE	0.96+
Day two	QUANTITY	0.96+
D DOS	TITLE	0.95+
First	QUANTITY	0.95+
four blocks	QUANTITY	0.94+
Yankee	ORGANIZATION	0.94+
4 hours a day	QUANTITY	0.94+
about 3	QUANTITY	0.93+
zero	QUANTITY	0.91+
pandemic	EVENT	0.91+
500 man	QUANTITY	0.91+
this month	DATE	0.9+
One threat	QUANTITY	0.89+
years ago	DATE	0.89+
Palo Alto, Calif. Studios	LOCATION	0.88+
Ven Savage	PERSON	0.86+
millions of millions of dollars	QUANTITY	0.86+
DD I Layer	OTHER	0.85+
two separate meetings	QUANTITY	0.85+
one day	QUANTITY	0.84+
first ones	QUANTITY	0.83+
last couple of years	DATE	0.83+
next semester	DATE	0.82+
Go Sox	ORGANIZATION	0.82+
last eight months	DATE	0.82+
19 baseball season sports	QUANTITY	0.81+
Morgan School District	ORGANIZATION	0.72+
last	DATE	0.69+
baseball	TITLE	0.68+
School District	ORGANIZATION	0.66+
years	DATE	0.65+
Indians	PERSON	0.58+
couple	QUANTITY	0.55+
info	ORGANIZATION	0.5+
Morgan	LOCATION	0.48+
influx	ORGANIZATION	0.43+
Covic	EVENT	0.43+
Cube	COMMERCIAL_ITEM	0.35+

Evan Kaplan, InfluxData | CUBEConversation, Sept 2018

(intense orchestral music) >> Hey welcome back everybody, Jeff Frick here with theCUBE We are taking a short break from the madness of the conference season to do some CUBE Conversations here in the Palo Alto studio, which we always like to do and meet new people, and hear new stories, learn about new companies. And today we've got a new company, we've never had 'em on theCUBE before, it's Evan Kaplan, he's the CEO of InluxData. Evan, great to see you. >> Yeah, hey thanks for having me. >> Absolutely. So for people that aren't familiar with the company, give 'em kind of the 101 on Influx. >> Yeah so, InfluxData is an opensource platform for collecting metrics and events at scale. The company is about almost four years old, has a large selection of tier one customers, is broadly accepted by developers as the number one time-series platform out there, so. >> So a lot of people talk about collecting data, so we've been doing Splunk since 2012, and, they really found something interesting on log files, and took it a whole 'nother level, so there's a lot of people that are capturing events. So what do you guys do that's a little bit different, how are you slicing and dicing this opportunity? >> Yeah, to put this is even in the broader context of what we're looking at is the 20 year break-up of the Oracle, DB2 and Formex franchise that dominated and relational databases were the answer to all problems and so if you look at a company like Splunk working on logs, they optimized a platform for those logs, for that data set, Elastic also, really interesting space. I think our innovation has been in saying "Hey, where the world's going, where all of these complex systems are going?" Particularly IoT, is to real-time view of the data and so, rather than collect verbose logs, historical views of the data and things like that, real system operators, real developers and builders want to instrument their applications, their infrastructure, so you can view 'em in real time. The place where the rubber hits the road is IoT. Sensors spit out metrics and events, period, full stop. And so if you want to be performant in how you handle, your instrumentation of the physical world, and how you do your machine learning, and how you want to manage these systems, you use a fundamentally time-series based database. As opposed to Splunk or Elastic or, which are primarily search-based databases. >> And are you primarily capturing and standardizing the data to feed other analytics tools, or do you have the whole suite, where you're doing some of the analytics as well? >> Yeah, such a great question. So, the fundamental platform is called the TICK Stack, and it stands for Telegraf which is a collector, which has about 200 different collectors that sit out there in the world and collect everything from SNMP data, to Oracle data, to application, to micro-service data, to Kubernetes, to that sort of stuff. There's Influx, which is the DB, which is highly optimized for millions and millions of writes a second, so collecting data points and samples. There's Chronograf which is the visualization engine and so, it allows you as soon as the data comes input you can see how it's graphed, see it on time-series oriented graphing, and then there's Kapacitor which takes action on the data. What we don't do is the super high sophisticated analytics. There are lots of companies in Silicon Valley who take our data, pump it up, and then we put it back on the platform to build a control loop for it. >> Right. So when the Kapacitor, does your application then take action on those things? >> Yes. Yeah, so, it'd do everything from alerting, to sending out another machine request, to spinning up a new Kubernetes pod, to basically scaling the application, self healing. >> Right. So does it fit in between a lot of those other types of applications that are sending off notifications, and those types of things? >> Yes, yeah. so you're in between? >> And usually, we're instrumented the way a standard developer, or an architect or CTO does is they look at a complex application, or a complex set of sensors, they instrument with Influx and Telegraf, and collect that data, they view it in real time, and then they build control loops, automation loops, to make that easier so when you see a problem, it's got a tolerance you can self adjust for. So it's the beginning of kind of the self-healing system. >> Okay, and I know that Telegraf is definitely opensource, are the other three? >> All four are open-source All four are open-source. >> Everything, in our world, everything for a developer is free, so, and a single note of Influx can handle a couple million writes a second, which is really really performant to run in production. Where our business model is, where we make money is, our closed source clustering, sharding, distributing the database, if you decide you want to run highly available in the production environment, you would buy our closed-source stuff. We have about 430 customers who run our closed source stuff on top of the opensource. >> So, it is kind of like a MapR to Hadoop if you will, where, you know, it's built on, built on the opensource, and then they've got their proprietary stuff kind of wrapped around it, almost like an open core? Or is that a? >> Yeah, it's a little It's a little different than the normal Hadoop stuff. One is, our stuff doesn't have any external dependencies. It can work with other third party projects, but just, it's a platform onto itself, there aren't 25 projects. There are four different projects, we own them all, they come across as a single binary, and it's not part of Apache. >> So they're integrated So the TICK is the full TICK >> Yes, and then you put the clustering on top. So there's some similarity, but not being part of Apache, we can control and keep clean what that experience is. And we're about, the thing that's been most successful for us is, well Paul our founder who is my partner, it's called time to awesome, the idea that a developer in 10 minutes can very quickly be up and instrumenting an application or a set of sensors, and see that data pouring in within 10 minutes from going to the site and downloading the opensource. >> So it's interesting, the giant opportunity is really around IoT, just in terms of the explosion of the sensor data, and we see that coming, and we were at AT&T show a couple weeks ago, talking about 5G which is, slowly, slowly coming down the road, (Evan laughs) they've got the standards fixed. But in terms of the, you said the shorter term, nobody has budget, I always like to joke, nobody has budget for a new platform, they do have budget for new applications, because they've got real problems. So you said you're seeing, your main success now, your go to market application, is around application monitoring? Would that be accurate, or what is kind of your? >> Yeah, there are two broad things, and they're both very similar technology as a service. One is the central monitoring stuff so, Tesla's Power Wall, Seimens' Windmills, a variety of solar companies build Telegraf into their platforms and then use InluxData to collect and store that information and analyze it. On the software side, people like IBM's Cloud Service running their network and their fabric, SAP with Ariba, Cisco with all their collaboration stuff, they instrument their software applications. And that's the idea is it's a general purpose platform for collecting and instrumenting instrumenting the applications or the sensors, either one, or both. >> Okay, and so what are you guys working on now, what's next, kind of raise the profile, get some new stuff >> Yeah, so we are-- before the whole IoT thing completely explodes, we're not quite there yet but it's coming down the pike. >> But we're starting to see it really happen, so that's really exciting for us. And this is just a really, really big market, it's certainly a super set of the log market, it should be. As you think about just the instrumentation of the physical world, how much instrumentation is going on, your clothes, your cars, your homes, your industrial devices, my watch, how much sensor data there is. We think this is a tremendously large market, so we're doing a couple of things. One is, we're about to introduce a new language for querying these kinds of time-series data that's going to be opensource, that a bunch of other people can use with their data stores. We're rolling out a new API-driven service, so that people can store these things directly in the could natively, so all they have to do is know our API. So we're really trying to push from the technology limit we're a product-driven company, and so, and an opensource-driven company, so we're trying to push that, that community is super important to us. >> It's so wild to me, the opportunity to have a closed feedback loop between someone's product back to the barn, you're barely starting to see it, Tesla obviously, is a good example, they're slowly seeing it in other places. But what a fundamental change in manufacturing, from building a product, making some assumptions about use, shipping that product to your distribution, and then, maybe you get some feedback now an then, versus actually monitoring the way that that thing is actually used by your end user, whether it's a product like a car, or even a software application, as you're rolling out all these different apps and features in the apps, how are people using it, are they using it? Where do you double down, where do you back off? And that loop has not really been >> That's pretty insightful. >> opened up very wide. Yeah, no it's just starting to open up, and that whole notion of product telemetry, my prediction is is that, as development teams grow and things like that, you're going to have telemetry experts, people are going to be specializing. How do you instrument these products so you get maximum engagement, and usage, and things like that? So I think that's pretty insightful on your part. If you think about it from a systems point of view, right? Instrumentation is first. You can't do anything 'til you instrument, whether it's telemetry from a product, it's the engagement or this. So instrumentation is first, visibility in real time is second. So observability is the big thought in systems application and building now, this notion of observing your system in real time, because you don't know, apriori, it's impossible to know a complex system, how it's going to behave, then it's automation, right? So like, okay now I can see these behaviors, how do I automate something that makes the experience for you, the user, better? But lastly, we can see this with self-driving cars, it's autonomy. It's the idea that the system becomes self-healing, and AI, and those sorts of things, but that's kind of the last step. There's a lot of learning in that process to get there. >> And it has to be automated because at scale there's no way for people to keep up with this stuff, and then how do you separate signal from noise and how do you know what to do? So you've got to automate a whole bunch of this. >> And you know if we had an aspiration it would be we're not going to write the applications that do these things but what we want to do is be that system of record so that people have a really efficient, effective metrics and events store so they can really track and keep track of all that engagement. Time-stamped data, for lack of a better way to say it. >> It sounds like you're in a pretty good space, Evan. >> Pretty excited (chuckles), thank you. Thanks for saying that, but yeah, we're pretty excited. >> Alright, well thanks for taking a few minutes out of your day and sharing the story, we look forward to watching the journey. >> Yeah. Thanks man. Alright, take care. >> Alright, thanks. He's Evan, I'm Jeff, you're watching theCUBE. We're having a CUBE Conversation in Palo Alto, we'll see you next time, thanks for watching. (intense orchestral music)

Published Date : Sep 28 2018

SUMMARY :

it's Evan Kaplan, he's the CEO of InluxData. So for people that aren't familiar with the company, is broadly accepted by developers as the number one So what do you guys do and so if you look at a company like Splunk working on logs, and then there's Kapacitor which takes action on the data. So when the Kapacitor, to basically scaling the application, self healing. and those types of things? so you're in between? So it's the beginning of kind of the self-healing system. All four are open-source in the production environment, It's a little different than the normal Hadoop stuff. Yes, and then you put the clustering on top. So you said you're seeing, And that's the idea is it's a general purpose platform before the whole IoT thing completely explodes, so all they have to do is know our API. the opportunity to have a closed feedback loop between There's a lot of learning in that process to get there. and then how do you separate signal from noise and And you know if we had an aspiration it would be Thanks for saying that, but yeah, we're pretty excited. and sharing the story, Alright, take care. we'll see you next time,

ENTITIES

Entity	Category	Confidence
Vadim	PERSON	0.99+
Pravin Pillai	PERSON	0.99+
Vadim Supitskiy	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Pravin	PERSON	0.99+
Dave	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Rickard Söderberg	PERSON	0.99+
Jeff	PERSON	0.99+
Peter Burris	PERSON	0.99+
Thomas	PERSON	0.99+
Rickard	PERSON	0.99+
Evan	PERSON	0.99+
John Furrier	PERSON	0.99+
Micheline Nijmeh	PERSON	0.99+
Google	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Abdul Razack	PERSON	0.99+
Micheline	PERSON	0.99+
Sept 2018	DATE	0.99+
March 2019	DATE	0.99+
Evan Kaplan	PERSON	0.99+
Hong Kong	LOCATION	0.99+
11	QUANTITY	0.99+
80%	QUANTITY	0.99+
New York City	LOCATION	0.99+
1949	DATE	0.99+
GANT	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
Zscaler	ORGANIZATION	0.99+
30%	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
six months	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
G Suite	TITLE	0.99+
Paul	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
two	QUANTITY	0.99+
73%	QUANTITY	0.99+
Mongo	ORGANIZATION	0.99+
58%	QUANTITY	0.99+
one	QUANTITY	0.99+
GDPR	TITLE	0.99+
Formex	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Palo Alto, California	LOCATION	0.99+
three years	QUANTITY	0.99+
10 minutes	QUANTITY	0.99+
fourth	QUANTITY	0.99+
InluxData	ORGANIZATION	0.99+
Abdul	PERSON	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Influx: