Evolving InfluxDB into the Smart Data Platform

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now in this program we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems, certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away, just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally I would just say please, like watch in ICE in Tim's sessions, like these are two of our best and brightest, They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time really hot area. As Brian said in a moment, I'll be right back with Anna East dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't wanna miss this >>Time series Data is everywhere. The number of sensors, systems and applications generating time series data increases every day. All these data sources producing so much data can cause analysis paralysis. Influx DB is an entire platform designed with everything you need to quickly build applications that generate value from time series data influx. DB Cloud is a serverless solution, which means you don't need to buy or manage your own servers. There's no need to worry about provisioning because you only pay for what you use. Influx DB Cloud is fully managed so you get the newest features and enhancements as they're added to the platform's code base. It also means you can spend time building solutions and delivering value to your users instead of wasting time and effort managing something else. Influx TVB Cloud offers a range of security features to protect your data, multiple layers of redundancy ensure you don't lose any data access controls ensure that only the people who should see your data can see it. >>And encryption protects your data at rest and in transit between any of our regions or cloud providers. InfluxDB uses a single API across the entire platform suite so you can build on open source, deploy to the cloud and then then easily query data in the cloud at the edge or on prem using the same scripts. And InfluxDB is schemaless automatically adjusting to changes in the shape of your data without requiring changes in your application. Logic. InfluxDB Cloud is production ready from day one. All it needs is your data and your imagination. Get started today@influxdata.com slash cloud. >>Okay, we're back. I'm Dave Valante with a Cube and you're watching evolving Influx DB into the smart data platform made possible by influx data. Anna ETOs Georgio is here, she's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into real-time analytics and is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IX is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory of course for speed. It's a kilo store, so it gives you a compression efficiency, it's gonna give you faster query speeds, you store files and object storage, so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's live tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import super useful. Also broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs, but the formation of the Rust Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, the adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Russ was chosen because of his exceptional performance and reliability. So while Russ is synt tactically similar to c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers. And dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on ality, for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fix race conditions, to protection against buffering overflows and to ensure thread safe async cashing structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learn about the, the new engine and, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It it's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data. And so much of the efficiency and performance of IOx comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of of illustrate why column or data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format, this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the men and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why Rowe Oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, commoner data fit framework. So that's where a lot of the advantages come >>From. Okay. So you basically described like a traditional database, a row approach, but I've seen like a lot of traditional database say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native i, is it not as effective? Is the, is the foreman not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. Yeah. >>Got it. So let's talk about Arrow Data Fusion. What is data fusion? I know it's written in Rust, but what does it bring to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as it's in memory format. So the way that it helps in influx DB IOCs is that okay, it's great if you can write unlimited amount of cardinality into influx Cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So Data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PANDAS data frames as well and all of the machine learning tools associated with Pandas. >>Okay. You're also leveraging Par K in the platform cause we heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Parque and why is it important? >>Sure. So parque is the column oriented durable file format. So it's important because it'll enable bulk import, bulk export, it has compatibility with Python and Pandas, so it supports a broader ecosystem. Par K files also take very little disc disc space and they're faster to scan because again, they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and he's, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOx and I really encourage, if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and you just wanna learn more, then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel look for the influx DDB unders IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about iacs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how INFLUX DB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and, and you guys super responsive, so really appreciate that. All right, thank you so much Anise for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakum, he's the director of engineering for Influx Data and we're gonna talk about how you update a SAS engine while the plane is flying at 30,000 feet. You don't wanna miss this. >>I'm really glad that we went with InfluxDB Cloud for our hosting because it has saved us a ton of time. It's helped us move faster, it's saved us money. And also InfluxDB has good support. My name's Alex Nada. I am CTO at Noble nine. Noble Nine is a platform to measure and manage service level objectives, which is a great way of measuring the reliability of your systems. You can essentially think of an slo, the product we're providing to our customers as a bunch of time series. So we need a way to store that data and the corresponding time series that are related to those. The main reason that we settled on InfluxDB as we were shopping around is that InfluxDB has a very flexible query language and as a general purpose time series database, it basically had the set of features we were looking for. >>As our platform has grown, we found InfluxDB Cloud to be a really scalable solution. We can quickly iterate on new features and functionality because Influx Cloud is entirely managed, it probably saved us at least a full additional person on our team. We also have the option of running InfluxDB Enterprise, which gives us the ability to even host off the cloud or in a private cloud if that's preferred by a customer. Influx data has been really flexible in adapting to the hosting requirements that we have. They listened to the challenges we were facing and they helped us solve it. As we've continued to grow, I'm really happy we have influx data by our side. >>Okay, we're back with Tim Yokum, who is the director of engineering at Influx Data. Tim, welcome. Good to see you. >>Good to see you. Thanks for having me. >>You're really welcome. Listen, we've been covering open source software in the cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem. The cloud has been being built out on open source, mobile, social platforms, key databases, and of course influx DB and influx data has been a big consumer and contributor of open source software. So my question to you is, where have you seen the biggest bang for the buck from open source software? >>So yeah, you know, influx really, we thrive at the intersection of commercial services and open, so open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services temping engines. Our, our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants and like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product influx db. >>You know, but I gotta ask you, Tim, because one of the challenge that that we've seen in particular, you saw this in the heyday of Hadoop, the, the innovations come so fast and furious and as a software company you gotta place bets, you gotta, you know, commit people and sometimes those bets can be risky and not pay off well, how have you managed this challenge? >>Oh, it moves fast. Yeah, that, that's a benefit though because it, the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we, what we tend to do is, is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example, that ecosystem is driven by thousands of intelligent developers, engineers, builders, they're adding value every day. So we have to really keep up with that. And as the stack changes, we, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's, it's something that we just do every day. >>So we have a survey partner down in New York City called Enterprise Technology Research etr, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity particularly, you know, along with cloud. But, but really Kubernetes is just, you know, still up until the right consistently even with, you know, the macro headwinds and all, all of the stuff that we're sick of talking about. But, so what are you doing with Kubernetes in the platform? >>Yeah, it, it's really central to our ability to run the product. When we first started out, we were just on AWS and, and the way we were running was, was a little bit like containers junior. Now we're running Kubernetes everywhere at aws, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code so our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. >>Just to follow up on that, is it, no. So I presume it's sounds like there's a PAs layer there to allow you guys to have a consistent experience across clouds and out to the edge, you know, wherever is that, is that correct? >>Yeah, so we've basically built more or less platform engineering, This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that, that just gets all of the underlying infrastructure out of the way and, and lets them focus on delivering influx cloud. >>Yeah, and I know I'm taking a little bit of a tangent, but is that, that, I'll call it a PAs layer if I can use that term. Is that, are there specific attributes to Influx db or is it kind of just generally off the shelf paths? You know, are there, is, is there any purpose built capability there that, that is, is value add or is it pretty much generic? >>So we really build, we, we look at things through, with a build versus buy through a, a build versus by lens. Some things we want to leverage cloud provider services, for instance, Postgres databases for metadata, perhaps we'll get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can, can deliver on that has consistency that is, is all generated from code that we can as a, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions and in no time. >>So how, so sometimes you build, sometimes you buy it. How do you make those decisions and and what does that mean for the, for the platform and for customers? >>Yeah, so what we're doing is, it's like everybody else will do, we're we're looking for trade offs that make sense. You know, we really want to protect our customers data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, like I had mentioned with SQL data stores for metadata, perhaps let's build on top of what of these three large cloud providers have already perfected. And we can then focus on our platform engineering and we can have our developers then focus on the influx data, software, influx, cloud software. >>So take it to the customer level, what does it mean for them? What's the value that they're gonna get out of all these innovations that we've been been talking about today and what can they expect in the future? >>So first of all, people who use the OSS product are really gonna be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across, over 4 billion series keys that people have stored. So there's a proven ability to scale now in terms of the open source, open source software and how we've developed the platform. You're getting highly available high cardinality time series platform. We manage it and, and really as, as I mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in real time. We deploy to our platform every day repeatedly all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change new features, better ways of doing deployments, safer ways of doing deployments. >>All of that happens behind the scenes. And like we had mentioned earlier, Kubernetes, I mean that, that allows us to get that done. We couldn't do it without having that platform as a, as a base layer for us to then put our software on. So we, we iterate quickly. When you're on the, the Influx cloud platform, you really are able to, to take advantage of new features immediately. We roll things out every day and as those things go into production, you have, you have the ability to, to use them. And so in the end we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let, let us do that for you. So, >>And that makes sense, but so is the, is the, are the innovations that we're talking about in the evolution of Influx db, do, do you see that as sort of a natural evolution for existing customers? I, is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >>Yeah, it really is it, it's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are, are really the hot thing. Iot, industrial iot especially, people want to just shove tons of data out there and be able to do queries immediately and they don't wanna manage infrastructure. What we've started to see are people that use the cloud service as their, their data store backbone and then they use edge computing with R OSS product to ingest data from say, multiple production lines and downsample that data, send the rest of that data off influx cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that and being in all sorts of different regions allows for people to really get out of the, the business of man trying to manage that big data, have us take care of that. And of course as we change the platform end users benefit from that immediately. And, >>And so obviously taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IOT and the Edge? How should we be thinking about the value that you bring from a security perspective? >>Yeah, we take, we take security super seriously. It, it's built into our dna. We do a lot of work to ensure that our platform is secure, that the data we store is, is kept private. It's of course always a concern. You see in the news all the time, companies being compromised, you know, that's something that you can have an entire team working on, which we do to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You know, you look at things like software, bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that, that's just part of our jobs to make sure that the platform that we're running it has, has fully vetted software and, and with open source especially, that's a lot of work. And so it's, it's definitely new territory. Supply chain attacks are, are definitely happening at a higher clip than they used to, but that is, that is really just part of a day in the, the life for folks like us that are, are building platforms. >>Yeah, and that's key. I mean especially when you start getting into the, the, you know, we talk about IOT and the operations technologies, the engineers running the, that infrastructure, you know, historically, as you know, Tim, they, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >>That >>Connected now, right? And so you've gotta have a partner that is again, take away that heavy lifting to r and d so you can focus on some of the other activities. Right. Give us the, the last word and the, the key takeaways from your perspective. >>Well, you know, from my perspective I see it as, as a a two lane approach with, with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, what you had mentioned, air gaping. Sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want torus their data to, to a company that's, that's got a full platform set up for them that they can build on, send that data over to the cloud, the cloud is not going away. I think more hybrid approach is, is where the future lives and that's what we're prepared for. >>Tim, really appreciate you coming to the program. Great stuff. Good to see you. >>Thanks very much. Appreciate it. >>Okay, in a moment I'll be back to wrap up. Today's session, you're watching The Cube. >>Are you looking for some help getting started with InfluxDB Telegraph or Flux Check >>Out Influx DB University >>Where you can find our entire catalog of free training that will help you make the most of your time series data >>Get >>Started for free@influxdbu.com. >>We'll see you in class. >>Okay, so we heard today from three experts on time series and data, how the Influx DB platform is evolving to support new ways of analyzing large data sets very efficiently and effectively in real time. And we learned that key open source components like Apache Arrow and the Rust Programming environment Data fusion par K are being leveraged to support realtime data analytics at scale. We also learned about the contributions in importance of open source software and how the Influx DB community is evolving the platform with minimal disruption to support new workloads, new use cases, and the future of realtime data analytics. Now remember these sessions, they're all available on demand. You can go to the cube.net to find those. Don't forget to check out silicon angle.com for all the news related to things enterprise and emerging tech. And you should also check out influx data.com. There you can learn about the company's products. You'll find developer resources like free courses. You could join the developer community and work with your peers to learn and solve problems. And there are plenty of other resources around use cases and customer stories on the website. This is Dave Valante. Thank you for watching Evolving Influx DB into the smart data platform, made possible by influx data and brought to you by the Cube, your leader in enterprise and emerging tech coverage.

Published Date : Nov 2 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, you know, all of the, the real time queries, the, the multiple language query support, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try And so just, you know, being careful, maybe a little cautious in terms And you can do some experimentation and, you know, using the cloud resources. You know, this is a new very sort of popular systems language, you know, really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. going out and you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. goes, goes beyond just the historical into the real time really hot area. There's no need to worry about provisioning because you only pay for what you use. InfluxDB uses a single API across the entire platform suite so you can build on Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the Hi, thank you so much. it's gonna give you faster query speeds, you store files and object storage, it aims to have no limits on cardinality and also allow you to write any kind of event data that It's really, the adoption is really starting to get steep on all the control, all the fine grain control, you need to take you know, the community is modernizing the platform, but I wanna talk about Apache And so you can answer that question and you have those immediately available to you. out that one temperature value that you want at that one time stamp and do that for every talking about is really, you know, kind of native i, is it not as effective? Yeah, it's, it's not as effective because you have more expensive compression and So let's talk about Arrow Data Fusion. It also has a PANDAS API so that you could take advantage of PANDAS What are you doing with and Pandas, so it supports a broader ecosystem. What's the value that you're bringing to the community? And I think kind of the idea here is that if you can improve kind of summarize, you know, where what, what the big takeaways are from your perspective. the hard work questions and you All right, thank you so much Anise for explaining I really appreciate it. Data and we're gonna talk about how you update a SAS engine while I'm really glad that we went with InfluxDB Cloud for our hosting They listened to the challenges we were facing and they helped Good to see you. Good to see you. So my question to you is, So yeah, you know, influx really, we thrive at the intersection of commercial services and open, You know, you look at Kubernetes for example, But, but really Kubernetes is just, you know, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. to the edge, you know, wherever is that, is that correct? This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us Is that, are there specific attributes to Influx db as an SRE group, as an ops team, that we can manage with very few people So how, so sometimes you build, sometimes you buy it. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, and really as, as I mentioned earlier, we can keep up with the state of the art. the end we want you to focus on getting actual insights from your data instead of running infrastructure, So cloud native technologies are, are really the hot thing. You see in the news all the time, companies being compromised, you know, technologies, the engineers running the, that infrastructure, you know, historically, as you know, take away that heavy lifting to r and d so you can focus on some of the other activities. with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, Tim, really appreciate you coming to the program. Thanks very much. Okay, in a moment I'll be back to wrap up. brought to you by the Cube, your leader in enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
David Brown	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave Volante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Stu	PERSON	0.99+
Herain Oberoi	PERSON	0.99+
John	PERSON	0.99+
Dave Valante	PERSON	0.99+
Kamile Taouk	PERSON	0.99+
John Fourier	PERSON	0.99+
Rinesh Patel	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Santana Dasgupta	PERSON	0.99+
Europe	LOCATION	0.99+
Canada	LOCATION	0.99+
BMW	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
ICE	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Jack Berkowitz	PERSON	0.99+
Australia	LOCATION	0.99+
NVIDIA	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Venkat	PERSON	0.99+
Michael	PERSON	0.99+
Camille	PERSON	0.99+
Andy Jassy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Venkat Krishnamachari	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Don Tapscott	PERSON	0.99+
thousands	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Intercontinental Exchange	ORGANIZATION	0.99+
Children's Cancer Institute	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
telco	ORGANIZATION	0.99+
Sabrina Yan	PERSON	0.99+
Tim	PERSON	0.99+
Sabrina	PERSON	0.99+
John Furrier	PERSON	0.99+
Google	ORGANIZATION	0.99+
MontyCloud	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Leo	PERSON	0.99+
COVID-19	OTHER	0.99+
Santa Ana	LOCATION	0.99+
UK	LOCATION	0.99+
Tushar	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Valente	PERSON	0.99+
JL Valente	PERSON	0.99+
1,000	QUANTITY	0.99+

Evolving InfluxDB into the Smart Data Platform Full Episode

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now in this program we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems, certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away, just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally I would just say please, like watch in ICE in Tim's sessions, like these are two of our best and brightest, They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time really hot area. As Brian said in a moment, I'll be right back with Anna East dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't wanna miss this >>Time series Data is everywhere. The number of sensors, systems and applications generating time series data increases every day. All these data sources producing so much data can cause analysis paralysis. Influx DB is an entire platform designed with everything you need to quickly build applications that generate value from time series data influx. DB Cloud is a serverless solution, which means you don't need to buy or manage your own servers. There's no need to worry about provisioning because you only pay for what you use. Influx DB Cloud is fully managed so you get the newest features and enhancements as they're added to the platform's code base. It also means you can spend time building solutions and delivering value to your users instead of wasting time and effort managing something else. Influx TVB Cloud offers a range of security features to protect your data, multiple layers of redundancy ensure you don't lose any data access controls ensure that only the people who should see your data can see it. >>And encryption protects your data at rest and in transit between any of our regions or cloud providers. InfluxDB uses a single API across the entire platform suite so you can build on open source, deploy to the cloud and then then easily query data in the cloud at the edge or on prem using the same scripts. And InfluxDB is schemaless automatically adjusting to changes in the shape of your data without requiring changes in your application. Logic. InfluxDB Cloud is production ready from day one. All it needs is your data and your imagination. Get started today@influxdata.com slash cloud. >>Okay, we're back. I'm Dave Valante with a Cube and you're watching evolving Influx DB into the smart data platform made possible by influx data. Anna ETOs Georgio is here, she's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into real-time analytics and is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IX is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory of course for speed. It's a kilo store, so it gives you a compression efficiency, it's gonna give you faster query speeds, you store files and object storage, so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's live tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import super useful. Also broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs, but the formation of the Rust Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, the adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Russ was chosen because of his exceptional performance and reliability. So while Russ is synt tactically similar to c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers. And dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on ality, for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fix race conditions, to protection against buffering overflows and to ensure thread safe async cashing structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learn about the, the new engine and, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It it's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data. And so much of the efficiency and performance of IOx comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of of illustrate why column or data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format, this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the men and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why Rowe Oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, commoner data fit framework. So that's where a lot of the advantages come >>From. Okay. So you basically described like a traditional database, a row approach, but I've seen like a lot of traditional database say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native i, is it not as effective? Is the, is the foreman not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. Yeah. >>Got it. So let's talk about Arrow Data Fusion. What is data fusion? I know it's written in Rust, but what does it bring to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as it's in memory format. So the way that it helps in influx DB IOCs is that okay, it's great if you can write unlimited amount of cardinality into influx Cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So Data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PANDAS data frames as well and all of the machine learning tools associated with Pandas. >>Okay. You're also leveraging Par K in the platform cause we heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Parque and why is it important? >>Sure. So parque is the column oriented durable file format. So it's important because it'll enable bulk import, bulk export, it has compatibility with Python and Pandas, so it supports a broader ecosystem. Par K files also take very little disc disc space and they're faster to scan because again, they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and he's, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOx and I really encourage, if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and you just wanna learn more, then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel look for the influx DDB unders IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about iacs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how INFLUX DB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and, and you guys super responsive, so really appreciate that. All right, thank you so much Anise for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakum, he's the director of engineering for Influx Data and we're gonna talk about how you update a SAS engine while the plane is flying at 30,000 feet. You don't wanna miss this. >>I'm really glad that we went with InfluxDB Cloud for our hosting because it has saved us a ton of time. It's helped us move faster, it's saved us money. And also InfluxDB has good support. My name's Alex Nada. I am CTO at Noble nine. Noble Nine is a platform to measure and manage service level objectives, which is a great way of measuring the reliability of your systems. You can essentially think of an slo, the product we're providing to our customers as a bunch of time series. So we need a way to store that data and the corresponding time series that are related to those. The main reason that we settled on InfluxDB as we were shopping around is that InfluxDB has a very flexible query language and as a general purpose time series database, it basically had the set of features we were looking for. >>As our platform has grown, we found InfluxDB Cloud to be a really scalable solution. We can quickly iterate on new features and functionality because Influx Cloud is entirely managed, it probably saved us at least a full additional person on our team. We also have the option of running InfluxDB Enterprise, which gives us the ability to even host off the cloud or in a private cloud if that's preferred by a customer. Influx data has been really flexible in adapting to the hosting requirements that we have. They listened to the challenges we were facing and they helped us solve it. As we've continued to grow, I'm really happy we have influx data by our side. >>Okay, we're back with Tim Yokum, who is the director of engineering at Influx Data. Tim, welcome. Good to see you. >>Good to see you. Thanks for having me. >>You're really welcome. Listen, we've been covering open source software in the cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem. The cloud has been being built out on open source, mobile, social platforms, key databases, and of course influx DB and influx data has been a big consumer and contributor of open source software. So my question to you is, where have you seen the biggest bang for the buck from open source software? >>So yeah, you know, influx really, we thrive at the intersection of commercial services and open, so open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services temping engines. Our, our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants and like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product influx db. >>You know, but I gotta ask you, Tim, because one of the challenge that that we've seen in particular, you saw this in the heyday of Hadoop, the, the innovations come so fast and furious and as a software company you gotta place bets, you gotta, you know, commit people and sometimes those bets can be risky and not pay off well, how have you managed this challenge? >>Oh, it moves fast. Yeah, that, that's a benefit though because it, the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we, what we tend to do is, is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example, that ecosystem is driven by thousands of intelligent developers, engineers, builders, they're adding value every day. So we have to really keep up with that. And as the stack changes, we, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's, it's something that we just do every day. >>So we have a survey partner down in New York City called Enterprise Technology Research etr, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity particularly, you know, along with cloud. But, but really Kubernetes is just, you know, still up until the right consistently even with, you know, the macro headwinds and all, all of the stuff that we're sick of talking about. But, so what are you doing with Kubernetes in the platform? >>Yeah, it, it's really central to our ability to run the product. When we first started out, we were just on AWS and, and the way we were running was, was a little bit like containers junior. Now we're running Kubernetes everywhere at aws, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code so our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. >>Just to follow up on that, is it, no. So I presume it's sounds like there's a PAs layer there to allow you guys to have a consistent experience across clouds and out to the edge, you know, wherever is that, is that correct? >>Yeah, so we've basically built more or less platform engineering, This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that, that just gets all of the underlying infrastructure out of the way and, and lets them focus on delivering influx cloud. >>Yeah, and I know I'm taking a little bit of a tangent, but is that, that, I'll call it a PAs layer if I can use that term. Is that, are there specific attributes to Influx db or is it kind of just generally off the shelf paths? You know, are there, is, is there any purpose built capability there that, that is, is value add or is it pretty much generic? >>So we really build, we, we look at things through, with a build versus buy through a, a build versus by lens. Some things we want to leverage cloud provider services, for instance, Postgres databases for metadata, perhaps we'll get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can, can deliver on that has consistency that is, is all generated from code that we can as a, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions and in no time. >>So how, so sometimes you build, sometimes you buy it. How do you make those decisions and and what does that mean for the, for the platform and for customers? >>Yeah, so what we're doing is, it's like everybody else will do, we're we're looking for trade offs that make sense. You know, we really want to protect our customers data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, like I had mentioned with SQL data stores for metadata, perhaps let's build on top of what of these three large cloud providers have already perfected. And we can then focus on our platform engineering and we can have our developers then focus on the influx data, software, influx, cloud software. >>So take it to the customer level, what does it mean for them? What's the value that they're gonna get out of all these innovations that we've been been talking about today and what can they expect in the future? >>So first of all, people who use the OSS product are really gonna be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across, over 4 billion series keys that people have stored. So there's a proven ability to scale now in terms of the open source, open source software and how we've developed the platform. You're getting highly available high cardinality time series platform. We manage it and, and really as, as I mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in real time. We deploy to our platform every day repeatedly all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change new features, better ways of doing deployments, safer ways of doing deployments. >>All of that happens behind the scenes. And like we had mentioned earlier, Kubernetes, I mean that, that allows us to get that done. We couldn't do it without having that platform as a, as a base layer for us to then put our software on. So we, we iterate quickly. When you're on the, the Influx cloud platform, you really are able to, to take advantage of new features immediately. We roll things out every day and as those things go into production, you have, you have the ability to, to use them. And so in the end we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let, let us do that for you. So, >>And that makes sense, but so is the, is the, are the innovations that we're talking about in the evolution of Influx db, do, do you see that as sort of a natural evolution for existing customers? I, is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >>Yeah, it really is it, it's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are, are really the hot thing. Iot, industrial iot especially, people want to just shove tons of data out there and be able to do queries immediately and they don't wanna manage infrastructure. What we've started to see are people that use the cloud service as their, their data store backbone and then they use edge computing with R OSS product to ingest data from say, multiple production lines and downsample that data, send the rest of that data off influx cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that and being in all sorts of different regions allows for people to really get out of the, the business of man trying to manage that big data, have us take care of that. And of course as we change the platform end users benefit from that immediately. And, >>And so obviously taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IOT and the Edge? How should we be thinking about the value that you bring from a security perspective? >>Yeah, we take, we take security super seriously. It, it's built into our dna. We do a lot of work to ensure that our platform is secure, that the data we store is, is kept private. It's of course always a concern. You see in the news all the time, companies being compromised, you know, that's something that you can have an entire team working on, which we do to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You know, you look at things like software, bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that, that's just part of our jobs to make sure that the platform that we're running it has, has fully vetted software and, and with open source especially, that's a lot of work. And so it's, it's definitely new territory. Supply chain attacks are, are definitely happening at a higher clip than they used to, but that is, that is really just part of a day in the, the life for folks like us that are, are building platforms. >>Yeah, and that's key. I mean especially when you start getting into the, the, you know, we talk about IOT and the operations technologies, the engineers running the, that infrastructure, you know, historically, as you know, Tim, they, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >>That >>Connected now, right? And so you've gotta have a partner that is again, take away that heavy lifting to r and d so you can focus on some of the other activities. Right. Give us the, the last word and the, the key takeaways from your perspective. >>Well, you know, from my perspective I see it as, as a a two lane approach with, with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, what you had mentioned, air gaping. Sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want torus their data to, to a company that's, that's got a full platform set up for them that they can build on, send that data over to the cloud, the cloud is not going away. I think more hybrid approach is, is where the future lives and that's what we're prepared for. >>Tim, really appreciate you coming to the program. Great stuff. Good to see you. >>Thanks very much. Appreciate it. >>Okay, in a moment I'll be back to wrap up. Today's session, you're watching The Cube. >>Are you looking for some help getting started with InfluxDB Telegraph or Flux Check >>Out Influx DB University >>Where you can find our entire catalog of free training that will help you make the most of your time series data >>Get >>Started for free@influxdbu.com. >>We'll see you in class. >>Okay, so we heard today from three experts on time series and data, how the Influx DB platform is evolving to support new ways of analyzing large data sets very efficiently and effectively in real time. And we learned that key open source components like Apache Arrow and the Rust Programming environment Data fusion par K are being leveraged to support realtime data analytics at scale. We also learned about the contributions in importance of open source software and how the Influx DB community is evolving the platform with minimal disruption to support new workloads, new use cases, and the future of realtime data analytics. Now remember these sessions, they're all available on demand. You can go to the cube.net to find those. Don't forget to check out silicon angle.com for all the news related to things enterprise and emerging tech. And you should also check out influx data.com. There you can learn about the company's products. You'll find developer resources like free courses. You could join the developer community and work with your peers to learn and solve problems. And there are plenty of other resources around use cases and customer stories on the website. This is Dave Valante. Thank you for watching Evolving Influx DB into the smart data platform, made possible by influx data and brought to you by the Cube, your leader in enterprise and emerging tech coverage.

Published Date : Oct 28 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, you know, all of the, the real time queries, the, the multiple language query support, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try And so just, you know, being careful, maybe a little cautious in terms And you can do some experimentation and, you know, using the cloud resources. You know, this is a new very sort of popular systems language, you know, really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. going out and you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. goes, goes beyond just the historical into the real time really hot area. There's no need to worry about provisioning because you only pay for what you use. InfluxDB uses a single API across the entire platform suite so you can build on Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the Hi, thank you so much. it's gonna give you faster query speeds, you store files and object storage, it aims to have no limits on cardinality and also allow you to write any kind of event data that It's really, the adoption is really starting to get steep on all the control, all the fine grain control, you need to take you know, the community is modernizing the platform, but I wanna talk about Apache And so you can answer that question and you have those immediately available to you. out that one temperature value that you want at that one time stamp and do that for every talking about is really, you know, kind of native i, is it not as effective? Yeah, it's, it's not as effective because you have more expensive compression and So let's talk about Arrow Data Fusion. It also has a PANDAS API so that you could take advantage of PANDAS What are you doing with and Pandas, so it supports a broader ecosystem. What's the value that you're bringing to the community? And I think kind of the idea here is that if you can improve kind of summarize, you know, where what, what the big takeaways are from your perspective. the hard work questions and you All right, thank you so much Anise for explaining I really appreciate it. Data and we're gonna talk about how you update a SAS engine while I'm really glad that we went with InfluxDB Cloud for our hosting They listened to the challenges we were facing and they helped Good to see you. Good to see you. So my question to you is, So yeah, you know, influx really, we thrive at the intersection of commercial services and open, You know, you look at Kubernetes for example, But, but really Kubernetes is just, you know, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. to the edge, you know, wherever is that, is that correct? This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us Is that, are there specific attributes to Influx db as an SRE group, as an ops team, that we can manage with very few people So how, so sometimes you build, sometimes you buy it. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, and really as, as I mentioned earlier, we can keep up with the state of the art. the end we want you to focus on getting actual insights from your data instead of running infrastructure, So cloud native technologies are, are really the hot thing. You see in the news all the time, companies being compromised, you know, technologies, the engineers running the, that infrastructure, you know, historically, as you know, take away that heavy lifting to r and d so you can focus on some of the other activities. with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, Tim, really appreciate you coming to the program. Thanks very much. Okay, in a moment I'll be back to wrap up. brought to you by the Cube, your leader in enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Dave Valante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Tim	PERSON	0.99+
Google	ORGANIZATION	0.99+
16 times	QUANTITY	0.99+
two rows	QUANTITY	0.99+
New York City	LOCATION	0.99+
60,000 people	QUANTITY	0.99+
Rust	TITLE	0.99+
Influx	ORGANIZATION	0.99+
Influx Data	ORGANIZATION	0.99+
today	DATE	0.99+
Influx Data	ORGANIZATION	0.99+
Python	TITLE	0.99+
three experts	QUANTITY	0.99+
InfluxDB	TITLE	0.99+
both	QUANTITY	0.99+
each row	QUANTITY	0.99+
two lane	QUANTITY	0.99+
Today	DATE	0.99+
Noble nine	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Flux	ORGANIZATION	0.99+
Influx DB	TITLE	0.99+
each column	QUANTITY	0.99+
270 terabytes	QUANTITY	0.99+
cube.net	OTHER	0.99+
twice	QUANTITY	0.99+
Bryan	PERSON	0.99+
Pandas	TITLE	0.99+
c plus plus	TITLE	0.99+
three years ago	DATE	0.99+
two	QUANTITY	0.99+
more than a decade	QUANTITY	0.98+
Apache	ORGANIZATION	0.98+
dozens	QUANTITY	0.98+
free@influxdbu.com	OTHER	0.98+
30,000 feet	QUANTITY	0.98+
Rust Foundation	ORGANIZATION	0.98+
two temperature values	QUANTITY	0.98+
In Flux Data	ORGANIZATION	0.98+
one time stamp	QUANTITY	0.98+
tomorrow	DATE	0.98+
Russ	PERSON	0.98+
IOT	ORGANIZATION	0.98+
Evolving InfluxDB	TITLE	0.98+
first	QUANTITY	0.97+
Influx data	ORGANIZATION	0.97+
one	QUANTITY	0.97+
first one	QUANTITY	0.97+
Influx DB University	ORGANIZATION	0.97+
SQL	TITLE	0.97+
The Cube	TITLE	0.96+
Influx DB Cloud	TITLE	0.96+
single server	QUANTITY	0.96+
Kubernetes	TITLE	0.96+

Itamar Ankorion, Qlik & Kosti Vasilakakis, AWS | AWS re:Invent 2021

>>Hello, and welcome back to the cubes. Continuous coverage of AWS 2021. We're here live real people, and we're pleased to bring you this hybrid event. The most important hybrid event of the year to wrap up really 20, 21 and kick off next year, we're going to dig into the intersection of machine learning and business intelligence, business intelligence, Innomar, and Corian is here as the senior vice president of technology alliances at click and costy Wasilla caucus is the head of product growth for low code, no code machine learning at AWS gentlemen. Welcome to the >>Cube. Thanks for having us. >>I think the first time you were on at reinvent Sev definitely early last decade of >>My life. I >>Had black hair and it was maybe a 2013, I want to say. So it's been quite a run >>And it's definitely been a, been a privilege. I had a, had a chance to attend pretty much all all reinvents from the first one, eh, with a much fewer people and say this growth year over year. And what's amazing about it. This is beyond the scale, how much you grow, the number of people. It's just the face of innovation. Keeps, keeps accelerating as an it's, just this phenomenal. >>We're lucky that we chose data as sort of a, our business passion. But, um, so speaking of data, what are you hearing from customers about what they want to do with their data and bringing together business intelligence and machine learning it's being injected in, but what are they telling you that they, that they want, that they need? What's the opportunity that you're hearing now? >>So, uh, I think first of all, this is a fascinating, fascinating topic because we're talking kind of about the intersection of, uh, what everybody wants to look to do as the next frontier of, uh, of data with predictive data, because descriptive analytics have been around for a long time, but what coconut use predictive analytics, prescriptive analytics to enrich what we've had with descriptive analytics to be the end of the day, improve the business and what, what I love talking to people around here and just listening to customers, express the, you know, their needs is how can they get more value out of data? So they have the data, they don't use. A lot of the data are in Applegate and they want to use it in more ways. And that's what exciting to discuss those new ways. They want to bring it together >>Because anything you'd add to that from AWS perspective, >>I'll tell you what we don't hear from our customers and that we've stopped hearing what is AI and machine learning. And on the contrary we are hearing, how can we make the teams that already AI and ML a lot more productive and make a lot more of it, for example, how can they iterate a lot faster across the ML workflow, how they can train and build really large state of the art, natural language processing models like DDB DBT three, how can we help customers build, train and tune customer specific models for all their, to be able to bring in hyper personalization to their products? And the other thing we're hearing is how can we help the teams that are not tapping into AI and ML get the most power of it in a way, how could you actually potentially either democratize the building and development of machine learning models? Or how can you, in another way, expose machine learning into applications that analytics users are already using? >>Yeah. So in my, when we first met success was measured in, yeah, I got the Hadoop cluster, the work technically, but to your point, they customers want to get more value out of that data now. And so they want to operationalize machine intelligence. Is that what active intelligence is? >>Um, so active intelligence is something that you have here click started to talk about, but we believe it really represents what customers are trying to achieve. And the reason we use the word active intelligence is if you're going to think about active, not being passive. So, uh, traditional BI, uh, kind of relied on pre-configured historical data sets, which were great for what they did, but today they're kind of out of gas in terms of supporting real time decisioning and action. So what active intelligence is all about is really enabling customers to make it take informed, informed action, not just informed decision informed action in the moment. So when that action needs needs to happen. So in order to accommodate that again, this is really the difference between active and passive. Is it active intelligence is all about innovations to bring real-time data. So it's all just historical data. >>I need real time data that's relevant to what's happening. Now. I need a way to get an intelligent data pipeline. And I lead this data pipeline that makes it real-time data available in the forum and the structure that allows me to make a decision or to take action. And finally, it's really to be designed to drive action, right? So whether it's a manual action or whether it's even completely automated, but it's intelligent, it's informed. So that's, that's what active intelligence is all about that by the way, predictive data fits really well into that entire paradigm. Right. >>I mean, we've been talking for years about real-time and it's like, okay, what is real time? Well, it's real time is before you lose the customer before you lose the patient before the machine explodes. Right? So your point about predictive. Yeah. Now you guys made an announcement yesterday, uh, ADA, which stands for AI, for data analytics, what what's that all about? Well, >>Ate them tries to aims to address the very point I mentioned before our customers that are asking us, how can we give access to our business teams? There are a lot more business needs to machine learning. An AI for data analytics is a set of partner solutions that are ML powered. And they're focusing across the spectrum of analytics from data warehousing, business intelligence, business process automation, and other business application. And the idea is to help our partners bring to our customers a lot of those more ways. And for example, we've built integrations with clique Tableau, snowflake, Workato Pegasystems. And through those, those usually take two flavors. Either we help our partners build a mail and embedded into their applications and in a way, make them more intelligent as Mr. Wright mentioned, or we help our partners expose machine learning capability from AWS, right within the UI. >>So for example, yes, they will launch snowflake integration with SageMaker. Now snowflake user can use the same user experience in three-year the same use, the SQL query that they love and trigger an auto ML process insights maker, right from the same UI and get ML into the same UI. And I'm quite excited to also discuss today about the integration we announced today with click SageMaker integration or that was about it. No, no, no other, so I think, um, what a setups, yeah. You mentioned customers want to create more machine learning. They, they want to build faster, new, more machine learning capabilities, which is whereby the way the, the, uh, no code local, you know, comes into mind. How can you use the autopilot, which is a SageMaker product for enabling faster creation of models. So I want to create models faster. They also want to be able to use models in a sense, monetize them, turn them into value to make them available to more users where they're you there's users are. >>Eh, so, you know, BI environments or experiences like as we started to think about him. So I says, well, be provided with Gleevec. And again, with our active intelligence platform is all about weaving the data into the applications, into the environments, either the analytic workflows that, uh, that users have. So we introduced and are super excited. Uh, we've announced, uh, two integrations. So very robust integration between cloud and Amazon SageMaker. And that includes both our new analytic connector for, uh, uh, Amazon SageMaker and our integration with Amazon SageMaker autopilot. So with integration with SageMaker, we now have ClixSense interacting directly and seamlessly with any model deployed within SageMaker. So again, very much like cost dimension in your experience as a user seamlessly, you now also have predictive predictive data. So as you working in application, as you're interacting with your data, dynamically data is interchanged between click and SageMaker in reaching your decision, making your actions with predictive datasets. And that's, what's so cool about it. So again, the clinic environment, we bring real-time data in, prepare it for analytics, and then feed that real-time data to SageMaker to get the real-time prediction back in the same experience for the user. So we're really, really excited about that. So >>Translate what that means for customers is that everything happens faster. Is it unlocked new capabilities? Can we unpack >>A little bit? Absolutely. So aware in a way, bridging the chasm between the data science world and the business teams. So the data science teams are building machine learning models to make predictions. And now with the first integration that Myra mentioned, we actually expose those machine learning models in an application that the business team uses click and with the same dashboards that they are very familiar with can now trigger those machine learning models and get real time predictions in the dashboards themselves powered by machine learning. So in a way, this chasm between the two worlds of data science and business users is completely bruised. And the second integration we built with autopilot, she helps data engineers use completely their own machine learning technology powered by AWS pacemaker. So a data engineers creating different pipelines and through those pipelines, they can now with a building block, add auto ML capabilities in that pipeline without them really knowing machine learning. So we bridge the gap of the business teams, getting access to the data science teams and also bringing the skillset gap for the data engineers to tap into machine learning. You mentioned >>Monitor monetization before. So this to me is key because who's going to do with doing the monetization. It's the business lines that are going to do that, not the data scientists data they're going to enable that, but ultimately it's those data consumers that are building those, I call them data products that they can ultimately monetize. And that's, I'm interested in low-code no-code who sits in your title too, so that all plays in doesn't it? >>Yeah, you guys, and we're heavily invested into that whole space. So for example, today we just launched SageMaker canvas. That is a low-code no-code capability for analysts and business users, but we realized we don't need to only innovate on the technology side. We need to also innovate on the partnerships that we built and those integrations help expose those, our technology to wherever our customers want to be the one to be in clique. So be it, let them use the machine learning technology that we are innovating on exactly where they wanted to be. >>Can you give us some customer examples and use cases, maybe make it real for us, >>Uh, for sure. And I, and I think as you, as you think about these use cases, one of the other things I want to do to kind of envision is the fact that all this predictive data and all this integration that we're talking about is not, can actually express itself in a lot of different experiences for the user. It can be a dashboard. It can also be a conversation analytics, which is part of what we offer in the cloud. So you can actually, he can arrive and interact with the data. You don't have to actually look at it. It can be alerts that actually look automatically and inform you that you need to take action. So you don't actually look at the data. The data will come to you when it, when it needs you including base on, on predictive data. So there's a lot of, uh, a lot of options about how you're going to do it. >>Then give me, let me give you, let me give you an example. I'll let me try and maybe pick one that is intuitive. I think for, for many, for many people sales, right? So you have sales, you have a lot of orders. You want to try to close to closing a quarter, you have a forecast, the deals you expect to close. Uh, and then you can use machine learning for example, to forecast or to try to project which, which deals you're going to lose. So now again, that can look at a lot of different aspects of the deal, the timing, the folder, the volume, the amounts, a lot of other parameters, right. Then predict if you're going to lose a deal. So now, if there's a deal that I, that my sales person is telling me, he's going to win, but the mall is telling me you may lose, well, I probably want to double click on that one. >>Right? So I cannot bring that information right in again, in the moment it is to the seller or to the management, so they can identify it and take action. Now, not only can I bring it to them, but I can also, you know, from the machine learning, you know, what is the likely reason that they lose? And if I know the likely reason, it also become prescriptive, I now can know what to do to try and fix it, right. So I can either do it again manually, or it can also integrate it, uh, again, you know, click cloud. We also also click on application automation, which is again, also kind of a low-code no-code environment to orchestrate processes. I can also take that automatically, also update back Salesforce or the CRM. Okay. So that the metadata management system gets updated. So you got an example, exactly. The example of active intelligence. It allows me to take informed action in the now in the moment about making the best example. >>And if Salesforce salesperson, maybe I prioritize and the machines helping me direct my resources. Is this available today? Is it in general availability >>Available right now? Right? Anyone can go start it right now and click LA >>Congratulations. Um, last question. So what's the future hold for this partnership? Where are you guys headed? Give us a little >>Direction. First of all, would love to scale those integrations. So if you're a customer of Blake, please go ahead and test them and do sir, the feedback. And second for us, we really want to learn from our customers and improve those integrations. We bring to them, we really want to hear what technologies they want to expose to a lot more users. And we are aspiring to build that partnership and get a lot more tight aligned with, uh, with Glick. And, uh, thank you costly. And, uh, we, we see tremendous additional opportunities. I think Amazon tells it where I would say is, well, we're in day one. That that's how we kind of feel about it. There's only so much we put into it, but the market is so dynamic. There's so many new needs that are coming up. So we kind of think about it that way. >>So first of all, we want to journey to expand Lee cloud, adding more services. It's actually a platform where we're bringing both data services. They integration data management, everything related to the analytics pipeline, and of course the analytic services. So it all comes together in one environment that makes it more agile, faster to build these new modern, active intelligence type experiences. So as we do that, we're going to be adding more services, creating more opportunities to integrate with more services from the AWS side. So we're really excited to look at that and just like close to, you mentioned with canvas, you know, Amazon keeps coming up with new new services and new capabilities. So there's gonna be a lot of more opportunity. Eh, we're gonna keep, uh, again, within spirit of our partnership where we want to, you know, jump first innovate quickly and, uh, you know, create is integration, adds value to customer >>Often the flywheel that's. I love it. Great. Great to have you guys awesome to reconnect. All right. Appreciate it. Thank you for watching. This is the queue and we're covering AWS reinvent 2021. We're the leader in high tech coverage, right back

Published Date : Dec 1 2021

SUMMARY :

Innomar, and Corian is here as the senior vice president of technology alliances at click and I So it's been quite a run This is beyond the scale, how much you grow, the number of people. so speaking of data, what are you hearing from customers about what they want to do with their data and bringing to customers, express the, you know, their needs is how can they get more value And on the contrary we are hearing, how can we make the teams I got the Hadoop cluster, the work technically, but to your point, And the reason we use the word active intelligence is if you're going to think about active, available in the forum and the structure that allows me to make a decision or to take action. Well, it's real time is before you lose the customer before you lose the patient before And the idea is to help our partners bring So I want to create models faster. So again, the clinic environment, Can we unpack So the data science teams are building machine learning models to make predictions. So this to me is key because who's going to do with doing the monetization. So for example, today we just launched SageMaker canvas. So you can actually, he can arrive and interact with the data. So now again, that can look at a lot of different aspects of the deal, the timing, So I cannot bring that information right in again, in the moment it is And if Salesforce salesperson, maybe I prioritize and the machines helping me direct my resources. So what's the future hold for this partnership? We bring to them, we really want to hear what technologies So we're really excited to look at that and just like close to, you mentioned with canvas, Great to have you guys awesome to reconnect.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Wright	PERSON	0.99+
Itamar Ankorion	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Kosti Vasilakakis	PERSON	0.99+
second integration	QUANTITY	0.99+
2013	DATE	0.99+
first integration	QUANTITY	0.99+
three-year	QUANTITY	0.99+
yesterday	DATE	0.99+
second	QUANTITY	0.99+
today	DATE	0.99+
two flavors	QUANTITY	0.99+
both	QUANTITY	0.98+
next year	DATE	0.98+
SageMaker	TITLE	0.98+
First	QUANTITY	0.98+
two worlds	QUANTITY	0.98+
2021	DATE	0.98+
first one	QUANTITY	0.97+
Gleevec	ORGANIZATION	0.97+
ADA	ORGANIZATION	0.97+
Applegate	ORGANIZATION	0.97+
ClixSense	TITLE	0.95+
two integrations	QUANTITY	0.94+
one	QUANTITY	0.94+
Lee	ORGANIZATION	0.93+
one environment	QUANTITY	0.92+
Tableau	TITLE	0.92+
SQL	TITLE	0.92+
Glick	ORGANIZATION	0.92+
Qlik	PERSON	0.91+
first	QUANTITY	0.91+
first time	QUANTITY	0.91+
Myra	PERSON	0.88+
SageMaker	ORGANIZATION	0.8+
day	QUANTITY	0.79+
early last decade	DATE	0.77+
double	QUANTITY	0.77+
21	DATE	0.7+
Blake	ORGANIZATION	0.69+
Workato Pegasystems	TITLE	0.67+
Invent	EVENT	0.64+
Salesforce	ORGANIZATION	0.63+
reinvent Sev	EVENT	0.58+
AWS	EVENT	0.56+
years	QUANTITY	0.56+
Wasilla	LOCATION	0.51+
20	DATE	0.48+
DDB DBT three	TITLE	0.44+

old version - Roberto Giordano, Borsa Italiana | Postgres Vision 2021

(upbeat music) >> From around the globe, it's theCUBE! With digital coverage of Postgres Vision 2021, brought to you by EDB. >> Welcome back to Postgres Vision 21, where theCUBE is covering the innovations in open source trends in this new age of application development and how to leverage open source database technologies to create world-class platforms that are cost-effective and also scale. My name is Dave Vellante, and with me is Roberto Giordano, who is the End User Computing, Corporate, and Database Services Manager at Borsa Italiana, the Italian Stock Exchange. Roberto, great to have you. Thanks for coming on. >> Thanks Dave, and thanks to the interview friend for the invitation. >> Okay, and we're going to dig in to the great customer story here. First, Roberto, tell us a little bit more about Borsa Italiana and your role at the organization. >> Absolutely. Well, as you mentioned, Borsa is the Italian Stock Exchange. We used to be part of the London Stock Exchange, but last month we left that group, and we joined another group called Euronext, so we are now part of another group, I would say. And right now within Euronext, Euronext provide the biggest liquidity pool in Europe, just to mention something. And basically we provide the market infrastructure to our customers across Europe and the whole world. So probably if it happens for you to buy a little of, I don't know, Ferrari for instance, probably use our infrastructure. >> So I wonder if you could talk about the key drivers in the exchange business in Italy. I don't know how closely you follow what's going on in the United States, but it's crypto madness, there's the Reddit army driving up stocks that have big short positions, and of course the regulators have to look at that, and there's a big debate going on. Well, I don't know what's it like in Italy, but what are the key drivers that are really informing the priorities for your technology strategy? >> Well, you mentioned, for instance, the stereotypical cases that are a little bit of laterally to the global markets and also to our markets as a it professional running market infrastructure is our first the goal to provide an infrastructure that is reliable and be with the lowest possible latency. So we are very focused on performance and reliability just to mention the two main drivers within our systems. >> Well, and you have end-user computing in your title and we're going to get into the database discussion, but I presumably with with COVID you had to pivot and that that piece of your job was escalated in 2020, I would imagine. And you mentioned latency which is a key factor in obviously in database access but that must've been a big challenge last year. >> Well, it was really a challenge, but basically we move just within a weekend, the wall organization working remotely. And it has been like this since February, 2020. Think about the challenge of moving almost 1000 people that used to come to the office every day to start to work remotely. And as within my team of the end user computing this was really a challenge but it was a good one at the end. We, we, we succeeded and everything work. It's fine from our perspective, no news is is a good news, you know, because normally when something doesn't work, we are on newspapers. So if you didn't heard about us it means that everything worked out just fine. >> Yeah. It's amazing, Roberto. We both in the technology business that you'll be you're a practitioner observer, but I mean if you're in the tech business most companies actually pivoted quite well. You're have always been a digital business, different. I mean, if you're a Ferrari and making cars and you can't get semiconductors, but but most technology companies actually made the transition you know, quite amazingly, let's get into the, the case study a bit of it. I wonder if you could paint a picture of your organization's infrastructure and applications what it looks like and and particularly your database infrastructure what does that look like? >> Well, we are a multi-vendor shop. So we would like to pick the right technology for for the right service. This means that my database services teams currently manage several different technology where possible that plays a big role in, in, in our portfolio. And because we, we, we currently support both the open source, fully open source version of PostgreSQL, but also the EDB distribution in particular we prefer to use DDB distribution where we did specific functionalities that just EDB provide. And we, when we need a first class level of support that ADB in in recent year was able to provide to us. >> When you say full functioning, are you talking about things like acid compliance, two phase commits? I mean, all these enterprise capabilities, is that right? Or maybe you could be >> Just too much just to mention one, for instance we recently migrated our wire intrasite availability solution using the ADB fail-over manager. That is an additional component that just it'll be provide. >> Yeah. Okay. So, so par recovery obviously is, is and so that's a solution that you to get from the EDB distro as opposed to having to build it yourself with open source tooling. >> Yeah, correct. Well, basically sterically, we used to rely on OSTP clustering from, from, from that perspective. But over the years we found that even if it's a technology that works fine, it has been around for four decades. And so on. We faced some challenges internally because within my team we don't own also the operative system layers. So we want a solution that was 100% within our control and perimeter. So just few months ago we asked the EDB EDB folks if they can provide something. And after a couple of meetings also with their pre-sales engineers, we found the the right solution for us. So we launched long story short, just a quick proof of concept to a tissue test together, again using the ADB consultancy. And, and then we, beginning of this year, we, we went live with the first mission critical service using this brand new technology, well brand new technology for us. You know, it'd be created a few years ago >> And I do have some follow-up questions but I want to understand what catalyzed the, you know what was the motivation for going with an open source database? I mean, you're, you're a great example because you have your multi-vendor so you have experienced with all of it, the full spectrum. What was it about open source database generally EDB specifically that triggered the, the choice? >> Well thanks for the question. It is, this is one of the, or one of the questions that I always, like. I think what really drove us was the right combination between easy to use, so simplicity and also good value for money. So we like to pick the right database technology for the right kind of service slash budget that the survey says and, and the open source solution for a specific service. It, it, it's, it's our, you know, first, first, first choice. So we are not going to say a company that use just one technology. We like to take the best of breed that the market can offer. In some cases, the open source and Pasquesi in particular is, is our choice. How involved was >> The line of business in this both the decision and the implementation? Was it kind of invisible to them, or this was really more of a technology decision based on the your interpretation of the requirements I'm interested in who was involved and how you actually got it done? >> Well, I, I think this decision was transplant for, for, for, for the business at the end of the day don't really have that kind of visibility. You know, they just provide requirements in particular in terms of performance and rehabil area, the reliability. And so, so this this is something they are not really involved about. And obviously if they, if we are in opposition to save a little bit of money everybody's at the, even the business >> No. So what did you have to do? So that makes sense to me, I figured that was the case. Who would, who were the stakeholders on your team? I mean, what kind of technical resources did you require an implementation resources? What take us through what the project if you will look like, wh how did you do it? >> Well, it's a combination of database expertise. I got the pleasure to run a team that is paid by very, very senior, very, very skilled database services professional that are able to support more than one more than what the county and also are very open to innovation and changes. Plus obviously we need also the development teams the relevant development teams on board, when you when you run this kind of transformations and it looks like also, they liked the idea to use PostgreSQL for for this specific service I got in mind. So it, it, it was quite, quite easy, not be discussion. You know. >> What was the, what was the elapsed time from from when you said, okay, we're in, you know signed the agreement we're going here you made the decision to actually getting into production. >> Well, as I mentioned, we, we, we were on we're on services and application that are really focused on high availability and performance. So generally speaking, we are not a peak organization. Also we run a business that is highly regulated. So as you know, as you can imagine we are an organization that don't have a lot of appetite for risk, you know, so generally speaking in order to run this kind of transformation is a matter of several months, I will say six nine months to have something delivered in that space. >> Okay. Well, that's, I mean, that's reasonable. I mean, if you could do it inside of a year that's I think quite good especially in the highly regulated industry. And then you mentioned kind of the fail over the high availability Cape Cape capabilities. Were there other specific EDB tools that that you utilize to sort of address the objectives? >> Yeah, absolutely. We were in particular, we used Postgres enterprise, AKA Pam. Okay. And very recently we were involved within ADB about per se specifically developing one functionality that, that that we needed back in the day. I think together with Bart these are the free EDB specific tools that, that we, that that we use right now. >> And, and I'm, I'm interested in, I want to get to the business impact and I know it's early days for you but the real motivation was to save money and simplify. I would actually, I would imagine your developers were happy because they get to use modern tooling and open source. But, but really though if your industry is bottom line, right, I mean that's really what the, the business case was all about. But I wonder if you could add some color there in terms of the business impact that you expect. And then, I mean I don't know how much visibility you have now but anything you can share with us. >> Well, thinking about the EFM implementation that the business impact the, was that in case of a failure or the DBA team that a services team is it is able to provide a solution that is within our 100% within our perimeter. So this means that we are fully accountable for it. So in a nutshell, when you run a service, the less people the less teams you have to involve the more control you can deliver. And in some, again, very critical services that is a great value. >> Okay. So, and, and where do you want to take this? I mean, how do you see w what's your, if you're thinking about your Postgres and, and generally an EDB you know, roadmap, where do you want it to go? >> Well, I stay to, to trends within within the organization, the, the, the, the the first one is about migrating more existing services to open source solution for database is going to be, is going to be prosperous. And other trends that I see within my organization is about designing applications, not really to be, to to use PostgreSQL as the base, as it does a base layer. I think both trends are more or less surroundings at the same state right now. >> Yeah. A lot of the audience members at Postgres vision 21 is just like you they they're managing day-to-day infrastructure. They're there they're expert practitioners. What advice would you give to somebody that is thinking about, you know taking this journey, maybe if you had to do something over again maybe what would you do differently? How can you help your peers here? >> Well, I think in particular, if you are going to say a big organization that runs a highly regulated business in some cases, you are a little bit afraid of open source because there is this, I can say general consideration about the lack of enterprise level support. I would like to say that it is just about the past because they're around bunch of companies like EDB that are we're a hundred percent capable of providing enterprise level of support, even on, on, on even on the open source distribution of Paul's presser. Obviously Dan is you're going to go with their specific distribution. The level of support is going to be even more accurate but as we know, it could be currently is they across say main contributor of the pollsters community. And I think is, is that an insurance for every organization? >> Your advice is don't be afraid. >> Yeah. My advice is done is absolutely, don't be, don't be afraid. And if, if, if I can, if we can mention about also about, you know, the cloud called technologies this is also another, another topic where if possible I would like to suggest to not being afraid EDB as every every I would say organization within the it industry is really pushing for it. And I think for a very, for, for a lot of cases not all of them, but a lot of cases, there is a great value about the design services application to be cloud native or migrating existing application into the cloud. >> Okay. But, but being a highly regulated industry and being a, you know, very much aware of the the narrative around open source, et cetera, you, you must've had just a little piece of your mind saying, okay I have to manage this risk. So there's anything specifically you did with managing the risks that you would advise? Was it, was it or is it really just about good change management? >> I think it was mainly about a good change management when you got, you know the relevant stakeholders that you need on board and we are, everybody's going the same direction. That basically is about executing. >> Excellent. Well, Roberto, I really appreciate your time and your knowledge that you share with the audience. So thanks so much for coming on the cube. >> Thank you, Dave. It was a great pleasure. >> And thank you for watching the cubes continuous coverage of Postgres vision 21. We'll be right back. (upbeat music)

Published Date : May 27 2021

SUMMARY :

brought to you by EDB. the Italian Stock Exchange. for the invitation. role at the organization. Europe and the whole world. and of course the regulators the goal to provide an Well, and you have end-user computing So if you didn't heard about us We both in the technology of PostgreSQL, but also the that just it'll be provide. and so that's a solution that you to get the right solution for us. all of it, the full spectrum. breed that the market can offer. at the end of the day No. So what did you have to do? I got the pleasure to signed the agreement we're going here of appetite for risk, you that you utilize to sort that we needed back in the day. impact that you expect. the less teams you have to involve I mean, how do you see w the same state right now. maybe what would you do differently? of the pollsters community. about also about, you know, that you would advise? the relevant stakeholders that you need So thanks so much for coming on the cube. It was a great pleasure. And thank you for watching the cubes

ENTITIES

Entity	Category	Confidence
Roberto	PERSON	0.99+
Euronext	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Europe	LOCATION	0.99+
Borsa Italiana	ORGANIZATION	0.99+
Italy	LOCATION	0.99+
Ferrari	ORGANIZATION	0.99+
Roberto Giordano	PERSON	0.99+
100%	QUANTITY	0.99+
February, 2020	DATE	0.99+
Borsa	ORGANIZATION	0.99+
Paul	PERSON	0.99+
2020	DATE	0.99+
United States	LOCATION	0.99+
one	QUANTITY	0.99+
last year	DATE	0.99+
first	QUANTITY	0.99+
London Stock Exchange	ORGANIZATION	0.99+
Reddit	ORGANIZATION	0.99+
First	QUANTITY	0.99+
last month	DATE	0.99+
PostgreSQL	TITLE	0.99+
Pam	PERSON	0.99+
both	QUANTITY	0.99+
Postgres	ORGANIZATION	0.99+
Dan	PERSON	0.99+
EDB	ORGANIZATION	0.99+
two main drivers	QUANTITY	0.98+
four decades	QUANTITY	0.98+
six nine months	QUANTITY	0.98+
few months ago	DATE	0.97+
Bart	PERSON	0.97+
first one	QUANTITY	0.97+
Italian Stock Exchange	ORGANIZATION	0.97+
almost 1000 people	QUANTITY	0.97+
first class	QUANTITY	0.96+
more than one	QUANTITY	0.95+
two phase	QUANTITY	0.94+
this year	DATE	0.89+
few years ago	DATE	0.88+
Cape Cape	LOCATION	0.87+
both trends	QUANTITY	0.86+
one functionality	QUANTITY	0.86+
first mission	QUANTITY	0.85+
a year	QUANTITY	0.83+
hundred percent	QUANTITY	0.83+
Postgres Vision	ORGANIZATION	0.82+
DDB	TITLE	0.8+
2021	DATE	0.8+
one technology	QUANTITY	0.75+
theCUBE	ORGANIZATION	0.71+
one of the questions	QUANTITY	0.71+
ADB	TITLE	0.71+
Postgres Vision 21	ORGANIZATION	0.69+
Postgres vision 21	ORGANIZATION	0.68+
ADB	ORGANIZATION	0.66+
EDB	TITLE	0.66+
recent year	DATE	0.65+
COVID	ORGANIZATION	0.51+
Vision 2021	EVENT	0.41+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for DDB: