Mandy Whaley & Par Merat, Cisco | Cisco Live EU Barcelona 2020

>> Announcer: Live from Barcelona, Spain, it's theCUBE. Covering Cisco Live 2020 brought to you by Cisco and it's ecosystem partners. >> Hi, everyone, welcome back to theCUBE's live coverage, it's our fourth day of four days of coverage here in Barcelona, Spain for Cisco Live 2020. I'm John Furrier with my co-host Stu Miniman and two great guests here in the DevNet studio where the theCUBE is sitting all week long, been packed with action, Mandy Whaley, Senior Director Developer Experience, Cisco DevNet and Par Merat, Senior Director welcome back to this CUBE. Good to see you guys. >> Thank you. >> Thank you. >> Glad to be here. >> So, we have had a lot of history with you guys from day one. >> Mandy: Yes. >> Watching DevNet from an idea of "Hey, we should do developer thing." And you also have DevNet Create which is a separate, more developer focused. DevNet and Cisco's developer environment. We've been there from the beginning, what a progression! Congratulations on the success. >> Thank you. Thank you so much, it's great to be here in Barcelona with everybody here, you know, learning in the workshops, and we just love these times to connect with our community at Cisco Live, and At DevNet Create, which you mentioned, which is coming up in March. So its right on the corner. >> DevNet Zone which we're in has been really robust, it's been the talk of the show every year, and it gets bigger and the sessions are packed, because people are learning developers, new developers as well as Cisco engineers who are certified, are coming getting new skills as the modern cloud hybrid environments require new skills. It's a technology shift. >> Yeah, exactly and what we have in the DevNet Zone are different ways that the engineers and developers can engage with that technology shifts. So, we have demos around IoT and security, and showing how, you know, to prevent threats from attacking the industrial routers and things like that. We have coding workshops from beginning, intro to Python, intro to Gets, all the way up through advanced, like, Kubernetes topics and things like that. So, people can really dive in with what they're looking for. And this year, we are really excited because we have the new DevNet certifications with those exams coming out right around the corner in February. So, a lot of people are here saying, "I am ready to skill up for those exams, "I am starting to dive into these topics." >> Well, Susie Wee was on, she's the chief of DevNet, among other things, and she said, there's going to be a DevNet 500. The first 500 certifications of DevNet are going to be, kind of, like, the hall of fame or, you know, inaugural or founder certifications. So, can you explain what does it mean? It's not a DevNet certification badge. It's a series of different, can go deeper than that? >> Yeah, just like we have our, you know, existing Network Certifications which are so respected and loved around the world, people get CCIE tattoos and things. Just like there is an associate and professional and expert level on the networking track, there's now a DevNet Associate, a DevNet Professional and coming soon DevNet Expert. And then there's also Specialist badges which help you add specific skills like data center automation, IoT web access. So, it's a whole new set of certifications that are more focused on the software. So they're about 80% software skills, 20% knowledge of networking, and then how you really connect up and down the stack. >> So these are new certifications, they're not replacing anything else >> No, no, no they're all the same stuff? >> They are new, they are part of the same program, they have the same rigor, the same kind of test. They actually have ways to interweave with the existing networking certifications, because we want people to do both skill path, right, to build this new IT team of the future. And so, it's a completely new set of exams. The exams are going to be available to take February 24th, and you can start signing up now. So, with the DevNet 500, you know, that's going to be a special recognition for the first 500 people who get DevNet certifications. It'll be life-time achievement, they'll always be in a DevNet 500, right? And I've had people coming up and telling me, you know "I'm signed up for the first day, "I'm taking my exams on the first day, "I'm trying to get into that." >> Stu and I always want to be on the list, so I think we might be on the 500 study up there (laughs). >> Of course, yeah, And what's really great is with the certifications, we've heard from people in the Zone that, they have been coming and taking classes and learning the skills, but they didn't have a specific way to map that to their career path to get rewarded at work, you know, to have that sort of progression. And so with the certifications they will really will have that. And it's also really important for our partners and Par Merat is doing lot of work with certifications and partners. >> Yeah, Par, definitely, I would love to hear a little bit, we've interviewed on theCUBE over the years some of the DevNet partners from a technology standpoint, of course the channel ecosystem hugely important to Cisco's business. Give us the update as to, you know, DevNet partnering as well as, what will these certifications mean to about the technology and go-to-market partners? >> Yeah, the wonderful thing about this is, it really demonstrates Cisco's embracement of software, and making sure that we are providing that common language for software developers and networkers to bring the two together. And what we've found is that our partners are at different levels of maturity along that progression of programmability. And this new DevNet specialization, which is anchored in the individuals that are now certified at that partner, allow them to demonstrate from a go-to-market standpoint, from a recognition standpoint, that as a practice they have these skills. And look, at the end of the day, it's all about delivering what our customers need. And our customers are asking us for significant help in automation, digital transformation, they're trying to drive new business outcomes. And this will provide that recognition on who to partner with in the market. >> Yeah, this is so important I remember when Cisco helped a lot of the the partner ecosystem build data center practices. Went from the silos and now embracing, you've got the hardware the software, we're talking multicloud. It's the practice that is needed today going forward to help customers with with where they are going. >> It really is. And another benefit that we are finding in talking to our partners is where packaging this up and rolling it out, is not only will it help them from a recognition standpoint, from a practice stand point and from a competitive differentiation standpoint. But it will also help them attract talent. I mean, it's no secret, there is a talent shortage right now. If you talk to any CEO that's top of mind, and how these partners are able to attract these new skills and attract smart people. Smart people like working on smart things, right? And so this has really been a big traction point for them as well. >> It's also giving ways to really specifically train for new job roles. So some of the ways that you can combine the new DevNet certifications with the network engineering certifications. We've looked at it and said, you know, there's a role of network automation developer. That's a new role. Everyone we ask in one of our sessions, "Who needs that person on their team?" So many customers, partners raise their hands saying, "We want the Network Automation Developers on our team." And you can combine your CCNP Enterprise with a DevNet certification and build up the skills to be that Network Automation Developers >> Certainly it's been great buzz. I've got to get your guys' thoughts because certainly it's great for careers and you guys are betting on the people, and the people are betting on Cisco. This is what's going on, it's a maturity of DevNet, almost. It's like a pinch-me moment for you guys, but you continue to grow. I've got to ask you, what are some of the cool things that you're showing here? As you mature, you still have the start here session, which is intro to Python and other things, pretty elementary, and then there's more advanced things. What are some of the new things that's going on that you could share? >> So some of the new things we've got going on, one of my favorites is the IoT and security demonstration. There's an industrial robot arm that's picking and placing things, and you can see how it's connected to the network and then something goes wrong with that robot arm. And then you can actually show how you can use the software and security tools to see was there code trying to access, you know, something that that robot was using, that's getting in the way of it working? So you could detect threats and move forward on that. We also have a whole automation journey that starts from modeling your network to testing, to how you would deploy automation, to a deep dive on telemetry and then ends with multi-domain automation. So really helping engineers, like, look at that whole progression, that's been really popular. >> Par, talk about the specialization, which ones are more, I'd say popular or entry level, which ones are people coming into getting certified first, network engineering, automation first? Or what's the-- >> Yeah, so the program's going to to roll out with three different levels. One is a specialized level, the second is an advanced level, and then we'll look to that third level. Again, they're anchored in the individual certs. And so as we look for that entry level, it's really all about automation, right? I mean, some things you take for granted, but you still need these new skills to be able to automate and scale, and have repeatable, scalable benefits from that. The second tier will be more cross-domain and that's where we're really thinking that additional skill set is needed to deliver dashboard experience, compliance experiences, and then that next level, again, will anchor towards the expert level that's coming out. But one thing I want to point out is, in addition to just having the certified people on staff, they also have to demonstrate that they have a practice around it. So it's not just enough to say, "I passed an exam." As we work with them to roll out the practice and they earn the badge, they're demonstrating that they have the full methodology in place so that it really, there's a lot behind it. >> So that means we can't be in the 500 list then even if we pass (laughs). >> Well, you might be able to be in the 500 list, but I don't know that theCUBE would end up being specialized. >> It's good banner advertising. No, seriously all fun, it's all fun. Cisco Live in Europe. Is there a difference between European and U.S.? Are you seeing any differences in geographic talent? >> You know what, the first couple of years that we did it, I think there was a bigger difference. It felt like there were different topics that were very popular in the U.S., slightly different in Europe. Last year and this year I feel, like they have converged. It's the same focus on DevOps, automation, security is a huge focus in both places. And it also feels like the interest and level of the people attending has also converged. It's really similar. Congratulations, it's been fun to watch the rise and success of DevNet, continues to be strong, obviously in the hub here, and the DevNet zone behind us, packed sessions. >> Mandy: Yes, yes. >> What's the biggest surprise for you guys in terms of things that you didn't expect or some of the success, what's jumped out? >> Yeah, I think, you know, one of the points that I want to make sure we also cover and it has been an added benefit. We were hoping it would happen, we just didn't realize it would happen this soon. We're attracting new companies, new partners, so the specialization won't just be available for our traditional VARs. This is also available for our non resellers and we are finding different companies accessing DevNet resources and learning these skills. So that's been a really great benefit of DevNet overall. >> Definitely, my favorite surprises are when I show up at the community events and I hear from someone I met last year what they went back and did, and the change that they drove in their company. And I think we're seeing those across the board of people who start a grassroots movement, take back some new ideas, really create change, and then they come back and we get to hear about that from them. Those are my favorite surprises. >> And I tell you, we've known for years how important the developer is, but I think the timing on this has been perfect because it is no longer just, Oh, the developer has some tools that they'd liked in the corner. The developer connected to the business and driving things forward. >> Mandy: Exactly. So perfect timing, congratulations on the certification-- >> The other thing that's been great is that Cisco itself, we now have APIs across the whole portfolio and up and down the stack. So that's been a wonderful thing to see come together because it opens up possibilities for all these developers. >> So Cisco is API first company? >> We are building APIs everywhere we can, and the community is taking them and finding creative things to build. >> Well, it's been fun to watch you guys change Cisco, but also impact customers has been great to watch. Par many thanks for coming on, appreciate it. >> Yeah, thank you. >> theCUBE's live coverage here in Barcelona for Cisco Live 2020 I'm John Furrier with Dave Vellante, Stu Miniman. Be right back with more after this short break. (upbeat music)

Published Date : Jan 31 2020

SUMMARY :

brought to you by Cisco and Good to see you guys. of history with you guys Congratulations on the success. So its right on the corner. it's been the talk of the show every year, and showing how, you like, the hall of fame and expert level on the networking track, and you can start signing up now. Stu and I always and learning the skills, of course the channel ecosystem and networkers to bring the two together. It's the practice that is and how these partners are able to attract So some of the ways that you can combine and the people are betting on Cisco. and you can see how it's So it's not just enough to be in the 500 list then to be in the 500 list, Are you seeing any differences and level of the people and we are finding different companies and the change that they how important the developer is, on the certification-- and up and down the stack. and the community is taking them Well, it's been fun to I'm John Furrier with Dave

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Susie Wee	PERSON	0.99+
John Furrier	PERSON	0.99+
Europe	LOCATION	0.99+
Barcelona	LOCATION	0.99+
February 24th	DATE	0.99+
February	DATE	0.99+
U.S.	LOCATION	0.99+
March	DATE	0.99+
Last year	DATE	0.99+
Mandy Whaley	PERSON	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Python	TITLE	0.99+
this year	DATE	0.99+
third level	QUANTITY	0.99+
Barcelona, Spain	LOCATION	0.99+
two great guests	QUANTITY	0.99+
four days	QUANTITY	0.99+
fourth day	QUANTITY	0.99+
DevNet	ORGANIZATION	0.99+
second tier	QUANTITY	0.99+
Par Merat	PERSON	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
today	DATE	0.99+
one	QUANTITY	0.98+
second	QUANTITY	0.98+
first day	QUANTITY	0.98+
both places	QUANTITY	0.98+
500 list	QUANTITY	0.98+
first 500 people	QUANTITY	0.97+
Cisco DevNet	ORGANIZATION	0.97+
Mandy	PERSON	0.97+
Stu	PERSON	0.96+
about 80%	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.95+
DevNet	TITLE	0.95+
three different levels	QUANTITY	0.94+
Par Merat	ORGANIZATION	0.93+
first	QUANTITY	0.93+
first company	QUANTITY	0.92+
one thing	QUANTITY	0.91+
day one	QUANTITY	0.88+
European	LOCATION	0.88+
Cisco Live 2020	EVENT	0.85+
CUBE	ORGANIZATION	0.85+
first couple of years	QUANTITY	0.84+
Merat	ORGANIZATION	0.83+
DevNet Create	ORGANIZATION	0.81+
20% knowledge	QUANTITY	0.81+
first 500 certifications	QUANTITY	0.75+
500	QUANTITY	0.73+
Par	PERSON	0.73+
one of our sessions	QUANTITY	0.72+
DevNet Create	TITLE	0.71+
Cisco Live	ORGANIZATION	0.7+

Mandy Whaley & Par Merat, Cisco | Cisco Live EU Barcelona 2020

>> Announcer: Live from Barcelona, Spain, it's theCUBE. Covering Cisco Live 2020 brought to you by Cisco and it's ecosystem partners. >> Hi, everyone, welcome back to theCUBE's live coverage, it's our fourth day of four days of coverage here in Barcelona, Spain for Cisco Live 2020. I'm John Furrier with my co-host Stu Miniman and two great guests here in the DevNet studio where the theCUBE is sitting all week long, been packed with action, Mandy Whaley, Senior Director Developer Experience, Cisco DevNet and Par Merat, Senior Director welcome back to this CUBE. Good to see you guys. >> Thank you. >> Thank you. >> Glad to be here. >> So, we have had a lot of history with you guys from day one. >> Mandy: Yes. >> Watching DevNet from an idea of "Hey, we should do developer thing." And you also have DevNet Create which is a separate, more developer focused. DevNet and Cisco's developer environment. We've been there from the beginning, what a progression! Congratulations on the success. >> Thank you. Thank you so much, it's great to be here in Barcelona with everybody here, you know, learning in the workshops, and we just love these times to connect with our community at Cisco Live, and At DevNet Create, which you mentioned, which is coming up in March. So its right on the corner. >> DevNet Zone which we're in has been really robust, it's been the talk of the show every year, and it gets bigger and the sessions are packed, because people are learning developers, new developers as well as Cisco engineers who are certified, are coming getting new skills as the modern cloud hybrid environments require new skills. It's a technology shift. >> Yeah, exactly and what we have in the DevNet Zone are different ways that the engineers and developers can engage with that technology shifts. So, we have demos around IoT and security, and showing how, you know, to prevent threats from attacking the industrial routers and things like that. We have coding workshops from beginning, intro to Python, intro to Gets, all the way up through advanced, like, Kubernetes topics and things like that. So, people can really dive in with what they're looking for. And this year, we are really excited because we have the new DevNet certifications with those exams coming out right around the corner in February. So, a lot of people are here saying, "I am ready to skill up for those exams, "I am starting to dive into these topics." >> Well, Susie Wee was on, she's the chief of DevNet, among other things, and she said, there's going to be a DevNet 500. The first 500 certifications of DevNet are going to be, kind of, like, the hall of fame or, you know, inaugural or founder certifications. So, can you explain what does it mean? It's not a DevNet certification badge. It's a series of different, can go deeper than that? >> Yeah, just like we have our, you know, existing Network Certifications which are so respected and loved around the world, people get CCIE tattoos and things. Just like there is an associate and professional and expert level on the networking track, there's now a DevNet Associate, a DevNet Professional and coming soon DevNet Expert. And then there's also Specialist badges which help you add specific skills like data center automation, IoT web access. So, it's a whole new set of certifications that are more focused on the software. So they're about 80% software skills, 20% knowledge of networking, and then how you really connect up and down the stack. >> So these are new certifications, they're not replacing anything else >> No, no, no they're all the same stuff? >> They are new, they are part of the same program, they have the same rigor, the same kind of test. They actually have ways to interweave with the existing networking certifications, because we want people to do both skill path, right, to build this new IT team of the future. And so, it's a completely new set of exams. The exams are going to be available to take February 24th, and you can start signing up now. So, with the DevNet 500, you know, that's going to be a special recognition for the first 500 people who get DevNet certifications. It'll be life-time achievement, they'll always be in a DevNet 500, right? And I've had people coming up and telling me, you know "I'm signed up for the first day, "I'm taking my exams on the first day, "I'm trying to get into that." >> Stu and I always want to be on the list, so I think we might be on the 500 study up there (laughs). >> Of course, yeah, And what's really great is with the certifications, we've heard from people in the Zone that, they have been coming and taking classes and learning the skills, but they didn't have a specific way to map that to their career path to get rewarded at work, you know, to have that sort of progression. And so with the certifications they will really will have that. And it's also really important for our partners and Par Merat is doing lot of work with certifications and partners. >> Yeah, Par, definitely, I would love to hear a little bit, we've interviewed on theCUBE over the years some of the DevNet partners from a technology standpoint, of course the channel ecosystem hugely important to Cisco's business. Give us the update as to, you know, DevNet partnering as well as, what will these certifications mean to about the technology and go-to-market partners? >> Yeah, the wonderful thing about this is, it really demonstrates Cisco's embracement of software, and making sure that we are providing that common language for software developers and networkers to bring the two together. And what we've found is that our partners are at different levels of maturity along that progression of programmability. And this new DevNet specialization, which is anchored in the individuals that are now certified at that partner, allow them to demonstrate from a go-to-market standpoint, from a recognition standpoint, that as a practice they have these skills. And look, at the end of the day, it's all about delivering what our customers need. And our customers are asking us for significant help in automation, digital transformation, they're trying to drive new business outcomes. And this will provide that recognition on who to partner with in the market. >> Yeah, this is so important I remember when Cisco helped a lot of the the partner ecosystem build data center practices. Went from the silos and now embracing, you've got the hardware the software, we're talking multicloud. It's the practice that is needed today going forward to help customers with with where they are going. >> It really is. And another benefit that we are finding in talking to our partners is where packaging this up and rolling it out, is not only will it help them from a recognition standpoint, from a practice stand point and from a competitive differentiation standpoint. But it will also help them attract talent. I mean, it's no secret, there is a talent shortage right now. If you talk to any CEO that's top of mind, and how these partners are able to attract these new skills and attract smart people. Smart people like working on smart things, right? And so this has really been a big traction point for them as well. >> It's also giving ways to really specifically train for new job roles. So some of the ways that you can combine the new DevNet certifications with the network engineering certifications. We've looked at it and said, you know, there's a role of network automation developer. That's a new role. Everyone we ask in one of our sessions, "Who needs that person on their team?" So many customers, partners raise their hands saying, "We want the Network Automation Developers on our team." And you can combine your CCNP Enterprise with a DevNet certification and build up the skills to be that Network Automation Developers >> Certainly it's been great buzz. I've got to get your guys' thoughts because certainly it's great for careers and you guys are betting on the people, and the people are betting on Cisco. This is what's going on, it's a maturity of DevNet, almost. It's like a pinch-me moment for you guys, but you continue to grow. I've got to ask you, what are some of the cool things that you're showing here? As you mature, you still have the start here session, which is intro to Python and other things, pretty elementary, and then there's more advanced things. What are some of the new things that's going on that you could share? >> So some of the new things we've got going on, one of my favorites is the IoT and security demonstration. There's an industrial robot arm that's picking and placing things, and you can see how it's connected to the network and then something goes wrong with that robot arm. And then you can actually show how you can use the software and security tools to see was there code trying to access, you know, something that that robot was using, that's getting in the way of it working? So you could detect threats and move forward on that. We also have a whole automation journey that starts from modeling your network to testing, to how you would deploy automation, to a deep dive on telemetry and then ends with multi-domain automation. So really helping engineers, like, look at that whole progression, that's been really popular. >> Par, talk about the specialization, which ones are more, I'd say popular or entry level, which ones are people coming into getting certified first, network engineering, automation first? Or what's the-- >> Yeah, so the program's going to to roll out with three different levels. One is a specialized level, the second is an advanced level, and then we'll look to that third level. Again, they're anchored in the individual certs. And so as we look for that entry level, it's really all about automation, right? I mean, some things you take for granted, but you still need these new skills to be able to automate and scale, and have repeatable, scalable benefits from that. The second tier will be more cross-domain and that's where we're really thinking that additional skill set is needed to deliver dashboard experience, compliance experiences, and then that next level, again, will anchor towards the expert level that's coming out. But one thing I want to point out is, in addition to just having the certified people on staff, they also have to demonstrate that they have a practice around it. So it's not just enough to say, "I passed an exam." As we work with them to roll out the practice and they earn the badge, they're demonstrating that they have the full methodology in place so that it really, there's a lot behind it. >> So that means we can't be in the 500 list then even if we pass (laughs). >> Well, you might be able to be in the 500 list, but I don't know that theCUBE would end up being specialized. >> It's good banner advertising. No, seriously all fun, it's all fun. Cisco Live in Europe. Is there a difference between European and U.S.? Are you seeing any differences in geographic talent? >> You know what, the first couple of years that we did it, I think there was a bigger difference. It felt like there were different topics that were very popular in the U.S., slightly different in Europe. Last year and this year I feel, like they have converged. It's the same focus on DevOps, automation, security is a huge focus in both places. And it also feels like the interest and level of the people attending has also converged. It's really similar. Congratulations, it's been fun to watch the rise and success of DevNet, continues to be strong, obviously in the hub here, and the DevNet zone behind us, packed sessions. >> Mandy: Yes, yes. >> What's the biggest surprise for you guys in terms of things that you didn't expect or some of the success, what's jumped out? >> Yeah, I think, you know, one of the points that I want to make sure we also cover and it has been an added benefit. We were hoping it would happen, we just didn't realize it would happen this soon. We're attracting new companies, new partners, so the specialization won't just be available for our traditional VARs. This is also available for our non resellers and we are finding different companies accessing DevNet resources and learning these skills. So that's been a really great benefit of DevNet overall. >> Definitely, my favorite surprises are when I show up at the community events and I hear from someone I met last year what they went back and did, and the change that they drove in their company. And I think we're seeing those across the board of people who start a grassroots movement, take back some new ideas, really create change, and then they come back and we get to hear about that from them. Those are my favorite surprises. >> And I tell you, we've known for years how important the developer is, but I think the timing on this has been perfect because it is no longer just, Oh, the developer has some tools that they'd liked in the corner. The developer connected to the business and driving things forward. >> Mandy: Exactly. So perfect timing, congratulations on the certification-- >> The other thing that's been great is that Cisco itself, we now have APIs across the whole portfolio and up and down the stack. So that's been a wonderful thing to see come together because it opens up possibilities for all these developers. >> So Cisco is API first company? >> We are building APIs everywhere we can, and the community is taking them and finding creative things to build. >> Well, it's been fun to watch you guys change Cisco, but also impact customers has been great to watch. Par many thanks for coming on, appreciate it. >> Yeah, thank you. >> theCUBE's live coverage here in Barcelona for Cisco Live 2020 I'm John Furrier with Dave Vellante, Stu Miniman. Be right back with more after this short break. (upbeat music)

Published Date : Jan 30 2020

SUMMARY :

brought to you by Cisco and it's ecosystem partners. Good to see you guys. So, we have had a lot of history with you guys And you also have DevNet Create and we just love these times to connect with our community and it gets bigger and the sessions are packed, and showing how, you know, to prevent threats or, you know, inaugural or founder certifications. and then how you really connect up and down the stack. So, with the DevNet 500, you know, that's going to be Stu and I always want to be on the list, and learning the skills, some of the DevNet partners from a technology standpoint, and making sure that we are providing Cisco helped a lot of the the partner ecosystem and how these partners are able to attract So some of the ways that you can combine and you guys are betting on the people, and you can see how it's connected to the network and they earn the badge, they're demonstrating So that means we can't be in the 500 list then Well, you might be able to be in the 500 list, Are you seeing any differences in geographic talent? and level of the people attending has also converged. and we are finding different companies and the change that they drove in their company. and driving things forward. So perfect timing, congratulations on the certification-- and up and down the stack. and the community is taking them Well, it's been fun to watch you guys change Cisco, I'm John Furrier with Dave Vellante, Stu Miniman.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Susie Wee	PERSON	0.99+
John Furrier	PERSON	0.99+
Europe	LOCATION	0.99+
Barcelona	LOCATION	0.99+
February 24th	DATE	0.99+
February	DATE	0.99+
U.S.	LOCATION	0.99+
March	DATE	0.99+
Last year	DATE	0.99+
Mandy Whaley	PERSON	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Python	TITLE	0.99+
this year	DATE	0.99+
third level	QUANTITY	0.99+
Barcelona, Spain	LOCATION	0.99+
two great guests	QUANTITY	0.99+
four days	QUANTITY	0.99+
fourth day	QUANTITY	0.99+
DevNet	ORGANIZATION	0.99+
second tier	QUANTITY	0.99+
Par Merat	PERSON	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
today	DATE	0.99+
one	QUANTITY	0.98+
second	QUANTITY	0.98+
first day	QUANTITY	0.98+
both places	QUANTITY	0.98+
500 list	QUANTITY	0.98+
first 500 people	QUANTITY	0.97+
Cisco DevNet	ORGANIZATION	0.97+
Mandy	PERSON	0.97+
Stu	PERSON	0.96+
about 80%	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.95+
DevNet	TITLE	0.95+
three different levels	QUANTITY	0.94+
Par Merat	ORGANIZATION	0.93+
first	QUANTITY	0.93+
first company	QUANTITY	0.92+
one thing	QUANTITY	0.91+
day one	QUANTITY	0.88+
European	LOCATION	0.88+
Cisco Live 2020	EVENT	0.85+
CUBE	ORGANIZATION	0.85+
first couple of years	QUANTITY	0.84+
Merat	ORGANIZATION	0.83+
DevNet Create	ORGANIZATION	0.81+
20% knowledge	QUANTITY	0.81+
first 500 certifications	QUANTITY	0.75+
500	QUANTITY	0.73+
Par	PERSON	0.73+
one of our sessions	QUANTITY	0.72+
DevNet Create	TITLE	0.71+
Cisco Live	ORGANIZATION	0.7+

K Young, Datadog | AWS Summit SF 2017

>> Voiceover: Live from San Francisco, it's The Cube. Covering AWS Summit 2017. Brought to you by Amazon Web Services. >> Hi, welcome back to The Cube. We are live in San Francisco at the AWS Summit. We've had a great day so far. I'm Lisa Martin here with my co-host George Gilbert. We are very excited to be joined by Datadog. K Young the Director of Strategic Alliances from Datadog, welcome to The Cube. >> Thank you, hi. Glad to be here. >> So, tell us, besides loving your shirt, as I've already told you, tell us and our viewers a little bit about who Datadog is and what do you do. >> Alright, so Datadog does infrastructure monitoring and application performance monitoring. So what that means is we're able to not only look at your hosts and the resources they have available to them, meaning CPU and memory and that sort of thing, but also all the software that's running on top of it. So, if it's off the shelf software, like a database, like Postgres, or maybe it's EngineX, we understand over 200 different off-the-shelf types of software, integrate with them directly so all you have to do is turn on those integrations, and we can tell you whether those pieces of software are performing at the rate that they ought to, with a sufficiently low number of errors. That's the infrastructure monitoring side of things. Then application performance monitoring, is where you can actually trace execution of requests, individual requests, across different services, or microservices, and tell where time is being spent and track metadata so that in a forensic case, you can go back and determine, oh this type of call is producing a lot of errors. Oh, and those errors are coming from here, and then, you know, maybe a lot of time is being spent here, and then because Datadog also does infrastructure monitoring, drill down into, okay well, what's happening under the hood? Maybe we're having problems because our infrastructure itself is misbehaving in some way. >> You have some pretty big customers: Salesforce, Airbnb, Samsung. I was just reading yesterday, an article that was published, that you've been, Datadog, in the top five businesses profiled by IDC as the multi-cloud management vendors to look out for. So, some pretty big accolades, some pretty big customers. How long have you been in business? >> K Young: Since 2010. >> Lisa: 2010. And tell us about what you're doing with Amazon. >> What we're doing with Amazon. So, let's see, where to begin. Amazon, a lot of people come to Datadog when they have complex systems to manage, meaning highly dynamic, or high scale, or they've adopted Docker, and their infrastructure is changing frequently. More frequently than infrastructure used to change ten years ago. Because Datadog makes it easy or ... Easy, possible even, to make sense of what's happening, even as your infrastructure changes on an hourly basis. So, a lot of customers come to us around the time they're interested in using dynamic infrastructure. Sometimes that's on Amazon, and sometimes that's when you're On-Prem but you're adopting Docker, for example, or microservices. We get a lot of business on Amazon. I think it's fair to say Amazon loves us, because it makes it so much easier to use their service and to adopt their service. And we're sort of the defacto infrastructure monitoring service for Amazon. >> So, you talking about containers, microservices, hyperscale. Is there a break with earlier monitoring and management software that didn't handle the ephemeral nature of applications and infrastructure? Is that the change? >> Yeah, that's basically it. Ten years ago, you as an assistant administrator or operations person, would have known the names of every one of your servers, and you kind of treat them affectionately. "Oh, you know, old Roger is misbehaving again, we got to give it a reboot." These days you don't know, in many cases, how many servers you have, much less what's running on them. So, it used to be that you could set up monitoring where you say, "Okay, I need to look at these things. They should be doing these set of tasks." And you set it up and basically forget it for six months or a year. Now, what's happening on any given machine or what's inside of a container, is churning very, very frequently. And so, to make sense of that, you have to use tags. So to tag all of your infrastructure with what it's doing, maybe what environment it is, like if it's staging or production, whether it's in AWS or On-Prem. Maybe it's a part of a build. And then you can look at your infrastructure and its performance through those lenses. You don't have to think in advance, "Oh, I'm going to want to know what's happening in US-East-1 in production with build number 1180." You can just do that on the fly with Datadog. And that's the sort of thing that we make possible. It's necessary for modern applications and modern services, that really wasn't possible before. >> So, it sounds like it's fairly straightforward at the infrastructure level to know what metrics and events you want to collect, in the sense that, you know, CPU utilization, memory utilization and, you know, maybe even a database number of connections and query time, but as you move up at the application level, the things that you want to ask could become very different between apps. >> K Young: Yeah. >> And then very different across Cloud or On-Prem. >> Yeah, that's right. So, there's sort of two classes of different things you could want to ask. Datadog accepts totally custom metric, so we know about, as I said, 200 different technologies, and we can collect everything automatically. But then, you're going to have your own application and you're going to want to send us things that are specific to your business. We take those just as well. So, for example, I think we have one customer who tracks when cash register drawers open or close. You know, that's not built in, but they can send those metrics to us. They get graphed the same way. We can set alerts on it the same way. We can use sophisticated machine learning to make projections about how we expect those patterns to be in the future, and if the cash registers don't open at the right rate, we can let somebody know that something has gone wrong. So, we can collect any kind of metrics. Then on top of that, we've got application performance monitoring. Right, so that's where you've written custom code, and Datadog, since it's already running on all of your servers, can track requests as it moves from service to service, or between microservices, and recompile that request into a visualization that will show you everything that happened, how long it took, and allows you to drill in and get metadata about each thing. So, you can actually reconstruct where time is going or whether there are problems. >> Why don't I ask you about some of the trends? As I mentioned a minute ago reading that article, or the mention of Datadog by IDC as one of the top five multi-cloud management vendors. What are some of the trends that you were seeing with respect to hypercloud, multi-cloud? You know, we've heard some conversation today from AWS, but I'd love to get your feedback, as the Director of Strategic Initiatives, what are you seeing? >> So, the trend that ... I'm going to answer this, but the trend that we were seeing a few years ago was more and more people were adopting Cloud, period. And that's continued and continued and continued. 18 months ago, if you went and talked to a large financial services organization and you told them, we do monitoring. Okay, they're interested. Well, we run only in the Cloud, so you actually have to send your data to the Cloud. They'd show you the door very politely. And now, they say, "Oh well, we're going to the cloud, now, too." It's a great place to be. Now, we're seeing organizations of all sizes, all types, are in the Cloud. So, the next leading trend is containerization and microservices. So, we actually published a Docker adoption report. We've done it three times now. We refreshed it yesterday. We do it about every six months, and we take a look at all of the usage that we can see. Because we have this somewhat unique vantage point of being able to see tens of thousands of customer's usage, real usage, of infrastructure, and look at, okay, which percent are using Docker? When they use it, do they dabble with it? Do they fully adopt it? Do they eventually abandon it? What are they running on it? So, we published a very long report. Anyone who's interested can actually Google "Docker adoption" and we'll be the top hit there. We've got eight different fact that talk about how quickly it's being adopted. Docker adoption is really quite remarkable. We're seeing a 40% growth in true adoption, not just dabbling, since last year. At the same time, we've seen a more than 100% increase, a more than doubling, of the companies that use Docker, that are using orchestrators, like Kubernetes, to manage even more sophisticated and rapidly changing fleets of machines. And that's really meaningful, because orchestration with containers really enables microservices, which enables Devox, which enables people to move quickly with very little friction and own specific parts of a stack. >> Does that mean that their On-Prem operations are beginning to look more and more in terms of processes like the Clouds? That it's not just a VM, but they're actually orchestrating things? >> Yes, it does. And people will run orchestration on top of the Cloud, or they'll run it On-Prem. But yeah, it's exactly the same. It's the same idea. If you're On-Prem you have a physical machine, you're running several containers in it, and they can just be very fluid and dynamic. >> And then how does machine learning ... How do you fit machine learning into the, whether it's at the infrastructure level or at the application performance management level, do you run it and get a baseline of what's normal? Or ... >> So there's some very deep math behind what we do, so we're able to project where metrics ought to be in the future. Across any number of different categories or tags that you give us, it's important that we do that very accurately 'cause we don't have false positives in our alerts, meaning we don't want to wake people up unnecessarily. We also don't want to have false negatives, meaning we don't want not alert when we should have. So there's a lot of math that goes into that and we can take care of very complex periodicity even while trends are happening within metrics, and doing that at scale, so it happens in real time is a challenge, but one that we're very proud of our solution. >> So you've been able to really derive some differentiation in the market. One of the things I was also reading was that a lot of the business, I mentioned some of those great brands, is in the U.S. and your CIO has been quite vocal about wanting to change that. What's happened in the last year, maybe with big rounds of Fund-Me raise, that's going to help you get more global as even Amazon was talking about expansion and geographies this morning? >> Well so it's even been a while since we've raised money, a year and a half now, I guess, but the company is doing so well. It's a great place to be. The company's doing so well that we're just able to expand our operations and look bigger and bigger. Our two founders are actually French, or they were born in France, at any rate. And so we have a Paris office and we're moving pretty aggressively into Europe now. >> Lisa: Fantastic. >> One question on, again, the hybrid-cloud migration. Whether it's On-Prem to, say, Azure, or On-Prem to Azure and Amazon, would the use of Datadog make it easier for the customer to, essentially, run the same workloads on either of the Clouds? >> Absolutely. So we see a lot of people coming to Datadog at the moment when they need to move from pure On-Prem to maybe hybrid or maybe fully into the Cloud. Because you can set up Datadog to look at both those environments and understand the performance characteristics and then move over bytes of into the Cloud and make sure that nothing's falling apart and that everything is behaving exactly as you expect. >> And then how about for those who say, "Well, we want to be committed to two Clouds, because we don't want to be beholden." >> K Young: Right. >> Do you help with that? >> Yeah, we don't help with literally, like, data movement, which is sometimes one of the challenges. >> But in managing, it's sort of pane of glass? >> Yes, exactly. It's all one pane of glass and you can take ... Once metrics are in Datadog, it doesn't really matter where they came from, you can overlay requests per second or latency and frame Google's Cloud right alongside latency that you're seeing in AWS on the same graph or next to each other, but you can set alerts if they deviate too much from each other. >> So it's kind of an abstraction layer or at least a commonality that customers would be able to have those applications and different clouds from different providers and be able to see the performance of the application and the infrastructure. And so one last question for you, as we're getting ready up to wrap here, you know there's a lot of debate about hybrid-cloud and there's reports that say in the next few years, companies will have to be multi-cloud, just look at the Snap and IPO filing from a couple months ago. Big announcement. Two billion dollars over five years with Google. And then, revise that S1 filing to announce a billion dollar deal with Amazon. >> K Young: Yeah. >> So I'm just curious. Are you seeing that maybe with the enterprises, like a Snap, more and more that, by default, whether it's for redundancy of infrastructure operations, is that a trend that you're also seeing? That you're quite well-positioned to be able to facilitate? >> Yeah, we're definitely seeing ... You know, it's clear that Amazon is in the commanding position, for sure, but we are definitely seeing more and more interest in actual action and other Clouds as well. >> Fantastic. Well, we thank you first of all for being on the program today. Great. Congratulations on the success that you've had with Amazon, with others, and with the market differentiation. Congrats on expanding globally as well, and we look forward to having you back on the program. >> Right. Well, thanks very much for having me. >> Excellent. So K Young, Director of Strategic Alliances from Datadog. On behalf of K, my co-host George Gilbert, I'm Lisa Martin. You're watching The Cube live from the AWS Summit in San Francisco, but stick around 'cause we're going to be right back. (techno music) (dramatic music)

Published Date : Apr 20 2017

SUMMARY :

Brought to you by Amazon Web Services. We are live in San Francisco at the AWS Summit. Glad to be here. about who Datadog is and what do you do. and the resources they have available to them, How long have you been in business? And tell us about what you're doing with Amazon. and to adopt their service. Is that the change? And so, to make sense of that, you have to use tags. in the sense that, you know, CPU utilization, and if the cash registers don't open at the right rate, What are some of the trends that you were seeing but the trend that we were seeing a few years ago It's the same idea. or at the application performance management level, or tags that you give us, that's going to help you get more global but the company is doing so well. or On-Prem to Azure and Amazon, and that everything is behaving exactly as you expect. because we don't want to be beholden." Yeah, we don't help with literally, like, data movement, on the same graph or next to each other, and be able to see the performance Are you seeing that maybe with the enterprises, is in the commanding position, and we look forward to having you back on the program. Well, thanks very much for having me. from the AWS Summit in San Francisco,

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
France	LOCATION	0.99+
Europe	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Samsung	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
Lisa	PERSON	0.99+
Airbnb	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
US	LOCATION	0.99+
K Young	PERSON	0.99+
San Francisco	LOCATION	0.99+
K	PERSON	0.99+
Datadog	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
2010	DATE	0.99+
Two billion dollars	QUANTITY	0.99+
yesterday	DATE	0.99+
last year	DATE	0.99+
two founders	QUANTITY	0.99+
U.S.	LOCATION	0.99+
Paris	LOCATION	0.99+
Google	ORGANIZATION	0.99+
one	QUANTITY	0.99+
200 different technologies	QUANTITY	0.99+
18 months ago	DATE	0.99+
IDC	ORGANIZATION	0.99+
a year	QUANTITY	0.99+
Salesforce	ORGANIZATION	0.99+
three times	QUANTITY	0.99+
Roger	PERSON	0.99+
a year and a half	QUANTITY	0.98+
One	QUANTITY	0.98+
One question	QUANTITY	0.98+
one customer	QUANTITY	0.98+
two classes	QUANTITY	0.98+
AWS Summit	EVENT	0.98+
both	QUANTITY	0.97+
each thing	QUANTITY	0.97+
Ten years ago	DATE	0.97+
one last question	QUANTITY	0.97+
today	DATE	0.97+
The Cube	TITLE	0.97+
first	QUANTITY	0.96+
tens of thousands	QUANTITY	0.96+
billion dollar	QUANTITY	0.96+
few years ago	DATE	0.95+
over 200 different	QUANTITY	0.95+
Docker	TITLE	0.95+
five businesses	QUANTITY	0.94+
ten years ago	DATE	0.94+
AWS Summit 2017	EVENT	0.92+
more than doubling	QUANTITY	0.88+
French	LOCATION	0.88+

Anais Dotis Georgiou, InfluxData | Evolving InfluxDB into the Smart Data Platform

>>Okay, we're back. I'm Dave Valante with The Cube and you're watching Evolving Influx DB into the smart data platform made possible by influx data. Anna East Otis Georgio is here. She's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into realtime analytics. Anna is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IO X is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory, of course for speed. It's a kilo store, so it gives you compression efficiency, it's gonna give you faster query speeds, it gonna use store files and object storages. So you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOCs is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's lift tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import, super useful. Also, broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so a lot there. Now we talked to Brian about how you're using Rust and and which is not a new programming language and of course we had some drama around Russ during the pandemic with the Mozilla layoffs, but the formation of the Russ Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Rust was chosen because of his exceptional performance and rebi reliability. So while rust is synt tactically similar to c c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers and dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on card for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ, Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fixed race conditions to protect against buffering overflows and to ensure thread safe ay caching structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learned about the the new engine and the, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you're really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data and so much of the efficiency and performance of IOCs comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of illustrate why calmer data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then neighbor each other and when they neighbor each other in the storage format. This provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the min and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one times stamp and do that for every single row. So you're scanning across a ton more data and that's why row oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, calmer data fit framework. So that's where a lot of the advantages come >>From. Okay. So you've basically described like a traditional database, a row approach, but I've seen like a lot of traditional databases say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native it, is it not as effective as the, is the form not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. >>Yeah. Got it. So let's talk about Arrow data fusion. What is data fusion? I know it's written in rust, but what does it bring to to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as its in memory format. So the way that it helps influx DB IOx is that okay, it's great if you can write unlimited amount of cardinality into influx cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PDA's data frames as well and all of the machine learning tools associated with pandas. >>Okay. You're also leveraging par K in the platform course. We heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Par K and why is it important? >>Sure. So Par K is the calm oriented durable file format. So it's important because it'll enable bulk import and bulk export. It has compatibility with Python and pandas so it supports a broader ecosystem. Parque files also take very little disc disc space and they're faster to scan because again they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and these, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call it the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOCs and I really encourage if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and I just wanna learn more, then I would encourage you to go to the monthly tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel. Look for the influx D DB underscore IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about IOCs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how influx TB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and you guys super responsive, so really appreciate that. All right, thank you so much and East for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yokum. He's the director of engineering for Influx Data and we're gonna talk about how you update a SaaS engine while the plane is flying at 30,000 feet. You don't wanna miss this.

Published Date : Nov 8 2022

SUMMARY :

to increase the granularity of time series analysis analysis and bring the world of data Hi, thank you so much. So you got very cost effective approach. it aims to have no limits on cardinality and also allow you to write any kind of event data that So lots of platforms, lots of adoption with rust, but why rust as an all the fine grain control, you need to take advantage of even to even today you do a lot of garbage collection in these, in these systems and And so you can picture this table where we have like two rows with the two temperature values for order to answer that question and you have those immediately available to you. to pluck out that one temperature value that you want at that one times stamp and do that for every about is really, you know, kind of native it, is it not as effective as the, Yeah, it's, it's not as effective because you have more expensive compression and because So let's talk about Arrow data fusion. It also has a PANDAS API so that you could take advantage of What are you doing with So it's important What's the value that you're bringing to the community? here is that the more you contribute and build those up, then the kind of summarize, you know, where what, what the big takeaways are from your perspective. So if there's a particular technology or stack that you wanna dive deeper into and want and you guys super responsive, so really appreciate that. I really appreciate it. Influx Data and we're gonna talk about how you update a SaaS engine while

ENTITIES

Entity	Category	Confidence
Tim Yokum	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Brian	PERSON	0.99+
Anna	PERSON	0.99+
James Bellenger	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Dave Valante	PERSON	0.99+
James	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
three months	QUANTITY	0.99+
16 times	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Python	TITLE	0.99+
mobile.twitter.com	OTHER	0.99+
Influx Data	ORGANIZATION	0.99+
iOS	TITLE	0.99+
Twitter	ORGANIZATION	0.99+
30,000 feet	QUANTITY	0.99+
Russ Foundation	ORGANIZATION	0.99+
Scala	TITLE	0.99+
Twitter Lite	TITLE	0.99+
two rows	QUANTITY	0.99+
200 megabyte	QUANTITY	0.99+
Node	TITLE	0.99+
Three months ago	DATE	0.99+
one application	QUANTITY	0.99+
both places	QUANTITY	0.99+
each row	QUANTITY	0.99+
Par K	TITLE	0.99+
Anais Dotis Georgiou	PERSON	0.99+
one language	QUANTITY	0.98+
first one	QUANTITY	0.98+
15 engineers	QUANTITY	0.98+
Anna East Otis Georgio	PERSON	0.98+
both	QUANTITY	0.98+
one second	QUANTITY	0.98+
25 engineers	QUANTITY	0.98+
About 800 people	QUANTITY	0.98+
sql	TITLE	0.98+
Node Summit 2017	EVENT	0.98+
two temperature values	QUANTITY	0.98+
one times	QUANTITY	0.98+
c plus plus	TITLE	0.97+
Rust	TITLE	0.96+
SQL	TITLE	0.96+
today	DATE	0.96+
Influx	ORGANIZATION	0.95+
under 600 kilobytes	QUANTITY	0.95+
first	QUANTITY	0.95+
c plus plus	TITLE	0.95+
Apache	ORGANIZATION	0.95+
par K	TITLE	0.94+
React	TITLE	0.94+
Russ	ORGANIZATION	0.94+
About three months ago	DATE	0.93+
8:30 AM Pacific time	DATE	0.93+
twitter.com	OTHER	0.93+
last decade	DATE	0.93+
Node	ORGANIZATION	0.92+
Hadoop	TITLE	0.9+
InfluxData	ORGANIZATION	0.89+
c c plus plus	TITLE	0.89+
Cube	ORGANIZATION	0.89+
each column	QUANTITY	0.88+
InfluxDB	TITLE	0.86+
Influx DB	TITLE	0.86+
Mozilla	ORGANIZATION	0.86+
DB IOx	TITLE	0.85+

Evolving InfluxDB into the Smart Data Platform

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now in this program we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems, certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away, just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally I would just say please, like watch in ICE in Tim's sessions, like these are two of our best and brightest, They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time really hot area. As Brian said in a moment, I'll be right back with Anna East dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't wanna miss this >>Time series Data is everywhere. The number of sensors, systems and applications generating time series data increases every day. All these data sources producing so much data can cause analysis paralysis. Influx DB is an entire platform designed with everything you need to quickly build applications that generate value from time series data influx. DB Cloud is a serverless solution, which means you don't need to buy or manage your own servers. There's no need to worry about provisioning because you only pay for what you use. Influx DB Cloud is fully managed so you get the newest features and enhancements as they're added to the platform's code base. It also means you can spend time building solutions and delivering value to your users instead of wasting time and effort managing something else. Influx TVB Cloud offers a range of security features to protect your data, multiple layers of redundancy ensure you don't lose any data access controls ensure that only the people who should see your data can see it. >>And encryption protects your data at rest and in transit between any of our regions or cloud providers. InfluxDB uses a single API across the entire platform suite so you can build on open source, deploy to the cloud and then then easily query data in the cloud at the edge or on prem using the same scripts. And InfluxDB is schemaless automatically adjusting to changes in the shape of your data without requiring changes in your application. Logic. InfluxDB Cloud is production ready from day one. All it needs is your data and your imagination. Get started today@influxdata.com slash cloud. >>Okay, we're back. I'm Dave Valante with a Cube and you're watching evolving Influx DB into the smart data platform made possible by influx data. Anna ETOs Georgio is here, she's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into real-time analytics and is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IX is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory of course for speed. It's a kilo store, so it gives you a compression efficiency, it's gonna give you faster query speeds, you store files and object storage, so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's live tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import super useful. Also broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs, but the formation of the Rust Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, the adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Russ was chosen because of his exceptional performance and reliability. So while Russ is synt tactically similar to c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers. And dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on ality, for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fix race conditions, to protection against buffering overflows and to ensure thread safe async cashing structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learn about the, the new engine and, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It it's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data. And so much of the efficiency and performance of IOx comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of of illustrate why column or data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format, this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the men and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why Rowe Oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, commoner data fit framework. So that's where a lot of the advantages come >>From. Okay. So you basically described like a traditional database, a row approach, but I've seen like a lot of traditional database say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native i, is it not as effective? Is the, is the foreman not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. Yeah. >>Got it. So let's talk about Arrow Data Fusion. What is data fusion? I know it's written in Rust, but what does it bring to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as it's in memory format. So the way that it helps in influx DB IOCs is that okay, it's great if you can write unlimited amount of cardinality into influx Cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So Data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PANDAS data frames as well and all of the machine learning tools associated with Pandas. >>Okay. You're also leveraging Par K in the platform cause we heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Parque and why is it important? >>Sure. So parque is the column oriented durable file format. So it's important because it'll enable bulk import, bulk export, it has compatibility with Python and Pandas, so it supports a broader ecosystem. Par K files also take very little disc disc space and they're faster to scan because again, they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and he's, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOx and I really encourage, if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and you just wanna learn more, then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel look for the influx DDB unders IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about iacs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how INFLUX DB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and, and you guys super responsive, so really appreciate that. All right, thank you so much Anise for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakum, he's the director of engineering for Influx Data and we're gonna talk about how you update a SAS engine while the plane is flying at 30,000 feet. You don't wanna miss this. >>I'm really glad that we went with InfluxDB Cloud for our hosting because it has saved us a ton of time. It's helped us move faster, it's saved us money. And also InfluxDB has good support. My name's Alex Nada. I am CTO at Noble nine. Noble Nine is a platform to measure and manage service level objectives, which is a great way of measuring the reliability of your systems. You can essentially think of an slo, the product we're providing to our customers as a bunch of time series. So we need a way to store that data and the corresponding time series that are related to those. The main reason that we settled on InfluxDB as we were shopping around is that InfluxDB has a very flexible query language and as a general purpose time series database, it basically had the set of features we were looking for. >>As our platform has grown, we found InfluxDB Cloud to be a really scalable solution. We can quickly iterate on new features and functionality because Influx Cloud is entirely managed, it probably saved us at least a full additional person on our team. We also have the option of running InfluxDB Enterprise, which gives us the ability to even host off the cloud or in a private cloud if that's preferred by a customer. Influx data has been really flexible in adapting to the hosting requirements that we have. They listened to the challenges we were facing and they helped us solve it. As we've continued to grow, I'm really happy we have influx data by our side. >>Okay, we're back with Tim Yokum, who is the director of engineering at Influx Data. Tim, welcome. Good to see you. >>Good to see you. Thanks for having me. >>You're really welcome. Listen, we've been covering open source software in the cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem. The cloud has been being built out on open source, mobile, social platforms, key databases, and of course influx DB and influx data has been a big consumer and contributor of open source software. So my question to you is, where have you seen the biggest bang for the buck from open source software? >>So yeah, you know, influx really, we thrive at the intersection of commercial services and open, so open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services temping engines. Our, our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants and like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product influx db. >>You know, but I gotta ask you, Tim, because one of the challenge that that we've seen in particular, you saw this in the heyday of Hadoop, the, the innovations come so fast and furious and as a software company you gotta place bets, you gotta, you know, commit people and sometimes those bets can be risky and not pay off well, how have you managed this challenge? >>Oh, it moves fast. Yeah, that, that's a benefit though because it, the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we, what we tend to do is, is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example, that ecosystem is driven by thousands of intelligent developers, engineers, builders, they're adding value every day. So we have to really keep up with that. And as the stack changes, we, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's, it's something that we just do every day. >>So we have a survey partner down in New York City called Enterprise Technology Research etr, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity particularly, you know, along with cloud. But, but really Kubernetes is just, you know, still up until the right consistently even with, you know, the macro headwinds and all, all of the stuff that we're sick of talking about. But, so what are you doing with Kubernetes in the platform? >>Yeah, it, it's really central to our ability to run the product. When we first started out, we were just on AWS and, and the way we were running was, was a little bit like containers junior. Now we're running Kubernetes everywhere at aws, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code so our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. >>Just to follow up on that, is it, no. So I presume it's sounds like there's a PAs layer there to allow you guys to have a consistent experience across clouds and out to the edge, you know, wherever is that, is that correct? >>Yeah, so we've basically built more or less platform engineering, This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that, that just gets all of the underlying infrastructure out of the way and, and lets them focus on delivering influx cloud. >>Yeah, and I know I'm taking a little bit of a tangent, but is that, that, I'll call it a PAs layer if I can use that term. Is that, are there specific attributes to Influx db or is it kind of just generally off the shelf paths? You know, are there, is, is there any purpose built capability there that, that is, is value add or is it pretty much generic? >>So we really build, we, we look at things through, with a build versus buy through a, a build versus by lens. Some things we want to leverage cloud provider services, for instance, Postgres databases for metadata, perhaps we'll get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can, can deliver on that has consistency that is, is all generated from code that we can as a, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions and in no time. >>So how, so sometimes you build, sometimes you buy it. How do you make those decisions and and what does that mean for the, for the platform and for customers? >>Yeah, so what we're doing is, it's like everybody else will do, we're we're looking for trade offs that make sense. You know, we really want to protect our customers data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, like I had mentioned with SQL data stores for metadata, perhaps let's build on top of what of these three large cloud providers have already perfected. And we can then focus on our platform engineering and we can have our developers then focus on the influx data, software, influx, cloud software. >>So take it to the customer level, what does it mean for them? What's the value that they're gonna get out of all these innovations that we've been been talking about today and what can they expect in the future? >>So first of all, people who use the OSS product are really gonna be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across, over 4 billion series keys that people have stored. So there's a proven ability to scale now in terms of the open source, open source software and how we've developed the platform. You're getting highly available high cardinality time series platform. We manage it and, and really as, as I mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in real time. We deploy to our platform every day repeatedly all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change new features, better ways of doing deployments, safer ways of doing deployments. >>All of that happens behind the scenes. And like we had mentioned earlier, Kubernetes, I mean that, that allows us to get that done. We couldn't do it without having that platform as a, as a base layer for us to then put our software on. So we, we iterate quickly. When you're on the, the Influx cloud platform, you really are able to, to take advantage of new features immediately. We roll things out every day and as those things go into production, you have, you have the ability to, to use them. And so in the end we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let, let us do that for you. So, >>And that makes sense, but so is the, is the, are the innovations that we're talking about in the evolution of Influx db, do, do you see that as sort of a natural evolution for existing customers? I, is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >>Yeah, it really is it, it's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are, are really the hot thing. Iot, industrial iot especially, people want to just shove tons of data out there and be able to do queries immediately and they don't wanna manage infrastructure. What we've started to see are people that use the cloud service as their, their data store backbone and then they use edge computing with R OSS product to ingest data from say, multiple production lines and downsample that data, send the rest of that data off influx cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that and being in all sorts of different regions allows for people to really get out of the, the business of man trying to manage that big data, have us take care of that. And of course as we change the platform end users benefit from that immediately. And, >>And so obviously taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IOT and the Edge? How should we be thinking about the value that you bring from a security perspective? >>Yeah, we take, we take security super seriously. It, it's built into our dna. We do a lot of work to ensure that our platform is secure, that the data we store is, is kept private. It's of course always a concern. You see in the news all the time, companies being compromised, you know, that's something that you can have an entire team working on, which we do to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You know, you look at things like software, bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that, that's just part of our jobs to make sure that the platform that we're running it has, has fully vetted software and, and with open source especially, that's a lot of work. And so it's, it's definitely new territory. Supply chain attacks are, are definitely happening at a higher clip than they used to, but that is, that is really just part of a day in the, the life for folks like us that are, are building platforms. >>Yeah, and that's key. I mean especially when you start getting into the, the, you know, we talk about IOT and the operations technologies, the engineers running the, that infrastructure, you know, historically, as you know, Tim, they, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >>That >>Connected now, right? And so you've gotta have a partner that is again, take away that heavy lifting to r and d so you can focus on some of the other activities. Right. Give us the, the last word and the, the key takeaways from your perspective. >>Well, you know, from my perspective I see it as, as a a two lane approach with, with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, what you had mentioned, air gaping. Sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want torus their data to, to a company that's, that's got a full platform set up for them that they can build on, send that data over to the cloud, the cloud is not going away. I think more hybrid approach is, is where the future lives and that's what we're prepared for. >>Tim, really appreciate you coming to the program. Great stuff. Good to see you. >>Thanks very much. Appreciate it. >>Okay, in a moment I'll be back to wrap up. Today's session, you're watching The Cube. >>Are you looking for some help getting started with InfluxDB Telegraph or Flux Check >>Out Influx DB University >>Where you can find our entire catalog of free training that will help you make the most of your time series data >>Get >>Started for free@influxdbu.com. >>We'll see you in class. >>Okay, so we heard today from three experts on time series and data, how the Influx DB platform is evolving to support new ways of analyzing large data sets very efficiently and effectively in real time. And we learned that key open source components like Apache Arrow and the Rust Programming environment Data fusion par K are being leveraged to support realtime data analytics at scale. We also learned about the contributions in importance of open source software and how the Influx DB community is evolving the platform with minimal disruption to support new workloads, new use cases, and the future of realtime data analytics. Now remember these sessions, they're all available on demand. You can go to the cube.net to find those. Don't forget to check out silicon angle.com for all the news related to things enterprise and emerging tech. And you should also check out influx data.com. There you can learn about the company's products. You'll find developer resources like free courses. You could join the developer community and work with your peers to learn and solve problems. And there are plenty of other resources around use cases and customer stories on the website. This is Dave Valante. Thank you for watching Evolving Influx DB into the smart data platform, made possible by influx data and brought to you by the Cube, your leader in enterprise and emerging tech coverage.

Published Date : Nov 2 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, you know, all of the, the real time queries, the, the multiple language query support, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try And so just, you know, being careful, maybe a little cautious in terms And you can do some experimentation and, you know, using the cloud resources. You know, this is a new very sort of popular systems language, you know, really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. going out and you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. goes, goes beyond just the historical into the real time really hot area. There's no need to worry about provisioning because you only pay for what you use. InfluxDB uses a single API across the entire platform suite so you can build on Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the Hi, thank you so much. it's gonna give you faster query speeds, you store files and object storage, it aims to have no limits on cardinality and also allow you to write any kind of event data that It's really, the adoption is really starting to get steep on all the control, all the fine grain control, you need to take you know, the community is modernizing the platform, but I wanna talk about Apache And so you can answer that question and you have those immediately available to you. out that one temperature value that you want at that one time stamp and do that for every talking about is really, you know, kind of native i, is it not as effective? Yeah, it's, it's not as effective because you have more expensive compression and So let's talk about Arrow Data Fusion. It also has a PANDAS API so that you could take advantage of PANDAS What are you doing with and Pandas, so it supports a broader ecosystem. What's the value that you're bringing to the community? And I think kind of the idea here is that if you can improve kind of summarize, you know, where what, what the big takeaways are from your perspective. the hard work questions and you All right, thank you so much Anise for explaining I really appreciate it. Data and we're gonna talk about how you update a SAS engine while I'm really glad that we went with InfluxDB Cloud for our hosting They listened to the challenges we were facing and they helped Good to see you. Good to see you. So my question to you is, So yeah, you know, influx really, we thrive at the intersection of commercial services and open, You know, you look at Kubernetes for example, But, but really Kubernetes is just, you know, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. to the edge, you know, wherever is that, is that correct? This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us Is that, are there specific attributes to Influx db as an SRE group, as an ops team, that we can manage with very few people So how, so sometimes you build, sometimes you buy it. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, and really as, as I mentioned earlier, we can keep up with the state of the art. the end we want you to focus on getting actual insights from your data instead of running infrastructure, So cloud native technologies are, are really the hot thing. You see in the news all the time, companies being compromised, you know, technologies, the engineers running the, that infrastructure, you know, historically, as you know, take away that heavy lifting to r and d so you can focus on some of the other activities. with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, Tim, really appreciate you coming to the program. Thanks very much. Okay, in a moment I'll be back to wrap up. brought to you by the Cube, your leader in enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
David Brown	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave Volante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Stu	PERSON	0.99+
Herain Oberoi	PERSON	0.99+
John	PERSON	0.99+
Dave Valante	PERSON	0.99+
Kamile Taouk	PERSON	0.99+
John Fourier	PERSON	0.99+
Rinesh Patel	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Santana Dasgupta	PERSON	0.99+
Europe	LOCATION	0.99+
Canada	LOCATION	0.99+
BMW	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
ICE	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Jack Berkowitz	PERSON	0.99+
Australia	LOCATION	0.99+
NVIDIA	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Venkat	PERSON	0.99+
Michael	PERSON	0.99+
Camille	PERSON	0.99+
Andy Jassy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Venkat Krishnamachari	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Don Tapscott	PERSON	0.99+
thousands	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Intercontinental Exchange	ORGANIZATION	0.99+
Children's Cancer Institute	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
telco	ORGANIZATION	0.99+
Sabrina Yan	PERSON	0.99+
Tim	PERSON	0.99+
Sabrina	PERSON	0.99+
John Furrier	PERSON	0.99+
Google	ORGANIZATION	0.99+
MontyCloud	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Leo	PERSON	0.99+
COVID-19	OTHER	0.99+
Santa Ana	LOCATION	0.99+
UK	LOCATION	0.99+
Tushar	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Valente	PERSON	0.99+
JL Valente	PERSON	0.99+
1,000	QUANTITY	0.99+

Evolving InfluxDB into the Smart Data Platform Full Episode

>>This past May, The Cube in collaboration with Influx data shared with you the latest innovations in Time series databases. We talked at length about why a purpose built time series database for many use cases, was a superior alternative to general purpose databases trying to do the same thing. Now, you may, you may remember the time series data is any data that's stamped in time, and if it's stamped, it can be analyzed historically. And when we introduced the concept to the community, we talked about how in theory, those time slices could be taken, you know, every hour, every minute, every second, you know, down to the millisecond and how the world was moving toward realtime or near realtime data analysis to support physical infrastructure like sensors and other devices and IOT equipment. A time series databases have had to evolve to efficiently support realtime data in emerging use cases in iot T and other use cases. >>And to do that, new architectural innovations have to be brought to bear. As is often the case, open source software is the linchpin to those innovations. Hello and welcome to Evolving Influx DB into the smart Data platform, made possible by influx data and produced by the Cube. My name is Dave Valante and I'll be your host today. Now in this program we're going to dig pretty deep into what's happening with Time series data generally, and specifically how Influx DB is evolving to support new workloads and demands and data, and specifically around data analytics use cases in real time. Now, first we're gonna hear from Brian Gilmore, who is the director of IOT and emerging technologies at Influx Data. And we're gonna talk about the continued evolution of Influx DB and the new capabilities enabled by open source generally and specific tools. And in this program you're gonna hear a lot about things like Rust, implementation of Apache Arrow, the use of par k and tooling such as data fusion, which powering a new engine for Influx db. >>Now, these innovations, they evolve the idea of time series analysis by dramatically increasing the granularity of time series data by compressing the historical time slices, if you will, from, for example, minutes down to milliseconds. And at the same time, enabling real time analytics with an architecture that can process data much faster and much more efficiently. Now, after Brian, we're gonna hear from Anna East Dos Georgio, who is a developer advocate at In Flux Data. And we're gonna get into the why of these open source capabilities and how they contribute to the evolution of the Influx DB platform. And then we're gonna close the program with Tim Yokum, he's the director of engineering at Influx Data, and he's gonna explain how the Influx DB community actually evolved the data engine in mid-flight and which decisions went into the innovations that are coming to the market. Thank you for being here. We hope you enjoy the program. Let's get started. Okay, we're kicking things off with Brian Gilmore. He's the director of i t and emerging Technology at Influx State of Bryan. Welcome to the program. Thanks for coming on. >>Thanks Dave. Great to be here. I appreciate the time. >>Hey, explain why Influx db, you know, needs a new engine. Was there something wrong with the current engine? What's going on there? >>No, no, not at all. I mean, I think it's, for us, it's been about staying ahead of the market. I think, you know, if we think about what our customers are coming to us sort of with now, you know, related to requests like sql, you know, query support, things like that, we have to figure out a way to, to execute those for them in a way that will scale long term. And then we also, we wanna make sure we're innovating, we're sort of staying ahead of the market as well and sort of anticipating those future needs. So, you know, this is really a, a transparent change for our customers. I mean, I think we'll be adding new capabilities over time that sort of leverage this new engine, but you know, initially the customers who are using us are gonna see just great improvements in performance, you know, especially those that are working at the top end of the, of the workload scale, you know, the massive data volumes and things like that. >>Yeah, and we're gonna get into that today and the architecture and the like, but what was the catalyst for the enhancements? I mean, when and how did this all come about? >>Well, I mean, like three years ago we were primarily on premises, right? I mean, I think we had our open source, we had an enterprise product, you know, and, and sort of shifting that technology, especially the open source code base to a service basis where we were hosting it through, you know, multiple cloud providers. That was, that was, that was a long journey I guess, you know, phase one was, you know, we wanted to host enterprise for our customers, so we sort of created a service that we just managed and ran our enterprise product for them. You know, phase two of this cloud effort was to, to optimize for like multi-tenant, multi-cloud, be able to, to host it in a truly like sass manner where we could use, you know, some type of customer activity or consumption as the, the pricing vector, you know, And, and that was sort of the birth of the, of the real first influx DB cloud, you know, which has been really successful. >>We've seen, I think like 60,000 people sign up and we've got tons and tons of, of both enterprises as well as like new companies, developers, and of course a lot of home hobbyists and enthusiasts who are using out on a, on a daily basis, you know, and having that sort of big pool of, of very diverse and very customers to chat with as they're using the product, as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction in terms of making sure we're continuously improving that and then also making these big leaps as we're doing with this, with this new engine. >>Right. So you've called it a transparent change for customers, so I'm presuming it's non-disruptive, but I really wanna understand how much of a pivot this is and what, what does it take to make that shift from, you know, time series, you know, specialist to real time analytics and being able to support both? >>Yeah, I mean, it's much more of an evolution, I think, than like a shift or a pivot. You know, time series data is always gonna be fundamental and sort of the basis of the solutions that we offer our customers, and then also the ones that they're building on the sort of raw APIs of our platform themselves. You know, the time series market is one that we've worked diligently to lead. I mean, I think when it comes to like metrics, especially like sensor data and app and infrastructure metrics, if we're being honest though, I think our, our user base is well aware that the way we were architected was much more towards those sort of like backwards looking historical type analytics, which are key for troubleshooting and making sure you don't, you know, run into the same problem twice. But, you know, we had to ask ourselves like, what can we do to like better handle those queries from a performance and a, and a, you know, a time to response on the queries, and can we get that to the point where the results sets are coming back so quickly from the time of query that we can like limit that window down to minutes and then seconds. >>And now with this new engine, we're really starting to talk about a query window that could be like returning results in, in, you know, milliseconds of time since it hit the, the, the ingest queue. And that's, that's really getting to the point where as your data is available, you can use it and you can query it, you can visualize it, and you can do all those sort of magical things with it, you know? And I think getting all of that to a place where we're saying like, yes to the customer on, you know, all of the, the real time queries, the, the multiple language query support, but, you know, it was hard, but we're now at a spot where we can start introducing that to, you know, a a limited number of customers, strategic customers and strategic availability zones to start. But you know, everybody over time. >>So you're basically going from what happened to in, you can still do that obviously, but to what's happening now in the moment? >>Yeah, yeah. I mean if you think about time, it's always sort of past, right? I mean, like in the moment right now, whether you're talking about like a millisecond ago or a minute ago, you know, that's, that's pretty much right now, I think for most people, especially in these use cases where you have other sort of components of latency induced by the, by the underlying data collection, the architecture, the infrastructure, the, you know, the, the devices and you know, the sort of highly distributed nature of all of this. So yeah, I mean, getting, getting a customer or a user to be able to use the data as soon as it is available is what we're after here. >>I always thought, you know, real, I always thought of real time as before you lose the customer, but now in this context, maybe it's before the machine blows up. >>Yeah, it's, it's, I mean it is operationally or operational real time is different, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, is just how many sort of operational customers we have. You know, everything from like aerospace and defense. We've got companies monitoring satellites, we've got tons of industrial users, users using us as a processes storing on the plant floor, you know, and, and if we can satisfy their sort of demands for like real time historical perspective, that's awesome. I think what we're gonna do here is we're gonna start to like edge into the real time that they're used to in terms of, you know, the millisecond response times that they expect of their control systems, certainly not their, their historians and databases. >>I, is this available, these innovations to influx DB cloud customers only who can access this capability? >>Yeah. I mean commercially and today, yes. You know, I think we want to emphasize that's a, for now our goal is to get our latest and greatest and our best to everybody over time. Of course. You know, one of the things we had to do here was like we double down on sort of our, our commitment to open source and availability. So like anybody today can take a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try to, you know, implement or execute some of it themselves in their own infrastructure. You know, we are, we're committed to bringing our sort of latest and greatest to our cloud customers first for a couple of reasons. Number one, you know, there are big workloads and they have high expectations of us. I think number two, it also gives us the opportunity to monitor a little bit more closely how it's working, how they're using it, like how the system itself is performing. >>And so just, you know, being careful, maybe a little cautious in terms of, of, of how big we go with this right away, just sort of both limits, you know, the risk of, of, you know, any issues that can come with new software rollouts. We haven't seen anything so far, but also it does give us the opportunity to have like meaningful conversations with a small group of users who are using the products, but once we get through that and they give us two thumbs up on it, it'll be like, open the gates and let everybody in. It's gonna be exciting time for the whole ecosystem. >>Yeah, that makes a lot of sense. And you can do some experimentation and, you know, using the cloud resources. Let's dig into some of the architectural and technical innovations that are gonna help deliver on this vision. What, what should we know there? >>Well, I mean, I think foundationally we built the, the new core on Rust. You know, this is a new very sort of popular systems language, you know, it's extremely efficient, but it's also built for speed and memory safety, which goes back to that us being able to like deliver it in a way that is, you know, something we can inspect very closely, but then also rely on the fact that it's going to behave well. And if it does find error conditions, I mean we, we've loved working with Go and, you know, a lot of our libraries will continue to, to be sort of implemented in Go, but you know, when it came to this particular new engine, you know, that power performance and stability rust was critical. On top of that, like, we've also integrated Apache Arrow and Apache Parque for persistence. I think for anybody who's really familiar with the nuts and bolts of our backend and our TSI and our, our time series merged Trees, this is a big break from that, you know, arrow on the sort of in MI side and then Par K in the on disk side. >>It, it allows us to, to present, you know, a unified set of APIs for those really fast real time inquiries that we talked about, as well as for very large, you know, historical sort of bulk data archives in that PARQUE format, which is also cool because there's an entire ecosystem sort of popping up around Parque in terms of the machine learning community, you know, and getting that all to work, we had to glue it together with aero flight. That's sort of what we're using as our, our RPC component. You know, it handles the orchestration and the, the transportation of the Coer data. Now we're moving to like a true Coer database model for this, this version of the engine, you know, and it removes a lot of overhead for us in terms of having to manage all that serialization, the deserialization, and, you know, to that again, like blurring that line between real time and historical data. It's, you know, it's, it's highly optimized for both streaming micro batch and then batches, but true streaming as well. >>Yeah. Again, I mean, it's funny you mentioned Rust. It is, it's been around for a long time, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. And, and we're gonna dig into to more of that, but give us any, is there anything else that we should know about Bryan? Give us the last word? >>Well, I mean, I think first I'd like everybody sort of watching just to like take a look at what we're offering in terms of early access in beta programs. I mean, if, if, if you wanna participate or if you wanna work sort of in terms of early access with the, with the new engine, please reach out to the team. I'm sure you know, there's a lot of communications going out and you know, it'll be highly featured on our, our website, you know, but reach out to the team, believe it or not, like we have a lot more going on than just the new engine. And so there are also other programs, things we're, we're offering to customers in terms of the user interface, data collection and things like that. And, you know, if you're a customer of ours and you have a sales team, a commercial team that you work with, you can reach out to them and see what you can get access to because we can flip a lot of stuff on, especially in cloud through feature flags. >>But if there's something new that you wanna try out, we'd just love to hear from you. And then, you know, our goal would be that as we give you access to all of these new cool features that, you know, you would give us continuous feedback on these products and services, not only like what you need today, but then what you'll need tomorrow to, to sort of build the next versions of your business. Because you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented stack of cloud services and enterprise databases and edge databases, you know, it's gonna be what we all make it together, not just, you know, those of us who were employed by Influx db. And then finally I would just say please, like watch in ICE in Tim's sessions, like these are two of our best and brightest, They're totally brilliant, completely pragmatic, and they are most of all customer obsessed, which is amazing. And there's no better takes, like honestly on the, the sort of technical details of this, then there's, especially when it comes to like the value that these investments will, will bring to our customers and our communities. So encourage you to, to, you know, pay more attention to them than you did to me, for sure. >>Brian Gilmore, great stuff. Really appreciate your time. Thank you. >>Yeah, thanks Dave. It was awesome. Look forward to it. >>Yeah, me too. Looking forward to see how the, the community actually applies these new innovations and goes, goes beyond just the historical into the real time really hot area. As Brian said in a moment, I'll be right back with Anna East dos Georgio to dig into the critical aspects of key open source components of the Influx DB engine, including Rust, Arrow, Parque, data fusion. Keep it right there. You don't wanna miss this >>Time series Data is everywhere. The number of sensors, systems and applications generating time series data increases every day. All these data sources producing so much data can cause analysis paralysis. Influx DB is an entire platform designed with everything you need to quickly build applications that generate value from time series data influx. DB Cloud is a serverless solution, which means you don't need to buy or manage your own servers. There's no need to worry about provisioning because you only pay for what you use. Influx DB Cloud is fully managed so you get the newest features and enhancements as they're added to the platform's code base. It also means you can spend time building solutions and delivering value to your users instead of wasting time and effort managing something else. Influx TVB Cloud offers a range of security features to protect your data, multiple layers of redundancy ensure you don't lose any data access controls ensure that only the people who should see your data can see it. >>And encryption protects your data at rest and in transit between any of our regions or cloud providers. InfluxDB uses a single API across the entire platform suite so you can build on open source, deploy to the cloud and then then easily query data in the cloud at the edge or on prem using the same scripts. And InfluxDB is schemaless automatically adjusting to changes in the shape of your data without requiring changes in your application. Logic. InfluxDB Cloud is production ready from day one. All it needs is your data and your imagination. Get started today@influxdata.com slash cloud. >>Okay, we're back. I'm Dave Valante with a Cube and you're watching evolving Influx DB into the smart data platform made possible by influx data. Anna ETOs Georgio is here, she's a developer advocate for influx data and we're gonna dig into the rationale and value contribution behind several open source technologies that Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the world of data into real-time analytics and is welcome to the program. Thanks for coming on. >>Hi, thank you so much. It's a pleasure to be here. >>Oh, you're very welcome. Okay, so IX is being touted as this next gen open source core for Influx db. And my understanding is that it leverages in memory of course for speed. It's a kilo store, so it gives you a compression efficiency, it's gonna give you faster query speeds, you store files and object storage, so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features, but what are the high level value points that people should understand? >>Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me, the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want, whether that's live tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metrics queries, we also wanna have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import super useful. Also broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like sql, Python, and maybe even pandas in the future. >>Okay, so lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs, but the formation of the Rust Foundation really addressed any of those concerns. You got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really, the adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with rust, but why rust as an alternative to say c plus plus for example? >>Sure, that's a great question. So Russ was chosen because of his exceptional performance and reliability. So while Russ is synt tactically similar to c plus plus and it has similar performance, it also compiles to a native code like c plus plus. But unlike c plus plus, it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers. And dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like c plus plus. So Russ like helps meet that requirement of having no limits on ality, for example, because it's, we're also using the Russ implementation of Apache Arrow and this control over memory and also Russ Russ's packaging system called crates IO offers everything that you need out of the box to have features like AY and a weight to fix race conditions, to protection against buffering overflows and to ensure thread safe async cashing structures as well. So essentially it's just like has all the control, all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high ity use cases. >>Yeah, and the more I learn about the, the new engine and, and the platform IOCs et cetera, you know, you, you see things like, you know, the old days not even to even today you do a lot of garbage collection in these, in these systems and there's an inverse, you know, impact relative to performance. So it looks like you really, you know, the community is modernizing the platform, but I wanna talk about Apache Arrow for a moment. It it's designed to address the constraints that are associated with analyzing large data sets. We, we know that, but please explain why, what, what is Arrow and and what does it bring to Influx db? >>Sure, yeah. So Arrow is a, a framework for defining in memory calmer data. And so much of the efficiency and performance of IOx comes from taking advantage of calmer data structures. And I will, if you don't mind, take a moment to kind of of illustrate why column or data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our stove. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value, maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the stove. Well usually our room temperature is regulated so those values don't change very often. >>So when you have calm oriented st calm oriented storage, essentially you take each row, each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same, then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format, this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you wanna define like the men and max value of the temperature in the room across a thousand different points, you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of calmer oriented storage. >>So if you had a row oriented storage, you'd first have to look at every field like the temperature in, in the room and the temperature of the stove. You'd have to go across every tag value that maybe describes where the room is located or what model the stove is. And every timestamp you'd then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why Rowe Oriented doesn't provide the same efficiency as calmer and Apache Arrow is in memory calmer data, commoner data fit framework. So that's where a lot of the advantages come >>From. Okay. So you basically described like a traditional database, a row approach, but I've seen like a lot of traditional database say, okay, now we've got, we can handle colo format versus what you're talking about is really, you know, kind of native i, is it not as effective? Is the, is the foreman not as effective because it's largely a, a bolt on? Can you, can you like elucidate on that front? >>Yeah, it's, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why, why RO row oriented storage isn't as efficient as calm, calmer oriented storage. Yeah. >>Got it. So let's talk about Arrow Data Fusion. What is data fusion? I know it's written in Rust, but what does it bring to the table here? >>Sure. So it's an extensible query execution framework and it uses Arrow as it's in memory format. So the way that it helps in influx DB IOCs is that okay, it's great if you can write unlimited amount of cardinality into influx Cbis, but if you don't have a query engine that can successfully query that data, then I don't know how much value it is for you. So Data fusion helps enable the, the query process and transformation of that data. It also has a PANDAS API so that you could take advantage of PANDAS data frames as well and all of the machine learning tools associated with Pandas. >>Okay. You're also leveraging Par K in the platform cause we heard a lot about Par K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Parque and why is it important? >>Sure. So parque is the column oriented durable file format. So it's important because it'll enable bulk import, bulk export, it has compatibility with Python and Pandas, so it supports a broader ecosystem. Par K files also take very little disc disc space and they're faster to scan because again, they're column oriented in particular, I think PAR K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the, the benefits of par k. >>Got it. Very popular. So and he's, what exactly is influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >>Sure. So Influx DB first has contributed a lot of different, different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing with flux. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up, then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >>Yeah. Got it. You got that virtuous cycle going, the people call the flywheel. Give us your last thoughts and kind of summarize, you know, where what, what the big takeaways are from your perspective. >>So I think the big takeaway is that influx data is doing a lot of really exciting things with Influx DB IOx and I really encourage, if you are interested in learning more about the technologies that Influx is leveraging to produce IOCs, the challenges associated with it and all of the hard work questions and you just wanna learn more, then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel look for the influx DDB unders IAC channel specifically to learn more about how to join those office hours and those monthly tech tech talks as well as ask any questions they have about iacs, what to expect and what you'd like to learn more about. I as a developer advocate, I wanna answer your questions. So if there's a particular technology or stack that you wanna dive deeper into and want more explanation about how INFLUX DB leverages it to build IOCs, I will be really excited to produce content on that topic for you. >>Yeah, that's awesome. You guys have a really rich community, collaborate with your peers, solve problems, and, and you guys super responsive, so really appreciate that. All right, thank you so much Anise for explaining all this open source stuff to the audience and why it's important to the future of data. >>Thank you. I really appreciate it. >>All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakum, he's the director of engineering for Influx Data and we're gonna talk about how you update a SAS engine while the plane is flying at 30,000 feet. You don't wanna miss this. >>I'm really glad that we went with InfluxDB Cloud for our hosting because it has saved us a ton of time. It's helped us move faster, it's saved us money. And also InfluxDB has good support. My name's Alex Nada. I am CTO at Noble nine. Noble Nine is a platform to measure and manage service level objectives, which is a great way of measuring the reliability of your systems. You can essentially think of an slo, the product we're providing to our customers as a bunch of time series. So we need a way to store that data and the corresponding time series that are related to those. The main reason that we settled on InfluxDB as we were shopping around is that InfluxDB has a very flexible query language and as a general purpose time series database, it basically had the set of features we were looking for. >>As our platform has grown, we found InfluxDB Cloud to be a really scalable solution. We can quickly iterate on new features and functionality because Influx Cloud is entirely managed, it probably saved us at least a full additional person on our team. We also have the option of running InfluxDB Enterprise, which gives us the ability to even host off the cloud or in a private cloud if that's preferred by a customer. Influx data has been really flexible in adapting to the hosting requirements that we have. They listened to the challenges we were facing and they helped us solve it. As we've continued to grow, I'm really happy we have influx data by our side. >>Okay, we're back with Tim Yokum, who is the director of engineering at Influx Data. Tim, welcome. Good to see you. >>Good to see you. Thanks for having me. >>You're really welcome. Listen, we've been covering open source software in the cube for more than a decade, and we've kind of watched the innovation from the big data ecosystem. The cloud has been being built out on open source, mobile, social platforms, key databases, and of course influx DB and influx data has been a big consumer and contributor of open source software. So my question to you is, where have you seen the biggest bang for the buck from open source software? >>So yeah, you know, influx really, we thrive at the intersection of commercial services and open, so open source software. So OSS keeps us on the cutting edge. We benefit from OSS in delivering our own service from our core storage engine technologies to web services temping engines. Our, our team stays lean and focused because we build on proven tools. We really build on the shoulders of giants and like you've mentioned, even better, we contribute a lot back to the projects that we use as well as our own product influx db. >>You know, but I gotta ask you, Tim, because one of the challenge that that we've seen in particular, you saw this in the heyday of Hadoop, the, the innovations come so fast and furious and as a software company you gotta place bets, you gotta, you know, commit people and sometimes those bets can be risky and not pay off well, how have you managed this challenge? >>Oh, it moves fast. Yeah, that, that's a benefit though because it, the community moves so quickly that today's hot technology can be tomorrow's dinosaur. And what we, what we tend to do is, is we fail fast and fail often. We try a lot of things. You know, you look at Kubernetes for example, that ecosystem is driven by thousands of intelligent developers, engineers, builders, they're adding value every day. So we have to really keep up with that. And as the stack changes, we, we try different technologies, we try different methods, and at the end of the day, we come up with a better platform as a result of just the constant change in the environment. It is a challenge for us, but it's, it's something that we just do every day. >>So we have a survey partner down in New York City called Enterprise Technology Research etr, and they do these quarterly surveys of about 1500 CIOs, IT practitioners, and they really have a good pulse on what's happening with spending. And the data shows that containers generally, but specifically Kubernetes is one of the areas that has kind of, it's been off the charts and seen the most significant adoption and velocity particularly, you know, along with cloud. But, but really Kubernetes is just, you know, still up until the right consistently even with, you know, the macro headwinds and all, all of the stuff that we're sick of talking about. But, so what are you doing with Kubernetes in the platform? >>Yeah, it, it's really central to our ability to run the product. When we first started out, we were just on AWS and, and the way we were running was, was a little bit like containers junior. Now we're running Kubernetes everywhere at aws, Azure, Google Cloud. It allows us to have a consistent experience across three different cloud providers and we can manage that in code so our developers can focus on delivering services, not trying to learn the intricacies of Amazon, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. >>Just to follow up on that, is it, no. So I presume it's sounds like there's a PAs layer there to allow you guys to have a consistent experience across clouds and out to the edge, you know, wherever is that, is that correct? >>Yeah, so we've basically built more or less platform engineering, This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us because we've built a platform that our developers can lean on and they only have to learn one way of deploying their application, managing their application. And so that, that just gets all of the underlying infrastructure out of the way and, and lets them focus on delivering influx cloud. >>Yeah, and I know I'm taking a little bit of a tangent, but is that, that, I'll call it a PAs layer if I can use that term. Is that, are there specific attributes to Influx db or is it kind of just generally off the shelf paths? You know, are there, is, is there any purpose built capability there that, that is, is value add or is it pretty much generic? >>So we really build, we, we look at things through, with a build versus buy through a, a build versus by lens. Some things we want to leverage cloud provider services, for instance, Postgres databases for metadata, perhaps we'll get that off of our plate, let someone else run that. We're going to deploy a platform that our engineers can, can deliver on that has consistency that is, is all generated from code that we can as a, as an SRE group, as an ops team, that we can manage with very few people really, and we can stamp out clusters across multiple regions and in no time. >>So how, so sometimes you build, sometimes you buy it. How do you make those decisions and and what does that mean for the, for the platform and for customers? >>Yeah, so what we're doing is, it's like everybody else will do, we're we're looking for trade offs that make sense. You know, we really want to protect our customers data. So we look for services that support our own software with the most uptime, reliability, and durability we can get. Some things are just going to be easier to have a cloud provider take care of on our behalf. We make that transparent for our own team. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, like I had mentioned with SQL data stores for metadata, perhaps let's build on top of what of these three large cloud providers have already perfected. And we can then focus on our platform engineering and we can have our developers then focus on the influx data, software, influx, cloud software. >>So take it to the customer level, what does it mean for them? What's the value that they're gonna get out of all these innovations that we've been been talking about today and what can they expect in the future? >>So first of all, people who use the OSS product are really gonna be at home on our cloud platform. You can run it on your desktop machine, on a single server, what have you, but then you want to scale up. We have some 270 terabytes of data across, over 4 billion series keys that people have stored. So there's a proven ability to scale now in terms of the open source, open source software and how we've developed the platform. You're getting highly available high cardinality time series platform. We manage it and, and really as, as I mentioned earlier, we can keep up with the state of the art. We keep reinventing, we keep deploying things in real time. We deploy to our platform every day repeatedly all the time. And it's that continuous deployment that allows us to continue testing things in flight, rolling things out that change new features, better ways of doing deployments, safer ways of doing deployments. >>All of that happens behind the scenes. And like we had mentioned earlier, Kubernetes, I mean that, that allows us to get that done. We couldn't do it without having that platform as a, as a base layer for us to then put our software on. So we, we iterate quickly. When you're on the, the Influx cloud platform, you really are able to, to take advantage of new features immediately. We roll things out every day and as those things go into production, you have, you have the ability to, to use them. And so in the end we want you to focus on getting actual insights from your data instead of running infrastructure, you know, let, let us do that for you. So, >>And that makes sense, but so is the, is the, are the innovations that we're talking about in the evolution of Influx db, do, do you see that as sort of a natural evolution for existing customers? I, is it, I'm sure the answer is both, but is it opening up new territory for customers? Can you add some color to that? >>Yeah, it really is it, it's a little bit of both. Any engineer will say, well, it depends. So cloud native technologies are, are really the hot thing. Iot, industrial iot especially, people want to just shove tons of data out there and be able to do queries immediately and they don't wanna manage infrastructure. What we've started to see are people that use the cloud service as their, their data store backbone and then they use edge computing with R OSS product to ingest data from say, multiple production lines and downsample that data, send the rest of that data off influx cloud where the heavy processing takes place. So really us being in all the different clouds and iterating on that and being in all sorts of different regions allows for people to really get out of the, the business of man trying to manage that big data, have us take care of that. And of course as we change the platform end users benefit from that immediately. And, >>And so obviously taking away a lot of the heavy lifting for the infrastructure, would you say the same thing about security, especially as you go out to IOT and the Edge? How should we be thinking about the value that you bring from a security perspective? >>Yeah, we take, we take security super seriously. It, it's built into our dna. We do a lot of work to ensure that our platform is secure, that the data we store is, is kept private. It's of course always a concern. You see in the news all the time, companies being compromised, you know, that's something that you can have an entire team working on, which we do to make sure that the data that you have, whether it's in transit, whether it's at rest, is always kept secure, is only viewable by you. You know, you look at things like software, bill of materials, if you're running this yourself, you have to go vet all sorts of different pieces of software. And we do that, you know, as we use new tools. That's something that, that's just part of our jobs to make sure that the platform that we're running it has, has fully vetted software and, and with open source especially, that's a lot of work. And so it's, it's definitely new territory. Supply chain attacks are, are definitely happening at a higher clip than they used to, but that is, that is really just part of a day in the, the life for folks like us that are, are building platforms. >>Yeah, and that's key. I mean especially when you start getting into the, the, you know, we talk about IOT and the operations technologies, the engineers running the, that infrastructure, you know, historically, as you know, Tim, they, they would air gap everything. That's how they kept it safe. But that's not feasible anymore. Everything's >>That >>Connected now, right? And so you've gotta have a partner that is again, take away that heavy lifting to r and d so you can focus on some of the other activities. Right. Give us the, the last word and the, the key takeaways from your perspective. >>Well, you know, from my perspective I see it as, as a a two lane approach with, with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, what you had mentioned, air gaping. Sure there's plenty of need for that, but at the end of the day, people that don't want to run big data centers, people that want torus their data to, to a company that's, that's got a full platform set up for them that they can build on, send that data over to the cloud, the cloud is not going away. I think more hybrid approach is, is where the future lives and that's what we're prepared for. >>Tim, really appreciate you coming to the program. Great stuff. Good to see you. >>Thanks very much. Appreciate it. >>Okay, in a moment I'll be back to wrap up. Today's session, you're watching The Cube. >>Are you looking for some help getting started with InfluxDB Telegraph or Flux Check >>Out Influx DB University >>Where you can find our entire catalog of free training that will help you make the most of your time series data >>Get >>Started for free@influxdbu.com. >>We'll see you in class. >>Okay, so we heard today from three experts on time series and data, how the Influx DB platform is evolving to support new ways of analyzing large data sets very efficiently and effectively in real time. And we learned that key open source components like Apache Arrow and the Rust Programming environment Data fusion par K are being leveraged to support realtime data analytics at scale. We also learned about the contributions in importance of open source software and how the Influx DB community is evolving the platform with minimal disruption to support new workloads, new use cases, and the future of realtime data analytics. Now remember these sessions, they're all available on demand. You can go to the cube.net to find those. Don't forget to check out silicon angle.com for all the news related to things enterprise and emerging tech. And you should also check out influx data.com. There you can learn about the company's products. You'll find developer resources like free courses. You could join the developer community and work with your peers to learn and solve problems. And there are plenty of other resources around use cases and customer stories on the website. This is Dave Valante. Thank you for watching Evolving Influx DB into the smart data platform, made possible by influx data and brought to you by the Cube, your leader in enterprise and emerging tech coverage.

Published Date : Oct 28 2022

SUMMARY :

we talked about how in theory, those time slices could be taken, you know, As is often the case, open source software is the linchpin to those innovations. We hope you enjoy the program. I appreciate the time. Hey, explain why Influx db, you know, needs a new engine. now, you know, related to requests like sql, you know, query support, things like that, of the real first influx DB cloud, you know, which has been really successful. as they're giving us feedback, et cetera, has has, you know, pointed us in a really good direction shift from, you know, time series, you know, specialist to real time analytics better handle those queries from a performance and a, and a, you know, a time to response on the queries, you know, all of the, the real time queries, the, the multiple language query support, the, the devices and you know, the sort of highly distributed nature of all of this. I always thought, you know, real, I always thought of real time as before you lose the customer, you know, and that's one of the things that really triggered us to know that we were, we were heading in the right direction, a look at the, the libraries in on our GitHub and, you know, can ex inspect it and even can try And so just, you know, being careful, maybe a little cautious in terms And you can do some experimentation and, you know, using the cloud resources. You know, this is a new very sort of popular systems language, you know, really fast real time inquiries that we talked about, as well as for very large, you know, but it's popularity is, is you know, really starting to hit that steep part of the S-curve. going out and you know, it'll be highly featured on our, our website, you know, the whole database, the ecosystem as it expands out into to, you know, this vertically oriented Really appreciate your time. Look forward to it. goes, goes beyond just the historical into the real time really hot area. There's no need to worry about provisioning because you only pay for what you use. InfluxDB uses a single API across the entire platform suite so you can build on Influx DB is leveraging to increase the granularity of time series analysis analysis and bring the Hi, thank you so much. it's gonna give you faster query speeds, you store files and object storage, it aims to have no limits on cardinality and also allow you to write any kind of event data that It's really, the adoption is really starting to get steep on all the control, all the fine grain control, you need to take you know, the community is modernizing the platform, but I wanna talk about Apache And so you can answer that question and you have those immediately available to you. out that one temperature value that you want at that one time stamp and do that for every talking about is really, you know, kind of native i, is it not as effective? Yeah, it's, it's not as effective because you have more expensive compression and So let's talk about Arrow Data Fusion. It also has a PANDAS API so that you could take advantage of PANDAS What are you doing with and Pandas, so it supports a broader ecosystem. What's the value that you're bringing to the community? And I think kind of the idea here is that if you can improve kind of summarize, you know, where what, what the big takeaways are from your perspective. the hard work questions and you All right, thank you so much Anise for explaining I really appreciate it. Data and we're gonna talk about how you update a SAS engine while I'm really glad that we went with InfluxDB Cloud for our hosting They listened to the challenges we were facing and they helped Good to see you. Good to see you. So my question to you is, So yeah, you know, influx really, we thrive at the intersection of commercial services and open, You know, you look at Kubernetes for example, But, but really Kubernetes is just, you know, Azure, and Google and figure out how to deliver services on those three clouds with all of their differences. to the edge, you know, wherever is that, is that correct? This is the new hot phrase, you know, it, it's, Kubernetes has made a lot of things easy for us Is that, are there specific attributes to Influx db as an SRE group, as an ops team, that we can manage with very few people So how, so sometimes you build, sometimes you buy it. And of course for customers you don't even see that, but we don't want to try to reinvent the wheel, and really as, as I mentioned earlier, we can keep up with the state of the art. the end we want you to focus on getting actual insights from your data instead of running infrastructure, So cloud native technologies are, are really the hot thing. You see in the news all the time, companies being compromised, you know, technologies, the engineers running the, that infrastructure, you know, historically, as you know, take away that heavy lifting to r and d so you can focus on some of the other activities. with influx, with Anytime series data, you know, you've got a lot of stuff that you're gonna run on-prem, Tim, really appreciate you coming to the program. Thanks very much. Okay, in a moment I'll be back to wrap up. brought to you by the Cube, your leader in enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
Tim Yoakum	PERSON	0.99+
Brian	PERSON	0.99+
Dave	PERSON	0.99+
Tim Yokum	PERSON	0.99+
Dave Valante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Tim	PERSON	0.99+
Google	ORGANIZATION	0.99+
16 times	QUANTITY	0.99+
two rows	QUANTITY	0.99+
New York City	LOCATION	0.99+
60,000 people	QUANTITY	0.99+
Rust	TITLE	0.99+
Influx	ORGANIZATION	0.99+
Influx Data	ORGANIZATION	0.99+
today	DATE	0.99+
Influx Data	ORGANIZATION	0.99+
Python	TITLE	0.99+
three experts	QUANTITY	0.99+
InfluxDB	TITLE	0.99+
both	QUANTITY	0.99+
each row	QUANTITY	0.99+
two lane	QUANTITY	0.99+
Today	DATE	0.99+
Noble nine	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Flux	ORGANIZATION	0.99+
Influx DB	TITLE	0.99+
each column	QUANTITY	0.99+
270 terabytes	QUANTITY	0.99+
cube.net	OTHER	0.99+
twice	QUANTITY	0.99+
Bryan	PERSON	0.99+
Pandas	TITLE	0.99+
c plus plus	TITLE	0.99+
three years ago	DATE	0.99+
two	QUANTITY	0.99+
more than a decade	QUANTITY	0.98+
Apache	ORGANIZATION	0.98+
dozens	QUANTITY	0.98+
free@influxdbu.com	OTHER	0.98+
30,000 feet	QUANTITY	0.98+
Rust Foundation	ORGANIZATION	0.98+
two temperature values	QUANTITY	0.98+
In Flux Data	ORGANIZATION	0.98+
one time stamp	QUANTITY	0.98+
tomorrow	DATE	0.98+
Russ	PERSON	0.98+
IOT	ORGANIZATION	0.98+
Evolving InfluxDB	TITLE	0.98+
first	QUANTITY	0.97+
Influx data	ORGANIZATION	0.97+
one	QUANTITY	0.97+
first one	QUANTITY	0.97+
Influx DB University	ORGANIZATION	0.97+
SQL	TITLE	0.97+
The Cube	TITLE	0.96+
Influx DB Cloud	TITLE	0.96+
single server	QUANTITY	0.96+
Kubernetes	TITLE	0.96+

Anais Dotis Georgiou, InfluxData

(upbeat music) >> Okay, we're back. I'm Dave Vellante with The Cube and you're watching Evolving InfluxDB into the smart data platform made possible by influx data. Anais Dotis-Georgiou is here. She's a developer advocate for influx data and we're going to dig into the rationale and value contribution behind several open source technologies that InfluxDB is leveraging to increase the granularity of time series analysis and bring the world of data into realtime analytics. Anais welcome to the program. Thanks for coming on. >> Hi, thank you so much. It's a pleasure to be here. >> Oh, you're very welcome. Okay, so IOx is being touted as this next gen open source core for InfluxDB. And my understanding is that it leverages in memory, of course for speed. It's a kilometer store, so it gives you compression efficiency it's going to give you faster query speeds, it's going to see you store files and object storages so you got very cost effective approach. Are these the salient points on the platform? I know there are probably dozens of other features but what are the high level value points that people should understand? >> Sure, that's a great question. So some of the main requirements that IOx is trying to achieve and some of the most impressive ones to me the first one is that it aims to have no limits on cardinality and also allow you to write any kind of event data that you want whether that's lift tag or a field. It also wants to deliver the best in class performance on analytics queries. In addition to our already well served metric queries we also want to have operator control over memory usage. So you should be able to define how much memory is used for buffering caching and query processing. Some other really important parts is the ability to have bulk data export and import, super useful. Also, broader ecosystem compatibility where possible we aim to use and embrace emerging standards in the data analytics ecosystem and have compatibility with things like SQL, Python and maybe even Pandas in the future. >> Okay, so a lot there. Now we talked to Brian about how you're using Rust and which is not a new programming language and of course we had some drama around Rust during the pandemic with the Mozilla layoffs but the formation of the Rust Foundation really addressed any of those concerns and you got big guns like Amazon and Google and Microsoft throwing their collective weights behind it. It's really adoption is really starting to get steep on the S-curve. So lots of platforms, lots of adoption with Rust but why Rust as an alternative to say C++ for example? >> Sure, that's a great question. So Rust was chosen because of his exceptional performance and reliability. So while Rust is syntactically similar to C++ and it has similar performance it also compiles to a native code like C++ But unlike C++ it also has much better memory safety. So memory safety is protection against bugs or security vulnerabilities that lead to excessive memory usage or memory leaks. And Rust achieves this memory safety due to its like innovative type system. Additionally, it doesn't allow for dangling pointers and dangling pointers are the main classes of errors that lead to exploitable security vulnerabilities in languages like C++. So Rust like helps meet that requirement of having no limits on cardinality, for example, because it's we're also using the Rust implementation of Apache Arrow and this control over memory and also Rust's packaging system called Crates IO offers everything that you need out of the box to have features like async and await to fix race conditions to protect against buffering overflows and to ensure thread safe async caching structures as well. So essentially it's just like has all the control all the fine grain control, you need to take advantage of memory and all your resources as well as possible so that you can handle those really, really high cardinality use cases. >> Yeah, and the more I learn about the new engine and the platform IOx et cetera, you see things like the old days not even to even today you do a lot of garbage collection in these systems and there's an inverse, impact relative to performance. So it looks like you're really, the community is modernizing the platform but I want to talk about Apache Arrow for a moment. It's designed to address the constraints that are associated with analyzing large data sets. We know that, but please explain why, what is Arrow and what does it bring to InfluxDB? >> Sure. Yeah. So Arrow is a a framework for defining in memory column data. And so much of the efficiency and performance of IOx comes from taking advantage of column data structures. And I will, if you don't mind, take a moment to kind of illustrate why column data structures are so valuable. Let's pretend that we are gathering field data about the temperature in our room and also maybe the temperature of our store. And in our table we have those two temperature values as well as maybe a measurement value, timestamp value maybe some other tag values that describe what room and what house, et cetera we're getting this data from. And so you can picture this table where we have like two rows with the two temperature values for both our room and the store. Well, usually our room temperature is regulated so those values don't change very often. So when you have calm oriented storage essentially you take each row each column and group it together. And so if that's the case and you're just taking temperature values from the room and a lot of those temperature values are the same then you'll, you might be able to imagine how equal values will then enable each other and when they neighbor each other in the storage format this provides a really perfect opportunity for cheap compression. And then this cheap compression enables high cardinality use cases. It also enables for faster scan rates. So if you want to define like the min and max value of the temperature in the room across a thousand different points you only have to get those a thousand different points in order to answer that question and you have those immediately available to you. But let's contrast this with a row oriented storage solution instead so that we can understand better the benefits of column oriented storage. So if you had a row oriented storage, you'd first have to look at every field like the temperature in the room and the temperature of the store. You'd have to go across every tag value that maybe describes where the room is located or what model the store is. And every timestamp you then have to pluck out that one temperature value that you want at that one time stamp and do that for every single row. So you're scanning across a ton more data and that's why row oriented doesn't provide the same efficiency as column and Apache Arrow is in memory column data column data fit framework. So that's where a lot of the advantages come from. >> Okay. So you've basically described like a traditional database a row approach, but I've seen like a lot of traditional databases say, okay, now we've got we can handle Column format versus what you're talking about is really kind of native is it not as effective as the former not as effective because it's largely a bolt on? Can you like elucidate on that front? >> Yeah, it's not as effective because you have more expensive compression and because you can't scan across the values as quickly. And so those are, that's pretty much the main reasons why row oriented storage isn't as efficient as column oriented storage. >> Yeah. Got it. So let's talk about Arrow data fusion. What is data fusion? I know it's written in Rust but what does it bring to to the table here? >> Sure. So it's an extensible query execution framework and it uses Arrow as its in memory format. So the way that it helps InfluxDB IOx is that okay it's great if you can write unlimited amount of cardinality into InfluxDB, but if you don't have a query engine that can successfully query that data then I don't know how much value it is for you. So data fusion helps enable the query process and transformation of that data. It also has a Pandas API so that you could take advantage of Pandas data frames as well and all of the machine learning tools associated with Pandas. >> Okay. You're also leveraging Par-K in the platform course. We heard a lot about Par-K in the middle of the last decade cuz as a storage format to improve on Hadoop column stores. What are you doing with Par-K and why is it important? >> Sure. So Par-K is the column oriented durable file format. So it's important because it'll enable bulk import and bulk export. It has compatibility with Python and Pandas so it supports a broader ecosystem. Par-K files also take very little disc space and they're faster to scan because again they're column oriented, in particular I think Par-K files are like 16 times cheaper than CSV files, just as kind of a point of reference. And so that's essentially a lot of the benefits of Par-K. >> Got it. Very popular. So and these, what exactly is Influx data focusing on as a committer to these projects? What is your focus? What's the value that you're bringing to the community? >> Sure. So InfluxDB first has contributed a lot of different things to the Apache ecosystem. For example, they contribute an implementation of Apache Arrow and go and that will support clearing Influx. Also, there has been a quite a few contributions to data fusion for things like memory optimization and supportive additional SQL features like support for timestamp, arithmetic and support for exist clauses and support for memory control. So yeah, Influx has contributed a lot to the Apache ecosystem and continues to do so. And I think kind of the idea here is that if you can improve these upstream projects and then the long term strategy here is that the more you contribute and build those up then the more you will perpetuate that cycle of improvement and the more we will invest in our own project as well. So it's just that kind of symbiotic relationship and appreciation of the open source community. >> Yeah. Got it. You got that virtuous cycle going people call it the flywheel. Give us your last thoughts and kind of summarize, what the big takeaways are from your perspective. >> So I think the big takeaway is that, Influx data is doing a lot of really exciting things with InfluxDB IOx and I really encourage if you are interested in learning more about the technologies that Influx is leveraging to produce IOx the challenges associated with it and all of the hard work questions and I just want to learn more then I would encourage you to go to the monthly Tech talks and community office hours and they are on every second Wednesday of the month at 8:30 AM Pacific time. There's also a community forums and a community Slack channel. Look for the InfluxDB underscore IOx channel specifically to learn more about how to join those office hours and those monthly tech talks as well as ask any questions they have about IOx what to expect and what you'd like to learn more about. I as a developer advocate, I want to answer your questions. So if there's a particular technology or stack that you want to dive deeper into and want more explanation about how InfluxDB leverages it to build IOx, I will be really excited to produce content on that topic for you. >> Yeah, that's awesome. You guys have a really rich community collaborate with your peers, solve problems and you guys super responsive, so really appreciate that. All right, thank you so much Anais for explaining all this open source stuff to the audience and why it's important to the future of data. >> Thank you. I really appreciate it. >> All right, you're very welcome. Okay, stay right there and in a moment I'll be back with Tim Yoakam. He's the director of engineering for Influx Data and we're going to talk about how you update a SaaS engine while the plane is flying at 30,000 feet. You don't want to miss this. (upbeat music)

Published Date : Oct 18 2022

SUMMARY :

and bring the world of data It's a pleasure to be here. it's going to give you and some of the most impressive ones to me and you got big guns and dangling pointers are the main classes Yeah, and the more I and the temperature of the store. is it not as effective as the former not and because you can't scan to to the table here? So the way that it helps Par-K in the platform course. and they're faster to scan So and these, what exactly is Influx data and appreciation of the and kind of summarize, of the hard work questions and you guys super responsive, I really appreciate it. and we're going to talk about

ENTITIES

Entity	Category	Confidence
Tim Yoakam	PERSON	0.99+
Brian	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Anais	PERSON	0.99+
two rows	QUANTITY	0.99+
16 times	QUANTITY	0.99+
Influx Data	ORGANIZATION	0.99+
each row	QUANTITY	0.99+
Python	TITLE	0.99+
Rust	TITLE	0.99+
C++	TITLE	0.99+
SQL	TITLE	0.99+
Anais Dotis Georgiou	PERSON	0.99+
InfluxDB	TITLE	0.99+
both	QUANTITY	0.99+
Rust Foundation	ORGANIZATION	0.99+
30,000 feet	QUANTITY	0.99+
first one	QUANTITY	0.99+
Mozilla	ORGANIZATION	0.99+
Pandas	TITLE	0.98+
InfluxData	ORGANIZATION	0.98+
Influx	ORGANIZATION	0.98+
IOx	TITLE	0.98+
each column	QUANTITY	0.97+
one time stamp	QUANTITY	0.97+
first	QUANTITY	0.97+
Influx	TITLE	0.96+
Anais Dotis-Georgiou	PERSON	0.95+
Crates IO	TITLE	0.94+
IOx	ORGANIZATION	0.94+
two temperature values	QUANTITY	0.93+
Apache	ORGANIZATION	0.93+
today	DATE	0.93+
8:30 AM Pacific time	DATE	0.92+
Wednesday	DATE	0.91+
one temperature	QUANTITY	0.91+
two temperature values	QUANTITY	0.91+
InfluxDB IOx	TITLE	0.9+
influx	ORGANIZATION	0.89+
last decade	DATE	0.88+
single row	QUANTITY	0.83+
a ton more data	QUANTITY	0.81+
thousand	QUANTITY	0.8+
dozens of other features	QUANTITY	0.8+
a thousand different points	QUANTITY	0.79+
Hadoop	TITLE	0.77+
Par-K	TITLE	0.76+
points	QUANTITY	0.75+
each	QUANTITY	0.75+
Slack	TITLE	0.74+
Evolving InfluxDB	TITLE	0.68+
kilometer	QUANTITY	0.67+
Arrow	TITLE	0.62+
The Cube	ORGANIZATION	0.61+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Par-K: