Ajay Vohora and Lester Waters, Io-Tahoe | Io-Tahoe Adaptive Data Governance
>> Narrator: From around the globe its "theCUBE" presenting Adaptive Data Governance, brought to you by Io-Tahoe. >> And we're back with the Data Automation series. In this episode we're going to learn more about what Io-Tahoe is doing in the field of adaptive data governance, how can help achieve business outcomes and mitigate data security risks. I'm Lisa Martin and I'm joined by Ajay Vohora the CEO of Io-Tahoe, and Lester Waters the CTO of Io-Tahoe. Gentlemen it's great to have you on the program. >> Thank you Lisa is good to be back. >> Great to see you Lisa. >> Likewise, very seriously this isn't cautious as we are. Lester were going to start with you, what's going on at Io-Tahoe, what's new? >> Well, I've been with Io-Tahoe for a little over the year, and one thing I've learned is every customer needs are just a bit different. So we've been working on our next major release of the Io-Tahoe product and to really try to address these customer concerns because we want to be flexible enough in order to come in and not just profile the data and not just understand data quality and lineage, but also to address the unique needs of each and every customer that we have. And so that required a platform rewrite of our product so that we could extend the product without building a new version of the product, we wanted to be able to have pluggable modules. We are also focused a lot on performance, that's very important with the bulk of data that we deal with and we're able to pass through that data in a single pass and do the analytics that are needed whether it's a lineage data quality or just identifying the underlying data. And we're incorporating all that we've learned, we're tuning up our machine learning, we're analyzing on more dimensions than we've ever done before, we're able to do data quality without doing an initial reggie expert for example, just out of the box. So I think it's all of these things are coming together to form our next version of our product and We're really excited about. >> Sounds exciting, Ajay from the CEOs level what's going on? >> Wow, I think just building on that, what Lester just mentioned now it's we're growing pretty quickly with our partners, and today here with Oracle we're excited to explain how that's shaping up lots of collaboration already with Oracle, and government in insurance and in banking. And we're excited because we get to have an impact, it's really satisfying to see how we're able to help businesses transform and redefine what's possible with their data. And having Oracle there as a partner to lean in with is definitely helping. >> Excellent, we're going to dig into that a little bit later. Lester let's go back over to you, explain adaptive data governance, help us understand that. >> Really adaptive data governance is about achieving business outcomes through automation. It's really also about establishing a data-driven culture and pushing what's traditionally managed in IT out to the business. And to do that, you've got to enable an environment where people can actually access and look at the information about the data, not necessarily access the underlying data because we've got privacy concern system, but they need to understand what kind of data they have, what shape it's in, what's dependent on it upstream and downstream, and so that they can make their educated decisions on what they need to do to achieve those business outcomes. A lot of frameworks these days are hardwired, so you can set up a set of business rules, and that set of business rules works for a very specific database and a specific schema. But imagine a world where you could just say, you know, (tapping) the start date of a loan must always be before the end date of a loan, and having that generic rule regardless of the underlying database, and applying it even when a new database comes online and having those rules applied, that's what adaptive data governance about. I like to think of it as the intersection of three circles, really it's the technical metadata coming together with policies and rules, and coming together with the business ontologies that are unique to that particular business. And bringing this all together allows you to enable rapid change in your environment, so, it's a mouthful adaptive data governance, but that's what it kind of comes down to. >> So Ajay help me understand this, is this what enterprise companies are doing now or are they not quite there yet? >> Well, you know Lisa I think every organization is going at his pace, but markets are changing economy and the speed at which some of the changes in the economy happening is compelling more businesses to look at being more digital in how they serve their own customers. So what we're saying is a number of trends here from heads of data, chief data officers, CIO stepping back from a one size fits all approach because they've tried that before and it just hasn't worked. They've spent millions of dollars on IT programs trying to drive value from that data, and they've ended up with large teams of manual processing around data to try and hard-wire these policies to fit with the context and each line of business, and that hasn't worked. So, the trends that we're seeing emerge really relate to how do I as a chief data officer, as a CIO, inject more automation and to allow these common tasks. And we've been able to see that impact, I think the news here is if you're trying to create a knowledge graph, a data catalog, or a business glossary, and you're trying to do that manually, well stop, you don't have to do that manual anymore. I think best example I can give is Lester and I we like Chinese food and Japanese food, and if you were sitting there with your chopsticks you wouldn't eat a bowl of rice with the chopsticks one grain at a time, what you'd want to do is to find a more productive way to enjoy that meal before it gets cold. And that's similar to how we're able to help organizations to digest their data is to get through it faster, enjoy the benefits of putting that data to work. >> And if it was me eating that food with you guys I would be not using chopsticks I would be using a fork and probably a spoon. So Lester how then does Io-Tahoe go about doing this and enabling customers to achieve this? >> Let me show you a little story here. So if you take a look at the challenges that most customers have they're very similar, but every customer is on a different data journey, so, but it all starts with what data do I have, what shape is that data in, how is it structured, what's dependent on it upstream and downstream, what insights can I derive from that data, and how can I answer all of those questions automatically? So if you look at the challenges for these data professionals, you know, they're either on a journey to the cloud, maybe they're doing a migration to Oracle, maybe they're doing some data governance changes, and it's about enabling this. So if you look at these challenges, I'm going to take you through a story here, and I want to introduce Amanda. Amanda is not Latin like anyone in any large organizations, she is looking around and she just sees stacks of data, I mean, different databases the one she knows about, the ones she doesn't know about but should know about, various different kinds of databases, and Amanda is this tasking with understanding all of this so that they can embark on her data journey program. So Amanda goes through and she's great, (snaps finger) "I've got some handy tools, I can start looking at these databases and getting an idea of what we've got." But when she digs into the databases she starts to see that not everything is as clear as she might've hoped it would be. Property names or column names have ambiguous names like Attribute one and Attribute two, or maybe Date one and Date two, so Amanda is starting to struggle even though she's got tools to visualize and look at these databases, she's still knows she's got a long road ahead, and with 2000 databases in her large enterprise, yes it's going to be a long journey. But Amanda is smart, so she pulls out her trusty spreadsheet to track all of her findings, and what she doesn't know about she raises a ticket or maybe tries to track down in order to find what that data means, and she's tracking all this information, but clearly this doesn't scale that well for Amanda. So maybe the organization will get 10 Amanda's to sort of divide and conquer that work. But even that doesn't work that well 'cause there's still ambiguities in the data. With Io-Tahoe what we do is we actually profile the underlying data. By looking at the underlying data, we can quickly see that Attribute one looks very much like a US social security number, and Attribute two looks like a ICD 10 medical code. And we do this by using ontologies, and dictionaries, and algorithms to help identify the underlying data and then tag it. Key to doing this automation is really being able to normalize things across different databases so that where there's differences in column names, I know that in fact they contain the same data. And by going through this exercise with Io-Tahoe, not only can we identify the data, but we also can gain insights about the data. So for example, we can see that 97% of that time, that column named Attribute one that's got US social security numbers, has something that looks like a social security number. But 3% of the time it doesn't quite look right, maybe there's a dash missing, maybe there's a digit dropped, or maybe there's even characters embedded in it, that may be indicative of a data quality issues, so we try to find those kinds of things. Going a step further, we also try to identify data quality relationships. So for example we have two columns, one date one date two, through observation we can see the date one 99% of the time is less than date two, 1% of the time it's not, probably indicative of the data quality issue, but going a step further we can also build a business rule that says date one is actually than date two, and so then when it pops up again we can quickly identify and remediate that problem. So these are the kinds of things that we can do with Io-Tahoe. Going even a step further, we can take your favorite data science solution, productionize it, and incorporate it into our next version as what we call a worker process to do your own bespoke analytics. >> Bespoke analytics, excellent, Lester thank you. So Ajay, talk us through some examples of where you're putting this to use, and also what is some of the feedback from some customers. >> Yeah, what I'm thinking how do you bring into life a little bit Lisa lets just talk through a case study. We put something together, I know it's available for download, but in a well-known telecommunications media company, they have a lot of the issues that lasted just spoke about lots of teams of Amanda's, super bright data practitioners, and are maybe looking to get more productivity out of their day, and deliver a good result for their own customers, for cell phone subscribers and broadband users. So, there are so many examples that we can see here is how we went about auto generating a lot of that old understanding of that data within hours. So, Amanda had her data catalog populated automatically, a business glossary built up, and maybe I would start to say, "Okay, where do I want to apply some policies to the data to set in place some controls, whether I want to adapt how different lines of business maybe tasks versus customer operations have different access or permissions to that data." And what we've been able to do that is to build up that picture to see how does data move across the entire organization, across the state, and monitor that over time for improvement. So we've taken it from being like reactive, let's do something to fix something to now more proactive. We can see what's happening with our data, who's using it, who's accessing it, how it's being used, how it's being combined, and from there taking a proactive approach is a real smart use of the tanons in that telco organization and the folks that work there with data. >> Okay Ajay, so digging into that a little bit deeper, and one of the things I was thinking when you were talking through some of those outcomes that you're helping customers achieve is ROI. How do customers measure ROI, What are they seeing with Io-Tahoe solution? >> Yeah, right now the big ticket item is time to value. And I think in data a lot of the upfront investment costs are quite expensive, they happen today with a lot of the larger vendors and technologies. Well, a CIO, an economic buyer really needs to be certain about this, how quickly can I get that ROI? And I think we've got something that we can show just pull up a before and after, and it really comes down to hours, days, and weeks where we've been able to have that impact. And in this playbook that we put together the before and after picture really shows those savings that committed a bit through providing data into some actionable form within hours and days to drive agility. But at the same time being able to enforce the controls to protect the use of that data and who has access to it, so atleast the number one thing I'd have to say is time, and we can see that on the graphic that we've just pulled up here. >> Excellent, so ostensible measurable outcomes that time to value. We talk about achieving adaptive data governance. Lester, you guys talk about automation, you talk about machine learning, how are you seeing those technologies being a facilitator of organizations adopting adaptive data governance? >> Well, as we see the manual date, the days of manual effort are out, so I think this is a multi-step process, but the very first step is understanding what you have in normalizing that across your data estate. So, you couple this with the ontologies that are unique to your business and algorithms, and you basically go across it and you identify and tag that data, that allows for the next steps to happen. So now I can write business rules not in terms of named columns, but I can write them in terms of the tags. Using that automated pattern recognition where we observed the loan starts should be before the loan (indistinct), being able to automate that is a huge time saver, and the fact that we can suggest that as a rule rather than waiting for a person to come along and say, "Oh wow, okay, I need this rule, I need this rule." These are steps that increase, or I should say decrease that time to value that Ajay talked about. And then lastly, a couple of machine learning, because even with great automation and being able to profile all your data and getting a good understanding, that brings you to a certain point, but there's still ambiguity in the data. So for example I might have two columns date one and date two, I may have even observed that date one should be less than date two, but I don't really know what date one and date two are other than a date. So, this is where it comes in and I'm like, "As the user said, can you help me identify what date one and day two are in this table?" It turns out they're a start date and an end date for a loan, that gets remembered, cycled into machine learning step by step to see this pattern of date one date two. Elsewhere I'm going to say, "Is it start date and end date?" Bringing all these things together with all this automation is really what's key to enable this data database, your data governance program. >> Great, thanks Lester. And Ajay I do want to wrap things up with something that you mentioned in the beginning about what you guys are doing with Oracle, take us out by telling us what you're doing there, how are you guys working together? >> Yeah, I think those of us who worked in IT for many years we've learned to trust Oracle's technology that they're shifting now to a hybrid on-prem cloud generation 2 platform which is exciting, and their existing customers and new customers moving to Oracle are on a journey. So Oracle came to us and said, "Now, we can see how quickly you're able to help us change mindsets," and as mindsets are locked in a way of thinking around operating models of IT that are maybe not agile or more siloed, and they're wanting to break free of that and adopt a more agile API driven approach with their data. So, a lot of the work that we're doing with Oracle is around accelerating what customers can do with understanding their data and to build digital apps by identifying the underlying data that has value. And the time we're able to do that in hours, days, and weeks, rather than many months is opening up the eyes to chief data officers, CIO is to say, "Well, maybe we can do this whole digital transformation this year, maybe we can bring that forward and transform who we are as a company." And that's driving innovation which we're excited about, and I know Oracle keen to drive through. >> And helping businesses transform digitally is so incredibly important in this time as we look to things changing in 2021. Ajay and Lester thank you so much for joining me on this segment, explaining adaptive data governance, how organizations can use it, benefit from it, and achieve ROI, thanks so much guys. >> Thanks you. >> Thanks again Lisa. (bright music)
SUMMARY :
brought to you by Io-Tahoe. going to learn more about this isn't cautious as we are. and do the analytics that are needed to lean in with is definitely helping. Lester let's go back over to you, and so that they can make and to allow these common tasks. and enabling customers to achieve this? that we can do with Io-Tahoe. and also what is some of the in that telco organization and the folks and one of the things I was thinking and we can see that that time to value. that allows for the next steps to happen. that you mentioned in the beginning and I know Oracle keen to drive through. Ajay and Lester thank you Thanks again Lisa.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Ajay Vohora | PERSON | 0.99+ |
Amanda | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Ajay | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
2021 | DATE | 0.99+ |
Lisa | PERSON | 0.99+ |
Lester | PERSON | 0.99+ |
one | QUANTITY | 0.99+ |
two columns | QUANTITY | 0.99+ |
97% | QUANTITY | 0.99+ |
Io-Tahoe | ORGANIZATION | 0.99+ |
Lester Waters | PERSON | 0.99+ |
3% | QUANTITY | 0.99+ |
Lester | ORGANIZATION | 0.99+ |
each line | QUANTITY | 0.99+ |
first step | QUANTITY | 0.98+ |
two columns | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
two | QUANTITY | 0.98+ |
millions of dollars | QUANTITY | 0.98+ |
1% | QUANTITY | 0.98+ |
telco | ORGANIZATION | 0.98+ |
one grain | QUANTITY | 0.97+ |
2000 databases | QUANTITY | 0.96+ |
US | LOCATION | 0.95+ |
ICD 10 | OTHER | 0.95+ |
one thing | QUANTITY | 0.94+ |
three circles | QUANTITY | 0.92+ |
Attribute two | OTHER | 0.92+ |
this year | DATE | 0.91+ |
single | QUANTITY | 0.91+ |
isa | PERSON | 0.9+ |
Latin | OTHER | 0.9+ |
each | QUANTITY | 0.87+ |
10 | QUANTITY | 0.85+ |
day | QUANTITY | 0.84+ |
Attribute one | OTHER | 0.84+ |
99% | QUANTITY | 0.83+ |
Io-Tahoe | PERSON | 0.79+ |
Data Automation | TITLE | 0.77+ |
Io- | ORGANIZATION | 0.74+ |
Lester Waters | PERSON | 0.72+ |
one size | QUANTITY | 0.71+ |
Io | ORGANIZATION | 0.69+ |
Lester Waters, Patrick Smith & Ezat Dayeh | IoTahoe | Data Automated
>> Announcer: From around the globe, it's theCUBE, with digital coverage of data automated and event series brought to you by IO Tahoe. >> Welcome back everybody to the power panel, driving business performance with smart data life cycles. Lester Waters is here. He's the chief technology officer from IO Tahoe, he's joined by Patrick Smith, who is field CTO from Pure Storage and Ezat Dayeh, who's a system engineering manager at Cohesity. Gentlemen, good to see you. Thanks so much for coming on this panel. >> Thank you, Dave. >> Let's start with Lester. I wonder if each of you could just give us a quick overview of your role and what's the number one problem that you're focused on solving for your customers? Let's start with Lester please. >> Yes, I'm Lester waters, chief technology officer for IO Tahoe, and really the number one problem that we are trying to solve for our customers is to help them understand what they have. 'Cause if they don't understand what they have in terms of their data, they can't manage it, they can't control it, they can't monitor it. They can't ensure compliance. So really that's finding all you can about your data that you have and building a catalog that can be readily consumed by the entire business is what we do. >> Great. All right, Patrick, field CTO in your title. That says to me you're talking to customers all the time. So you've got a good perspective on it. Give us you know, your take on things here. >> Yeah, absolutely. So my patch is EMEA and talk to customers and prospects in lots of different verticals across the region. And as they look at their environments and their data landscape, they're faced with massive growth in the data that they're trying to analyze and demands to be able to get in site faster and to deliver business value faster than they've ever had to do in the past. So big challenges that we're seeing across the region. >> Got it. And is that, Cohesity? You're like the new kid on the block, you guys are really growing rapidly, created this whole notion of data management backup and beyond, but from a system engineering manager, what are you seeing from customers, your role and the number one problem that you're solving? >> Yeah, sure. So the number one problem, I see time and again, speaking with customers, fall around data fragmentation. So due to things like organic growth, you know, even maybe budgetary limitations, infrastructure has grown over time, very piecemeal and it's highly distributed internally. And just to be clear, you know, when I say internally, you know, that could be that it's on multiple platforms or silos within an on-prem infrastructure, but that it also does extend to the cloud as well. So we've seen, you know, over the past few years, a big drive towards cloud consumption, almost at any cost in some examples. You know, there could be business reasons like moving from things like CapEx to a more of an OPEX model. And what this has done is it's gone to, to create further silos, you know, both on-prem and also in the cloud. And while short term needs may be met by doing that, what it's doing is it's causing longer term problems and it's reducing the agility for these customers to be able to change and transform. >> Right, hey cloud is cool. Everybody wants to be in the cloud, right? So you're right. It creates maybe unintended consequences. So let's start with the business outcome and kind of try to work backwards. I mean, people, you know, they want to get more insights from data. They want to have a more efficient data life cycle, but so Lester, let me start with you, thinking about like the North star to creating data-driven cultures, you know, what is the North star for customers here? >> I think the North star in a nutshell is driving value from your data without question. I mean, we differentiate ourselves these days by even in nuances in our data. Now, underpinning that there's a lot of things that have to happen to make that work out well, you know, for example, making sure you adequately protect your data, you know, do you have a good, do you have a good storage subsystem? Do you have a good backup and recovery point objectives, recovery time objectives? Do you, are you fully compliant? Are you ensuring that you're ticking all the boxes? There's a lot of regulations these days in term, with respect to compliance, data retention, data privacy, and so forth. Are you ticking those boxes? Are you being efficient with your data? You know, in other words, I think there's a statistic that someone mentioned to me the other day, that 53% of all businesses have between three and 15 copies of the same data. So, you know, finding and eliminating those is part of the, part of the problem is you need to chase. >> Yeah, so Patrick and Ezat, I mean, you know, Lester touched on a lot of the areas that you guys are involved in. I like to think of, you know, you're right. Lester, no doubt, business value, and a lot of that comes from reducing the end to end cycle times, but anything that you guys would, would add to that, Patrick, maybe start with Patrick. >> Yeah, I think, I think getting value from data really hits on, it hits on what everyone wants to achieve, but I think there are a couple of key steps in doing that. First of all, is getting access to the data and that really hits three big problems. Firstly, working out what you've got. Secondly, after working out what you've got, how to get access to it, because it's all very well knowing you've got some data, but if you can't get access to it, either because of privacy reasons, security reasons, then that's a big challenge. And then finally, once you've got access to the data, making sure that you can process that data in a timely manner and at the scale that you need to, to deliver your business objectives. So I think those are really three key steps in successfully getting value from the data within our organization. >> Ezat, I'll ask you, anything else you'd fill in? >> Yeah, so the guys have touched on a lot of things already. For me, you know, it would be that an organization has got a really good global view of all of its data. It understands the data flow and dependencies within their infrastructure, understands the precise legal and compliance requirements and have the ability to action changes or initiatives within their environment, forgive the pun, but with a cloud-like agility. You know, and that's no easy feat, right? That is hard work. Another thing as well is that it's for companies to be mature enough, to truly like delete and get rid of unneeded data from their system. You know, I've seen so many times in the past, organizations paying more than they need to because they've acquired a lot of data baggage. Like it just gets carried over from refresh to refresh. And, you know, if you can afford it great, but chances are, you want to be as competitive as possible. And what happens is that this results in, you know, spend that is unnecessary, not just in terms of acquisition, but also in terms of maintaining the infrastructure, but then the other knock on effect as well is, you know, from a compliance and a security point of view, you're exposing yourself. So, you know, if you don't need it, delete it or at least archive it. >> Okay, So we've talked about the challenges in some of the objectives, but there's a lot of blockers out there, and I want to understand how you guys are helping remove them. So Lester, what are some of those blockers? I mean, I can mention a couple, there's their skillsets. There's obviously you talked about the problem of siloed data, but there's also data ownership. That's my data. There's budget issues. What do you see as some of the big blockers in terms of people really leaning in to this smart data life cycle? >> Yeah, silos is probably one of the biggest one I see in businesses. Yes, it's my data, not your data. Lots of compartmentalization and breaking that down is one of the, one of the challenges and having the right tools to help you do that is only part of the solution. There's obviously a lot of cultural things that need to take place to break down those silos and work together. If you can identify where you have redundant data across your enterprise, you might be able to consolidate those, you know, bring together applications. A lot of companies, you know, it's not uncommon for a large enterprise to have, you know, several thousand applications, many of which have their own instance of the very same data. So if there's a customer list, for example, it might be in five or six different sources of truth. And there's no reason to have that, and bringing that together by bringing those things together, you will start to tear down the business boundary silos that automatically exist. I think, I think one of the other challenges too, is self service. As Patrick mentioned, gaining access to your data and being able to work with it in a safe and secure fashion, is key here. You know, right now you typically raise a ticket, wait for access to the data, and then maybe, you know, maybe a week later out pops the bit you need and really, you know, with data being such a commodity and having timeliness to it, being able to have quick access to that data is key. >> Yeah, so I want to go to Patrick. So, you know, one of the blockers that I see is legacy infrastructure, technical debt, sucking all the budget. You've got, you know, too many people having to look after, you know, storage. It's just, it's just too complicated. And I wonder if you have, obviously that's my perspective, what's your perspective on that? >> Yeah, absolutely. We'd agree with that. As you look at the infrastructure that supports people's data landscapes today, for primarily legacy reasons, the infrastructure itself is siloed. So you have different technologies with different underlying hardware, different management methodologies that are there for good reason, because historically you had to have specific fitness for purpose, for different data requirements. That's one of the challenges that we tackled head on at Pure with the flash blade technology and the concept of the data hub, a platform that can deliver in different characteristics for the different workloads, but from a consistent data platform. And it means that we get rid of those silos. It means that from an operational perspective, it's far more efficient. And once your data set is consolidated into the data hub, you don't have to move that data around. You can bring your applications and your workloads to the data rather than the other way around. >> Now, Ezat, I want to go to you because you know, in the world, in your world, which to me goes beyond backup. I mean, one of the challenges is, you know, they say backup is one thing. Recovery is everything, But as well, the CFO doesn't want to pay for just protection. And one of the things that I like about what you guys have done is you've broadened the perspective to get more value out of your, what was once seen as an insurance policy. I wonder if you could talk about that as a blocker and how you're having success removing it. >> Yeah, absolutely. So, you know, as well as what the guys have already said, you know, I do see one of the biggest blockers as the fact that the task at hand can, you know, can be overwhelming for customers and it can overwhelm them very, very quickly. And that's because, you know, this stuff is complicated. It's got risk, you know, people are used to the status quo, but the key here is to remember that it's not an overnight change. It's not, you know, a flick of a switch. It's something that can be tackled in a very piecemeal manner, and absolutely like you you said, you know, reduction in TCO and being able to leverage the data for other purposes is a key driver for this. So like you said, you know, for us specifically, one of the areas that we help customers around with first of all, it's usually data protection. It can also be things like consolidation of unstructured file data. And, you know, the reason why customers are doing this is because legacy data protection is very costly. You know, you'd be surprised how costly it is. A lot of people don't actually know how expensive it can be. And it's very complicated involving multiple vendors. And it's there really to achieve one goal. And the thing is, it's very inflexible and it doesn't help towards being an agile data driven company. So, you know, this can be, this can be resolved. It can be very, you know, pretty straightforward. It can be quite painless as well. Same goes for unstructured data, which is very complex to manage. And, you know, we've all heard the stats from the analysts, you know, data obviously is growing at an extremely rapid rate. But actually when you look at that, you know, how is it actually growing? 80% of that growth is actually in unstructured data. And only 20% of that growth is in structured data. So, you know, these are quick win areas that the customers can realize. Immediate TCO improvement and increased agility as well, when it comes to managing and automating their infrastructure. So, yeah, it's all about making, you know, doing more with, with what you have. >> So let's paint a picture of this guys, if you could bring up the life cycle, I want to explore that a little bit and ask each of you to provide a perspective on this. And so, you know, what you can see here is you've got this, this cycle, the data life cycle, and what we're wanting to do is really inject intelligence or smarts into this life cycle, you can see, you start with ingestion or creation of data. You're storing it. You got to put it somewhere, right? You got to classify it, you got to protect it. And then of course you want to, you know, reduce the copies, make it efficient, and then you want to prepare it, so the businesses can actually consume it. And then you've got clients and governance and privacy issues. And at some point when it's legal to do so, you want to get rid of it. We never get rid of stuff in technology. We keep it forever. But I wonder if we could start with you Lester. This is, you know, the picture of the life cycle. What role does automation play in terms of injecting smarts into the life cycle? >> Automation is key here. You know, especially from the discover catalog and classified perspective. I've seen companies where we, where they go and will take and dump their, all of their database schemes into a spreadsheet so that they can sit down and manually figure out what attribute 37 needs for a column name. And that's only the tip of the iceberg. So being able to automatically detect what you have, automatically deduce what's consuming the data, you know, upstream and downstream, being able to understand all of the things related to the life cycle of your data, backup archive, deletion. It is key. So having good tools is very important. >> So Patrick, obviously you participated in the store piece of this picture. So I wonder if you could just talk more specifically about that, but I'm also interested in how you affect the whole system view, the end to end cycle time. >> Yeah, I think Lester kind of hit the nail on the head in terms of the importance of automation, because data volumes are just so massive now that you, you can't, you can't effectively manage or understand or catalog your data without automation. But once you, once you understand the data and the value of the data, then that's where you can work out where the data needs to be at any point in time. And that's where we come into play. You know, if data needs to be online, if it's hot data, if it's data that needs to be analyzed, and, you know, we're moving to a world of analytics where some of our customers say, there's no such thing as cold data anymore, then it needs to be on a performance platform, but you need to understand exactly what the data is that you have to work out where to place it and where it fits into that data life cycle. And then there's that whole challenge of protecting it through the life cycle, whether that's protecting the hot data or as the data moves off into, you know, into an archive or into a cold store, still making sure you know where it is, and easily retrievable, should you need to move it back into the working set. So I think automation is key, but also making sure that it ties into understanding where you place your data at any point in time. >> Right, so Pure and Cohesity, obviously, partner to do that. And of course, Ezat, you guys are part of the protect, you're certainly part of the retain, but also you provide data management capabilities and analytics. I wonder if you could add some color there. >> Yeah, absolutely. So like you said, you know, we focus pretty heavily on data protection as just one of our areas and that infrastructure, it is just sitting there really you know, the legacy infrastructure, it's just sitting there, you know, consuming power, space cooling and pretty inefficient. And, you know, one of our main purposes is like we said, to make that data useful and automating that process is a key part of that, right? So, you know, not only are we doing things like obviously making it easier to manage, improving RPOs and RTOs with policy-based SLAs, but we're making it useful and having a system that can be automated through APIs and being an API first based system. It's almost mandatory now when you're going through a digital, you know, digital transformation. And one of the things that we can do is as part of that automation, is that we can make copies of data without consuming additional capacity available, pretty much instantaneously. You might want to do that for many different purposes. So examples of that could be, you know, for example, reproducing copies of production data for development purposes, or for testing new applications for example. And you know, how would you, how would you go about doing that in a legacy environment? The simple answer is it's painfully, right? So you just can't do those kinds of things. You know, I need more infrastructure to store the data. I need more compute to actually perform the things that I want to do on it, such as analytics, and to actually get a copy of that data, you know, I have to either manually copy it myself or I restore from a backup. And obviously all of that takes time, additional energy. And you end up with a big sprawling infrastructure, which isn't a manageable, like Patrick said, it's just the sheer amount of data, you know, it doesn't, it doesn't warrant doing that anymore. So, you know, if I have a modern day platform such as, you know, the Cohesity data platform, I can actually do a lot of analytics on that through applications. So we have a marketplace for apps. And the other great thing is that it's an open system, right? So anybody can develop an app. It's not just apps that are developed by us. It can be third parties, it could be customers. And with the data being consolidated in one place, you can then start to start to realize some of these benefits of deriving insights out of your data. >> Yeah, I'm glad you brought that up earlier in your little example there, because you're right. You know, how do you deal with that? You throw people at the problem and it becomes nights and weekends, and that sort of just fails. It doesn't scale. I wonder if we could talk about metadata. It's increasingly important. Metadata is data about the data, but Lester, maybe explain why it's so important and what role it plays in terms of creating smart data lifecycle. >> Well, yes, metadata, it does describe the data, but it's, a lot of people think it's just about the data itself, but there's a lot of extended characteristics about your data. So, imagine if for my data life cycle, I can communicate with the backup system from Cohesity and find out when the last time that data was backed up, or where it's backed up to. I can communicate exchange data with Pure Storage and find out what tier it's on. Is the data at the right tier commensurate with its use level that Patrick pointed out? And being able to share that metadata across systems. I think that's the direction that we're going in. Right now we're at the stage, we're just identifying the metadata and trying to bring it together and catalog it. The next stage will be, okay using the APIs that we have between our systems. Can we communicate and share that data and build good solutions for our customers to use? >> I think it's a huge point that you just made. I mean, you know, 10 years ago, automating classification was the big problem and it was machine intelligence. You know, we're obviously attacking that, but your point about as machines start communicating to each other and you start, you know, it's cloud to cloud, there's all kinds of metadata, kind of new metadata that's being created. I often joke that someday there's going to be more metadata than the data. So that brings us to cloud. And Ezat, I'd like to start with you, because you were talking about some cloud creep before. So what's your take on cloud? I mean, you've got private clouds, you got hybrid clouds, public clouds, inter clouds, IOT, and the edge is sort of another form of cloud. So how does cloud fit into the data life cycle? How does it affect the data life cycle? >> Yeah, sure. So, you know, I do think, you know, having the cloud is a great thing and it has got its role to play and you can have many different permutations and iterations of how you use it. And, you know, as I, as I may have sort of mentioned previously, you know, I've seen customers go into the cloud very, very quickly. And actually recently they're starting to remove web codes from the cloud. And the reason why this happens is that, you know, cloud has got its role to play, but it's not right for absolutely everything, especially in their current form as well. So, you know, a good analogy I like to use, and this may sound a little bit cliche, but you know, when you compare clouds versus on premises data centers, you can use the analogy of houses and hotels. So to give you an idea, so, you know, when we look at hotels, that's like the equivalent of a cloud, right? I can get everything I need from there. I can get my food, my water, my outdoor facilities. If I need to accommodate more people, I can rent some more rooms. I don't have to maintain the hotel. It's all done for me. When you look at houses, the equivalent to, you know, on premises infrastructure, I pretty much have to do everything myself, right? So I have to purchase the house. I have to maintain it. I have to buy my own food and water, eat it. I have to make improvements myself, but then why do we all live in houses, not in hotels? And the simple answer that I can, I can only think of is, is that it's cheaper, right? It's cheaper to do it myself, but that's not to say that hotels haven't got their role to play. You know, so for example, if I've got loads of visitors coming over for the weekend, I'm not going to go and build an extension to my house, just for them. I will burst into my hotel, into the cloud, and use it for, you know, for things like that. And you know, if I want to go somewhere on holiday, for example, then I'm not going to go buy a house there. I'm going to go in, I'm going to stay in a hotel, same thing. I need some temporary usage. You know, I'll use the cloud for that as well. Now, look, this is a loose analogy, right? But it kind of works. And it resonates with me at least anyway. So what I'm really saying is the cloud is great for many things, but it can work out costlier for certain applications while others are a perfect fit. So when customers do want to look at using the cloud, it really does need to be planned in an organized way, you know, so that you can avoid some of the pitfalls that we're talking about around, for example, creating additional silos, which are just going to make your life more complicated in the long run. So, you know, things like security planning, you know, adequate training for staff is absolutely a must. We've all seen the, you know, the horror stories in the press where certain data maybe has been left exposed in the cloud. Obviously nobody wants to see that. So as long as it's a well planned and considered approach, the cloud is great and it really does help customers out. >> Yeah, it's an interesting analogy. I hadn't thought of that before, but you're right. 'Cause I was going to say, well, part of it is you want the cloud experience everywhere, but you don't always want the cloud experience, especially, you know, when you're with your family, you want certain privacy. I've not heard that before Ezat, so that's a new perspective, so thank you. But so, but Patrick, I do want to come back to that cloud experience because in fact, that's what's happening in a lot of cases. Organizations are extending the cloud properties of automation on-prem and in hybrid. And certainly you guys have done that. You've created, you know, cloud-based capabilities. They can run in AWS or wherever, but what's your take on cloud? What's Pure's perspective? >> Yeah, I thought Ezat brought up a really interesting point and a great analogy for the use of the public cloud, and it really reinforces the importance of the hybrid and multicloud environment, because it gives you that flexibility to choose where is the optimal environment to run your business workloads. And that's what it's all about. And the flexibility to change which environment you're running in, either from one month to the next or from one year to the next, because workloads change and the characteristics that are available in the cloud change on a pretty frequent basis. It's a fast moving world. So one of the areas of focus for us with our cloud block store technology is to provide effectively a bridge between the on-prem cloud and the public cloud, to provide that consistent data management layer that allows customers to move their data where they need it when they need it. And the hybrid cloud is something that we've lived with ourselves at Pure. So our Pure1 management technology actually sits in a hybrid cloud environment. We started off entirely cloud native, but now we use the public cloud for compute and we use our own technology, the end of a high performance network link to support our data platform. So we get the best of both worlds. And I think that's where a lot of our customers are trying to get to is cloud flexibility, but also efficiency and optimization. >> All right, I want to come back in a moment there, but before we do, Lester, I wonder if we could talk a little bit about compliance governance and privacy. You know, that, a lot of that comes down to data, the EU right now, I think the Brits on this panel are still in the EU for now, but the EU are looking at new rules, new regulations going beyond GDPR, tightening things up in a, specifically kind of pointing at the cloud. Where does sort of privacy, governance, compliance fit in to the, to the data life cycle, then Ezat, I want your thoughts on this as well. >> Yeah, this is a very important point because the landscape for compliance around data privacy and data retention is changing very rapidly and being able to keep up with those changing regulations in an automated fashion is the only way you're going to be able to do it. Even, I think there's a, some sort of a, maybe a ruling coming out today or tomorrow with the change to GDPR. So this is, these are all very key points, and being able to codify those rules into some software, whether you know, IO Tahoe or your storage system or Cohesity that'll help you be compliant is crucial. >> Yeah, Esat, anything you can add there? I mean, this really is your wheelhouse. >> Yeah, absolutely. So, you know, I think anybody who's watching this probably has gotten the message that, you know, less silos is better. And then absolutely it also applies to data in the cloud as well. So, you know, by aiming to consolidate into fewer platforms, customers can realize a lot better control over their data. And then natural effect of this is that it makes meeting compliance and governance a lot easier. So when it's consolidated, you can start to confidently understand who is accessing your data, how frequently are they accessing the data? You can also do things like detecting anomalous file access activities, and quickly identify potential threats. You know, and this can be delivered by apps which are running on one platform that has consolidated the data as well. And you can also start getting into lots of things like, you know, rapidly searching for PII. So personally identifiable information across different file types. And you can report on all of this activity back to the business, by identifying, you know, where are you storing your copies of data? How many copies have you got and who has access to them? These are all becoming table stakes as far as I'm concerned. >> Right, right. >> The organizations continue that move into digital transformation and more regulation comes into law. So it's something that has to be taken very, very seriously. The easier you make your infrastructure, the easier it will be for you to comply with it. >> Okay, Patrick, we were talking, you talked earlier about storage optimization. We talked to Adam Worthington about the business case. You get the sort of numerator, which is the business value and then the denominator, which is the cost. And so storage efficiency is obviously a key part of it. It's part of your value proposition to pick up on your sort of earlier comments, and what's unique about Pure in this regard? >> Yeah, and I think there are, there are multiple dimensions to that. Firstly, if you look at the difference between legacy storage platforms, they used to take up racks or isles of space in a data center with flash technology that underpins flash blade, we effectively switch out racks for rack units. And it has a big play in terms of data center footprint, and the environmentals associated with the data center, but it doesn't stop at that. You know, we make sure that we efficiently store data on our platforms. We use advanced compression techniques to make sure that we make flash storage as cost competitive as we possibly can. And then if you look at extending out storage efficiencies and the benefits it brings, just the performance has a direct effect on staff, whether that's, you know, the staff and the simplicity of the platform, so that it's easy and efficient to manage, or whether it's the efficiency you get from your data scientists who are using the outcomes from the platform and making them more efficient. If you look at some of our customers in the financial space, their time to results are improved by 10 or 20 X by switching to our technology from legacy technologies for their analytics platforms. >> So guys we've been running, you know, CUBE interviews in our studios remotely for the last 120 days, it's probably the first interview I've done where I haven't started off talking about COVID, but digital transformation, you know, BC, before COVID. Yeah, it was real, but it was all of a buzzy wordy too. And now it's like a mandate. So Lester, I wonder if you could talk about smart data life cycle and how it fits into this isolation economy and hopefully what will soon be a post isolation economy? >> Yeah, COVID has dramatically accelerated the data economy. I think, you know, first and foremost, we've all learned to work at home. I, you know, we've all had that experience where, you know, there were people who would um and ah about being able to work at home just a couple of days a week. And here we are working five days a week. That's had a knock on impact to infrastructure to be able to support that. But going further than that, you know, the data economy is all about how a business can leverage their data to compete in this new world order that we are now in. So, you know, they've got to be able to drive that value from their data and if they're not prepared for it, they're going to falter. We've unfortunately seen a few companies that have faltered because they weren't prepared for this data economy. This is where all your value is driven from. So COVID has really been a forcing function to, you know, it's probably one of the few good things that have come out of COVID, is that we have been forced to adapt. And it's been an interesting journey and it continues to be so. >> Well, is that too, you know, everybody talks about business resiliency, ransomware comes into effect here, and Patrick, you, you may have some thoughts on this too, but Ezat, your thoughts on the whole work from home pivot and how it's impacting the data life cycle. >> Absolutely, like, like Lester said, you know, we've, we're seeing a huge impact here. You know, working from home has, has pretty much become the norm now. Companies have been forced into basically making it work. If you look at online retail, that's accelerated dramatically as well. Unified communications and video conferencing. So really, you know, the point here is that yes, absolutely. You know, we've compressed you know, in the past maybe four months, what probably would have taken maybe even five years, maybe 10 years or so. And so with all this digital capability, you know, when you talk about things like RPOs and RTOs, these things are, you know, very much, you know, front of mind basically and they're being taken very seriously. You know, with legacy infrastructure, you're pretty much limited with what you can do around that. But with next generation, it puts it front and center. And when it comes to, you know, to ransomware, of course, it's not a case of if it's going to happen, it's a case of when it's going to happen. Again, we've all seen lots of stuff in the press, different companies being impacted by this, you know, both private and public organizations. So it's a case of, you know, you have to think long and hard about how you're going to combat this, because actually malware also, it's becoming, it's becoming a lot more sophisticated. You know, what we're seeing now is that actually, when, when customers get impacted, the malware will sit in their environment and it will have a look around it, it won't actually do anything. And what it's actually trying to do is, it's trying to identify things like your backups, where are your backups? Because you know, what do, what do we all do? If we get hit by a situation like this, we go to our backups. But you know, the bad actors out there, they, you know, they're getting pretty smart as well. And if your legacy solution is sitting on a system that can be compromised quite easily, that's a really bad situation, you know, waiting to happen. And, you know, if you can't recover from your backups, essentially, unfortunately, you know, people are going to be making trips to the bank because you're going to have to pay to get your data back. And of course, nobody wants to see that happening. So one of the ways, for example, that we look to help customers defend against this is actually we have, we have a three pronged approach. So protect, detect, and respond. So what we mean by protect, and let me say, you know, first of all, this isn't a silver bullet, right? Security is an industry all of itself. It's very complicated. And the approach here is that you have to layer it. What Cohesity, for example, helps customers with, is around protecting that insurance policy, right? The backups. So by ensuring that that data is immutable, cannot be edited in any way, which is inherent to our file system. We make sure that nothing can affect that, but it's not just external actors you have to think about, it's also potentially internal bad actors as well. So things like being able to data lock your information so that even administrators can't change, edit or delete data, is just another way in which we help customers to protect. And then also you have things like multifactor authentication as well, but once we've okay, so we've protected the data. Now, when it comes, now it comes to detection. So again, being, you know, ingrained into data protection, we have a good view of what's happening with all of this data that's flowing around the organization. And if we start to see, for example, that backup times, or, you know, backup quantities, data quantities are suddenly spiking all of a sudden, we use things like, you know, AI machine learning to highlight these, and once we detect an anomaly such as this, we can then alert our users to this fact. And not only do we alert them and just say, look, we think something might be going on with your systems, but we'll also point them to a known good recovery point as well, so that they don't have to sit searching, well, when did this thing hit and you know, which recovery point do I have to use? And so, you know, and we use metadata to do all of these kinds of things with our global management platform called Helios. And that actually runs in the cloud as well. And so when we find this kind of stuff, we can basically recover it very, very quickly. And this comes back now to the RPOs and the RTOs. So your recovery point objective, we can shrink that, right? And essentially what that means is that you will lose less data. But more importantly, the RTO, your recovery time objective, it means that actually, should something happen and we need to recover that data, we can also shrink that dramatically. So again, when you think about other, you know, legacy technology out there, when something like this happens, you might be waiting hours, most likely days, possibly even weeks and months, depending on the severity. Whereas we're talking about being able to bring data back, you know, we're talking maybe, you know, a few hundred virtual machines in seconds and minutes. And so, you know, when you think about the value that that can give an organization, it becomes, it becomes a no brainer really, as far as, as far as I'm concerned. So, you know, that really covers how we respond to these situations. So protect, detect, and respond. >> Great, great summary. I mean, my summary is adverse, right? The adversaries are very, very capable. You got to put security practices in place. The backup Corpus becomes increasingly important. You got to have analytics to detect anomalous behavior and you got to have, you know, fast recovery. And thank you for that. We got to wrap, but so Lester, let me, let me ask you to sort of paint picture of the sort of journey or the maturity model that people have to take. You know, if they want to get into it, where do they start and where are they going? Give us that view. >> I think first it's knowing what you have. If you don't know what you have, you can't manage it, you can't control it, you can't secure it, you can't ensure it's compliant. So that's first and foremost. The second is really, you know, ensuring that you're compliant. Once you know what you have, are you securing it? Are you following the regulatory, the applicable regulations? Are you able to evidence that? How are you storing your data? Are you archiving it? Are you storing it effectively and efficiently? You know, have you, Nirvana from my perspective is really getting to a point where you've consolidated your data, you've broken down the silos and you have a virtually self service environment by which the business can consume and build upon their data. And really at the end of the day, as we said at the beginning, it's all about driving value out of your data. And the automation is key to this journey. >> That's awesome. And you just described sort of a winning data culture. Lester, Patrick, Ezat, thanks so much for participating in this power panel. >> Thank you, David. >> Thank you. >> Thank you for watching everybody. This is Dave Vellante for theCUBE. (bright music)
SUMMARY :
brought to you by IO Tahoe. to the power panel, I wonder if each of you could that you have and building a catalog Give us you know, your and demands to be able what are you seeing from customers, to create further silos, you know, I mean, people, you know, So, you know, finding Ezat, I mean, you know, manner and at the scale that you need to, So, you know, if you don't need it, and I want to understand how you guys enterprise to have, you know, So, you know, one of the So you have different technologies to you because you know, from the analysts, you know, And so, you know, what you can you know, upstream and downstream, So I wonder if you could or as the data moves off into, you know, And of course, Ezat, you And you know, how would you, You know, how do you deal with that? And being able to share that I mean, you know, 10 years ago, the equivalent to, you know, you know, when you're with your family, And the flexibility to that comes down to data, whether you know, IO Tahoe Yeah, Esat, anything you can add there? the message that, you know, So it's something that has to you talked earlier about whether that's, you know, So guys we've been running, you know, I think, you know, first and foremost, Well, is that too, you know, So it's a case of, you know, you know, fast recovery. And the automation is key to this journey. And you just described sort Thank you for watching everybody.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Patrick | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Patrick Smith | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
Ezat | PERSON | 0.99+ |
Adam Worthington | PERSON | 0.99+ |
80% | QUANTITY | 0.99+ |
53% | QUANTITY | 0.99+ |
IO Tahoe | ORGANIZATION | 0.99+ |
Ezat Dayeh | PERSON | 0.99+ |
10 | QUANTITY | 0.99+ |
10 years | QUANTITY | 0.99+ |
EU | ORGANIZATION | 0.99+ |
five years | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
GDPR | TITLE | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Cohesity | ORGANIZATION | 0.99+ |
EMEA | ORGANIZATION | 0.99+ |
one month | QUANTITY | 0.99+ |
Lester | PERSON | 0.99+ |
one platform | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
15 copies | QUANTITY | 0.99+ |
Pure Storage | ORGANIZATION | 0.99+ |
second | QUANTITY | 0.98+ |
one year | QUANTITY | 0.98+ |
both worlds | QUANTITY | 0.98+ |
six different sources | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
Lester | ORGANIZATION | 0.98+ |
one goal | QUANTITY | 0.98+ |
20 X | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
each | QUANTITY | 0.98+ |
10 years ago | DATE | 0.98+ |
Secondly | QUANTITY | 0.98+ |
first interview | QUANTITY | 0.98+ |
Firstly | QUANTITY | 0.98+ |
five days a week | QUANTITY | 0.98+ |
a week later | DATE | 0.97+ |
both | QUANTITY | 0.97+ |
three key steps | QUANTITY | 0.97+ |
three | QUANTITY | 0.97+ |
Lester Waters | PERSON | 0.97+ |
20% | QUANTITY | 0.97+ |
First | QUANTITY | 0.94+ |
IoTahoe | ORGANIZATION | 0.93+ |
one place | QUANTITY | 0.92+ |
Pure | ORGANIZATION | 0.9+ |
Nirvana | PERSON | 0.89+ |
Lester Waters, Io Tahoe | Enterprise Data Automation
(upbeat music) >> Reporter: From around the globe, it's The Cube with digital coverage of enterprise data automation and event series brought to you by Io-Tahoe. >> Okay, we're back. Focusing on enterprise data automation, we're going to talk about the journey to the cloud. Remember, the hashtag is data automated. We're here with Lester Waters who's the CTO of Io-Tahoe, Lester, good to see you from across the pond on video, wish we were face to face, but it's great to have you on The Cube. >> Also I do, thank you for having me. >> Oh, you're very welcome. Hey, give us a little background on CTO, you got a deep expertise in a lot of different areas, but what do we need to know? >> Well, David, I started my career basically at Microsoft, where I started the Information Security Cryptography Group. They're the very first one that the company had and that led to a career in information security and of course, as you go along with the information security, data is the key element to be protected. So I always had my hands in data and that naturally progressed into a role with Io-Tahoe as their CTO. >> Guys, I have to invite you back, we'll talk crypto all day we'd love to do that but we're here talking about yeah, awesome, right? But we're here talking about the cloud and here we'll talk about the journey to the cloud and accelerate. Everybody's really interested obviously in cloud, even more interested now with the pandemic, but what's that all about? >> Well, moving to the cloud is quite an undertaking for most organizations. First of all, we've got as probably if you're a large enterprise, you probably have thousands of applications, you have hundreds and hundreds of database instances, and trying to shed some light on that, just to plan your move to the cloud is a real challenge. And some organizations try to tackle that manually. Really what Io-Tahoe is bringing is trying to tackle that in an automated version to help you with your journey to the cloud. >> Well, look at migrations are sometimes just an evil word to a lot of organizations, but at the same time, building up technical debt veneer after veneer and year, and year, and year is something that many companies are saying, "Okay, it's got to stop." So what's the prescription for that automation journey and simplifying that migration to the cloud? >> Well, I think the very first thing that's all about is data hygiene. You don't want to pick up your bad habits and take them to the cloud. You've got an opportunity here, so I see the journey to the cloud is an opportunity to really clean house, reorganize things, like moving out. You might move all your boxes, but you're kind of probably cherry pick what you're going to take with you and then you're going to organize it as you end up at your new destination. So from that, I get there's seven key principles that I like to operate by when I advise on the cloud migration. >> Okay. So, where do you start? >> Well, I think the first thing is understanding what you got, so discover and cataloging your data and your applications. If I don't know what I have, I can't move it, I can't improve it, I can't build up on it. And I have to understand there is dependency, so building that data catalog is the very first step. What do I got? >> Now, is that a metadata exercise? Sometimes there's more metadata than there is data. Is metadata part of that first step or? >> In deed, metadata is the first step so the metadata really describes the data you have. So, the metadata is going to tell me I have 2000 tables and maybe of those tables, there's an average of 25 columns each, and so that gives me a sketch if you will, of what I need to move. How big are the boxes I need to pack for my move to the cloud? >> Okay, and you're saying you can automate that data classification, categorization, discovery, correct using math machine intelligence, is that correct? >> Yeah, that's correct. So basically we go, and we will discover all of the schema, if you will, that's the metadata description of your tables and columns in your database in the data types. So we take, we will ingest that in, and we will build some insights around that. And we do that across a variety of platforms because everybody's organization has you've got a one yeah, an Oracle Database here, and you've got a Microsoft SQL Database here, you might have something else there that you need to bring site onto. And part of this journey is going to be about breaking down your data silos and understanding what you've got. >> Okay. So, we've done the audit, we know what we've got, what's next? Where do we go next? >> So the next thing is remediating that data. Where do I have duplicate data? Often times in an organization, data will get duplicated. So, somebody will take a snapshot of a data, and then ended up building a new application, which suddenly becomes dependent on that data. So it's not uncommon for an organization of 20 master instances of a customer. And you can see where that will go when trying to keep all that stuff in sync becomes a nightmare all by itself. So you want to understand where all your redundant data is. So when you go to the cloud, maybe you have an opportunity here to consolidate that data. >> Yeah, because you like to borrow in an Einstein or apply an Einstein Bromide right. Keep as much data as you can, but no more. >> Correct. >> Okay. So you get to the point to the second step you're kind of a one to reduce costs, then what? You figure out what to get rid of, or actually get rid of it, what's next? >> Yes, that would be the next step. So figuring out what you need and what you don't need often times I've found that there's obsolete columns of data in your databases that you just don't need, or maybe it's been superseded by another, you've got tables that have been superseded by other tables in your database. So you got to understand what's being used and what's not and then from that, you can decide, "I'm going to leave this stuff behind, "or I'm going to archive this stuff "cause I might need it for data retention "or I'm just going to delete it, "I don't need it at all." >> Well, Lester, most organizations, if they've been around a while, and the so-called incumbents, they've got data all over the place, their data marts, data warehouses, there are all kinds of different systems and the data lives in silos. So, how do you kind of deal with that problem? Is that part of the journey? >> That's a great point Dave, because you're right that the data silos happen because this business unit is chartered with this task another business unit has this task and that's how you get those instantiations of the same data occurring in multiple places. So as part of your cloud migration journey, you really want to plan where there's an opportunity to consolidate your data, because that means there'll be less to manage, there'll be less data to secure, and it'll have a smaller footprint, which means reduced costs. >> So, people always talk about a single version of the truth, data quality is a huge issue. I've talked to data practitioners and they've indicated that the quality metrics are in the single digits and they're trying to get to 90% plus, but maybe you could address data quality. Where does that fit in on the journey? >> That's, a very important point. First of all, you don't want to bring your legacy issues with you. As the point I made earlier, if you've got data quality issues, this is a good time to find those and identify and remediate them. But that can be a laborious task. We've had customers that have tried to do this by hand and it's very, very time consuming, cause you imagine if you've got 200 tables, 50,000 columns, imagine, the manual labor involved in doing that. And you could probably accomplish it, but it'll take a lot of work. So the opportunity to use tools here and automate that process is really will help you find those outliers there's that bad data and correct it before you move to the cloud. >> And you're just talking about that automation it's the same thing with data catalog and that one of the earlier steps. Organizations would do this manually or they try to do it manually and that's a lot of reason for the failure. They just, it's like cleaning out your data like you just don't want to do it (laughs). Okay, so then what's next? I think we're plowing through your steps here. What what's next on the journey? >> The next one is, in a nutshell, preserve your data format. Don't boil the ocean here to use a cliche. You want to do a certain degree of lift and shift because you've got application dependencies on that data and the data format, the tables on which they sit, the columns and the way they're named. So, some degree you are going to be doing a lift and shift, but it's an intelligent lift and shift using all the insights you've gathered by cataloging the data, looking for data quality issues, looking for duplicate columns, doing planning consolidation. You don't want to also rewrite your application. So, in that aspect, I think it's important to do a bit of lift and shift and preserve those data formats as they sit. >> Okay, so let me follow up on that. That sounds really important to me, because if you're doing a conversion and you're rewriting applications, that means that you're going to have to freeze the existing application, and then you going to be refueling the plane as you're in midair and a lot of times, especially with mission critical systems, you're never going to bring those together and that's a recipe for disaster, isn't it? >> Great analogy unless you're with the air force, you'll (mumbles) (laughs). Now, that's correct. It's you want to have bite-sized steps and that's why it's important to plan your journey, take these steps. You're using automation where you can to make that journey to the cloud much easier and more straightforward. >> All right, I like that. So we're taking a kind of a systems view and end to end view of the data pipeline, if you will. What's next? I think we're through. I think I've counted six. What's the lucky seven? >> Lucky seven, involve your business users. Really, when you think about it, your data is in silos. Part of this migration to the cloud is an opportunity to break down these silos, these silos that naturally occur as part of the business unit. You've got to break these cultural barriers that sometimes exist between business and say, so for example, I always advise, there's an opportunity here to consolidate your sensitive data, your PII, your personally identifiable information, and if three different business units have the same source of truth for that, there's was an opportunity to consolidate that into one as you migrate. That might be a little bit of tweaking to some of the apps that you have that are dependent on it, but in the long run, that's what you really want to do. You want to have a single source of truth, you want to ring fence that sensitive data, and you want all your business users talking together so that you're not reinventing the wheel. >> Well, the reason I think too that's so important is that you're now I would say you're creating a data driven culture. I know that's sort of a buzz word, but what it's true and what that means to me is that your users, your lines of business feel like they actually own the data rather than pointing fingers at the data group, the IT group, the data quality people, data engineers, saying, "Oh, I don't believe it." If the lines of business own the data, they're going to lean in, they're going to maybe bring their own data science resources to the table, and it's going to be a much more collaborative effort as opposed to a non-productive argument. >> Yeah. And that's where we want to get to. DataOps is key, and maybe that's a term that's still evolving. But really, you want the data to drive the business because that's where your insights are, that's where your value is. You want to break down the silos between not only the business units, as I mentioned, but also as you pointed out, the roles of the people that are working with it. A self service data culture is the right way to go with the right security controls, putting on my security hat of course in place so that if I'm a developer and I'm building a new application, I'd love to be able to go to the data catalog, "Oh, there's already a database that has the customer "what the customers have clicked on when shopping." I could use that. I don't have to rebuild that, I'll just use that as for my application. That's the kind of problems you want to be able to solve and that's where your cost reductions come in across the board. >> Yeah. I want to talk a little bit about the business context here. We always talk about data, it's the new source of competitive advantage, I think there's not a lot of debate about that, but it's hard. A lot of companies are struggling to get value out of their data because it's so difficult. All the things we've talked about, the silos, the data quality, et cetera. So, you mentioned the term data apps, data apps is all about streamlining, that data, pipelining, infusing automation and machine intelligence into that pipeline and then ultimately taking a systems view and compressing that time to insights so that you can drive monetization, whether it's cut costs, maybe it's new revenue, drive productivity, but it's that end to end cycle time reduction that successful practitioners talk about as having the biggest business impact. Are you seeing that? >> Absolutely, but it is a journey and it's a huge cultural change for some companies that are. I've worked in many companies that are ticket based IT-driven and just do even the marginalist of change or get insight, raise a ticket, wait a week and then out the other end will pop maybe a change that I needed and it'll take a while for us to get to a culture that truly has a self service data-driven nature where I'm the business owner, and I want to bring in a data scientist because we're losing. For example, a business might be losing to a competitor and they want to find what insights, why is the customer churn, for example, happening every Tuesday? What is it about Tuesday? This is where your data scientist comes in. The last thing you want is to raise a ticket, wait for the snapshot of the data, you want to enable that data scientist to come in, securely connect into the data, and do his analysis, and come back and give you those insights, which will give you that competitive advantage. >> Well, I love your point about churn, maybe it talks about the Andreessen quote that "Software's eating the world," and all companies are our software companies, and SaaS companies, and churn is the killer of SaaS companies. So very, very important point you're making. My last question for you before we summarize is the tech behind all of these. What makes Io-Tahoe unique in its ability to help automate that data pipeline? >> Well, we've done a lot of research, we have I think now maybe 11 pending patent applications, I think one has been approved to be issued (mumbles), but really, it's really about sitting down and doing the right kind of analysis and figuring out how we can optimize this journey. Some of these stuff isn't rocket science. You can read a schema and into an open source solution, but you can't necessarily find the hidden insights. So if I want to find my foreign key dependencies, which aren't always declared in the database, or I want to identify columns by their content, which because the columns might be labeled attribute one, attribute two, attribute three, or I want to find out how my data flows between the various tables in my database. That's the point at which you need to bring in automation, you need to bring in data science solutions, and there's even a degree of machine learning because for example, we might deduce that data is flowing from this table to this table and upon when you present that to the user with a 87% confidence, for example, and the user can go, or the administrator can go. Now, it really goes the other way, it was an invalid collusion and that's the machine learning cycle. So the next time we see that pattern again, in that environment we will be able to make a better recommendation because some things aren't black and white, they need that human intervention loop. >> All right, I just want to summarize with Lester Waters' playbook to moving to the cloud and I'll go through them. Hopefully, I took some notes, hopefully, I got them right. So step one, you want to do that data discovery audit, you want to be fact-based. Two is you want to remediate that data redundancy, and then three identify what you can get rid of. Oftentimes you don't get rid of stuff in IT, or maybe archive it to cheaper media. Four is consolidate those data silos, which is critical, breaking down those data barriers. And then, five is attack the quality issues before you do the migration. Six, which I thought was really intriguing was preserve that data format, you don't want to do the rewrite applications and do that conversion. It's okay to do a little bit of lifting and shifting >> This comes in after the task. >> Yeah, and then finally, and probably the most important is you got to have that relationship with the lines of business, your users, get them involved, begin that cultural shift. So I think great recipe Lester for safe cloud migration. I really appreciate your time. I'll give you the final word if you will bring us home. >> All right. Well, I think the journey to the cloud it's a tough one. You will save money, I have heard people say, you got to the cloud, it's too expensive, it's too this, too that, but really, there is an opportunity for savings. I'll tell you when I run data services as a PaaS service in the cloud, it's wonderful because I can scale up and scale down almost by virtually turning a knob. And so I'll have complete control and visibility of my costs. And so for me, that's very important. Io also, it gives me the opportunity to really ring fence my sensitive data, because let's face it, most organizations like being in a cheese grater when you talk about security, because there's so many ways in and out. So I find that by consolidating and bringing together the crown jewels, if you will. As a security practitioner, it's much more easy to control. But it's very important. You can't get there without some automation and automating this discovery and analysis process. >> Well, great advice. Lester, thanks so much. It's clear that the capex investments on data centers are generally not a good investment for most companies. Lester, really appreciate, Lester waters CTO of Io-Tahoe. Let's watch this short video and we'll come right back. You're watching The Cube, thank you. (upbeat music)
SUMMARY :
to you by Io-Tahoe. but it's great to have you on The Cube. you got a deep expertise in and that led to a career Guys, I have to invite you back, to help you with your and simplifying that so I see the journey to is the very first step. Now, is that a metadata exercise? and so that gives me a sketch if you will, that you need to bring site onto. we know what we've got, what's next? So you want to understand where Yeah, because you like point to the second step and then from that, you can decide, and the data lives in silos. and that's how you get Where does that fit in on the journey? So the opportunity to use tools here and that one of the earlier steps. and the data format, the and then you going to to plan your journey, and end to end view of the and you want all your business and it's going to be a much database that has the customer and compressing that time to insights and just do even the marginalist of change and churn is the killer That's the point at which you and do that conversion. after the task. and probably the most important is the journey to the cloud It's clear that the capex
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
200 tables | QUANTITY | 0.99+ |
hundreds | QUANTITY | 0.99+ |
90% | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Lester Waters | PERSON | 0.99+ |
six | QUANTITY | 0.99+ |
first step | QUANTITY | 0.99+ |
87% | QUANTITY | 0.99+ |
Information Security Cryptography Group | ORGANIZATION | 0.99+ |
25 columns | QUANTITY | 0.99+ |
Io-Tahoe | ORGANIZATION | 0.99+ |
seven key principles | QUANTITY | 0.99+ |
2000 tables | QUANTITY | 0.99+ |
Andreessen | PERSON | 0.99+ |
Six | QUANTITY | 0.99+ |
second step | QUANTITY | 0.99+ |
Io Tahoe | PERSON | 0.99+ |
Tuesday | DATE | 0.99+ |
50,000 columns | QUANTITY | 0.99+ |
Lester | PERSON | 0.98+ |
11 pending patent applications | QUANTITY | 0.98+ |
five | QUANTITY | 0.97+ |
a week | QUANTITY | 0.97+ |
20 master instances | QUANTITY | 0.97+ |
Einstein | PERSON | 0.97+ |
Four | QUANTITY | 0.97+ |
first one | QUANTITY | 0.96+ |
one | QUANTITY | 0.96+ |
first thing | QUANTITY | 0.95+ |
Lester | ORGANIZATION | 0.95+ |
Two | QUANTITY | 0.93+ |
First | QUANTITY | 0.93+ |
Enterprise Data Automation | ORGANIZATION | 0.93+ |
three | QUANTITY | 0.93+ |
seven | QUANTITY | 0.92+ |
step one | QUANTITY | 0.92+ |
single version | QUANTITY | 0.92+ |
pandemic | EVENT | 0.91+ |
SQL Database | TITLE | 0.91+ |
single source | QUANTITY | 0.86+ |
three different business units | QUANTITY | 0.82+ |
The Cube | ORGANIZATION | 0.8+ |
Oracle Database | TITLE | 0.79+ |
thousands of applications | QUANTITY | 0.76+ |
single digits | QUANTITY | 0.76+ |
capex | ORGANIZATION | 0.74+ |
CTO | PERSON | 0.73+ |
Waters' | PERSON | 0.69+ |
each | QUANTITY | 0.68+ |
attribute two | OTHER | 0.65+ |
attribute three | OTHER | 0.59+ |
The Cube | TITLE | 0.57+ |
attribute one | OTHER | 0.44+ |
Lester Waters, Io-Tahoe
(upbeat music) >> Reporter: From around the globe, it's The Cube with digital coverage of enterprise data automation and event series brought to you by Io-Tahoe. >> Okay, we're back. Focusing on enterprise data automation, we're going to talk about the journey to the cloud. Remember, the hashtag is data automated. We're here with Lester Waters who's the CTO of Io-Tahoe, Lester, good to see you from across the pond on video, wish we were face to face, but it's great to have you on The Cube. >> Also I do, thank you for having me. >> Oh, you're very welcome. Hey, give us a little background on CTO, you got a deep expertise in a lot of different areas, but what do we need to know? >> Well, David, I started my career basically at Microsoft, where I started the Information Security Cryptography Group. They're the very first one that the company had and that led to a career in information security and of course, as you go along with the information security, data is the key element to be protected. So I always had my hands in data and that naturally progressed into a role with Io-Tahoe as their CTO. >> Guys, I have to invite you back, we'll talk crypto all day we'd love to do that but we're here talking about yeah, awesome, right? But we're here talking about the cloud and here we'll talk about the journey to the cloud and accelerate. Everybody's really interested obviously in cloud, even more interested now with the pandemic, but what's that all about? >> Well, moving to the cloud is quite an undertaking for most organizations. First of all, we've got as probably if you're a large enterprise, you probably have thousands of applications, you have hundreds and hundreds of database instances, and trying to shed some light on that, just to plan your move to the cloud is a real challenge. And some organizations try to tackle that manually. Really what Io-Tahoe is bringing is trying to tackle that in an automated version to help you with your journey to the cloud. >> Well, look at migrations are sometimes just an evil word to a lot of organizations, but at the same time, building up technical debt veneer after veneer and year, and year, and year is something that many companies are saying, "Okay, it's got to stop." So what's the prescription for that automation journey and simplifying that migration to the cloud? >> Well, I think the very first thing that's all about is data hygiene. You don't want to pick up your bad habits and take them to the cloud. You've got an opportunity here, so I see the journey to the cloud is an opportunity to really clean house, reorganize things, like moving out. You might move all your boxes, but you're kind of probably cherry pick what you're going to take with you and then you're going to organize it as you end up at your new destination. So from that, I get there's seven key principles that I like to operate by when I advise on the cloud migration. >> Okay. So, where do you start? >> Well, I think the first thing is understanding what you got, so discover and cataloging your data and your applications. If I don't know what I have, I can't move it, I can't improve it, I can't build up on it. And I have to understand there is dependency, so building that data catalog is the very first step. What do I got? >> Now, is that a metadata exercise? Sometimes there's more metadata than there is data. Is metadata part of that first step or? >> In deed, metadata is the first step so the metadata really describes the data you have. So, the metadata is going to tell me I have 2000 tables and maybe of those tables, there's an average of 25 columns each, and so that gives me a sketch if you will, of what I need to move. How big are the boxes I need to pack for my move to the cloud? >> Okay, and you're saying you can automate that data classification, categorization, discovery, correct using math machine intelligence, is that correct? >> Yeah, that's correct. So basically we go, and we will discover all of the schema, if you will, that's the metadata description of your tables and columns in your database in the data types. So we take, we will ingest that in, and we will build some insights around that. And we do that across a variety of platforms because everybody's organization has you've got a one yeah, an Oracle Database here, and you've got a Microsoft SQL Database here, you might have something else there that you need to bring site onto. And part of this journey is going to be about breaking down your data silos and understanding what you've got. >> Okay. So, we've done the audit, we know what we've got, what's next? Where do we go next? >> So the next thing is remediating that data. Where do I have duplicate data? Often times in an organization, data will get duplicated. So, somebody will take a snapshot of a data, and then ended up building a new application, which suddenly becomes dependent on that data. So it's not uncommon for an organization of 20 master instances of a customer. And you can see where that will go when trying to keep all that stuff in sync becomes a nightmare all by itself. So you want to understand where all your redundant data is. So when you go to the cloud, maybe you have an opportunity here to consolidate that data. >> Yeah, because you like to borrow in an Einstein or apply an Einstein Bromide right. Keep as much data as you can, but no more. >> Correct. >> Okay. So you get to the point to the second step you're kind of a one to reduce costs, then what? You figure out what to get rid of, or actually get rid of it, what's next? >> Yes, that would be the next step. So figuring out what you need and what you don't need often times I've found that there's obsolete columns of data in your databases that you just don't need, or maybe it's been superseded by another, you've got tables that have been superseded by other tables in your database. So you got to understand what's being used and what's not and then from that, you can decide, "I'm going to leave this stuff behind, "or I'm going to archive this stuff "cause I might need it for data retention "or I'm just going to delete it, "I don't need it at all." >> Well, Lester, most organizations, if they've been around a while, and the so-called incumbents, they've got data all over the place, their data marts, data warehouses, there are all kinds of different systems and the data lives in silos. So, how do you kind of deal with that problem? Is that part of the journey? >> That's a great point Dave, because you're right that the data silos happen because this business unit is chartered with this task another business unit has this task and that's how you get those instantiations of the same data occurring in multiple places. So as part of your cloud migration journey, you really want to plan where there's an opportunity to consolidate your data, because that means there'll be less to manage, there'll be less data to secure, and it'll have a smaller footprint, which means reduced costs. >> So, people always talk about a single version of the truth, data quality is a huge issue. I've talked to data practitioners and they've indicated that the quality metrics are in the single digits and they're trying to get to 90% plus, but maybe you could address data quality. Where does that fit in on the journey? >> That's, a very important point. First of all, you don't want to bring your legacy issues with you. As the point I made earlier, if you've got data quality issues, this is a good time to find those and identify and remediate them. But that can be a laborious task. We've had customers that have tried to do this by hand and it's very, very time consuming, cause you imagine if you've got 200 tables, 50,000 columns, imagine, the manual labor involved in doing that. And you could probably accomplish it, but it'll take a lot of work. So the opportunity to use tools here and automate that process is really will help you find those outliers there's that bad data and correct it before you move to the cloud. >> And you're just talking about that automation it's the same thing with data catalog and that one of the earlier steps. Organizations would do this manually or they try to do it manually and that's a lot of reason for the failure. They just, it's like cleaning out your data like you just don't want to do it (laughs). Okay, so then what's next? I think we're plowing through your steps here. What what's next on the journey? >> The next one is, in a nutshell, preserve your data format. Don't boil the ocean here to use a cliche. You want to do a certain degree of lift and shift because you've got application dependencies on that data and the data format, the tables on which they sit, the columns and the way they're named. So, some degree you are going to be doing a lift and shift, but it's an intelligent lift and shift using all the insights you've gathered by cataloging the data, looking for data quality issues, looking for duplicate columns, doing planning consolidation. You don't want to also rewrite your application. So, in that aspect, I think it's important to do a bit of lift and shift and preserve those data formats as they sit. >> Okay, so let me follow up on that. That sounds really important to me, because if you're doing a conversion and you're rewriting applications, that means that you're going to have to freeze the existing application, and then you going to be refueling the plane as you're in midair and a lot of times, especially with mission critical systems, you're never going to bring those together and that's a recipe for disaster, isn't it? >> Great analogy unless you're with the air force, you'll (mumbles) (laughs). Now, that's correct. It's you want to have bite-sized steps and that's why it's important to plan your journey, take these steps. You're using automation where you can to make that journey to the cloud much easier and more straightforward. >> All right, I like that. So we're taking a kind of a systems view and end to end view of the data pipeline, if you will. What's next? I think we're through. I think I've counted six. What's the lucky seven? >> Lucky seven, involve your business users. Really, when you think about it, your data is in silos. Part of this migration to the cloud is an opportunity to break down these silos, these silos that naturally occur as part of the business unit. You've got to break these cultural barriers that sometimes exist between business and say, so for example, I always advise, there's an opportunity here to consolidate your sensitive data, your PII, your personally identifiable information, and if three different business units have the same source of truth for that, there's was an opportunity to consolidate that into one as you migrate. That might be a little bit of tweaking to some of the apps that you have that are dependent on it, but in the long run, that's what you really want to do. You want to have a single source of truth, you want to ring fence that sensitive data, and you want all your business users talking together so that you're not reinventing the wheel. >> Well, the reason I think too that's so important is that you're now I would say you're creating a data driven culture. I know that's sort of a buzz word, but what it's true and what that means to me is that your users, your lines of business feel like they actually own the data rather than pointing fingers at the data group, the IT group, the data quality people, data engineers, saying, "Oh, I don't believe it." If the lines of business own the data, they're going to lean in, they're going to maybe bring their own data science resources to the table, and it's going to be a much more collaborative effort as opposed to a non-productive argument. >> Yeah. And that's where we want to get to. Data apps is key, and maybe that's a term that's still evolving. But really, you want the data to drive the business because that's where your insights are, that's where your value is. You want to break down the silos between not only the business units, as I mentioned, but also as you pointed out, the roles of the people that are working with it. A self service data culture is the right way to go with the right security controls, putting on my security hat of course in place so that if I'm a developer and I'm building a new application, I'd love to be able to go to the data catalog, "Oh, there's already a database that has the customer "what the customers have clicked on when shopping." I could use that. I don't have to rebuild that, I'll just use that as for my application. That's the kind of problems you want to be able to solve and that's where your cost reductions come in across the board. >> Yeah. I want to talk a little bit about the business context here. We always talk about data, it's the new source of competitive advantage, I think there's not a lot of debate about that, but it's hard. A lot of companies are struggling to get value out of their data because it's so difficult. All the things we've talked about, the silos, the data quality, et cetera. So, you mentioned the term data apps, data apps is all about streamlining, that data, pipelining, infusing automation and machine intelligence into that pipeline and then ultimately taking a systems view and compressing that time to insights so that you can drive monetization, whether it's cut costs, maybe it's new revenue, drive productivity, but it's that end to end cycle time reduction that successful practitioners talk about as having the biggest business impact. Are you seeing that? >> Absolutely, but it is a journey and it's a huge cultural change for some companies that are. I've worked in many companies that are ticket based IT-driven and just do even the marginalist of change or get insight, raise a ticket, wait a week and then out the other end will pop maybe a change that I needed and it'll take a while for us to get to a culture that truly has a self service data-driven nature where I'm the business owner, and I want to bring in a data scientist because we're losing. For example, a business might be losing to a competitor and they want to find what insights, why is the customer churn, for example, happening every Tuesday? What is it about Tuesday? This is where your data scientist comes in. The last thing you want is to raise a ticket, wait for the snapshot of the data, you want to enable that data scientist to come in, securely connect into the data, and do his analysis, and come back and give you those insights, which will give you that competitive advantage. >> Well, I love your point about churn, maybe it talks about the Andreessen quote that "Software's eating the world," and all companies are our software companies, and SaaS companies, and churn is the killer of SaaS companies. So very, very important point you're making. My last question for you before we summarize is the tech behind all of these. What makes Io-Tahoe unique in its ability to help automate that data pipeline? >> Well, we've done a lot of research, we have I think now maybe 11 pending patent applications, I think one has been approved to be issued (mumbles), but really, it's really about sitting down and doing the right kind of analysis and figuring out how we can optimize this journey. Some of these stuff isn't rocket science. You can read a schema and into an open source solution, but you can't necessarily find the hidden insights. So if I want to find my foreign key dependencies, which aren't always declared in the database, or I want to identify columns by their content, which because the columns might be labeled attribute one, attribute two, attribute three, or I want to find out how my data flows between the various tables in my database. That's the point at which you need to bring in automation, you need to bring in data science solutions, and there's even a degree of machine learning because for example, we might deduce that data is flowing from this table to this table and upon when you present that to the user with a 87% confidence, for example, and the user can go, or the administrator can go. Now, it really goes the other way, it was an invalid collusion and that's the machine learning cycle. So the next time we see that pattern again, in that environment we will be able to make a better recommendation because some things aren't black and white, they need that human intervention loop. >> All right, I just want to summarize with Lester Waters' playbook to moving to the cloud and I'll go through them. Hopefully, I took some notes, hopefully, I got them right. So step one, you want to do that data discovery audit, you want to be fact-based. Two is you want to remediate that data redundancy, and then three identify what you can get rid of. Oftentimes you don't get rid of stuff in IT, or maybe archive it to cheaper media. Four is consolidate those data silos, which is critical, breaking down those data barriers. And then, five is attack the quality issues before you do the migration. Six, which I thought was really intriguing was preserve that data format, you don't want to do the rewrite applications and do that conversion. It's okay to do a little bit of lifting and shifting >> This comes in after the task. >> Yeah, and then finally, and probably the most important is you got to have that relationship with the lines of business, your users, get them involved, begin that cultural shift. So I think great recipe Lester for safe cloud migration. I really appreciate your time. I'll give you the final word if you will bring us home. >> All right. Well, I think the journey to the cloud it's a tough one. You will save money, I have heard people say, you got to the cloud, it's too expensive, it's too this, too that, but really, there is an opportunity for savings. I'll tell you when I run data services as a PaaS service in the cloud, it's wonderful because I can scale up and scale down almost by virtually turning a knob. And so I'll have complete control and visibility of my costs. And so for me, that's very important. Io also, it gives me the opportunity to really ring fence my sensitive data, because let's face it, most organizations like being in a cheese grater when you talk about security, because there's so many ways in and out. So I find that by consolidating and bringing together the crown jewels, if you will. As a security practitioner, it's much more easy to control. But it's very important. You can't get there without some automation and automating this discovery and analysis process. >> Well, great advice. Lester, thanks so much. It's clear that the capex investments on data centers are generally not a good investment for most companies. Lester, really appreciate, Lester waters CTO of Io-Tahoe. Let's watch this short video and we'll come right back. You're watching The Cube, thank you. (upbeat music)
SUMMARY :
to you by Io-Tahoe. but it's great to have you on The Cube. you got a deep expertise in and that led to a career Guys, I have to invite you back, to help you with your and simplifying that so I see the journey to is the very first step. Now, is that a metadata exercise? and so that gives me a sketch if you will, that you need to bring site onto. we know what we've got, what's next? So you want to understand where Yeah, because you like point to the second step and then from that, you can decide, and the data lives in silos. and that's how you get Where does that fit in on the journey? So the opportunity to use tools here and that one of the earlier steps. and the data format, the and then you going to to plan your journey, and end to end view of the and you want all your business and it's going to be a much database that has the customer and compressing that time to insights and just do even the marginalist of change and churn is the killer That's the point at which you and do that conversion. after the task. and probably the most important is the journey to the cloud It's clear that the capex
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
200 tables | QUANTITY | 0.99+ |
hundreds | QUANTITY | 0.99+ |
90% | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Lester Waters | PERSON | 0.99+ |
six | QUANTITY | 0.99+ |
first step | QUANTITY | 0.99+ |
87% | QUANTITY | 0.99+ |
Information Security Cryptography Group | ORGANIZATION | 0.99+ |
25 columns | QUANTITY | 0.99+ |
Io-Tahoe | ORGANIZATION | 0.99+ |
seven key principles | QUANTITY | 0.99+ |
2000 tables | QUANTITY | 0.99+ |
Andreessen | PERSON | 0.99+ |
Six | QUANTITY | 0.99+ |
second step | QUANTITY | 0.99+ |
Io-Tahoe | PERSON | 0.99+ |
Tuesday | DATE | 0.99+ |
50,000 columns | QUANTITY | 0.99+ |
Lester | PERSON | 0.98+ |
11 pending patent applications | QUANTITY | 0.98+ |
five | QUANTITY | 0.97+ |
a week | QUANTITY | 0.97+ |
20 master instances | QUANTITY | 0.97+ |
Einstein | PERSON | 0.97+ |
Four | QUANTITY | 0.97+ |
first one | QUANTITY | 0.96+ |
one | QUANTITY | 0.96+ |
first thing | QUANTITY | 0.95+ |
Lester | ORGANIZATION | 0.95+ |
Two | QUANTITY | 0.93+ |
First | QUANTITY | 0.93+ |
three | QUANTITY | 0.93+ |
seven | QUANTITY | 0.92+ |
step one | QUANTITY | 0.92+ |
single version | QUANTITY | 0.92+ |
pandemic | EVENT | 0.91+ |
SQL Database | TITLE | 0.91+ |
single source | QUANTITY | 0.86+ |
three different business units | QUANTITY | 0.82+ |
The Cube | ORGANIZATION | 0.8+ |
Oracle Database | TITLE | 0.79+ |
thousands of applications | QUANTITY | 0.76+ |
single digits | QUANTITY | 0.76+ |
CTO | PERSON | 0.74+ |
Waters' | PERSON | 0.69+ |
each | QUANTITY | 0.68+ |
capex | ORGANIZATION | 0.67+ |
attribute two | OTHER | 0.65+ |
The Cube | TITLE | 0.6+ |
attribute three | OTHER | 0.59+ |
attribute one | OTHER | 0.44+ |
Ajay Vohora & Lester Waters, Io-Tahoe | AWS re:Invent 2019
>>LA Las Vegas. It's the cube covering AWS reinvent 2019, brought to you by Amazon web services and they don't care along with its ecosystem partners. >>Fine. Oh, welcome back here to Las Vegas. We are alive at AWS. Reinvent a lot with Justin Warren. I'm John Walls day one of a jam pack show. We had great keynotes this morning from Andy Jassy, uh, also representatives from Goldman Sachs and number of other enterprises on this stage right now we're gonna talk about data. It's all about data with IO Tahoe, a couple of the companies, representatives, CEO H J for horror. Jorge J. Thanks for being with us. Thank you Joan. And uh, Lester waters is the CSO at IO Tahoe. Leicester. Good afternoon to you. Thanks for being with us. Thank you for having us. CJ, you brought a football with you there. I see. So you've come prepared for a sport sport. I love it. All right. But if this is that your booth and your, you're showing here I assume and exhibiting and I know you've got a big offering we're going to talk about a little bit later on. First tell us about IO Tahoe a little bit to inform our viewers right now who might not be too familiar with the company. >>Sure. Well, our background was dealing with enterprise scale data issues that were really about the complexity, the amount of data and different types of data. So 2014 around when we're in stealth, kind of working on our technology, uh, the, a lot of the common technologies around them were Apache base. So Hadoop, um, large enterprises that were working with like a GE, Comcast had a cow help us come out of stealth in 2017. Uh, and grave, it's gave us a great story of solving petabyte scale data challenges, uh, using machine learning. So, uh, that manual overhead, that more and more as we look at, uh, AWS services, how do we drive the automation and get the value from data, uh, automation. >>It's gotta be the way forwards. All right, so let's, let's jump onto that then. Uh, on, on that notion, you've got this exponential growth in data, obviously working off the edge internet of things. Um, all these inputs, right? And we have so much more information at our disposal. Some of it's great, some of it's not. How do we know the difference, especially in this world where this exponential increase has happened. Lester, I mean, just tackle that for, from a, uh, from a company perspective and identifying, you know, first off, how do we ever figure out what do we have that's that valuable? Where do we get the value out of that, right? And then, um, how do we make sense of it? How do we put it into practice? >>Yeah. So I think not most enterprises have a problem with data sprawl. There's project startup, we get a block of data and then all of a sudden the new, a new project comes along, they take a copy of that data. There's another instance of it. Then there's another instance for another project. >>And suddenly these different data sources become authoritative and become production. So now I have three, four, or five different instances. Oh, and then there's the three or four that got canceled and they're still sitting around. And as an information security professional, my challenge is to know where all of those pieces of data are so that, so that I can govern it and make sure that the stuff I don't need is gotten rid of it deleted. Uh, so you know, using the IO Tahoe software, I'm able to catalog all of that. I'm able to garner insights into that data using the, the nine patent pending algorithms that we have, uh, to, to find that, uh, to do intelligent tagging, if you will. So, uh, from my perspective, I'm very interested in making sure that I'm adhering to compliance rules. So the really cool thing about the stuff is that we go and tag data, we look at it and we actually tie it to lines of regulations. So you could go CC CCPA. This bit of text here applies to this. And that's really helpful for me as an information security professional because I'm not necessarily versed on every line of regulation, but when I can go and look at it handily like that, it makes it easier for me to go, Oh, okay, that's great. I know how to treat that in terms of control. So that for, that's the important bit for me. So if you don't know where your data is, you can't control it. You can't monitor it. >>Governance. Yeah. The, the knowing where stuff is, I'm familiar with a framework that was developed at Telstra back in Australia called the five no's, which is about exactly that. Knowing where your data is, what is it, who has access to it? Cause I actually being able to cattle on the data then like knowing what it is that you have. This is a mammoth task. I mean that's, that's hard enough 12 years ago. But like today with the amount of data that's actually actively being created every single day, so how, how does your system help CSOs tackle this, this kind of issue and maybe less listed. You can, you can start off and then, then you can tell us a bit more of yourself. >>Yeah, I mean I'll start off on that. It's a, a place to kind of see the feedback from our enterprise customers is as that veracity and volume of data increases. The, the challenge is definitely there to keep on top of governing that. So continually discovering that new data created, how is it different? How's it adding to the existing data? Uh, using machine learning and the models that we create, whether it's anomaly detection or classifying the data based on certain features in the data that allows us to tag it, load that in our catalog. So I've discovered it now we've made it accessible. Now any BI developer data engineer can search for that data in a catalog and make something from it. So if there were 10 steps in that data mile, we definitely sold the first four or five to of bring that momentum to getting value from that data. So discovering it, catalog it, tagging the data to make it searchable, and then it's free to pick up for whatever use case is out there, whether it's migration, security, compliance, um, security is a big one for you. >>And I would also add too, for the data scientists, you know, knowing all the assets they have available to them in order to, to drive those business value insights that they're so important these days. For companies because you know, a lot of companies compete on very thin margins and, and, and having insights into their data and to the way customers can use their data really can make, make or break a company these days. So that's, that's critical. And as Aja pointed out, being able to automate that through, through data ops if you will, uh, and drive those insights automatically is great. Like for example, from an information security standpoint, I want to fingerprint my data and I want to feed it into a DLP system. And so that, you know, I can really sort of keep an eye out if this data is actually going out. And it really is my data versus a standard reject kind of matching, which isn't the best, uh, techniques. So >>yeah. So walk us through that in a bit more detail. So you mentioned tagging is essentially that a couple of times. So let's go into the details a little bit about what that, what that actually means for customers. My understanding is that you're looking for things like a social security number that could be sitting somewhere in this data. So finding out where are all these social security numbers that I may not be aware of and it could be being shared with someone who shouldn't have access to that, but it is there, is that what it is or are they, are there other kinds of data that you're able to tag that traditional purchase? >>Yeah. Was wait straight out of the box. You've got your um, PII or personally, um, identifiable information, that kind of day that is covered under the CCPA GDPR. So there are those standards, regulatory driven definitions that is social security number name, address would fall under. Um, beyond that. Then in a large enterprise, you've got a clever data scientists, data engineers you through the nature of their work can combine sets of data that could include work patterns, IDs, um, lots of activity. You bring that together and that suddenly becomes, uh, under that umbrella of sensitive. Um, so being able to tag and classify data under those regulatory policies, but then is what and what could be an operational risk to an organization, whether it's a bank, insurance, utility, health care in particular, if you work in all those verticals or yeah, across the way, agnostic to any vertical. >>Okay. All right. And the nature of being able to do that is having that machine learning set up a baseline, um, around what is sensitive and then honing that to what is particular to that organization. So, you know, lots of people will use ever sort of seen here at AWS S three, uh, Aurora, Postgres or, or my sequel Redshift. Um, and also different ways the underlying sources of that data, whether it's a CRM system, a IOT, all of those sources have got nuances that makes every enterprise data landscape just slightly different. So China make a rules based, one size fits all approach is, is going to be limiting, um, that the increase your manual overhead. So customers like GE, Comcast, um, that move way beyond throwing people at the problem, that's no longer possible. Uh, so being smart about how to approach this, classifying the data, using features in the data crane, that metadata as an asset just as an eight data warehouse would be, allows you to, to enable the rest of the organization. >>So, I mean, you've talked about, um, you know, deriving value and identifying value. Um, how does ultimately, once you catalog your tag, what does this mean to the bottom line of terms of ROI? How does AWS play into that? Um, you know, why am I as, as a, as a company, you know, what value am I getting out of, of your abilities with AWS and then having that kind of capability. >>Yeah. We, we did a great study with Forester. Um, they calculated the ROI and it's a mixture of things. It's that manual personnel overhead who are locked into that. Um, pretty unpleasant low productivity role of wrangling with data for want of a better words to make something of it. They'd much rather be creating the dashboards that the BI or the insights. Um, so moving, you know, dozens of people from the back office manual wrangling into what's going to make difference to the chief marketing officer and your CFO bring down the cost of served your customer by getting those operational insights is how they want to get to working with that data. So that automation to take out the manual overhead of the upfront task is an allowing that, that resource to be better deployed onto the more interesting productive work. So that's one part of the ROI. >>The other is with AWS. What we've found here engaging with the AWS ecosystem is just that speed of migration to AWS. We can take months out of that by cataloging what's on premise and saying, huh, I date aside. So our data engineering team want to create products on for their own customers using Sage maker using Redshift, Athena. Um, but what is the exact data that we need to push into the cloud to use those services? Is it the 20 petabytes that we've accumulated over the 20 last 20 years? That's probably not going to be the case. So tiering the on prem and cloud, um, base of that data is, is really helpful to a data officer and an information architect to set themselves up to accelerate that migration to AWS. So for people who've used this kind of system and they've run through the tagging and seen the power of the platform that you've got there. So what are some of the things that they're now able to do once they've got these highly qual, high quality tagged data set? >>So it's not just tagging too. We also do, uh, we do, we do, we do fuzzy, fuzzy magic so we can find relationships in the data or even relationships within the data in terms of duplicate. So, so for example, somebody, somebody got married and they're really the same, you know, so now there's their surname has changed. We can help companies find that, those bits of a matching. And I think we had one customer where we saved about, saved him about a hundred thousand a year in mailing costs because they were sending, you know, to, you know, misses, you know, right there anymore. Her name was. And having the, you know, being able to deduplicate that kind of data really helps with that helps people save money. >>Yep. And that's kind of the next phase in our journey is moving beyond the tag in the classification is uh, our roadmap working with AWS is very much machine learning driven. So our engineering team, uh, what they're excited about is what's the next model, what's the next problem we can solve with AI machine learning to throw at the large scale data problem. So we'll continually be curating and creating that metadata catalog asset. So allow that to be used as a resource to enable the rest of the, the data landscape. >>And I think what's interesting about our product is we really have multiple audiences for it. We've got the chief data officer who wants to make sure that we're completely compliant because it doesn't want that 4% potential fine. You know, so being able to evidence that they're having due diligence and their data management will go a long way towards if there is a breach because zero days do happen. But if you can evidence that you've really been, been, had a good discipline, then you won't get that fine or hopefully you won't get a big fine. And that the second audience is going to be information security professionals who want to secure that perimeter. The third is going to be the data architects who are trying to, to uh, to, you know, manage and, and create new solutions with that data. And the fourth of course is the data scientists trying to drive >>new business value. >>Alright, well before we, we, we, we um, let y'all take off, I want to know about, uh, an offering that you've launched this week, uh, apparently to great success and you're pretty excited about just your space alone here, your presence here. But tell us a little bit about that before you take off. >>Yeah. So we're here also sponsoring the jam lounge and everybody's welcome to sign up. It's, um, a number of our friends there to competitively take some challenges, come into the jam lounge, use our products, and kind of understand what it means to accelerate that journey onto AWS. What can I do if I show what what? Yeah, give me, give me an idea about the blog. You can take some chances to discover data and understand what data is there. Isn't there fighting relationships and intuitively through our UI, start exploring that and, and joining the dots. Um, uh, what, what is my day that knowing your data and then creating policies to drive that data into use. Cool. Good. And maybe pick up a football along the way so I know. Yeah. Thanks for being with us. Thank you for half the time. And, uh, again, the jam lounge, right? Right, right here at the SAS Bora AWS reinvent. We are alive. And you're watching this right here on the queue.
SUMMARY :
AWS reinvent 2019, brought to you by Amazon web services So you've come prepared for So Hadoop, um, large enterprises that were working with like and identifying, you know, first off, how do we ever figure out what do we have that's that There's project startup, we get a block of data and then all of a sudden the new, a new project comes along, So that for, that's the important bit for me. it is that you have. tagging the data to make it searchable, and then it's free to pick up for And I would also add too, for the data scientists, you know, knowing all the assets they So let's go into the details a little bit about what that, what that actually means for customers. Um, so being able to tag and classify And the nature of being able to do that is having Um, you know, why am I as, as a, as a company, you know, what value am I Um, so moving, you know, dozens of people from the back office base of that data is, is really helpful to a data officer and And having the, you know, being able to deduplicate that kind of data really So allow that to be used as a resource And that the second audience is going you take off. start exploring that and, and joining the dots.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Comcast | ORGANIZATION | 0.99+ |
GE | ORGANIZATION | 0.99+ |
Justin Warren | PERSON | 0.99+ |
Andy Jassy | PERSON | 0.99+ |
Goldman Sachs | ORGANIZATION | 0.99+ |
Australia | LOCATION | 0.99+ |
2017 | DATE | 0.99+ |
Joan | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
10 steps | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
2014 | DATE | 0.99+ |
Telstra | ORGANIZATION | 0.99+ |
Jorge J. | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
Ajay Vohora | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
20 petabytes | QUANTITY | 0.99+ |
four | QUANTITY | 0.99+ |
John Walls | PERSON | 0.99+ |
IO Tahoe | ORGANIZATION | 0.99+ |
4% | QUANTITY | 0.99+ |
Io-Tahoe | PERSON | 0.99+ |
one customer | QUANTITY | 0.99+ |
First | QUANTITY | 0.99+ |
CJ | PERSON | 0.99+ |
Redshift | TITLE | 0.99+ |
third | QUANTITY | 0.99+ |
12 years ago | DATE | 0.98+ |
fourth | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Lester Waters | PERSON | 0.98+ |
H J | PERSON | 0.97+ |
Aja | PERSON | 0.97+ |
Forester | ORGANIZATION | 0.97+ |
CCPA | TITLE | 0.97+ |
this week | DATE | 0.97+ |
zero days | QUANTITY | 0.96+ |
about a hundred thousand a year | QUANTITY | 0.96+ |
first | QUANTITY | 0.95+ |
second audience | QUANTITY | 0.94+ |
nine | QUANTITY | 0.94+ |
LA Las Vegas | LOCATION | 0.94+ |
Sage | ORGANIZATION | 0.92+ |
Leicester | LOCATION | 0.91+ |
Apache | ORGANIZATION | 0.9+ |
Lester | PERSON | 0.9+ |
SAS Bora | ORGANIZATION | 0.88+ |
first four | QUANTITY | 0.87+ |
one part | QUANTITY | 0.87+ |
one | QUANTITY | 0.87+ |
2019 | DATE | 0.85+ |
Hadoop | ORGANIZATION | 0.84+ |
Aurora | TITLE | 0.82+ |
dozens of people | QUANTITY | 0.79+ |
Redshift | ORGANIZATION | 0.78+ |
Postgres | ORGANIZATION | 0.76+ |
20 | DATE | 0.75+ |
eight data warehouse | QUANTITY | 0.74+ |
five different | QUANTITY | 0.73+ |
CEO | PERSON | 0.7+ |
single day | QUANTITY | 0.69+ |
China | LOCATION | 0.68+ |
20 last | QUANTITY | 0.65+ |
Athena | LOCATION | 0.63+ |
morning | DATE | 0.55+ |
Invent | EVENT | 0.54+ |
GDPR | TITLE | 0.53+ |
S three | TITLE | 0.52+ |
years | QUANTITY | 0.51+ |
no | OTHER | 0.4+ |
waters | ORGANIZATION | 0.39+ |
Io-Tahoe Smart Data Lifecycle CrowdChat | Digital
(upbeat music) >> Voiceover: From around the globe, it's theCUBE with digital coverage of Data Automated. An event series brought to you by Io-Tahoe. >> Welcome everyone to the second episode in our Data Automated series made possible with support from Io-Tahoe. Today, we're going to drill into the data lifecycle. Meaning the sequence of stages that data travels through from creation to consumption to archive. The problem as we discussed in our last episode is that data pipelines are complicated, they're cumbersome, they're disjointed and they involve highly manual processes. A smart data lifecycle uses automation and metadata to improve agility, performance, data quality and governance. And ultimately, reduce costs and time to outcomes. Now, in today's session we'll define the data lifecycle in detail and provide perspectives on what makes a data lifecycle smart? And importantly, how to build smarts into your processes. In a moment we'll be back with Adam Worthington from Ethos to kick things off. And then, we'll go into an expert power panel to dig into the tech behind smart data lifecyles. And, then we'll hop into the crowd chat and give you a chance to ask questions. So, stay right there, you're watching theCUBE. (upbeat music) >> Voiceover: Innovation. Impact. Influence. Welcome to theCUBE. Disruptors. Developers. And, practitioners. Learn from the voices of leaders, who share their personal insights from the hottest digital events around the globe. Enjoy the best this community has to offer on theCUBE. Your global leader in high tech digital coverage. >> Okay, we're back with Adam Worthington. Adam, good to see you, how are things across the pond? >> Good thank you, I'm sure our weather's a little bit worse than yours is over the other side, but good. >> Hey, so let's set it up, tell us about yourself, what your role is as CTO and--- >> Yeah, Adam Worthington as you said, CTO and co-founder of Ethos. But, we're a pretty young company ourselves, so we're in our sixth year. And, we specialize in emerging disruptive technology. So, within the infrastructure data center kind of cloud space. And, my role is a technical lead, so I, it's kind of my job to be an expert in all of the technologies that we work with. Which can be a bit of a challenge if you have a huge portfolio. One of the reasons we got to deliberately focus on. And also, kind of pieces of technical validation and evaluation of new technologies. >> So, you guys are really technology experts, data experts, and probably also expert in process and delivering customer outcomes, right? >> That's a great word there Dave, outcomes. I mean, that's a lot of what I like to speak to customers about. >> Let's talk about smart data you know, when you throw out terms like this it kind of can feel buzz wordy but what are the critical aspects of so-called smart data? >> Cool, well typically I had to step back a little bit and set the scene a little bit more in terms of kind of where I came from. So, and the types of problems I've sorted out. So, I'm really an infrastructure or solution architect by trade. And, what I kind of, relatively organically, but over time my personal framework and approach. I focused on three core design principles. So, simplicity, flexibility and efficiency. So, whatever it was I was designing and obviously they need different things depending on what the technology area is that we're working with. So, that's for me a pretty good step. So, they're the kind of areas that a smart approach in data will directly address both reducing silos. So, that comes from simplifying. So, moving away from complexity of infrastructure. Reducing the amount of copies of data that we have across the infrastructure. And, reducing the amount of application environment for the need for different areas. So, the smarter we get with data it's in my eyes anyway, the further we move away from those traditional legacy. >> But, how does it work? I mean, how, in other words, what's involved in injecting smarts into your data lifecycle? >> I think one of my, well actually I didn't have this quote ready, but genuinely one of my favorite quotes is from the French philosopher and mathematician, Blaise Pascal and he says, if I get this right, "I'd have written you a shorter letter, but I didn't have the time." So, there's real, I love that quote for lots of reasons. >> Dave: Alright. >> That's direct applications in terms of what we're talking about. In terms of, it's actually really complicated to develop a technology capability to make things simple. Be more directly meeting the needs of the business through tech. So, you provide self-service capability. And, I don't just mean self-driving, I mean making data and infrastructure make sense to the business users that are using it. >> Your job, correct me if I'm wrong, is to kind of put that all together in a solution. And then, help the customer you know, realize what we talked about earlier that business out. >> Yeah, and that's, it's sitting at both sides and understanding both sides. So, kind of key to us in our abilities to be able to deliver on exactly what you've just said, is being experts in the capabilities and new and better ways of doing things. But also, having the kind of, better business understanding to be able to ask the right questions to identify how can you better approach this 'cause it helps solve these issues. But, another area that I really like is the, with the platforms you can do more with less. And, that's not just about reducing data redundancy, that's about creating application environments that can service, an infrastructure to service different requirements that are able to do the random IO thing without getting too kind of low level tech. As well as the sequential. So, what that means is, that you don't necessarily have to move data from application environment A, do one thing with it, collate it and then move it to the application environment B, to application environment C, in terms of an analytics kind of left to right workload, you keep your data where it is, use it for different requirements within the infrastructure and again, do more with less. And, what that does, it's not just about simplicity and efficiency, it significantly reduces the times of value that that faces, as well. >> Do you have examples that you can share with us, even if they're anonymized of customers that you've worked with, that are maybe a little further down on the journey. Or, maybe not and--- >> Looking at the, you mentioned data protection earlier. So, another organization this is a project which is just coming nearing completion at the moment. Huge organization, that literally petabytes of data that was servicing their backup and archive. And, what they had is not just this reams of data. They had, I think I'm right in saying, five different backup applications that they had depending on the, what area of infrastructure they were backing up. So, whether it was virtualization, that was different to if they were backing up, different if they were backing up another data base environment they were using something else in the cloud. So, a consolidated approach that we recommended to work with them on. They were able to significantly reduce complexity and reduce the amount of time that it took them. So, what they were able to achieve and this was again, one of the key departments they had. They'd gone above the threshold of being able to backup all of them. >> Adam, give us the final thoughts, bring us home in this segment. >> Well, the final thoughts, so this is something, yeah we didn't particularly touch on. But, I think it's kind of slightly hidden, it isn't spoken about as much as I think it could be. Is the traditional approaches to infrastructure. We've already touched on that they can be complicated and there's a lack of efficiency. It impacts a user's ability to be agile. But, what you find with traditional approaches and we've already touched on some of the kind of benefits to new approaches there, is that they're often very prescriptive. They're designed for a particular firm. The infrastructure environment, the way that it's served up to the users in a kind of a packaged kind of way, means that they need to use it in that, whatever way it's been dictated. So, that kind of self-service aspect, as it comes in from a flexibility standpoint. But, these platforms and these platform approaches is the right way to address technology in my eyes. Enables the infrastructure to be used flexibly. So, the business users and the data users, what we find is that if we put in this capability into their hands. They start innovating the way that they use that data. And, the way that they bring benefits. And, if a platform is too prescriptive and they aren't able to do that, then what you're doing with these new approaches is get all of the metrics that we've touched on. It's fantastic from a cost standpoint, from an agility standpoint. But, what it means is that the innovators in the business, the ones that really understand what they're looking to achieve, they now have the tools to innovate with that. And, I think, and I've started to see that with projects that we've completed, if you do it in the right way, if you articulate the capability and you empower the business users in the right way. Then, they're in a significantly better position, these businesses to take advantages and really sort of match and significantly beat off their competition environment spaces. >> Super Adam, I mean a really exciting space. I mean we spent the last 10 years gathering all this data. You know, trying to slog through it and figure it out and now, with the tools that we have and the automation capabilities, it really is a new era of innovation and insight. So, Adam Worthington, thanks so much for coming in theCUBE and participating in this program. >> Yeah, exciting times and thank you very much Dave for inviting me, and yeah big pleasure. >> Now, we're going to go into the power panel and go deeper into the technologies that enable smart data lifecyles. And, stay right there, you're watching theCUBE. (light music) >> Voiceover: Are you interested in test-driving the Io-Tahoe platform? Kickstart the benefits of Data Automation for your business through the IoLabs program. A flexible, scalable, sandbox environment on the cloud of your choice. With setup, service and support provided by Io-Tahoe. Click on the link and connect with a data engineer to learn more and see Io-Tahoe in action. >> Welcome back everybody to the power panel, driving business performance with smart data lifecyles. Lester Waters is here, he's the Chief Technology Officer from Io-Tahoe. He's joined by Patrick Smith, who is field CTO from Pure Storage. And, Ezat Dayeh who is Assistant Engineering Manager at Cohesity. Gentlemen, good to see you, thanks so much for coming on this panel. >> Thank you, Dave. >> Yes. >> Thank you, Dave. >> Let's start with Lester, I wonder if each of you could just give us a quick overview of your role and what's the number one problem that you're focused on solving for your customers? Let's start with Lester, please. >> Ah yes, I'm Lester Waters, Chief Technology Officer for Io-Tahoe. And really, the number one problem that we are trying to solve for our customers is to help them understand what they have. 'Cause if they don't understand what they have in terms of their data, they can't manage it, they can't control it, they can't monitor it, they can't ensure compliance. So, really that's finding all that you can about your data that you have and building a catalog that can be readily consumed by the entire business is what we do. >> Patrick, field CTO in your title, that says to me you're talking to customers all the time so you've got a good perspective on it. Give us you know, your take on things here. >> Yeah absolutely, so my patch is in the air and talk to customers and prospects in lots of different verticals across the region. And, as they look at their environments and their data landscape, they're faced with massive growth in the data that they're trying to analyze. And, demands to be able to get inside are faster. And, to deliver business value faster than they've ever had to do in the past, so. >> Got it and then Ezat at Cohesity, you're like the new kid on the block. You guys are really growing rapidly. You created this whole notion of data management, backup and beyond, but from Assistant Engineering Manager what are you seeing from customers, your role and the number one problem that you're solving? >> Yeah sure, so the number one problem I see you know, time and again speaking with customers it's all around data fragmentation. So, due to things like organic growth you know, even maybe budgetary limitations, infrastructure has grown you know, over time, very piecemeal. And, it's highly distributed internally. And, just to be clear you know, when I say internally you know, that could be that it's on multiple platforms or silos within an on-prem infrastructure. But, that it also does extend to the cloud, as well. >> Right hey, cloud is cool, everybody wants to be in the cloud, right? So, you're right it creates maybe unattended consequences. So, let's start with the business outcome and kind of try to work backwards. I mean people you know, they want to get more insights from data, they want to have a more efficient data lifecyle. But, so Lester let me start with you, in thinking about like, the North Star, creating data driven cultures you know, what is the North Star for customers here? >> I think the North Star in a nutshell is driving value from your data. Without question, I mean we differentiate ourselves these days by even the nuances in our data. Now, underpinning that there's a lot of things that have to happen to make that work out well. You know for example, making sure you adequately protect your data. You know, do you have a good storage system? Do you have a good backup and recovery point objectives, recovering time objectives? Do you, are you fully compliant? Are you ensuring that you're ticking all the boxes? There's a lot of regulations these days in terms, with respect to compliance, data retention, data privacy and so fourth. Are you ticking those boxes? Are you being efficient with your data? You know, in other words I think there's a statistic that someone mentioned to me the other day that 53% of all businesses have between three and 15 copies of the same data. So you know, finding and eliminating those is part of the problems you need to chase. >> I like to think of you know, you're right. Lester, no doubt, business value and a lot of that comes from reducing the end to end cycle times. But, anything that you guys would add to that, Patrick and Ezat, maybe start with Patrick. >> Yeah, I think getting value from data really hits on, it hits on what everyone wants to achieve. But, I think there are a couple of key steps in doing that. First of all is getting access to the data. And that's, that really hits three big problems. Firstly, working out what you've got. Secondly, after working out what you've got, how to get access to it. Because, it's all very well knowing that you've got some data but if you can't get access to it. Either, because of privacy reasons, security reasons. Then, that's a big challenge. And then finally, once you've got access to the data, making sure that you can process that data in a timely manner. >> For me you know, it would be that an organization has got a really good global view of all of its data. It understands the data flow and dependencies within their infrastructure. Understands the precise legal and compliance requirements. And, has the ability to action changes or initiatives within their environment. Forgive the pun, but with a cloud like agility. You know, and that's no easy feat, right? That is hard work. >> Okay, so we've talked about the challenges and some of the objectives, but there's a lot of blockers out there and I want to understand how you guys are helping remove them? So, Lester what do you see as some of the big blockers in terms of people really leaning in to this smart data lifecycle. >> Yeah silos, is probably one of the biggest one I see in businesses. Yes, it's my data not your data. Lots of compartmentalization. And, breaking that down is one of the challenges. And, having the right tools to help you do that is only part of the solution. There's obviously a lot of cultural things that need to take place to break down those silos and work together. If you can identify where you have redundant data across your enterprise, you might be able to consolidate those. >> Yeah so, over to Patrick, so you know, one of the blockers that I see is legacy infrastructure, technical debt sucking all the budget. You got you know, too many people having to look after. >> As you look at the infrastructure that supports peoples data landscapes today. For primarily legacy reasons, the infrastructure itself is siloed. So, you have different technologies with different underlying hardware, different management methodologies that are there for good reason. Because, historically you had to have specific fitness for purpose for different data requirements. >> Dave: Ah-hm. >> And, that's one of the challenges that we tackled head on at Pure. With the flash plate technology and the concept of the data hub. A platform that can deliver in different characteristics for the different workloads. But, from a consistent data platform. >> Now, Ezat I want to go to you because you know, in the world, in your world which to me goes beyond backup and one of the challenges is you know, they say backup is one thing, recovery is everything. But as well, the CFO doesn't want to pay for just protection. Now, one of the things that I like about what you guys have done is you've broadened the perspective to get more value out of your what was once seen as an insurance policy. >> I do see one of the biggest blockers as the fact that the task at hand can you know, be overwhelming for customers. But, the key here is to remember that it's not an overnight change, it's not you know, the flick of the switch. It's something that can be tackled in a very piecemeal manner. And, absolutely like you've said you know, reduction in TCO and being able to leverage the data for other purposes is a key driver for this. So you know, this can be resolved. It can be very you know, pretty straightforward. It can be quite painless, as well. Same goes for unstructured data, which is very complex to manage. And you know, we've all heard the stats from the analysts, you know data obviously is growing at an extremely rapid rate. But, actually when you look at that you know, how is it actually growing? 80% of that growth is actually in unstructured data and only 20% of that growth is in structured data. So you know, these are quick win areas that the customers can realize immediate TCO improvement and increased agility, as well. >> Let's paint a picture of this guys, if I can bring up the lifecyle. You know what you can see here is you've got this cycle, the data lifecycle and what we're wanting to do is inject intelligence or smarts into this lifecyle. So, you can see you start with ingestion or creation of data. You're storing it, you've got to put it somewhere, right? You've got to classify it, you've got to protect it. And then, of course you want to you know, reduce the copies, make it you know, efficient. And then, you want to prepare it so that businesses can actually consume it and then you've got compliance and governance and privacy issues. And, I wonder if we could start with you Lester, this is you know, the picture of the lifecycle. What role does automation play in terms of injecting smarts into the lifecycle? >> Automation is key here, you know. Especially from the discover, catalog and classify perspective. I've seen companies where they go and we'll take and dump all of their data base schemes into a spreadsheet. So, that they can sit down and manually figure out what attribute 37 means for a column name. And, that's only the tip of the iceberg. So, being able to automatically detect what you have, automatically deduce where, what's consuming the data, you know upstream and downstream, being able to understand all of the things related to the lifecycle of your data backup, archive, deletion, it is key. And so, having good toolage areas is very important. >> So Patrick, obviously you participate in the store piece of this picture. So, I wondered if you could just talk more specifically about that, but I'm also interested in how you affect the whole system view, the end-to-end cycle time. >> Yeah, I think Lester kind of hit the nail on the head in terms of the importance of automation. Because, the data volumes are just so massive now that you can't effectively manage or understand or catalog your data without automation. Once you understand the data and the value of the data, then that's where you can work out where the data needs to be at any point in time. >> Right, so Pure and Cohesity obviously partnered to do that and of course, Ezat you guys are part of the protect, you're certainly part of the retain. But also, you provide data management capabilities and analytics, I wonder if you could add some color there? >> Yeah absolutely, so like you said you know, we focus pretty heavily on data protection as just one of our areas. And, that infrastructure it is just sitting there really can you know, the legacy infrastructure it's just sitting there you know, consuming power, space, cooling and pretty inefficient. And, automating that process is a key part of that. If I have a modern day platform such as you know, the Cohesity data platform I can actually do a lot of analytics on that through applications. So, we have a marketplace for apps. >> I wonder if we could talk about metadata. It's increasingly important you know, metadata is data about the data. But, Lester maybe explain why it's so important and what role it plays in terms of creating smart data lifecycle. >> A lot of people think it's just about the data itself. But, there's a lot of extended characteristics about your data. So, imagine if for my data lifecycle I can communicate with the backup system from Cohesity. And, find out when the last time that data was backed up or where it's backed up to. I can communicate, exchange data with Pure Storage and find out what tier it's on. Is the data at the right tier commencer with it's use level? If I could point it out. And, being able to share that metadata across systems. I think that's the direction that we're going in. Right now, we're at the stage we're just identifying the metadata and trying to bring it together and catalog it. The next stage will be okay, using the APIs and that we have between our systems. Can we communicate and share that data and build good solutions for customers to use? >> I think it's a huge point that you just made, I mean you know 10 years ago, automating classification was the big problem. And you know, with machine intelligence you know, we're obviously attacking that. But, your point about as machines start communicating to each other and you start you know, it's cloud to cloud. There's all kinds of metadata, kind of new metadata that's being created. I often joke that some day there's going to be more metadata than data. So, that brings us to cloud and Ezat, I'd like to start with you. >> You know, I do think that you know, having the cloud is a great thing. And, it has got its role to play and you can have many different you know, permutations and iterations of how you use it. And, you know, as I've may have sort of mentioned previously you know, I've seen customers go into the cloud very, very quickly and actually recently they're starting to remove workloads from the cloud. And, the reason why this happens is that you know, cloud has got its role to play but it's not right for absolutely everything. Especially in their current form, as well. A good analogy I like to use and this may sound a little bit clique but you know, when you compare clouds versus on premises data centers. You can use the analogies of houses and hotels. So, to give you an idea, so you know, when we look at hotels that's like the equivalent of a cloud, right? I can get everything I need from there. I can get my food, my water, my outdoor facilities, if I need to accommodate more people, I can rent some more rooms. I don't have to maintain the hotel, it's all done for me. When you look at houses the equivalent to you know, on premises infrastructure. I pretty much have to do everything myself, right? So, I have to purchase the house, I have to maintain it, I have buy my own food and water, eat it, I have to make improvements myself. But, then why do we all live in houses, not in hotels? And, the simple answer that I can only think of is, is that it's cheaper, right? It's cheaper to do it myself, but that's not to say that hotels haven't got their role to play. You know, so for example if I've got loads of visitors coming over for the weekend, I'm not going to go and build an extension to my house, just for them. I will burst into my hotel, into the cloud. And, you use it for you know, for things like that. So, what I'm really saying is the cloud is great for many things, but it can work out costlier for certain applications, while others are a perfect fit. >> That's an interesting analogy, I hadn't thought of that before. But, you're right, 'cause I was going to say well part of it is you want the cloud experience everywhere. But, you don't always want the cloud experience, especially you know, when you're with your family, you want certain privacy. I've not heard that before, Ezat. So, that's a new perspective, so thank you. But, Patrick I do want to come back to that cloud experience because in fact that's what's happening in a lot of cases. Organizations are extending the cloud properties of automation on-prem. >> Yeah, I thought Ezat brought up a really interesting point and a great analogy for the use of the public cloud. And, it really reinforces the importance of the Hybrid and the multicloud environment. Because, it gives you that flexibility to choose where is the optimal environment to run your business workloads. And, that's what it's all about. And, the flexibility to change which environment you're running in, either from one month to the next or from one year to the next. Because, workloads change and the characteristics that are available in the cloud change. The Hybrid cloud is something that we've lived with ourselves at Pure. So, our Pure management technology actually sits in a Hybrid cloud environment. We started off entirely cloud native but now, we use the public cloud for compute and we use our own technology at the end of a high performance network link to support our data platform. So, we're getting the best of both worlds. I think that's where a lot of our customers are trying to get to. >> All right, I want to come back in a moment there. But before we do, Lester I wonder if we could talk a little bit about compliance and governance and privacy. I think the Brits on this panel, we're still in the EU for now but the EU are looking at new rules, new regulations going beyond GDPR. Where does sort of privacy, governance, compliance fit in for the data lifecycle. And Ezat, I want your thought on this as well? >> Ah yeah, this is a very important point because the landscape for compliance around data privacy and data retention is changing very rapidly. And, being able to keep up with those changing regulations in an automated fashion is the only way you're going to be able to do it. Even, I think there's a some sort of a maybe ruling coming out today or tomorrow with a change to GDPR. So, this is, these are all very key points and being able to codify those rules into some software whether you know, Io-Tahoe or your storage system or Cohesity, it'll help you be compliant is crucial. >> Yeah, Ezat anything you can add there, I mean this really is your wheel house? >> Yeah, absolutely, so you know, I think anybody who's watching this probably has gotten the message that you know, less silos is better. And, it absolutely it also applies to data in the cloud, as well. So you know, by aiming to consolidate into you know, fewer platforms customers can realize a lot better control over their data. And, the natural affect of this is that it makes meeting compliance and governance a lot easier. So, when it's consolidated you can start to confidently understand who's accessing your data, how frequently are they accessing the data. You can also do things like you know, detecting an ominous file access activities and quickly identify potential threats. >> Okay Patrick, we were talking, you talked earlier about storage optimization. We talked to Adam Worthington about the business case, you've got the sort numerator which is the business value and then a denominator which is the cost. And, what's unique about Pure in this regard? >> Yeah, and I think there are multiple dimensions to that. Firstly, if you look at the difference between legacy storage platforms, they used to take up racks or aisles of space in a data center. With flash technology that underpins flash played we effectively switch out racks for rack units. And, it has a big play in terms of data center footprint and the environmentals associated with a data center. If you look at extending out storage efficiencies and the benefits it brings. Just the performance has a direct effect on staff. Whether that's you know, the staff and the simplicity of the platform so that it's easy and efficient to manage. Or, whether it's the efficiency you get from your data scientists who are using the outcomes from the platform and making them more efficient. If you look at some of our customers in the financial space their time to results are improved by 10 or 20 x by switching to our technology. From legacy technologies for their analytics platforms. >> So guys, we've been running you know, CUBE interviews in our studios remotely for the last 120 days. This is probably the first interview I've done where I haven't started off talking about COVID. Lester, I wondered if you could talk about smart data lifecycle and how it fits into this isolation economy and hopefully what will soon be a post-isolation economy? >> Yeah, COVID has dramatically accelerated the data economy. I think you know, first and foremost we've all learned to work at home. I you know, we've all had that experience where you know, people would hum and har about being able to work at home just a couple of days a week. And, here we are working five days a week. That's had a knock on impact to infrastructure to be able to support that. But, going further than that you know, the data economy is all about how a business can leverage their data to compete in this new world order that we are now in. COVID has really been a forcing function to you know, it's probably one of the few good things that have come out of COVID is that we've been forced to adapt. And, it's been an interesting journey and it continues to be so. >> Like Lester said you know, we're seeing huge impact here. You know, working from home has pretty much become the norm now. You know, companies have been forced into making it work. If you look at online retail, that's accelerated dramatically, as well. Unified communications and video conferencing. So, really you know, that the point here is that, yes absolutely we've compressed you know, in the past maybe four months what probably would have taken maybe even five years, maybe 10 years or so. >> We've got to wrap, but so Lester let me ask you, sort of paint a picture of the sort of journey the maturity model that people have to take. You know, if they want to get into it, where do they start and where are they going? Give us that view. >> Yeah, I think first is knowing what you have. If you don't know what you have you can't manage it, you can't control it, you can't secure it, you can't ensure it's compliant. So, that's first and foremost. The second is really you know, ensuring that you're compliant once you know what you have, are you securing it? Are you following the regulatory, the regulations? Are you able to evidence that? How are you storing your data? Are you archiving it? Are you storing it effectively and efficiently? You know, have you, nirvana from my perspective is really getting to a point where you've consolidated your data, you've broken down the silos and you have a virtually self-service environment by which the business can consume and build upon their data. And, really at the end of the day as we said at the beginning, it's all about driving value out of your data. And, automation is key to this journey. >> That's awesome and you've just described like sort of a winning data culture. Lester, Patrick, Ezat, thanks so much for participating in this power panel. >> Thank you, David. >> Thank you. >> All right, so great overview of the steps in the data lifecyle and how to inject smarts into the processes, really to drive business outcomes. Now, it's your turn, hop into the crowd chat. Please log in with Twitter or LinkedIn or Facebook, ask questions, answer questions and engage with the community. Let's crowd chat! (bright music)
SUMMARY :
to you by Io-Tahoe. and give you a chance to ask questions. Enjoy the best this community Adam, good to see you, how Good thank you, I'm sure our of the technologies that we work with. I like to speak to customers about. So, and the types of is from the French of the business through tech. And then, help the customer you know, to identify how can you that you can share with us, and reduce the amount of Adam, give us the final thoughts, the kind of benefits to and the automation capabilities, thank you very much Dave and go deeper into the technologies on the cloud of your choice. he's the Chief Technology I wonder if each of you So, really that's finding all that you can Give us you know, your in the data that they're and the number one problem And, just to be clear you know, I mean people you know, they is part of the problems you need to chase. from reducing the end to end cycle times. making sure that you can process And, has the ability to action changes So, Lester what do you see as some of And, having the right tools to help you Yeah so, over to Patrick, so you know, So, you have different technologies and the concept of the data hub. the challenges is you know, the analysts, you know to you know, reduce the copies, And, that's only the tip of the iceberg. in the store piece of this picture. the data needs to be at any point in time. and analytics, I wonder if you it's just sitting there you know, It's increasingly important you know, And, being able to share to each other and you start So, to give you an idea, so you know, especially you know, when And, the flexibility to change compliance fit in for the data lifecycle. in an automated fashion is the only way You can also do things like you know, about the business case, Whether that's you know, you know, CUBE interviews forcing function to you know, So, really you know, that of the sort of journey And, really at the end of the day for participating in this power panel. the processes, really to
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Patrick | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Ezat Dayeh | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Adam Worthington | PERSON | 0.99+ |
Patrick Smith | PERSON | 0.99+ |
Adam | PERSON | 0.99+ |
Ezat | PERSON | 0.99+ |
80% | QUANTITY | 0.99+ |
10 | QUANTITY | 0.99+ |
second episode | QUANTITY | 0.99+ |
Blaise Pascal | PERSON | 0.99+ |
53% | QUANTITY | 0.99+ |
five years | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
10 years | QUANTITY | 0.99+ |
EU | ORGANIZATION | 0.99+ |
sixth year | QUANTITY | 0.99+ |
Io-Tahoe | ORGANIZATION | 0.99+ |
Ethos | ORGANIZATION | 0.99+ |
North Star | ORGANIZATION | 0.99+ |
Lester | PERSON | 0.99+ |
Cohesity | ORGANIZATION | 0.99+ |
second | QUANTITY | 0.99+ |
both sides | QUANTITY | 0.99+ |
first interview | QUANTITY | 0.99+ |
each | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
one month | QUANTITY | 0.99+ |
Lester Waters | PERSON | 0.99+ |
GDPR | TITLE | 0.98+ |
today | DATE | 0.98+ |
Firstly | QUANTITY | 0.98+ |
one year | QUANTITY | 0.98+ |
15 copies | QUANTITY | 0.98+ |
ORGANIZATION | 0.98+ | |
First | QUANTITY | 0.98+ |
Today | DATE | 0.98+ |
20 x | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
both | QUANTITY | 0.97+ |
10 years ago | DATE | 0.97+ |
four months | QUANTITY | 0.97+ |
five days a week | QUANTITY | 0.97+ |
Secondly | QUANTITY | 0.97+ |
ORGANIZATION | 0.97+ | |
both worlds | QUANTITY | 0.97+ |
ORGANIZATION | 0.97+ | |
three | QUANTITY | 0.96+ |
One | QUANTITY | 0.96+ |
Pure Storage | ORGANIZATION | 0.95+ |
Lester | ORGANIZATION | 0.94+ |
20% | QUANTITY | 0.94+ |
Pure | ORGANIZATION | 0.93+ |
fourth | QUANTITY | 0.93+ |
Enterprise Data Automation | Crowdchat
>>from around the globe. It's the Cube with digital coverage of enterprise data automation, an event Siri's brought to you by Iot. Tahoe Welcome everybody to Enterprise Data Automation. Ah co created digital program on the Cube with support from my hotel. So my name is Dave Volante. And today we're using the hashtag data automated. You know, organizations. They really struggle to get more value out of their data, time to data driven insights that drive cost savings or new revenue opportunities. They simply take too long. So today we're gonna talk about how organizations can streamline their data operations through automation, machine intelligence and really simplifying data migrations to the cloud. We'll be talking to technologists, visionaries, hands on practitioners and experts that are not just talking about streamlining their data pipelines. They're actually doing it. So keep it right there. We'll be back shortly with a J ahora who's the CEO of Iot Tahoe to kick off the program. You're watching the Cube, the leader in digital global coverage. We're right back right after this short break. Innovation impact influence. Welcome to the Cube disruptors. Developers and practitioners learn from the voices of leaders who share their personal insights from the hottest digital events around the globe. Enjoy the best this community has to offer on the Cube, your global leader. High tech digital coverage from around the globe. It's the Cube with digital coverage of enterprise, data, automation and event. Siri's brought to you by Iot. Tahoe. Okay, we're back. Welcome back to Data Automated. A J ahora is CEO of I O ta ho, JJ. Good to see how things in London >>Thanks doing well. Things in, well, customers that I speak to on day in, day out that we partner with, um, they're busy adapting their businesses to serve their customers. It's very much a game of ensuring the week and serve our customers to help their customers. Um, you know, the adaptation that's happening here is, um, trying to be more agile. Got to be more flexible. Um, a lot of pressure on data, a lot of demand on data and to deliver more value to the business, too. So that customers, >>as I said, we've been talking about data ops a lot. The idea being Dev Ops applied to the data pipeline, But talk about enterprise data automation. What is it to you. And how is it different from data off >>Dev Ops, you know, has been great for breaking down those silos between different roles functions and bring people together to collaborate. Andi, you know, we definitely see that those tools, those methodologies, those processes, that kind of thinking, um, lending itself to data with data is exciting. We look to do is build on top of that when data automation, it's the it's the nuts and bolts of the the algorithms, the models behind machine learning that the functions. That's where we investors, our r and d on bringing that in to build on top of the the methods, the ways of thinking that break down those silos on injecting that automation into the business processes that are going to drive a business to serve its customers. It's, um, a layer beyond Dev ops data ops. They can get to that point where well, I think about it is is the automation behind new dimension. We've come a long way in the last few years. Boy is, we started out with automating some of those simple, um, to codify, um, I have a high impact on organization across the data a cost effective way house. There's data related tasks that classify data on and a lot of our original pattern certain people value that were built up is is very much around that >>love to get into the tech a little bit in terms of how it works. And I think we have a graphic here that gets into that a little bit. So, guys, if you bring that up, >>sure. I mean right there in the middle that the heart of what we do it is, you know, the intellectual property now that we've built up over time that takes from Hacha genius data sources. Your Oracle Relational database. Short your mainframe. It's a lay and increasingly AP eyes and devices that produce data and that creates the ability to automatically discover that data. Classify that data after it's classified. Them have the ability to form relationships across those different source systems, silos, different lines of business. And once we've automated that that we can start to do some cool things that just puts of contact and meaning around that data. So it's moving it now from bringing data driven on increasingly where we have really smile, right people in our customer organizations you want I do some of those advanced knowledge tasks data scientists and ah, yeah, quants in some of the banks that we work with, the the onus is on, then, putting everything we've done there with automation, pacifying it, relationship, understanding that equality, the policies that you can apply to that data. I'm putting it in context once you've got the ability to power. Okay, a professional is using data, um, to be able to put that data and contacts and search across the entire enterprise estate. Then then they can start to do some exciting things and piece together the the tapestry that fabric across that different system could be crm air P system such as s AP and some of the newer brown databases that we work with. Snowflake is a great well, if I look back maybe five years ago, we had prevalence of daily technologies at the cutting edge. Those are converging to some of the cloud platforms that we work with Google and AWS and I think very much is, as you said it, those manual attempts to try and grasp. But it is such a complex challenges scale quickly runs out of steam because once, once you've got your hat, once you've got your fingers on the details Oh, um, what's what's in your data state? It's changed, You know, you've onboard a new customer. You signed up a new partner. Um, customer has, you know, adopted a new product that you just Lawrence and there that that slew of data keeps coming. So it's keeping pace with that. The only answer really is is some form of automation >>you're working with AWS. You're working with Google, You got red hat. IBM is as partners. What is attracting those folks to your ecosystem and give us your thoughts on the importance of ecosystem? >>That's fundamental. So, I mean, when I caimans where you tell here is the CEO of one of the, um, trends that I wanted us CIO to be part of was being open, having an open architecture allowed one thing that was close to my heart, which is as a CEO, um, a c i o where you go, a budget vision on and you've already made investments into your organization, and some of those are pretty long term bets. They should be going out 5 10 years, sometimes with the CRM system training up your people, getting everybody working together around a common business platform. What I wanted to ensure is that we could openly like it using AP eyes that were available, the love that some investment on the cost that has already gone into managing in organizations I t. But business users to before. So part of the reason why we've been able to be successful with, um, the partners like Google AWS and increasingly, a number of technology players. That red hat mongo DB is another one where we're doing a lot of good work with, um and snowflake here is, um Is those investments have been made by the organizations that are our customers, and we want to make sure we're adding to that. And they're leveraging the value that they've already committed to. >>Yeah, and maybe you could give us some examples of the r A y and the business impact. >>Yeah, I mean, the r a y David is is built upon on three things that I mentioned is a combination off. You're leveraging the existing investment with the existing estate, whether that's on Microsoft Azure or AWS or Google, IBM, and I'm putting that to work because, yeah, the customers that we work with have had made those choices. On top of that, it's, um, is ensuring that we have got the automation that is working right down to the level off data, a column level or the file level we don't do with meta data. It is being very specific to be at the most granular level. So as we've grown our processes and on the automation, gasification tagging, applying policies from across different compliance and regulatory needs that an organization has to the data, everything that then happens downstream from that is ready to serve a business outcome now without hoping out which run those processes within hours of getting started And, um, Bill that picture, visualize that picture and bring it to life. You know, the PR Oh, I that's off the bat with finding data that should have been deleted data that was copies off on and being able to allow the architect whether it's we're working on GCB or a migration to any other clouds such as AWS or a multi cloud landscape right off the map. >>A. J. Thanks so much for coming on the Cube and sharing your insights and your experience is great to have you. >>Thank you, David. Look who is smoking in >>now. We want to bring in the customer perspective. We have a great conversation with Paul Damico, senior vice president data architecture, Webster Bank. So keep it right there. >>Utah Data automated Improve efficiency, Drive down costs and make your enterprise data work for you. Yeah, we're on a mission to enable our customers to automate the management of data to realise maximum strategic and operational benefits. We envisage a world where data users consume accurate, up to date unified data distilled from many silos to deliver transformational outcomes, activate your data and avoid manual processing. Accelerate data projects by enabling non I t resources and data experts to consolidate categorize and master data. Automate your data operations Power digital transformations by automating a significant portion of data management through human guided machine learning. Yeah, get value from the start. Increase the velocity of business outcomes with complete accurate data curated automatically for data, visualization tours and analytic insights. Improve the security and quality of your data. Data automation improves security by reducing the number of individuals who have access to sensitive data, and it can improve quality. Many companies report double digit era reduction in data entry and other repetitive tasks. Trust the way data works for you. Data automation by our Tahoe learns as it works and can ornament business user behavior. It learns from exception handling and scales up or down is needed to prevent system or application overloads or crashes. It also allows for innate knowledge to be socialized rather than individualized. No longer will your companies struggle when the employee who knows how this report is done, retires or takes another job, the work continues on without the need for detailed information transfer. Continue supporting the digital shift. Perhaps most importantly, data automation allows companies to begin making moves towards a broader, more aspirational transformation, but on a small scale but is easy to implement and manage and delivers quick wins. Digital is the buzzword of the day, but many companies recognized that it is a complex strategy requires time and investment. Once you get started with data automation, the digital transformation initiated and leaders and employees alike become more eager to invest time and effort in a broader digital transformational agenda. Yeah, >>everybody, we're back. And this is Dave Volante, and we're covering the whole notion of automating data in the Enterprise. And I'm really excited to have Paul Damico here. She's a senior vice president of enterprise Data Architecture at Webster Bank. Good to see you. Thanks for coming on. >>Nice to see you too. Yes. >>So let's let's start with Let's start with Webster Bank. You guys are kind of a regional. I think New York, New England, uh, leave headquartered out of Connecticut, but tell us a little bit about the >>bank. Yeah, Webster Bank is regional, Boston. And that again in New York, Um, very focused on in Westchester and Fairfield County. Um, they're a really highly rated bank regional bank for this area. They, um, hold, um, quite a few awards for the area for being supportive for the community. And, um, are really moving forward. Technology lives. Currently, today we have, ah, a small group that is just working toward moving into a more futuristic, more data driven data warehouse. That's our first item. And then the other item is to drive new revenue by anticipating what customers do when they go to the bank or when they log into there to be able to give them the best offer. The only way to do that is you have timely, accurate, complete data on the customer and what's really a great value on off something to offer that >>at the top level, what were some of what are some of the key business drivers there catalyzing your desire for change >>the ability to give the customer what they need at the time when they need it? And what I mean by that is that we have, um, customer interactions and multiple weights, right? And I want to be able for the customer, too. Walk into a bank, um, or online and see the same the same format and being able to have the same feel, the same look and also to be able to offer them the next best offer for them. >>Part of it is really the cycle time, the end end cycle, time that you're pressing. And then there's if I understand it, residual benefits that are pretty substantial from a revenue opportunity >>exactly. It's drive new customers, Teoh new opportunities. It's enhanced the risk, and it's to optimize the banking process and then obviously, to create new business. Um, and the only way we're going to be able to do that is that we have the ability to look at the data right when the customer walks in the door or right when they open up their app. >>Do you see the potential to increase the data sources and hence the quality of the data? Or is that sort of premature? >>Oh, no. Um, exactly. Right. So right now we ingest a lot of flat files and from our mainframe type of runnin system that we've had for quite a few years. But now that we're moving to the cloud and off Prem and on France, you know, moving off Prem into, like, an s three bucket Where that data king, we can process that data and get that data faster by using real time tools to move that data into a place where, like, snowflake Good, um, utilize that data or we can give it out to our market. The data scientists are out in the lines of business right now, which is great, cause I think that's where data science belongs. We should give them on, and that's what we're working towards now is giving them more self service, giving them the ability to access the data in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own like tableau dashboards and then pushing the data back out. I have eight engineers, data architects, they database administrators, right, um, and then data traditional data forwarding people, Um, and because some customers that I have that our business customers lines of business, they want to just subscribe to a report. They don't want to go out and do any data science work. Um, and we still have to provide that. So we still want to provide them some kind of read regiment that they wake up in the morning and they open up their email. And there's the report that they just drive, um, which is great. And it works out really well. And one of the things. This is why we purchase I o waas. I would have the ability to give the lines of business the ability to do search within the data, and we read the data flows and data redundancy and things like that and help me cleanup the data and also, um, to give it to the data. Analysts who say All right, they just asked me. They want this certain report and it used to take Okay, well, we're gonna four weeks, we're going to go. We're gonna look at the data, and then we'll come back and tell you what we dio. But now with Iot Tahoe, they're able to look at the data and then, in one or two days of being able to go back and say, Yes, we have data. This is where it is. This is where we found that this is the data flows that we've found also, which is what I call it is the birth of a column. It's where the calm was created and where it went live as a teenager. And then it went to, you know, die very archive. >>In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the data structure, and actually dig into it. But also see it, um, and that speeds things up and gives everybody additional confidence. And then the other pieces essentially infusing ai or machine intelligence into the data pipeline is really how you're attacking automation, right? >>Exactly. So you're able to let's say that I have I have seven cause lines of business that are asking me questions. And one of the questions I'll ask me is, um, we want to know if this customer is okay to contact, right? And you know, there's different avenues so you can go online to go. Do not contact me. You can go to the bank And you could say, I don't want, um, email, but I'll take tests and I want, you know, phone calls. Um, all that information. So seven different lines of business asked me that question in different ways once said Okay to contact the other one says, You know, just for one to pray all these, you know, um, and each project before I got there used to be siloed. So one customer would be 100 hours for them to do that and analytical work, and then another cut. Another of analysts would do another 100 hours on the other project. Well, now I can do that all at once, and I can do those type of searches and say yes we already have that documentation. Here it is. And this is where you can find where the customer has said, You know, you don't want I don't want to get access from you by email, or I've subscribed to get emails from you. I'm using Iot typos eight automation right now to bring in the data and to start analyzing the data close to make sure that I'm not missing anything and that I'm not bringing over redundant data. Um, the data warehouse that I'm working off is not, um a It's an on prem. It's an oracle database. Um, and it's 15 years old, so it has extra data in it. It has, um, things that we don't need anymore. And Iot. Tahoe's helping me shake out that, um, extra data that does not need to be moved into my S three. So it's saving me money when I'm moving from offering on Prem. >>What's your vision or your your data driven organization? >>Um, I want for the bankers to be able to walk around with on iPad in their hands and be able to access data for that customer really fast and be able to give them the best deal that they can get. I want Webster to be right there on top, with being able to add new customers and to be able to serve our existing customers who had bank accounts. Since you were 12 years old there and now our, you know, multi. Whatever. Um, I want them to be able to have the best experience with our our bankers. >>That's really what I want is a banking customer. I want my bank to know who I am, anticipate my needs and create a great experience for me. And then let me go on with my life. And so that's a great story. Love your experience, your background and your knowledge. Can't thank you enough for coming on the Cube. >>No, thank you very much. And you guys have a great day. >>Next, we'll talk with Lester Waters, who's the CTO of Iot Toe cluster takes us through the key considerations of moving to the cloud. >>Yeah, right. The entire platform Automated data Discovery data Discovery is the first step to knowing your data auto discover data across any application on any infrastructure and identify all unknown data relationships across the entire siloed data landscape. smart data catalog. Know how everything is connected? Understand everything in context, regained ownership and trust in your data and maintain a single source of truth across cloud platforms, SAS applications, reference data and legacy systems and power business users to quickly discover and understand the data that matters to them with a smart data catalog continuously updated ensuring business teams always have access to the most trusted data available. Automated data mapping and linking automate the identification of unknown relationships within and across data silos throughout the organization. Build your business glossary automatically using in house common business terms, vocabulary and definitions. Discovered relationships appears connections or dependencies between data entities such as customer account, address invoice and these data entities have many discovery properties. At a granular level, data signals dashboards. Get up to date feeds on the health of your data for faster improved data management. See trends, view for history. Compare versions and get accurate and timely visual insights from across the organization. Automated data flows automatically captured every data flow to locate all the dependencies across systems. Visualize how they work together collectively and know who within your organization has access to data. Understand the source and destination for all your business data with comprehensive data lineage constructed automatically during with data discovery phase and continuously load results into the smart Data catalog. Active, geeky automated data quality assessments Powered by active geek You ensure data is fit for consumption that meets the needs of enterprise data users. Keep information about the current data quality state readily available faster Improved decision making Data policy. Governor Automate data governance End to end over the entire data lifecycle with automation, instant transparency and control Automate data policy assessments with glossaries, metadata and policies for sensitive data discovery that automatically tag link and annotate with metadata to provide enterprise wide search for all lines of business self service knowledge graph Digitize and search your enterprise knowledge. Turn multiple siloed data sources into machine Understandable knowledge from a single data canvas searching Explore data content across systems including GRP CRM billing systems, social media to fuel data pipelines >>Yeah, yeah, focusing on enterprise data automation. We're gonna talk about the journey to the cloud Remember, the hashtag is data automate and we're here with Leicester Waters. Who's the CTO of Iot Tahoe? Give us a little background CTO, You've got a deep, deep expertise in a lot of different areas. But what do we need to know? >>Well, David, I started my career basically at Microsoft, uh, where I started the information Security Cryptography group. They're the very 1st 1 that the company had, and that led to a career in information, security. And and, of course, as easy as you go along with information security data is the key element to be protected. Eso I always had my hands and data not naturally progressed into a roll out Iot talk was their CTO. >>What's the prescription for that automation journey and simplifying that migration to the cloud? >>Well, I think the first thing is understanding what you've got. So discover and cataloging your data and your applications. You know, I don't know what I have. I can't move it. I can't. I can't improve it. I can't build upon it. And I have to understand there's dependence. And so building that data catalog is the very first step What I got. Okay, >>so So we've done the audit. We know we've got what's what's next? Where do we go >>next? So the next thing is remediating that data you know, where do I have duplicate data? I may have often times in an organization. Uh, data will get duplicated. So somebody will take a snapshot of the data, you know, and then end up building a new application, which suddenly becomes dependent on that data. So it's not uncommon for an organization of 20 master instances of a customer, and you can see where that will go. And trying to keep all that stuff in sync becomes a nightmare all by itself. So you want to sort of understand where all your redundant data is? So when you go to the cloud, maybe you have an opportunity here to do you consolidate that that data, >>then what? You figure out what to get rid of our actually get rid of it. What's what's next? >>Yes, yes, that would be the next step. So figure out what you need. What, you don't need you Often times I've found that there's obsolete columns of data in your databases that you just don't need. Or maybe it's been superseded by another. You've got tables have been superseded by other tables in your database, so you got to kind of understand what's being used and what's not. And then from that, you can decide. I'm gonna leave this stuff behind or I'm gonna I'm gonna archive this stuff because I might need it for data retention where I'm just gonna delete it. You don't need it. All were >>plowing through your steps here. What's next on the >>journey? The next one is is in a nutshell. Preserve your data format. Don't. Don't, Don't. Don't boil the ocean here at music Cliche. You know, you you want to do a certain degree of lift and shift because you've got application dependencies on that data and the data format, the tables in which they sent the columns and the way they're named. So some degree, you are gonna be doing a lift and ship, but it's an intelligent lift and ship. The >>data lives in silos. So how do you kind of deal with that? Problem? Is that is that part of the journey? >>That's that's great pointed because you're right that the data silos happen because, you know, this business unit is start chartered with this task. Another business unit has this task and that's how you get those in stance creations of the same data occurring in multiple places. So you really want to is part of your cloud migration. You really want a plan where there's an opportunity to consolidate your data because that means it will be less to manage. Would be less data to secure, and it will be. It will have a smaller footprint, which means reduce costs. >>But maybe you could address data quality. Where does that fit in on the >>journey? That's that's a very important point, you know. First of all, you don't want to bring your legacy issues with U. S. As the point I made earlier. If you've got data quality issues, this is a good time to find those and and identify and remediate them. But that could be a laborious task, and you could probably accomplish. It will take a lot of work. So the opportunity used tools you and automate that process is really will help you find those outliers that >>what's next? I think we're through. I think I've counted six. What's the What's the lucky seven >>Lucky seven involved your business users. Really, When you think about it, you're your data is in silos, part of part of this migration to cloud as an opportunity to break down the silos. These silence that naturally occurs are the business. You, uh, you've got to break these cultural barriers that sometimes exists between business and say so. For example, I always advise there's an opportunity year to consolidate your sensitive data. Your P I. I personally identifiable information and and three different business units have the same source of truth From that, there's an opportunity to consolidate that into one. >>Well, great advice, Lester. Thanks so much. I mean, it's clear that the Cap Ex investments on data centers they're generally not a good investment for most companies. Lester really appreciate Lester Water CTO of Iot Tahoe. Let's watch this short video and we'll come right back. >>Use cases. Data migration. Accelerate digitization of business by providing automated data migration work flows that save time in achieving project milestones. Eradicate operational risk and minimize labor intensive manual processes that demand costly overhead data quality. You know the data swamp and re establish trust in the data to enable data signs and Data analytics data governance. Ensure that business and technology understand critical data elements and have control over the enterprise data landscape Data Analytics ENABLEMENT Data Discovery to enable data scientists and Data Analytics teams to identify the right data set through self service for business demands or analytical reporting that advanced too complex regulatory compliance. Government mandated data privacy requirements. GDP Our CCP, A, e, p, R HIPPA and Data Lake Management. Identify late contents cleanup manage ongoing activity. Data mapping and knowledge graph Creates BKG models on business enterprise data with automated mapping to a specific ontology enabling semantic search across all sources in the data estate data ops scale as a foundation to automate data management presences. >>Are you interested in test driving the i o ta ho platform Kickstart the benefits of data automation for your business through the Iot Labs program? Ah, flexible, scalable sandbox environment on the cloud of your choice with set up service and support provided by Iot. Top Click on the link and connect with the data engineer to learn more and see Iot Tahoe in action. Everybody, we're back. We're talking about enterprise data automation. The hashtag is data automated and we're going to really dig into data migrations, data migrations. They're risky, they're time consuming and they're expensive. Yousef con is here. He's the head of partnerships and alliances at I o ta ho coming again from London. Hey, good to see you, Seth. Thanks very much. >>Thank you. >>So let's set up the problem a little bit. And then I want to get into some of the data said that migration is a risky, time consuming, expensive. They're they're often times a blocker for organizations to really get value out of data. Why is that? >>I think I mean, all migrations have to start with knowing the facts about your data. Uh, and you can try and do this manually. But when you have an organization that may have been going for decades or longer, they will probably have a pretty large legacy data estate so that I have everything from on premise mainframes. They may have stuff which is probably in the cloud, but they probably have hundreds, if not thousands of applications and potentially hundreds of different data stores. >>So I want to dig into this migration and let's let's pull up graphic. It will talk about We'll talk about what a typical migration project looks like. So what you see, here it is. It's very detailed. I know it's a bit of an eye test, but let me call your attention to some of the key aspects of this, uh and then use if I want you to chime in. So at the top here, you see that area graph that's operational risk for a typical migration project, and you can see the timeline and the the milestones That Blue Bar is the time to test so you can see the second step. Data analysis. It's 24 weeks so very time consuming, and then let's not get dig into the stuff in the middle of the fine print. But there's some real good detail there, but go down the bottom. That's labor intensity in the in the bottom, and you can see hi is that sort of brown and and you could see a number of data analysis data staging data prep, the trial, the implementation post implementation fixtures, the transition to be a Blu, which I think is business as usual. >>The key thing is, when you don't understand your data upfront, it's very difficult to scope to set up a project because you go to business stakeholders and decision makers, and you say Okay, we want to migrate these data stores. We want to put them in the cloud most often, but actually, you probably don't know how much data is there. You don't necessarily know how many applications that relates to, you know, the relationships between the data. You don't know the flow of the basis of the direction in which the data is going between different data stores and tables. So you start from a position where you have pretty high risk and probably the area that risk you could be. Stack your project team of lots and lots of people to do the next phase, which is analysis. And so you set up a project which has got a pretty high cost. The big projects, more people, the heavy of governance, obviously on then there, then in the phase where they're trying to do lots and lots of manual analysis, um, manual processes, as we all know, on the layer of trying to relate data that's in different grocery stores relating individual tables and columns, very time consuming, expensive. If you're hiring in resource from consultants or systems integrators externally, you might need to buy or to use party tools. Aziz said earlier the people who understand some of those systems may have left a while ago. CEO even higher risks quite cost situation from the off on the same things that have developed through the project. Um, what are you doing with Ayatollah? Who is that? We're able to automate a lot of this process from the very beginning because we can do the initial data. Discovery run, for example, automatically you very quickly have an automated validator. A data met on the data flow has been generated automatically, much less time and effort and much less cars stopped. >>Yeah. And now let's bring up the the the same chart. But with a set of an automation injection in here and now. So you now see the sort of Cisco said accelerated by Iot, Tom. Okay, great. And we're gonna talk about this, but look, what happens to the operational risk. A dramatic reduction in that, That that graph and then look at the bars, the bars, those blue bars. You know, data analysis went from 24 weeks down to four weeks and then look at the labor intensity. The it was all these were high data analysis, data staging data prep trialling post implementation fixtures in transition to be a you all those went from high labor intensity. So we've now attacked that and gone to low labor intensity. Explain how that magic happened. >>I think that the example off a data catalog. So every large enterprise wants to have some kind of repository where they put all their understanding about their data in its price States catalog. If you like, imagine trying to do that manually, you need to go into every individual data store. You need a DB, a business analyst, reach data store. They need to do an extract of the data. But it on the table was individually they need to cross reference that with other data school, it stores and schemers and tables you probably with the mother of all Lock Excel spreadsheets. It would be a very, very difficult exercise to do. I mean, in fact, one of our reflections as we automate lots of data lots of these things is, um it accelerates the ability to water may, But in some cases, it also makes it possible for enterprise customers with legacy systems take banks, for example. There quite often end up staying on mainframe systems that they've had in place for decades. I'm not migrating away from them because they're not able to actually do the work of understanding the data, duplicating the data, deleting data isn't relevant and then confidently going forward to migrate. So they stay where they are with all the attendant problems assistance systems that are out of support. You know, you know, the biggest frustration for lots of them and the thing that they spend far too much time doing is trying to work out what the right data is on cleaning data, which really you don't want a highly paid thanks to scientists doing with their time. But if you sort out your data in the first place, get rid of duplication that sounds migrate to cloud store where things are really accessible. It's easy to build connections and to use native machine learning tools. You well, on the way up to the maturity card, you can start to use some of the more advanced applications >>massive opportunities not only for technology companies, but for those organizations that can apply technology for business. Advantage yourself, count. Thanks so much for coming on the Cube. Much appreciated. Yeah, yeah, yeah, yeah
SUMMARY :
of enterprise data automation, an event Siri's brought to you by Iot. a lot of pressure on data, a lot of demand on data and to deliver more value What is it to you. into the business processes that are going to drive a business to love to get into the tech a little bit in terms of how it works. the ability to automatically discover that data. What is attracting those folks to your ecosystem and give us your thoughts on the So part of the reason why we've IBM, and I'm putting that to work because, yeah, the A. J. Thanks so much for coming on the Cube and sharing your insights and your experience is great to have Look who is smoking in We have a great conversation with Paul Increase the velocity of business outcomes with complete accurate data curated automatically And I'm really excited to have Paul Damico here. Nice to see you too. So let's let's start with Let's start with Webster Bank. complete data on the customer and what's really a great value the ability to give the customer what they need at the Part of it is really the cycle time, the end end cycle, time that you're pressing. It's enhanced the risk, and it's to optimize the banking process and to the cloud and off Prem and on France, you know, moving off Prem into, In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the You know, just for one to pray all these, you know, um, and each project before data for that customer really fast and be able to give them the best deal that they Can't thank you enough for coming on the Cube. And you guys have a great day. Next, we'll talk with Lester Waters, who's the CTO of Iot Toe cluster takes Automated data Discovery data Discovery is the first step to knowing your We're gonna talk about the journey to the cloud Remember, the hashtag is data automate and we're here with Leicester Waters. data is the key element to be protected. And so building that data catalog is the very first step What I got. Where do we go So the next thing is remediating that data you know, You figure out what to get rid of our actually get rid of it. And then from that, you can decide. What's next on the You know, you you want to do a certain degree of lift and shift Is that is that part of the journey? So you really want to is part of your cloud migration. Where does that fit in on the So the opportunity used tools you and automate that process What's the What's the lucky seven there's an opportunity to consolidate that into one. I mean, it's clear that the Cap Ex investments You know the data swamp and re establish trust in the data to enable Top Click on the link and connect with the data for organizations to really get value out of data. Uh, and you can try and milestones That Blue Bar is the time to test so you can see the second step. have pretty high risk and probably the area that risk you could be. to be a you all those went from high labor intensity. But it on the table was individually they need to cross reference that with other data school, Thanks so much for coming on the Cube.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Dave Volante | PERSON | 0.99+ |
Paul Damico | PERSON | 0.99+ |
Paul Damico | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Aziz | PERSON | 0.99+ |
Webster Bank | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Westchester | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
24 weeks | QUANTITY | 0.99+ |
Seth | PERSON | 0.99+ |
London | LOCATION | 0.99+ |
one | QUANTITY | 0.99+ |
hundreds | QUANTITY | 0.99+ |
Connecticut | LOCATION | 0.99+ |
New York | LOCATION | 0.99+ |
100 hours | QUANTITY | 0.99+ |
iPad | COMMERCIAL_ITEM | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
four weeks | QUANTITY | 0.99+ |
Siri | TITLE | 0.99+ |
thousands | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
six | QUANTITY | 0.99+ |
first item | QUANTITY | 0.99+ |
20 master instances | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
second step | QUANTITY | 0.99+ |
S three | COMMERCIAL_ITEM | 0.99+ |
I o ta ho | ORGANIZATION | 0.99+ |
first step | QUANTITY | 0.99+ |
Fairfield County | LOCATION | 0.99+ |
five years ago | DATE | 0.99+ |
first | QUANTITY | 0.99+ |
each project | QUANTITY | 0.99+ |
France | LOCATION | 0.98+ |
two days | QUANTITY | 0.98+ |
Leicester Waters | ORGANIZATION | 0.98+ |
Iot Tahoe | ORGANIZATION | 0.98+ |
Cap Ex | ORGANIZATION | 0.98+ |
seven cause | QUANTITY | 0.98+ |
Lester Waters | PERSON | 0.98+ |
5 10 years | QUANTITY | 0.98+ |
Boston | LOCATION | 0.97+ |
Iot | ORGANIZATION | 0.97+ |
Tahoe | ORGANIZATION | 0.97+ |
Tom | PERSON | 0.97+ |
First | QUANTITY | 0.97+ |
15 years old | QUANTITY | 0.96+ |
seven different lines | QUANTITY | 0.96+ |
single source | QUANTITY | 0.96+ |
Utah | LOCATION | 0.96+ |
New England | LOCATION | 0.96+ |
Webster | ORGANIZATION | 0.95+ |
12 years old | QUANTITY | 0.95+ |
Iot Labs | ORGANIZATION | 0.95+ |
Iot. Tahoe | ORGANIZATION | 0.95+ |
1st 1 | QUANTITY | 0.95+ |
U. S. | LOCATION | 0.95+ |
J ahora | ORGANIZATION | 0.95+ |
Cube | COMMERCIAL_ITEM | 0.94+ |
Prem | ORGANIZATION | 0.94+ |
one customer | QUANTITY | 0.93+ |
Oracle | ORGANIZATION | 0.93+ |
I O ta ho | ORGANIZATION | 0.92+ |
Snowflake | TITLE | 0.92+ |
seven | QUANTITY | 0.92+ |
single | QUANTITY | 0.92+ |
Lester | ORGANIZATION | 0.91+ |