Image Title

Search Results for AWS SageMaker:

Ajay Vohora, Io-Tahoe | SmartData Marketplaces


 

>> Narrator: From around the globe, it's theCUBE. With digital coverage of smart data marketplaces. Brought to you by Io-Tahoe. >> Digital transformation has really gone from a buzzword to a mandate, but digital business is a data business. And for the last several months we've been working with Io-Tahoe on an ongoing content series, focused on smart data and automation to drive better insights and outcomes, essentially putting data to work. And today we're going to do a deeper dive on automating data discovery. And one of the thought leaders in this space is Ajay Vohora, who's the CEO of Io-Tahoe. Once again, joining me, Ajay good to see you. Thanks for coming on. >> Great to be here, David, thank you. >> So let's, let's start by talking about some of the business realities and what are the economics that are driving automated data discovery? Why is that so important? >> Yeah, on this one, David it's a number of competing factors. We've got the reality of data which may be sensitive. So there's control. Three other elements wanting to drive value from that data to innovation. You can't really drive a lot of value without exchanging data. So the ability to exchange data and to manage those cost overheads and data discovery is at the root of managing that in an automated way to classify that data and set some policies to put that automation in place. >> Yeah, look, we have a picture of this. If we could bring it up guys, cause I want to, Ajay, help the audience understand kind of where data discovery fits in here. This is, as we talked about, this is a complicated situation for a lot of customers. They've got variety of different tools and you've really laid it out nicely here in this diagram. So, take us through sort of where that piece fits. >> Yeah, I mean, we're at the right hand side of this exchange, you know. We're really now in a data driven economy that is everything's connected through APIs that we consume online through mobile apps. And what's not apparent is the chain of activities and tasks that have to go into serving that data to an API at the outset. They may be many legacy systems, technologies, platforms On-premise, in cloud, hybrid, you name it and across those silos, getting to a unified view is the heavy lifting. I think we've seen some, some great impacts that BI tools, such as Power BI, Tableau, Looker, and so on, and Qlik have had, and they're in our ecosystem on visualizing Data and, you know, CEOs, managers, people that are working in companies day-to-day get a lot of value from saying, "What's the real time activity? "What was the trend over this month versus last month?" The tools to enable that, you know, we hear a lot of good things that we're doing with Snowflake, MongoDB on the public Cloud platforms, GCP Azure about enabling building those pipelines to feed into those analytics. But what often gets hidden is how do you source that data that could be locked into a mainframe, a data warehouse, IOT data, and pull over all of that together. And that is the reality of that is it's a lot of heavy lifting. It's hands on work that can be time consuming. And the issue there is that data may have value. It might have potential to have an impact on the top line for a business, on outcomes for consumers, but you're never really sure unless you've done the investigation, discovered it, unified that, and be able to serve that through to other technologies. >> Guys, if you would bring that picture back up again, because Ajay you made a point and I want to land on that for a second. There's a lot of manual curating. An example would be the data catalog. You know, data scientists complain all the time that they're manually wrangling data. And so you're trying to inject automation into the cycle. And then the other piece that I want you to address is the importance of APIs. You really can't do this without an architecture that allows you to connect things together that sort of enables some of the automation. >> Yep, I mean, I'll take that in two parts, David, the APIs, so virtual machines connected by APIs, business rules, and business logic driven by APIs, applications, so everything across the stack from infrastructure down to the network, hardware is all connected through APIs and the work of serving data through to an API, building those pipelines, is often miscalculated, just how much manual effort that takes and that manual effort, we've got a nice list here of what we automate down at the bottom, those tasks of indexing, labeling, mapping across different legacy systems, all of that takes away from the job of a data scientist or data engineer, looking to produce value, monetize data, and to help that business convey to consumers. >> Yeah, it's that top layer that the business sees, of course, there's a lot of work that has to go into achieving that. I want to talk about some of the key tech trends that you're seeing. And one of the things that we talk about a lot is metadata. The importance of metadata, you know, can't be understated. What are some of the big trends that you're seeing metadata and others? >> Yeah, I'll summarize it as five. There's a trend now look at metadata more holistically across the enterprise. And that really makes sense from trying to look across different data silos and apply a policy to manage that data. So that's the control piece. That's that lever. The other side, sometimes competing with that control around sensitive data around managing the cost of data is innovation. Innovation being able to speculate and experiment and try things out where you don't really know what the outcome is if you're a data scientist and engineer, you've got a hypothesis and therefore you've got that tension between control over data and innovation and driving value from it. So enterprise wide metadata management is really helping to unlock where might that latent value be across that sets of data. The other piece is adaptive data governance. Those controls that stick from the data policemen, data stewards, where they're trying to protect the organization, protect the brand, protect consumers data necessary, but in different use cases, you might want to nuance and apply a different policy to govern that data relevant to the context where you might have data that is less sensitive, that can be used for innovation and adapting the style of governance to fit the context is another trend that we're seeing coming up here. A few others is where we're sitting quite extensively in working with automating data discovery. We're now breaking that down into what can we direct? What do we know is a business outcome is a known upfront objective and direct that data discovery to towards that. And that means applying our algorithms around technology and our tools towards solving a known problem. The other one is autonomous data discovery. And that means, you know, trying to allow background processes to understand what changes are happening with data over time, flagging those anomalies. And the reason that's important is when you look over a length of time to see different spikes, different trends and activity, that's really giving a data ops team the ability to manage and calibrate how they're applying policies and controls the data. And the last two, David, that we're seeing is this huge drive towards self-service. So re-imagining how to apply policy data governance into the hands of a data consumer inside a business, or indeed the consumer themselves, to self-service if they're a banking customer or healthcare customer and the policies and the controls and rules, making sure that those are all in place to adaptively serve those data marketplaces that when are involved in creating. >> I want to ask you about the autonomous data discovering, the adaptive data governance, is the problem we're addressing there one of quality, in other words, machines are better than humans are at doing this? Is it one of scale? That humans just don't don't scale that well? Is it both? Can you add some color to that? >> Yeah, honestly, it's the same equation that existed 10 years ago, 20 years ago, it's being exacerbated, but it's that equation of how do I control all the things that I need to protect? How do I enable innovation where it is going to deliver business value? How do I exchange data between a customer, somebody in my supply chain safely, and do all of that whilst managing the fourth leg, which is cost overheads. There's not an open checkbook here. I've got to figure out if I'm the CIO and CDO, how I do all of this within a fixed budget. So those aspects have always been there, now with more choices, infrastructure in the Cloud, API driven applications, On-premises, and that is expanding the choices that a business has and how they put their data to work. It's also then creating a layer of management and data governance that really has to now manage those four aspects, control, innovation, exchange of data, and the cost overhead. >> That top layer of the first slide that we showed was all about the business value. So, I wonder if we could drill into the business impact a little bit. What are your customers seeing specifically in terms of the impact of all this automation on their business? >> Yeah, so we've had some great results. I think a few of the biggest have been helping customers move away from manually curating their data and their metadata. It used to be a time where if data initiatives or data governance initiatives, there'd be teams of people manually feeding a data catalog. And it's great to have that inventory of classified data to be able to understand single version of the truth, but having 10, 15 people manually process that, keep it up to date, when it's moving feet, the reality of it is what's true about data today, add another few sources and a few months time to your business, start collaborating with new partners, suddenly the landscape has changed. The amount of work has gone up, but what we're finding is through automating, creating that data discovery, feeding our data catalog, that's releasing a lot more time for our customers to spend on innovating and managing their data. A couple of others is around self service data analytics, moving the choices of what data might have business value into the hands of business users and data consumers to have faster cycle times around generating insights. And we're really helping them by automating the creation of those data sets that are needed for that. And the last piece, I'd have to say where we're seeing impacts more recently is in the exchange of data. There are a number of marketplaces out there who are now being compelled to become more digital, to rewire their business processes and everything from an RPA initiative to automation involving digital transformation is having CIOs, chief data officers and enterprise architects rethink how do they, how do they rewire the pipelines for their data to feed that digital transformation? >> Yeah, to me, it comes down to monetization. Now, of course, that's for a for-profit industry. For non-profits, for sure, the cost cutting or in the case of healthcare, which we'll talk about in a moment, I mean, it's patient outcomes, but the job of a Chief Data Officer has gone from data quality and governance and compliance to really figuring out how data can be monetized, not necessarily selling the data, but how it contributes to the monetization of the company. And then really understanding specifically for that organization, how to apply that. And that is a big challenge. We sort of chatted about 10 years ago, the early days of a dupe. And then 1% of the companies had enough engineers to figure it out, but now the tooling is available. The technology is there and the practices are there. And that really, to me is the bottom line, Ajay, is it's show me the money. >> Absolutely. It's definitely is focusing in on the single view of that customer and where we're helping there is to pull together those disparate, siloed sources of data to understand what are the needs of the patient, of the broker of the, if it's insurance? What are the needs of the supply chain manager, if it's manufacturing? And providing that 360 view of data is helping to see, helping that individual unlock the value for the business. So data's providing the lens provided, you know which data it is that can assist in doing that. >> And, you know, you mentioned RPA before, I had an RPA customer tell me she was a Six Sigma expert and she told me, "We would never try to apply Six Sigma "to a business process, "but with RPA we can do so very cheaply." Well, what that means is lower costs. It means better employee satisfaction and really importantly, better customer satisfaction and better customer outcomes. Let's talk about healthcare for a minute because it's a really important industry. It's one that is ripe for disruption and has really been, up until recently, pretty slow to adopt a lot of the major technologies that have been made available. But what are you seeing in terms of this theme we're using a putting data to work in healthcare specifically? >> Yeah, I mean, health care's has had a lot thrown at it. There's been a lot of change in terms of legislation recently, particularly in the U.S. market, in other economies, healthcare is on a path to becoming more digital. And part of that is around transparency of price. So, to be operating effectively as a healthcare marketplace, being able to have that price transparency around what an elective procedure is going to cost before taking that step forward. It's super important to have an informed decision around that. So if we look at the U.S., for example, we've seen that healthcare costs annually have risen to $4 trillion, but even with all of that cost, we have healthcare consumers who are reluctant sometimes to take up healthcare even if they have symptoms. And a lot of that is driven through not knowing what they're opening themselves up to. And, you know, I think David, if you or I were to book travel a holiday, maybe, or trip, we'd want to know what we're in for, what we're paying for upfront. But sometimes in healthcare that choice, the option might be the plan, but the cost that comes with it isn't. So recent legislation in the U.S. is certainly helpful to bring forward that price transparency. The underlying issue there though is the disparate different format types of data that are being used from payers, patients, employers, different healthcare departments to try and make that work. And where we're helping on that aspect in particular related to price transparency is to help make that data machine readable. So, sometimes with data, the beneficiary might be a person, but in a lot of cases, now we're seeing the ability to have different systems interact and exchange data in order to process the workflow to generate online lists of pricing from a provider that's been negotiated with a payer is really an enabling factor. >> So guys, I wonder if you could bring up the next slide, which is kind of the nirvana. So, if you saw the previous slide that the middle there was all different shapes and presumably to disparate data, this is the outcome that you want to get, where everything fits together nicely. And you've got this open exchange. It's not opaque as it is today. It's not bubble gum, band-aids and duct tape, but describe this sort of outcome that you're trying to achieve and maybe a little bit about what it's going to take to get there. >> Ajay: Yeah, that that's the culmination of a number of things. It's making sure that the data is machine readable, making it available to APIs, that could be RPA tools. We're working with technology companies that employ RPA for healthcare, and specifically to manage that patient and payer data to bring that together. In our data discovery, what we're able to do is to classify that data and have it made available to a downstream tool technology or person to apply that, that workflow to the data. So this looks like nirvana, it looks like utopia, but it's, you know, the end objective of a journey that we can see in different economies, that are at different stages of maturity in turning healthcare into a digital service even so that you can consume it from where you live, from home with telemedicine and tele care. >> Yeah, so, and this is not just for healthcare, but you know, you want to achieve that self-service data marketplace in virtually any industry. You're working with TCS, Tata Consulting Services to achieve this. You know, a company like Io-Tahoe has to have partnerships with organizations that have deep industry expertise. Talk about your relationship with TCS and what you guys are doing specifically in this regard. >> Yeah, we've been working with TCS now for a long while and we'll be announcing some of those initiatives here where we're now working together to reach their customers where they've got a brilliant framework of business, 4.0, where they're re-imagining with the clients, how their business can operate with AI, with automation and become more agile and digital. Our technology, now, the reams of patients that we have in our portfolio, being able to apply that at scale, on a global scale across industries, such as banking, insurance and healthcare is really allowing us to see a bigger impact on consumer outcomes, patient outcomes. And the feedback from TCS is that we're really helping in those initiatives remove that friction. They talk a lot about data friction. I think that's a polite term for the image that we just saw with the disparate technologies that the legacy that has built up. So if we want to create a transformation, having that partnership with TCS across industries is giving us that reach and that impact on many different people's day-to-day jobs and lives. >> Let's talk a little bit about the Cloud. It's a topic that we've hit on quite a bit here in this content series. But, but you know, the Cloud companies, the big hyper-scalers, they've put everything into the Cloud, right? But customers are more circumspect than that. But at the same time, machine intelligence, ML, AI, the Cloud is a place to do a lot of that. That's where a lot of the innovation occurs. And so what are your thoughts on getting to the Cloud, putting data to work, if you will, with machine learning, stuff that you're doing with AWS, what's your fit there? >> Yeah, we, David, we work with all of the Cloud platforms, Microsoft Azure, GCP, IBM, but we're expanding our partnership now with AWS. And we're really opening up the ability to work with their Greenfield accounts, where a lot of that data, that technology is in their own data centers at the customer. And that's across banking, healthcare, manufacturing, and insurance. And for good reason, a lot of companies that have taken the time to see what works well for them with the technologies that the Cloud providers are offering, and a lot of cases, testing services or analytics using the Cloud to move workloads to the Cloud to drive data analytics is a real game changer. So there's good reason to maintain a lot of systems On-premise. If that makes sense from a cost, from a liability point of view and the number of clients that we work with that do have, and will keep their mainframe systems when in Cobra is no surprise to us, but equally they want to tap into technologies that AWS has such as SageMaker. The issue is as a Chief Data Officer, I didn't have the budget to move everything to the Cloud they want, I might want to show some results first upfront to my business users and work closely with my Chief Marketing Officer to look at what's happening in terms of customer trends and customer behavior> What are the customer outcomes, patient outcomes and partner outcomes that you can achieve through analytics, data science? So, working with AWS and with clients to manage that hybrid topology of some of that data being in the Cloud, being put to work with AWS SageMaker and Io-Tahoe being used to identify where is the data that needs to be amalgamated and curated to provide the dataset for machine learning, advanced analytics to have an impact for the business. >> So what are the critical attributes of what you're looking at to help customers decide what to move and what the keep if you will? >> Well, one of the quickest outcomes that we help customers achieve is to buy that business glossary, you know, that the items of data, that means something to them across those different silos and pull all of that together into a unified view. Once they've got that data engineer working with a business manager to think through, how do we want to create this application? Now, what is the churn model, the loyalty or the propensity model that we want to put in place here? How do we use predictive analytics to understand what needs for a patient that sort of innovation is what we're unlocking, applying a tools such as SageMaker on AWS to then do the computation and to build those models to deliver that outcome is across that value chain. And it goes back to the first picture that we put up, David, you know, the outcome is that API on the back of it, you've got a machine learning model that's been developed in a tool such as Databricks or Jupiter notebook. That data has to be sourced from somewhere. Somebody has to say that, "Yep, "You've got permission to do what you're trying to do without falling foul "of any compliance around data." And it all goes back to discovering that data, classifying it, indexing it in an automated way to cut those timelines down to hours and days. >> Yeah, it's the innovation part of your data portfolio, if you will, that you're going to put into the Cloud, apply tools like SageMaker and others, your tool Azure. I mean, whatever your favorite tool is, you don't care. The customer's going to choose that. And you know, the Cloud vendors, maybe they want you to use their tool, but they're making their marketplaces available to everybody, but it's that innovation piece, the ones that you, where you want to apply that self-service data marketplace to, and really drive, as I said before, monetization, All right, give us your final thoughts. Ajay, bring us home. >> So final thoughts on this, David, is at the moment, we're seeing a lot of value in helping customers discover their data using automation, automatically curating a data catalog. And that unified view is then being put to work through our API is having an open architecture to plug in whatever tool technology our clients have decided to use. And that open architecture is really feeding into the reality of what CIOs and Chief Data Officers are managing, which is a hybrid On-premise Cloud approach to use best of breed. But business users wanting to use a particular technology to get their business outcome, having the flexibility to do that no matter where your data is sitting On-premise, on Cloud is where self-service comes in so that sales service view of what data I can plug together, jive exchange, monetizing that data is where we're starting to see some real traction with customers. Now accelerating, becoming more digital to serve their own customers. >> Yeah, we really have seen a cultural mind shift going from sort of complacency, and obviously COVID has accelerated this, but the combination of that cultural shift, the Cloud machine intelligence tools give me a lot of hope that the promises of big data will ultimately be lived up to in this next 10 years. So Ajay Vohora, thanks so much for coming back on theCUBE. You're a great guest and appreciate your insights. >> Appreciate it, David. See you next time. >> All right, keep it right there, everybody, right back after this short break. (techno music)

Published Date : Sep 17 2020

SUMMARY :

Brought to you by Io-Tahoe. and automation to drive So the ability to exchange data help the audience understand and tasks that have to go into serving is the importance of APIs. all of that takes away from the job that has to go into achieving that. And that means, you know, and that is expanding the choices in terms of the impact And the last piece, I'd have to say And that really, to me is the bottom line, of the broker of the, of the major technologies that choice, the option might be the plan, that the middle there Ajay: Yeah, that that's the culmination has to have partnerships that the legacy that has built up. on getting to the Cloud, of some of that data being in the Cloud, that means something to them to apply that self-service having the flexibility to do that that the promises of big data See you next time. right back after this short break.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

Ajay VohoraPERSON

0.99+

TCSORGANIZATION

0.99+

AWSORGANIZATION

0.99+

Io-TahoeORGANIZATION

0.99+

$4 trillionQUANTITY

0.99+

Tata Consulting ServicesORGANIZATION

0.99+

fiveQUANTITY

0.99+

IBMORGANIZATION

0.99+

two partsQUANTITY

0.99+

bothQUANTITY

0.99+

MicrosoftORGANIZATION

0.99+

first pictureQUANTITY

0.99+

fourth legQUANTITY

0.99+

AjayPERSON

0.99+

Io-TahoePERSON

0.99+

oneQUANTITY

0.99+

20 years agoDATE

0.99+

U.S.LOCATION

0.99+

10 years agoDATE

0.98+

Three other elementsQUANTITY

0.98+

360 viewQUANTITY

0.98+

1%QUANTITY

0.98+

last monthDATE

0.98+

first slideQUANTITY

0.97+

todayDATE

0.97+

Power BITITLE

0.97+

CobraLOCATION

0.96+

DatabricksORGANIZATION

0.96+

10, 15 peopleQUANTITY

0.96+

single viewQUANTITY

0.95+

Six SigmaORGANIZATION

0.95+

GCP AzureTITLE

0.95+

single versionQUANTITY

0.94+

CloudTITLE

0.94+

TableauTITLE

0.92+

AzureTITLE

0.88+

MongoDBTITLE

0.86+

about 10 years agoDATE

0.84+

COVIDTITLE

0.83+

firstQUANTITY

0.82+

SnowflakeTITLE

0.81+

GCPORGANIZATION

0.81+

twoQUANTITY

0.81+

LookerTITLE

0.79+

SageMakerTITLE

0.78+

GreenfieldORGANIZATION

0.78+

next 10 yearsDATE

0.74+

Six SigmaTITLE

0.7+

this monthDATE

0.67+

JupiterORGANIZATION

0.63+

QlikTITLE

0.62+

AWS SageMakerORGANIZATION

0.61+

a secondQUANTITY

0.55+

SageMakerORGANIZATION

0.54+