Image Title

Search Results for Data Robot:

Jeremy Rader


 

>>from the Cube Studios in Palo Alto and Boston connecting with thought leaders all around the world. This is a cube conversation. >>Alright, welcome back. Jeff Frick here. And we're excited for this next segment. We're joined by Jeremy Raider. He is the GM digital transformation and scale solutions for Intel Corporation. Jeremy, great to see you. Hey, thanks for having me. I love I love the flowers in the backyard. I thought maybe you ran over to the Japanese, the Japanese garden or the Rose Garden. Right To very beautiful places to visit in Portland. >>Yeah. You know, you only get for a couple Ah, couple weeks here, so we get the timing just right. >>Excellent. All right, so let's jump into it. Really? And in this conversation really is all about making Ai Riel. Um, and you guys are working with Dell and you're working with not only Dell, right? There's the hardware and software, but a lot of these smaller a solution provider. So what is some of the key attributes that that needs to make ai riel for your customers out there? >>Yeah. So you know, it's a It's a complex space. So when you can bring the best of the Intel portfolio, which is which is expanding a lot. You know, it's not just the few anymore you're getting into memory technologies, network technologies and kind of a little less known as how many resources we have focused on the software side of things optimizing frameworks and optimizing and in these key ingredients and libraries that you can stitch into that portfolio to really get more performance in value, out of your machine learning and deep learning space. And so you know what we've really done here with Dell? It has started to bring a bunch of that portfolio together with Dell's capabilities, and then bring in that ai's V partner, that software vendor where we can really take and stitch and bring the most value out of a broad portfolio. Ultimately using using the complexity of what it takes to deploy an AI capability. So a lot going on. They're bringing kind of the three legged stool of the software vendor hardware vendor dental into the mix, and you get a really strong outcome, >>right? So before we get to the solutions piece, let's stick a little bit into the intel world, and I don't know if a lot of people are aware that obviously you guys make CPUs and you've been making great CPS forever. But there's a whole lot more stuff that you've added, you know, kind of around the core CPU, if you will. In terms of of actual libraries and ways to really optimize the seond processors to operate in an AI world. I wonder if you can kind of take us a little bit below the surface on how that works. What are some of the examples of things you can do to get more from your Gambira Intel processors for AI specific applications of workloads? >>Yeah, well, you know, there's a ton of software optimization that goes into this. You know that having the great CPU is definitely step one. But ultimately you want to get down into the libraries like tensor flow. We have data analytics, acceleration libraries. You know, that really allows you to get kind of again under the covers a little bit and look at how do we have to get the most out of the kinds of capabilities that are ultimately used in machine learning in deep learning capabilities, and then bring that forward and trying and enable that with our software vendors so that they can take advantage of those acceleration components and ultimately, you know, move from, you know, less training time or could be a cost factor, right? Those are the kind of capabilities we want to expose to software vendors do these kinds of partnerships >>on, and that's terrific. And I do think that's a big part of the story that a lot of people are probably not as aware of that. There are a lot of these optimization opportunities that you guys have been leveraging for a while. So shifting gears a little bit right AI and machine learning is all about the data. And in doing a little research for this, I found actually you on stage talking about some company that had, like, 350 of road off 315 petabytes of of data, 140,000 sources of those data, and I think probably not great quote of six months access time to get it right and actually work with it. And the company you're referencing was intel. So you guys know a lot about debt data, managing data, everything from your manufacturing and and obviously supporting a global organization for I, t and Brian and, ah, a lot of complexity and secrets and good stuff. So you know what have you guys leveraged as intel in the way you work with data and getting a good data pipeline that's enabling you to kind of put that into these other solutions that you're providing to the customers, >>right? Well is, you know, it's absolutely a journey, and it doesn't happen overnight, and that's what we've you know. We've seen it at Intel on We see it with many of our customers that are on the same journey that we've been on. And so you know, this idea of building that pipeline it really starts with what kind of problems that you're trying to solve. What are the big issues that are holding you back that company where you see that competitive advantage that you're trying to get to? And then ultimately, how do you build the structure to enable the right kind of pipeline of that data? Because that's that's what machine learning and deep learning is that data journey. So really a lot of focus around you know how we can understand those business challenges bring forward those kinds of capabilities along the way through to where we structure our entire company around those assets. And then ultimately, some of the partnerships that we're gonna be talking about these companies that are out there to help us really squeeze the most out of that data as quickly as possible because otherwise it goes stale real fast, sits on the shelf, and you're not getting that value out of right. So, yeah, we've been on the journey. It's ah, it's a long journey. But ultimately we could take a lot of those those kind of learnings and we can apply them to our silicon technology. The software optimization is that we're doing and ultimately, how we talk to our enterprise customers about how they can solve overcome some of the same challenges that we did. >>Well, let's talk about some of those challenges specifically because, you know, I think part of the the challenge is that kind of knocked big data, if you will in Hadoop, if you will kind of off the rails. Little bit was, there's a whole lot that goes into it. Besides just doing the analysis There's a lot of data practice data collection, data organization, a whole bunch of things that have to happen before You can actually start to do the sexy stuff of AI. So you know, what are some of those challenges? How are you helping people get over kind of these baby steps before they can really get into the deep end of the pool? >>Yeah, well, you know, one is you have to have the resource is so you know, do you even have the resource is if you can acquire those Resource is can you keep them interested in that kind of work that you're doing? So that's a big challenge on and actually will talk about how that fits into some of the partnerships that we've been establishing in the ecosystem. It's also you get stuck in this poc do loop, right? You finally get those resource is and they start to get access to that data that we talked about. They start to play out some scenarios a theorize a little bit. Maybe they show you some really interesting value, but it never seems to make its way into a full production mode. And I think that is a challenge that is facing so many enterprises that are stuck in that loop. And so that's where we look at who's out there in the ecosystem That can help more readily move through that whole process of the evaluation that proved they are a why the POC and ultimately move that thing that capability into production mode as quickly as possible that you know that to me is one of those fundamental aspects of if you're stuck in the POC. Nothing's happening from this. This is not helping your company. We want to move things more quickly, >>right? Right. And let's just talk about some of these companies that you guys are working with that you've got some reference architectures is data robot a Grid Dynamics H 20 just down the road in Antigua. So a lot of the companies we've worked with with Cube and I think you know another part that's interesting. It again we can learn from kind of old days of big data is kind of generalized. Ai versus solution specific. Ai and I think you know where there's a real opportunity is not AI for a sake, but really it's got to be applied to a specific solution. A specific problem so that you have, you know, better chatbots. Better customer service experience, you know, better something. So when you were working with these folks and trying to design solutions or some of the opportunities that you saw to work with, some of these folks to now have an applied a application slash solution versus just kind of AI for ai's sake, >>Yeah. I mean, that could be anything from fraud, detection and financial services, or even taking a step back and looking more horizontally like back to that data challenge. If if you're stuck at the AI built a fantastic data lake, but I haven't been able to pull anything back out of it, who are some of the companies that are out there that can help overcome some of those big data challenges and ultimately get you to where you know, you don't have a data scientist spending 60% of their time on data acquisition pre processing? That's not where we want them, right? We want them on building out that next theory. We want them on looking at the next business challenge. We want them on selecting the right models, but ultimately they have to do that as quickly as possible so that they can move that that capability forward into the next phase. So, really, it's about that that connection of looking at those those problems or challenges in the whole pipeline. And these companies like Data robot in H 20 because you know, they're all addressing specific challenges in the end to end. That's why they've kind of bubbled up as ones that we want to continue to collaborate with, because it can help enterprises overcome those issues more fast. You know more readily. >>Great. Well, Jeremy, thanks for taking a few minutes and giving us the Intel side of the story. Um, it's a great company. Has been around forever. I worked there many, many moons ago. That's Ah, that's a story for another time. But really appreciate it and >>I'll interview you >>will go there. Alright, So super Thanks a lot. So he's Jeremy. I'm Jeff Frick. So now it's time to go ahead and jump into the crowd chat. It's crowdchat dot net slash make ai Really, Um, we'll see you in the chat. And thanks for watching. Yeah, yeah, yeah, yeah

Published Date : May 20 2020

SUMMARY :

from the Cube Studios in Palo Alto and Boston connecting with thought leaders all around the world. I thought maybe you ran over to the Japanese, the Japanese garden or the Rose Um, and you guys are working with Dell and you're working with not only Dell, right? And so you know what we've really done here with Dell? What are some of the examples of things you can do to get more from You know, that really allows you to get kind of again under the covers a little bit and look at how do we have to get So you know what have you guys leveraged as intel in the way you work with data And then ultimately, how do you build the structure to enable the right kind of pipeline of that So you know, what are some of those challenges? Yeah, well, you know, one is you have to have the resource is so you know, So a lot of the companies we've worked with with Cube and I think you know another that can help overcome some of those big data challenges and ultimately get you to where you I worked there many, many moons ago. we'll see you in the chat.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeremyPERSON

0.99+

Jeremy RaiderPERSON

0.99+

Jeff FrickPERSON

0.99+

PortlandLOCATION

0.99+

Jeremy RaderPERSON

0.99+

BrianPERSON

0.99+

AntiguaLOCATION

0.99+

DellORGANIZATION

0.99+

60%QUANTITY

0.99+

Palo AltoLOCATION

0.99+

315 petabytesQUANTITY

0.99+

BostonLOCATION

0.99+

350QUANTITY

0.99+

six monthsQUANTITY

0.99+

140,000 sourcesQUANTITY

0.99+

IntelORGANIZATION

0.99+

Intel CorporationORGANIZATION

0.98+

intelORGANIZATION

0.98+

Cube StudiosORGANIZATION

0.98+

oneQUANTITY

0.97+

step oneQUANTITY

0.96+

CubeORGANIZATION

0.96+

Data robotORGANIZATION

0.95+

JapaneseLOCATION

0.93+

couple weeksQUANTITY

0.92+

Grid DynamicsORGANIZATION

0.92+

Rose GardenLOCATION

0.92+

GambiraORGANIZATION

0.81+

H 20COMMERCIAL_ITEM

0.76+

three leggedQUANTITY

0.76+

many moonsDATE

0.72+

Japanese gardenLOCATION

0.72+

20TITLE

0.53+

crowdchatORGANIZATION

0.52+

coupleQUANTITY

0.5+

HCOMMERCIAL_ITEM

0.34+

Matt Carroll, Immuta | CUBEConversation, November 2019


 

>> From the Silicon Angle Media office, in Boston Massachusetts, it's the Cube. Now, here's your host, Dave Vellante. >> Hi everybody, welcome to this Cube Conversation here in our studios, outside of Boston. My name is Dave Vellante. I'm here with Matt Carroll, who's the CEO of Immuta. Matt, good to see ya. >> Good, nice to have me on. >> So we're going to talk about governance, how to automate governance, data privacy, but let me start with Immuta. What is Immuta, why did you guys start this company? >> Yeah, Immuta is an automated data governance platform. We started this company back in 2014 because we saw a gap in the market to be able to control data. What's happened in the market as changes is that every enterprise wants to leverage their data. Data's the new app. But, governments want to regulate it and consumers want to protect it. These were at odds with one another, so we saw a need of creating a platform that could meet the needs of everyone. To democratize access to data and in the enterprise, but at the same time, provide the necessary controls on the data to enforce any regulation, and ensure that there was transparency as to who is using it and why. >> So let's unpack that a little bit. Just try to dig into the problem here. So we all know about the data explosion, of course, and I often say data used to be a liability, now it's turned into an asset. People used to say get rid of the data, now everybody wants to mine it, and they want to take advantage of it, but that causes privacy concerns for individuals. We've seen this with Facebook and many others. Regulations now come into play, GDPR, different states applying different regulations, so you have all these competing forces. The business guys just want to go and get out to the market, but then the lawyers and the compliance officers and others. So are you attacking that problem? Maybe you could describe that problem a little further and talk about how you guys... >> Yeah, absolutely. As you described, there's over 150 privacy regulations being proposed over 25 states, just in 2019 alone. GDPR has created or opened the flood gates if you will, for people to start thinking about how do we want to insert our values into data? How should people use it? And so, the challenge now is, you're right, your most sensitive data in an enterprise is most likely going to give you the most insight into driving your business forward, creating new revenue channels, and be able to optimize your operational expenses. But the challenge is that consumers have awoken to, we're not exactly sure we're okay with that, right? We signed a YULU with you to just use our data for marketing, but now you're using it for other revenue channels? Why? And so, where Immuta is trying to play in there is how do we give the line of business the ability to access that instantaneously? But also give the CISO, the Chief Information Security Officer, and the governance seems the ability to take control back. So it's a delicate balance between speed and safety. And I think what's really happening in the market is we used to think about security from building firewalls, we invested in physical security controls around managing external adversaries from stealing our data. But now it's not necessarily someone trying to steal it, it's just potentially misusing it by accident in the enterprise. And the CISO is having to step in and provide that level of control. And it's also the collision of the cloud and these privacy regulations. Cause now, we have data everywhere, it's not just in our firewalls. And that's the big challenge. That's the opportunity at hand, democratization of data in the enterprise. The problem is data's not all in the enterprise. Data's in the cloud, data's in SaaS, data's in the infrastructure. >> It's distributed by it's very nature. All right, so there's a lot of things I want to follow up on. So first, there's GDPR. When GDPR came out of course, it was May of 2018 I think. It went into effect. It actually came out in 2017, but the penalties didn't take effect till '18. And I thought, okay, maybe this can be a framework for governments around the world and states. It sounds like yeah sort of, but not really. Maybe there's elements of GDPR that people are adopting, but then it sounds like they're putting in their own twists, which is going to be a nightmare for companies. So, are you not seeing a sort of, GDPR becoming this global standard? It sounds like, no. >> I don't think it's going to be necessarily global standard, but I do think the spirit of the GDPR, and at the core of it is, why are you using my data? What was the purpose? So traditionally, when we think about using data, we think about all right, who's the user, and what authorizations do they have, right? But now, there's a third question. Sure, you're authorized to see this data, depending on your role or organization right? But why are you using it? Are you using it for certain business use? Are you using it for personal use? Why are you using this? That's the spirit of GDPR that everyone is adopting across the board. And then of course, each state, or each federal organization is thinking about their unique lens on it, right? And so you're right. This is going to be incredibly complex. And the amount of policies being enforced at query time. I'm in my favorite, let's just say I'm in Tableau or Looker right? I'm just some simple analyst, I'm a young kid, I'm 22, my first job right? And I'm running these queries, I don't know where the data is, right? I don't know what I'm combining. And what we found is on average in these large enterprises, any query at any moment in time, might have over 500 thousand policies that need to be enforced in real time. >> Wow. >> And it's only getting worse. We have to automate it. No human can handle all those edge cases. We have to automate. >> So, I want to get into how you guys actually do that. Before I do, there seems to be... There's a lot of confusion in the marketplace. Take the word data management, data protection. All the backup guys are using that term, the database guys use that term, GOC folks use that term, so there's a lot of confusion there. You have all these adjacent markets coming together. You've got the whole governance risk and compliance space, you've got cyber security, there's privacy concerns, which is kind of two sides of the same coin. How do you see these adjacencies coming together? It seems like you sit in the middle of all that. >> Yeah, welcome to why my marketing budget is getting bigger and bigger. The challenge we're facing now is I think, who owns the problem right? The Chief Data Officer is taking on a much larger role in these organizations, the CISO is taking a much more larger role in reporting up to the board. You have the line of business who now is almost self-sustaining, they don't have to depend on IT as much any longer because of the cloud and because of the new compute layers to make it easier. So who owns it? At the end of the day, where we see it is we think there's a next generation of cyber tools that are coming out. We think that the CISO has to own this. And the reason is that the CISO's job is to protect the enterprise from cyber risk. And at the core of cyber risk is data. And they must own the data problem. The CDO must find the data, and explain what that data is, and make sure it's quality, but it is the CISO that must protect the enterprise from these threats. And so, I see us as part of this next wave of cyber tools that are coming out. There's other companies that are equally in our stratosphere, like BigID, we're seeing AWS with Macy doing sensitive data discovery, Google has their data loss prevention service. So the cloud players are starting to see, hey, we've got to identify sensitive data. There's other startups that are saying hey, we got to identify and catalog sensitive data. And for us, we're saying hey, we need to be able to consume all that cataloging, understand what's sensitive, and automatically apply policies to ensure that any regulation in that environment is met. >> I want to ask you about the cloud too. So much to talk to you about here, Matt. So, I also wanted to get your perspective on variances within industries. So you mentioned Chief Data Officers. The ascendancy of the Chief Data Officers started in financial services, healthcare, and government where we had highly regulation industries. And now it's sort of seeped into more commercial. But it terms of those regulated industries, take healthcare for example. There are specific nuances. Can you talk about what you're seeing in terms of industry variance. >> Yeah, it's a great point. Starting with like, healthcare. What does it mean to be HIPPA compliant anymore? There are different types of devices now where I can point it at your heartbeat from a distance away and I can have 99 percent accuracy of identifying you, right? It takes three data points in any data set to identify 87 percent of US citizens. If I have your age, sex, location, I can identify you. So, what does it mean anymore to be HIPPA compliant? So the challenge is how do we build guarantees of trust that we've de-identified these DESA's, cause we have to use it, right? No one's going to go into a hospital and say, "You know what, I don't want you to say my life. "Cause I want my data protected," right? No one's ever going to say that. So the challenges we face now across these regulated industries is the most sensitive data sets are critical for those businesses to operate. So there has to be a compromise. So, what we're trying to do in these organizations is help them leverage their data and build levels of proportionality, to access that right? So, the key isn't to stop people from using data. The key is to build the controls necessary to leverage a small bit of the data. Let's just say, we've made it indistinguishable. You can only ask Agriculture and Statistics the question. Well, you know what, we actually found some really interesting things there, we need to be a little bit more useful, it's this trade-off between privacy and utility. It's a pendulum that swings back and forth. As someone proves I need more of this, you can swing it, or just mask it. I need more of it? All right, we'll just redact some of the certain things. Nope, this is really important, it's going to save someone's life. Okay, completely unmasked, you have the raw data. But it's that control that's necessary in these environments, that's what's missing. You know, we came out of the US Intelligence community. We understood this better than anyone. Because highly regulated, very sensitive data, but we knew we needed the ability to rapidly control. Well is this just a hunch, or is this a 9-11 event? And you need the ability to switch like that. That's the difference and so, healthcare is going through a change of, we have all these new algorithms. Like Facebook the other day said, hey, we have machine learning algorithms that can look at MRI scans, and we're going to be better than anyone in the world at identifying these. Do you feel good about giving your data to Facebook? I don't know, but we can maybe provide guaranteed anonymization to them, to prove to the world they're going to do right. That's where we have to get to. >> Well, this is huge, especially for the consumer, cause you just gave several examples. Facebook's going to know a lot about me, a mobile device, a Fit Bit, and yet, if I want to get access to my own medical records, it's like Fort Knox to try to get, please, give this to my insurance company. You know, you got to go through all these forms. So, you've got those diverging objectives and so, as a consumer, I want to be able to trust that when I say yes you can use it, go, and I can get access to it, and other can get access to it. I want to understand exactly what it is that you guys do, what you sell. Is it software, is it SAS, and then let's get into how it works. So what is it? >> Yeah, so we're a software platform. We deploy into any infrastructure, but it is not multi-tenant so, we can deploy on any cloud, or on premises for any customer, and we do that with customers across the world. But if you think about at the core of what is Immuta, think of Immuta as a system of record for the CISO or the line of business where I can connect to any data, on any infrastructure, on any compute layer, and we connect into over 61 different storage platforms. We then have built a UI where lawyers... We actually have three lawyers as employees that act as product managers to help any lawyer of any stature take what's on paper, these regulations, these rules and policies, and they digitize it essentially, in active code. So they can build any policy they want on any data in the ecosystem, in the enterprise, and enforce it globally without having to write any code. And then because we're this plane where you can connect any tool to this data, and enforce any regulation because we're the man in the middle, we can audit who is using what data and why. In every action, in any change in policy. So, if you think about it, it's connect any tool to any data, control it, any regulation, and prove compliance in a court of law. >> So you can set the policy at the data set level? >> Correct. >> And so, how does one do that? Can you automate that on the creation of that data set? I mean you've got you know, dependencies. How does that all work? >> Yeah, what's a really interesting part of our secret sauce is that one, we could do that at the column level, we can do it at the row level, we can do it at the cell level. >> So very granular. >> Very, very granular. This is something again, we learned from the US Intelligence community, that we have to have very fine grained access to every little bit of the data. The reason is that, especially in the age of data, is people are going to combine many data sets together. The challenge isn't enforcing the policy on a static data set, the challenge is enforcing the policy across three data sets where you merge three pieces of data together, who have conflicting policies. What do you do then? That's the beauty of our system. We deal with that policy inheritance, we manage that lineage of the policy, and can tell you here's what the policy will be. >> In other words, you can manage to the highest common denominator as an example. >> Or we can automate it to the lowest common denominator, where you can work in projects together recognizing hey, we're going to bring someone into the project that's not going to have the level of access. Everyone else will automatically change it to the lowest common denominator. But then you share that work with another team and it'll automatically be brought to the highest common denominator. And we've built all these work flows in. That was what was missing and that's why I call it a system of record. It's really a symbiotic relationship between IT, the data owner, governance, the CISO, who are trying to protect the data, and the consumer, and all they want to do is access the data as fast as possible to make better, more informed decisions. >> So the other mega-trend you have is obviously, the super power of machine intelligence, or artificial intelligence, and then you've got edge devices and machine to machine communication, where it's just an explosion of IP addresses and data, and so, it sounds like you guys can attack that problem as well. >> Any of this data coming in on any system, the idea is that eventually it's going to land somewhere, right? And you got to protect it. We call that like rogue data, right? This is why I said earlier, when we talk about data, we have to start thinking about it as it's not in some building anymore. Data's everywhere. It's going to be on a cloud infrastructure, it's going to be on premises, and it's likely, in the future, going to be on many distributed data centers around the world cause business is global. And so, what's interesting to us is no matter where the data's sitting, we can protect it, we can connect to it, and we allow people to access it. And that's the key thing is not worrying about how to lock down your physical infrastructure, it's about logically separating it. And that's why what differentiates us from other people is one, we don't copy the data, right? That's the always the barrier for these types of platforms. We leave the data where it is. The second is we take all those regulations and we can actually, at query time, push it down to where that data is. So rather than bring it to us, we push the policy to the data. And what that does is that's what allows us, what differentiates us from everyone else is, it allows us to guarantee that protection, no matter where the data's living. >> So you're essentially virtualizing the data? >> Yeah, yeah. It's virtual views of data, but it's not all the data. What people have to realize is in the day of apps, we cared about storage. We put all the data into a database, we built some services on top of it and a UI, and it was controlled that way, right? You had all the nice business logic to control it. In the age of data, right? Data is the new app, right? We have all these automation tools, Data Robot, and H20, and Domino, and Tableau's building all these automation work flows. >> The robotic process automation. >> Yeah, RPA, UI Path, the Work Fusion, right? They're making it easier and easier for any user to connect to any data and then automate the process around it. They don't need an app to build a unique work flows, these new tools do that for them. The key is getting to the data. And the challenge with the supply chain of data is time to data is the most critical aspect of that. Cause, the time to insight is perishable. And so, what I always tell people, a little story, I came from the government, I worked in Baghdad, we had 42 minutes to know whether or not a bad guy in the environment, we could go after him. After that, that data was perishable, right? We didn't know where he was. It's the same thing in the real world. It's like imagine if Google told you, well, in 42 minutes it might be a good time to go 495. (laughter) It's not very useful, I need to know the information now. That's the key. What we see is policy enforcement and regulations are the key barrier of entry. So our ability to rapidly, with no latency, be able to connect anyone to that data and enforce those policies where the data lives, that's the critical nature. >> Okay, so you can apply the policies and you do it quickly, and so now you can help solve the problem. You mentioned a cloud before, or on prem. What is the strategy there with regard to various clouds and how do you approach multi-clouds? >> I think cloud is what used to be an infrastructure as a service game, is now becoming a compute game. I think large, regulated enterprises, government, healthcare, financial services, insurance, are all moving to cloud now in a different way. >> What do you mean by that? Cause people think infrastructure as service, they'll say oh that's compute storage and some networking. What do you mean by that? >> I think there's a whole new age of software that's being laid on top of the availability of compute and the availability of storage. That's companies like Databricks, companies like Snowflake, and what they're doing is dramatically changing how people interact with data. The availability zones, the different types of features, the ability to rip and replace legacy warehouses and main frames. It's changing the ability to not just access, but also the types of users that could even come on to leverage this data. And so these enterprises are now thinking through, "How do I move my entire infrastructure of data to them? "And what are these new capabilities "that I could get out of that?" Which, that is just happening now. A lot of people have been thinking, "Oh, this has been happening over the past five years," no, the compute game is now the new war. I used to think of like, Big Data, right? Big Data created, everyone started to understand, "Ah, if we've got our data assets together, "we can get value." Now they're thinking, "All right, let's move beyond that." The new cloud at our currents works is Snowflake and Databricks. What they're thinking about is, "How do I take all your meta-data "and allow anyone to connect any BI tool, "any data science tool, and provide highly performance, "and highly dependable compute services "to process petabytes of data?" It's pretty fantastic. >> And very cost efficient and being able to scale, compute independent of storage, from an architectural perspective. A lot of people claim they can do that, but it doesn't scale the same way. >> Yeah, when you're talking about... Cause that's the thing is you got to remember, these financial systems especially, they depend on these transactions. They cannot go down and they're processing petabytes of data. That's what the new war is over, is that data in the compute layer. >> And the opportunity for you is that data that can come from anywhere, it's not sitting in a God box, where you can enforce policies on that corpus. You don't know where it's coming from. >> We want to be invisible to that right? You're using Snowflake, it's just automatically enforced. You're using Databricks, it's automatically enforced. All these policies are enforced in flight. No one should even truly care about us. We just want to allow you to use the data the way you're used to using it. >> And you do this, this secret sauce you talked about is math, it's artificial intelligence? >> It's math. I wish I could say it was like super fancy, unsupervised neural nets or what not, it's 15 years of working in the most regulated, sticky environments. We learned about very simple novel ways of pushing it down. Great engineering's always simple. But what we've done is... At query time, what's really neat is we figured a way to take user attributes from identity management system and combine that with a purpose, and then what we do is we've built all these libraries to connect into all these dispert storage and compute systems, to push it in there. The nice thing about that is prior to this what people were doing, was making copies. They'd go to the data engineering team and they'd say hey, "I need to ETL this "and get a copy and it'll be anatomized." Think about that for a second. One, the load on your production systems, of all these copies, all the time, right? The second is CISO, the surface area. Now you've got all this data that in a snapshot in time, is legal and ethical, might change tomorrow. And so, now you've got an increase surface area of risk. Like that no-copy aspect. So the pushing it down and then the no-copy aspect really changed the game for enterprises. >> And you've got providence issues, like you say. You've got governance and compliance. >> And imagine trying, if someone said to you, imagine Congress said hey, "Any data source that you've processed "over the past five years, I want to know if "there was these three people in any of these data sources "and if there were, who touched that data "and why did they touch it?" >> Yeah and storage is cheap, but there's unintended consequences. People are, management isn't. >> We just don't have a unified way to look at all of the logs cross listed. >> So we started to talk about cloud and then I took you down a different path. But you offer your software on any cloud, is that right? >> Yeah, so right now, we are in production on Immuta's Marketplace. And that is a managed service, so you can go deploy in there, it'll go into your VPC, and we can manage the updates for you, we have no insight into your infrastructure, but we can push those updates, it'll automatically update, so you're getting our quarterly releases, we release every season. But yeah, we started with AWBS, and then we will grow out. We see cloud is just too ubiquitous. Currently, we still support though, Bigquery, Data Praq, we support Azure, Data Light Storage version two, as well as Azure Databricks. But you can get us through Immuta's Marketplace. We're also investing in ReInvent, we'll be out there in Vegas in a couple weeks. It's a big event for us just because obviously, the government has a very big stake in AWBS, but also commercial customers. It's been a massive endeavor to move. We've seen lots of infrastructure. Most of our deals now are on cloud infrastructure. >> Great, so tell us about the company. You've raised, I think in a Series B, about 28 million to date. Maybe you could give us the head count, and whatever you can share about momentum, maybe customer examples. >> Yeah, so we've raised 32 million to date. >> 32 million. >> From some great investors. The company's about 70 people now. So not too big, but not small anymore. Just this year, at this point, I haven't closed my fiscal year, so I don't want to give too much, but we've doubled our ARR and we've tripled our LOGO count this year alone and we've still got one more quarter here. We just started our fourth quarter. And some customer cases, the way I think about our business is I love healthcare, I love government, I love finance. To give you some examples is like, COGNO is a really great example. COGNO and what they're trying to solve is can they predict where a child is on the autism spectrum? And they're trying to use machine learning to be able to narrow these children down so that they can see patterns as to how a provider, a therapist is helping these families give these kids the skills to operate in the real world. And so it's like this symbiotic relationship utilizing software, surveys and video and what not, to help connect these kids that are in similar areas of the spectrum, to help say hey, this is a successful treatment, right? The problem with that is we need lots of training data. And this is children, one, two, this is healthcare, and so, how do you guarantee HIPPA compliance? How do you get through FDA trials, through third party, blind testing? And still continue to validate and retrain your models, while protecting the identity of these children? So we provide a platform where we can anonymize all the data for them, we can guarantee that there's blind studies, where the company doesn't have access to certain subsets of the data. We can also then connect providers to gain access to the HIPPA data as needed. We can automate the whole thing for them. And they're a startup too, there are 100 people. But imagine if you were a startup in this health-tech industry and you had to invest in the backend infrastructure to handle all of that. It's too expensive. What we're unlocking for them, I mean yes, it's great that they're HIPPA compliant and all that, that's what we want right? But the more important thing is like, we're providing a value add to innovate in areas utilizing machine learning, that regulations would've stymied, right? We're allowing startups in that ecosystem to really push us forward and help those families. >> Cause HIPPA compliance is table stay compulsory. But now you're talking about enabling new business models. >> Yeah, yeah exactly. >> How did you get into all this? You're CEO, you're business savvy, but it sounds like you're pretty technical as well. What's your background? >> Yeah I mean, so I worked in the intelligence community before this. And most of my focus was on how do we take data and be able to leverage it, either for counter-terrorism missions, to different non-kinetic operations. And so, where I kind of grew up in is in this age of, think about billions of dollars in Baghdad. Where I learned is that through the computing infrastructure there, everything changed. 2006 Baghdad created this boom of technology. We had drones, right? We had all these devices on our trucks that were collecting information in real time and telling us things. And then we started building computing infrastructure and it burst Hadoop. So, I kind of grew up in this era of Big Data. We were collecting it all, we had no idea what to do with it. We had nowhere to process it. And so, I kind of saw like, there's a problem here. If we can find the unique little, you know, nuggets of information out of that, we can make some really smart decisions and save lives. So once I left that community, I kind of dedicated myself to that. The birth of this company again, was spun out of the US Intelligence community and it was really a simple problem. It was, they had a bunch of data scientists that couldn't access data fast enough. So they couldn't solve problems at the speed they needed to. It took four to six months to get to data, the mission said they needed it in less than 72 hours. So it was orthogonal to one another, and so it was very clear we had to solve that problem fast. So that weird world of very secure, really sensitive, but also the success that we saw of using data. It was so obvious that we need to democratize access to data, but we need to do it securely and we need to be able to prove it. We work with more lawyers in the intelligence community than you could ever imagine, so the goal was always, how do we make a lawyer happy? If you figure that problem out, you have some success and I think we've done it. >> Well that's awesome in applying that example to the commercial business world. Scott McNeely's famous for saying there is no privacy in the internet, get over it. Well guess what, people aren't going to get over it. It's the individuals that are much more concerned with it after the whole Facebook and fake news debacle. And as well, organizations putting data in the cloud. They need to govern their data, they need that privacy. So Matt, thanks very much for sharing with us your perspectives on the market, and the best of luck with Immuta. >> Thanks so much, I appreciate it. Thanks for having me out. >> All right, you're welcome. All right and thank you everybody for watching this Cube Conversation. This is Dave Vellante, we'll see ya next time. (digital music)

Published Date : Nov 7 2019

SUMMARY :

in Boston Massachusetts, it's the Cube. Matt, good to see ya. What is Immuta, why did you guys start this company? on the data to enforce any regulation, and get out to the market, but then the lawyers and the governance seems the ability to take control back. but the penalties didn't take effect till '18. and at the core of it is, why are you using my data? We have to automate it. There's a lot of confusion in the marketplace. So the cloud players are starting to see, So much to talk to you about here, Matt. So, the key isn't to stop people from using data. and I can get access to it, and other can get access to it. and we do that with customers across the world. Can you automate that on the creation of that data set? we can do it at the row level, The reason is that, especially in the age of data, to the highest common denominator as an example. and the consumer, and all they want to do So the other mega-trend you have is obviously, and it's likely, in the future, You had all the nice business logic to control it. Cause, the time to insight is perishable. What is the strategy there with regard to are all moving to cloud now in a different way. What do you mean by that? It's changing the ability to not just access, but it doesn't scale the same way. Cause that's the thing is you got to remember, And the opportunity for you is that data We just want to allow you to use the data and they'd say hey, "I need to ETL this And you've got providence issues, like you say. Yeah and storage is cheap, to look at all of the logs cross listed. and then I took you down a different path. and we can manage the updates for you, and whatever you can share about momentum, in the backend infrastructure to handle all of that. But now you're talking about enabling new business models. How did you get into all this? so the goal was always, how do we make a lawyer happy? and the best of luck with Immuta. Thanks so much, I appreciate it. All right and thank you everybody

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Matt CarrollPERSON

0.99+

BostonLOCATION

0.99+

ImmutaORGANIZATION

0.99+

MattPERSON

0.99+

2014DATE

0.99+

GoogleORGANIZATION

0.99+

2017DATE

0.99+

15 yearsQUANTITY

0.99+

32 millionQUANTITY

0.99+

FacebookORGANIZATION

0.99+

2019DATE

0.99+

November 2019DATE

0.99+

VegasLOCATION

0.99+

99 percentQUANTITY

0.99+

CongressORGANIZATION

0.99+

BaghdadLOCATION

0.99+

SnowflakeORGANIZATION

0.99+

42 minutesQUANTITY

0.99+

GDPRTITLE

0.99+

fourQUANTITY

0.99+

third questionQUANTITY

0.99+

AWSORGANIZATION

0.99+

six monthsQUANTITY

0.99+

22QUANTITY

0.99+

three peopleQUANTITY

0.99+

Boston MassachusettsLOCATION

0.99+

May of 2018DATE

0.99+

BigqueryORGANIZATION

0.99+

three piecesQUANTITY

0.99+

87 percentQUANTITY

0.99+

two sidesQUANTITY

0.99+

Data PraqORGANIZATION

0.99+

Scott McNeelyPERSON

0.99+

DatabricksORGANIZATION

0.99+

less than 72 hoursQUANTITY

0.99+

twoQUANTITY

0.99+

100 peopleQUANTITY

0.99+

firstQUANTITY

0.99+

tomorrowDATE

0.99+

first jobQUANTITY

0.98+

secondQUANTITY

0.98+

2006DATE

0.98+

ReInventORGANIZATION

0.98+

each stateQUANTITY

0.98+

USLOCATION

0.98+

this yearDATE

0.98+

AWBSORGANIZATION

0.98+

over 500 thousand policiesQUANTITY

0.98+

over 25 statesQUANTITY

0.98+

oneQUANTITY

0.98+

over 150 privacy regulationsQUANTITY

0.98+

'18DATE

0.98+

495QUANTITY

0.98+

fourth quarterDATE

0.98+

OneQUANTITY

0.97+

about 70 peopleQUANTITY

0.96+

three data setsQUANTITY

0.96+

billions of dollarsQUANTITY

0.95+

Series BOTHER

0.95+

one more quarterQUANTITY

0.95+

YULUORGANIZATION

0.95+

CISOORGANIZATION

0.95+

LookerORGANIZATION

0.94+

over 61 different storage platformsQUANTITY

0.93+

Fort KnoxORGANIZATION

0.92+

about 28 millionQUANTITY

0.92+

ImmutaTITLE

0.92+

TableauORGANIZATION

0.88+

The Truth About AI and RPA | UiPath


 

>> From the SiliconANGLE Media Office in Boston, Massachusets, it's theCUBE! (techno music) Now, here's your host, Stu Miniman. >> Hi. I'm Stu Miniman and this is a Cube Conversation from our Boston area studio. Welcome back to the program. Bobby Patrick, who is the Chief Marketing Officer of UiPath. Bobby, good to see you. >> Great to be here Stu. >> Alright. Bobby, we're going to tackle head-on an interesting discussion that's been going on in the industry. Of course, Artificial Intelligence is this wave that is impacting a lot when you look at earnings reports, everyone's talking about it. Most companies are understanding how they're doing it. It is not a new term. I go back reading my history of technology, Ada Lovelace, 150 years ago when she was helping to define what a computer was. She made the Lovelace objective, I believe they said - >> Right. >> Which was later quoted by Turing and the like is that if we can describe it in code, it's probably not Artificial Intelligence cause their not building new things - >> Right. >> And being able to change on there, so there's hype around AI itself, but UiPath is one of the leaders in Robotic Process Automation and how that fits in with AI and Machine Learning, all of these other terms it can get a bit of an acronym soup and we all can't agree on what the terms are. So, let's start with some of the basics Bobby. Please give us RPA and AI and we'll get into it from there. >> Well, Robotic Process Automation, according to the analysts, like Forester are part of the overall AI broader kind of massive, massive market. AI itself has many different, different, routes. Deep learning right, and machine learning, natural language processing, right and so on. I think AI is a term that covers many different grounds. And RPA, AI applies two ways. It applies within RPA and that we have a technology called Computer Vision. It's how a robot looks at a screen like how a human does, which is very, very difficult actually. You look at a citrix terminal session, or a VDI session, different than an Excel sheet, different than as SASAB, and most processes across all of those, so a robot has to be able to look at all of, all of those screen elements, and understand them right. AI within Computer Vision around understanding documents, looking at unstructured data, looking at handwriting. Conversational understanding. Looking at text in an email determining context, helping with chatbots. But a number of those components, doesn't mean we have to build that all ourselves. What RPA does is we bring it all together. We make it easy to automate and build and create the data flow of a process. Then you can apply AI to that, right. So, I think, two years ago when I first joined UiPath, putting RPA and AI in the same sentence people laughed. Year ago we said, ya know what, RPA is really the path to AI in business operations. Now, ya know we say that we're the most highly valued AI company in the world and no one has ever disagreed. >> Yeah, so it's good to lay out some of the adopting cause one of the things to look at and say if I looked at this product two or three years ago, it's not the product that it is today. We know how fast software - >> Right. Is making changes along the line. Second thing, automation itself is something we've been talking about my entire career. >> Right. When I look at things we were doing 5, 10, 15 years ago, and calling automation, we kind of laugh at it. Because today, automation absolutely is making a lot of changes. RPA is taking that automation in a very strategic direction for many companies there. It's the conversation we had last year at your conference was, RPA is the gateway drug if you will. >> Right. >> Of that environment because automation has scared a lot of people. Am I just doing scripts, what do I control, what do I set? Maybe just give us that first grounding of where that automation path is, has come and is going. >> So, there's different kinds of automation right as you said. We've had automation for decades, primarily in IT. Automation was primarily around API to API integration. And that's really hard, right. It requires developers, engineers, it requires them to keep it current. It's expensive and takes a longer time. Along comes the technology, RPA and UiPath, right were you can automate fairly quickly. There's built in recorders and you can do it with a drag and drop, like a flow chart. You can automate a process, and that, that automation is immediately beneficial. Meaning that outcome, is immediate. And, the cost to doing that is small in comparison. And I think, maybe it's the longtail of automation in some ways. It's all of these things that we do around a SAP process. The reality is if you have SAP, or you have Oracle, or you have Workday, the human processes around that involve still a Spreadsheet. It involves PDF documents. A great, one of my favorite examples right now on YouTube with Microsoft is Chevron. Chevron has hundreds of thousands of PDF's that is generated from every oil rig every day. It has all kinds of data in different formats. Tables, different structured and semi-structured data. They would actually extract that data, manually. To be able to process that and analyze that, right. Working with Microsoft AI and UiPath RPA they're able to automate that entire massive process. And now they're on stage talking about it, Microsoft and UiPath events right. And, they call that AI. That's applying AI to a massive problem for them. They need the robot to be completely accurate though. You don't to worry that the data that is being extracted from the PDF's is inaccurate, right. So, Machine Learning goes into that. There's exception management that's a part of that process as well. They call it AI. >> Yeah, some of this is just, people in the industry, the industry watchers is, we get very particular on different terminology. Let's not conflate Artificial Intelligence, or Augmented Intelligence with Machine Learning, because their different environments. I've heard Forester talk about, right, it's a spectrum though, there's an umbrella for some of these. So, we like to get not too pedantic on individual terms itself. >> Right. >> Um - >> Let me give you more examples. I think the term robotic and RPA, yes, it's true that the vast majority of the last couple of years with RPA have been very rules based, right. Because most processes today like in a call center, there's a rule. Do this and this, then this and this. And so, you're automating that same rules based structure. But once that data's flowing through, you can actually then look at the history of that data and then turn a rules based automation into an experience based automation. And how do you do that? You apply Machine Learning algorithms. You apply Data Robot, LMAI, IBM Watson to it, right. But, it's still the RPA platform that is driving that automation, it's just no longer rules based it's experience based. A great example at UiPath Together Dubai recently, was Dubai customs. They had a process where when you declared something, let's say you box of chocolate, they had to open up a binder and find a classification code for that box of chocolate. Well, they use our RPA product and they make a call out to IBM Watson as a part of the automation, and they just write in, pink box of candy filled chocolate. And it takes its Deep Learning, it comes back with a classification code, all part of an automated process. What happens? Dubai customs lines go from being a two hours to a few minutes, right. It's a combination of our RPA capability and our automation board capability and the ability to bring in IBM Watson. Dubai customs says they applied AI now and solved a big problem. >> One of the things I was reading through the recent Gartner Magic Quadrant on RPA, and they had two classifications. One was, kind of the automation does it all, and the other was the people and machines. Things like chatbox, some of the examples you've been giving there seem to be that combination. Where do those two fit together or are those distinctions that you make? >> Yeah, I mean Gartner's interesting. Gartner's a very IT-centric analyst firm, right and IT often in my view are often very conventional thinkers and not the fastest to adopt breakthrough technologies. They weren't the fastest to adopt Cloud, they weren't the fastest to adopt on-demand CRM, and they weren't the fastest to jump onto RPA because they believe, why can't we use API for everything. And the Gartner analysts is kind of, in the beginning of the process of the Magic Quadrant, they spent a lot of time with us and they were trying hard to say that was, you should solve everything with an API. That's just not reality, right? It's not feasible, and it's not affordable, right? But, RPA is just not the automation of a task or process, it's then applying a whole other set of other technologies. We have 700 partners today in our ecosystem. Natural Language processing partners, right. Machine learning partners. Chatbox partners, you mentioned. So we want to be, we want to make it very easy. In a drag and drop way. To be able to apply these great technologies to an automation to solve some big problem. What's fun to me right now is there's a lot of great startups. They come out of say insurance, or they come out of financial services and they've got a great algorithm and they know the business really well. And they probably have one or two amazing customers, and they're stuck. We, for them, this came from a partner of ours, you're becoming, you UiPath, you're becoming our best route to market because you have the data. You have the work flow. Our job I think in some ways, is to make it easy to bring these technologies together to apply them to an automation to make that through a democratized way where a non-engineer can do this, and I think that's what's happening. >> Yeah, those integrations between environments can be very powerful something we see. Every shop has lots of applications, has lots of technical data and they're not just sweeping the floor of everything they have. What are some of the limits of AI and RPA today, what do you see things going? >> I think, Deep Learning we see very little of that. It's probably applied to some kind of science project and things within companies. I think for the vast majority of our customers, they use machine learning within RPA for Computer Vision by default. But, ya know they're still not really at a stage of mass adoption of what algorithms do I want to apply to a process. I think we're trying to make it easier for you to be able to drag and drop AI we call it, to make it easier to apply. But, I think we're in very early days. And as you mentioned, there's market confusion on it. I know one thing from our 90 plus customers that are in our advisory boards. I know from them they say their companies struggles with finding an ROI in AI, and, you know, I think we're helping there cause we're applying to real operations. They say the same thing about Blockchain. I don't know Stu. Do you know of a single example of a Blockchain ROI, great example? >> Yeah, it reminds me, Big Data was one of those, over half of the people failed to get the ROI they want. It's one of those promises of certain technology - >> Right. >> That high-level, you know let's poo-poo Bobby things that actually have tangible results - >> Yeah. >> And get things done. But you weren't following the strict guidelines of the API economy. >> Right, well true, exactly right. What I find amazing is, I mentioned in another one of our talks conversations that 23,000 have come to UiPath events this year. To our own events, not trade events and other shows, that's different. They want to get on stage and talk. They're delighted about this. And their talking about, generally speaking, RPA's helping them go digital. But they're all saying their ambition is to apply AI to make those processes smarter. To learn from - to go from rules based to experience based. I think what's beautiful about UiPath, is that we're a platform that you can get there overtime. You can apply - you can predict perhaps the algorithm 's you're going to want to use in two or three years. We're not going to force you, you can apply any algorithm you want to an automation work going through. I think that flexibility is actually for customers, they find it very comforting. >> It's one of those things I say, most companies have a cloud strategy. That needs to be written in, not etched in stone. You need to revisit it every quarter. Same thing with what happening AI and in your space things are changing so fast and they need to be agile. >> That's right. >> They need to be able to make changes. In October, you're going to have a lot of those customers up on stage talking. Where will this AI discussion fit into UiPath forward in Las Vegas. We talk a lot about our AI fabric, framework it's around document understanding, getting heavy robots getting smarter and smarter, what they see on the screen, what they see on a document, what they see with handwriting, and improving the accuracy of visual understanding. Looking at the, face recognition and other types of images and being able to understand the images. Conversational understanding. The tone of an email. Is this person really upset? How upset? Or, a conversational chatbot. Really evolving from mimicking humans with RPA to augmenting humans and I think that story, both in the innovations, the customer examples on stage, I think you're going to see the sophistication of automation's that are being used through UiPath grow exponentially. >> Okay, so I want to give you the final word on this. And I don't want to talk to the people that might poo-poo or argue RPA and AI and ML and all these things. Bring us inside your customers. What...where, how does that conversation start? Are they coming it from AI, ML, RPA or is there, ya know a business discussion that usually catalyzes this engagement? >> Our customer's are starting with digital. They're trying to go digital. They know they need digital transformation, it's been very, very hard. There's a real outcome that comes quickly from taking a mundane task that is expensive, and automating that. The outcomes are quick, often projects that involve our partners like Accenture and others. The payback period on the entire project with RPA can be 6 months, it's self-funding. What other technologies doing B2B is self-funding in one year? That's part of the incredible adoption birth. But, every single customer doesn't stop there. They say okay, I also want to know that this automation is, I want to know that I can go apply AI to this. It's in every conversation. So there's two big booms with UiPath and our RPA. The first is when you go digital, there's some great outcome. There's productivity gain, it's immediate, right. I guess I said the payback period is quick. The second big one is when you go and turn it from a rules based to an experience based process, or you apply AI to it, there's another set of business benefits down the road. As more algorithms come out and things, you keep applying to it. This is sort of the gift that keeps on giving. I think if we didn't have that connection to Machine Learning or AI, I think the enthusiasm level of the majority of our customers would not be anywhere near what it is today. >> Alright, well Bobby really appreciate digging into the customerality, RPA, AI all the acronym soup that was going on and we look forward to UiPath Forward at the Bellagio in Las Vegas this October. >> It'll be fun. Alright, I'm Stu Miniman, as always thank you so much for watching theCube.

Published Date : Jul 17 2019

SUMMARY :

From the SiliconANGLE Media Office Welcome back to the program. that is impacting a lot when you look at but UiPath is one of the leaders in RPA is really the path to AI in business operations. cause one of the things to look at and say Is making changes along the line. RPA is the gateway drug if you will. Am I just doing scripts, They need the robot to be completely accurate though. people in the industry, they had to open up a binder and find a and the other was the people and machines. But, RPA is just not the automation of a task the floor of everything they have. They say the same thing about Blockchain. over half of the people failed to get of the API economy. is that we're a platform that you can get there overtime. things are changing so fast and they need to be and improving the accuracy of visual understanding. I want to give you the final word on this. I guess I said the payback period is quick. all the acronym soup that was going on thank you so much

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Ada LovelacePERSON

0.99+

BobbyPERSON

0.99+

Stu MinimanPERSON

0.99+

oneQUANTITY

0.99+

Bobby PatrickPERSON

0.99+

GartnerORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

OctoberDATE

0.99+

BostonLOCATION

0.99+

IBMORGANIZATION

0.99+

6 monthsQUANTITY

0.99+

AccentureORGANIZATION

0.99+

one yearQUANTITY

0.99+

twoQUANTITY

0.99+

UiPathORGANIZATION

0.99+

Las VegasLOCATION

0.99+

two hoursQUANTITY

0.99+

last yearDATE

0.99+

700 partnersQUANTITY

0.99+

StuPERSON

0.99+

ExcelTITLE

0.99+

twoDATE

0.99+

firstQUANTITY

0.99+

5DATE

0.99+

two waysQUANTITY

0.99+

YouTubeORGANIZATION

0.98+

three yearsQUANTITY

0.98+

two years agoDATE

0.98+

90 plus customersQUANTITY

0.98+

OneQUANTITY

0.98+

ChevronORGANIZATION

0.98+

this yearDATE

0.98+

OracleORGANIZATION

0.98+

todayDATE

0.98+

two classificationsQUANTITY

0.98+

150 years agoDATE

0.97+

DubaiLOCATION

0.97+

bothQUANTITY

0.97+

23,000QUANTITY

0.97+

TuringPERSON

0.97+

three years agoDATE

0.97+

10DATE

0.97+

two big boomsQUANTITY

0.95+

Year agoDATE

0.95+

SiliconANGLE Media OfficeORGANIZATION

0.95+

singleQUANTITY

0.95+

one thingQUANTITY

0.94+

Second thingQUANTITY

0.93+

15 years agoDATE

0.93+

hundreds of thousandsQUANTITY

0.91+

ForesterORGANIZATION

0.89+

MassachusetsLOCATION

0.86+

LMAIORGANIZATION

0.85+

second bigQUANTITY

0.84+

SAPORGANIZATION

0.84+

over halfQUANTITY

0.79+

SASABTITLE

0.78+

single customerQUANTITY

0.78+

customersQUANTITY

0.77+

Keynote Analysis | UiPath Forward 2018


 

(energetic music) >> Live from Miami Beach, Florida. It's theCUBE covering UiPathForward Americas. Brought to you by UiPath. >> Welcome to Miami everybody. This is theCUBE the leader in live tech coverage. We're here covering the UiPathForward Americas conference. UiPath is a company that has come out of nowhere, really. And, is a leader in robotic process automation, RPA. It really is about software robots. I am Dave Vellante and I am here with Stu Miniman. We have one day of coverage, Stu. We are all over the place this weekend. Aren't we? Stu and I were in Orlando earlier. Flew down. Quick flight to Miami and we're getting the Kool-Aid injection from the RPA crowd. We're at the Fontainebleau in Miami. Kind of cool hotel. Stu you might remember, I am sure, you do, several years ago we did the very first .NEXT tour. .NEXT from Nutanix at this event. About this same size, maybe a little smaller. This is a little bigger. >> Dave, this is probably twice the size, about 1,500 people here. I remember about a year ago you were, started buzzing about RPA. Big growth in the market, you know really enjoyed getting into the keynote here. You know, you said we were at splunk and data was at the center of everything, and the CEO here for (mumbles), it's automation first. We talked about mobile first, cloud first, automation first. I know we got a lot of things we want to talk about because you know, I think back through my career, and I know you do too, automation is something we've been talking about for years. We struggle with it. There's challenges there, but there's a lot of things coming together and that's why we have this new era that RPA is striking at to really explode this market. >> Yeah, so I made a little prediction that I put out on Twitter, I'll share with folks. I said there's a wide and a gap between the number of jobs available worldwide and the number for people to fill them. That's something that we know. And there's a productivity gap. And the numbers aren't showing up. We're not seeing bump-ups in productivity even though spending on technology is kind of through the roof. Robotic Process Automation is going to become a fundamental component of closing that gap because companies, as part of the digital process transformation, they want to automate. The market today is around a billion. We see it growing 10 x over the next five to seven years. We're going to have some analysts on today from Forester, we'll dig into that a little bit, they cover this market really, really closely. So, we're hearing a lot more about RPA. We heard it last week at Infor, Charles Phillips was a big proponent of this. UiPath has been in this business now for a few years. It came out of Romania. Daniel Dines, former Microsoft executive, very interesting fellow. First time I've seen him speak. We're going to meet him today. He is a techy. Comes on stage with a T-shirt, you know. He's very sort of thoughtful, he's talking about open, about culture, about having fun. Really dedicated to listening to customers and growing this business. He said, he gave us a data point that they went from nothing, just a couple of million dollars, two years ago. They'll do 140 million. They're doing 140 million now in annual reccurring revenue. On their way to 200. I would estimate, they'll probably get there. If not by the end of year, probably by the first quarter next year. So let's take look at some of the things that we heard in the keynote. We heard from customers. A lot of partners here. Seen a lot of the big SIs diving in. That's always a sign of big markets. What did you learn today at the keynotes? >> Yeah, Dave, first thing there is definitely, one of the push backs about automation is, "Oh wait what is that "going to do for jobs?" You touched on it. There's a lot of staff they threw out. They said that RPA can really bring, you know, 75% productivity improvement because we know productivity improvement kind of stalled out over all in the market. And, what we want to do is get rid of mundane tasks. Dave, I spent a long time of my career helping to get, you know, how to we get infrastructure simpler? How do we get rid of those routine things? The storage robe they said if you were configuring LUNs, you need to go find other jobs. If you were networking certain basic things, we're going to automate that with software. But there are things that the automation are going to be able to do, so that you can be more creative. You can spend more time doing some higher level functions. And that's where we have a skills gap. I'm excited we're going to have Tom Clancy, who you and I know. I've got his book on the shelf and not Tom Clancy the fiction author, but you know the Tom Clancy who has done certifications and education through storage and cloud and now how do we get people ready for this next wave of how you can do people and machines. One of my favorite events, Dave, that we ever did was the Second Machine Age with MIT in London. Talking about it's really people plus machines, is really where you're going to get that boom. You've interviewed Garry Kasparov on this topic and it's just fascinating and it really excites me as someone, I mean, I've lived with my computers all my life and just as a technologist, I'm optimistic at how, you know, the two sides together can be much more powerful than either alone. >> Well, it's an important topic Stu. A lot of the shows that we go to, the vendors don't want to talk about that. "Oh, we don't want to talk about displacing humans." UiPath's perspective on that, and we'll poke them a little on that is, "That's old news. "People are happy because they're replacing their 'mundane tasks.'" And while that's true, there's some action on Twitter. (mumbles name) just tweeted out, replying to some of the stuff that we were talking about here, in the hashtag, which is UiPathForward, #UiPathForward, "Automation displaces unskilled workers, "that's the crux of the problem. "We need best algorithms to automate re-training and "re-skilling of workers. "That's what we need the most for best socio-economic "outcomes, in parallel to automation through "algorithm driven machines," he's right. That gap, and we talked about this at 2MA, is it going to be a creativity gap? It's an education issue, it's an education challenge. 'Cause you just don't want to displace, unskilled workers, we want to re-train people. >> Right, absolutely. You could have this hollowing out of the market place otherwise, where you have really low paid workers on the one end, and you have really high-end creative workers but the middle, you know, the middle class workers could be displaced if they are not re-trained, they're not put forward. The World Economic Forum actually said that this automation is going to create 60-million net new jobs. Now, 60-million, it sounds like a big number, but it is a large global workforce. And, actually Dave, one of the things that really struck me is, not only do you have a Romanian founder but up on stage we had, a Japanese customer giving a video in Japanese with the subtitles in English. Not something that you typically see at a U.S. show. Very global, in their reach. You talked about the community and very open source focus of something we've seen. This is how software grows very fast as you get those people working. It's something I want to understand. They've got, the UiPath that's 2,000 customers but they've got 114,000 certified RPA developers. So, I'm like, okay, wait. Those numbers don't make sense to me yet, but I'm sure our guests are going to be able to explain them. >> And, so you're right about the need for education. I was impressed that UiPath is actually spending some of it the money that it's raised. This company, just did a monster raise, 225-million. We had Carl Ashenbach on in theCUBE studio to talk about that. Jeff Freck interviewed him last week. You can find that interview on our YouTube play list and I think on out website as well. But they invested, I think it was 10-million dollars with the goal of training a million students in the next three years. They've hired Tom Clancy, who we know from the old EMC education world. EMC training and education world. So they got a pro in here who knows to scale training. So that's huge. They've also started a 20-million investment fund investing in start ups and eco-system companies, so they're putting their money where their mouth is. The company has raised over 400-million dollars to date. They've got a 3-billion dollar evaluation. Some of the other things we've heard from the keynote today, um, they've got about 1,400 employees which is way up. They were just 270, I believe, last year. And they're claiming, and I think it's probably true, they're the fastest growing enterprise software company in history, which is kind of astounding. Like you said, given that they came out of Romania, this global company maybe that's part of the reason why. >> I mean, Dave, they said his goal is they're going to have 4,000 employees by 2019. Wait, there are a software company and they raised huge amounts of money. AS you said, they are a triple unicorn with a three billion dollar valuation. Why does a software company need so many employees? And 3,000, at least 3,000 of those are going to be technical because this is intricate. This is not push button simplicity. There's training that needs to happen. How much do they need to engage? How much of this is vertical knowledge that they need to get? I was at Microsoft Ignite two weeks ago. Microsoft is going really deep vertically because AI requires specialized knowledge in each verticals. How much of that is needed from RPA? You've got a little booklet that they have of some basic 101 of the RPA skills. >> I don't know if you can see this, but... Is that the right camera? So, it's this kind of robot pack. It's kind of fun. Kind of go through, it says, you got to reliable friend you can automate, you know, sending them a little birthday wish. They got QR codes in the back you can download it. You know, waiters so you can order online food. There's something called Tackle, for you fantasy football players who help you sort of automate your fantasy football picks. Which is kind of cool. So, that's fun. There's fun culture here, but really it's about digital transformation and driving it to the heart of process automation. Daniel Dines, talked about taking things from hours to minutes, from sort of accurate to perfectly accurate. You know, slow to fast. From very time consuming to automated. So, he puts forth this vision of automation first. He talked about the waves, main frames, you know the traditional waves client server, internet, etc. And then, you know I really want to poke at this and dig into it a little bit. He talked about a computer vision and that seemed to be a technical enabler. So, I'm envisioning this sort of computer vision, this visual, this ability to visualize a robot, to visualize what's happening on the screen, and then a studio to be able to program these things. I think those are a couple of the components I discerned. But, it's really about a cultural shift, a mind shift, is what Daniel talked about, towards an automation first opportunity. >> And Dave, one of the things you said right there... Three things, the convergence of computer vision, the Summer of AI, and what he meant by that is that we've lived through a bunch of winters. And we've been talking about this. And, then the business.. >> Ice age of a, uh... >> Business, process, automation together, those put together and we can create that automation first era. And, he talked about... We've been talking about automation since the creation of the first computer. So, it's not a new idea. Just like, you know we've been talking on theCUBE for years. You know, data science isn't a new thing. We sometimes give these things new terms like RPA. But, I love digging into why these are real, and just as we've seen these are real indicators, you know, intelligence with like, whether you call it AI or ML, are doing things in various environments that we could not do in the past. Just borders of magnitude, more processing, data is more important. We could do more there. You know, are we on the cusp of really automation. being able to deliver on the things that we've been trying to talk about a couple of generations? >> So a couple of other stats that I thought were interesting. Daniel put forth a vision of one robot for every person to use. A computer for every person. A chicken for every pot, kind of thing (laughs) So, that was kind of cool. >> "PC for every person," Bill Gates. >> Right, an open and free mind set, so he talked a about, Daniel talked about of an era of openness. And UiPath has a market place where all the automations. you can put automations in there, they're all free to use. So, they're making money on the software and not on the automation. So, they really have this... He said, "We're making our competitors better. "They're copying what we're doing, "and we think that's a good thing. "Because it's going to help change the world." It's about affecting society, so the rising tides lift all boats. >> Yeah Dave, it reminds me a lot of, you know, you look at GitHub, you look at Docker Hub. There's lots of places. This is where code lives in these open market places. You know, not quite like the AWS or IBM market places where you can you can just buy software, but the question is how many developers get in there. They say they got 250,000 community members already there. So, and already what do they have? I think hundreds of processes that are built in there, so that will be a good metric we can see to how fast that scales. >> We had heard from a couple of customers, and Wells Fargo was up there, and United Health. Mr. Yamomoto from SNBC, they have 1,000 robots. So, they are really completely transforming their organization. We heard from a partner, Data Robot, Jeremy Atchins, somebody who's been on theCUBE before, Data Robot. They showed an automated loan processing where you could go in, talk to a chat bot and within minutes get qualified for a loan. I don't know if you noticed the loan amount was $7,000 and the interest rate was 13.6% so the applicant, really, must not of had great credit history. Cause that's kind of loan shark rates, but anyway, it was kind of a cool demo with the back end data robot munging all the data, doing whatever they had to do, transferring through a CSV into the software robot and then making that decision. So, that was kind of cool, those integrations seemed to be pretty key. I want to learn more about that. >> I mean it reminds me of chat box have been hot in a lot of areas lately, as how we can improve customer support and automate things on infrastructure in the likes of, we'll see how those intersections meet. >> Yeah, so we're going to be covering this all day. We got technologists coming on, customers, partners. Stu and I will be jamming. He's @Stu and I'm @Dvellante. Shoot us any questions, comments. Thanks for the ones we've had so far. We're here at the Fontainebleau in Miami Beach. Pretty crazy hotel. A lot of history here. A lot of pictures of Frank Sinatra on the wall. Keep it right there, buddy. You're watching theCUBE. We'll be right back after this short break. (energetic music)

Published Date : Oct 4 2018

SUMMARY :

Brought to you by UiPath. We are all over the place this weekend. Big growth in the market, Seen a lot of the big SIs diving in. of my career helping to get, A lot of the shows that we but the middle, you know, Some of the other things 101 of the RPA skills. They got QR codes in the And Dave, one of the of the first computer. So a couple of other on the software and not on but the question is how many and the interest rate was in the likes of, we'll see Thanks for the ones we've had so far.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

DavePERSON

0.99+

DanielPERSON

0.99+

Jeff FreckPERSON

0.99+

$7,000QUANTITY

0.99+

Daniel DinesPERSON

0.99+

Tom ClancyPERSON

0.99+

13.6%QUANTITY

0.99+

StuPERSON

0.99+

MicrosoftORGANIZATION

0.99+

OrlandoLOCATION

0.99+

MiamiLOCATION

0.99+

20-millionQUANTITY

0.99+

Wells FargoORGANIZATION

0.99+

Charles PhillipsPERSON

0.99+

IBMORGANIZATION

0.99+

AWSORGANIZATION

0.99+

Bill GatesPERSON

0.99+

RomaniaLOCATION

0.99+

75%QUANTITY

0.99+

UiPathORGANIZATION

0.99+

Carl AshenbachPERSON

0.99+

3,000QUANTITY

0.99+

YamomotoPERSON

0.99+

2019DATE

0.99+

last yearDATE

0.99+

World Economic ForumORGANIZATION

0.99+

10-million dollarsQUANTITY

0.99+

225-millionQUANTITY

0.99+

United HealthORGANIZATION

0.99+

114,000QUANTITY

0.99+

140 millionQUANTITY

0.99+

LondonLOCATION

0.99+

Data RobotORGANIZATION

0.99+

Miami BeachLOCATION

0.99+

Jeremy AtchinsPERSON

0.99+

4,000 employeesQUANTITY

0.99+

Stu MinimanPERSON

0.99+

Garry KasparovPERSON

0.99+

60-millionQUANTITY

0.99+

Miami Beach, FloridaLOCATION

0.99+

2,000 customersQUANTITY

0.99+

U.S.LOCATION

0.99+

MITORGANIZATION

0.99+

SNBCORGANIZATION

0.99+

200QUANTITY

0.99+

last weekDATE

0.99+

10 xQUANTITY

0.99+

first quarter next yearDATE

0.99+

Frank SinatraPERSON

0.99+

two weeks agoDATE

0.99+

twiceQUANTITY

0.99+

Kool-AidORGANIZATION

0.99+

first computerQUANTITY

0.99+

NutanixLOCATION

0.99+

hundredsQUANTITY

0.98+

two years agoDATE

0.98+

EMCORGANIZATION

0.98+

over 400-million dollarsQUANTITY

0.98+

270QUANTITY

0.98+

several years agoDATE

0.98+

two sidesQUANTITY

0.98+

a million studentsQUANTITY

0.98+

oneQUANTITY

0.98+

todayDATE

0.98+

InforORGANIZATION

0.98+

YouTubeORGANIZATION

0.98+

around a billionQUANTITY

0.98+

EnglishOTHER

0.98+

UiPathForward AmericasEVENT

0.98+

one dayQUANTITY

0.98+

three billion dollarQUANTITY

0.97+

Paul Barth, Podium Data | The Podium Data Marketplace


 

(light techno music) >> Narrator: From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now here's your host, Stu Miniman. >> Hi, I'm Stu Miniman and welcome to theCUBE conversation here in our Boston area studio. Happy to welcome back to the program, Paul Barth, who's the CEO of Podium Data, also a Boston area company. Paul, great to see you. >> Great to see you, Stu. >> Alright, so we last caught up with you, it was a fun event that we do at MIT talking about information, data quality, kind of understand why your company would be there. For our audience that doesn't know, just give us a quick summary, your background, what was kind of the why of Podium Data back when it was founded in 2014. >> Oh that's great Stu, thank you. I've spent most of my career in helping large companies with their data and analytic strategies, next generation architectures, new technologies, et cetera, and in doing this work, we kept stumbling across the complexity of adopting new technologies. And around the time that big data and Hadoop was getting popular and lots of hype in the marketplace, we realized that traditional large businesses couldn't manage data on this because the technology was so new and different. So we decided to form a software company that would automate a lot of the processing, manage a catalog of the data, and make it easy for nontechnical users to access their data. >> Yeah, that's great. You know when I think back to when we were trying to help people understand this whole big data wave, one of the pithy things we did, it was turning all this glut of data from a problem to an opportunity, how do we put this in to the users. But a lot of things kind of, we hit bumps in the road as an industry. Did studies it was more than 50 percent of these projects fail. You brought up a great point, tooling is tough, changing processes is really challenging. But that focus on data is core to our research, what we talk about all the time. But now it's automation and AIML, choose your favorite acronym of the day. This is going to solve all the ills that the big data wave didn't do right. Right, Paul? So maybe you can help us connect the dots a little bit because I hear a lot in to the foundation that trend from the big data to kind of the automation and AI thing. So you're maybe just a little ahead of your time. >> Well thanks, I saw an opportunity before there was anything in the marketplace that could help companies really corral their data, get some of the benefits of consolidation, some oversight in management through an automated catalog and the like. As AI has started to emerge as the next hype wave, what we're seeing consistently from our partners like Data Robot and others who have great AI technology is they're starved for good information. You can't learn automatically or even human learning if you're given inconsistent information, data that's not conformed or ready or consistent, which you can look at a lot of different events and start to build correlations. So we believe that we're still a central part of large companies building out their analytics infrastructure. >> Okay, help us kind of look at how your users and how you fit into this changing ecosystem. We all know things are just changing so fast. From 2014 to today, Cloud is so much bigger, the big waves of IoT keep talking. Everybody's got some kind of machine learning initiative. So what're the customers looking for, how do you fit in some of those different environments? >> I think when we formed the company we recognized that the cost performance differential between the open-sourced data management platforms like Hadoop and now Spark, were so dramatically better than the traditional databases and data warehouses, that we could transform the business process of how do you get data from Rotaready. And that's a consistent problem for large companies they have data in legacy formats, on mainframes, they have them in relational databases, they have them in flat files, in the Cloud, behind the firewall, and these silos continue to grow. This view of a consistent, or consistent view of your business, your customers, your processes, your operations, is cental to optimizing and automating the business today. So our business users are looking for a couple of things. One thing they are looking for is some manageability and a consistent view of their data no matter where it lives, and our catalog can create that automatically in days or weeks depending on how how big we go or how broadly we go. They're looking for that visibility but also they're looking for productivity enhancements, which means that they can start leveraging that data without a big IT project. And finally they're looking for agility which means there's self-service, there's an ability to access data that you know is trusted and secured and safe for the end users to use without having to call IT and have a program spin something up. So they're really looking for a totally new paradigm of data delivery. >> I tell you that hits on so many things that we've been seeing and a challenge that we've seen in the marketplace. In my world, talk about people they had their data centers and if I look at my data and I look at my applications, it's this heterogeneous nightmare. We call it hybrid or multi cloud these days, and it shows the promise of making me faster and all this stuff. But as you said, my data is all over the place, my applications are getting spun up and maybe I'm moving them and federating things and all that. But, my data is one of the most critical components of my business. Maybe explain a little bit how that works. Where do the customers come in and say oh my gosh, I've got a challenge and Podium Data's helping and the marketplace and all that. >> Sure, first of all we targeted from the start large regulated businesses, financial services, pharmaceutical healthcare, and we've broadened since then. But these companies' data issues were really pressure from both ends. One was a compliance pressure. They needed to develop regulatory reports that could be audited and proven correct. If your data is in many silos and it's compiled manually using spreadsheets, that's not only incredibly expensive and nonreproducible, it's really not auditable. So a lot of these folks were pressured to prove that the data they were reporting was accurate. On the other side, it's the opportunity cost. Fintech companies are coming into their space offering loans and financial products, without any human interaction, without any branches. They knew that data was the center to that. The only way you can make an offer to someone for financial product is if you know enough about them that you understand the risk. So the use and leverage of data was a very critical mass. There was good money to invest in it and they also saw that the old ways of doing this just weren't working. >> Paul, does your company help with the incoming GDPR challenges that are being faced? >> Sure, last year we introduced a PII detector and protection scheme. That may not sound like such a big deal but in the Hadoop open-source world it is. At the end of the day this technology while cheap and powerful is incredibly immature. So when you land data, for example, into these open data platforms like S3 out in the Cloud, Podium takes the time to analyze that data and tell you what the structures of the data are, where you might have issues with sensitive data, and has the tooling like obfuscation and encryption to protect the data so you can create safe to use data. I'd say our customers right now, they started out behind the firewall. Again, these regulated businesses were very nervous about breaches. They're looking and realizing they need to get to the Cloud 'cause frankly not only is it a better platform for them from a cost basis and scalability, it's actually where the data comes from these days, their data suppliers are in the Cloud. So we're helping them catalog their data and identify the sensitive data and prepare data sets to move to the Cloud and then migrate it to the Cloud and manage it there. >> Such a critical piece. I lived in the storage world for about a decade. There was a little acquisition that they made of a company called Pi, P-I. It was Paul Maritz who a lot of people know, Paul had a great career at Microsoft went on to run VMware for a bunch. But it was, the vision you talk about reminds me of what I heard Paul Maritz talking to. Gosh, that was a decade ago. Information, so much sensitivity. Expand a little bit on the security aspect there, when I looked through your website, you're not a security company per se, but are there partnerships? How do you help customers with I want to leverage data but I need to be secure, all the GRC and security things that's super challenging. >> At this space to achieve agility and scale on a new technology, you have to be enterprise ready. So in version one of our product, we had security features that included field level encryption and protection, but also integration with LDAB and Kerberos and other enterprise standard mechanisms and systems that would protect data. We can interoperate with Protegrity's and other kinds of encryption and protection algorithms with our open architecture. But it's kind of table stakes to get your data in a secured, monitorable infrastructure if you're going to enable this agility and self-service. Otherwise you restrict the use of the new data technologies to sandboxes. The failures you hear about are not in the sandboxes in the exploration, they're in getting those to production. I had one of my customers talk about how before Podium they had 50 different projects on Hadoop and all of them were in code red and none of them could go to production. >> Paul you mentioned catalogs, give us the update. What's the newest from Podium Data? Help explain that a little bit more. >> So we believe that the catalog has to help operationalize the data delivery process. So one of the things we did from the very start was say let's use the analytical power of big data technologies, Spark, Hadoop, and others, to analyze the data on it's way in to the platform and build a metadata catalog out of that. So we have over 100 profiling statistics that we automatically calculate and maintain for every field of every file we ever load. It's not something you do as an afterthought or selectively. We knew from our experience that we needed to do that, data validation, and then bring in inferences such as this field looks like PII data and tag that in the metadata. That process of taking in data and this even applies to legacy mainframe data coming in a VSAM format. It gets converted and landed to a usable format automatically. But the most important part is the catalog gets enriched with all this statistical profiling information, validation, all of the technical information and we interoperate as well as have a GUI to help with business tagging, business definitions in the light. >> Paul, just a little bit of a broader industry question, we talked a value of data I think everybody understands how important is it. How are we doing in understanding the value of that data though, is that a monetization thing? You've got academia in your background, there's debates, we've talked to some people at MIT about this. How do you look at data value as an industry in general, is there anything from Podium Data that you help people identify, are we leveraging it, are we doing the most, what are your thoughts around that? >> So I'd say someone who's looking for a good framework to think about this I'd recommend Doug Laney's book on infonomics, we've collaborated for a while, he's doing a great job there. But there's also just a blocking and tackling which is what data is getting used or a common one for our customers is where do I have data that's duplicate or it comes from the same source but it's not exactly the same. That often causes reconciliation issues in finance, or in forecasting, in sales analysis. So what we've done with our data catalog with all these profiling statistics is start to build some analytics that identify similar data sets that don't have to be exactly the same to say you may have a version of the data that you're trying to load here already available. Why don't you look at that data set and see if that one is preferred and the data governance community really likes this. For one of our customers there were literally millions of dollars in savings of eliminating duplication but the more important thing is the inconsistency, when people are using similar but not the same data sets. So we're seeing that as a real driver. >> I want to give you the final word. Just what are you seeing out in the industry these days, biggest opportunities, biggest challenges from users you're talking to? >> Well, what I'd say is when we started this it was very difficult for traditional businesses to use Hadoop in production and they needed an army of programmers and I think we solved that. Last year we started on our work to move to a post-Hadoop world so the first thing we've done is open up our cataloging tools so we can catalog any data set in any source and allow the data to be brought into an analytical environment or production environment more on demand then the idea that you're going to build a giant data lake with everything in it and replicate everything. That's become really interesting because you can build the catalog in a few weeks and then actually use the analysis and all the contents to drive the strategy. What do I prioritize, where do I put things? The other big initiative is of course, Cloud. As I mentioned earlier you have to protect and make Cloud ready data behind your firewall and then you have to know where it's used and how it's used externally. We automate a lot of that process and make that transition something that you can manage over time, and that is now going to be extended into multi cloud, multi lake type of technologies. >> Multi cloud, multi lake, alright. Well Paul Barth, I appreciate getting the update everything happening with Podium Data. Well, theCUBE had so many events this year, be sure to check out thecube.net for all the upcoming events and all the existing interviews. I'm Stu Miniman, thanks for watching theCUBE. (light techno music)

Published Date : Apr 26 2018

SUMMARY :

Narrator: From the SiliconANGLE Media office Hi, I'm Stu Miniman and welcome to theCUBE conversation it was a fun event that we do at MIT and in doing this work, we kept stumbling across one of the pithy things we did, and start to build correlations. and how you fit into this changing ecosystem. and safe for the end users to use and it shows the promise of making me So the use and leverage of data was a very critical mass. and then migrate it to the Cloud and manage it there. Expand a little bit on the security aspect there, and none of them could go to production. What's the newest from Podium Data? and tag that in the metadata. that you help people identify, are we leveraging it, and the data governance community really likes this. I want to give you the final word. and allow the data to be brought into Well Paul Barth, I appreciate getting the update

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
2014DATE

0.99+

Podium DataORGANIZATION

0.99+

Paul MaritzPERSON

0.99+

Stu MinimanPERSON

0.99+

MicrosoftORGANIZATION

0.99+

Paul BarthPERSON

0.99+

PaulPERSON

0.99+

BostonLOCATION

0.99+

last yearDATE

0.99+

StuPERSON

0.99+

Last yearDATE

0.99+

PodiumORGANIZATION

0.99+

Doug LaneyPERSON

0.99+

thecube.netOTHER

0.99+

more than 50 percentQUANTITY

0.99+

oneQUANTITY

0.99+

todayDATE

0.99+

Boston, MassachusettsLOCATION

0.99+

MITORGANIZATION

0.98+

GRCORGANIZATION

0.98+

OneQUANTITY

0.98+

this yearDATE

0.98+

both endsQUANTITY

0.98+

50 different projectsQUANTITY

0.97+

SparkTITLE

0.97+

Data RobotORGANIZATION

0.97+

HadoopTITLE

0.96+

S3TITLE

0.95+

millions of dollarsQUANTITY

0.95+

GDPRTITLE

0.95+

theCUBEORGANIZATION

0.95+

a decade agoDATE

0.94+

over 100 profiling statisticsQUANTITY

0.91+

CloudTITLE

0.9+

RotareadyORGANIZATION

0.89+

One thingQUANTITY

0.87+

first thingQUANTITY

0.87+

VMwareTITLE

0.86+

KerberosTITLE

0.83+

The Podium Data MarketplaceORGANIZATION

0.79+

firstQUANTITY

0.79+

LDABTITLE

0.79+

Pi, P-IORGANIZATION

0.77+

SiliconANGLE MediaORGANIZATION

0.61+

a decadeQUANTITY

0.6+

waveEVENT

0.45+

ProtegrityORGANIZATION

0.44+

Next-Generation Analytics Social Influencer Roundtable - #BigDataNYC 2016 #theCUBE


 

>> Narrator: Live from New York, it's the Cube, covering big data New York City 2016. Brought to you by headline sponsors, CISCO, IBM, NVIDIA, and our ecosystem sponsors, now here's your host, Dave Valante. >> Welcome back to New York City, everybody, this is the Cube, the worldwide leader in live tech coverage, and this is a cube first, we've got a nine person, actually eight person panel of experts, data scientists, all alike. I'm here with my co-host, James Cubelis, who has helped organize this panel of experts. James, welcome. >> Thank you very much, Dave, it's great to be here, and we have some really excellent brain power up there, so I'm going to let them talk. >> Okay, well thank you again-- >> And I'll interject my thoughts now and then, but I want to hear them. >> Okay, great, we know you well, Jim, we know you'll do that, so thank you for that, and appreciate you organizing this. Okay, so what I'm going to do to our panelists is ask you to introduce yourself. I'll introduce you, but tell us a little bit about yourself, and talk a little bit about what data science means to you. A number of you started in the field a long time ago, perhaps data warehouse experts before the term data science was coined. Some of you started probably after Hal Varian said it was the sexiest job in the world. (laughs) So think about how data science has changed and or what it means to you. We're going to start with Greg Piateski, who's from Boston. A Ph.D., KDnuggets, Greg, tell us about yourself and what data science means to you. >> Okay, well thank you Dave and thank you Jim for the invitation. Data science in a sense is the second oldest profession. I think people have this built-in need to find patterns and whatever we find we want to organize the data, but we do it well on a small scale, but we don't do it well on a large scale, so really, data science takes our need and helps us organize what we find, the patterns that we find that are really valid and useful and not just random, I think this is a big challenge of data science. I've actually started in this field before the term Data Science existed. I started as a researcher and organized the first few workshops on data mining and knowledge discovery, and the term data mining became less fashionable, became predictive analytics, now it's data science and it will be something else in a few years. >> Okay, thank you, Eves Mulkearns, Eves, I of course know you from Twitter. A lot of people know you as well. Tell us about your experiences and what data scientist means to you. >> Well, data science to me is if you take the two words, the data and the science, the science it holds a lot of expertise and skills there, it's statistics, it's mathematics, it's understanding the business and putting that together with the digitization of what we have. It's not only the structured data or the unstructured data what you store in the database try to get out and try to understand what is in there, but even video what is coming on and then trying to find, like George already said, the patterns in there and bringing value to the business but looking from a technical perspective, but still linking that to the business insights and you can do that on a technical level, but then you don't know yet what you need to find, or what you're looking for. >> Okay great, thank you. Craig Brown, Cube alum. How many people have been on the Cube actually before? >> I have. >> Okay, good. I always like to ask that question. So Craig, tell us a little bit about your background and, you know, data science, how has it changed, what's it all mean to you? >> Sure, so I'm Craig Brown, I've been in IT for almost 28 years, and that was obviously before the term data science, but I've evolved from, I started out as a developer. And evolved through the data ranks, as I called it, working with data structures, working with data systems, data technologies, and now we're working with data pure and simple. Data science to me is an individual or team of individuals that dissect the data, understand the data, help folks look at the data differently than just the information that, you know, we usually use in reports, and get more insights on, how to utilize it and better leverage it as an asset within an organization. >> Great, thank you Craig, okay, Jennifer Shin? Math is obviously part of being a data scientist. You're good at math I understand. Tell us about yourself. >> Yeah, so I'm a senior principle data scientist at the Nielsen Company. I'm also the founder of 8 Path Solutions, which is a data science, analytics, and technology company, and I'm also on the faculty in the Master of Information and Data Science program at UC Berkeley. So math is part of the IT statistics for data science actually this semester, and I think for me, I consider myself a scientist primarily, and data science is a nice day job to have, right? Something where there's industry need for people with my skill set in the sciences, and data gives us a great way of being able to communicate sort of what we know in science in a way that can be used out there in the real world. I think the best benefit for me is that now that I'm a data scientist, people know what my job is, whereas before, maybe five ten years ago, no one understood what I did. Now, people don't necessarily understand what I do now, but at least they understand kind of what I do, so it's still an improvement. >> Excellent. Thank you Jennifer. Joe Caserta, you're somebody who started in the data warehouse business, and saw that snake swallow a basketball and grow into what we now know as big data, so tell us about yourself. >> So I've been doing data for 30 years now, and I wrote the Data Warehouse ETL Toolkit with Ralph Timbal, which is the best selling book in the industry on preparing data for analytics, and with the big paradigm shift that's happened, you know for me the past seven years has been, instead of preparing data for people to analyze data to make decisions, now we're preparing data for machines to make the decisions, and I think that's the big shift from data analysis to data analytics and data science. >> Great, thank you. Miriam, Miriam Fridell, welcome. >> Thank you. I'm Miriam Fridell, I work for Elder Research, we are a data science consultancy, and I came to data science, sort of through a very circuitous route. I started off as a physicist, went to work as a consultant and software engineer, then became a research analyst, and finally came to data science. And I think one of the most interesting things to me about data science is that it's not simply about building an interesting model and doing some interesting mathematics, or maybe wrangling the data, all of which I love to do, but it's really the entire analytics lifecycle, and a value that you can actually extract from data at the end, and that's one of the things that I enjoy most is seeing a client's eyes light up or a wow, I didn't really know we could look at data that way, that's really interesting. I can actually do something with that, so I think that, to me, is one of the most interesting things about it. >> Great, thank you. Justin Sadeen, welcome. >> Absolutely, than you, thank you. So my name is Justin Sadeen, I work for Morph EDU, an artificial intelligence company in Atlanta, Georgia, and we develop learning platforms for non-profit and private educational institutions. So I'm a Marine Corp veteran turned data enthusiast, and so what I think about data science is the intersection of information, intelligence, and analysis, and I'm really excited about the transition from big data into smart data, and that's what I see data science as. >> Great, and last but not least, Dez Blanchfield, welcome mate. >> Good day. Yeah, I'm the one with the funny accent. So data science for me is probably the funniest job I've ever to describe to my mom. I've had quite a few different jobs, and she's never understood any of them, and this one she understands the least. I think a fun way to describe what we're trying to do in the world of data science and analytics now is it's the equivalent of high altitude mountain climbing. It's like the extreme sport version of the computer science world, because we have to be this magical unicorn of a human that can understand plain english problems from C-suite down and then translate it into code, either as soles or as teams of developers. And so there's this black art that we're expected to be able to transmogrify from something that we just in plain english say I would like to know X, and we have to go and figure it out, so there's this neat extreme sport view I have of rushing down the side of a mountain on a mountain bike and just dodging rocks and trees and things occasionally, because invariably, we do have things that go wrong, and they don't quite give us the answers we want. But I think we're at an interesting point in time now with the explosion in the types of technology that are at our fingertips, and the scale at which we can do things now, once upon a time we would sit at a terminal and write code and just look at data and watch it in columns, and then we ended up with spreadsheet technologies at our fingertips. Nowadays it's quite normal to instantiate a small high performance distributed cluster of computers, effectively a super computer in a public cloud, and throw some data at it and see what comes back. And we can do that on a credit card. So I think we're at a really interesting tipping point now where this coinage of data science needs to be slightly better defined, so that we can help organizations who have weird and strange questions that they want to ask, tell them solutions to those questions, and deliver on them in, I guess, a commodity deliverable. I want to know xyz and I want to know it in this time frame and I want to spend this much amount of money to do it, and I don't really care how you're going to do it. And there's so many tools we can choose from and there's so many platforms we can choose from, it's this little black art of computing, if you'd like, we're effectively making it up as we go in many ways, so I think it's one of the most exciting challenges that I've had, and I think I'm pretty sure I speak for most of us in that we're lucky that we get paid to do this amazing job. That we get make up on a daily basis in some cases. >> Excellent, well okay. So we'll just get right into it. I'm going to go off script-- >> Do they have unicorns down under? I think they have some strange species right? >> Well we put the pointy bit on the back. You guys have in on the front. >> So I was at an IBM event on Friday. It was a chief data officer summit, and I attended what was called the Data Divas' breakfast. It was a women in tech thing, and one of the CDOs, she said that 25% of chief data officers are women, which is much higher than you would normally see in the profile of IT. We happen to have 25% of our panelists are women. Is that common? Miriam and Jennifer, is that common for the data science field? Or is this a higher percentage than you would normally see-- >> James: Or a lower percentage? >> I think certainly for us, we have hired a number of additional women in the last year, and they are phenomenal data scientists. I don't know that I would say, I mean I think it's certainly typical that this is still a male-dominated field, but I think like many male-dominated fields, physics, mathematics, computer science, I think that that is slowly changing and evolving, and I think certainly, that's something that we've noticed in our firm over the years at our consultancy, as we're hiring new people. So I don't know if I would say 25% is the right number, but hopefully we can get it closer to 50. Jennifer, I don't know if you have... >> Yeah, so I know at Nielsen we have actually more than 25% of our team is women, at least the team I work with, so there seems to be a lot of women who are going into the field. Which isn't too surprising, because with a lot of the issues that come up in STEM, one of the reasons why a lot of women drop out is because they want real world jobs and they feel like they want to be in the workforce, and so I think this is a great opportunity with data science being so popular for these women to actually have a job where they can still maintain that engineering and science view background that they learned in school. >> Great, well Hillary Mason, I think, was the first data scientist that I ever interviewed, and I asked her what are the sort of skills required and the first question that we wanted to ask, I just threw other women in tech in there, 'cause we love women in tech, is about this notion of the unicorn data scientist, right? It's been put forth that there's the skill sets required to be a date scientist are so numerous that it's virtually impossible to have a data scientist with all those skills. >> And I love Dez's extreme sports analogy, because that plays into the whole notion of data science, we like to talk about the theme now of data science as a team sport. Must it be an extreme sport is what I'm wondering, you know. The unicorns of the world seem to be... Is that realistic now in this new era? >> I mean when automobiles first came out, they were concerned that there wouldn't be enough chauffeurs to drive all the people around. Is there an analogy with data, to be a data-driven company. Do I need a data scientist, and does that data scientist, you know, need to have these unbelievable mixture of skills? Or are we doomed to always have a skill shortage? Open it up. >> I'd like to have a crack at that, so it's interesting, when automobiles were a thing, when they first bought cars out, and before they, sort of, were modernized by the likes of Ford's Model T, when we got away from the horse and carriage, they actually had human beings walking down the street with a flag warning the public that the horseless carriage was coming, and I think data scientists are very much like that. That we're kind of expected to go ahead of the organization and try and take the challenges we're faced with today and see what's going to come around the corner. And so we're like the little flag-bearers, if you'd like, in many ways of this is where we're at today, tell me where I'm going to be tomorrow, and try and predict the day after as well. It is very much becoming a team sport though. But I think the concept of data science being a unicorn has come about because the coinage hasn't been very well defined, you know, if you were to ask 10 people what a data scientist were, you'd get 11 answers, and I think this is a really challenging issue for hiring managers and C-suites when the generants say I was data science, I want big data, I want an analyst. They don't actually really know what they're asking for. Generally, if you ask for a database administrator, it's a well-described job spec, and you can just advertise it and some 20 people will turn up and you interview to decide whether you like the look and feel and smell of 'em. When you ask for a data scientist, there's 20 different definitions of what that one data science role could be. So we don't initially know what the job is, we don't know what the deliverable is, and we're still trying to figure that out, so yeah. >> Craig what about you? >> So from my experience, when we talk about data science, we're really talking about a collection of experiences with multiple people I've yet to find, at least from my experience, a data science effort with a lone wolf. So you're talking about a combination of skills, and so you don't have, no one individual needs to have all that makes a data scientist a data scientist, but you definitely have to have the right combination of skills amongst a team in order to accomplish the goals of data science team. So from my experiences and from the clients that I've worked with, we refer to the data science effort as a data science team. And I believe that's very appropriate to the team sport analogy. >> For us, we look at a data scientist as a full stack web developer, a jack of all trades, I mean they need to have a multitude of background coming from a programmer from an analyst. You can't find one subject matter expert, it's very difficult. And if you're able to find a subject matter expert, you know, through the lifecycle of product development, you're going to require that individual to interact with a number of other members from your team who are analysts and then you just end up well training this person to be, again, a jack of all trades, so it comes full circle. >> I own a business that does nothing but data solutions, and we've been in business 15 years, and it's been, the transition over time has been going from being a conventional wisdom run company with a bunch of experts at the top to becoming more of a data-driven company using data warehousing and BI, but now the trend is absolutely analytics driven. So if you're not becoming an analytics-driven company, you are going to be behind the curve very very soon, and it's interesting that IBM is now coining the phrase of a cognitive business. I think that is absolutely the future. If you're not a cognitive business from a technology perspective, and an analytics-driven perspective, you're going to be left behind, that's for sure. So in order to stay competitive, you know, you need to really think about data science think about how you're using your data, and I also see that what's considered the data expert has evolved over time too where it used to be just someone really good at writing SQL, or someone really good at writing queries in any language, but now it's becoming more of a interdisciplinary action where you need soft skills and you also need the hard skills, and that's why I think there's more females in the industry now than ever. Because you really need to have a really broad width of experiences that really wasn't required in the past. >> Greg Piateski, you have a comment? >> So there are not too many unicorns in nature or as data scientists, so I think organizations that want to hire data scientists have to look for teams, and there are a few unicorns like Hillary Mason or maybe Osama Faiat, but they generally tend to start companies and very hard to retain them as data scientists. What I see is in other evolution, automation, and you know, steps like IBM, Watson, the first platform is eventually a great advance for data scientists in the short term, but probably what's likely to happen in the longer term kind of more and more of those skills becoming subsumed by machine unique layer within the software. How long will it take, I don't know, but I have a feeling that the paradise for data scientists may not be very long lived. >> Greg, I have a follow up question to what I just heard you say. When a data scientist, let's say a unicorn data scientist starts a company, as you've phrased it, and the company's product is built on data science, do they give up becoming a data scientist in the process? It would seem that they become a data scientist of a higher order if they've built a product based on that knowledge. What is your thoughts on that? >> Well, I know a few people like that, so I think maybe they remain data scientists at heart, but they don't really have the time to do the analysis and they really have to focus more on strategic things. For example, today actually is the birthday of Google, 18 years ago, so Larry Page and Sergey Brin wrote a very influential paper back in the '90s About page rank. Have they remained data scientist, perhaps a very very small part, but that's not really what they do, so I think those unicorn data scientists could quickly evolve to have to look for really teams to capture those skills. >> Clearly they come to a point in their career where they build a company based on teams of data scientists and data engineers and so forth, which relates to the topic of team data science. What is the right division of roles and responsibilities for team data science? >> Before we go, Jennifer, did you have a comment on that? >> Yeah, so I guess I would say for me, when data science came out and there was, you know, the Venn Diagram that came out about all the skills you were supposed to have? I took a very different approach than all of the people who I knew who were going into data science. Most people started interviewing immediately, they were like this is great, I'm going to get a job. I went and learned how to develop applications, and learned computer science, 'cause I had never taken a computer science course in college, and made sure I trued up that one part where I didn't know these things or had the skills from school, so I went headfirst and just learned it, and then now I have actually a lot of technology patents as a result of that. So to answer Jim's question, actually. I started my company about five years ago. And originally started out as a consulting firm slash data science company, then it evolved, and one of the reasons I went back in the industry and now I'm at Nielsen is because you really can't do the same sort of data science work when you're actually doing product development. It's a very very different sort of world. You know, when you're developing a product you're developing a core feature or functionality that you're going to offer clients and customers, so I think definitely you really don't get to have that wide range of sort of looking at 8 million models and testing things out. That flexibility really isn't there as your product starts getting developed. >> Before we go into the team sport, the hard skills that you have, are you all good at math? Are you all computer science types? How about math? Are you all math? >> What were your GPAs? (laughs) >> David: Anybody not math oriented? Anybody not love math? You don't love math? >> I love math, I think it's required. >> David: So math yes, check. >> You dream in equations, right? You dream. >> Computer science? Do I have to have computer science skills? At least the basic knowledge? >> I don't know that you need to have formal classes in any of these things, but I think certainly as Jennifer was saying, if you have no skills in programming whatsoever and you have no interest in learning how to write SQL queries or RR Python, you're probably going to struggle a little bit. >> James: It would be a challenge. >> So I think yes, I have a Ph.D. in physics, I did a lot of math, it's my love language, but I think you don't necessarily need to have formal training in all of these things, but I think you need to have a curiosity and a love of learning, and so if you don't have that, you still want to learn and however you gain that knowledge I think, but yeah, if you have no technical interests whatsoever, and don't want to write a line of code, maybe data science is not the field for you. Even if you don't do it everyday. >> And statistics as well? You would put that in that same general category? How about data hacking? You got to love data hacking, is that fair? Eaves, you have a comment? >> Yeah, I think so, while we've been discussing that for me, the most important part is that you have a logical mind and you have the capability to absorb new things and the curiosity you need to dive into that. While I don't have an education in IT or whatever, I have a background in chemistry and those things that I learned there, I apply to information technology as well, and from a part that you say, okay, I'm a tech-savvy guy, I'm interested in the tech part of it, you need to speak that business language and if you can do that crossover and understand what other skill sets or parts of the roles are telling you I think the communication in that aspect is very important. >> I'd like throw just something really quickly, and I think there's an interesting thing that happens in IT, particularly around technology. We tend to forget that we've actually solved a lot of these problems in the past. If we look in history, if we look around the second World War, and Bletchley Park in the UK, where you had a very similar experience as humans that we're having currently around the whole issue of data science, so there was an interesting challenge with the enigma in the shark code, right? And there was a bunch of men put in a room and told, you're mathematicians and you come from universities, and you can crack codes, but they couldn't. And so what they ended up doing was running these ads, and putting challenges, they actually put, I think it was crossword puzzles in the newspaper, and this deluge of women came out of all kinds of different roles without math degrees, without science degrees, but could solve problems, and they were thrown at the challenge of cracking codes, and invariably, they did the heavy lifting. On a daily basis for converting messages from one format to another, so that this very small team at the end could actually get in play with the sexy piece of it. And I think we're going through a similar shift now with what we're refer to as data science in the technology and business world. Where the people who are doing the heavy lifting aren't necessarily what we'd think of as the traditional data scientists, and so, there have been some unicorns and we've championed them, and they're great. But I think the shift's going to be to accountants, actuaries, and statisticians who understand the business, and come from an MBA star background that can learn the relevant pieces of math and models that we need to to apply to get the data science outcome. I think we've already been here, we've solved this problem, we've just got to learn not to try and reinvent the wheel, 'cause the media hypes this whole thing of data science is exciting and new, but we've been here a couple times before, and there's a lot to be learned from that, my view. >> I think we had Joe next. >> Yeah, so I was going to say that, data science is a funny thing. To use the word science is kind of a misnomer, because there is definitely a level of art to it, and I like to use the analogy, when Michelangelo would look at a block of marble, everyone else looked at the block of marble to see a block of marble. He looks at a block of marble and he sees a finished sculpture, and then he figures out what tools do I need to actually make my vision? And I think data science is a lot like that. We hear a problem, we see the solution, and then we just need the right tools to do it, and I think part of consulting and data science in particular. It's not so much what we know out of the gate, but it's how quickly we learn. And I think everyone here, what makes them brilliant, is how quickly they could learn any tool that they need to see their vision get accomplished. >> David: Justin? >> Yeah, I think you make a really great point, for me, I'm a Marine Corp veteran, and the reason I mentioned that is 'cause I work with two veterans who are problem solvers. And I think that's what data scientists really are, in the long run are problem solvers, and you mentioned a great point that, yeah, I think just problem solving is the key. You don't have to be a subject matter expert, just be able to take the tools and intelligently use them. >> Now when you look at the whole notion of team data science, what is the right mix of roles, like role definitions within a high-quality or a high-preforming data science teams now IBM, with, of course, our announcement of project, data works and so forth. We're splitting the role division, in terms of data scientist versus data engineers versus application developer versus business analyst, is that the right breakdown of roles? Or what would the panelists recommend in terms of understanding what kind of roles make sense within, like I said, a high performing team that's looking for trying to develop applications that depend on data, machine learning, and so forth? Anybody want to? >> I'll tackle that. So the teams that I have created over the years made up these data science teams that I brought into customer sites have a combination of developer capabilities and some of them are IT developers, but some of them were developers of things other than applications. They designed buildings, they did other things with their technical expertise besides building technology. The other piece besides the developer is the analytics, and analytics can be taught as long as they understand how algorithms work and the code behind the analytics, in other words, how are we analyzing things, and from a data science perspective, we are leveraging technology to do the analyzing through the tool sets, so ultimately as long as they understand how tool sets work, then we can train them on the tools. Having that analytic background is an important piece. >> Craig, is it easier to, I'll go to you in a moment Joe, is it easier to cross train a data scientist to be an app developer, than to cross train an app developer to be a data scientist or does it not matter? >> Yes. (laughs) And not the other way around. It depends on the-- >> It's easier to cross train a data scientist to be an app developer than-- >> Yes. >> The other way around. Why is that? >> Developing code can be as difficult as the tool set one uses to develop code. Today's tool sets are very user friendly. where developing code is very difficult to teach a person to think along the lines of developing code when they don't have any idea of the aspects of code, of building something. >> I think it was Joe, or you next, or Jennifer, who was it? >> I would say that one of the reasons for that is data scientists will probably know if the answer's right after you process data, whereas data engineer might be able to manipulate the data but may not know if the answer's correct. So I think that is one of the reasons why having a data scientist learn the application development skills might be a easier time than the other way around. >> I think Miriam, had a comment? Sorry. >> I think that what we're advising our clients to do is to not think, before data science and before analytics became so required by companies to stay competitive, it was more of a waterfall, you have a data engineer build a solution, you know, then you throw it over the fence and the business analyst would have at it, where now, it must be agile, and you must have a scrum team where you have the data scientist and the data engineer and the project manager and the product owner and someone from the chief data office all at the table at the same time and all accomplishing the same goal. Because all of these skills are required, collectively in order to solve this problem, and it can't be done daisy chained anymore it has to be a collaboration. And that's why I think spark is so awesome, because you know, spark is a single interface that a data engineer can use, a data analyst can use, and a data scientist can use. And now with what we've learned today, having a data catalog on top so that the chief data office can actually manage it, I think is really going to take spark to the next level. >> James: Miriam? >> I wanted to comment on your question to Craig about is it harder to teach a data scientist to build an application or vice versa, and one of the things that we have worked on a lot in our data science team is incorporating a lot of best practices from software development, agile, scrum, that sort of thing, and I think particularly with a focus on deploying models that we don't just want to build an interesting data science model, we want to deploy it, and get some value. You need to really incorporate these processes from someone who might know how to build applications and that, I think for some data scientists can be a challenge, because one of the fun things about data science is you get to get into the data, and you get your hands dirty, and you build a model, and you get to try all these cool things, but then when the time comes for you to actually deploy something, you need deployment-grade code in order to make sure it can go into production at your client side and be useful for instance, so I think that there's an interesting challenge on both ends, but one of the things I've definitely noticed with some of our data scientists is it's very hard to get them to think in that mindset, which is why you have a team of people, because everyone has different skills and you can mitigate that. >> Dev-ops for data science? >> Yeah, exactly. We call it insight ops, but yeah, I hear what you're saying. Data science is becoming increasingly an operational function as opposed to strictly exploratory or developmental. Did some one else have a, Dez? >> One of the things I was going to mention, one of the things I like to do when someone gives me a new problem is take all the laptops and phones away. And we just end up in a room with a whiteboard. And developers find that challenging sometimes, so I had this one line where I said to them don't write the first line of code until you actually understand the problem you're trying to solve right? And I think where the data science focus has changed the game for organizations who are trying to get some systematic repeatable process that they can throw data at and just keep getting answers and things, no matter what the industry might be is that developers will come with a particular mindset on how they're going to codify something without necessarily getting the full spectrum and understanding the problem first place. What I'm finding is the people that come at data science tend to have more of a hacker ethic. They want to hack the problem, they want to understand the challenge, and they want to be able to get it down to plain English simple phrases, and then apply some algorithms and then build models, and then codify it, and so most of the time we sit in a room with whiteboard markers just trying to build a model in a graphical sense and make sure it's going to work and that it's going to flow, and once we can do that, we can codify it. I think when you come at it from the other angle from the developer ethic, and you're like I'm just going to codify this from day one, I'm going to write code. I'm going to hack this thing out and it's just going to run and compile. Often, you don't truly understand what he's trying to get to at the end point, and you can just spend days writing code and I think someone made the comment that sometimes you don't actually know whether the output is actually accurate in the first place. So I think there's a lot of value being provided from the data science practice. Over understanding the problem in plain english at a team level, so what am I trying to do from the business consulting point of view? What are the requirements? How do I build this model? How do I test the model? How do I run a sample set through it? Train the thing and then make sure what I'm going to codify actually makes sense in the first place, because otherwise, what are you trying to solve in the first place? >> Wasn't that Einstein who said if I had an hour to solve a problem, I'd spend 55 minutes understanding the problem and five minutes on the solution, right? It's exactly what you're talking about. >> Well I think, I will say, getting back to the question, the thing with building these teams, I think a lot of times people don't talk about is that engineers are actually very very important for data science projects and data science problems. For instance, if you were just trying to prototype something or just come up with a model, then data science teams are great, however, if you need to actually put that into production, that code that the data scientist has written may not be optimal, so as we scale out, it may be actually very inefficient. At that point, you kind of want an engineer to step in and actually optimize that code, so I think it depends on what you're building and that kind of dictates what kind of division you want among your teammates, but I do think that a lot of times, the engineering component is really undervalued out there. >> Jennifer, it seems that the data engineering function, data discovery and preparation and so forth is becoming automated to a greater degree, but if I'm listening to you, I don't hear that data engineering as a discipline is becoming extinct in terms of a role that people can be hired into. You're saying that there's a strong ongoing need for data engineers to optimize the entire pipeline to deliver the fruits of data science in production applications, is that correct? So they play that very much operational role as the backbone for... >> So I think a lot of times businesses will go to data scientist to build a better model to build a predictive model, but that model may not be something that you really want to implement out there when there's like a million users coming to your website, 'cause it may not be efficient, it may take a very long time, so I think in that sense, it is important to have good engineers, and your whole product may fail, you may build the best model it may have the best output, but if you can't actually implement it, then really what good is it? >> What about calibrating these models? How do you go about doing that and sort of testing that in the real world? Has that changed overtime? Or is it... >> So one of the things that I think can happen, and we found with one of our clients is when you build a model, you do it with the data that you have, and you try to use a very robust cross-validation process to make sure that it's robust and it's sturdy, but one thing that can sometimes happen is after you put your model into production, there can be external factors that, societal or whatever, things that have nothing to do with the data that you have or the quality of the data or the quality of the model, which can actually erode the model's performance over time. So as an example, we think about cell phone contracts right? Those have changed a lot over the years, so maybe five years ago, the type of data plan you had might not be the same that it is today, because a totally different type of plan is offered, so if you're building a model on that to say predict who's going to leave and go to a different cell phone carrier, the validity of your model overtime is going to completely degrade based on nothing that you have, that you put into the model or the data that was available, so I think you need to have this sort of model management and monitoring process to take this factors into account and then know when it's time to do a refresh. >> Cross-validation, even at one point in time, for example, there was an article in the New York Times recently that they gave the same data set to five different data scientists, this is survey data for the presidential election that's upcoming, and five different data scientists came to five different predictions. They were all high quality data scientists, the cross-validation showed a wide variation about who was on top, whether it was Hillary or whether it was Trump so that shows you that even at any point in time, cross-validation is essential to understand how robust the predictions might be. Does somebody else have a comment? Joe? >> I just want to say that this even drives home the fact that having the scrum team for each project and having the engineer and the data scientist, data engineer and data scientist working side by side because it is important that whatever we're building we assume will eventually go into production, and we used to have in the data warehousing world, you'd get the data out of the systems, out of your applications, you do analysis on your data, and the nirvana was maybe that data would go back to the system, but typically it didn't. Nowadays, the applications are dependent on the insight coming from the data science team. With the behavior of the application and the personalization and individual experience for a customer is highly dependent, so it has to be, you said is data science part of the dev-ops team, absolutely now, it has to be. >> Whose job is it to figure out the way in which the data is presented to the business? Where's the sort of presentation, the visualization plan, is that the data scientist role? Does that depend on whether or not you have that gene? Do you need a UI person on your team? Where does that fit? >> Wow, good question. >> Well usually that's the output, I mean, once you get to the point where you're visualizing the data, you've created an algorithm or some sort of code that produces that to be visualized, so at the end of the day that the customers can see what all the fuss is about from a data science perspective. But it's usually post the data science component. >> So do you run into situations where you can see it and it's blatantly obvious, but it doesn't necessarily translate to the business? >> Well there's an interesting challenge with data, and we throw the word data around a lot, and I've got this fun line I like throwing out there. If you torture data long enough, it will talk. So the challenge then is to figure out when to stop torturing it, right? And it's the same with models, and so I think in many other parts of organizations, we'll take something, if someone's doing a financial report on performance of the organization and they're doing it in a spreadsheet, they'll get two or three peers to review it, and validate that they've come up with a working model and the answer actually makes sense. And I think we're rushing so quickly at doing analysis on data that comes to us in various formats and high velocity that I think it's very important for us to actually stop and do peer reviews, of the models and the data and the output as well, because otherwise we start making decisions very quickly about things that may or may not be true. It's very easy to get the data to paint any picture you want, and you gave the example of the five different attempts at that thing, and I had this shoot out thing as well where I'll take in a team, I'll get two different people to do exactly the same thing in completely different rooms, and come back and challenge each other, and it's quite amazing to see the looks on their faces when they're like, oh, I didn't see that, and then go back and do it again until, and then just keep iterating until we get to the point where they both get the same outcome, in fact there's a really interesting anecdote about when the UNIX operation system was being written, and a couple of the authors went away and wrote the same program without realizing that each other were doing it, and when they came back, they actually had line for line, the same piece of C code, 'cause they'd actually gotten to a truth. A perfect version of that program, and I think we need to often look at, when we're building models and playing with data, if we can't come at it from different angles, and get the same answer, then maybe the answer isn't quite true yet, so there's a lot of risk in that. And it's the same with presentation, you know, you can paint any picture you want with the dashboard, but who's actually validating when the dashboard's painting the correct picture? >> James: Go ahead, please. >> There is a science actually, behind data visualization, you know if you're doing trending, it's a line graph, if you're doing comparative analysis, it's bar graph, if you're doing percentages, it's a pie chart, like there is a certain science to it, it's not that much of a mystery as the novice thinks there is, but what makes it challenging is that you also, just like any presentation, you have to consider your audience. And your audience, whenever we're delivering a solution, either insight, or just data in a grid, we really have to consider who is the consumer of this data, and actually cater the visual to that person or to that particular audience. And that is part of the art, and that is what makes a great data scientist. >> The consumer may in fact be the source of the data itself, like in a mobile app, so you're tuning their visualization and then their behavior is changing as a result, and then the data on their changed behavior comes back, so it can be a circular process. >> So Jim, at a recent conference, you were tweeting about the citizen data scientist, and you got emasculated by-- >> I spoke there too. >> Okay. >> TWI on that same topic, I got-- >> Kirk Borne I hear came after you. >> Kirk meant-- >> Called foul, flag on the play. >> Kirk meant well. I love Claudia Emahoff too, but yeah, it's a controversial topic. >> So I wonder what our panel thinks of that notion, citizen data scientist. >> Can I respond about citizen data scientists? >> David: Yeah, please. >> I think this term was introduced by Gartner analyst in 2015, and I think it's a very dangerous and misleading term. I think definitely we want to democratize the data and have access to more people, not just data scientists, but managers, BI analysts, but when there is already a term for such people, we can call the business analysts, because it implies some training, some understanding of the data. If you use the term citizen data scientist, it implies that without any training you take some data and then you find something there, and they think as Dev's mentioned, we've seen many examples, very easy to find completely spurious random correlations in data. So we don't want citizen dentists to treat our teeth or citizen pilots to fly planes, and if data's important, having citizen data scientists is equally dangerous, so I'm hoping that, I think actually Gartner did not use the term citizen data scientist in their 2016 hype course, so hopefully they will put this term to rest. >> So Gregory, you apparently are defining citizen to mean incompetent as opposed to simply self-starting. >> Well self-starting is very different, but that's not what I think what was the intention. I think what we see in terms of data democratization, there is a big trend over automation. There are many tools, for example there are many companies like Data Robot, probably IBM, has interesting machine learning capability towards automation, so I think I recently started a page on KDnuggets for automated data science solutions, and there are already 20 different forums that provide different levels of automation. So one can deliver in full automation maybe some expertise, but it's very dangerous to have part of an automated tool and at some point then ask citizen data scientists to try to take the wheels. >> I want to chime in on that. >> David: Yeah, pile on. >> I totally agree with all of that. I think the comment I just want to quickly put out there is that the space we're in is a very young, and rapidly changing world, and so what we haven't had yet is this time to stop and take a deep breath and actually define ourselves, so if you look at computer science in general, a lot of the traditional roles have sort of had 10 or 20 years of history, and so thorough the hiring process, and the development of those spaces, we've actually had time to breath and define what those jobs are, so we know what a systems programmer is, and we know what a database administrator is, but we haven't yet had a chance as a community to stop and breath and say, well what do we think these roles are, and so to fill that void, the media creates coinages, and I think this is the risk we've got now that the concept of a data scientist was just a term that was coined to fill a void, because no one quite knew what to call somebody who didn't come from a data science background if they were tinkering around data science, and I think that's something that we need to sort of sit up and pay attention to, because if we don't own that and drive it ourselves, then somebody else is going to fill the void and they'll create these very frustrating concepts like data scientist, which drives us all crazy. >> James: Miriam's next. >> So I wanted to comment, I agree with both of the previous comments, but in terms of a citizen data scientist, and I think whether or not you're citizen data scientist or an actual data scientist whatever that means, I think one of the most important things you can have is a sense of skepticism, right? Because you can get spurious correlations and it's like wow, my predictive model is so excellent, you know? And being aware of things like leaks from the future, right? This actually isn't predictive at all, it's a result of the thing I'm trying to predict, and so I think one thing I know that we try and do is if something really looks too good, we need to go back in and make sure, did we not look at the data correctly? Is something missing? Did we have a problem with the ETL? And so I think that a healthy sense of skepticism is important to make sure that you're not taking a spurious correlation and trying to derive some significant meaning from it. >> I think there's a Dilbert cartoon that I saw that described that very well. Joe, did you have a comment? >> I think that in order for citizen data scientists to really exist, I think we do need to have more maturity in the tools that they would use. My vision is that the BI tools of today are all going to be replaced with natural language processing and searching, you know, just be able to open up a search bar and say give me sales by region, and to take that one step into the future even further, it should actually say what are my sales going to be next year? And it should trigger a simple linear regression or be able to say which features of the televisions are actually affecting sales and do a clustering algorithm, you know I think hopefully that will be the future, but I don't see anything of that today, and I think in order to have a true citizen data scientist, you would need to have that, and that is pretty sophisticated stuff. >> I think for me, the idea of citizen data scientist I can relate to that, for instance, when I was in graduate school, I started doing some research on FDA data. It was an open source data set about 4.2 million data points. Technically when I graduated, the paper was still not published, and so in some sense, you could think of me as a citizen data scientist, right? I wasn't getting funding, I wasn't doing it for school, but I was still continuing my research, so I'd like to hope that with all the new data sources out there that there might be scientists or people who are maybe kept out of a field people who wanted to be in STEM and for whatever life circumstance couldn't be in it. That they might be encouraged to actually go and look into the data and maybe build better models or validate information that's out there. >> So Justin, I'm sorry you had one comment? >> It seems data science was termed before academia adopted formalized training for data science. But yeah, you can make, like Dez said, you can make data work for whatever problem you're trying to solve, whatever answer you see, you want data to work around it, you can make it happen. And I kind of consider that like in project management, like data creep, so you're so hyper focused on a solution you're trying to find the answer that you create an answer that works for that solution, but it may not be the correct answer, and I think the crossover discussion works well for that case. >> So but the term comes up 'cause there's a frustration I guess, right? That data science skills are not plentiful, and it's potentially a bottleneck in an organization. Supposedly 80% of your time is spent on cleaning data, is that right? Is that fair? So there's a problem. How much of that can be automated and when? >> I'll have a shot at that. So I think there's a shift that's going to come about where we're going to move from centralized data sets to data at the edge of the network, and this is something that's happening very quickly now where we can't just hold everything back to a central spot. When the internet of things actually wakes up. Things like the Boeing Dreamliner 787, that things got 6,000 sensors in it, produces half a terabyte of data per flight. There are 87,400 flights per day in domestic airspace in the U.S. That's 43.5 petabytes of raw data, now that's about three years worth of disk manufacturing in total, right? We're never going to copy that across one place, we can't process, so I think the challenge we've got ahead of us is looking at how we're going to move the intelligence and the analytics to the edge of the network and pre-cook the data in different tiers, so have a look at the raw material we get, and boil it down to a slightly smaller data set, bring a meta data version of that back, and eventually get to the point where we've only got the very minimum data set and data points we need to make key decisions. Without that, we're already at the point where we have too much data, and we can't munch it fast enough, and we can't spin off enough tin even if we witch the cloud on, and that's just this never ending deluge of noise, right? And you've got that signal versus noise problem so then we're now seeing a shift where people looking at how do we move the intelligence back to the edge of network which we actually solved some time ago in the securities space. You know, spam filtering, if an emails hits Google on the west coast of the U.S. and they create a check some for that spam email, it immediately goes into a database, and nothing gets on the opposite side of the coast, because they already know it's spam. They recognize that email coming in, that's evil, stop it. So we've already fixed its insecurity with intrusion detection, we've fixed it in spam, so we now need to take that learning, and bring it into business analytics, if you like, and see where we're finding patterns and behavior, and brew that out to the edge of the network, so if I'm seeing a demand over here for tickets on a new sale of a show, I need to be able to see where else I'm going to see that demand and start responding to that before the demand comes about. I think that's a shift that we're going to see quickly, because we'll never keep up with the data munching challenge and the volume's just going to explode. >> David: We just have a couple minutes. >> That does sound like a great topic for a future Cube panel which is data science on the edge of the fog. >> I got a hundred questions around that. So we're wrapping up here. Just got a couple minutes. Final thoughts on this conversation or any other pieces that you want to punctuate. >> I think one thing that's been really interesting for me being on this panel is hearing all of my co-panelists talking about common themes and things that we are also experiencing which isn't a surprise, but it's interesting to hear about how ubiquitous some of the challenges are, and also at the announcement earlier today, some of the things that they're talking about and thinking about, we're also talking about and thinking about. So I think it's great to hear we're all in different countries and different places, but we're experiencing a lot of the same challenges, and I think that's been really interesting for me to hear about. >> David: Great, anybody else, final thoughts? >> To echo Dez's thoughts, it's about we're never going to catch up with the amount of data that's produced, so it's about transforming big data into smart data. >> I could just say that with the shift from normal data, small data, to big data, the answer is automate, automate, automate, and we've been talking about advanced algorithms and machine learning for the science for changing the business, but there also needs to be machine learning and advanced algorithms for the backroom where we're actually getting smarter about how we ingestate and how we fix data as it comes in. Because we can actually train the machines to understand data anomalies and what we want to do with them over time. And I think the further upstream we get of data correction, the less work there will be downstream. And I also think that the concept of being able to fix data at the source is gone, that's behind us. Right now the data that we're using to analyze to change the business, typically we have no control over. Like Dez said, they're coming from censors and machines and internet of things and if it's wrong, it's always going to be wrong, so we have to figure out how to do that in our laboratory. >> Eaves, final thoughts? >> I think it's a mind shift being a data scientist if you look back at the time why did you start developing or writing code? Because you like to code, whatever, just for the sake of building a nice algorithm or a piece of software, or whatever, and now I think with the spirit of a data scientist, you're looking at a problem and say this is where I want to go, so you have more the top down approach than the bottom up approach. And have the big picture and that is what you really need as a data scientist, just look across technologies, look across departments, look across everything, and then on top of that, try to apply as much skills as you have available, and that's kind of unicorn that they're trying to look for, because it's pretty hard to find people with that wide vision on everything that is happening within the company, so you need to be aware of technology, you need to be aware of how a business is run, and how it fits within a cultural environment, you have to work with people and all those things together to my belief to make it very difficult to find those good data scientists. >> Jim? Your final thought? >> My final thoughts is this is an awesome panel, and I'm so glad that you've come to New York, and I'm hoping that you all stay, of course, for the the IBM Data First launch event that will take place this evening about a block over at Hudson Mercantile, so that's pretty much it. Thank you, I really learned a lot. >> I want to second Jim's thanks, really, great panel. Awesome expertise, really appreciate you taking the time, and thanks to the folks at IBM for putting this together. >> And I'm big fans of most of you, all of you, on this session here, so it's great just to meet you in person, thank you. >> Okay, and I want to thank Jeff Frick for being a human curtain there with the sun setting here in New York City. Well thanks very much for watching, we are going to be across the street at the IBM announcement, we're going to be on the ground. We open up again tomorrow at 9:30 at Big Data NYC, Big Data Week, Strata plus the Hadoop World, thanks for watching everybody, that's a wrap from here. This is the Cube, we're out. (techno music)

Published Date : Sep 28 2016

SUMMARY :

Brought to you by headline sponsors, and this is a cube first, and we have some really but I want to hear them. and appreciate you organizing this. and the term data mining Eves, I of course know you from Twitter. and you can do that on a technical level, How many people have been on the Cube I always like to ask that question. and that was obviously Great, thank you Craig, and I'm also on the faculty and saw that snake swallow a basketball and with the big paradigm Great, thank you. and I came to data science, Great, thank you. and so what I think about data science Great, and last but not least, and the scale at which I'm going to go off script-- You guys have in on the front. and one of the CDOs, she said that 25% and I think certainly, that's and so I think this is a great opportunity and the first question talk about the theme now and does that data scientist, you know, and you can just advertise and from the clients I mean they need to have and it's been, the transition over time but I have a feeling that the paradise and the company's product and they really have to focus What is the right division and one of the reasons I You dream in equations, right? and you have no interest in learning but I think you need to and the curiosity you and there's a lot to be and I like to use the analogy, and the reason I mentioned that is that the right breakdown of roles? and the code behind the analytics, And not the other way around. Why is that? idea of the aspects of code, of the reasons for that I think Miriam, had a comment? and someone from the chief data office and one of the things that an operational function as opposed to and so most of the time and five minutes on the solution, right? that code that the data but if I'm listening to you, that in the real world? the data that you have or so that shows you that and the nirvana was maybe that the customers can see and a couple of the authors went away and actually cater the of the data itself, like in a mobile app, I love Claudia Emahoff too, of that notion, citizen data scientist. and have access to more people, to mean incompetent as opposed to and at some point then ask and the development of those spaces, and so I think one thing I think there's a and I think in order to have a true so I'd like to hope that with all the new and I think So but the term comes up and the analytics to of the fog. or any other pieces that you want to and also at the so it's about transforming big data and machine learning for the science and now I think with the and I'm hoping that you and thanks to the folks at IBM so it's great just to meet you in person, This is the Cube, we're out.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JenniferPERSON

0.99+

Jennifer ShinPERSON

0.99+

Miriam FridellPERSON

0.99+

Greg PiateskiPERSON

0.99+

JustinPERSON

0.99+

IBMORGANIZATION

0.99+

DavidPERSON

0.99+

Jeff FrickPERSON

0.99+

2015DATE

0.99+

Joe CasertaPERSON

0.99+

James CubelisPERSON

0.99+

JamesPERSON

0.99+

MiriamPERSON

0.99+

JimPERSON

0.99+

JoePERSON

0.99+

Claudia EmahoffPERSON

0.99+

NVIDIAORGANIZATION

0.99+

HillaryPERSON

0.99+

New YorkLOCATION

0.99+

Hillary MasonPERSON

0.99+

Justin SadeenPERSON

0.99+

GregPERSON

0.99+

DavePERSON

0.99+

55 minutesQUANTITY

0.99+

TrumpPERSON

0.99+

2016DATE

0.99+

CraigPERSON

0.99+

Dave ValantePERSON

0.99+

GeorgePERSON

0.99+

Dez BlanchfieldPERSON

0.99+

UKLOCATION

0.99+

FordORGANIZATION

0.99+

Craig BrownPERSON

0.99+

10QUANTITY

0.99+

8 Path SolutionsORGANIZATION

0.99+

CISCOORGANIZATION

0.99+

five minutesQUANTITY

0.99+

twoQUANTITY

0.99+

30 yearsQUANTITY

0.99+

KirkPERSON

0.99+

25%QUANTITY

0.99+

Marine CorpORGANIZATION

0.99+

80%QUANTITY

0.99+

43.5 petabytesQUANTITY

0.99+

BostonLOCATION

0.99+

Data RobotORGANIZATION

0.99+

10 peopleQUANTITY

0.99+

Hal VarianPERSON

0.99+

EinsteinPERSON

0.99+

New York CityLOCATION

0.99+

NielsenORGANIZATION

0.99+

first questionQUANTITY

0.99+

FridayDATE

0.99+

Ralph TimbalPERSON

0.99+

U.S.LOCATION

0.99+

6,000 sensorsQUANTITY

0.99+

UC BerkeleyORGANIZATION

0.99+

Sergey BrinPERSON

0.99+