Ricardo Guerra, Itaú Unibanco | AWS re:Invent 2020

>>from around the globe. >>It's >>the Cube with digital coverage of AWS reinvent 2020 sponsored by Intel and AWS. Yeah, welcome back to the cubes. Live coverage of reinvent 2020. I'm your host, John for year here for three weeks with a cube virtual. This year we're not in person. We're doing remote because of the pandemic. A great guest for credit Ghira CEO at I t a unit Banco in Brazil, Great customer of Amazon. Really a good reference point to this transformation story that Andy Jassy has been talking about on stage Ricardo. Great to have you on remotely. Thanks for coming on from Brazil. >>Thank you. Thanks for having me >>love to get down there, land the beach for a while. Just relax. After all the virtual tension from reinvent all the coverage, it's been wild. Anyway, thanks for coming on. I want to get into a CEO. You know, one of the things that Andy Jassy was really leaning forward this year on was the story of you gotta be on the cloud to have agility and the digital transformation which has been talked about for years. People process technology. We've heard that this year. More than ever, it's been quite the acceleration. You're either on the right side of history or not here as a business. Can you share your story of your transformation with Amazon? >>Sure, John. Eso this story months back Thio Actually, a decade ago when we started discussing our digital transformation, Right, eso when we see we are bank, that is almost 100 years old, 96 years old. And we are We're a big one we have on in Brazil 56 million customers. So it's a big company. Uh, and we have pretty much all the businesses over Universal Bank, including insurance here in Brazil. Banks also have insurance on all the rest. So from corporate banking to retail, from credit cards to investments, all sort of products and, uh, we started in technology Eyes early is in the seventies. So 1973 to be exactly when we started our current account system in the mainframes, right? So you can imagine that we have invested a lot in technology over the last almost 50 years, and it's always very well known here in the country for for the use of technology. So we have been pioneers in online transfer In the eighties, we have been pioneers using A T. M's and Internet banking and so on. Uh, but what happened until 2010 is that we were pretty much putting up applications, one top off the other. So we are offering products and services assed fast as we could, looking at the customer, trying to differentiate ourselves from competition. But definitely we were trying to move AST fast as we could, but we were not taking the right care off the platform. In the sense that today we see a lot of transformation in technology happening all the time, right? There's new stuff coming out all the time. We're seeing, hearing reinvent the amount of things that AWS have lounged. So all the time we're seeing new stuff, which is good for the business. So So the speed off this transformation is only getting faster and faster. So in order, Thio be to use all of those features to be able to leverage on new technology. Your platform has to be flexible. You have to be able to adopt those new technologies without losing moment off, offering your products and services, right? So, uh, again a decade ago. We started discussing these and we said We have to invest in technology in a different ways, not only producing solutions in financial services, but definitely we have to take care of the platform and understand how we should evolve the platform in order to you again. Better offer products and services to our customers, which is by by the end of the day. It's why we exist. So so So we started this story and we said Okay, so there's there's three things mainly that we have to take care in order. Thio, go into this journey. First of all, it's about people. We have to have a new mindset, a new culture, Ah, mindset where we have where we empower people in decisions. We have people thinking about the customer all the time and being aggressive on building solutions for them and using technology for building those solutions. Right. So we need more sort of entrepreneur type of people on people who really want to differentiate themselves and provide better services to the customer. In the second pillar, I would say the methodology has to change right. You have to have an agile methodology agile approach, as opposed to a traditional waterfall approach with silos internally in the organization that allows you to be faster as well and to adapt to customer needs faster on and third of all, which is the main subject here. In our conversation, you have to have a flexible platform, as I will explain on Duh. So we decided to pretty much rewrite our application in an architectural where we can be flexible. So pretty much what we have a sort of a monolith where we write cold. We have ridden code in a sequence building huge applications, right, and those huge applications are bottom acts, and they are very hard to maintain and to evolve. What we're doing is in a simple way. Building Micro services were breaking up those applications so we can adapt faster to whatever we see that our customers need or there's any business opportunity. In that sense, Cloud is the perfect platform to host those services, right, because we have again we're able to have the Bob's methodology. You are able to have service reliable engineering aside, reliable engineering were able to have all kinds of things and services that will help us on this journey to be more flexible and faster, right? So that Z that's why we have chosen to go to the cloud. In that sense, we've looked for a partner that waas reliable that waas a leader in the market that was able thio keep up with all the technology that is coming out in the market and offer innovation in the level that we need. And and that's that's the reason why we partner up with AWS for for the next 10 years. So, uh, you happy? Are >>you happy with Amazon? Just while I got you there? Are you happy with their with their response to you and there they're in, uh, interfacing with you guys. Are you happy with them? >>Yes. Yes, John, we have started working with AWS more intensely back in 2018 when our central bank bank allowed us to go to the public cloud. So we started working with them and we learned a lot. Of course, we were very young and immature at the time. In the knowledge of the technology eso We learned a lot in this two years and a half and we have built nice stuff together we have a very important court systems running on W s already on. They have been a good apartment. What they like to say is that our cultures match. We are both customer centric. Sui are both concerned with the success off the partnership. Ah, long term partnership doesn't work. If you're not concerned with the relationship, right, you've got to make sure that both parts will profit from this. Right? So, uh, I think we had we have had a good match, and in terms of technology, we are We are very happy. We have all the infrastructure and services that we need. >>Yeah, when you're building a bridge to the future together, your relationships matter. I would agree. And I think that's a differentiator I wanna just touch upon you mentioned you guys were pioneers going back and the way you tell your story. I was growing up in the seventies and kind of cut my teeth in the eighties and computer science. And remember those days it was very cool. Time went from mainframe client server, but there's a point where you become bloated with the monolithic. You got you stuck with all this. We called spaghetti code, right? It's all over the place, right? So, uh, in all intertwined, then you have that moment of truth. That's something that Andy Jassy was saying on stage. I thought was interesting. And it was almost like a business school lesson of Hey, leaders, you got to get to the truth. When you guys saw the cloud, what was the mindset? Because it sounds like you guys are a pioneering culture. You like Thio be innovative. What was the moment? Take me through the mindset of Hey, we better get busy building or we're gonna get busy dying. What's the What's the take me through that mindset >>at a very good question, John. So we literally started to suffer with our own speed to be really transparent, right? We were seeing the market starting to speed up all these new start ups and tech companies in other industries. And we're seeing all the industry's moving faster and everyone building solutions that were way better to the customer than the solutions that we were seeing. I don't know five or 10 years before, and it doesn't matter if you're talking about any industry, not only finance, right, So So we said. OK, looks like the financial industry is going to go through the same the same path and we're trying. We're trying to make things Mawr, mawr, I would say towards the customer and we're trying to understand customer needs better. And we actually did that when we have implemented a design thinking methodology back in 2010. Ah, Big one at the bank and And we came up with a lot of solutions building along with the customer and we were piling up backlogs, right? We're saying Okay, there's lots of stuff that we have to dio were lost in our spaghetti. That's pretty much it right, s So that's when we saw Okay, that's That's the end of Jesse moment that you describe Stop. We have to have a better platform. We have to reorganize ourselves. Otherwise we're gonna get lost in ourselves. There is no there's no way we can grow and invest Mawr because we're going to get stuck in this forget anyway. So So we structured a very robust program off platform modernization where we have we have invested over the last few years mainly, and we're going to keep on investing. Looking ahead where we try to build solutions while modernizing our platform. So there's no from from our perspective or a platform is big enough to just say that we cannot just rebuild the platform. Let's let's put a teen aside and let's rebuild the back that would take on all 789 years. I don't know how long and then we would be legacy again whenever finished. So what we do is we break up the our our reasoning. So we say Let's take each business as a very small component off the bank and let's build the technological components that support that business and let's extract those components from the monolith from from the Legacy. And and that's a strategy. That's the technology strategy that we have today. So we have empowered the business for them to own the platform. So they understand today that the platform is not a problem of technology, but it's actually the business. And by owning the platform, they understand that the monolithic doesn't allow them to be as fast as they want as the customer want, so it's very straightforward for them to understand that they have to break up the batter form They have to prioritize building micro services in the cloud so all the ideas and needs that they can identify will be much easier to implement. After that, >>you put it on them. They have to own the up they have to own. The business model of the platform is there. If the keys to the kingdom or in their hands, you're enabling that. That's a great stretch. And I love the MicroStrategy's breakout picking things out rather than trying to boil the ocean over over seven years. That's a big mistake people make and they end up having a legacy. Outdated platform that's ready for no one. Right? Ricardo? That's a masterclass right there in strategy. Thank you very much for sharing that insight into your bank And congratulations and all your innovations continues. Thanks for coming on the Cube. >>Thank you so much. >>Okay, I'm John. For a host of the Cube, virtual were remote this year. Got great content. Stay with us on the Cube Channel here on AWS. Reinvent. Thanks for watching

Published Date : Dec 8 2020

SUMMARY :

the Cube with digital coverage of AWS reinvent 2020 Thanks for having me You know, one of the things that Andy Jassy was really leaning forward this year on was the story of you that is coming out in the market and offer innovation in the level that we need. in, uh, interfacing with you guys. We have all the infrastructure and services that and the way you tell your story. That's the technology strategy that we have today. If the keys to the kingdom or in their hands, you're enabling that. Stay with us on the Cube Channel here on AWS.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Andy Jassy	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Brazil	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
2010	DATE	0.99+
2018	DATE	0.99+
Universal Bank	ORGANIZATION	0.99+
789 years	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
three things	QUANTITY	0.99+
both	QUANTITY	0.99+
eighties	DATE	0.99+
seventies	DATE	0.99+
1973	DATE	0.99+
second pillar	QUANTITY	0.99+
both parts	QUANTITY	0.99+
Bob	PERSON	0.99+
Itaú Unibanco	ORGANIZATION	0.98+
First	QUANTITY	0.98+
this year	DATE	0.98+
This year	DATE	0.98+
Ricardo	PERSON	0.98+
today	DATE	0.98+
a decade ago	DATE	0.98+
Intel	ORGANIZATION	0.97+
each business	QUANTITY	0.96+
two years and a half	QUANTITY	0.95+
almost 100 years old	QUANTITY	0.94+
56 million customers	QUANTITY	0.94+
Jesse	PERSON	0.94+
Sui	PERSON	0.93+
Thio	PERSON	0.93+
Cube	COMMERCIAL_ITEM	0.91+
96 years old	QUANTITY	0.9+
one	QUANTITY	0.9+
pandemic	EVENT	0.89+
Ricardo Guerra	PERSON	0.84+
next 10 years	DATE	0.83+
five	DATE	0.83+
over seven years	QUANTITY	0.81+
Ghira	PERSON	0.8+
almost 50 years	QUANTITY	0.74+
MicroStrategy	ORGANIZATION	0.71+
CEO	PERSON	0.69+
10 years	DATE	0.67+
last	DATE	0.62+
reinvent	EVENT	0.58+
Invent	TITLE	0.58+
months back	DATE	0.54+
Banco	ORGANIZATION	0.53+
years	DATE	0.53+
Cube Channel	TITLE	0.51+
third	QUANTITY	0.5+
2020	DATE	0.49+
reinvent 2020	EVENT	0.48+

Joe Gonzalez, MassMutual | Virtual Vertica BDC 2020

(bright music) >> Announcer: It's theCUBE. Covering the Virtual Vertica Big Data Conference 2020, brought to you by Vertica. Hello everybody, welcome back to theCUBE's coverage of the Vertica Big Data Conference, the Virtual BDC. My name is Dave Volante, and you're watching theCUBE. And we're here with Joe Gonzalez, who is a Vertica DBA, at MassMutual Financial. Joe, thanks so much for coming on theCUBE I'm sorry that we can't be face to face in Boston, but at least we're being responsible. So thank you for coming on. >> (laughs) Thank you for having me. It's nice to be here. >> Yeah, so let's set it up. We'll talk about, you know, a little bit about MassMutual. Everybody knows it's a big financial firm, but what's your role there and kind of your mission? >> So my role is Vertica DBA. I was hired January of last year to come on and manage their Vertica cluster. They've been on Vertica for probably about a year and a half before that started out on on-prem cluster and then move to AWS Enterprise in the cloud, and brought me on just as they were considering transitioning over to Vertica's EON mode. And they didn't really have anybody dedicated to Vertica, nobody who really knew and understood the product. And I've been working with Vertica for about probably six, seven years, at that point. I was looking for something new and landed a really good opportunity here with a great company. >> Yeah, you have a lot of experience in Vertica. You had a role as a market research, so you're a data guy, right? I mean that's really what you've been doing your entire career. >> I am, I've worked with Pitney Bowes, in the postage industry, I worked with healthcare auditing, after seven years in market research. And then I've been with MassMutual for a little over a year now, yeah, quite a lot. >> So tell us a little bit about kind of what your objectives are at MassMutual, what you're kind of doing with the platform, what application just supporting, paint a picture for us if you would. >> Certainly, so my role is, MassMutual just decided to make Vertica its enterprise data warehouse. So they've really bought into Vertica. And we're moving all of our data there probably about to good 80, 90% of MassMutual's data is going to be on the Vertica platform, in EON mode. So, and we have a wide usage of that data across corporation. Right now we're about 50 terabytes and growing quickly. And a wide variety of users. So there's a lot of ETLs coming in overnight, loading a lot of data, transforming a lot of data. And a lot of reporting tools are using it. So currently, Tableau MicroStrategy. We have Alteryx using it, and we also have API's running against it throughout the day, 24/7 with people coming in, especially now these days with the, you know, some financial uncertainty going on. A lot of people coming and checking their 401k's, checking their insurance and status and what not. So we have to handle a lot of concurrent traffic on top of the normal big query. So it's a quite diverse cluster. And I'm glad they're really investing in using Vertica as their overall solution for this. >> Yeah, I mean, these days your 401k like this, right? (laughing) Afraid to look. So I wonder, Joe if you could share with our audience. I mean, for those who might not be as familiar with the history of just Vertica, and specifically, about MPP, you've had historically you have, you know, traditional RDBMS, whether it's Db2 or Oracle, and then you had a spate of companies that came out with this notion of MPP Vertica is the one that, I think it's probably one of the few if only brands that they've survived, but what did that bring to the industry and why is that important for people to understand, just in terms of whatever it is, scale, performance, cost. Can you explain that? >> To me, it actually brought scale at good cost. And that's why I've been a big proponent of Vertica ever since I started using it. There's a number, like you said of different platforms where you can load big data and store and house big data. But the purpose of having that big data is not just for it to sit there, but to be used, and used in a variety of ways. And that's from, you know, something small, like the first installation I was on was about 10 terabytes. And, you know, I work with the data warehouses up to 100 terabytes, and, you know, there's Vertica installations with, you know, hundreds of petabytes on them. You want to be able to use that data, so you need a platform that's going to be able to access that data and get it to the clients, get it to the customers as quickly as possible, and not paying an arm and a leg for the privilege to do so. And Vertica allows companies to do that, not only get their data to clients and you know, in company users quickly, but save money while doing so. >> So, but so, why couldn't I just use a traditional RDBMS? Why not just throw it all into Oracle? >> One, cost, Oracle is very expensive while Vertica's a lot more affordable than that. But the column-score structure of Vertica allows for a lot more optimized queries. Some of the queries that you can run in Vertica in 2, 3, 4 seconds, will take minutes and sometimes hours in an RDBMS, like Oracle, like SQL Server. They have the capability to store that amount of data, no question, but the usability really lacks when you start querying tables that are 180 billion column, 180 billion rows rather of tables in Vertica that are over 1000 columns. Those will take hours to run on a traditional RDBMS and then running them in Vertica, I get my queries back in a sec. >> You know what's interesting to me, Joe and I wonder if you could comment, it seems that Vertica has done a good job of embracing, you know, riding the waves, whether it was HDFS and the big data in our early part of the big data era, the machine learning, machine intelligence. Whether it's, you know, TensorFlow and other data science tools, it seems like Vertica somehow in the cloud is the other one, right? A lot of times cloud is super disruptive, particularly to companies that started on-prem, it seems like Vertica somehow has been able to adopt and embrace some of these trends. Why, from your standpoint, first of all, from your standpoint, as a customer, is that true? And why do you think that is? Is it architectural? Is it true mindset engineering? I wonder if you could comment on that. >> It's absolutely true, I've started out again, on an on-prem Vertica data warehouse, and we kind of, you know, rolled kind of along with them, you know, more and more people have been using data, they want to make it accessible to people on the web now. And you know, having that, the option to provide that data from an on-prem solution, from AWS is key, and now Vertica is offering even a hybrid solution, if you want to keep some of your data behind a firewall, on-prem, and put some in the cloud as well. So data at Vertica has absolutely evolved along with the industry in ways that no other company really has that I've seen. And I think the reason for it and the reason I've stayed with Vertica, and specifically have remained at Vertica DBA for the last seven years, is because of the way Vertica stays in touch with it's persons. I've been working with the same people for the seven, eight years, I've been using Vertica, they're family. I'm part of their family, and you know, I'm good friends with some of these people. And they really are in tune not only with the customer but what they're doing. They really sit down with you and have those conversations about, you know, what are your needs? How can we make Vertica better? And they listen to their clients. You know, just having access to the data engineers who develop Vertica to be arranged on a phone call or whatnot, I've never had that with any other company. Vertica makes that available to their customers when they need it. So the personal touch is a huge for them. >> That's good, it's always good to get the confirmation from the practitioners, just not hear from the vendor. I want to ask you about the EON transition. You mentioned that MassMutual brought you in to help with that. What were some of the challenges that you faced? And how did you get over them? And what did, what is, why EON? You know, what was the goal, the outcome and some of the challenges maybe that you had to overcome? >> Right. So MassMutual had an interesting setup when I first came in. They had three different Vertica clusters to accommodate three different portions of their business. The data scientists who use the data quite extensively in very large queries, very intense queries, their work with their predictive analytics and whatnot. It was a separate one for the API's, which needed, you know, sub-second query response times. And the enterprise solution, they weren't always able to get the performance they needed, because the fast queries were being overrun by the larger queries that needed more resources. And then they had a third for starting to develop this enterprise data platform and started, you know, looking into their future. The first challenge was, first of all, bringing all those three together, and back into a single cluster, and allowing our users to have both of the heavy queries and the API queries running at the same time, on the same platform without having to completely separate them out onto different clusters. EON really helps with that because it allows to store that data in the S3 communal storage, have the main cluster set up to run the heavy queries. And then you can set up sub clusters that still point to that S3 data, but separates out the compute so that the API's really have their own resources to run and not be interfered with by the other process. >> Okay, so that, I'm hearing a couple of things. One is you're sort of busting down data silos. So you're able to have a much more coherent view of your data, which I would imagine is critical, certainly. Companies like MassMutual, have been around for 100 years, and so you've got all kinds of data dispersed. So to the extent that you can break down those silos, that's important, but also being able to I guess have granular increments of compute and storage is what I'm hearing. What does that do for you? It make that more efficient? Well, they are other business benefits? Maybe you could elucidate. >> Well, one cost is again, a huge benefit, the cost of running three different clusters in even AWS, in the enterprise solution was a little costly, you know, you had to have your dedicated servers here and there. So you're paying for like, you know, 12, 15 different servers, for example. Whereas we bring them all back into EON, I can run everything on a six-node production cluster. And you know, when things are busy, I can spin up the three-node top cluster for the API's, only paid for when I need them, and then bring them back into the main cluster when things are slowed down a bit, and they can get that performance that they need. So that saves a ton on resource costs, you know, you're not paying for the storage, you're paying for one S3 bucket, you're only paying for the nodes, these are two instances, that are up and running when you need them., and that is huge. And again, like you said, it gives us the ability to silo our data without having to completely separate our data into different storage areas. Which is a big benefit, it gives us the ability to query everything from one single cluster without having to synchronize it to, you know, three different ones. So this one going to have there's, this one going to have there's, but everyone's still looking at the same data and replicate that in QA and Devs so that people can do it outside of production and do some testing as well. >> So EON, obviously a very important innovation. And of course, Vertica touts the difference between others who separate huge storage, and you know, they're not the only one that does that, but they are really I think the only one that does it for on-prem, and virtually across clouds. So my question is, and I think you're doing a breakout session on the Virtual BDC. We're going to be in Boston, now we're doing it online. If I'm in the audience, I'm imagining I'm a junior DBA at an organization that maybe doesn't have a Joe. I haven't been an expert for seven years. How hard is it for me to get, what do I need to do to get up to speed on EON? It sounds great, I want it. I'm going to save my company money, but I'm nervous 'cause I've only been at Vertica DBA for, you know, a year, and I'm sort of, you know, not as experienced as you. What are the things that I should be thinking about? Do I need to bring in? Do I need to hire somebody? Do I need to bring in a consultant? Can I learn it myself? What would you advise? >> It's definitely easy enough that if you have at least a little bit of work experience, you can learn it yourself, okay? 'Cause the concepts are still there. There's some you know, little bits of nuances where you do need to be aware of certain changes between the Enterprise and EON edition. But I would also say consult with your Vertica Account Manager, consult with your, you know, let them bring in the right people from Vertica to help you get up to speed and if you need to, there are also resources available as far as consultants go, that will help you get up to speed very quickly. And we did work together with Vertica and with one of their partners, Clarity, in helping us to understand EON better, set it up the right way, you know, how do we take our, the number of shards for our data warehouse? You know, they helped us evaluate all that and pick the right number of shards, the right number of nodes to get set up and going. And, you know, helped us figure out the best ways to get our data over from the Enterprise Edition into EON very quickly and very efficient. So different with yourself. >> I wanted to ask you about organizational, you know, issues because, you know, the guys like you practitioners always tell me, "Look, the tech, technology comes and goes, that's kind of the easy part, we're good at that. It's the people it's the processes, the skill sets." What does your, you know, team regime look like? And do you have any sort of ideal team makeup or, you know, ideal advice, is it two piece of teams? Is it what kind of skills? What kind of interaction and communications to senior leadership? I wonder if you could just give us some color on that. >> One of the things that makes me extremely proud to be working for MassMutual right now, is that they do what a lot of companies have not been doing and that is investing in IT. They have put a lot of thought, a lot of money, and a lot of support into setting up their enterprise data platform and putting Vertica at the center. And not only did they put the money into getting the software that they needed, like Vertica, you know, MicroStrategy, and all the other tools that we were using to use that, they put the money in the people. Our managers are extremely supportive of us. We hired about 40 to 45 different people within a four-month time frame, data engineers, data analysts, data modelers, a nice mix of people across who can help shape your data and bring the data in and help the users use the data properly, and allow me as the database administrator to make sure that they're doing what they're doing most efficiently and focus on my job. So you have to have that diversity among the different data skills in order to make your team successful. >> That's awesome. Kind of a side question, and it's really not Vertica's wheelhouse, but I'm curious, you know, in the early days of the big data, you know, movement, a lot of the data scientists would complain, and they still do that, "80% of my time is spent wrangling data." The tools for the data engineer, the data scientists, the database, you know, experts, they're all different. And is that changing? And to what degree is that changing? Kind of what ending are we in and just in terms of a more facile environment for all those roles? >> Again, I think it depends on company to company, you know, what resources they make available to the data scientists. And the data scientists, we have a lot of them at MassMutual. And they're very much into doing a lot of machine learning, model training, predictive analytics. And they are, you know, used to doing it outside of Vertica too, you know, pulling that data out into Python and Scalars Bar, and tools like that. And they're also now just getting into using Vertica's in-database analytics and machine learning, which is a skill that, you know, definitely nobody else out there has. So being able to have one somebody who understands Vertica like myself, and being able to train other people to use Vertica the way that is most efficient for them is key. But also just having people who understand not only the tools that you're using, but how to model data, how to architect your tables, your schemas, the interaction between your tables and schemas and whatnot, you need to have that diversity in order to make this work. And our data scientists have benefited immensely from the struct that MassMutual put in place by our data management delivery team. >> That's great, I think I saw, somewhere in your background, that you've trained about 100 people in Vertica. Did I get that right? >> Yes, I've, since I started here, I've gone to our Boston location, our Springfield location, and our New York City location and trained, probably about this point, about 120, 140 of our Vertica users. And I'm trying to do, you know, a couple of follow-up sessions per year. >> So adoption, obviously, is a big goal of yours. Getting people to adopt the platform, but then more importantly, I guess, deliver business value and outcomes. >> Absolutely. >> Yeah, I wanted to ask you about encryption. You know, in the perfect world, everything would be encrypted, but there are trade offs. Are you using encryption? What are you doing in that regard? >> We are actually just getting into that now due to the New York and the CCPA regulations that are now in place. We do have a lot of Person Identifiable Information in our data store that does require encryption. So we are going through a month's long process that started in December, I think, it's actually a bit earlier than that, to start identifying all the columns, not only in our Vertica database, but in, you know, the other databases that we do use, you know, we have Postgres database, SQL Server, Teradata for the time being, until that moves into Vertica. And identify where that data sits, what downstream applications, pull that data from the data sources and store it locally as well, and starts encrypting that data. And because of the tight relationship between Voltage and Vertica, we settled on Voltages as the major platform to start doing that encryption. So we're going to be implementing that in Vertica probably within the next month or two, and roll it out to all the teams that have data that requires encryption. We're going to start rolling it out to the downstream application owners to make sure that they are encrypting the data as they get it pulled over. And we're also using another product for several other applications that don't mesh well as well with both. >> Voltage being micro, focuses encryption solution, correct? >> Right, yes. >> Yes, of course, like a focus for the audience's is the, it owns Vertica and if Vertica is a separate brand. So I want to ask you kind of close on what success looks like. You've been at this for a number of years, coming into MassMutual which was great to hear. I've had some past experience with MassMutual, it's an awesome company, I've been to the Springfield facility and in Boston as well, and I have great respect for them, and they've really always been a leader. So it's great to hear that they're investing in technology as a differentiator. What does success look like for you? Let's say you're at MassMutual for a few years, you're looking back, what success look like? Go. >> A good question. It's changing every day just, you know, with more and more, you know, applications coming onboard, more and more data being pulled in, more uses being found for the data that we have. I think success for me is making sure that Vertica, first of all, is always up made, is always running at its most optimal to keep our users happy. I think when I started, you know, we had a lot of processes that were running, you know, six, seven hours, some of them were taking, you know, almost a day long, because they were so complicated, we've got those running in under an hour now, some of them running in a matter of minutes. I want to keep that optimization going for all of our processes. Like I said, there's a lot of users using this data. And it's been hard over the first year of me being here to get to all of them. And thankfully, you know, I'm getting a bit of help now, I have a couple of system DBAs, and I'm training up to help out with these optimizations, you know, fixing queries, fixing projections to make sure that queries do run as quickly as possible. So getting that to its optimal stage is one. Two, getting our data encrypted and protected so that even if for whatever reasons, somehow somebody breaks into our data, they're not going to be able to get anything at all, because our data is 100% protected. And I think more companies need to be focusing on that as well. And third, I want to see our data science teams using more and more of Vertica's in-database predictive analytics, in-database machine learning products, and really helping make their jobs more efficient by doing so. >> Joe, you're awesome guest I mean, we always like I said, love having the practitioners on and getting the straight, skinny and pros. You're welcome back anytime, and as I say, I wish we could have met in Boston, maybe next year at the BDC. But it's great to have you online, and thanks for coming on theCUBE. >> And thank you for having me and hopefully we'll meet next year. >> Yeah, I hope so. And thank you everybody for watching that. Remember theCUBE is running concurrent with the Vertica Virtual BDC, it's vertica.com/bdc2020. If you want to check out all the keynotes, and all the breakout sessions, I'm Dave Volante for theCUBE. We'll be going. More interviews, for people right there. Thanks for watching. (bright music)

Published Date : Mar 31 2020

SUMMARY :

Big Data Conference 2020, brought to you by Vertica. (laughs) Thank you for having me. We'll talk about, you know, cluster and then move to AWS Enterprise in the cloud, Yeah, you have a lot of experience in Vertica. in the postage industry, I worked with healthcare auditing, paint a picture for us if you would. with the, you know, some financial uncertainty going on. and then you had a spate of companies that came out their data to clients and you know, Some of the queries that you can run in Vertica a good job of embracing, you know, riding the waves, And you know, having that, the option to provide and some of the challenges maybe that you had to overcome? It was a separate one for the API's, which needed, you know, So to the extent that you can break down those silos, So that saves a ton on resource costs, you know, and I'm sort of, you know, not as experienced as you. to help you get up to speed and if you need to, because, you know, the guys like you practitioners the database administrator to make sure that they're doing of the big data, you know, movement, Again, I think it depends on company to company, you know, Did I get that right? And I'm trying to do, you know, a couple of follow-up Getting people to adopt the platform, but then more What are you doing in that regard? the other databases that we do use, you know, So I want to ask you kind of close on what success looks like. And thankfully, you know, I'm getting a bit of help now, But it's great to have you online, And thank you for having me And thank you everybody for watching that.

ENTITIES

Entity	Category	Confidence
Joe Gonzalez	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
MassMutual	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
December	DATE	0.99+
100%	QUANTITY	0.99+
Joe	PERSON	0.99+
six	QUANTITY	0.99+
New York City	LOCATION	0.99+
seven years	QUANTITY	0.99+
12	QUANTITY	0.99+
80%	QUANTITY	0.99+
seven	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
four-month	QUANTITY	0.99+
vertica.com/bdc2020	OTHER	0.99+
Springfield	LOCATION	0.99+
2	QUANTITY	0.99+
next year	DATE	0.99+
two instances	QUANTITY	0.99+
seven hours	QUANTITY	0.99+
both	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Scalars Bar	TITLE	0.99+
Python	TITLE	0.99+
180 billion rows	QUANTITY	0.99+
Two	QUANTITY	0.99+
third	QUANTITY	0.99+
15 different servers	QUANTITY	0.99+
two piece	QUANTITY	0.98+
One	QUANTITY	0.98+
180 billion column	QUANTITY	0.98+
over 1000 columns	QUANTITY	0.98+
eight years	QUANTITY	0.98+
Voltage	ORGANIZATION	0.98+
three	QUANTITY	0.98+
hundreds of petabytes	QUANTITY	0.98+
first	QUANTITY	0.98+
six-node	QUANTITY	0.98+
one	QUANTITY	0.98+
one single cluster	QUANTITY	0.98+
Vertica Big Data Conference	EVENT	0.98+
MassMutual Financial	ORGANIZATION	0.98+
4 seconds	QUANTITY	0.98+
EON	ORGANIZATION	0.98+
New York	LOCATION	0.97+
about 10 terabytes	QUANTITY	0.97+
first challenge	QUANTITY	0.97+
next month	DATE	0.97+

COVID-19: IT Spending Impact March 26, 2020

>> From theCUBE studios in Palo Alto in Boston, connecting with our leaders all around the world, this is theCUBE Conversation. >> Hello everyone, and welcome to this week's Wiki Bond CUBE Insights powered by ETR. In this breaking analysis, we're changing the format a little bit, we're going right to the new data from ETR. You might recall that last week, ETR received survey results from over 1000 CIOs and IT practitioners. And they made a call at that time, which said that actually surprisingly, a large number of respondents about 40% said they didn't expect a change in their 2020 IT spending. At the same time about 20% of the survey said they're going to spend more largely related to Work From Home infrastructure. ETR was really the first to report on this. And it wasn't just collaboration tool like zoom and video conferencing. It was infrastructure around that security, network bandwidth and other types of infrastructure to support Work From Home like desktop virtualization. ETR made the call at that time, that it looked like budgets, were going to be flat for 2020. Now, you also might recall consensus estimates for 2020 came into the year at about 4%, slightly ahead of GDP. Obviously, that's all is changed. Last week, ETR took the forecast down, and we're going to update you today. We're now gone slightly negative. And with me to talk about that again, is Sagar Kadakia, who's the Director of Research at ETR. Sagar, great to see you again, thank you for coming on. >> Thanks for having me again David, really appreciate it. >> Let's get right into it. I mean, if you look at the time series chart that we showed last week, you can see how sentiment changed over time. That blue line was basically people who responded to the survey starting at 3/11. Now you've updated that, that forecast, really tracking after the COVID-19 really kicked in. Can you explain what we're seeing here in this chart? >> Yeah, no problem. The last time we spoke, we were around an N or sample size of about 1000. And we were right around that zero percent growth rate. One of the unique things that we've done is we've left this survey open. And so what that allows us to do is really track the impact on annual IP growth, essentially daily. And so as things have progressed, as you look at that blue line, you can really see the growth rate has continued to trend downwards. And as of just a day or two ago, we're now below zero. And so I think because of what's occurring right now, the overall current climate continues to slightly deteriorate. You're seeing that in a lot of the CIOs responses. >> If you bring that slide back up Andrew, I want to just sort of stay on this for a second. What I really like about what you guys are doing is you're essentially bringing event analysis in this. So if you see that blue line, you see on 3/13, a national emergency was declared and that's really when the blue line started to decline. What ETR has done is kind of reset that, reset the data since 3/13. Because it's now a more accurate reflection of what's actually happening happening in the market. Notice in the upper right, it says the US approved... The Senate last night approved a stimulus package. Actually, they're calling it an Aid Package. It's really not a stimulus package. It's an aid package that they're injecting to help. A number of our workers actually sounds like existing workers and small businesses and even large businesses like Boeing. Boeing was up significantly yesterday powering the Dow and potentially airlines. As you can see ETR is going to continue to monitor the impact, and roll this out. Really ETR is the only company that I know of anyway, that can track this stuff on a daily basis. So Sagar, that event analysis is really key, and you're going to be watching the impact of this stimulus slash aid packet. >> Yeah, so here's what we're doing on that chart. If you look at that yellow line again, effectively what you're seeing is, if we remove the first I think six or seven 100 respondents that took the survey and start tracking how budgets are changing as a 3/13, that's when the US declared a national emergency. We can recalculate the growth rate. And we can see it's around... It's almost negative one and a half. And so the beauty of doing this, really polling daily, is it allows us to be just as dynamic, as a lot of these organizations are. I think one of the things we talked about the last time was some of these budget changes are going to be temporary. And organizations are figuring out what they're doing day by day. And a lot of that is dictated based on government actions. And so uniquely here, what we're able to do is kind of give people a range and also say, "based on these events, "this is how things are changing."" And so I think we think the first biggest event was on 3/13, where the US effectively declared a national emergency over COVID-19. And now what we're going to start tracking between today and over the weekend, and Monday is: Are people getting more positive? Is there no change? Or is there further deterioration because of this aid package that got passed this morning? >> Now I want to share with our audience. I've been down to ETR's headquarters in New York, it's staffed with a number of data scientists and statistical experts. The ends here are well over 1000. I think we're over 1100 now, is that correct? What is the end that we're at today? >> That's right. Yeah, we're we're pushing right over 1200. And we're going to expect a few more hundred respondents. The good thing is it's balanced, which is important. All these events that are occurring, we want to make sure that we have at least a few hundred more CIOs and IT executives answering. And so every week as we kind of continue to do some of these breaking analysis, there are going to be a few more hundred CIOs. And we'll really be able to zero in or hone in on what they're saying. The growth rate on the IT side, it's going to continue to fluctuate. It's going to continue to be dynamic over the next few weeks, but right now versus (murmurs). We are in negative territory now. >> I want to also explain I mean, the end is important. But in and of itself, it's not the be all end all, what's important about the end, the larger it is, the more cuts you can make. And I want to share... You guys have been doing this for the better part of a decade. And so you have firm level data. And you've got indicators and markers that you've tracked over the years. For example, one of the things that ETR tracks is Giant Public and Private GDP we call it. And that's for example, I'm not saying that, that Mars is one of the companies but Mars is a huge private company, UPS before they went public, huge private company. ETR tracks firm level data, they of course anonymize that, but they can see markers and trackers and trends, and probably have, I don't know dozens of those types of segments. So the bigger the end is, the more... The higher the end within those buckets, and the better the confidence interval. And you guys are experts at really digging into that in trying to understand and read the tea leaves. >> That's right. The key to this survey is, it's not anonymous, we know who is taking the survey. Now to your point, we do anonymize and aggregate it when we display those results. But one of the unique capabilities is we're able to see all of these trend lines. The entire drill down survey that we did on COVID-19 through the lenses of different verticals so we can take a look at industrials materials manufacturing, healthcare, pharma, airlines, delivery services, health, and all these other verticals and get a feel for which ones are deteriorating the most, which ones look stable. And, we talked about last week and it continues to remain true this week. And again, the ends have gone up on all these verticals on the supply chain side. Industrials, materials manufacturing, healthcare, pharma, they continue and they also anticipate to see these things in the next few months, broken supply chains and on the demand side, it's really retail consumer airlines delivery services. That's coming down quite substantial. And I think, based on what United and some of these other airlines have done these last few days in terms of cutting capacity, that's just a reflection of what we're seeing. >> Let's dig into the data a little bit more and bring up the next chart. Last week, we're about 40% actually, exactly 40% where that gray line that said: CIOs and IT practitioners said, "no change." They're like the budget of the green. The green was actually at about 20 21%. So it's slightly up now at 22%. And you can see, most of the the green is in that one to 10% range. And you can see in the left hand side, it's obviously changing. Now we're at 37% in the gray line, slightly up in the green, and a little bit more down and in the red. So take us through what's changed Sagar. >> Yeah, to reiterate what we were talking about last week, and then I'll kind of talk about some of the change is, I think the market and a lot of our clients, they were expecting the growth rate to be more negative. Last week when we talked about zero percent. The reason that, it wasn't more negative is because we saw all these organizations accelerating spend because they had to keep employees productive. They don't want to catastrophe in productivity. And so you saw this acceleration, as you mentioned earlier in the interview around Work From Home tools, like collaboration tools, increasing bandwidth on the VPN networking side, laptops, MDM, so forth and so on. That continues to hold true today. Again, if we use the same example that we talked about last week, (mumbles) organizations, they have 40 50 60,000 employees or more working from home. You have to be able to support these individuals and that's why we're actually seeing some organizations accelerate spend and the majority organizations even though they are declining spend, some of that is still being offset by having to spend more on what we're calling kind of this Work From Home infrastructure. But I will say this: you are seeing more organizations versus last week, which is why the growth rate has come down, moving more and more towards the negative buckets. Again, there is some offset there. But the offset we talked about last week, Work From Home infrastructure is not a one-for-one when it comes to taking down your IT budget, and that continues to hold true. >> Let's talk a little bit about some of the industries retail, airlines, industrials, pharma, healthcare, what are you seeing in terms of the industry impact, particularly when it relates to supply chains, but other industry data that went through? >> I think the biggest takeaway is that healthcare pharma, industry materials, manufacturing organizations, they've indicated the highest levels of broken supply chains today. And they think in three months from now, it's actually going to get worse. And so we spoke about this last time, I don't think this is going to be a V shaped recovery from the standpoint of things are going to get better in the next few weeks or the next month or two. CIOs are indicating that they expect conditions to worsen over the next three months on the supply chain side and even demand the ones that are getting hit the hardest on the retail consumer side airlines, delivery services, they are again indicating that they anticipate demand to be worse three months from now. The goal is to continue serving and pulling these individuals over the next few weeks and months and to see if we can get a better timeline as we get into two edge but for the next few months, conditions look like they're going to get worse. >> I want to highlight some of the industries and let's make some comments here. Retail... You guys called out retail airlines, delivery services, industrials, materials, manufacturing, pharma and healthcare, there's some of the highest impact. I'll just make a few comments here. I think retail really, this accelerates the whole digital transformation. We already saw this starting, I think you'll see further consolidation and some permanence in the way in which companies are pivoting to digital. Obviously, the big guys like Walmart and the like are competing very effectively with Amazon. But, there's going to be some more consolidation there. I would say potentially the same thing in airlines that really are closely watching what the government is going to do. But, do we need this this many airlines? Do we need all this capacity? Maybe yes, maybe no. So watching that. And of course, healthcare right now, as I said last week in the braking analysis, they're just too distracted right now to buy anything. And they're overwhelmed. Now, of course, pharma, they're manufacturing, so they've got disruptions in supply chain and obviously the business. But there could be an upside down the road as COVID-19 vaccines come to the market. >> On the upside, I think you kind of hit it, right on the nail. When you get these type of events that occur. Sometimes it speeds up digital transformation. one of the things that the team and I have been talking about internally is: this is not your father's Keep The Lights On strategy so to speak. Organizations are very focused on maintaining productivity versus significantly cutting costs. What does that mean? Maybe three to five years ago, if this had occurred, you would have seen a lot of infrastructure as a service platform, as a service... A lot of these cloud providers, you'd have seen those projects decline as organization spent more on on plan. And we're not seeing that. We're seeing continued elevated budgets on the Cloud side and Micron just reported this morning and again, cited strong demand on the Cloud and data center side. That just goes to show that organizations are trying to maintain productivity. They want to continue these IT roadmaps and they're going to cut budgets where they can, but it's not going to be on the Cloud side. >> You know what, that's a really important point. This is not post Y2K, not 2008, 2007, 2008, 2009 because we've, pretended but a 10 year bull market, companies are doing pretty well, balance sheets are generally strong. They somewhat in whether, it was used to stronger companies, whether they're so they're not focused right now anyway, on cut cut cut as it was in the last few downturns. Let's go into some of the vendor data and some of the sector data, Andrew if you'd bring up the next chart. What we're showing here is really comparing the the blue is the January survey to the current survey in the yellow, and you're seeing some of the sectors that are up taking. You've identified mobile device management, big data and Cloud, some of the productivity, you mentioned DocuSign, Adobe zoom, Citrix, even VMware with the desktop virtualization. We've talked about security, you've got marketing and LinkedIn, my LinkedIn inbound is going through the roof as people are probably signing up for a LinkedIn premium. Let's talk about this a little bit. What you're seeing... Help us interpret this data. >> Yeah, sure. One of the things that everybody wants to know is, okay, so Work From Home infrastructures getting more spend for the vendors that are benefiting the most. One of the unique things that we can do is because we're kind of collecting all the DNA, from a tech stack aside from these organizations, we can overlap, how they're spending on these vendors. And also with the data that they provide in terms of whether they are increasing or decelerating their IT budgets because of COVID-19. What you're looking at here, is we isolated to all of those organizations and customers that indicated that they're increasing their budgets because of COVID-19. Because of the Work From Home infrastructure. And what we're doing is we're then isolating to vendors that are getting the most upticks in spend. This actually really nicely aligns with a lot of the themes that we were talking about collaboration tools. You see that VMware, they're all right on the virtualization side, MDM with Microsoft. And you're seeing a lot of other vendors with Citrix and Zoom and Adobe. These are the ones that we think are going to benefit from this kind of Work From home infrastructure movement. And again, it's all very... It's not just the qualitative and the commentary. This is all analytics, we really went in and analyzed every single one of these organizations that were increasing their budgets and tried to pinpoint using different data analysis techniques, and to see which vendors were really getting the majority or the largest, pie of that span. >> We had Sanjay Poonen, who's the CEO of VMware on yesterday and he was very sensitive but not trying to hear as your ambulance chasing because obviously they do desktop virtualization and VDI big workload. At the same time. I think he was also being cautious because there's probably portions of their business that are going to get hit, Michael Dell similarly, I think he was quoted in CRN as saying, "hey, are we seeing momentum in our laptop "business in our mobile business?" But as you guys pointed out, the flip side of that is their on prem business is probably going to suffer somewhat. It's a kind of like the Work From Home is a partial offset, but it's not a total offset. You're seeing that with a lot of these companies. Obviously, Microsoft, AWS, a lot of the cloud companies are very well positioned, how about some of the guys that are going to get impacted? Obviously, as I said that the on-prem folks, you guys talked about earlier it's not your father's Keep Your Lights On strategy. Okay but this... You asked the question, is this a reprieve for the legacy guys? Not quite, was your conclusion. What did you mean by that? >> I think a lot of times when you have these sub-events, the clients a lot of the market think okay, "some of the legacy vendors are going to do well "because, we're in malicious times, "and we don't want to keep on this kind "of next generation strategy." We're not seeing that and to the point that you highlighted earlier. There are... Even though these companies like Dell, like Cisco, where they're seeing some products accelerate, there are products to your point that are not doing as well The desktops, right? As an example for Dell or the storage. On the negative side or the legacy side where we're just not seeing any traction, the IBM's the Oracle on-prem, Symantec, which got acquired by Broadcom, checkpoint MicroStrategy. And there's another half dozen other vendors that we're seeing where they are not capitalizing. There is no reprieve for these legacy names. And we don't anticipate them getting additional spend, because of this Work From Home infrastructure kind of movement. >> Let's unpack that a little bit. It's interesting Symantec and checkpoint in security, security you think would get an uplift there, but what you're seeing here is... Let me just tell the audience who you called out. Symantec Teradata MicroStrategy, NET app Checkpoint Oracle and IBM, and I know there are others. But I would say this: These are companies that are getting impacted in a big way by the Cloud. Particularly like Symantec and checkpoint. That's a Cloud security companies are actually probably still doing pretty well. You take Teradata, their data is getting impact by the Cloud from folks like Snowflake and Redshift, MicroStrategy a lot of modern BI coming out. NetApp here's a company that's embraced the Cloud, but the vast majority of the business changess to be on-prem. I think IBM and Oracle are interesting. They're somewhat different. Actually a lot different IBM has services exposure, and you guys call that out, particularly around outsourcing. At the same time, it's going to be interesting to see IBM is going to get a lot of resources. Going to be interesting to see if they start coming out with corona virus related services. So watching for that, and then Oracle, their whole story is, "okay, we got Gen 2 Cloud and Mission Critical in the Cloud, but they're on-prem businesses, I think clearly going to be affected here is kind of what you guys pointed out, and I would agree with your thoughts. >> I think what we're seeing is organizations they had a Cloud roadmap, and that roadmap is continuing. The one thing that is changing in some of that roadmap is we need to be able to support employees as they work from home as we achieve this roadmap. And so that's why we're not seeing a reprieve on the legacy side. But we are seeing upticks and spin where we just wouldn't anticipate them right on maybe on Citrix, on Dell laptops, Adobe and a few other areas. Now, in terms of security side, some of the next gen security vendors like CrowdStrike APi, which is an MFA, those vendors are doing well. It makes sense, where you have more people working from home, you have more devices that are connecting to data applications. Just a component itself. And so you would expect spend to continue going up as you need more authentication, more Endpoint Protection. Cisco Meraki they do Cloud Networking. That piece is looking very good, even though Hardware networking is not looking very good at all. The Cloud Networking is looking good, which again makes sense, as you're increasing bandwidth on that side. >> Definitely stories of two sides of that coin. >> That's right >> I want to... Andrew, if you want to... If you wouldn't mind bringing up the next job, we're going to go back to the first one that we showed you with the time series. This is a very important point. Again, we can't stress it enough. We want to understand the impact of the stimulus or aid package. And ETR is going to continue to track that. What can we expect from you guys over the next week or so? >> The goal is to determine whether or not the stimulus is having an impact on how people are responding to our survey as a relates to how they're changing their budgets. The next four or five days, if we start seeing an uptick in this yellow and blue lines here, I think that's a positive. I think that shows that people are kind of wrapping their heads around, great government is taking action here. There is a roadmap in place to help us get out of this. But if the line continues coming down, it just may be that the last few weeks or the last month or so, there was just so much damage. There's not really... There's no coming back from this at least in the near term. So we are kind of watching out for that. >> Well, the Fed is definitely active. >> They're doing right what they can, they're pushing liquidity into the marketplace. People think out of bullets. I don't agree with the Fed. Fed has a quite a bit of of headroom and some dry powder, (murmurs) which is awesome. But the Fed itself, can't do it. You needed to have this fiscal stimulus. So we're excited to see that come to market. I think what I would say to our audiences, my concern is uncertainty. The markets don't like uncertainty and right now there's a lot of uncertainty. If you saw the piece on medium of The Hammer And The Dance it lays out some scenarios about what could happen to the healthcare system. You see people who say, "hey, we should shut down for 10 weeks." The president saying, "hey, we want "to get back to work by by April." The big concern that I have is: okay, maybe we can stamp it out in the near term and get back to work by late April, early May. But then what happens? Are people going to start traveling again? Are people going to start holding events again? And I think there's going to be some real question marks around that. That uncertainty I think, is something that we obviously have to watch. I think there is light at the end of the tunnel, when you look at China and some of the other things that are happening around the world, but we still don't know how long that tunnel is. I'll give you final thoughts before we wrap. >> I think and that's the biggest thing here is the uncertainty, which is why we're doing a lot of this event analysis. We're trying to figure out: after each one of these big events, is there more certainty in people's responses? And just we were talking about, sectors and verticals and vendors that are not doing well. Because the uncertainty we're seeing a lot of down ticks and spend amongst outsource IT and IT consulting vendors. And as long as the uncertainty continues, you're going to see more and more IT projects frozen, less and less spend on those outsource IT and IT consulting vendors and others. And until there's something really in place here where people feel comfortable, you're going to probably see budgets remain where they are, which right now they're negative. >> Folks as we said last week, Sagar and I, ETR is committed, theCUBE is committed to keep you updated on a regular basis. Right now on a weekly cadence. As we have new information, we will bring it to you. Sagar, thanks so much for coming on and supporting us. >> You're welcome and thanks for having me again. >> You're welcome. Thank you for watching this CUBE Insights powered by ETR. And remember all these breaking analysis available on podcast, go to etr.plus that's where all the action is in terms of the survey work. siliconangle.comm covers these breaking analysis and I published weekly on wikibond.com. Thanks for watching everybody. Stay safe. And we'll see you next time.

Published Date : Mar 26 2020

SUMMARY :

this is theCUBE Conversation. Sagar, great to see you again, thank you for coming on. that we showed last week, You're seeing that in a lot of the CIOs responses. Really ETR is the only company that I know of anyway, And so the beauty of doing this, What is the end that we're at today? The growth rate on the IT side, the larger it is, the more cuts you can make. And again, the ends have gone up and a little bit more down and in the red. But the offset we talked about last week, from the standpoint of things are going to get better and some permanence in the way in which companies On the upside, I think you kind of hit it, is the January survey to the current survey in the yellow, One of the unique things that we can do Obviously, as I said that the on-prem folks, "some of the legacy vendors are going to do well At the same time, it's going to be interesting to see IBM some of the next gen security vendors like CrowdStrike APi, sides of that coin. And ETR is going to continue to track that. it just may be that the last few weeks And I think there's going to be some And as long as the uncertainty continues, theCUBE is committed to keep you updated on a regular basis. And we'll see you next time.

ENTITIES

Entity	Category	Confidence
Boeing	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
Sanjay Poonen	PERSON	0.99+
David	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Symantec	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
six	QUANTITY	0.99+
Andrew	PERSON	0.99+
March 26, 2020	DATE	0.99+
United	ORGANIZATION	0.99+
New York	LOCATION	0.99+
10 weeks	QUANTITY	0.99+
Last week	DATE	0.99+
2008	DATE	0.99+
UPS	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Sagar	PERSON	0.99+
Dell	ORGANIZATION	0.99+
3/13	DATE	0.99+
22%	QUANTITY	0.99+
last week	DATE	0.99+
January	DATE	0.99+
2020	DATE	0.99+
2007	DATE	0.99+
COVID-19	OTHER	0.99+
Mars	ORGANIZATION	0.99+
2009	DATE	0.99+
Monday	DATE	0.99+
Adobe	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
37%	QUANTITY	0.99+
yesterday	DATE	0.99+
40	QUANTITY	0.99+
3/11	DATE	0.99+
LinkedIn	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
Fed	ORGANIZATION	0.99+
this week	DATE	0.99+
Senate	ORGANIZATION	0.99+
Citrix	ORGANIZATION	0.99+
Michael Dell	PERSON	0.99+
Broadcom	ORGANIZATION	0.99+
10 year	QUANTITY	0.99+
Micron	ORGANIZATION	0.99+
VMware	ORGANIZATION	0.99+
first	QUANTITY	0.99+
one and a half	QUANTITY	0.99+
Sagar Kadakia	PERSON	0.99+

Stephanie McReynolds, Alation | CUBEConversation, November 2019

>> Announcer: From our studios, in the heart of Silicon Valley, Palo Alto, California, this is a CUBE conversation. >> Hello, and welcome to theCUBE studios, in Palo Alto, California for another CUBE conversation where we go in depth with though leaders driving innovation across tech industry. I'm your host, Peter Burris. The whole concept of self service analytics has been with us decades in the tech industry. Sometimes its been successful, most times it hasn't been. But we're making great progress and have over the last few years as the technologies matures, as the software becomes more potent, but very importantly as the users of analytics become that much more familiar with what's possible and that much more wanting of what they could be doing. But this notion of self service analytics requires some new invention, some new innovation. What are they? How's that going to play out? Well, we're going to have a great conversation today with Stephanie McReynolds, she's Senior Vice President of Marketing, at Alation. Stephanie, thanks again for being on theCUBE. >> Thanks for inviting me, it's great to be back. >> So, tell us a little, give us an update on Alation. >> So as you know, Alation was one of the first companies to bring a data catalog to the market. And that market category has now been cemented and defined depending on the industry analyst you talk to. There could be 40 or 50 vendors now who are providing data catalogs to the market. So this has become one of the hot technologies to include in a modern analytics stacks. Particularly, we're seeing a lot of demand as companies move from on premise deployments into the cloud. Not only are they thinking about how do we migrate our systems, our infrastructure into the cloud but with data cataloging more importantly, how do we migrate our users to the cloud? How do we get self-service users to understand where to go to find data, how to understand it, how to trust it, what re-use can we do of it's existing assets so we're not just exploding the amount of processing we're doing in the cloud. So that's been very exciting, it's helped us grow our business. We've now seen four straight years of triple digit revenue growth which is amazing for a high growth company like us. >> Sure. >> We also have over 150 different organizations in production with a data catalog as part of their modern analytics stack. And many of those organizations are moving into the thousands of users. So eBay was probably our first customer to move into the, you know, over a thousand weekly logins they're now up to about 4,000 weekly logins through Alation. But now we have customers like Boeing and General Electric and Pfizer and we just closed a deal with US Air Force. So we're starting to see all sorts of different industries and all sorts of different users from the analytics specialist in your organization, like a data scientist or a data engineer, all the way out to maybe a product manager or someone who doesn't really think of them as an analytics expert using Alation either directly or sometimes through one of our partnerships with folks like Tableau or Microstrategy or Power BI. >> So, if we think about this notion of self- service analytics, Stephanie, and again it's Alation has been a leader in defining this overall category, we think in terms of an individual who has some need for data but is, most importantly, has questions they think data can answer and now they're out looking for data. Take us through that process. They need to know where the data is, they need to know what it is, they need to know how to use it, and they need to know what to do if they make a mistake. How is that, how are the data catalogs, like Alation, serving that, and what's new? >> Yeah, so as consumers, this world of data cataloging is very similar if you go back to the introduction of the internet. >> Sure. >> How did you find a webpage in the 90's? Pretty difficult, you had to know the exact URL to go to in most cases, to find a webpage. And then a Yahoo was introduced, and Yahoo did a whole bunch of manual curation of those pages so that you could search for a page and find it. >> So Yahoo was like a big catalog. >> It was like a big catalog, an inventory of what was out there. So the original data catalogs, you could argue, were what we would call from an technical perspective, a metadata repository. No business user wants to use a metadata repository but it created an inventory of what are all the data assets that we have in the organizations and what's the description of those data assets. The meta- data. So metadata repositories were kind of the original catalogs. The big breakthrough for data catalogs was: How do we become the Google of finding data in the organization? So rather than manually curating everything that's out there and providing an in- user inferant with an answer, how could we use machine learning and AI to look at patterns of usage- what people are clicking on, in terms of data assets- surface those as data recommendations to any end user whether they're an analytics specialist or they're just a self- service analytics user. And so that has been the real break through of this new category called data cataloging. And so most folks are accessing a data catalog through a search interface or maybe they're writing a SQL query and there's SQL recommendations that are being provided by the catalog-- >> Or using a tool that utilizes SQL >> Or using a tool that utilizes SQL, and for most people in a- most employees in a large enterprise when you get those thousands of users, they're using some other tool like Tableau or Microstrategy or, you know, a variety of different data visualization providers or data science tools to actually access that data. So a big part of our strategy at Alation has been, how do we surface this data recommendation engine in those third party products. And then if you think about it, once you're surfacing that information and providing some value to those end users, the next thing you want to do is make sure that they're using that data accurately. And that's a non- trivial problem to solve, because analytics and data is complicated. >> Right >> And metadata is extremely complicated-- >> And metadata is-- because often it's written in a language that's arcane and done to be precise from a data standpoint, that's not easily consumable or easily accessible by your average human being. >> Right, so a label, for example, on a table in a data base might be cust_seg_257, what does that mean? >> It means we can process it really quickly in the system. >> Yeah, but as-- >> But it's useless to a human being-- >> As a marketing manager, right? I'm like, hey, I want to do some customer segmentation analysis and I want to find out if people who live in California might behave differently if I provide them an offer than people that live in Massachusetts, it's not intuitive to say, oh yeah, that's in customer_seg_ so what data catalogs are doing is they're thinking about that marketing manager, they're thinking about that peer business user and helping make that translation between business terminology, "Hey I want to run some customer segmentation analysis for the West" with the technical, physical model, that underlies the data in that data base which is customer_seg_257 is the table you need to access to get the answer to that question. So as organizations start to adapt more self- service analytics, it's important that we're managing not just the data itself and this translation from technical metadata to business metadata, but there's another layer that's becoming even more important as organizations embrace self- service analytics. And that's how is this data actually being processed? What is the logic that is being used to traverse different data sets that end users now have access to. So if I take gender information in one table and I have information on income on another table, and I have some private information that identifies those two customers as the same in those two tables, in some use tables I can join that data, if I'm doing marketing campaigns, I likely can join that data. >> Sure. >> If I'm running a loan approval process here in the United States, I cannot join that data. >> That's a legal limitation, that's not a technical issue-- >> That's a legal, federal, government issue. Right? And so here's where there's a discussion, in folks that are knowledgeable about data and data management, there's a discussion of how do we govern this data? But I think by saying how we govern this data, we're kind of covering up what's actually going on, because you don't have govern that data so much as you have to govern the analysis. How is this joined, how are we combining these two data sets? If I just govern the data for accuracy, I might not know the usage scenario which is someone wants to combine these two things which makes it's illegal. Separately, it's fine, combined, it's illegal. So now we need to think about, how do we govern the analytics themselves, the logic that is being used. And that gets kind of complicated, right? For a marketing manager to understand the difference between those things on the surface is doesn't really make sense. It only makes sense when the context of that government regulation is shared and explained and in the course of your workflow and dragging and dropping in a Tableau report, you might not remember that, right? >> That's right, and the derivative output that you create that other people might then be able to use because it's back in the data catalog, doesn't explicitly note, often, that this data was generated as a combination of a join that might not be in compliance with any number of different rules. >> Right, so about a year and a half ago, we introduced a new feature in our data catalog called Trust Check. >> Yeah, I really like this. This is a really interesting thing. >> And that was meant to be a way where we could alert end users to these issues- hey, you're trying to run the same analytic and that's not allowed. We're going to give you a warning, we're not going to let you run that query, we're going to stop you in your place. So that was a way in the workflow of someone while they're typing a SQL statement or while they're dragging and dropping in Tableau to surface that up. Now, some of the vendors we work with, like Tableau, have doubled down on this concept of how do they integrate with an enterprise data catalog to make this even easier. So at Tableau conference last week, they introduced a new metadata API, they introduced a Tableau catalog, and the opportunity for these type of alerts to be pushed into the Tableau catalog as well as directly into reports and worksheets and dashboards that end users are using. >> Let me make sure I got this. So it means that you can put a lot of the compliance rules inside Alation and have a metadata API so that Alation effectively is governing the utilization of data inside the Tableau catalog. >> That's right. So think about the integration with Tableau is this communication mechanism to surface up these policies that are stored centrally in your data catalog. And so this is important, this notion of a central place of reference. We used to talk about data catalogs just as a central place of reference for where all your data assets lie in the organizations, and we have some automated ways to crawl those sources and create a centralized inventory. What we've added in our new release, which is coming out here shortly, is the ability to centralize all your policies in that catalog as well as the pointers to your data in that catalog. So you have a single source of reference for how this data needs to be governed, as well as a single source of reference for how this data is used in the organization. >> So does that mean, ultimately, that someone could try to do something, trust check and say, no you can't, but this new capability will say, and here's why or here's what you do. >> Exactly. >> A descriptive step that says let me explain why you can't do it. >> That's right. Let me not just stop your query and tell you no, let me give you the details as to why this query isn't a good query and what you might be able to do to modify that query should you still want to run it. And so all of that context is available for any end user to be able to become more aware of what is the system doing, and why is recommending. And on the flip side, in the world before we had something like Trust Check, the only opportunity for an IT Team to stop those queries was just to stop them without explanation or to try to publish manuals and ask people to run tests, like the DMV, so that they memorized all those rules of governance. >> Yeah, self- service, but if there's a problem you have to call us. >> That's right. That's right. So what we're trying to do is trying to make the work of those governance teams, those IT Teams, much easier by scaling them. Because we all know the volume of data that's being created, the volume of analysis that's being created is far greater than any individual can come up with, so we're trying to scale those precious data expert resources-- >> Digitize them-- >> Yeah, exactly. >> It's a digital transformation of how we acquire data necessary-- >> And then-- >> for data transformation. >> make it super transparent for the end user as to why they're being told yes or no so that we remove this friction that's existed between business and IT when trying to perform analytics. >> But I want to build a little bit on one of the things I thought I heard you say, and that is that the idea that this new feature, this new capability will actually prescribe an alternative, logical way for you to get your information that might be in compliance. Have I got that right? >> Yeah, that's right. Because what we also have in the catalog is a workflow that allows individuals called Stewards, analytics Stewards to be able to make recommendations and certifications. So if there's a policy that says though shall not use the data in this way, the Stewards can then say, but here's an alternative mechanism, here's an alternative method, and by the way, not only are we making this as a recommendation but this is certified for success. We know that our best analysts have already tried this out, or we know that this complies with government regulation. And so this is a more active way, then, for the two parties to collaborate together in a distributed way, that's asynchronous, and so it's easy for everyone no matter what hour of the day they're working or where they're globally located. And it helps progress analytics throughout the organization. >> Oh and more importantly, it increases the likelihood that someone who is told you now have self- service capability doesn't find themselves abandoning it the first time that somebody says no, because we've seen that over and over with a lot of these query tools, right? That somebody says, oh wow, look at this new capability until the screen, you know, metaphorically, goes dark. >> Right, until it becomes too complicated-- >> That's right-- >> and then you're like, oh I guess I wasn't really trained on this. >> And then they walk away. And it doesn't get adopted. >> Right. >> And this is a way, it's very human centered way to bring that self- service analyst into the system and be a full participant in how you generate value out of it. >> And help them along. So you know, the ultimate goal that we have as an organization, is help organizations become our customers, become data literate populations. And you can only become data literate if you get comfortable working with the date and it's not a black box to you. So the more transparency that we can create through our policy center, through documenting the data for end users, and making it more easy for them to access, the better. And so, in the next version of the Alation product, not only have we implemented features for analytic Stewards to use, to certify these different assets, to log their policies, to ensure that they can document those policies fully with examples and use cases, but we're also bringing to market a professional services offering from our own team that says look, given that we've now worked with about 20% of our installed base, and observed how they roll out Stewardship initiatives and how they assign Stewards and how they manage this process, and how they manage incentives, we've done a lot of thinking about what are some of the best practices for having a strong analytics Stewardship practice if you're a self- service analytics oriented organization. And so our professional services team is now available to help organizations roll out this type of initiative, make it successful, and have that be supported with product. So the psychological incentives of how you get one of these programs really healthy is important. >> Look, you guys have always been very focused on ensuring that your customers were able to adopt valued proposition, not just buy the valued proposition. >> Right. >> Stephanie McReynolds, Senior Vice President of Marketing Relation, once again, thanks for being on theCUBE. >> Thanks for having me. >> And thank you for joining us for another CUBE conversation. I'm Peter Burris. See you next time.

Published Date : Dec 10 2019

SUMMARY :

in the heart of Silicon Valley, Palo Alto, California, and that much more wanting of what they could be doing. So, tell us a little, depending on the industry analyst you talk to. and General Electric and Pfizer and we just closed a deal and they need to know what to do if they make a mistake. of the internet. of those pages so that you could search for a page And so that has been the real break through the next thing you want to do is make sure that's arcane and done to be precise from a data standpoint, and I have some private information that identifies in the United States, I cannot join that data. and in the course of your workflow and dragging and dropping That's right, and the derivative output that you create we introduced a new feature in our data catalog This is a really interesting thing. and the opportunity for these type of alerts to be pushed So it means that you can put a lot of the compliance rules is the ability to centralize all your policies and here's why or here's what you do. let me explain why you can't do it. the only opportunity for an IT Team to stop those queries but if there's a problem you have to call us. the volume of analysis that's being created so that we remove this friction that's existed and that is that the idea that this new feature, and by the way, not only are we making this Oh and more importantly, it increases the likelihood and then you're like, And then they walk away. And this is a way, it's very human centered way So the psychological incentives of how you get one of these not just buy the valued proposition. Senior Vice President of Marketing Relation, once again, And thank you for joining us for another

ENTITIES

Entity	Category	Confidence
Boeing	ORGANIZATION	0.99+
Pfizer	ORGANIZATION	0.99+
General Electric	ORGANIZATION	0.99+
Stephanie McReynolds	PERSON	0.99+
Stephanie	PERSON	0.99+
Peter Burris	PERSON	0.99+
40	QUANTITY	0.99+
California	LOCATION	0.99+
Massachusetts	LOCATION	0.99+
Yahoo	ORGANIZATION	0.99+
November 2019	DATE	0.99+
Alation	ORGANIZATION	0.99+
eBay	ORGANIZATION	0.99+
two parties	QUANTITY	0.99+
two things	QUANTITY	0.99+
two tables	QUANTITY	0.99+
two customers	QUANTITY	0.99+
one table	QUANTITY	0.99+
United States	LOCATION	0.99+
50 vendors	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Palo Alto, California	LOCATION	0.99+
SQL	TITLE	0.99+
last week	DATE	0.99+
US Air Force	ORGANIZATION	0.99+
Microstrategy	ORGANIZATION	0.99+
first customer	QUANTITY	0.99+
Tableau	ORGANIZATION	0.98+
Tableau	TITLE	0.98+
Stewards	ORGANIZATION	0.98+
Power BI	ORGANIZATION	0.98+
over 150 different organizations	QUANTITY	0.98+
90's	DATE	0.97+
today	DATE	0.97+
single	QUANTITY	0.97+
one	QUANTITY	0.97+
about 20%	QUANTITY	0.97+
four straight years	QUANTITY	0.97+
first time	QUANTITY	0.97+
CUBE	ORGANIZATION	0.96+
over a thousand weekly logins	QUANTITY	0.96+
thousands of users	QUANTITY	0.96+
two data	QUANTITY	0.94+
Microstrategy	TITLE	0.94+
first companies	QUANTITY	0.92+
Tableau	EVENT	0.9+
about	DATE	0.9+
Silicon Valley, Palo Alto, California	LOCATION	0.89+
a year and a half ago	DATE	0.88+
about 4,000 weekly logins	QUANTITY	0.86+
Trust Check	ORGANIZATION	0.82+
single source	QUANTITY	0.79+
Trust Check	TITLE	0.75+
theCUBE	ORGANIZATION	0.75+
customer_seg_257	OTHER	0.74+
up	QUANTITY	0.73+
Alation	PERSON	0.72+
decades	QUANTITY	0.7+
cust_seg_257	OTHER	0.66+
Senior Vice President	PERSON	0.65+
years	DATE	0.58+
CUBEConversation	EVENT	0.51+

Stephanie McReynolds, Alation | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by Stephanie McReynolds. She is the Vice President of Marketing at Alation. Thanks so much for, for returning to theCUBE, Stephanie. >> Thank you for having me again. >> So, before the cameras were rolling, we were talking about Kevin Slavin's talk on the main stage this morning, and talking about, well really, a background to sort of this concern about AI and automation coming to take people's jobs, but really, his overarching point was that we really, we shouldn't, we shouldn't let the algorithms take over, and that humans actually are an integral piece of this loop. So, riff on that a little bit. >> Yeah, what I found fascinating about what he presented were actual examples where having a human in the loop of AI decision-making had a more positive impact than just letting the algorithms decide for you, and turning it into kind of a black, a black box. And the issue is not so much that, you know, there's very few cases where the algorithms make the wrong decision. What happens the majority of the time is that the algorithms actually can't be understood by human. So if you have to roll back >> They're opaque, yeah. >> in your decision-making, or uncover it, >> I mean, who can crack what a convolutional neural network does, at a layer by layer, nobody can. >> Right, right. And so, his point was, if we want to avoid not just poor outcomes, but also make sure that the robots don't take over the world, right, which is where every like, media person goes first, right? (Rebecca and James laugh) That you really need a human in the loop of this process. And a really interesting example he gave was what happened with the 2015 storm, and he talked about 16 different algorithms that do weather predictions, and only one algorithm predicted, mis-predicted that there would be a huge weather storm on the east coast. So if there had been a human in the loop, we wouldn't have, you know, caused all this crisis, right? The human could've >> And this is the storm >> Easily seen. >> That shut down the subway system, >> That's right. That's right. >> And really canceled New York City for a few days there, yeah. >> That's right. So I find this pretty meaningful, because Alation is in the data cataloging space, and we have a lot of opportunity to take technical metadata and automate the collection of technical and business metadata and do all this stuff behind the scenes. >> And you make the discovery of it, and the analysis of it. >> We do the discovery of this, and leading to actual recommendations to users of data, that you could turn into automated analyses or automated recommendations. >> Algorithmic, algorithmically augmented human judgment is what it's all about, the way I see it. What do you think? >> Yeah, but I think there's a deeper insight that he was sharing, is it's not just human judgment that is required, but for humans to actually be in the loop of the analysis as it moves from stage to stage, that we can try to influence or at least understand what's happening with that algorithm. And I think that's a really interesting point. You know, there's a number of data cataloging vendors, you know, some analysts will say there's anywhere from 10 to 30 different vendors in the data cataloging space, and as vendors, we kind of have this debate. Some vendors have more advanced AI and machine learning capabilities, and other vendors haven't automated at all. And I think that the answer, if you really want humans to adopt analytics, and to be comfortable with the decision-making of those algorithms, you need to have a human in the loop, in the middle of that process, of not only making the decision, but actually managing the data that flows through these systems. >> Well, algorithmic transparency and accountability is an increasing requirement. It's a requirement for GDPR compliance, for example. >> That's right. >> That I don't see yet with Wiki, but we don't see a lot of solution providers offering solutions to enable more of an automated roll-up of a narrative of an algorithmic decision path. But that clearly is a capability as it comes along, and it will. That will absolutely depend on a big data catalog managing the data, the metadata, but also helping to manage the tracking of what models were used to drive what decision, >> That's right. >> And what scenario. So that, that plays into what Alation >> So we talk, >> And others in your space do. >> We call that data catalog, almost as if the data's the only thing that we're tracking, but in addition to that, that metadata or the data itself, you also need to track the business semantics, how the business is using or applying that data and that algorithmic logic, so that might be logic that's just being used to transform that data, or it might be logic to actually make and automate decision, like what they're talking about GDPR. >> It's a data artifact catalog. These are all artifacts that, they are derived in many ways, or supplement and complement the data. >> That's right. >> They're all, it's all the logic, like you said. >> And what we talk about is, how do you create transparency into all those artifacts, right? So, a catalog starts with this inventory that creates a foundation for transparency, but if you don't make those artifacts accessible to a business person, who might not understand what is metadata, what is a transformation script. If you can't make that, those artifacts accessible to a, what I consider a real, or normal human being, right, (James laughs) I love to geek out, but, (all laugh) at some point, not everyone is going to understand. >> She's the normal human being in this team. >> I'm normal. I'm normal. >> I'm the abnormal human being among the questioners here. >> So, yeah, most people in the business are just getting our arms around how do we trust the output of analytics, how do we understand enough statistics and know what to apply to solve a business problem or not, and then we give them this like, hairball of technical artifacts and say, oh, go at it. You know, here's your transparency. >> Well, I want to ask about that, that human that we're talking about, that needs to be in the loop at every stage. What, that, surely, we can make the data more accessible, and, but it also requires a specialized skill set, and I want to ask you about the talent, because I noticed on your LinkedIn, you said, hey, we're hiring, so let me know. >> That's right, we're always hiring. We're a startup, growing well. >> So I want to know from you, I mean, are you having difficulty with filling roles? I mean, what is at the pipeline here? Are people getting the skills that they need? >> Yeah, I mean, there's a wide, what I think is a misnomer is there's actually a wide variety of skills, and I think we're adding new positions to this pool of skills. So I think what we're starting to see is an expectation that true business people, if you are in a finance organization, or you're in a marketing organization, or you're in a sales organization, you're going to see a higher level of data literacy be expected of that, that business person, and that's, that doesn't mean that they have to go take a Python course and learn how to be a data scientist. It means that they have to understand statistics enough to realize what the output of an algorithm is, and how they should be able to apply that. So, we have some great customers, who have formally kicked off internal training programs that are data literacy programs. Munich Re Insurance is a good example. They spoke with James a couple of months ago in Berlin. >> Yeah, this conference in Berlin, yeah. >> That's right, that's right, and their chief data officer has kicked off a formal data literacy training program for their employees, so that they can get business people comfortable enough and trusting the data, and-- >> It's a business culture transformation initiative that's very impressive. >> Yeah. >> How serious they are, and how comprehensive they are. >> But I think we're going to see that become much more common. Pfizer has taken, who's another customer of ours, has taken on a similar initiative, and how do they make all of their employees be able to have access to data, but then also know when to apply it to particular decision-making use cases. And so, we're seeing this need for business people to get a little bit of training, and then for new roles, like information stewards, or data stewards, to come online, folks who can curate the data and the data assets, and help be kind of translators in the organization. >> Stephanie, will there be a need for a algorithm curator, or a model curator, to, you know, like a model whisperer, to explain how these AI, convolutional, recurrent, >> Yeah. >> Whatever, all these neural, how, what they actually do, you know. Would there be a need for that going forward? Another as a normal human being, who can somehow be bilingual in neural net and in standard language? >> I think, I think so. I mean, I think we've put this pressure on data scientists to be that person. >> Oh my gosh, they're so busy doing their job. How can we expect them to explain, and I mean, >> Right. >> And to spend 100% of their time explaining it to the rest of us? >> And this is the challenge with some of the regulations like GDPR. We aren't set up yet, as organizations, to accommodate this complexity of understanding, and I think that this part of the market is going to move very quickly, so as vendors, one of the things that we can do is continue to help by building out applications that make it easy for information stewardship. How do you lower the barrier for these specialist roles and make it easy for them to do their job by using AI and machine learning, where appropriate, to help scale the manual work, but keeping a human in the loop to certify that data asset, or to add additional explanation and then taking their work and using AI, machine learning, and automation to propagate that work out throughout the organization, so that everyone then has access to those explanations. So you're no longer requiring the data scientists to hold like, I know other organizations that hold office hours, and the data scientist like sits at a desk, like you did in college, and people can come in and ask them questions about neural nets. That's just not going to scale at today's pace of business. >> Right, right. >> You know, the term that I used just now, the algorithm or model whisperer, you know, the recommend-er function that is built into your environment, in similar data catalog, is a key piece of infrastructure to rank the relevance rank, you know, the outputs of the catalog or responses to queries that human beings might make. You know, the recommendation ranking is critically important to help human beings assess the, you know, what's going on in the system, and give them some advice about how to, what avenues to explore, I think, so. >> Yeah, yeah. And that's part of our definition of data catalog. It's not just this inventory of technical metadata. >> That would be boring, and dry, and useless. >> But that's where, >> For most human beings. >> That's where a lot of vendor solutions start, right? >> Yeah. >> And that's an important foundation. >> Yeah, for people who don't live 100% of their work day inside the big data catalog. I hear what you're saying, you know. >> Yeah, so people who want a data catalog, how you make that relevant to the business is you connect those technical assets, that technical metadata with how is the business actually using this in practice, and how can we have proactive recommendation or the recommendation engines, and certifications, and this information steward then communicating through this platform to others in the organization about how do you interpret this data and how do you use it to actually make business decisions. And I think that's how we're going to close the gap between technology adoption and actual data-driven decision-making, which we're not quite seeing yet. We're only seeing about 30, when they survey, only about 36% of companies are actually confident they're making data-driven decisions, even though there have been, you know, millions, if not billions of dollars that have gone into the data analytics market and investments, and it's because as a manager, I don't quite have the data literacy yet, and I don't quite have the transparency across the rest of the organization to close that trust gap on analytics. >> Here's my feeling, in terms of cultural transformations across businesses in general. I think the legal staff of every company is going to need to get real savvy on using those kinds of tools, like your catalog, with recommendation engines, to support e-discovery, or discovery of the algorithmic decision paths that were taken by their company's products, 'cause they're going to be called by judges and juries, under a subpoena and so forth, and so on, to explain all this, and they're human beings who've got law degrees, but who don't know data, and they need the data environment to help them frame up a case for what we did, and you know, so, we being the company that's involved. >> Yeah, and our politicians. I mean, anyone who's read Cathy's book, Weapons of Math Destruction, there are some great use cases of where, >> Math, M-A-T-H, yeah. >> Yes, M-A-T-H. But there are some great examples of where algorithms can go wrong, and many of our politicians and our representatives in government aren't quite ready to have that conversation. I think anyone who watched the Zuckerberg hearings you know, in congress saw the gap of knowledge that exists between >> Oh my gosh. >> The legal community, and you know, and the tech community today. So there's a lot of work to be done to get ready for this new future. >> But just getting back to the cultural transformation needed to be, to make data-driven decisions, one of the things you were talking about is getting the managers to trust the data, and we're hearing about what are the best practices to have that happen in the sense, of starting small, be willing to experiment, get out of the lab, try to get to insight right away. What are, what would your best advice be, to gain trust in the data? >> Yeah, I think the biggest gap is this issue of transparency. How do you make sure that everyone understands each step of the process and has access to be able to dig into that. If you have a foundation of transparency, it's a lot easier to trust, rather than, you know, right now, we have kind of like the high priesthood of analytics going on, right? (Rebecca laughs) And some believers will believe, but a lot of folks won't, and, you know, the origin story of Alation is really about taking these concepts of the scientific revolution and scientific process and how can we support, for data analysis, those same steps of scientific evaluation of a finding. That means that you need to publish your data set, you need to allow others to rework that data, and come up with their own findings, and you have to be open and foster conversations around data in your organization. One other customer of ours, Meijer, who's a grocery store in the mid-west, and if you're west coast or east coast-based, you might not have heard of them-- >> Oh, Meijers, thrifty acres. I'm from Michigan, and I know them, yeah. >> Gigantic. >> Yeah, there you go. Gigantic grocery chain in the mid-west, and, Joe Oppenheimer there actually introduced a program that he calls the social contract for analytics, and before anyone gets their license to use Tableau, or MicroStrategy, or SaaS, or any of the tools internally, he asks those individuals to sign a social contract, which basically says that I'll make my work transparent, I will document what I'm doing so that it's shareable, I'll use certain standards on how I format the data, so that if I come up with a, with a really insightful finding, it can be easily put into production throughout the rest of the organization. So this is a really simple example. His inspiration for that social contract was his high school freshman. He was entering high school and had to sign a social contract, that he wouldn't make fun of the teachers, or the students, you know, >> I love it. >> Very simple basics. >> Yeah, right, right, right. >> I wouldn't make fun of the teacher. >> We all need social contract. >> Oh my gosh, you have to make fun of the teacher. >> I think it was a little more formal than that, in the language, but that was the concept. >> That's violating your civil rights as a student. I'm sorry. (Stephanie laughs) >> Stephanie, always so much fun to have you here. Thank you so much for coming on. >> Thank you. It's a pleasure to be here. >> I'm Rebecca Knight, for James Kobielus. We'll have more of theCUBE's live coverage of DataWorks just after this.

Published Date : Jun 20 2018

SUMMARY :

brought to you by Hortonworks. She is the Vice President of Marketing Thank you for having me and that humans actually of the time is that yeah. I mean, who can crack but also make sure that the robots That's right. And really canceled because Alation is in the and the analysis of it. and leading to actual recommendations the way I see it. and to be comfortable with It's a requirement for GDPR compliance, the metadata, but also helping to manage that plays into what Alation that metadata or the data itself, or supplement and complement the data. it's all the logic, I love to geek out, but, She's the normal human being I'm normal. I'm the abnormal and know what to apply that needs to be in the That's right, we're always hiring. and how they should be able to apply that. Yeah, this conference It's a business culture and how comprehensive they are. in the organization. and in standard language? on data scientists to be to explain, and I mean, and the data scientist to rank the relevance rank, you know, definition of data catalog. and dry, and useless. And that's an important inside the big data catalog. and I don't quite have the transparency and so on, to explain all this, Yeah, and our politicians. and many of our politicians and the tech community today. is getting the managers to trust the data, and has access to be and I know them, yeah. or the students, you know, the teacher. the teacher. in the language, but that was That's violating much fun to have you here. It's a pleasure to be here. We'll have more of theCUBE's live coverage

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Stephanie McReynolds	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rebecca	PERSON	0.99+
Michigan	LOCATION	0.99+
Stephanie	PERSON	0.99+
Berlin	LOCATION	0.99+
James	PERSON	0.99+
100%	QUANTITY	0.99+
Kevin Slavin	PERSON	0.99+
San Jose	LOCATION	0.99+
millions	QUANTITY	0.99+
Cathy	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Pfizer	ORGANIZATION	0.99+
LinkedIn	ORGANIZATION	0.99+
Munich Re Insurance	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
congress	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
Joe Oppenheimer	PERSON	0.99+
Python	TITLE	0.99+
10	QUANTITY	0.99+
Meijers	ORGANIZATION	0.99+
Zuckerberg	PERSON	0.99+
16 different algorithms	QUANTITY	0.99+
Weapons of Math Destruction	TITLE	0.99+
GDPR	TITLE	0.99+
One	QUANTITY	0.98+
each step	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
about 36%	QUANTITY	0.98+
DataWorks Summit 2018	EVENT	0.97+
Tableau	TITLE	0.97+
about 30	QUANTITY	0.97+
Hortonworks	ORGANIZATION	0.97+
Alation	ORGANIZATION	0.96+
one algorithm	QUANTITY	0.96+
30 different vendors	QUANTITY	0.95+
billions of dollars	QUANTITY	0.95+
2015	DATE	0.95+
SaaS	TITLE	0.94+
one	QUANTITY	0.94+
Gigantic	ORGANIZATION	0.93+
first	QUANTITY	0.9+
MicroStrategy	TITLE	0.88+
this morning	DATE	0.88+
couple of months ago	DATE	0.84+
today	DATE	0.81+
Meijer	ORGANIZATION	0.77+
Wiki	TITLE	0.74+
Vice President	PERSON	0.72+
DataWorks	ORGANIZATION	0.71+
Alation	PERSON	0.53+
DataWorks	EVENT	0.43+

Aaron Kalb, Alation | BigData NYC 2017

>> Announcer: Live from midtown Manhattan, it's the Cube. Covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Welcome back everyone, we are here live in New York City, in Manhattan for BigData NYC, our event we've been doing for five years in conjunction with Strata Data which is formerly Strata Hadoop, which was formerly Strata Conference, formerly Hadoop World. We've been covering the big data space going on ten years now. This is the Cube. I'm here with Aaron Kalb, whose Head of Product and co-founder at Alation. Welcome to the cube. >> Aaron Kalb: Thank you so much for having me. >> Great to have you on, so co-founder head of product, love these conversations because you're also co-founder, so it's your company, you got a lot of equity interest in that, but also head of product you get to have the 20 mile stare, on what the future looks, while inventing it today, bringing it to market. So you guys have an interesting take on the collaboration of data. Talk about what the means, what's the motivation behind that positioning, what's the core thesis around Alation? >> Totally so the thing we've observed is a lot of people working in the data space, are concerned about the data itself. How can we make it cheaper to store, faster to process. And we're really concerned with the human side of it. Data's only valuable if it's used by people, how do we help people find the data, understand the data, trust in the data, and that involves a mix of algorithmic approaches and also human collaboration, both human to human and human to computer to get that all organized. >> John Furrier: It's interesting you have a symbolics background from Stanford, worked at Apple, involved in Siri, all this kind of futuristic stuff. You can't go a day without hearing about Alexia is going to have voice-activated, you've got Siri. AI is taking a really big part of this. Obviously all of the hype right now, but what it means is the software is going to play a key role as an interface. And this symbolic systems almost brings on this neural network kind of vibe, where objects, data, plays a critical role. >> Oh, absolutely, yeah, and in the early days when we were co-founding the company, we talked about what is Siri for the enterprise? Right, I was you know very excited to work on Siri, and it's really a kind of fun gimmick, and it's really useful when you're in the car, your hands are covered in cookie dough, but if you could answer questions like what was revenue last quarter in the UK and get the right answer fast, and have that dialogue, oh do you mean fiscal quarter or calendar quarter. Do you mean UK including Ireland, or whatever it is. That would really enable better decisions and a better outcome. >> I was worried that Siri might do something here. Hey Siri, oh there it is, okay be careful, I don't want it to answer and take over my job. >> (laughs) >> Automation will take away the job, maybe Siri will be doing interviews. Okay let's take a step back. You guys are doing well as a start up, you've got some great funding, great investors. How are you guys doing on the product? Give us a quick highlight on where you guys are, obviously this is BigData NYC a lot going on, it's Manhattan, you've got financial services, big industry here. You've got the Strata Data event which is the classic Hadoop industry that's morphed into data. Which really is overlapping with cloud, IoTs application developments all kind of coming together. How do you guys fit into that world? >> Yeah, absolutely, so the idea of the data lake is kind of interesting. Psychologically it's sort of a hoarder mentality, oh everything I've ever had I want to keep in the attic, because I might need it one day. Great opportunity to evolve these new streams of data, with IoT and what not, but just cause you can get to it physically doesn't mean it's easy to find the thing you want, the needle in all that big haystack and to distinguish from among all the different assets that are available, which is the one that is actually trustworthy for your need. So we find that all these trends make the need for a catalog to kind of organize that information and get what you want all the more valuable. >> This has come up a lot, I want to get into the integration piece and how you're dealing with your partnerships, but the data lake integration has been huge, and having the catalog has come up with, has been the buzz. Foundationally if you will saying catalog is important. Why is it important to do the catalog work up front, with a lot of the data strategies? >> It's a great question, so, we see data cataloging as step zero. Before you can prep the data in a tool like Trifacta, PACSAT, or Kylo. Before you can visualize it in a tool like Tableau, or MicroStrategy. Before you can do some sort of cool prediction of what's going to happen in the future, with a data science engine, before any of that. These are all garbage in garbage out processes. The step zero is find the relevant data. Understand it so you can get it in the right format. Trust that it's good and then you can do whatever comes next >> And governance has become a key thing here, we've heard of the regulations, GDPR outside of the United States, but also that's going to have an arms length reach over into the United States impact. So these little decisions, and there's going to be an Equifax someday out there. Another one's probably going to come around the corner. How does the policy injection change the catalog equation? A lot of people are building machine learning algorithms on top of catalogs, and they're worried they might have to rewrite everything. How do you balance the trade off between good catalog design and flexibility on the algorithm side? >> Totally yes it's a complicated thing with governance and consumption right. There's people who are concerned with keeping the data safe, and there are people concerned with turning that data into real value, and these can seem to be at odds. What we find is actually a catalog as a foundation for both, and they are not as opposed as they seem. What Alation fundamentally does is we make a map of where the data is, who's using what data, when, how. And that can actually be helpful if your goal is to say let's follow in the footsteps of the best analyst and make more insights generated or if you want to say, hey this data is being used a lot, let's make sure it's being used correctly. >> And by the right people. >> And by the right people exactly >> Equifax they were fishing that pond dry months, months before it actually happened. With good tools like this they might have seen this right? Am I getting it right? >> That's exactly right, how can you observe what's going on to make sure it's compliant and that the answers are correct and that it's happening quickly and driving results. >> So in a way you're taking the collective intelligence of the user behavior and using that into understanding what to do with the data modeling? >> That's exactly right. We want to make each person in your organization as knowledgeable as all of their peers combined. >> So the benefit then for the customer would be if you see something that's developing you can double down on it. And if the users are using a lot of data, then you can provision more technology, more software. >> Absolutely, absolutely. It's sort of like when I was going to Stanford, there was a place where the grass was all dead, because people were riding their bikes diagonally across it. And then somebody smart was like, we're going to put a real gravel path there. So the infrastructure should follow the usage, instead of being something you try to enforce on people. >> It's a classic design meme that goes around. Good design is here, the more effective design is the path. >> Exactly. >> So let's get into the integration. So one of the hot topics here this year obviously besides cloud and AI, with cloud really being more the driver, the tailwind for the growth, AI being more the futuristic head room, is integration. You guys have some partnerships that you announced with integration, what are some of the key ones, and why are they important? >> Absolutely, so, there have been attempts in the past to centralize all the data in one place have one warehouse or one lake have one BI tool. And those generally fail, for different reasons, different teams pick different stacks that work for them. What we think is important is the single source of reference One hub with spokes out to all those different points. If you think about it it's like Google, it's one index of the whole web even though the web is distributed all over the place. To make that happen it's very important that we have partnerships to get data in from various sources. So we have partnerships with database vendors, with Cloudera and Hortonworks, with different BI tools. What's new are a few things. One is with Cloudera Navigator, they have great technical metadata around security and lineage over HGFS, and that's a way to bolster our catalog to go even deeper into what's happening in the files before things get surfaced and higher for places where we have a deeper offering today. >> So it's almost a connector to them in a way, you kind of share data. >> That's exactly right, we've a lot of different connectors, this is one new one that we have. Another, go ahead. >> I was going to go ahead continue. >> I was just going to say another place that is exciting is data prep tools, so Trifacta and Paxata are both places where you can find and understand an alation and then begin to manipulate in those tools. We announced with Paxata yesterday, the ability to click to profile, so if you want to actually see what's in some raw compressed avro file, you can see that in one click. >> It's interesting, Paxata has really been almost lapping, Trifacta because they were the leader in my mind, but now you've got like a Nascar race going on between the two firms, because data wrangling is a huge issue. Data prep is where everyone is stuck right now, they just want to do the data science, it's interesting. >> They are both amazing companies and I'm happy to partner with both. And actually Trifacta and Alation have a lot of joint customers we're psyched to work with as well. I think what's interesting is that data prep, and this is beginning to happen with analyst definitions of that field. It isn't just preparing the data to be used, getting it cleaned and shaped, it's also preparing the humans to use the data giving them the confidence, the tools, the knowledge to know how to manipulate it. >> And it's great progress. So the question I wanted to ask is now the other big trend here is, I mean it's kind of a subtext in this show, it's not really front and center but we've been seeing it kind of emerge as a concept, we see in the cloud world, on premise vs cloud. On premise a lot of people bring in the dev ops model in, and saying I may move to the cloud for bursting and some native applications, but at the end of the day there is a lot of work going on on premise. A lot of companies are kind of cleaning house, retooling, replatforming, whatever you want to do resetting. They are kind of getting their house in order to do on prem cloud ops, meaning a business model of cloud operations on site. A lot of people doing that, that will impact the story, it's going to impact some of the server modeling, that's a hot trend. How do you guys deal with the on premise cloud dynamic? >> Totally, so we just want to do what's right for the customer, so we deploy both on prem and in the cloud and then from wherever the Alation server is it will point to usually a mix of sources, some that are in the cloud like vetshifter S3 often with Amazon today, and also sources that are on prem. I do think I'm seeing a trend more and more toward the cloud and we have people that are migrating from HGFS to S3 is one thing we hear a lot about it. Strata with sort of dupe interest. But I think what's happening is people are realizing as each Equifax in turn happens, that this old wild west model of oh you surround your bank with people on horseback and it's physically in one place. With data it isn't like that, most people are saying I'd rather have the A+ teams at Salesforce or Amazon or Google be responsible for my security, then the people I can get over in the midwest. >> And the Paxata guys have loved the term Data Democracy, because that is really democratization, making the data free but also having the governance thing. So tell me about the Data Lake governance, because I've never loved the term Data Lake, I think it's more of a data ocean, but now you see data lake, data lake, data lake. Are they just silos of data lakes happening now? Are people trying to connect them? That's key, so that's been a key trend here. How do you handle the governance across multiple data lakes? >> That's right so the key is to have that single source of reference, so that regardless of which lake or warehouse, or little siloed Sequel server somewhere, that you can search in a single portal and find that thing no matter where it is. >> John: Can you guys do that? >> We can do that, yeah, I think the metaphor for people who haven't seen it really is Google, if you think about it, you don't even know what physical server a webpage is hosted from. >> Data lakes should just be invisible >> Exactly. >> So your interfacing with multiple data lakes, that's a value proposition for you. >> That's right so it could be on prem or in the cloud, multi-cloud. >> Can you share an example of a customer that uses that and kind of how it's laid out? >> Absolutely, so one great example of an interesting data environment is eBay. They have the biggest teradata warehouse in the world. They also have I believe two huge data lakes, they have hive on top of that, and Presto is used to sort of virtualize it across a mixture of teradata, and hive and then direct Presto query It gets very complicated, and they have, they are a very data driven organization, so they have people who are product owners who are in jobs where data isn't in their job title and they know how to look at excel and look at numbers and make choices, but they aren't real data people. Alation provides that accessibility so that they can understand it. >> We used to call the Hadoop world the car show for the data world, where for a long time it was about the engine what was doing what, and then it became, what's the car, and now how's it drive. Seeing that same evolution now where all that stuff has to get done under the hood. >> Aaron: Exactly. >> But there are still people who care about that, right. They are the mechanics, they are the plumbers, whatever you want to call them, but then the data science are the guys really driving things and now end users potentially, and even applications bots or what nots. It seems to evolve, that's where we're kind of seeing the show change a little bit, and that's kind of where you see some of the AI things. I want to get your thoughts on how you or your guys are using AI, how you see AI, if it's AI at all if it's just machine learning as a baby step into AI, we all know what AI could be, but it's really just machine learning now. How do you guys use quote AI and how has it evolved? >> It's a really insightful question and a great metaphor that I love. If you think about it, it used to be how do you build the car, and now I can drive the car even though I couldn't build it or even fix it, and soon I don't even have to drive the car, the car will just drive me, all I have to know is where I want to go. That's sortof the progression that we see as well. There's a lot of talk about deep learning, all these different approaches, and it's super interesting and exciting. But I think even more interesting than the algorithms are the applications. And so for us it's like today how do we get that turn by turn directions where we say turn left at the light if you want to get there And eventually you know maybe the computer can do it for you The thing that is also interesting is to make these algorithms work no matter how good your algorithm is it's all based on the quality of your training data. >> John: Which is a historical data. Historical data in essence the more historical data you have you need that to train the data. >> Exactly right, and we call this behavior IO how do we look at all the prior human behavior to drive better behavior in the future. And I think the key for us is we don't want to have a bunch of unpaid >> John: You can actually get that URL behavioral IO. >> We should do it before it's too late (Both laugh) >> We're live right now, go register that Patrick. >> Yeah so the goal is we don't want to have a bunch of unpaid interns trying to manually attack things, that's error prone and that's slow. I look at things like Luis von Ahn over at CMU, he does a thing where as you're writing in a CAPTCHA to get an email account you're also helping Google recognize a hard to read address or a piece of text from books. >> John: If you shoot the arrow forward, you just take this kind of forward, you almost think augmented reality is a pretext to what we might see for what you're talking about and ultimately VR are you seeing some of the use cases for virtual reality be very enterprise oriented or even end consumer. I mean Tom Brady the best quarterback of all time, he uses virtual reality to play the offense virtually before every game, he's a power user, in pharma you see them using virtual reality to do data mining without being in the lab, so lab tests. So you're seeing augmentation coming in to this turn by turn direction analogy. >> It's exactly, I think it's the other half of it. So we use AI, we use techniques to get great data from people and then we do extra work watching their behavior to learn what's right. And to figure out if there are recommendations, but then you serve those recommendations, either it's Google glasses it appears right there in your field of view. We just have to figure out how do we make sure, that in a moment of you're making a dashboard, or you're making a choice that you have that information right on hand. >> So since you're a technical geek, and a lot of folks would love to talk about this, so I'll ask you a tough question cause this is something everyone is trying to chase for the holy grail. How do you get the right piece of data at the right place at the right time, given that you have all these legacy silos, latencies and network issues as well, so you've got a data warehouse, you've got stuff in cold storage, and I've got an app and I'm doing something, there could be any points of data in the world that could be in milliseconds potentially on my phone or in my device my internet of thing wearable. How do you make that happen? Because that's the struggle, at the same time keep all the compliance and all the overhead involved, is it more compute, is it an architectural challenge how do you view that because this is the big challenge of our time. >> Yeah again I actually think it's the human challenge more than the technology challenge. It is true that there is data all over the place kind of gathering dust, but again if you think about Google, billions of web pages, I only care about the one I'm about to use. So for us it's really about being in that moment of writing a query, building a chart, how do we say in that moment, hey you're using an out of date definition of profit. Or hey the database you chose to use, the one thing you chose out of the millions that is actually is broken and stale. And we have interventions to do that with our partners and through our own first party apps that actually change how decisions get made at companies. >> So to make that happen, if I imagine it, you'd have to need access to the data, and then write software that is contextually aware to then run, compute, in context to the user interaction. >> It's exactly right, back to the turn by turn directions concept you have to know both where you're trying to go and where you are. And so for us that can be the from where I'm writing a Sequel statement after join we can suggest the table most commonly joined with that, but also overlay onto that the fact that the most commonly joined table was deprecated by a data steward data curator. So that's the moment that we can change the behavior from bad to good. >> So a chief data officer out there, we've got to wrap up, but I wanted to ask one final question, There's a chief data officer out there they might be empowered or they might be just a CFO assistant that's managing compliance, either way, someone's going to be empowered in an organization to drive data science and data value forward because there is so much proof that data science works. From military to play you're seeing examples where being data driven actually has benefits. So everyone is trying to get there. How do you explain the vision of Alation to that prospect? Because they have so much to select from, there's so much noise, there's like, we call it the tool shed out there, there's like a zillion tools out there there's like a zillion platforms, some tools are trying to turn into something else, a hammer is trying to be a lawnmower. So they've got to be careful on who the select, so what's the vision of Alation to that chief data officer, or that person in charge of analytics to scale operational analytics. >> Absolutely so we say to the CDO we have a shared vision for this place where your company is making decisions based on data, instead of based on gut, or expensive consultants months too late. And the way we get there, the reason Alation adds value is, we're sort of the last tool you have to buy, because with this lake mentality, you've got your tool shed with all the tools, you've got your library with all the books, but they're just in a pile on the floor, if you had a tool that had everything organized, so you just said hey robot, I need an hammer and this size nail and this text book on this set of information and it could just come to you, and it would be correct and it would be quick, then you could actually get value out of all the expense you've already put in this infrastructure, that's especially true on the lake. >> And also tools describe the way the works done so in that model tools can be in the tool shed no one needs to know it's in there. >> Aaron: Exactly. >> You guys can help scale that. Well congratulations and just how far along are you guys in terms of number of employees, how many customers do you have? If you can share that, I don't know if that's confidential or what not >> Absolutely, so we're small but growing very fast planning to double in the next year, and in terms of customers, we've got 85 customers including some really big names. I mentioned eBay, Pfizer, Safeway Albertsons, Tesco, Meijer. >> And what are they saying to you guys, why are they buying, why are they happy? >> They share that same vision of a more data driven enterprise, where humans are empowered to find out, understand, and trust data to make more informed choices for the business, and that's why they come and come back. >> And that's the product roadmap, ethos, for you guys that's the guiding principle? >> Yeah the ultimate goal is to empower humans with information. >> Alright Aaron thanks for coming on the Cube. Aaron Kalb, co-founder head of product for Alation here in New York City for BigData NYC and also Strata Data I'm John Furrier thanks for watching. We'll be right back with more after this short break.

Published Date : Sep 28 2017

SUMMARY :

Brought to you by This is the Cube. Great to have you on, so co-founder head of product, Totally so the thing we've observed is a lot Obviously all of the hype right now, and get the right answer fast, and have that dialogue, I don't want it to answer and take over my job. How are you guys doing on the product? doesn't mean it's easy to find the thing you want, and having the catalog has come up with, has been the buzz. Understand it so you can get it in the right format. and flexibility on the algorithm side? and make more insights generated or if you want to say, Am I getting it right? That's exactly right, how can you observe what's going on We want to make each person in your organization So the benefit then for the customer would be So the infrastructure should follow the usage, Good design is here, the more effective design is the path. You guys have some partnerships that you announced it's one index of the whole web So it's almost a connector to them in a way, this is one new one that we have. the ability to click to profile, going on between the two firms, It isn't just preparing the data to be used, but at the end of the day there is a lot of work for the customer, so we deploy both on prem and in the cloud because that is really democratization, making the data free That's right so the key is to have that single source really is Google, if you think about it, So your interfacing with multiple data lakes, on prem or in the cloud, multi-cloud. They have the biggest teradata warehouse in the world. the car show for the data world, where for a long time and that's kind of where you see some of the AI things. and now I can drive the car even though I couldn't build it Historical data in essence the more historical data you have to drive better behavior in the future. Yeah so the goal is and ultimately VR are you seeing some of the use cases but then you serve those recommendations, and all the overhead involved, is it more compute, the one thing you chose out of the millions So to make that happen, if I imagine it, back to the turn by turn directions concept you have to know How do you explain the vision of Alation to that prospect? And the way we get there, no one needs to know it's in there. If you can share that, I don't know if that's confidential planning to double in the next year, for the business, and that's why they come and come back. Yeah the ultimate goal is Alright Aaron thanks for coming on the Cube.

ENTITIES

Entity	Category	Confidence
Luis von Ahn	PERSON	0.99+
eBay	ORGANIZATION	0.99+
Aaron Kalb	PERSON	0.99+
Pfizer	ORGANIZATION	0.99+
John	PERSON	0.99+
Aaron	PERSON	0.99+
Tesco	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Safeway Albertsons	ORGANIZATION	0.99+
Siri	TITLE	0.99+
Google	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
UK	LOCATION	0.99+
20 mile	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
BigData	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
Equifax	ORGANIZATION	0.99+
two firms	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
Meijer	ORGANIZATION	0.99+
ten years	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Trifacta	ORGANIZATION	0.99+
85 customers	QUANTITY	0.99+
Alation	ORGANIZATION	0.99+
Patrick	PERSON	0.99+
both	QUANTITY	0.99+
Strata Data	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
United States	LOCATION	0.99+
Paxata	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
excel	TITLE	0.99+
Manhattan	LOCATION	0.99+
last quarter	DATE	0.99+
Ireland	LOCATION	0.99+
GDPR	TITLE	0.99+
Tom Brady	PERSON	0.99+
each person	QUANTITY	0.99+
Salesforce	ORGANIZATION	0.98+
next year	DATE	0.98+
NYC	LOCATION	0.98+
one	QUANTITY	0.98+
this year	DATE	0.98+
yesterday	DATE	0.98+
today	DATE	0.97+
one lake	QUANTITY	0.97+
Nascar	ORGANIZATION	0.97+
one warehouse	QUANTITY	0.97+
Strata Data	EVENT	0.96+
Tableau	TITLE	0.96+
One	QUANTITY	0.96+
Both laugh	QUANTITY	0.96+
billions of web pages	QUANTITY	0.96+
single portal	QUANTITY	0.95+

Wikibon Presents: Software is Eating the Edge | The Entangling of Big Data and IIoT

>> So as folks make their way over from Javits I'm going to give you the least interesting part of the evening and that's my segment in which I welcome you here, introduce myself, lay out what what we're going to do for the next couple of hours. So first off, thank you very much for coming. As all of you know Wikibon is a part of SiliconANGLE which also includes theCUBE, so if you look around, this is what we have been doing for the past couple of days here in the TheCUBE. We've been inviting some significant thought leaders from over on the show and in incredibly expensive limousines driven them up the street to come on to TheCUBE and spend time with us and talk about some of the things that are happening in the industry today that are especially important. We tore it down, and we're having this party tonight. So we want to thank you very much for coming and look forward to having more conversations with all of you. Now what are we going to talk about? Well Wikibon is the research arm of SiliconANGLE. So we take data that comes out of TheCUBE and other places and we incorporated it into our research. And work very closely with large end users and large technology companies regarding how to make better decisions in this incredibly complex, incredibly important transformative world of digital business. What we're going to talk about tonight, and I've got a couple of my analysts assembled, and we're also going to have a panel, is this notion of software is eating the Edge. Now most of you have probably heard Marc Andreessen, the venture capitalist and developer, original developer of Netscape many years ago, talk about how software's eating the world. Well, if software is truly going to eat the world, it's going to eat at, it's going to take the big chunks, big bites at the Edge. That's where the actual action's going to be. And what we want to talk about specifically is the entangling of the internet or the industrial internet of things and IoT with analytics. So that's what we're going to talk about over the course of the next couple of hours. To do that we're going to, I've already blown the schedule, that's on me. But to do that I'm going to spend a couple minutes talking about what we regard as the essential digital business capabilities which includes analytics and Big Data, and includes IIoT and we'll explain at least in our position why those two things come together the way that they do. But I'm going to ask the august and revered Neil Raden, Wikibon analyst to come on up and talk about harvesting value at the Edge. 'Cause there are some, not now Neil, when we're done, when I'm done. So I'm going to ask Neil to come on up and we'll talk, he's going to talk about harvesting value at the Edge. And then Jim Kobielus will follow up with him, another Wikibon analyst, he'll talk specifically about how we're going to take that combination of analytics and Edge and turn it into the new types of systems and software that are going to sustain this significant transformation that's going on. And then after that, I'm going to ask Neil and Jim to come, going to invite some other folks up and we're going to run a panel to talk about some of these issues and do a real question and answer. So the goal here is before we break for drinks is to create a community feeling within the room. That includes smart people here, smart people in the audience having a conversation ultimately about some of these significant changes so please participate and we look forward to talking about the rest of it. All right, let's get going! What is digital business? One of the nice things about being an analyst is that you can reach back on people who were significantly smarter than you and build your points of view on the shoulders of those giants including Peter Drucker. Many years ago Peter Drucker made the observation that the purpose of business is to create and keep a customer. Not better shareholder value, not anything else. It is about creating and keeping your customer. Now you can argue with that, at the end of the day, if you don't have customers, you don't have a business. Now the observation that we've made, what we've added to that is that we've made the observation that the difference between business and digital business essentially is one thing. That's data. A digital business uses data to differentially create and keep customers. That's the only difference. If you think about the difference between taxi cab companies here in New York City, every cab that I've been in in the last three days has bothered me about Uber. The reason, the difference between Uber and a taxi cab company is data. That's the primary difference. Uber uses data as an asset. And we think this is the fundamental feature of digital business that everybody has to pay attention to. How is a business going to use data as an asset? Is the business using data as an asset? Is a business driving its engagement with customers, the role of its product et cetera using data? And if they are, they are becoming a more digital business. Now when you think about that, what we're really talking about is how are they going to put data to work? How are they going to take their customer data and their operational data and their financial data and any other kind of data and ultimately turn that into superior engagement or improved customer experience or more agile operations or increased automation? Those are the kinds of outcomes that we're talking about. But it is about putting data to work. That's fundamentally what we're trying to do within a digital business. Now that leads to an observation about the crucial strategic business capabilities that every business that aspires to be more digital or to be digital has to put in place. And I want to be clear. When I say strategic capabilities I mean something specific. When you talk about, for example technology architecture or information architecture there is this notion of what capabilities does your business need? Your business needs capabilities to pursue and achieve its mission. And in the digital business these are the capabilities that are now additive to this core question, ultimately of whether or not the company is a digital business. What are the three capabilities? One, you have to capture data. Not just do a good job of it, but better than your competition. You have to capture data better than your competition. In a way that is ultimately less intrusive on your markets and on your customers. That's in many respects, one of the first priorities of the internet of things and people. The idea of using sensors and related technologies to capture more data. Once you capture that data you have to turn it into value. You have to do something with it that creates business value so you can do a better job of engaging your markets and serving your customers. And that essentially is what we regard as the basis of Big Data. Including operations, including financial performance and everything else, but ultimately it's taking the data that's being captured and turning it into value within the business. The last point here is that once you have generated a model, or an insight or some other resource that you can act upon, you then have to act upon it in the real world. We call that systems of agency, the ability to enact based on data. Now I want to spend just a second talking about systems of agency 'cause we think it's an interesting concept and it's something Jim Kobielus is going to talk about a little bit later. When we say systems of agency, what we're saying is increasingly machines are acting on behalf of a brand. Or systems, combinations of machines and people are acting on behalf of the brand. And this whole notion of agency is the idea that ultimately these systems are now acting as the business's agent. They are at the front line of engaging customers. It's an extremely rich proposition that has subtle but crucial implications. For example I was talking to a senior decision maker at a business today and they made a quick observation, they talked about they, on their way here to New York City they had followed a woman who was going through security, opened up her suitcase and took out a bird. And then went through security with the bird. And the reason why I bring this up now is as TSA was trying to figure out how exactly to deal with this, the bird started talking and repeating things that the woman had said and many of those things, in fact, might have put her in jail. Now in this case the bird is not an agent of that woman. You can't put the woman in jail because of what the bird said. But increasingly we have to ask ourselves as we ask machines to do more on our behalf, digital instrumentation and elements to do more on our behalf, it's going to have blow back and an impact on our brand if we don't do it well. I want to draw that forward a little bit because I suggest there's going to be a new lifecycle for data. And the way that we think about it is we have the internet or the Edge which is comprised of things and crucially people, using sensors, whether they be smaller processors in control towers or whether they be phones that are tracking where we go, and this crucial element here is something that we call information transducers. Now a transducer in a traditional sense is something that takes energy from one form to another so that it can perform new types of work. By information transducer I essentially mean it takes information from one form to another so it can perform another type of work. This is a crucial feature of data. One of the beauties of data is that it can be used in multiple places at multiple times and not engender significant net new costs. It's one of the few assets that you can say about that. So the concept of an information transducer's really important because it's the basis for a lot of transformations of data as data flies through organizations. So we end up with the transducers storing data in the form of analytics, machine learning, business operations, other types of things, and then it goes back and it's transduced, back into to the real world as we program the real world and turning into these systems of agency. So that's the new lifecycle. And increasingly, that's how we have to think about data flows. Capturing it, turning it into value and having it act on our behalf in front of markets. That could have enormous implications for how ultimately money is spent over the next few years. So Wikibon does a significant amount of market research in addition to advising our large user customers. And that includes doing studies on cloud, public cloud, but also studies on what's happening within the analytics world. And if you take a look at it, what we basically see happening over the course of the next few years is significant investments in software and also services to get the word out. But we also expect there's going to be a lot of hardware. A significant amount of hardware that's ultimately sold within this space. And that's because of something that we call true private cloud. This concept of ultimately a business increasingly being designed and architected around the idea of data assets means that the reality, the physical realities of how data operates, how much it costs to store it or move it, the issues of latency, the issues of intellectual property protection as well as things like the regulatory regimes that are being put in place to govern how data gets used in between locations. All of those factors are going to drive increased utilization of what we call true private cloud. On premise technologies that provide the cloud experience but act where the data naturally needs to be processed. I'll come a little bit more to that in a second. So we think that it's going to be a relatively balanced market, a lot of stuff is going to end up in the cloud, but as Neil and Jim will talk about, there's going to be an enormous amount of analytics that pulls an enormous amount of data out to the Edge 'cause that's where the action's going to be. Now one of the things I want to also reveal to you is we've done a fair amount of data, we've done a fair amount of research around this question of where or how will data guide decisions about infrastructure? And in particular the Edge is driving these conversations. So here is a piece of research that one of our cohorts at Wikibon did, David Floyer. Taking a look at IoT Edge cost comparisons over a three year period. And it showed on the left hand side, an example where the sensor towers and other types of devices were streaming data back into a central location in a wind farm, stylized wind farm example. Very very expensive. Significant amounts of money end up being consumed, significant resources end up being consumed by the cost of moving the data from one place to another. Now this is even assuming that latency does not become a problem. The second example that we looked at is if we kept more of that data at the Edge and processed at the Edge. And literally it is a 85 plus percent cost reduction to keep more of the data at the Edge. Now that has enormous implications, how we think about big data, how we think about next generation architectures, et cetera. But it's these costs that are going to be so crucial to shaping the decisions that we make over the next two years about where we put hardware, where we put resources, what type of automation is possible, and what types of technology management has to be put in place. Ultimately we think it's going to lead to a structure, an architecture in the infrastructure as well as applications that is informed more by moving cloud to the data than moving the data to the cloud. That's kind of our fundamental proposition is that the norm in the industry has been to think about moving all data up to the cloud because who wants to do IT? It's so much cheaper, look what Amazon can do. Or what AWS can do. All true statements. Very very important in many respects. But most businesses today are starting to rethink that simple proposition and asking themselves do we have to move our business to the cloud, or can we move the cloud to the business? And increasingly what we see happening as we talk to our large customers about this, is that the cloud is being extended out to the Edge, we're moving the cloud and cloud services out to the business. Because of economic reasons, intellectual property control reasons, regulatory reasons, security reasons, any number of other reasons. It's just a more natural way to deal with it. And of course, the most important reason is latency. So with that as a quick backdrop, if I may quickly summarize, we believe fundamentally that the difference today is that businesses are trying to understand how to use data as an asset. And that requires an investment in new sets of technology capabilities that are not cheap, not simple and require significant thought, a lot of planning, lot of change within an IT and business organizations. How we capture data, how we turn it into value, and how we translate that into real world action through software. That's going to lead to a rethinking, ultimately, based on cost and other factors about how we deploy infrastructure. How we use the cloud so that the data guides the activity and not the choice of cloud supplier determines or limits what we can do with our data. And that's going to lead to this notion of true private cloud and elevate the role the Edge plays in analytics and all other architectures. So I hope that was perfectly clear. And now what I want to do is I want to bring up Neil Raden. Yes, now's the time Neil! So let me invite Neil up to spend some time talking about harvesting value at the Edge. Can you see his, all right. Got it. >> Oh boy. Hi everybody. Yeah, this is a really, this is a really big and complicated topic so I decided to just concentrate on something fairly simple, but I know that Peter mentioned customers. And he also had a picture of Peter Drucker. I had the pleasure in 1998 of interviewing Peter and photographing him. Peter Drucker, not this Peter. Because I'd started a magazine called Hired Brains. It was for consultants. And Peter said, Peter said a number of really interesting things to me, but one of them was his definition of a customer was someone who wrote you a check that didn't bounce. He was kind of a wag. He was! So anyway, he had to leave to do a video conference with Jack Welch and so I said to him, how do you charge Jack Welch to spend an hour on a video conference? And he said, you know I have this theory that you should always charge your client enough that it hurts a little bit or they don't take you seriously. Well, I had the chance to talk to Jack's wife, Suzie Welch recently and I told her that story and she said, "Oh he's full of it, Jack never paid "a dime for those conferences!" (laughs) So anyway, all right, so let's talk about this. To me, things about, engineered things like the hardware and network and all these other standards and so forth, we haven't fully developed those yet, but they're coming. As far as I'm concerned, they're not the most interesting thing. The most interesting thing to me in Edge Analytics is what you're going to get out of it, what the result is going to be. Making sense of this data that's coming. And while we're on data, something I've been thinking a lot lately because everybody I've talked to for the last three days just keeps talking to me about data. I have this feeling that data isn't actually quite real. That any data that we deal with is the result of some process that's captured it from something else that's actually real. In other words it's proxy. So it's not exactly perfect. And that's why we've always had these problems about customer A, customer A, customer A, what's their definition? What's the definition of this, that and the other thing? And with sensor data, I really have the feeling, when companies get, not you know, not companies, organizations get instrumented and start dealing with this kind of data what they're going to find is that this is the first time, and I've been involved in analytics, I don't want to date myself, 'cause I know I look young, but the first, I've been dealing with analytics since 1975. And everything we've ever done in analytics has involved pulling data from some other system that was not designed for analytics. But if you think about sensor data, this is data that we're actually going to catch the first time. It's going to be ours! We're not going to get it from some other source. It's going to be the real deal, to the extent that it's the real deal. Now you may say, ya know Neil, a sensor that's sending us information about oil pressure or temperature or something like that, how can you quarrel with that? Well, I can quarrel with it because I don't know if the sensor's doing it right. So we still don't know, even with that data, if it's right, but that's what we have to work with. Now, what does that really mean? Is that we have to be really careful with this data. It's ours, we have to take care of it. We don't get to reload it from source some other day. If we munge it up it's gone forever. So that has, that has very serious implications, but let me, let me roll you back a little bit. The way I look at analytics is it's come in three different eras. And we're entering into the third now. The first era was business intelligence. It was basically built and governed by IT, it was system of record kind of reporting. And as far as I can recall, it probably started around 1988 or at least that's the year that Howard Dresner claims to have invented the term. I'm not sure it's true. And things happened before 1988 that was sort of like BI, but 88 was when they really started coming out, that's when we saw BusinessObjects and Cognos and MicroStrategy and those kinds of things. The second generation just popped out on everybody else. We're all looking around at BI and we were saying why isn't this working? Why are only five people in the organization using this? Why are we not getting value out of this massive license we bought? And along comes companies like Tableau doing data discovery, visualization, data prep and Line of Business people are using this now. But it's still the same kind of data sources. It's moved out a little bit, but it still hasn't really hit the Big Data thing. Now we're in third generation, so we not only had Big Data, which has come and hit us like a tsunami, but we're looking at smart discovery, we're looking at machine learning. We're looking at AI induced analytics workflows. And then all the natural language cousins. You know, natural language processing, natural language, what's? Oh Q, natural language query. Natural language generation. Anybody here know what natural language generation is? Yeah, so what you see now is you do some sort of analysis and that tool comes up and says this chart is about the following and it used the following data, and it's blah blah blah blah blah. I think it's kind of wordy and it's going to refined some, but it's an interesting, it's an interesting thing to do. Now, the problem I see with Edge Analytics and IoT in general is that most of the canonical examples we talk about are pretty thin. I know we talk about autonomous cars, I hope to God we never have them, 'cause I'm a car guy. Fleet Management, I think Qualcomm started Fleet Management in 1988, that is not a new application. Industrial controls. I seem to remember, I seem to remember Honeywell doing industrial controls at least in the 70s and before that I wasn't, I don't want to talk about what I was doing, but I definitely wasn't in this industry. So my feeling is we all need to sit down and think about this and get creative. Because the real value in Edge Analytics or IoT, whatever you want to call it, the real value is going to be figuring out something that's new or different. Creating a brand new business. Changing the way an operation happens in a company, right? And I think there's a lot of smart people out there and I think there's a million apps that we haven't even talked about so, if you as a vendor come to me and tell me how great your product is, please don't talk to me about autonomous cars or Fleet Managing, 'cause I've heard about that, okay? Now, hardware and architecture are really not the most interesting thing. We fell into that trap with data warehousing. We've fallen into that trap with Big Data. We talk about speeds and feeds. Somebody said to me the other day, what's the narrative of this company? This is a technology provider. And I said as far as I can tell, they don't have a narrative they have some products and they compete in a space. And when they go to clients and the clients say, what's the value of your product? They don't have an answer for that. So we don't want to fall into this trap, okay? Because IoT is going to inform you in ways you've never even dreamed about. Unfortunately some of them are going to be really stinky, you know, they're going to be really bad. You're going to lose more of your privacy, it's going to get harder to get, I dunno, mortgage for example, I dunno, maybe it'll be easier, but in any case, it's not going to all be good. So let's really think about what you want to do with this technology to do something that's really valuable. Cost takeout is not the place to justify an IoT project. Because number one, it's very expensive, and number two, it's a waste of the technology because you should be looking at, you know the old numerator denominator thing? You should be looking at the numerators and forget about the denominators because that's not what you do with IoT. And the other thing is you don't want to get over confident. Actually this is good advice about anything, right? But in this case, I love this quote by Derek Sivers He's a pretty funny guy. He said, "If more information was the answer, "then we'd all be billionaires with perfect abs." I'm not sure what's on his wishlist, but you know, I would, those aren't necessarily the two things I would think of, okay. Now, what I said about the data, I want to explain some more. Big Data Analytics, if you look at this graphic, it depicts it perfectly. It's a bunch of different stuff falling into the funnel. All right? It comes from other places, it's not original material. And when it comes in, it's always used as second hand data. Now what does that mean? That means that you have to figure out the semantics of this information and you have to find a way to put it together in a way that's useful to you, okay. That's Big Data. That's where we are. How is that different from IoT data? It's like I said, IoT is original. You can put it together any way you want because no one else has ever done that before. It's yours to construct, okay. You don't even have to transform it into a schema because you're creating the new application. But the most important thing is you have to take care of it 'cause if you lose it, it's gone. It's the original data. It's the same way, in operational systems for a long long time we've always been concerned about backup and security and everything else. You better believe this is a problem. I know a lot of people think about streaming data, that we're going to look at it for a minute, and we're going to throw most of it away. Personally I don't think that's going to happen. I think it's all going to be saved, at least for a while. Now, the governance and security, oh, by the way, I don't know where you're going to find a presentation where somebody uses a newspaper clipping about Vladimir Lenin, but here it is, enjoy yourselves. I believe that when people think about governance and security today they're still thinking along the same grids that we thought about it all along. But this is very very different and again, I'm sorry I keep thrashing this around, but this is treasured data that has to be carefully taken care of. Now when I say governance, my experience has been over the years that governance is something that IT does to make everybody's lives miserable. But that's not what I mean by governance today. It means a comprehensive program to really secure the value of the data as an asset. And you need to think about this differently. Now the other thing is you may not get to think about it differently, because some of the stuff may end up being subject to regulation. And if the regulators start regulating some of this, then that'll take some of the degrees of freedom away from you in how you put this together, but you know, that's the way it works. Now, machine learning, I think I told somebody the other day that claims about machine learning in software products are as common as twisters in trail parks. And a lot of it is not really what I'd call machine learning. But there's a lot of it around. And I think all of the open source machine learning and artificial intelligence that's popped up, it's great because all those math PhDs who work at Home Depot now have something to do when they go home at night and they construct this stuff. But if you're going to have machine learning at the Edge, here's the question, what kind of machine learning would you have at the Edge? As opposed to developing your models back at say, the cloud, when you transmit the data there. The devices at the Edge are not very powerful. And they don't have a lot of memory. So you're only going to be able to do things that have been modeled or constructed somewhere else. But that's okay. Because machine learning algorithm development is actually slow and painful. So you really want the people who know how to do this working with gobs of data creating models and testing them offline. And when you have something that works, you can put it there. Now there's one thing I want to talk about before I finish, and I think I'm almost finished. I wrote a book about 10 years ago about automated decision making and the conclusion that I came up with was that little decisions add up, and that's good. But it also means you don't have to get them all right. But you don't want computers or software making decisions unattended if it involves human life, or frankly any life. Or the environment. So when you think about the applications that you can build using this architecture and this technology, think about the fact that you're not going to be doing air traffic control, you're not going to be monitoring crossing guards at the elementary school. You're going to be doing things that may seem fairly mundane. Managing machinery on the factory floor, I mean that may sound great, but really isn't that interesting. Managing well heads, drilling for oil, well I mean, it's great to the extent that it doesn't cause wells to explode, but they don't usually explode. What it's usually used for is to drive the cost out of preventative maintenance. Not very interesting. So use your heads. Come up with really cool stuff. And any of you who are involved in Edge Analytics, the next time I talk to you I don't want to hear about the same five applications that everybody talks about. Let's hear about some new ones. So, in conclusion, I don't really have anything in conclusion except that Peter mentioned something about limousines bringing people up here. On Monday I was slogging up and down Park Avenue and Madison Avenue with my client and we were visiting all the hedge funds there because we were doing a project with them. And in the miserable weather I looked at him and I said, for godsake Paul, where's the black car? And he said, that was the 90s. (laughs) Thank you. So, Jim, up to you. (audience applauding) This is terrible, go that way, this was terrible coming that way. >> Woo, don't want to trip! And let's move to, there we go. Hi everybody, how ya doing? Thanks Neil, thanks Peter, those were great discussions. So I'm the third leg in this relay race here, talking about of course how software is eating the world. And focusing on the value of Edge Analytics in a lot of real world scenarios. Programming the real world for, to make the world a better place. So I will talk, I'll break it out analytically in terms of the research that Wikibon is doing in the area of the IoT, but specifically how AI intelligence is being embedded really to all material reality potentially at the Edge. But mobile applications and industrial IoT and the smart appliances and self driving vehicles. I will break it out in terms of a reference architecture for understanding what functions are being pushed to the Edge to hardware, to our phones and so forth to drive various scenarios in terms of real world results. So I'll move a pace here. So basically AI software or AI microservices are being infused into Edge hardware as we speak. What we see is more vendors of smart phones and other, real world appliances and things like smart driving, self driving vehicles. What they're doing is they're instrumenting their products with computer vision and natural language processing, environmental awareness based on sensing and actuation and those capabilities and inferences that these devices just do to both provide human support for human users of these devices as well as to enable varying degrees of autonomous operation. So what I'll be talking about is how AI is a foundation for data driven systems of agency of the sort that Peter is talking about. Infusing data driven intelligence into everything or potentially so. As more of this capability, all these algorithms for things like, ya know for doing real time predictions and classifications, anomaly detection and so forth, as this functionality gets diffused widely and becomes more commoditized, you'll see it burned into an ever-wider variety of hardware architecture, neuro synaptic chips, GPUs and so forth. So what I've got here in front of you is a sort of a high level reference architecture that we're building up in our research at Wikibon. So AI, artificial intelligence is a big term, a big paradigm, I'm not going to unpack it completely. Of course we don't have oodles of time so I'm going to take you fairly quickly through the high points. It's a driver for systems of agency. Programming the real world. Transducing digital inputs, the data, to analog real world results. Through the embedding of this capability in the IoT, but pushing more and more of it out to the Edge with points of decision and action in real time. And there are four capabilities that we're seeing in terms of AI enabled, enabling capabilities that are absolutely critical to software being pushed to the Edge are sensing, actuation, inference and Learning. Sensing and actuation like Peter was describing, it's about capturing data from the environment within which a device or users is operating or moving. And then actuation is the fancy term for doing stuff, ya know like industrial IoT, it's obviously machine controlled, but clearly, you know self driving vehicles is steering a vehicle and avoiding crashing and so forth. Inference is the meat and potatoes as it were of AI. Analytics does inferences. It infers from the data, the logic of the application. Predictive logic, correlations, classification, abstractions, differentiation, anomaly detection, recognizing faces and voices. We see that now with Apple and the latest version of the iPhone is embedding face recognition as a core, as the core multifactor authentication technique. Clearly that's a harbinger of what's going to be universal fairly soon which is that depends on AI. That depends on convolutional neural networks, that is some heavy hitting processing power that's necessary and it's processing the data that's coming from your face. So that's critically important. So what we're looking at then is the AI software is taking root in hardware to power continuous agency. Getting stuff done. Powered decision support by human beings who have to take varying degrees of action in various environments. We don't necessarily want to let the car steer itself in all scenarios, we want some degree of override, for lots of good reasons. They want to protect life and limb including their own. And just more data driven automation across the internet of things in the broadest sense. So unpacking this reference framework, what's happening is that AI driven intelligence is powering real time decisioning at the Edge. Real time local sensing from the data that it's capturing there, it's ingesting the data. Some, not all of that data, may be persistent at the Edge. Some, perhaps most of it, will be pushed into the cloud for other processing. When you have these highly complex algorithms that are doing AI deep learning, multilayer, to do a variety of anti-fraud and higher level like narrative, auto-narrative roll-ups from various scenes that are unfolding. A lot of this processing is going to begin to happen in the cloud, but a fair amount of the more narrowly scoped inferences that drive real time decision support at the point of action will be done on the device itself. Contextual actuation, so it's the sensor data that's captured by the device along with other data that may be coming down in real time streams through the cloud will provide the broader contextual envelope of data needed to drive actuation, to drive various models and rules and so forth that are making stuff happen at the point of action, at the Edge. Continuous inference. What it all comes down to is that inference is what's going on inside the chips at the Edge device. And what we're seeing is a growing range of hardware architectures, GPUs, CPUs, FPGAs, ASIC, Neuro synaptic chips of all sorts playing in various combinations that are automating more and more very complex inference scenarios at the Edge. And not just individual devices, swarms of devices, like drones and so forth are essentially an Edge unto themselves. You'll see these tiered hierarchies of Edge swarms that are playing and doing inferences of ever more complex dynamic nature. And much of this will be, this capability, the fundamental capabilities that is powering them all will be burned into the hardware that powers them. And then adaptive learning. Now I use the term learning rather than training here, training is at the core of it. Training means everything in terms of the predictive fitness or the fitness of your AI services for whatever task, predictions, classifications, face recognition that you, you've built them for. But I use the term learning in a broader sense. It's what's make your inferences get better and better, more accurate over time is that you're training them with fresh data in a supervised learning environment. But you can have reinforcement learning if you're doing like say robotics and you don't have ground truth against which to train the data set. You know there's maximize a reward function versus minimize a loss function, you know, the standard approach, the latter for supervised learning. There's also, of course, the issue, or not the issue, the approach of unsupervised learning with cluster analysis critically important in a lot of real world scenarios. So Edge AI Algorithms, clearly, deep learning which is multilayered machine learning models that can do abstractions at higher and higher levels. Face recognition is a high level abstraction. Faces in a social environment is an even higher level of abstraction in terms of groups. Faces over time and bodies and gestures, doing various things in various environments is an even higher level abstraction in terms of narratives that can be rolled up, are being rolled up by deep learning capabilities of great sophistication. Convolutional neural networks for processing images, recurrent neural networks for processing time series. Generative adversarial networks for doing essentially what's called generative applications of all sort, composing music, and a lot of it's being used for auto programming. These are all deep learning. There's a variety of other algorithm approaches I'm not going to bore you with here. Deep learning is essentially the enabler of the five senses of the IoT. Your phone's going to have, has a camera, it has a microphone, it has the ability to of course, has geolocation and navigation capabilities. It's environmentally aware, it's got an accelerometer and so forth embedded therein. The reason that your phone and all of the devices are getting scary sentient is that they have the sensory modalities and the AI, the deep learning that enables them to make environmentally correct decisions in the wider range of scenarios. So machine learning is the foundation of all of this, but there are other, I mean of deep learning, artificial neural networks is the foundation of that. But there are other approaches for machine learning I want to make you aware of because support vector machines and these other established approaches for machine learning are not going away but really what's driving the show now is deep learning, because it's scary effective. And so that's where most of the investment in AI is going into these days for deep learning. AI Edge platforms, tools and frameworks are just coming along like gangbusters. Much development of AI, of deep learning happens in the context of your data lake. This is where you're storing your training data. This is the data that you use to build and test to validate in your models. So we're seeing a deepening stack of Hadoop and there's Kafka, and Spark and so forth that are driving the training (coughs) excuse me, of AI models that are power all these Edge Analytic applications so that that lake will continue to broaden in terms, and deepen in terms of a scope and the range of data sets and the range of modeling, AI modeling supports. Data science is critically important in this scenario because the data scientist, the data science teams, the tools and techniques and flows of data science are the fundamental development paradigm or discipline or capability that's being leveraged to build and to train and to deploy and iterate all this AI that's being pushed to the Edge. So clearly data science is at the center, data scientists of an increasingly specialized nature are necessary to the realization to this value at the Edge. AI frameworks are coming along like you know, a mile a minute. TensorFlow has achieved a, is an open source, most of these are open source, has achieved sort of almost like a defacto standard, status, I'm using the word defacto in air quotes. There's Theano and Keras and xNet and CNTK and a variety of other ones. We're seeing range of AI frameworks come to market, most open source. Most are supported by most of the major tool vendors as well. So at Wikibon we're definitely tracking that, we plan to go deeper in our coverage of that space. And then next best action, powers recommendation engines. I mean next best action decision automation of the sort of thing Neil's covered in a variety of contexts in his career is fundamentally important to Edge Analytics to systems of agency 'cause it's driving the process automation, decision automation, sort of the targeted recommendations that are made at the Edge to individual users as well as to process that automation. That's absolutely necessary for self driving vehicles to do their jobs and industrial IoT. So what we're seeing is more and more recommendation engine or recommender capabilities powered by ML and DL are going to the Edge, are already at the Edge for a variety of applications. Edge AI capabilities, like I said, there's sensing. And sensing at the Edge is becoming ever more rich, mixed reality Edge modalities of all sort are for augmented reality and so forth. We're just seeing a growth in certain, the range of sensory modalities that are enabled or filtered and analyzed through AI that are being pushed to the Edge, into the chip sets. Actuation, that's where robotics comes in. Robotics is coming into all aspects of our lives. And you know, it's brainless without AI, without deep learning and these capabilities. Inference, autonomous edge decisioning. Like I said, it's, a growing range of inferences that are being done at the Edge. And that's where it has to happen 'cause that's the point of decision. Learning, training, much training, most training will continue to be done in the cloud because it's very data intensive. It's a grind to train and optimize an AI algorithm to do its job. It's not something that you necessarily want to do or can do at the Edge at Edge devices so, the models that are built and trained in the cloud are pushed down through a dev ops process down to the Edge and that's the way it will work pretty much in most AI environments, Edge analytics environments. You centralize the modeling, you decentralize the execution of the inference models. The training engines will be in the cloud. Edge AI applications. I'll just run you through sort of a core list of the ones that are coming into, already come into the mainstream at the Edge. Multifactor authentication, clearly the Apple announcement of face recognition is just a harbinger of the fact that that's coming to every device. Computer vision speech recognition, NLP, digital assistance and chat bots powered by natural language processing and understanding, it's all AI powered. And it's becoming very mainstream. Emotion detection, face recognition, you know I could go on and on but these are like the core things that everybody has access to or will by 2020 and they're core devices, mass market devices. Developers, designers and hardware engineers are coming together to pool their expertise to build and train not just the AI, but also the entire package of hardware in UX and the orchestration of real world business scenarios or life scenarios that all this intelligence, the submitted intelligence enables and most, much of what they build in terms of AI will be containerized as micro services through Docker and orchestrated through Kubernetes as full cloud services in an increasingly distributed fabric. That's coming along very rapidly. We can see a fair amount of that already on display at Strata in terms of what the vendors are doing or announcing or who they're working with. The hardware itself, the Edge, you know at the Edge, some data will be persistent, needs to be persistent to drive inference. That's, and you know to drive a variety of different application scenarios that need some degree of historical data related to what that device in question happens to be sensing or has sensed in the immediate past or you know, whatever. The hardware itself is geared towards both sensing and increasingly persistence and Edge driven actuation of real world results. The whole notion of drones and robotics being embedded into everything that we do. That's where that comes in. That has to be powered by low cost, low power commodity chip sets of various sorts. What we see right now in terms of chip sets is it's a GPUs, Nvidia has gone real far and GPUs have come along very fast in terms of power inference engines, you know like the Tesla cars and so forth. But GPUs are in many ways the core hardware sub straight for in inference engines in DL so far. But to become a mass market phenomenon, it's got to get cheaper and lower powered and more commoditized, and so we see a fair number of CPUs being used as the hardware for Edge Analytic applications. Some vendors are fairly big on FPGAs, I believe Microsoft has gone fairly far with FPGAs inside DL strategy. ASIC, I mean, there's neuro synaptic chips like IBM's got one. There's at least a few dozen vendors of neuro synaptic chips on the market so at Wikibon we're going to track that market as it develops. And what we're seeing is a fair number of scenarios where it's a mixed environment where you use one chip set architecture at the inference side of the Edge, and other chip set architectures that are driving the DL as processed in the cloud, playing together within a common architecture. And we see some, a fair number of DL environments where the actual training is done in the cloud on Spark using CPUs and parallelized in memory, but pushing Tensorflow models that might be trained through Spark down to the Edge where the inferences are done in FPGAs and GPUs. Those kinds of mixed hardware scenarios are very, very, likely to be standard going forward in lots of areas. So analytics at the Edge power continuous results is what it's all about. The whole point is really not moving the data, it's putting the inference at the Edge and working from the data that's already captured and persistent there for the duration of whatever action or decision or result needs to be powered from the Edge. Like Neil said cost takeout alone is not worth doing. Cost takeout alone is not the rationale for putting AI at the Edge. It's getting new stuff done, new kinds of things done in an automated consistent, intelligent, contextualized way to make our lives better and more productive. Security and governance are becoming more important. Governance of the models, governance of the data, governance in a dev ops context in terms of version controls over all those DL models that are built, that are trained, that are containerized and deployed. Continuous iteration and improvement of those to help them learn to do, make our lives better and easier. With that said, I'm going to hand it over now. It's five minutes after the hour. We're going to get going with the Influencer Panel so what we'd like to do is I call Peter, and Peter's going to call our influencers. >> All right, am I live yet? Can you hear me? All right so, we've got, let me jump back in control here. We've got, again, the objective here is to have community take on some things. And so what we want to do is I want to invite five other people up, Neil why don't you come on up as well. Start with Neil. You can sit here. On the far right hand side, Judith, Judith Hurwitz. >> Neil: I'm glad I'm on the left side. >> From the Hurwitz Group. >> From the Hurwitz Group. Jennifer Shin who's affiliated with UC Berkeley. Jennifer are you here? >> She's here, Jennifer where are you? >> She was here a second ago. >> Neil: I saw her walk out she may have, >> Peter: All right, she'll be back in a second. >> Here's Jennifer! >> Here's Jennifer! >> Neil: With 8 Path Solutions, right? >> Yep. >> Yeah 8 Path Solutions. >> Just get my mic. >> Take your time Jen. >> Peter: All right, Stephanie McReynolds. Far left. And finally Joe Caserta, Joe come on up. >> Stephie's with Elysian >> And to the left. So what I want to do is I want to start by having everybody just go around introduce yourself quickly. Judith, why don't we start there. >> I'm Judith Hurwitz, I'm president of Hurwitz and Associates. We're an analyst research and fault leadership firm. I'm the co-author of eight books. Most recent is Cognitive Computing and Big Data Analytics. I've been in the market for a couple years now. >> Jennifer. >> Hi, my name's Jennifer Shin. I'm the founder and Chief Data Scientist 8 Path Solutions LLC. We do data science analytics and technology. We're actually about to do a big launch next month, with Box actually. >> We're apparent, are we having a, sorry Jennifer, are we having a problem with Jennifer's microphone? >> Man: Just turn it back on? >> Oh you have to turn it back on. >> It was on, oh sorry, can you hear me now? >> Yes! We can hear you now. >> Okay, I don't know how that turned back off, but okay. >> So you got to redo all that Jen. >> Okay, so my name's Jennifer Shin, I'm founder of 8 Path Solutions LLC, it's a data science analytics and technology company. I founded it about six years ago. So we've been developing some really cool technology that we're going to be launching with Box next month. It's really exciting. And I have, I've been developing a lot of patents and some technology as well as teaching at UC Berkeley as a lecturer in data science. >> You know Jim, you know Neil, Joe, you ready to go? >> Joe: Just broke my microphone. >> Joe's microphone is broken. >> Joe: Now it should be all right. >> Jim: Speak into Neil's. >> Joe: Hello, hello? >> I just feel not worthy in the presence of Joe Caserta. (several laughing) >> That's right, master of mics. If you can hear me, Joe Caserta, so yeah, I've been doing data technology solutions since 1986, almost as old as Neil here, but been doing specifically like BI, data warehousing, business intelligence type of work since 1996. And been doing, wholly dedicated to Big Data solutions and modern data engineering since 2009. Where should I be looking? >> Yeah I don't know where is the camera? >> Yeah, and that's basically it. So my company was formed in 2001, it's called Caserta Concepts. We recently rebranded to only Caserta 'cause what we do is way more than just concepts. So we conceptualize the stuff, we envision what the future brings and we actually build it. And we help clients large and small who are just, want to be leaders in innovation using data specifically to advance their business. >> Peter: And finally Stephanie McReynolds. >> I'm Stephanie McReynolds, I had product marketing as well as corporate marketing for a company called Elysian. And we are a data catalog so we help bring together not only a technical understanding of your data, but we curate that data with human knowledge and use automated intelligence internally within the system to make recommendations about what data to use for decision making. And some of our customers like City of San Diego, a large automotive manufacturer working on self driving cars and General Electric use Elysian to help power their solutions for IoT at the Edge. >> All right so let's jump right into it. And again if you have a question, raise your hand, and we'll do our best to get it to the floor. But what I want to do is I want to get seven questions in front of this group and have you guys discuss, slog, disagree, agree. Let's start here. What is the relationship between Big Data AI and IoT? Now Wikibon's put forward its observation that data's being generated at the Edge, that action is being taken at the Edge and then increasingly the software and other infrastructure architectures need to accommodate the realities of how data is going to work in these very complex systems. That's our perspective. Anybody, Judith, you want to start? >> Yeah, so I think that if you look at AI machine learning, all these different areas, you have to be able to have the data learned. Now when it comes to IoT, I think one of the issues we have to be careful about is not all data will be at the Edge. Not all data needs to be analyzed at the Edge. For example if the light is green and that's good and it's supposed to be green, do you really have to constantly analyze the fact that the light is green? You actually only really want to be able to analyze and take action when there's an anomaly. Well if it goes purple, that's actually a sign that something might explode, so that's where you want to make sure that you have the analytics at the edge. Not for everything, but for the things where there is an anomaly and a change. >> Joe, how about from your perspective? >> For me I think the evolution of data is really becoming, eventually oxygen is just, I mean data's going to be the oxygen we breathe. It used to be very very reactive and there used to be like a latency. You do something, there's a behavior, there's an event, there's a transaction, and then you go record it and then you collect it, and then you can analyze it. And it was very very waterfallish, right? And then eventually we figured out to put it back into the system. Or at least human beings interpret it to try to make the system better and that is really completely turned on it's head, we don't do that anymore. Right now it's very very, it's synchronous, where as we're actually making these transactions, the machines, we don't really need, I mean human beings are involved a bit, but less and less and less. And it's just a reality, it may not be politically correct to say but it's a reality that my phone in my pocket is following my behavior, and it knows without telling a human being what I'm doing. And it can actually help me do things like get to where I want to go faster depending on my preference if I want to save money or save time or visit things along the way. And I think that's all integration of big data, streaming data, artificial intelligence and I think the next thing that we're going to start seeing is the culmination of all of that. I actually, hopefully it'll be published soon, I just wrote an article for Forbes with the term of ARBI and ARBI is the integration of Augmented Reality and Business Intelligence. Where I think essentially we're going to see, you know, hold your phone up to Jim's face and it's going to recognize-- >> Peter: It's going to break. >> And it's going to say exactly you know, what are the key metrics that we want to know about Jim. If he works on my sales force, what's his attainment of goal, what is-- >> Jim: Can it read my mind? >> Potentially based on behavior patterns. >> Now I'm scared. >> I don't think Jim's buying it. >> It will, without a doubt be able to predict what you've done in the past, you may, with some certain level of confidence you may do again in the future, right? And is that mind reading? It's pretty close, right? >> Well, sometimes, I mean, mind reading is in the eye of the individual who wants to know. And if the machine appears to approximate what's going on in the person's head, sometimes you can't tell. So I guess, I guess we could call that the Turing machine test of the paranormal. >> Well, face recognition, micro gesture recognition, I mean facial gestures, people can do it. Maybe not better than a coin toss, but if it can be seen visually and captured and analyzed, conceivably some degree of mind reading can be built in. I can see when somebody's angry looking at me so, that's a possibility. That's kind of a scary possibility in a surveillance society, potentially. >> Neil: Right, absolutely. >> Peter: Stephanie, what do you think? >> Well, I hear a world of it's the bots versus the humans being painted here and I think that, you know at Elysian we have a very strong perspective on this and that is that the greatest impact, or the greatest results is going to be when humans figure out how to collaborate with the machines. And so yes, you want to get to the location more quickly, but the machine as in the bot isn't able to tell you exactly what to do and you're just going to blindly follow it. You need to train that machine, you need to have a partnership with that machine. So, a lot of the power, and I think this goes back to Judith's story is then what is the human decision making that can be augmented with data from the machine, but then the humans are actually training the training side and driving machines in the right direction. I think that's when we get true power out of some of these solutions so it's not just all about the technology. It's not all about the data or the AI, or the IoT, it's about how that empowers human systems to become smarter and more effective and more efficient. And I think we're playing that out in our technology in a certain way and I think organizations that are thinking along those lines with IoT are seeing more benefits immediately from those projects. >> So I think we have a general agreement of what kind of some of the things you talked about, IoT, crucial capturing information, and then having action being taken, AI being crucial to defining and refining the nature of the actions that are being taken Big Data ultimately powering how a lot of that changes. Let's go to the next one. >> So actually I have something to add to that. So I think it makes sense, right, with IoT, why we have Big Data associated with it. If you think about what data is collected by IoT. We're talking about a serial information, right? It's over time, it's going to grow exponentially just by definition, right, so every minute you collect a piece of information that means over time, it's going to keep growing, growing, growing as it accumulates. So that's one of the reasons why the IoT is so strongly associated with Big Data. And also why you need AI to be able to differentiate between one minute versus next minute, right? Trying to find a better way rather than looking at all that information and manually picking out patterns. To have some automated process for being able to filter through that much data that's being collected. >> I want to point out though based on what you just said Jennifer, I want to bring Neil in at this point, that this question of IoT now generating unprecedented levels of data does introduce this idea of the primary source. Historically what we've done within technology, or within IT certainly is we've taken stylized data. There is no such thing as a real world accounting thing. It is a human contrivance. And we stylize data and therefore it's relatively easy to be very precise on it. But when we start, as you noted, when we start measuring things with a tolerance down to thousandths of a millimeter, whatever that is, metric system, now we're still sometimes dealing with errors that we have to attend to. So, the reality is we're not just dealing with stylized data, we're dealing with real data, and it's more, more frequent, but it also has special cases that we have to attend to as in terms of how we use it. What do you think Neil? >> Well, I mean, I agree with that, I think I already said that, right. >> Yes you did, okay let's move on to the next one. >> Well it's a doppelganger, the digital twin doppelganger that's automatically created by your very fact that you're living and interacting and so forth and so on. It's going to accumulate regardless. Now that doppelganger may not be your agent, or might not be the foundation for your agent unless there's some other piece of logic like an interest graph that you build, a human being saying this is my broad set of interests, and so all of my agents out there in the IoT, you all need to be aware that when you make a decision on my behalf as my agent, this is what Jim would do. You know I mean there needs to be that kind of logic somewhere in this fabric to enable true agency. >> All right, so I'm going to start with you. Oh go ahead. >> I have a real short answer to this though. I think that Big Data provides the data and compute platform to make AI possible. For those of us who dipped our toes in the water in the 80s, we got clobbered because we didn't have the, we didn't have the facilities, we didn't have the resources to really do AI, we just kind of played around with it. And I think that the other thing about it is if you combine Big Data and AI and IoT, what you're going to see is people, a lot of the applications we develop now are very inward looking, we look at our organization, we look at our customers. We try to figure out how to sell more shoes to fashionable ladies, right? But with this technology, I think people can really expand what they're thinking about and what they model and come up with applications that are much more external. >> Actually what I would add to that is also it actually introduces being able to use engineering, right? Having engineers interested in the data. Because it's actually technical data that's collected not just say preferences or information about people, but actual measurements that are being collected with IoT. So it's really interesting in the engineering space because it opens up a whole new world for the engineers to actually look at data and to actually combine both that hardware side as well as the data that's being collected from it. >> Well, Neil, you and I have talked about something, 'cause it's not just engineers. We have in the healthcare industry for example, which you know a fair amount about, there's this notion of empirical based management. And the idea that increasingly we have to be driven by data as a way of improving the way that managers do things, the way the managers collect or collaborate and ultimately collectively how they take action. So it's not just engineers, it's supposed to also inform business, what's actually happening in the healthcare world when we start thinking about some of this empirical based management, is it working? What are some of the barriers? >> It's not a function of technology. What happens in medicine and healthcare research is, I guess you can say it borders on fraud. (people chuckling) No, I'm not kidding. I know the New England Journal of Medicine a couple of years ago released a study and said that at least half their articles that they published turned out to be written, ghost written by pharmaceutical companies. (man chuckling) Right, so I think the problem is that when you do a clinical study, the one that really killed me about 10 years ago was the women's health initiative. They spent $700 million gathering this data over 20 years. And when they released it they looked at all the wrong things deliberately, right? So I think that's a systemic-- >> I think you're bringing up a really important point that we haven't brought up yet, and that is is can you use Big Data and machine learning to begin to take the biases out? So if you let the, if you divorce your preconceived notions and your biases from the data and let the data lead you to the logic, you start to, I think get better over time, but it's going to take a while to get there because we do tend to gravitate towards our biases. >> I will share an anecdote. So I had some arm pain, and I had numbness in my thumb and pointer finger and I went to, excruciating pain, went to the hospital. So the doctor examined me, and he said you probably have a pinched nerve, he said, but I'm not exactly sure which nerve it would be, I'll be right back. And I kid you not, he went to a computer and he Googled it. (Neil laughs) And he came back because this little bit of information was something that could easily be looked up, right? Every nerve in your spine is connected to your different fingers so the pointer and the thumb just happens to be your C6, so he came back and said, it's your C6. (Neil mumbles) >> You know an interesting, I mean that's a good example. One of the issues with healthcare data is that the data set is not always shared across the entire research community, so by making Big Data accessible to everyone, you actually start a more rational conversation or debate on well what are the true insights-- >> If that conversation includes what Judith talked about, the actual model that you use to set priorities and make decisions about what's actually important. So it's not just about improving, this is the test. It's not just about improving your understanding of the wrong thing, it's also testing whether it's the right or wrong thing as well. >> That's right, to be able to test that you need to have humans in dialog with one another bringing different biases to the table to work through okay is there truth in this data? >> It's context and it's correlation and you can have a great correlation that's garbage. You know if you don't have the right context. >> Peter: So I want to, hold on Jim, I want to, >> It's exploratory. >> Hold on Jim, I want to take it to the next question 'cause I want to build off of what you talked about Stephanie and that is that this says something about what is the Edge. And our perspective is that the Edge is not just devices. That when we talk about the Edge, we're talking about human beings and the role that human beings are going to play both as sensors or carrying things with them, but also as actuators, actually taking action which is not a simple thing. So what do you guys think? What does the Edge mean to you? Joe, why don't you start? >> Well, I think it could be a combination of the two. And specifically when we talk about healthcare. So I believe in 2017 when we eat we don't know why we're eating, like I think we should absolutely by now be able to know exactly what is my protein level, what is my calcium level, what is my potassium level? And then find the foods to meet that. What have I depleted versus what I should have, and eat very very purposely and not by taste-- >> And it's amazing that red wine is always the answer. >> It is. (people laughing) And tequila, that helps too. >> Jim: You're a precision foodie is what you are. (several chuckle) >> There's no reason why we should not be able to know that right now, right? And when it comes to healthcare is, the biggest problem or challenge with healthcare is no matter how great of a technology you have, you can't, you can't, you can't manage what you can't measure. And you're really not allowed to use a lot of this data so you can't measure it, right? You can't do things very very scientifically right, in the healthcare world and I think regulation in the healthcare world is really burdening advancement in science. >> Peter: Any thoughts Jennifer? >> Yes, I teach statistics for data scientists, right, so you know we talk about a lot of these concepts. I think what makes these questions so difficult is you have to find a balance, right, a middle ground. For instance, in the case of are you being too biased through data, well you could say like we want to look at data only objectively, but then there are certain relationships that your data models might show that aren't actually a causal relationship. For instance, if there's an alien that came from space and saw earth, saw the people, everyone's carrying umbrellas right, and then it started to rain. That alien might think well, it's because they're carrying umbrellas that it's raining. Now we know from real world that that's actually not the way these things work. So if you look only at the data, that's the potential risk. That you'll start making associations or saying something's causal when it's actually not, right? So that's one of the, one of the I think big challenges. I think when it comes to looking also at things like healthcare data, right? Do you collect data about anything and everything? Does it mean that A, we need to collect all that data for the question we're looking at? Or that it's actually the best, more optimal way to be able to get to the answer? Meaning sometimes you can take some shortcuts in terms of what data you collect and still get the right answer and not have maybe that level of specificity that's going to cost you millions extra to be able to get. >> So Jennifer as a data scientist, I want to build upon what you just said. And that is, are we going to start to see methods and models emerge for how we actually solve some of these problems? So for example, we know how to build a system for stylized process like accounting or some elements of accounting. We have methods and models that lead to technology and actions and whatnot all the way down to that that system can be generated. We don't have the same notion to the same degree when we start talking about AI and some of these Big Datas. We have algorithms, we have technology. But are we going to start seeing, as a data scientist, repeatability and learning and how to think the problems through that's going to lead us to a more likely best or at least good result? >> So I think that's a bit of a tough question, right? Because part of it is, it's going to depend on how many of these researchers actually get exposed to real world scenarios, right? Research looks into all these papers, and you come up with all these models, but if it's never tested in a real world scenario, well, I mean we really can't validate that it works, right? So I think it is dependent on how much of this integration there's going to be between the research community and industry and how much investment there is. Funding is going to matter in this case. If there's no funding in the research side, then you'll see a lot of industry folk who feel very confident about their models that, but again on the other side of course, if researchers don't validate those models then you really can't say for sure that it's actually more accurate, or it's more efficient. >> It's the issue of real world testing and experimentation, A B testing, that's standard practice in many operationalized ML and AI implementations in the business world, but real world experimentation in the Edge analytics, what you're actually transducing are touching people's actual lives. Problem there is, like in healthcare and so forth, when you're experimenting with people's lives, somebody's going to die. I mean, in other words, that's a critical, in terms of causal analysis, you've got to tread lightly on doing operationalizing that kind of testing in the IoT when people's lives and health are at stake. >> We still give 'em placebos. So we still test 'em. All right so let's go to the next question. What are the hottest innovations in AI? Stephanie I want to start with you as a company, someone at a company that's got kind of an interesting little thing happening. We start thinking about how do we better catalog data and represent it to a large number of people. What are some of the hottest innovations in AI as you see it? >> I think it's a little counter intuitive about what the hottest innovations are in AI, because we're at a spot in the industry where the most successful companies that are working with AI are actually incorporating them into solutions. So the best AI solutions are actually the products that you don't know there's AI operating underneath. But they're having a significant impact on business decision making or bringing a different type of application to the market and you know, I think there's a lot of investment that's going into AI tooling and tool sets for data scientists or researchers, but the more innovative companies are thinking through how do we really take AI and make it have an impact on business decision making and that means kind of hiding the AI to the business user. Because if you think a bot is making a decision instead of you, you're not going to partner with that bot very easily or very readily. I worked at, way at the start of my career, I worked in CRM when recommendation engines were all the rage online and also in call centers. And the hardest thing was to get a call center agent to actually read the script that the algorithm was presenting to them, that algorithm was 99% correct most of the time, but there was this human resistance to letting a computer tell you what to tell that customer on the other side even if it was more successful in the end. And so I think that the innovation in AI that's really going to push us forward is when humans feel like they can partner with these bots and they don't think of it as a bot, but they think about as assisting their work and getting to a better result-- >> Hence the augmentation point you made earlier. >> Absolutely, absolutely. >> Joe how 'about you? What do you look at? What are you excited about? >> I think the coolest thing at the moment right now is chat bots. Like to be able, like to have voice be able to speak with you in natural language, to do that, I think that's pretty innovative, right? And I do think that eventually, for the average user, not for techies like me, but for the average user, I think keyboards are going to be a thing of the past. I think we're going to communicate with computers through voice and I think this is the very very beginning of that and it's an incredible innovation. >> Neil? >> Well, I think we all have myopia here. We're all thinking about commercial applications. Big, big things are happening with AI in the intelligence community, in military, the defense industry, in all sorts of things. Meteorology. And that's where, well, hopefully not on an every day basis with military, you really see the effect of this. But I was involved in a project a couple of years ago where we were developing AI software to detect artillery pieces in terrain from satellite imagery. I don't have to tell you what country that was. I think you can probably figure that one out right? But there are legions of people in many many companies that are involved in that industry. So if you're talking about the dollars spent on AI, I think the stuff that we do in our industries is probably fairly small. >> Well it reminds me of an application I actually thought was interesting about AI related to that, AI being applied to removing mines from war zones. >> Why not? >> Which is not a bad thing for a whole lot of people. Judith what do you look at? >> So I'm looking at things like being able to have pre-trained data sets in specific solution areas. I think that that's something that's coming. Also the ability to, to really be able to have a machine assist you in selecting the right algorithms based on what your data looks like and the problems you're trying to solve. Some of the things that data scientists still spend a lot of their time on, but can be augmented with some, basically we have to move to levels of abstraction before this becomes truly ubiquitous across many different areas. >> Peter: Jennifer? >> So I'm going to say computer vision. >> Computer vision? >> Computer vision. So computer vision ranges from image recognition to be able to say what content is in the image. Is it a dog, is it a cat, is it a blueberry muffin? Like a sort of popular post out there where it's like a blueberry muffin versus like I think a chihuahua and then it compares the two. And can the AI really actually detect difference, right? So I think that's really where a lot of people who are in this space of being in both the AI space as well as data science are looking to for the new innovations. I think, for instance, cloud vision I think that's what Google still calls it. The vision API we've they've released on beta allows you to actually use an API to send your image and then have it be recognized right, by their API. There's another startup in New York called Clarify that also does a similar thing as well as you know Amazon has their recognition platform as well. So I think in a, from images being able to detect what's in the content as well as from videos, being able to say things like how many people are entering a frame? How many people enter the store? Not having to actually go look at it and count it, but having a computer actually tally that information for you, right? >> There's actually an extra piece to that. So if I have a picture of a stop sign, and I'm an automated car, and is it a picture on the back of a bus of a stop sign, or is it a real stop sign? So that's going to be one of the complications. >> Doesn't matter to a New York City cab driver. How 'about you Jim? >> Probably not. (laughs) >> Hottest thing in AI is General Adversarial Networks, GANT, what's hot about that, well, I'll be very quick, most AI, most deep learning, machine learning is analytical, it's distilling or inferring insights from the data. Generative takes that same algorithmic basis but to build stuff. In other words, to create realistic looking photographs, to compose music, to build CAD CAM models essentially that can be constructed on 3D printers. So GANT, it's a huge research focus all around the world are used for, often increasingly used for natural language generation. In other words it's institutionalizing or having a foundation for nailing the Turing test every single time, building something with machines that looks like it was constructed by a human and doing it over and over again to fool humans. I mean you can imagine the fraud potential. But you can also imagine just the sheer, like it's going to shape the world, GANT. >> All right so I'm going to say one thing, and then we're going to ask if anybody in the audience has an idea. So the thing that I find interesting is traditional programs, or when you tell a machine to do something you don't need incentives. When you tell a human being something, you have to provide incentives. Like how do you get someone to actually read the text. And this whole question of elements within AI that incorporate incentives as a way of trying to guide human behavior is absolutely fascinating to me. Whether it's gamification, or even some things we're thinking about with block chain and bitcoins and related types of stuff. To my mind that's going to have an enormous impact, some good, some bad. Anybody in the audience? I don't want to lose everybody here. What do you think sir? And I'll try to do my best to repeat it. Oh we have a mic. >> So my question's about, Okay, so the question's pretty much about what Stephanie's talking about which is human and loop training right? I come from a computer vision background. That's the problem, we need millions of images trained, we need humans to do that. And that's like you know, the workforce is essentially people that aren't necessarily part of the AI community, they're people that are just able to use that data and analyze the data and label that data. That's something that I think is a big problem everyone in the computer vision industry at least faces. I was wondering-- >> So again, but the problem is that is the difficulty of methodologically bringing together people who understand it and people who, people who have domain expertise people who have algorithm expertise and working together? >> I think the expertise issue comes in healthcare, right? In healthcare you need experts to be labeling your images. With contextual information where essentially augmented reality applications coming in, you have the AR kit and everything coming out, but there is a lack of context based intelligence. And all of that comes through training images, and all of that requires people to do it. And that's kind of like the foundational basis of AI coming forward is not necessarily an algorithm, right? It's how well are datas labeled? Who's doing the labeling and how do we ensure that it happens? >> Great question. So for the panel. So if you think about it, a consultant talks about being on the bench. How much time are they going to have to spend on trying to develop additional business? How much time should we set aside for executives to help train some of the assistants? >> I think that the key is not, to think of the problem a different way is that you would have people manually label data and that's one way to solve the problem. But you can also look at what is the natural workflow of that executive, or that individual? And is there a way to gather that context automatically using AI, right? And if you can do that, it's similar to what we do in our product, we observe how someone is analyzing the data and from those observations we can actually create the metadata that then trains the system in a particular direction. But you have to think about solving the problem differently of finding the workflow that then you can feed into to make this labeling easy without the human really realizing that they're labeling the data. >> Peter: Anybody else? >> I'll just add to what Stephanie said, so in the IoT applications, all those sensory modalities, the computer vision, the speech recognition, all that, that's all potential training data. So it cross checks against all the other models that are processing all the other data coming from that device. So that the natural language process of understanding can be reality checked against the images that the person happens to be commenting upon, or the scene in which they're embedded, so yeah, the data's embedded-- >> I don't think we're, we're not at the stage yet where this is easy. It's going to take time before we do start doing the pre-training of some of these details so that it goes faster, but right now, there're not that many shortcuts. >> Go ahead Joe. >> Sorry so a couple things. So one is like, I was just caught up on your incentivizing programs to be more efficient like humans. You know in Ethereum that has this notion, which is bot chain, has this theory, this concept of gas. Where like as the process becomes more efficient it costs less to actually run, right? It costs less ether, right? So it actually is kind of, the machine is actually incentivized and you don't really know what it's going to cost until the machine processes it, right? So there is like some notion of that there. But as far as like vision, like training the machine for computer vision, I think it's through adoption and crowdsourcing, so as people start using it more they're going to be adding more pictures. Very very organically. And then the machines will be trained and right now is a very small handful doing it, and it's very proactive by the Googles and the Facebooks and all of that. But as we start using it, as they start looking at my images and Jim's and Jen's images, it's going to keep getting smarter and smarter through adoption and through very organic process. >> So Neil, let me ask you a question. Who owns the value that's generated as a consequence of all these people ultimately contributing their insight and intelligence into these systems? >> Well, to a certain extent the people who are contributing the insight own nothing because the systems collect their actions and the things they do and then that data doesn't belong to them, it belongs to whoever collected it or whoever's going to do something with it. But the other thing, getting back to the medical stuff. It's not enough to say that the systems, people will do the right thing, because a lot of them are not motivated to do the right thing. The whole grant thing, the whole oh my god I'm not going to go against the senior professor. A lot of these, I knew a guy who was a doctor at University of Pittsburgh and they were doing a clinical study on the tubes that they put in little kids' ears who have ear infections, right? And-- >> Google it! Who helps out? >> Anyway, I forget the exact thing, but he came out and said that the principle investigator lied when he made the presentation, that it should be this, I forget which way it went. He was fired from his position at Pittsburgh and he has never worked as a doctor again. 'Cause he went against the senior line of authority. He was-- >> Another question back here? >> Man: Yes, Mark Turner has a question. >> Not a question, just want to piggyback what you're saying about the transfixation of maybe in healthcare of black and white images and color images in the case of sonograms and ultrasound and mammograms, you see that happening using AI? You see that being, I mean it's already happening, do you see it moving forward in that kind of way? I mean, talk more about that, about you know, AI and black and white images being used and they can be transfixed, they can be made to color images so you can see things better, doctors can perform better operations. >> So I'm sorry, but could you summarize down? What's the question? Summarize it just, >> I had a lot of students, they're interested in the cross pollenization between AI and say the medical community as far as things like ultrasound and sonograms and mammograms and how you can literally take a black and white image and it can, using algorithms and stuff be made to color images that can help doctors better do the work that they've already been doing, just do it better. You touched on it like 30 seconds. >> So how AI can be used to actually add information in a way that's not necessarily invasive but is ultimately improves how someone might respond to it or use it, yes? Related? I've also got something say about medical images in a second, any of you guys want to, go ahead Jennifer. >> Yeah, so for one thing, you know and it kind of goes back to what we were talking about before. When we look at for instance scans, like at some point I was looking at CT scans, right, for lung cancer nodules. In order for me, who I don't have a medical background, to identify where the nodule is, of course, a doctor actually had to go in and specify which slice of the scan had the nodule and where exactly it is, so it's on both the slice level as well as, within that 2D image, where it's located and the size of it. So the beauty of things like AI is that ultimately right now a radiologist has to look at every slice and actually identify this manually, right? The goal of course would be that one day we wouldn't have to have someone look at every slice to like 300 usually slices and be able to identify it much more automated. And I think the reality is we're not going to get something where it's going to be 100%. And with anything we do in the real world it's always like a 95% chance of it being accurate. So I think it's finding that in between of where, what's the threshold that we want to use to be able to say that this is, definitively say a lung cancer nodule or not. I think the other thing to think about is in terms of how their using other information, what they might use is a for instance, to say like you know, based on other characteristics of the person's health, they might use that as sort of a grading right? So you know, how dark or how light something is, identify maybe in that region, the prevalence of that specific variable. So that's usually how they integrate that information into something that's already existing in the computer vision sense. I think that's, the difficulty with this of course, is being able to identify which variables were introduced into data that does exist. >> So I'll make two quick observations on this then I'll go to the next question. One is radiologists have historically been some of the highest paid physicians within the medical community partly because they don't have to be particularly clinical. They don't have to spend a lot of time with patients. They tend to spend time with doctors which means they can do a lot of work in a little bit of time, and charge a fair amount of money. As we start to introduce some of these technologies that allow us to from a machine standpoint actually make diagnoses based on those images, I find it fascinating that you now see television ads promoting the role that the radiologist plays in clinical medicine. It's kind of an interesting response. >> It's also disruptive as I'm seeing more and more studies showing that deep learning models processing images, ultrasounds and so forth are getting as accurate as many of the best radiologists. >> That's the point! >> Detecting cancer >> Now radiologists are saying oh look, we do this great thing in terms of interacting with the patients, never have because they're being dis-intermediated. The second thing that I'll note is one of my favorite examples of that if I got it right, is looking at the images, the deep space images that come out of Hubble. Where they're taking data from thousands, maybe even millions of images and combining it together in interesting ways you can actually see depth. You can actually move through to a very very small scale a system that's 150, well maybe that, can't be that much, maybe six billion light years away. Fascinating stuff. All right so let me go to the last question here, and then I'm going to close it down, then we can have something to drink. What are the hottest, oh I'm sorry, question? >> Yes, hi, my name's George, I'm with Blue Talon. You asked earlier there the question what's the hottest thing in the Edge and AI, I would say that it's security. It seems to me that before you can empower agency you need to be able to authorize what they can act on, how they can act on, who they can act on. So it seems if you're going to move from very distributed data at the Edge and analytics at the Edge, there has to be security similarly done at the Edge. And I saw (speaking faintly) slides that called out security as a key prerequisite and maybe Judith can comment, but I'm curious how security's going to evolve to meet this analytics at the Edge. >> Well, let me do that and I'll ask Jen to comment. The notion of agency is crucially important, slightly different from security, just so we're clear. And the basic idea here is historically folks have thought about moving data or they thought about moving application function, now we are thinking about moving authority. So as you said. That's not necessarily, that's not really a security question, but this has been a problem that's been in, of concern in a number of different domains. How do we move authority with the resources? And that's really what informs the whole agency process. But with that said, Jim. >> Yeah actually I'll, yeah, thank you for bringing up security so identity is the foundation of security. Strong identity, multifactor, face recognition, biometrics and so forth. Clearly AI, machine learning, deep learning are powering a new era of biometrics and you know it's behavioral metrics and so forth that's organic to people's use of devices and so forth. You know getting to the point that Peter was raising is important, agency! Systems of agency. Your agent, you have to, you as a human being should be vouching in a secure, tamper proof way, your identity should be vouching for the identity of some agent, physical or virtual that does stuff on your behalf. How can that, how should that be managed within this increasingly distributed IoT fabric? Well a lot of that's been worked. It all ran through webs of trust, public key infrastructure, formats and you know SAML for single sign and so forth. It's all about assertion, strong assertions and vouching. I mean there's the whole workflows of things. Back in the ancient days when I was actually a PKI analyst three analyst firms ago, I got deep into all the guts of all those federation agreements, something like that has to be IoT scalable to enable systems agency to be truly fluid. So we can vouch for our agents wherever they happen to be. We're going to keep on having as human beings agents all over creation, we're not even going to be aware of everywhere that our agents are, but our identity-- >> It's not just-- >> Our identity has to follow. >> But it's not just identity, it's also authorization and context. >> Permissioning, of course. >> So I may be the right person to do something yesterday, but I'm not authorized to do it in another context in another application. >> Role based permissioning, yeah. Or persona based. >> That's right. >> I agree. >> And obviously it's going to be interesting to see the role that block chain or its follow on to the technology is going to play here. Okay so let me throw one more questions out. What are the hottest applications of AI at the Edge? We've talked about a number of them, does anybody want to add something that hasn't been talked about? Or do you want to get a beer? (people laughing) Stephanie, you raised your hand first. >> I was going to go, I bring something mundane to the table actually because I think one of the most exciting innovations with IoT and AI are actually simple things like City of San Diego is rolling out 3200 automated street lights that will actually help you find a parking space, reduce the amount of emissions into the atmosphere, so has some environmental change, positive environmental change impact. I mean, it's street lights, it's not like a, it's not medical industry, it doesn't look like a life changing innovation, and yet if we automate streetlights and we manage our energy better, and maybe they can flicker on and off if there's a parking space there for you, that's a significant impact on everyone's life. >> And dramatically suppress the impact of backseat driving! >> (laughs) Exactly. >> Joe what were you saying? >> I was just going to say you know there's already the technology out there where you can put a camera on a drone with machine learning within an artificial intelligence within it, and it can look at buildings and determine whether there's rusty pipes and cracks in cement and leaky roofs and all of those things. And that's all based on artificial intelligence. And I think if you can do that, to be able to look at an x-ray and determine if there's a tumor there is not out of the realm of possibility, right? >> Neil? >> I agree with both of them, that's what I meant about external kind of applications. Instead of figuring out what to sell our customers. Which is most what we hear. I just, I think all of those things are imminently doable. And boy street lights that help you find a parking place, that's brilliant, right? >> Simple! >> It improves your life more than, I dunno. Something I use on the internet recently, but I think it's great! That's, I'd like to see a thousand things like that. >> Peter: Jim? >> Yeah, building on what Stephanie and Neil were saying, it's ambient intelligence built into everything to enable fine grain microclimate awareness of all of us as human beings moving through the world. And enable reading of every microclimate in buildings. In other words, you know you have sensors on your body that are always detecting the heat, the humidity, the level of pollution or whatever in every environment that you're in or that you might be likely to move into fairly soon and either A can help give you guidance in real time about where to avoid, or give that environment guidance about how to adjust itself to your, like the lighting or whatever it might be to your specific requirements. And you know when you have a room like this, full of other human beings, there has to be some negotiated settlement. Some will find it too hot, some will find it too cold or whatever but I think that is fundamental in terms of reshaping the sheer quality of experience of most of our lived habitats on the planet potentially. That's really the Edge analytics application that depends on everybody having, being fully equipped with a personal area network of sensors that's communicating into the cloud. >> Jennifer? >> So I think, what's really interesting about it is being able to utilize the technology we do have, it's a lot cheaper now to have a lot of these ways of measuring that we didn't have before. And whether or not engineers can then leverage what we have as ways to measure things and then of course then you need people like data scientists to build the right model. So you can collect all this data, if you don't build the right model that identifies these patterns then all that data's just collected and it's just made a repository. So without having the models that supports patterns that are actually in the data, you're not going to find a better way of being able to find insights in the data itself. So I think what will be really interesting is to see how existing technology is leveraged, to collect data and then how that's actually modeled as well as to be able to see how technology's going to now develop from where it is now, to being able to either collect things more sensitively or in the case of say for instance if you're dealing with like how people move, whether we can build things that we can then use to measure how we move, right? Like how we move every day and then being able to model that in a way that is actually going to give us better insights in things like healthcare and just maybe even just our behaviors. >> Peter: Judith? >> So, I think we also have to look at it from a peer to peer perspective. So I may be able to get some data from one thing at the Edge, but then all those Edge devices, sensors or whatever, they all have to interact with each other because we don't live, we may, in our business lives, act in silos, but in the real world when you look at things like sensors and devices it's how they react with each other on a peer to peer basis. >> All right, before I invite John up, I want to say, I'll say what my thing is, and it's not the hottest. It's the one I hate the most. I hate AI generated music. (people laughing) Hate it. All right, I want to thank all the panelists, every single person, some great commentary, great observations. I want to thank you very much. I want to thank everybody that joined. John in a second you'll kind of announce who's the big winner. But the one thing I want to do is, is I was listening, I learned a lot from everybody, but I want to call out the one comment that I think we all need to remember, and I'm going to give you the award Stephanie. And that is increasing we have to remember that the best AI is probably AI that we don't even know is working on our behalf. The same flip side of that is all of us have to be very cognizant of the idea that AI is acting on our behalf and we may not know it. So, John why don't you come on up. Who won the, whatever it's called, the raffle? >> You won. >> Thank you! >> How 'about a round of applause for the great panel. (audience applauding) Okay we have a put the business cards in the basket, we're going to have that brought up. We're going to have two raffle gifts, some nice Bose headsets and speaker, Bluetooth speaker. Got to wait for that. I just want to say thank you for coming and for the folks watching, this is our fifth year doing our own event called Big Data NYC which is really an extension of the landscape beyond the Big Data world that's Cloud and AI and IoT and other great things happen and great experts and influencers and analysts here. Thanks for sharing your opinion. Really appreciate you taking the time to come out and share your data and your knowledge, appreciate it. Thank you. Where's the? >> Sam's right in front of you. >> There's the thing, okay. Got to be present to win. We saw some people sneaking out the back door to go to a dinner. >> First prize first. >> Okay first prize is the Bose headset. >> Bluetooth and noise canceling. >> I won't look, Sam you got to hold it down, I can see the cards. >> All right. >> Stephanie you won! (Stephanie laughing) Okay, Sawny Cox, Sawny Allie Cox? (audience applauding) Yay look at that! He's here! The bar's open so help yourself, but we got one more. >> Congratulations. Picture right here. >> Hold that I saw you. Wake up a little bit. Okay, all right. Next one is, my kids love this. This is great, great for the beach, great for everything portable speaker, great gift. >> What is it? >> Portable speaker. >> It is a portable speaker, it's pretty awesome. >> Oh you grabbed mine. >> Oh that's one of our guys. >> (lauging) But who was it? >> Can't be related! Ava, Ava, Ava. Okay Gene Penesko (audience applauding) Hey! He came in! All right look at that, the timing's great. >> Another one? (people laughing) >> Hey thanks everybody, enjoy the night, thank Peter Burris, head of research for SiliconANGLE, Wikibon and he great guests and influencers and friends. And you guys for coming in the community. Thanks for watching and thanks for coming. Enjoy the party and some drinks and that's out, that's it for the influencer panel and analyst discussion. Thank you. (logo music)

Published Date : Sep 28 2017

SUMMARY :

is that the cloud is being extended out to the Edge, the next time I talk to you I don't want to hear that are made at the Edge to individual users We've got, again, the objective here is to have community From the Hurwitz Group. And finally Joe Caserta, Joe come on up. And to the left. I've been in the market for a couple years now. I'm the founder and Chief Data Scientist We can hear you now. And I have, I've been developing a lot of patents I just feel not worthy in the presence of Joe Caserta. If you can hear me, Joe Caserta, so yeah, I've been doing We recently rebranded to only Caserta 'cause what we do to make recommendations about what data to use the realities of how data is going to work in these to make sure that you have the analytics at the edge. and ARBI is the integration of Augmented Reality And it's going to say exactly you know, And if the machine appears to approximate what's and analyzed, conceivably some degree of mind reading but the machine as in the bot isn't able to tell you kind of some of the things you talked about, IoT, So that's one of the reasons why the IoT of the primary source. Well, I mean, I agree with that, I think I already or might not be the foundation for your agent All right, so I'm going to start with you. a lot of the applications we develop now are very So it's really interesting in the engineering space And the idea that increasingly we have to be driven I know the New England Journal of Medicine So if you let the, if you divorce your preconceived notions So the doctor examined me, and he said you probably have One of the issues with healthcare data is that the data set the actual model that you use to set priorities and you can have a great correlation that's garbage. What does the Edge mean to you? And then find the foods to meet that. And tequila, that helps too. Jim: You're a precision foodie is what you are. in the healthcare world and I think regulation For instance, in the case of are you being too biased We don't have the same notion to the same degree but again on the other side of course, in the Edge analytics, what you're actually transducing What are some of the hottest innovations in AI and that means kind of hiding the AI to the business user. I think keyboards are going to be a thing of the past. I don't have to tell you what country that was. AI being applied to removing mines from war zones. Judith what do you look at? and the problems you're trying to solve. And can the AI really actually detect difference, right? So that's going to be one of the complications. Doesn't matter to a New York City cab driver. (laughs) So GANT, it's a huge research focus all around the world So the thing that I find interesting is traditional people that aren't necessarily part of the AI community, and all of that requires people to do it. So for the panel. of finding the workflow that then you can feed into that the person happens to be commenting upon, It's going to take time before we do start doing and Jim's and Jen's images, it's going to keep getting Who owns the value that's generated as a consequence But the other thing, getting back to the medical stuff. and said that the principle investigator lied and color images in the case of sonograms and ultrasound and say the medical community as far as things in a second, any of you guys want to, go ahead Jennifer. to say like you know, based on other characteristics I find it fascinating that you now see television ads as many of the best radiologists. and then I'm going to close it down, It seems to me that before you can empower agency Well, let me do that and I'll ask Jen to comment. agreements, something like that has to be IoT scalable and context. So I may be the right person to do something yesterday, Or persona based. that block chain or its follow on to the technology into the atmosphere, so has some environmental change, the technology out there where you can put a camera And boy street lights that help you find a parking place, That's, I'd like to see a thousand things like that. that are always detecting the heat, the humidity, patterns that are actually in the data, but in the real world when you look at things and I'm going to give you the award Stephanie. and for the folks watching, We saw some people sneaking out the back door I can see the cards. Stephanie you won! Picture right here. This is great, great for the beach, great for everything All right look at that, the timing's great. that's it for the influencer panel and analyst discussion.

ENTITIES

Entity	Category	Confidence
Judith	PERSON	0.99+
Jennifer	PERSON	0.99+
Jim	PERSON	0.99+
Neil	PERSON	0.99+
Stephanie McReynolds	PERSON	0.99+
Jack	PERSON	0.99+
2001	DATE	0.99+
Marc Andreessen	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Joe Caserta	PERSON	0.99+
Suzie Welch	PERSON	0.99+
Joe	PERSON	0.99+
David Floyer	PERSON	0.99+
Peter	PERSON	0.99+
Stephanie	PERSON	0.99+
Jen	PERSON	0.99+
Neil Raden	PERSON	0.99+
Mark Turner	PERSON	0.99+
Judith Hurwitz	PERSON	0.99+
John	PERSON	0.99+
Elysian	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
Qualcomm	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
2017	DATE	0.99+
Honeywell	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Derek Sivers	PERSON	0.99+
New York	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
1998	DATE	0.99+

Josh Klahr & Prashanthi Paty | DataWorks Summit 2017

>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Hey, welcome back to theCUBE. Day two of the DataWorks Summit, I'm Lisa Martin with my cohost, George Gilbert. We've had a great day and a half so far, learning a ton in this hyper-growth, big data world meets IoT, machine learning, data science. George and I are excited to welcome our next guests. We have Josh Klahr, the VP of Product Management from AtScale. Welcome George, welcome back. >> Thank you. >> And we have Prashanthi Paty, the Head of Data Engineering for GoDaddy. Welcome to theCUBE. >> Thank you. >> Great to have you guys here. So, wanted to kind of talk to you guys about, one, how you guys are working together, but two, also some of the trends that you guys are seeing. So as we talked about, in the tech industry, it's two degrees of Kevin Bacon, right. You guys worked together back in the day at Yahoo. Talk to us about what you both visualized and experienced in terms of the Hadoop adoption maturity cycle. >> Sure. >> You want to start, Josh? >> Yeah, I'll start, and you can chime in and correct me. But yeah, as you mentioned, Prashanthi and I worked together at Yahoo. It feels like a long time ago. In our central data group. And we had two main jobs. First job was, collect all of the data from our ad systems, our audience systems, and stick that data into a Hadoop cluster. At the time, we were kind of doing it while Hadoop was kind of being developed. And the other thing that we did was, we had to support a bunch of BI consumers. So we built cubes, we built data marts, we used MicroStrategy, Tableau, and I would say the experience there was a great experience with Hadoop in terms of the ability to have low-cost storage, scale out data processing of all of, what were really, billions and billions, tens of billions of events a day. But when it came to BI, it felt like we were doing stuff the old way. And we were moving data off cluster, and making it small. In fact, you did a lot of that. >> Well, yeah, at the end of the day, we were using Hadoop as a staging layer. So we would process a whole bunch of data there, and then we would scale it back, and move it into, again, relational stores or cubes, because basically we couldn't afford to give any accessibility to BI tools or to our end users directly on Hadoop. So while we surely did a large-scale data processing in Hadoop layer, we failed to turn on the insights right there. >> Lisa: Okay. >> Maybe there's a lesson in there for folks who are getting slightly more mature versions of Hadoop now, but can learn from also some of the experiences you've had. Were there issues in terms of, having cleaned and curated data, were there issues for BI with performance and the lack of proper file formats like Parquet? What was it that where you hit the wall? >> It was both, you have to remember this, we were probably one of the first teams to put a data warehouse on Hadoop. So we were dealing with Pig versions of like, 0.5, 0.6, so we were putting a lot of demand on the tooling and the infrastructure. Hadoop was still in a very nascent stage at that time. That was one. And I think a lot of the focus was on, hey, now we have the ability to do clickstream analytics at scale, right. So we did a lot of the backend stuff. But the presentation is where I think we struggled. >> So would that mean that you did do, the idea is that you could do full resolution without sampling on the backend, and then you would extract and presumably sort of denormalize so that you could, essentially run data match for subject matter interests. >> Yeah, and that's exactly what we did is, we took all of this big data, but to make it work for BI, which were two things, one was performance. It was really, can you get an interactive query and response time. And the other thing was the interface. Can a Tableau user connect and understand what they're looking at. You had to make the data small again. And that was actually the genesis of AtScale, which is where I am today, was, we were frustrated with this, big data platform and having to then make the data small again in order to support BI. >> That's a great transition, Josh. Let's actually talk about AtScale. You guys saw BI on Hadoop as this big white space. How have you succeeded there, and then let's talk about what GoDaddy is doing with AtScale and big data. >> Yeah, I think that we definitely learned, we took the learnings from our experience at Yahoo, and we really thought about, if we were to start from scratch, and solve the problem the way we wanted it to be solved, what would that system look like. And it was a few things. One was an interface that worked for BI. I don't want to date myself, but my experience in the software space started with OLAP. And I can tell you OLAP isn't dead. When you go and talk to an enterprise, a fortune 1000 enterprise and you talk about OLAP, that's how they think. They think in terms of measures and dimensions and hierarchies. So one important thing for us was to project an OLAP interface on top of data that's Hadoop native. It's Hive tables, Parquet, ORC, you kind of talk about all of the mess that may sit underneath the covers. So one thing was projecting that interface, the other thing was delivering performance. So we've invested a lot in using the Hadoop cluster natively to deliver performing queries. We do this by creating aggregate tables and summary tables and being smart about how we route queries. But we've done it in a way that makes a Hadoop admin very happy. You don't have to buy a bunch of AtScale servers in addition to your Hadoop cluster. We scale the way the Hadoop cluster scales. So we don't require separate technology. So we fit really nicely into that Hadoop ecosystem. >> So how do you make, making the Hadoop admin happy is a good thing. How do you make the business user happy, who needs now, as we were here yesterday, to kind of merge more with the data science folks to be able to understand or even have the chance to articulate, "These are the business outcomes "we want to look for and we want to see." How do you guys, maybe, under the hood, if you will, AtScale, make the business guys and gals happy? >> I'll share my opinion and then Prashanthi can comment on her experience but, as I've mentioned before, the business users want an interface that's simple to use. And so that's one thing we do, is, we give them the ability to just look at measures and dimensions. If I'm a business, I grew up using Excel to do my analysis. The thing I like most as an analyst is a big fat wide table. And so that's what, we make an underlying Hadoop cluster and what could be tens or hundreds of tables look like a single big fat wide table for a data analyst. You talk to a data scientist, you talk to a business analyst, that's the way they want to view the world. So that's one thing we do. And then, we give them response times that are fast. We give them interactivity, so that you could really quickly start to get a sense of the shape of the data. >> And allowing them to get that time to value. >> Yes. >> I can imagine. >> Just a follow-up on that. When you have to prepare the aggregates, essentially like the cubes, instead of the old BI tools running on a data mart, what is the additional latency that's required from data coming fresh into the data lake and then transforming it into something that's consumption ready for the business user? >> Yeah, I think I can take that. So again, if you look at the last 10 years, in the initial period, certainly at Yahoo, we just threw engineering resources at that problem, right. So we had teams dedicated to building these aggregates. But the whole premise of Hadoop was the ability to do unstructured optimizations. And by having a team find out the new data coming in and then integrating that into your pipeline, so we were adding a lot of latency. And so we needed to figure out how we can do this in a more seamless way, in a more real-time way. And get the, you know, the real premise of Hadoop. Get it at the hands of our business users. I mean, I think that's where AtScale is doing a lot of the good work in terms of dynamically being able to create aggregates based on the design that you put in the cube. So we are starting to work with them on our implementation. We're looking forward to the results. >> Tell us a little bit more about what you're looking to achieve. So GoDaddy is a customer of AtScale. Tell us a little bit more about that. What are you looking to build together, and kind of, where are you in your journey right now? >> Yeah, so the main goal for us is to move beyond predefined models, dashboards, and reports. So we want to be more agile with our schema changes. Time to market is one. And performance, right. Ability to put BI tools directly on top of Hadoop, is one. And also to push as much of the semantics as possible down into the Hadoop layer. So those are the things that we're looking to do. >> So that sounds like a classic business intelligence component, but sort of rethought for a big data era. >> I love that quote, and I feel it. >> Prashanthi: Yes. >> Josh: Yes. (laughing) >> That's exactly what we're trying to do. >> But that's also, some of the things you mentioned are non-trivial. You want to have this, time goes in to the pre-processing of data so that it's consumable, but you also wanted it to be dynamic, which is sort of a trade-off, which means, you know, that takes time. So is that a sort of a set of requirements, a wishlist for AtScale, or is that something that you're building on your own? >> I think there's a lot happening in that space. They are one of the first people to come out with their product, which is solving a real problem that we tried to solve for a long time. And I think as we start using them more and more, we'll surely be pushing them to bring in more features. I think the algorithm that they have to dynamically generate aggregates is something that we're giving quite a lot of feedback to them on. >> Our last guest from Pentaho was talking about, there was, in her keynote today, the quote from I think McKinsey report that said, "40% of machine learning data is either not fully "exploited or not used at all." So, tell us, kind of, where is big daddy regarding machine learning? What are you seeing? What are you seeing at AtScale and how are you guys going to work together to maybe venture into that frontier? >> Yeah, I mean, I think one of the key requirements we're placing on our data scientists is, not only do you have to be very good at your data science job, you have to be a very good programmer too to make use of the big data technologies. And we're seeing some interesting developments like very workload-specific engines coming into the market now for search, for graph, for machine learning, as well. Which is supposed to give the tools right into the hands of data scientists. I personally haven't worked with them to be able to comment. But I do think that the next realm on big data is this workload-specific engines, and coming on top of Hadoop, and realizing more of the insights for the end users. >> Curious, can you elaborate a little more on those workload-specific engines, that sounds rather intriguing. >> Well, I think interactive, interacting with Hadoop on a real-time basis, we see search-based engines like Elasticsearch, Solr, and there is also Druid. At Yahoo, we were quite a bit shop of Druid actually. And we were using it as an interactive query layer directly with our applications, BI applications. This is our JavaScript-based BI applications, and Hadoop. So I think there are quite a few means to realize insights from Hadoop now. And that's the space where I see workload-specific engines coming in. >> And you mentioned earlier before we started that you were using Mahout, presumably for machine learning. And I guess I thought the center of gravity for that type of analytics has moved to Spark, and you haven't mentioned Spark yet. We are not using Mahout though. I mentioned it as something that's in that space. But yeah, I mean, Spark is pretty interesting. Spark SQL, doing ETL with Spark, as well as using Spark SQL for queries is something that looks very, very promising lately. >> Quick question for you, from a business perspective, so you're the Head of Engineering at GoDaddy. How do you interact with your business users? The C-suite, for example, where data science, machine learning, they understand, we have to have, they're embracing Hadoop more and more. They need to really, embracing big data and leveraging Hadoop as an enabler. What's the conversation like, or maybe even the influence of the GoDaddy business C-suite on engineering? How do you guys work collaboratively? >> So we do have very regular stakeholder meeting. And these are business stakeholders. So we have representatives from our marketing teams, finance, product teams, and data science team. We consider data science as one of our customers. We take requirements from them. We give them peek into the work we're doing. We also let them be part of our agile team so that when we have something released, they're the first ones looking at it and testing it. So they're very much part of the process. I don't think we can afford to just sit back and work on this monolithic data warehouse and at the end of the day say, "Hey, here is what we have" and ask them to go get the insights from it. So it's a very agile process, and they're very much part of it. >> One last question for you, sorry George, is, you guys mentioned you are sort of early in your partnership, unless I misunderstood. What has AtScale help GoDaddy achieve so far and what are your expectations, say the next six months? >> We want the world. (laughing) >> Lisa: Just that. >> Yeah, but the premise is, I mean, so Josh and I, we were part of the same team at Yahoo, where we faced problems that AtScale is trying to solve. So the premise of being able to solve those problems, which is, like their name, basically delivering data at scale, that's the premise that I'm very much looking forward to from them. >> Well, excellent. Well, we want to thank you both for joining us on theCUBE. We wish you the best of luck in attaining the world. (all laughing) >> Josh: There we go, thank you. >> Excellent, guys. Josh Klahr, thank you so much. >> My pleasure. Prashanthi, thank you for being on theCUBE for the first time. >> No problem. >> You've been watching theCUBE live at the day two of the DataWorks Summit. For my cohost George Gilbert, I am Lisa Martin. Stick around guys, we'll be right back. (jingle)

Published Date : Jun 14 2017

SUMMARY :

Brought to you by Hortonworks. George and I are excited to welcome our next guests. And we have Prashanthi Paty, Talk to us about what you both visualized and experienced And the other thing that we did was, and then we would scale it back, and the lack of proper file formats like Parquet? So we were dealing with Pig versions of like, the idea is that you could do full resolution And the other thing was the interface. How have you succeeded there, and solve the problem the way we wanted it to be solved, So how do you make, And so that's one thing we do, is, that's consumption ready for the business user? based on the design that you put in the cube. and kind of, where are you in your journey right now? So we want to be more agile with our schema changes. So that sounds like a classic business intelligence Josh: Yes. of data so that it's consumable, but you also wanted And I think as we start using them more and more, What are you seeing at AtScale and how are you guys and realizing more of the insights for the end users. Curious, can you elaborate a little more And we were using it as an interactive query layer and you haven't mentioned Spark yet. machine learning, they understand, we have to have, and at the end of the day say, "Hey, here is what we have" you guys mentioned you are sort of early We want the world. So the premise of being able to solve those problems, Well, we want to thank you both for joining us on theCUBE. Josh Klahr, thank you so much. for the first time. of the DataWorks Summit.

ENTITIES

Entity	Category	Confidence
Josh	PERSON	0.99+
George	PERSON	0.99+
Lisa Martin	PERSON	0.99+
George Gilbert	PERSON	0.99+
Josh Klahr	PERSON	0.99+
Prashanthi Paty	PERSON	0.99+
Prashanthi	PERSON	0.99+
Lisa	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
Kevin Bacon	PERSON	0.99+
San Jose	LOCATION	0.99+
Excel	TITLE	0.99+
Silicon Valley	LOCATION	0.99+
GoDaddy	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
yesterday	DATE	0.99+
AtScale	ORGANIZATION	0.99+
tens	QUANTITY	0.99+
Spark	TITLE	0.99+
Druid	TITLE	0.99+
First job	QUANTITY	0.99+
Hadoop	TITLE	0.99+
two	QUANTITY	0.99+
Spark SQL	TITLE	0.99+
today	DATE	0.99+
two degrees	QUANTITY	0.99+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
DataWorks Summit	EVENT	0.98+
two things	QUANTITY	0.98+
Elasticsearch	TITLE	0.98+
first time	QUANTITY	0.98+
DataWorks Summit 2017	EVENT	0.97+
first teams	QUANTITY	0.96+
Solr	TITLE	0.96+
Mahout	TITLE	0.95+
hundreds of tables	QUANTITY	0.95+
two main jobs	QUANTITY	0.94+
One last question	QUANTITY	0.94+
billions and	QUANTITY	0.94+
McKinsey	ORGANIZATION	0.94+
Day two	QUANTITY	0.94+
One	QUANTITY	0.94+
Parquet	TITLE	0.94+
Tableau	TITLE	0.93+

Satyen Sangani, Alation | SAP Sapphire Now 2017

>> Narrator: It's theCUBE covering Sapphire Now 2017 brought to you by SAP Cloud Platform and HANA Enterprise Cloud. >> Welcome back everyone to our special Sapphire Now 2017 coverage in our Palo Alto Studios. We have folks on the ground in Orlando. It's the third day of Sapphire Now and we're bringing our friends and experts inside our new 4500 square foot studio where we're starting to get our action going and covering events anywhere they are from here. If we can't get there we'll do it from here in Palo Alto. Our next guest is Satyen Sangani, CEO of Alation. A hot start-up funded by Custom Adventures, Catalyst Data Collective, and I think Andreessen Horowitz is also an investor? >> Satyen: That's right. >> Satyen, welcome to the cube conversation here. >> Thank you for having me. >> So we are doing this special coverage, and I wanted to bring you in and discuss Sapphire Now as it relates to the context of the biggest wave hitting the industry, with waves are ones cloud. We've known that for a while. People surfing that one, then the data wave is coming fast, and I think this is a completely different animal in the sense of it's going to look different, but be just as big. Your business is in the data business. You help companies figure this out. Give us the update on, first take a minute talk about Alation, for the folks who aren't following you, what do you guys do, and then let's talk about data. >> Yeah. So for those of you that don't know about what Alation is, it's basically a data catalog. You know, if you think about all of the databases that exist in the enterprise, stuff on Prem, stuff in the cloud, all the BI tools like Tableau and MicroStrategy, and Business Objects. When you've got a lot of data that sits inside the enterprise today and a wide variety of legacy and modern tools, and what Alation does is, it creates a catalog, crawling all of those systems like Google crawls the web and effectively looks at all the logs inside of those systems, to understand how the data is interrelated and we create this data social graph, and it kind of looks >> John: It's a metadata catalog? >> We call you know, we don't use the word metadata because metadata is the word that people use when you know that's that's Johnny back in the corner office, Right? And people don't want to talk about metadata if you're a business person you think about metadata you're like, I don't, not my thing. >> So you guys are democratizing what data means to an organization? That's right. >> We just like to talk about context. We basically say, look in the same way that information, or in the same way when you're eating your food, you need, you know organic labeling to understand whether or not that's good or bad, we have on some level a provenance problem, a trust problem inside of data in the enterprise, and you need a layer of you know trust, and understanding in context. >> So you guys are a SAS, or you guys are a SAS solution, or are you a software subscription? >> We are both. Most of this is actually on Prem because most of the people that have the problem that Alation solves are very big complicated institutions, or institutions with a lot of data, or a lot of people trying to analyze it, but we do also have a SAS offering, and actually that's how we intersect with SAP Altiscale, and so we have a cloud base that's offering that we work with. >> Tell me about your relation SAP because you kind of backdoored in through an acquisition, quickly note that we'll get into the conversation. >> Yeah that's right, So Altiscale to big intersections, big data, and then they do big data in the cloud SAP acquired them last year and what we do is we provide a front-end capability for people to access that data in the cloud, so that as analysts want to analyze that data, as data governance folks want to manage that data, we provide them with a single catalog to do that. >> So talk about the dynamics in the industry because SAP clearly the big news there is the Leonardo, they're trying to create this framework, we just announced an alpha because everyone's got these names of dead creative geniuses, (Satyen laughs) We just ingest our Nostradamus products, Since they have Leonardo and, >> That's right. >> SAP's got Einstein, and IBM's got Watson, and Informatica has got Claire, so who thought maybe we just get our own version, but anyway, everyone's got some sort of like bot, or like AI program. >> Yep. >> I mean I get that, but the reality is, the trend is, they're trying to create a tool chest of platform re-platforming around tooling >> Satyen: Yeah. >> To make things easier. >> Satyen: Yeah. >> You have a lot of work in this area, through relation, trying to make things easier. >> Satyen: Yeah. >> And also they get the cloud, On-premise, HANA Enterprise Cloud, SAV cloud platform, meaning developers. So the convergence between developers, cloud, and data are happening. What's your take on that strategy? You think SAP's got a good move by going multi cloud, or should they, should be taking a different approach? >> Well I think they have to, I mean I think the economics in cloud, and the unmanageability, you know really human economics, and being able to have more and more being managed by third-party providers that are, you know, effectively like AWS, and how they skill, in the capability to manage at scale, and you just really can't compete if you're SAP, and you can't compete if your customers are buying, and assembling the toolkits On-premise, so they've got to go there, and I think every IT provider has to >> John: Got to go to the cloud you mean? >> They've got to go to the cloud, I think there's no question about it, you know I think that's at this point, a foregone conclusion in the world of enterprise IT. >> John: Yeah it's pretty obvious, I mean hybrid cloud is happening, that's really a gateway to multi-cloud, the submission is when I build Norton, a guest in latency multi-cloud issues there, but the reality is not every workloads gone there yet, a lot of analytics going on in the cloud. >> Satyen: Yeah. >> DevTest, okay check the box on DevTest >> Satyen: That's right. >> Analytics is all a ballgame right now, in terms of state of the art, your thoughts on the trends in how companies are using the cloud for analytics, and things that are challenges and opportunities. >> Yeah, I think there's, I think the analytics story in the cloud is a little bit earlier. I think that the transaction processing and the new applications, and the new architectures, and new integrations, certainly if you're going to build a new project, you're going to do that in the cloud, but I think the analytics in a stack, first of all there's like data gravity, right, you know there's a lot of gravity to that data, and moving it all into the cloud, and so if you're transaction processing, your behavioral apps are in the cloud, then it makes sense to keep the data in an AWS, or in the cloud. Conversely you know if it's not, then you're not going to take a whole bunch of data that sits on Prem and move it whole hog all the way to the cloud just because, right, that's super expensive, >> Yeah. >> You've got legacy. >> A lot of risks too and a lot of governance and a lot of compliance stuff as well. >> That's exactly right I mean if you're trying to comply with Basel II or GDPR, and you know you want to manage all that privacy information. How are you going to do that if you're going to move your data at the same time >> John: Yeah. >> And so it's a tough >> John: Great point. >> It's a tough move, I think from our perspective, and I think this is really important, you know we sort of say look, in a world where data is going to be on Prem, on the cloud, you know in BI tools, in databases and no SQL databases, on Hadoop, you're going to have data everywhere, and in that world where data is going to be in multiple locations and multiple technologies you got to figure out a way to manage. >> Yeah. I mean data sprawls all over the place, it's a big problem, oh and this oh and by the way that's a good thing, store it to your storage is getting cheaper and cheaper, data legs are popping out, but you have data links, for all you have data everywhere. >> Satyen: That's right. >> How are you looking at that problem as a start-up, and how a customer's dealing with that, and what is this a real issue, or is this still too early to talk about data sprawl? >> It's a real issue, I mean it, we liken it to the advent of the Internet in the time of traditional media, right, so you had you had traditional media, there were single sort of authoritative sources we all watched it may be CNN may be CBS we had the nightly news we had Newsweek, we got our information, also the Internet comes along, and anybody can blog about anything, right and so the cost of creating information is now this much lower anybody can create any reality anybody can store data anywhere, right, and so now you've got a world where, with tableau, with Hadoop, with redshift, you can build any stack you want to at any cost, and so now what do you do? Because everybody's creating their own thing, every Dev is doing their own thing, everybody's got new databases, new applications, you know software is eating the world right? >> And data it is eating software. >> And data is eating software, and so now you've got this problem where you're like look I got all this stuff, and I don't know I don't know what's fake news, what's real, what's alternative fact, what doesn't make any sense, and so you've got a signal and noise problem, and I think in that world you got to figure out how to get to truth, right, >> John: Yeah. And what's the answer to that in your mind, not that you have the answer, if you did, we'd be solving it better. >> Yeah. >> But I mean directionally where's the vector going in your mind? I try to talk to Paul Martino about this at bullpen capital he's a total analytics geek he doesn't think this big data can solve that yet but they started to see some science around trying to solve these problems with data. What's your vision on this? >> Satyen: Yeah you know so I believe that every I think that every developer is going to start building applications based on data I think that every business person is going to have an analytical role in their job because if they're not dealing with the world on the certainty, and they're not using all the evidence, at their disposable, they're not making the best decisions and obviously they're going to be more and more analysts and so you know at some level everybody is an analyst >> I wrote a post in 2008, my old blog was hosted on WordPress, before I started SilicionANGLE, data is the new developer kid. >> That's right. >> And I saw that early, and it was still not as clear to this now as obvious as least to us because we're in the middle, in this industry, but it's now part of the software fabric, it's like a library, like as developer you'd call a library of code software to come in and be part of your program >> Yeah >> Building blocks approach, Lego blocks, but now data as Lego blocks completely changes the game on things if you think of it that way. Where are we on that notion of you really using data as a development component, I mean it seems to be early, I don't, haven't seen any proof points, that says, well that company's actually using the data programmatically with software. >> Satyen: Yeah. well I mean look I think there's features in almost every software application whether it's you know 27% of the people clicked on this button into this particular thing, I mean that's a data based application right and so I think there is this notion that we talked a lot about, which is data literacy, right, and so that's kind of a weird thing, so what does that exactly mean? Well data is just information like a news article is information, and you got to decide whether it's good or it's bad, and whether you can come to a conclusion, or whether you can't, just as if you're using an API from a third-party developer you need documentation, you need context about that data, and people have to be intelligent about how they use it. >> And literacies also makes it, makes it addressable. >> That's right. >> If you have knowledge about data, at some point it's named and addressed at some point in a network. >> Satyen: Yeah. >> Especially Jada in motion, I mean data legs I get, data at rest, we start getting into data in motion, real-time data, every piece of data counts. Right? >> That's exactly right. And so now you've got to teach people about how to use this stuff you've got to give them the right data you got to make that discoverable you got to make that information usable you've got to get people to know who the experts are about the data, so they can ask questions, you know these are tougher problems, especially as you get more and more systems. >> All right, as a start up, you're a growing start-up, you guys are, are lean and mean, doing well. You have to go compete in this war. It's a lot of, you know a lot of big whales in there, I mean you got Oracle, SAP, IBM, they're all trying to transform, everybody is transforming all the incumbent winners, potential buyers of your company, or potentially you displacing this, as a young CEO, they you know eat their lunch, you have to go compete in a big game. How are you guys looking at that compass, I see your focus so I know a little bit about your plan, but take us through the mindset of a start-up CEO, that has to go into this world, you guys have to be good, I mean this is a big wave, see it's a big wave. >> Yeah. Nobody buys from a start-up unless you get, and a start-up could be even a company, less than a 100-200 people, I mean nobody's buying from a company unless there's a 10x return to value relative to the next best option, and so in that world how do you build 10x value? Well one you've got to have great technology, and then that's the start point, but the other thing is you've got to have deep focus on your customers, right, and so I think from our perspective, we build focus by just saying, look nobody understands data in your company, and by and large you've got to make money by understanding this data, as you do the digital transformation stuff, a big part of that is differentiating and making better products and optimizing based upon understanding your data because that helps you and your business make better decisions, >> John: Yeah. >> And so what we're going to do is help you understand that data better and faster than any other company can do. >> You really got to pick your shots, but what you're saying, if I hear you saying is as a start-up you got to hit the beachhead segment you want to own. >> Satyen: That's right. >> And own it. >> Satyen: That's exactly. >> No other decision, just get it, and then maybe get to a bigger scope later, and sequence around, and grow it that way. >> Satyen: You can't solve 10 problems >> Can't be groping for a beachhead if you don't know what you want, you're never going to get it. >> That's right. You can't solve 10 problems unless you solve one, right, and so you know I think we're at a phase where we've proven that we can scalably solved one, we've got customers like, you know Pfizer and Intuit and Citrix and Tesco and Tesla and eBay and Munich Reinsurance and so these are all you know amazing brands that are traditionally difficult to sell into, but you know I think from our perspective it's really about focus and just helping customers that are making that digital analytical transformation. Do it faster, and do it by enabling their people. >> But a lot going on this week for events, we had Informatica world this week, we got V-mon. We had Google I/O. We had Sapphire. It's a variety of other events going on, but I want to ask you kind of a more of a entrepreneurial industry question, which is, if we're going through the so-called digital transformation, that means a new modern era an old one movie transformed, yet I go to every event, and everyone's number one at something, that's like I was just at Informatica, they're number one in six squadrons. Michael Dell we're number in four every character, Mark Hurr at the press meeting said they're number one in all categories, Ross Perot think quote about you could be number one depends on how you slice the market, seems to be in play, my point is I kind of get a little bit, you know weirded out by that, but that is okay, you know I guess theCUBE's number one in overall live videos produced at an enterprise event, you know I, so we're number one at something, but my point is. >> Satyen: You really are. >> My point is, in a new transformation, what is the new scoreboard going to look like because a lot of things that you're talking about is horizontally integrated, there's new use cases developing, a new environment is coming online, so if someone wanted to actually try to keep score of who number one is and who's winning, besides customer wins, because that's clearly the one that you can point to and say hey they're winning customers, customer growth is good, outside of customer growth, what do you think will be the key requirements to get some sort of metric on who's really doing well these are the others, I mean we're not yet there with >> Yeah it's a tough problem, I mean you know used to be the world was that nobody gets fired for choosing choosing IBM. >> John: Yeah. >> Right, and I think that that brand credibility worked in a world where you could be conservative right, in this world I think, that looking for those measures, it is going to be really tough, and I think on some level that quest for looking for what is number one, or who is the best is actually the sort of fool's errand, and if that's what you're looking for, if you're looking for, you know what's the best answer for me based upon social signal, you know it's kind of like you know I'm going to go do the what the popular kids do in high school, I mean that could lead to you know a path, but it doesn't lead to the one that's going to actually get you satisfaction, and so on some level I think that customers, like you are the best signal, you know, always, >> John: Yeah, I mean it's hard, it's a rhetorical question, we ask it because, you know, we're trying to see not mystical with the path of fact called the fashion, what's fashionable. >> Satyen: Yeah. >> That's different. I mean talk about like really a cure metro, in the old days market share is one, actually IDC used a track who had market shares, and they would say based upon the number of shipments products, this is the market share winner, right? yeah that's pretty clean, I mean that's fairly clean, so just what it would be now? Number of instances, I mean it's so hard to figure out anyway, I digress. >> No, I think that's right, I mean I think I think it's really tough, that I think customers stories that, sort of map to your case. >> Yeah. It all comes back down to customer wins, how many customers you have was the >> Yeah and how much value they are getting out of your stuff. >> Yeah. That 10x value, and I think that's the multiplier minimum, if not more and with clouds and the scale is happening, you agree? >> Satyen: Yeah. >> It's going to get better. Okay thanks for coming on theCUBE. We have Satyen Sangani. CEO, co-founder of Alation, great start-up. Follow them on Twitter, these guys got some really good focus, learning about your data, because once you understand the data hygiene, you start think about ethics, and all the cool stuff happening with data. Thanks so much for coming on CUBE. More coverage, but Sapphire after the short break. (techno music)

Published Date : May 19 2017

SUMMARY :

brought to you by SAP Cloud Platform and I think Andreessen Horowitz is also an investor? and I wanted to bring you in and discuss So for those of you that don't know about what Alation is, that people use when you know that's So you guys are democratizing and you need a layer of you know trust, and so we have a cloud base that's offering because you kind of backdoored in through an acquisition, and then they do big data in the cloud and IBM's got Watson, You have a lot of work in this area, through relation, and data are happening. you know I think that's at this point, a lot of analytics going on in the cloud. and things that are challenges and opportunities. you know there's a lot of gravity to that data, and a lot of compliance stuff as well. and you know you want to and multiple technologies you got to figure out but you have data links, not that you have the answer, but they started to see some science data is the new developer kid. the game on things if you think of it that way. and you got to decide whether it's good or it's bad, And literacies also makes it, If you have knowledge about data, I mean data legs I get, you know these are tougher problems, I mean you got Oracle, SAP, IBM, and so in that world how do you build 10x value? is help you understand that data better and faster the beachhead segment you want to own. and then maybe get to a bigger scope later, if you don't know what you want, and so you know I think we're at a phase you know I guess theCUBE's number one in overall I mean you know you know, I mean it's so hard to figure out anyway, I mean I think I think it's really tough, how many customers you have was the Yeah and how much value they are getting and I think that's the multiplier minimum, and all the cool stuff happening with data.

ENTITIES

Entity	Category	Confidence
Michael Dell	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Paul Martino	PERSON	0.99+
John	PERSON	0.99+
Pfizer	ORGANIZATION	0.99+
Ross Perot	PERSON	0.99+
Mark Hurr	PERSON	0.99+
Palo Alto	LOCATION	0.99+
2008	DATE	0.99+
27%	QUANTITY	0.99+
Satyen	PERSON	0.99+
Satyen Sangani	PERSON	0.99+
10 problems	QUANTITY	0.99+
Orlando	LOCATION	0.99+
Catalyst Data Collective	ORGANIZATION	0.99+
CBS	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
4500 square foot	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+
CNN	ORGANIZATION	0.99+
last year	DATE	0.99+
AWS	ORGANIZATION	0.99+
Tesco	ORGANIZATION	0.99+
Basel II	TITLE	0.99+
eBay	ORGANIZATION	0.99+
Alation	ORGANIZATION	0.99+
10x	QUANTITY	0.99+
Custom Adventures	ORGANIZATION	0.99+
six squadrons	QUANTITY	0.99+
this week	DATE	0.99+
Andreessen Horowitz	PERSON	0.99+
both	QUANTITY	0.99+
Tableau	TITLE	0.98+
Informatica	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
2017	DATE	0.97+
MicroStrategy	TITLE	0.97+
Intuit	ORGANIZATION	0.97+
first	QUANTITY	0.97+
third day	QUANTITY	0.96+
Norton	ORGANIZATION	0.96+
Jada	PERSON	0.96+
Johnny	PERSON	0.96+
Sapphire	ORGANIZATION	0.95+
Twitter	ORGANIZATION	0.94+
Munich Reinsurance	ORGANIZATION	0.94+
HANA Enterprise Cloud	TITLE	0.94+
less than a 100-200 people	QUANTITY	0.94+
single	QUANTITY	0.94+
Claire	PERSON	0.93+
Business Objects	TITLE	0.93+
big	EVENT	0.93+
one	QUANTITY	0.93+
Google	ORGANIZATION	0.92+
Leonardo	ORGANIZATION	0.91+
IDC	ORGANIZATION	0.9+
DevTest	TITLE	0.9+
Alation	PERSON	0.89+
Cloud Platform	TITLE	0.89+
Einstein	PERSON	0.88+
four	QUANTITY	0.87+

Nenshad Bardoliwalla, Paxata - #BigDataNYC 2016 - #theCUBE

>> Voiceover: Live from New York, it's The Cube, covering Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, Nvidia, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and George Gilbert. >> Welcome back to New York City, everybody. Nenshad Bardoliwalla is here, he's the co-founder and chief product officer at Paxata, a company that, three years ago, I want to say three years ago, came out of stealth on The Cube. >> October 27, 2013. >> Right, and we were at the Warwick Hotel across the street from the Hilton. Yeah, Prakash came on The Cube and came out of stealth. Welcome back. >> Thank you very much. >> Great to see you guys. Taking the world by storm. >> Great to be here, and of course, Prakash sends his apologies. He couldn't be here so he sent his stunt double. (Dave and George laugh) >> Great, so give us the update. What's the latest? >> So there are a lot of great things going on in our space. The thing that we announced here at the show is what we're calling Paxata Connect, OK? We are moving just in the same way that we created the self-service data preparation category, and now there are 50 companies that claim they do self-service data prep. We are moving the industry to the next phase of what we are calling our business information platform. Paxata Connect is one of the first major milestones in getting to that vision of the business information platform. What Paxata Connect allows our customers to do is, number one, to have visual, completely declarative, point-and-click browsing access to a variety of different data sources in the enterprise. For example, we support, we are the only company that we know of that supports connecting to multiple, simultaneous, different Hadoop distributions in one system. So a Paxata customer can connect to MapR, they can connect to Hortonworks, they can connect to Cloudera, and they can federate across all of them, which is a very powerful aspect of the system. >> And part of this involves, when you say declarative, it means you don't have to write a program to retrieve the data. >> Exactly right. Exactly right. >> Is this going into HTFS, into Hive, or? >> Yes it is. In fact, so Hadoop is one part of, this multi-source Hadoop capability is one part of Paxata Connect. The second is, as we've moved into this information platform world, our customers are telling us they want read-write access to more than just Hadoop. Hadoop is obviously a very important part, but we're actually supporting no-sequel data sources like Cloudant, Mongo DB, we're supporting read and write, we're supporting, for the first time, relational databases, we already supported read, but now we actually support write to relational databases. So Paxata is really becoming kind of this fabric, a business-centric information fabric, that allows people to move data from anywhere to any destination, and transform it, profile it, explore it along the way. >> Excellent. Let's get into some of the use cases. >> Yeah, tell us where the banks are. The sense at the conference is that everyone sort of got their data lakes to some extent up and running. Now where are they pushing to go next? >> Sure, that's an excellent question. So we have really focused on the enterprise segment, as you know. So the customers that are working with Paxata from an industry perspective, banking is, of course, a very important one, we were really proud to share the stage yesterday with both Citi and Standard Chartered Bank, two of our flagship banking customers. But Paxata is also heavily used in the United States government, in the intelligence community, I won't say any more about that. It's used heavily in retail and consumer products, it's used heavily in the high-tech space, it's used heavily by data service providers, that is, companies whose entire business is based on data. But to answer your question specifically, what's happening in the data lake world is that a lot of folks, the early adopters, have jumped onto the data lake bandwagon. So they're pouring terabytes and petabytes of data into the data lake. And then the next question the business asks is, OK, now what? Where's the data, right? One of the simplest use cases, but actually one that's very pervasive for our customers, is they say, "Look, we don't even know, "our business people, they don't even know "what's in Hadoop right now." And by the way, I will also say that the data lake is not just Hadoop, but Amazon S3 is also serving as a data lake. The capabilities inside Microsoft's cloud are also serving as a data lake. Even the notion of a data lake is becoming this sort of polymorphic distributed thing. So what they do is, they want to be able to get what we like to say is first eyes on data. We let people with Paxata, especially with the release of Connect, to just point and click their way and to actually explore the data in all of the native systems before they even bring it in to something like Paxata. So they can actually sneak preview thousands of database tables or thousands of compressed data sets inside of Amazon S3, or thousands of data sets inside of Hadoop, and now the business people for the first time can point and click and actually see what is in the data lake in the first place. So step number one is, we have taken the approach so far in the industry of, there have been a lot of IT-driven use cases that have motivated people to go to the data lake approach. But now, we obviously want to show, all of our companies want to show business value, so tools and platforms like Paxata that sit on top of the data lake, that can federate across multiple data lakes and provide business-centric access to that information is the first significant use case pattern we're seeing. >> Just a clarification, could there be two roles where one is for slightly more technical business user exposes views summarizing, so that the ultimate end user doesn't have to see the thousands of tables? >> Absolutely, that's a great question. So when you look at self-service, if somebody wants to roll out a self-service strategy, there are multiple roles in an organization that actually need to intersect with self-service. There is a pattern in organizations where people say, "We want our people to get access to all the data." Of course it's governed, they have to have the right passwords and SSO and all that, but they're the companies who say, yes, the users really need to be able to see all of the data across these different tables. But there's a different role, who also uses Paxata extensively, who are the curators, right? These are the people who say, look, I'm going to provision the raw data, provide the views, provide even some normalization or transformation, and then land that data back into another layer, as people call the data relay, they go from layer zero to layer one to layer two, they're different directory structures, but the point is, there's a natural processing frame that they're going through with their data, and then from the curated data that's created by the data stewards, then the analysts can go pick it up. >> One of the other big challenges that our research is showing, that chief data officers express, is that they get this data in the data lake. So they've got the data sources, you're providing access to it, the other piece is they want to trust that data. There's obviously a governance piece, but then there's a data quality piece, maybe you could talk about that? >> Absolutely. So use case number one is about access. The second reason that people are not so -- So, why are people doing data prep in the first place? They are trying to make information-driven decisions that actually help move their business forward. So if you look at researchers from firms like Forrester, they'll say there are two reasons that slow down the latency of going from raw data to decision. Number one is access to data. That's the use case we just talked about. Number two is the trustworthiness of data. Our approach is very different on that. Once people actually can find the data that they're looking for, the big paradigm shift in the self-service world is that, instead of trying to process data based on transforming the metadata attributes, like I'm going to draw on a work flow diagram, bring in this table, aggregate with this operator, then split it this way, filter it, which is the classic ETL paradigm. The, I don't want to say profound, but maybe the very obvious thing we did was to say, "What if people could actually look at the data in the first place --" >> And sort of program it by example? >> We can tell, that's right. Because our eyes can tell us, our brains help us to say, we can immediately look at a data set, right? You look at an age column, let's say. There are values in the age column of 150 years. Maybe 20 years from now there may be someone who, on Earth, lives to 150 years. But pretty much -- >> Highly unlikely. >> The customers at the banks you work with are not 150 years old, right? So just being able to look at the data, to get to the point that you're asking, quality is about data being fit for a specific purpose. In order for data to be fit for a specific purpose, the person who needs the data needs to make the decision about what is quality data. Both of you may have access to the same transactional data, raw data, that the IT team has landed in the Hadoop cluster. But now you pull it up for one use case, you pull it up for another use case, and because your needs are different, what constitutes quality to you and where you want to make the investment is going to be very different. So by putting the power of that capability into the hands of the person who actually knows what they want, that is how we are actually able to change the paradigm and really compress the latency from "Here's my raw data" to "Here's the decision I want to make on that data." >> Let me ask, it sounds like, having put all of the self-service capabilities together, you've democratized access to this data. Now, what happens in terms of governance, or more importantly, just trust, when the pipeline, you know, has to go beyond where you're working on it, to some of the analytics or some of the basic ingest? To say, "I know this data came from here "and it's going there." >> That's right, how do we verify the fidelity of these data sources? It's a fantastic question. So, in my career, having worked in BI for a couple of decades, I know I look much younger but it actually has been a couple of decades. Remember, the camera adds about 15 pounds, for those of you watching at home. (Dave and George laugh) >> George: But you've lost already. >> Thank you very much. >> So you've lost net 30. (Nenshad laughs) >> Or maybe I'm back to where I'm supposed to be. What I've seen as the two models of governance in the enterprise when it comes to analytics and information management, right? There's model one, which is, we're going to build an enterprise data warehouse, we're going to know all the possible questions people are going to ask in advance, we're going to preprogram the ETL routines, we're going to put something like a MicroStrategy or BusinessObjects, an enterprise-reporting factory tool. Then you spend 10 million dollars on that project, the users come in and for the first time they use the system, and they say, "Oh, I kind of want to change this, this way. "I want to add this calculation." It takes them about five minutes to determine that they can't do it for whatever reason, and what is the first feature they look for in the product in order to move forward? Download to Excel, right? So you invested 15 million dollars to build a download to Excel capability which they already had before. So if you lock things down too much, the point is, the end users will go around you. They've been doing it for 30 years and they'll keep doing it. Then we have model two. Model two is, Excel spreadsheet. Excel Hell, or spreadmarts. There are lots of words for these things. You have a version of the data, you have a version of the data, I have a version of the data. We all started from the same transactional data, yet you're the head of sales, so suddenly your forecast looks really rosy. You're the head of finance, you really don't like what the forecast looks like. And I'm the product guy, so why am I even looking at the forecast in the first place, but somehow I got access to the data, right? These are the two polarities of the enterprise that we've worked with for the last 30 years. We wanted to find sort of a middle path, which is to say, let's give people the freedom and flexibility to be able to do the transformations they need to. If they want to add a column, let them add a column. If they want to change a calculation, let them add a a calculation. But, every single step in the process must be recorded. It must be versioned, it must be auditable. It must be governed in that way. So why the large banks and the intelligence community and the large enterprise customers are attracted to Paxata is because they have the ability to have perfect retraceability for every decision that they make. I can actually sit next to you and say, "This is why the data looks like this. "This is how this value, which started at one million, "became 1.5 million." That covers the Paxata part. But then the answer to the question you asked is, how do you even extend that to a broader ecosystem? I think that's really about some of the metadata interchange initiatives that a lot of the vendors in the Hadoop space, but also in the traditional enterprise space, have had for the last many years. If you look at something like Apache Atlas or Cloudera Navigator, they are systems designed to collect, aggregate, and connect these different metadata steps so you can see in an end-to-end flow, this is the raw data that got ingested into Hadoop. These are the transformations that the end user did in Paxata in order to make it ready for analytics. This is how it's getting consumed in something like Zoom Data, and you actually have the entire life cycle of data now actually manifested as a software asset. >> So those not, in other words, those are not just managing within the perimeter of Hadoop. They are managers of managers. >> That's right, that's right. Because the data is coming from anywhere, and it's going to anywhere. And then you can add another dimension of complexity which is, it's not just one Hadoop cluster. It's 10 Hadoop clusters. And those 10 Hadoop clusters, three of them are in Amazon. Four of them are in Microsoft. Three of them are in Google Cloud platform. How do you know what people are doing with data then? >> How is this all presented to the user? What does the user see? >> Great question. The trick to all of this, of self service, first you have to know very clearly, who is the person you are trying to serve? What are their technical skills and capabilities, and how can you get them productive as fast as possible? When we created this category, our key notion was that we were going to go after analysts. Now, that is a very generic term, right? Because we are all, in some sense, analysts in our day-to-day lives. But in Paxata, a business analyst, in an enterprise organizational context, is somebody that has the ability to use Microsoft Excel, they have to have that skill or they won't be successful with today's Paxata. They have to know what a VLOOKUP is, because a VLOOKUP is a way to actually pull data from a second data source into one. We would all know that as a join or a lookup. And the third thing is, they have to know what a pivot table is and know how a pivot table works. Because the key insight we had is that, of the hundreds of millions of analysts, people who use Excel on a day-to-day basis, a lot of their work is data prep. But Excel, being an amazing generic tool, is actually quite bad for doing data prep. So the person we target, when I go to a customer and they say, "Are we a good candidate to use Paxata?" and we're talking to the actual person who's going to use the software, I say, "Do you know what a VLOOKUP is, yes or no? "Do you know what a pivot table is, yes or no?" If they have that skill, when they come into Paxata, we designed Paxata to be very attractive to those people. So it's completely point-and-click. It's completely visual. It's completely interactive. There's no scripting inside that whole process, because do you think the average Microsoft Excel analyst wants to script, or they want to use a proprietary wrangling language? I'm sorry, but analysts don't want to wrangle. Data scientists, the 1% of the 1%, maybe they like to wrangle, but you don't have that with the broader analyst community, and that is a much larger market opportunity that we have targeted. >> Well, very large, I mean, a lot of people are familiar with those concepts in Excel, and if they're not, they're relatively easy to learn. >> Nenshad: That's right. Excellent. All right, Nenshad, we have to leave it there. Thanks very much for coming on The Cube, appreciate it. >> Thank you very much for having me. >> Congratulations for all the success. >> Thank you. >> All right, keep it right there, everybody. We'll be back with our next guest. This is The Cube, we're live from New York City at Big Data NYC. We'll be right back. (electronic music)

Published Date : Sep 30 2016

SUMMARY :

Brought to you by headline sponsors, here, he's the co-founder across the street from the Hilton. Great to see you guys. Great to be here, and of course, What's the latest? of the business information platform. to retrieve the data. Exactly right. explore it along the way. Let's get into some of the use cases. The sense at the conference One of the simplest use These are the people who One of the other big That's the use case we just talked about. to say, we can immediately the banks you work with of the self-service capabilities together, Remember, the camera adds about 15 pounds, So you've lost net 30. of the data, I have a version of the data. They are managers of managers. and it's going to anywhere. And the third thing is, they have to know relatively easy to learn. have to leave it there. This is The Cube, we're

ENTITIES

Entity	Category	Confidence
Citi	ORGANIZATION	0.99+
October 27, 2013	DATE	0.99+
George	PERSON	0.99+
George Gilbert	PERSON	0.99+
Nenshad	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Prakash	PERSON	0.99+
Dave	PERSON	0.99+
New York City	LOCATION	0.99+
Nvidia	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Earth	LOCATION	0.99+
15 million dollars	QUANTITY	0.99+
two	QUANTITY	0.99+
30 years	QUANTITY	0.99+
Forrester	ORGANIZATION	0.99+
Excel	TITLE	0.99+
thousands	QUANTITY	0.99+
50 companies	QUANTITY	0.99+
10 million dollars	QUANTITY	0.99+
Standard Chartered Bank	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
Nenshad Bardoliwalla	PERSON	0.99+
two reasons	QUANTITY	0.99+
one million	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
first	QUANTITY	0.99+
two roles	QUANTITY	0.99+
two polarities	QUANTITY	0.99+
1.5 million	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
150 years	QUANTITY	0.99+
Hadoop	TITLE	0.99+
Paxata	ORGANIZATION	0.99+
second reason	QUANTITY	0.99+
One	QUANTITY	0.99+
two models	QUANTITY	0.99+
second	QUANTITY	0.99+
one	QUANTITY	0.99+
yesterday	DATE	0.99+
Both	QUANTITY	0.99+
three years ago	DATE	0.99+
first time	QUANTITY	0.98+
first time	QUANTITY	0.98+
New York	LOCATION	0.98+
both	QUANTITY	0.98+
1%	QUANTITY	0.97+
third thing	QUANTITY	0.97+
one system	QUANTITY	0.97+
about five minutes	QUANTITY	0.97+
Paxata	PERSON	0.97+
first feature	QUANTITY	0.97+
Data	LOCATION	0.96+
one part	QUANTITY	0.96+
United States government	ORGANIZATION	0.95+
thousands of tables	QUANTITY	0.94+
20 years	QUANTITY	0.94+
Model two	QUANTITY	0.94+
10 Hadoop clusters	QUANTITY	0.94+
terabytes	QUANTITY	0.93+

Anjul Bhambri - IBM Information on Demand 2013 - theCUBE

okay welcome back to IBM's information on demand live in Las Vegas this is the cube SiliconANGLE movie bonds flagship program we go out to the events it's check the student from the noise talk to the thought leaders get all the data share that with you and you go to SiliconANGLE com or Wikibon or to get all the footage and we're if you want to participate with us we're rolling out our new innovative crowd activated innovation application called crowd chat go to crouch at net / IBM iod just login with your twitter handle or your linkedin and participate and share your voice is going to be on the record transcript of the cube conversations I'm John furrier with silicon items with my co-host hi buddy I'm Dave vellante Wikibon dork thanks for watching aren't you Oh bhambri is here she's the vice president of big data and analytics at IBM many time cube guests as you welcome back good to see you again thank you so we were both down at New York City last week for the hadoop world really amazing to see how that industry has evolved I mean you guys I've said the number of times today and I said this to you before you superglued your your big data or your analytics business to the Big Data meme and really created a new category I don't know if that was by design or you know or not but it certainly happened suddenly by design well congratulations then because because I think that you know again even a year a year and a half ago those two terms big data and analytics were sort of separate now it's really considered as one right yeah yeah I think because initially as people our businesses started getting really flooded with big data right dealing with the large volumes dealing with structured semi-structured or unstructured data they were looking at that you know how do you store and manage this data in a cost-effective manner but you know if you're just only storing this data that's useless and now obviously it's people realize that they need and there is insights from this data that has to be gleaned and there's technology that is available to do that so so customers are moving very quickly to that it's not just about cost savings in terms of handling this data but getting insights from it so so big data and analytics you know is becoming it's it's becoming synonymous heroes interesting to me on Jules is you know just following this business it's all it's like there's a zillion different nails out there and and and everybody has a hammer and they're hitting the nail with their unique camera but I've it's like IBM as a lot of different hammers so we could talk about that a little bit you've got a very diverse portfolio you don't try to force one particular solution on the client you it sort of an it's the Pens sort of answer we could talk about that a little bit yeah sure so in the context of big data when we look at just let's start with transactional data right that continues to be the number one source where there is very valuable insights to be gleaned from it so the volumes are growing that you know we have retailers that are handling now 2.5 million transactions per hour a telco industry handling 10 billion call data detailed records every day so when you look at that level that volume of transactions obviously you need to be you need engines that can handle that that can process analyze and gain insights from this that you can get you can do ad hoc analytics on this run queries and get information out of this at the same speed at which this data is getting generated so you know we we announced the blu acceleration rate witches are in memory columnstore which gives you the power to handle these kinds of volumes and be able to really query and get value out of this very quickly so but now when you look at you know you go beyond the structured data or beyond transactional data there is semi structured unstructured data that's where which is still data at rest is where you know we have big insights which leverages Apache Hadoop open source but we've built lots of capabilities on top of that where we get we give the customers the best of open source plus at the same time the ability to analyze this data so you know we have text analytics capabilities we provide machine learning algorithms we have provided integration with that that customers can do predictive modeling on this data using SPSS using open source languages like our and in terms of visualization they can visualize this data using cognos they can visualize this data using MicroStrategy so we are giving customers like you said it's not just you know there's one hammer and they have to use that for every nail the other aspect has been around real time and we heard that a lot at strada right in the like I've been going to start us since the beginning and those that time even though we were talking about real time but nobody else true nobody was talking nobody was back in the hadoop world days ago one big bats job yeah so in real time is now the hotbed of the conversation a journalist storm he's new technologies coming out with him with yarn has done it's been interesting yeah you seen the same thing yeah so so and and of course you know we have a very mature technology in that space you know InfoSphere streams for a real-time analytics has been around for a long time it was you know developed initially for the US government and so we've been you know in the space for more than anybody else and we have deployments in the telco space where you know these tens of billions of call detail records are being processed analyzed in real time and you know these telcos are using it to predict customer churn to prevent customer churn gaining all kinds of insights and extremely high you know very low latency so so it's good to see that you know other companies are recognizing the need for it and are you know bringing other offerings out in this space yes every time before somebody says oh I want to go you know low latency and I want to use spark you say okay no problem we could do that and streets is interesting because if I understand it you're basically acting on the data producing analytics prior to persisting the data on in memory it's all in memory and but yet at the same time is it of my question is is it evolving where you now can blend that sort of real-time yeah activity with maybe some some batch data and and talk about how that's evolving yeah absolutely so so streams is for for you know where as data is coming in it can be processed filtered patterns can be seen in streams of data by correlating connecting different streams of data and based on a certain events occurring actions can be taken now it is possible that you know all of this data doesn't need to be persisted but there may be some aspects or some attributes of this data that need to be persisted you could persist this data in a database that is use it as a way to populate your warehouse you could persist it in a Hadoop based offering like BigInsights where you can you know bring in other kinds of data and enrich the data it's it's like data loans from data and a different picture emerges Jeff Jonas's puzzle right so that's that that's very valid and so so when we look at the real time it is about taking action in real time but there is data that can be persisted from that in both the warehouse as well as on something like the insides are too I want to throw a term at you and see what what what this means to you we actually doing some crowd chats with with IBM on this topic data economy was going to SS you have no date economy what does the data economy mean to you what our customers you know doing with the data economy yes okay so so my take on this is that there are there are two aspects of this one is that the cost of storing the data and analyzing the data processing the data has gone down substantially the but the value in this data because you can now process analyze petabytes of this data you can bring in not just structured but semi-structured and unstructured data you can glean information from different types of data and a different picture emerges so the value that is in this data has gone up substantially I previously a lot of this data was probably discarded people without people knowing that there is useful information in this so to the business the value in the data has gone up what they can do with this data in terms of making business decisions in terms of you know making their customers and consumers more satisfied giving them the right products and services and how they can monetize that data has gone up but the cost of storing and analyzing and processing has gone down rich which i think is fantastic right so it's a huge win win for businesses it's a huge win win for the consumers because they are getting now products and services from you know the businesses which they were not before so that that to me is the economy of data so this is why I John I think IBM is really going to kill it in this in this business because they've got such a huge portfolio they've got if you look at where I OD has evolved data management information management data governance all the stuff on privacy these were all cost items before people looked at him on I gotta deal with all this data and now it's there's been a bit flip uh-huh IBM is just in this wonderful position to take advantage of it of course Ginny's trying to turn that you know the the battleship and try to get everybody aligned but the moons and stars are aligning and really there's a there's a tailwind yeah we have a question on domains where we have a question on Twitter from Jim Lundy analyst former Gartner analyst says own firm now shout out to Jim Jim thanks for for watching as always I know you're a cube cube alum and also avid watcher and now now a loyal member of the crowd chat community the question is blu acceleration is helps drive more data into actionable analytics and dashboards mm-hmm can I BM drive new more new deals with it I've sued so can you expound it answers yes yes yes and can you elaborate on that for Jim yeah I you know with blu acceleration you know we have had customers that have evaluated blue and against sa bihana and have found that what blue can provide is is they ahead of what SI p hana can provide so we have a number of accounts where you know people are going with the performance the throughput you know what blue provides is is very unique and it's very head of what anybody else has in the market in solving SI p including SI p and and you know it's ultimately its value to the business right and that's what we are trying to do that how do we let our customers the right technology so that they can deal with all of this data get their arms around it get value from this data quickly that's that's really of a sense here wonderful part of Jim's question is yes the driving new deals for sure a new product new deals me to drive new footprints is that maybe what he's asking right in other words you traditional IBM accounts are doing doing deals are you able to drive new footprints yeah yeah we you know there are there are customers that you know I'm not gonna take any names here but which have come to us which are new to IBM right so it's a it's that to us and that's happening that new business that's Nate new business and that's happening with us for all our big data offerings because you know the richness that is there in the portfolio it's not that we have like you were saying Dave it's not that we have one hammer and we are going to use it for every nail that is out there you know as people are looking at blue big insights for her to streams for real time and with all this comes the whole lifecycle management and governance right so security privacy all those things don't don't go away so all the stuff that was relevant for the relational data now we are able to bring that to big data very quickly and which is I think of huge value to customers and as people are moving very quickly in this big data space there's nobody else who can just bring all of these assets together from and and you know provide an integrated platform what use cases to Jim's point I don't you know I know you don't want to name names but can you name you how about some use cases that that these customers are using with blue like but use cases and they solving so you know I from from a use case a standpoint it is really like you know people are seeing performance which is you know 30 32 times faster than what they had seen when they were not using and in-memory columnstore you know so eight to twenty five thirty two times per men's gains is is you know something that is huge and is getting more and more people attracted to this so let's take an industry take financial services for example so the big the big ones in financial services are a risk people want to know you know are they credit risk yeah there's obviously marketing serving up serving up ads a fraud detection you would think is another one that in more real time are these these you know these will be the segments and of course you know retail where again you know there is like i was saying right that the number of transactions that are being handled is is growing phenomenally i gave one example which was around 2.5 million transactions per hour which was unheard of before and the information that has to be gleaned from it which is you know to leverage this for demand forecasting to leverage this for gaining insights in terms of giving the customers the right kind of coupons to make sure that those coupons are getting you know are being used so it was you know before the world used to be you get the coupons in your email in your mail then the world changed to that you get coupons after you've done the transaction now where we are seeing customers is that when a customer walks in the store that's where they get the coupons based on which i layer in so it's a combination of the transactional data the location data right and we are able to bring all of this together so so it's blue combined with you know what things like streams and big insights can do that makes the use cases even more powerful and unique so I like this new format of the crowd chatting emily is a one hour crowd chat where it's kind of like thought leaders just going to pounding away but this is more like reddit AMA but much better question coming in from grant case is one of the themes to you is one of the themes we've heard about in Makino was the lack of analytical talent what is going on to contribute more value for an organization skilling up the work for or implementing better software tools for knowledge workers so in terms so skills is definitely an issue that has been a been a challenge in the in the industry with and it got pretty compound with big data and the new technology is coming in from the standpoint of you know what we are doing for the data scientists which is you know the people who are leveraging data to to gain new insights to explore and and and discover what other attributes they should be adding to their predictive models to improve the accuracy of those models so there is there's a very rich set of tools which are used for exploration and discovery so we have which is both from you know Cognos has such such such capabilities we have such capabilities with our data Explorer absolutely basically tooling for the predictive on the modeling sister right now the efforts them on the modeling and for the predictive and descriptive analytics right I mean there's a lot of when you look at that Windows petabytes of data before people even get to predictive there's a lot of value to be gleaned from descriptive analytics and being able to do it at scale at petabytes of data was difficult before and and now that's possible with extra excellent visualization right so that it's it's taking things too that it the analytics is becoming interactive it's not just that you know you you you are able to do this in real time ask the questions get the right answers because the the models running on petabytes of data and the results coming from that is now possible so so interactive analytics is where this is going so another question is Jim was asking i was one of ibm's going around doing blue accelerator upgrades with all its existing clients loan origination is a no brainer upgrade I don't even know that was the kind of follow-up that I had asked is that new accounts is a new footprint or is it just sort of you it is spending existing it's it's boat it's boat what is the characteristic of a company that is successfully or characteristics of a company that is successfully leveraging data yeah so companies are thinking about now that you know their existing edw which is that enterprise data warehouse needs to be expanded so you know before if they were only dealing with warehouses which one handling just structure data they are augmenting that so this is from a technology standpoint right there augmenting that and building their logical data warehouse which takes care of not just the structure data but also semi-structured and unstructured data are bringing augmenting the warehouses with Hadoop based offerings like big insights with real-time offerings like streams so that from an IT standpoint they are ready to deal with all kinds of data and be able to analyze and gain information from all kinds of data now from the standpoint of you know how do you start the Big Data journey it the platform that at least you know we provide is a plug-and-play so there are different starting points for for businesses they may have started with warehouses they bring in a poly structured store with big inside / Hadoop they are building social profiles from social and public data which was not being done before matching that with the enterprise data which may be in CRM systems master data management systems inside the enterprise and which creates quadrants of comparisons and they are gaining more insights about the customer based on master data management based on social profiles that they are building so so this is one big trend that we are seeing you know to take this journey they have to you know take smaller smaller bites digests that get value out of it and you know eat it in chunks rather than try to you know eat the whole pie in one chunk so a lot of companies starting with exploration proof of concepts implementing certain use cases in four to six weeks getting value and then continuing to add more and more data sources and more and more applications so there are those who would say those existing edw so many people man some people would say they should be retired you would disagree with that no no I yeah I I think we very much need that experience and expertise businesses need that experience and expertise because it's not an either/or it's not that that goes away and there comes a different kind of a warehouse it's an evolution right but there's a tension there though wouldn't you say there's an organizational tension between the sort of newbies and the existing you know edw crowd i would say that maybe you know three years ago that was there was a little bit of that but there is i mean i talked to a lot of customers and there is i don't see that anymore so people are people are you know they they understand they know what's happening they are moving with the times and they know that this evolution is where the market is going where the business is going and where the technology you know they're going to be made obsolete if they don't embrace it right yeah yeah so so as we get on time I want to ask you a personal question what's going on with you these days with within IBM asli you're in a hot area you are at just in New York last week tell us what's going on in your life these days I mean things going well I mean what things you're looking at what are you paying attention to what's on your radar when you wake up and get to work before you get to work what's what are you thinking about what's the big picture so so obviously you know big data has been really fascinating right lots of lots of different kinds of applications in different industries so working with the customers in telco and healthcare banking financial sector has been very educational right so a lot of learning and that's very exciting and what's on my radar is we are obviously now seeing that we've done a lot of work in terms of helping customers develop and their Big Data Platform on-premise now we are seeing more and more a trend where people want to put this on the cloud so that's something that we have now a lot of I mean it's not like we haven't paid attention to the cloud but you know in the in the coming months you are going to see more from us are where you know how do we build cus how do we help customers build both private and and and public cloud offerings are and and you know where they can provide analytics as a service two different lines of business by setting up the clouds soso cloud is certainly on my mind software acquisition that was a hole in the portfolio and that filled it you guys got to drive that so so both software and then of course OpenStack right from an infrastructure standpoint for what's happening in the open source so we are you know leveraging both of those and like I said you'll hear more about that OpenStack is key as I say for you guys because you have you have street cred when it comes to open source I mean what you did in Linux and made a you know great business out of that so everybody will point it you know whether it's Oracle or IBM and HP say oh they just want to sell us our stack you've got to demonstrate and that you're open and OpenStack it's great way to do that and other initiatives as well so like I say that's a V excited about that yeah yeah okay I sure well thanks very much for coming on the cube it's always a pleasure to thank you see you yeah same here great having you back thank you very much okay we'll be right back live here inside the cube here and IV IBM information on demand hashtag IBM iod go to crouch at net / IBM iod and join the conversation where we're going to have a on the record crowd chat conversation with the folks out the who aren't here on-site or on-site Worth's we're here alive in Las Vegas I'm Java with Dave on to write back the q

Published Date : Nov 5 2013

SUMMARY :

of newbies and the existing you know edw

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Jeff Jonas	PERSON	0.99+
Jim Lundy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
New York City	LOCATION	0.99+
one hour	QUANTITY	0.99+
New York	LOCATION	0.99+
Anjul Bhambri	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
30	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Dave vellante	PERSON	0.99+
Dave	PERSON	0.99+
2013	DATE	0.99+
Linux	TITLE	0.99+
Gartner	ORGANIZATION	0.99+
last week	DATE	0.99+
eight	QUANTITY	0.99+
two aspects	QUANTITY	0.99+
last week	DATE	0.99+
three years ago	DATE	0.98+
four	QUANTITY	0.98+
both	QUANTITY	0.98+
six weeks	QUANTITY	0.98+
one chunk	QUANTITY	0.98+
SPSS	TITLE	0.98+
John furrier	PERSON	0.97+
one hammer	QUANTITY	0.97+
US government	ORGANIZATION	0.97+
Ginny	PERSON	0.97+
year and a half ago	DATE	0.96+
32 times	QUANTITY	0.96+
two terms	QUANTITY	0.95+
telco	ORGANIZATION	0.95+
today	DATE	0.94+
reddit	ORGANIZATION	0.93+
Cognos	ORGANIZATION	0.93+
around 2.5 million transactions per hour	QUANTITY	0.93+
one example	QUANTITY	0.93+
two different lines	QUANTITY	0.93+
ibm	ORGANIZATION	0.92+
themes	QUANTITY	0.9+
one	QUANTITY	0.9+
number one	QUANTITY	0.9+
petabytes	QUANTITY	0.9+
Jim Jim	PERSON	0.89+
10 billion call data	QUANTITY	0.89+
OpenStack	TITLE	0.89+
Hadoop	TITLE	0.88+
bhambri	PERSON	0.88+
days	DATE	0.88+
tens of billions of call	QUANTITY	0.87+
Wikibon	ORGANIZATION	0.85+
twitter	ORGANIZATION	0.85+
twenty five thirty two times	QUANTITY	0.85+
2.5 million transactions per hour	QUANTITY	0.84+
one big	QUANTITY	0.83+
blue	ORGANIZATION	0.83+
one big bats	QUANTITY	0.82+
one of	QUANTITY	0.8+
IBM iod	TITLE	0.78+
zillion different nails	QUANTITY	0.77+
Twitter	ORGANIZATION	0.74+
SiliconANGLE com	OTHER	0.74+
Makino	TITLE	0.73+

Christian Chabot - Tableau Customer Conference 2013 - theCUBE

okay we're back this is Dave Volante with Jeff Kelly we're with Ricky bond on organ this is the cubes silicon angles flagship product we go out to the events we extract the signal from the noise we bring you the tech athletes who are really changing the industry and we have one here today christiane sabo is the CEO the leader the spiritual leader of of this conference and of Tablo Kristin welcome to the cube thanks for having me yeah it's our pleasure great keynote the other day I just got back from Italy so I'm full of superlatives right it really was magnificent I was inspired I think the whole audience was inspired by your enthusiasm and what struck me is I'm a big fan of simon Sinek who says that people don't buy what you do they buy why you do it and your whole speech was about why you're here everybody can talk about their you know differentiators they can talk about what they sell you talked about why you're here was awesome so congratulations I appreciate that yeah so um so why did you start then you and your colleagues tableau well it's how below really started with a series of breakthrough research innovations that was this seed there are three co-founders of tableau myself dr. crystal T and professor Pat Hanrahan and those two are brilliant inventors and designers and researchers and the real hero of the tableau story and the company formed when they met on entrepreneur and a customer I had spent several years as a data analyst when I first came out of college and I understood the problems making sense of data and so when I encountered the research advancements they had made I saw a vision of the future a much better world that could bring the power of data to a vastly larger number of people yeah and it's really that simple isn't it and and so you gave some fantastic examples them in the way in which penicillin you know was discovered you know happenstance and many many others so those things inspire you to to create this innovation or was it the other way around you've created this innovation and said let's look around and see what others have done well I think the thing that we're really excited about is simply put as making databases and spreadsheets easy for people to use I can talk to someone who knows nothing about business intelligence technology or databases or anything but if I say hey do you have any spreadsheets or data files or databases you you just feel like it could it could get in there and answer some questions and put it all together and see the big picture and maybe find a thing or two everyone not everyone has been in that situation if nothing else with the spreadsheet full of stuff like your readership or the linkage the look the the traffic flow on on the cube website everyone can relate to that idea of geez why can't I just have a google for databases and that's what tableau is doing right right so you've kind of got this it's really not a war it's just two front two vectors you know sometimes I did I did tweet out they have a two-front war yeah what'd you call it the traditional BI business I love how you slow down your kids and you do that and then Excel but the point I made on Twitter in 140 characters was you it will be longer here I'm a little long-winded sometimes on the cube but you've got really entrenched you know bi usage and you've got Excel which is ubiquitous so it sounds easy to compete with those it's not it's really not you have to have a 10x plus value problem solutely talked about that a little bit well I think the most important thing we're doing is we're bringing the power of data and analytics to a much broader population of people so the reason the answer that way is that if you look at these traditional solutions that you described they have names like and these are the product brand names forget who owns them but the product brand names people are used to hearing when it comes to enterprise bi technology our names like Business Objects and Cognos and MicroStrategy and Oracle Oh bi and big heavy complicated develop intensive platforms and surprise surprise they're not in the hands of very many people they're just too complicated and development heavy to use so when we go into the worlds even the world's biggest companies this was a shocker for us even when we go into the world's most sophisticated fortune 500 companies and the most cutting-edge industries with the top-notch people most of the people in their organization aren't using those platforms because of theirs their complication and expense and development pull and so usually what we end up doing is just bringing the power of easy analytics and dashboards and visualization and easy QA with data to people who have nothing other than maybe a spreadsheet on their desk so in that sense it's actually a little easier than it sounds well you know I have to tell you I just have a cio consultancy and back in the day and we used to go in and do application portfolio analysis and we would look at the applications and we always advise the CIOs that the value of an application is a function of its use how much is being adopted and the impact of that use you know productivity of the users right and you'd always find that this is the dss system the decision support system like you said there were maybe 3 to 15 users yeah and an organization of tens of thousands of people yeah if they were very productive so imagine if you can you can permeate the other you know hundreds of thousands of users that are out there do you see that kind of impact that productivity impact as the potential for your marketplace absolutely I you know the person who I think said it best was the CEO of Cisco John Chambers and I'll paraphrase him here but he has this great thing he said which is he said you know if I can get each of the people on my team consulting data say oh I don't know twice per day before making a decision and they do the same thing with their people and their people and so you know that's a million decisions a month you did the math better made than my competition I don't want people waiting around for top top management to consult some data before making a decision I want all of our people all the time Consulting data before making a decision and that's the real the real spirit of this new age of BI for too long it's been in the hands of a high priesthood of people who know how to operate these complicated convoluted enterprise bi systems and the revolution is here people are fed up with it they're taking power into their hands and they're driving their organizations forward with the power of data thanks to the magic of an easy-to-use suite like tableau well it's a perfect storm right because everybody wants to be a data-driven organization absolutely data-driven if you don't have the tools to be able to visualize the data absolutely so Jeff if you want to jump in well Christian so in your keynote you talked for the majority of the keynote about human intuition and the human element talk a little bit about that because when we hear about in the press these days about big data it's oh well the the volume of data will tell you what the answer is you don't need much of the human element talk about why you think the human element is so important to data-driven decision-making and how you incorporate that into your design philosophy when you're building the product and you're you know adding new features how does the human element play in that scenario yeah I mean it's funny dated the data driven moniker is coming these days and we're tableaus a big big believer in the power of data we use our tools internally but of course no one really wants to be data driven if you drive your company completely based on data say hello to the cliff wall you will drive it off a cliff you really want people intelligent domain experts using a combination of act and intuition and instinct to make data informed decisions to make great decisions along the way so although pure mining has some role in the scheme of analytics frankly it's a minor role what we really need to do is make analytic software that as I said yesterday is like a bicycle for our minds this was the great Steve Jobs quote about computers that their best are like bicycles for our mind effortless machines that just make us go so much faster than any other species with no more effort expended right that's the spirit of computers when they're at our best Google Google is effortless to use and makes my brain a thousand times smarter than it is right unfortunately over an analytic software we've never seen software that does tap in business intelligence software there's so much development weight and complexity and expense and slow rollout schedules that were never able to get that augmentation of the brain that can help lead to better decisions so at tableau in terms of design we value our product requirements documents say things like intuition and feel and design and instinct and user experience they're focused on the journey of working with data not just some magic algorithm that's gonna spit out some answer that tells you what to do yeah I mean I've often wondered where that bi business would be that traditional decision support business if it weren't for sarbanes-oxley I mean it gave it a new life right because you had to have a single version of the truth that was mandated by by the government here we had Bruce Boston on yesterday who works over eight for a company that shall not be named but anyway he was talking about okay Bruce in case you're watching we're sticking to our promise but he was talking about intent desire and satisfaction things those are three things intent desire and satisfaction that machines can't do like the point being you just you know it was the old bromide you can't take the humans in the last mile yeah I guess yeah do you see that ever changing no I mean I think you know I I went to a friend a friend of mine I just haven't seen in a while a friend of mine once said he was an he was an artificial intelligence expert had Emilie's PhD in a professorship in AI and once I naively asked him I said so do we have artificial intelligence do we have it or not and we've been talking about for decades like is it here and he said you're asking the wrong question the question is how smart our computers right so I just think we're analytics is going is we want to make our computers smarter and smarter and smarter there'll be no one day we're sudden when we flip a switch over and the computer now makes the decision so in that sense the answer to your question is I keep I see things going is there is it going now but underneath the covers of human human based decision making it are going to be fantastic advancements and the technology to support good decision making to help people do things like feel and and and chase findings and shift perspectives on a problem and actually be creative using data I think there's I think it's gonna be a great decade ahead ahead of us so I think part of the challenge Christian in doing that and making that that that evolution is we've you know in the way I come the economy and and a lot of jobs work over the last century is you know you're you're a cog in a wheel your this is how you do your job you go you do it the same way every day and it's more of that kind of almost assembly line type of thinking and now we're you know we're shifting now we're really the to get ahead in your career you've got to be as good but at an artist you've got to create B you've got to make a difference is the challenge do you see a challenge there in terms of getting people to embrace this new kind of creativity and again how do you as a company and as a you know provider of data visualization technology help change some of those attitudes and make people kind of help people make that shift to more of less of a you know a cog in a larger organization to a creative force inside that organ well mostly I feel like we support what people natively want to do so there are there are some challenges but I mostly see opportunity there in category after category of human activity we're seeing people go from consumers to makers look at publishing from 20 years ago to now self-publishing come a few blogs and Twitter's Network exactly I mean we've gone from consumers to makers everyone's now a maker and we have an ecosystem of ideas that's so positive people naturally want to go that way I mean people's best days on the job are when they feel they're creating something and have that sense of achievement of having had an idea and seeing some progress their hands made on that idea so in a sense we're just fueling the natural human desire to have more participation with data to id8 with data to be more involved with data then they've been able to in the past and again like other industries what we're seeing in this category of technology which is the one I know we're going from this very waterfall cog in a wheel type process is something that's much more agile and collaborative and real-time and so it's hard to be creative and inspired when you're just a cog stuck in a long waterfall development process so it's mostly just opportunity and really we're just fueling the fire that I think is already there yeah you talked about that yesterday in your talk you gave a great FAA example the Mayan writing system example was fantastic so I just really loved that story you in your talk yesterday basically told the audience first of all you have very you know you have clarity of vision you seem to have certainty in your vision of passion for your vision but the same time you said you know sometimes data can be confusing and you're not really certain where it's going don't worry about that it's no it's okay you know I was like all will be answered eventually what but what about uncertainty you know in your minds as the you know chief executive of this organization as a leader in a new industry what things are uncertain to you what are the what are the potential blind spots for you that you worry about do you mean for tableau as a company for people working with data general resource for tableau as a company oh I see well I think there's always you know I got a trip through the spirit of the question but we're growing a company we're going a disruptive technology company and we want to embrace all the tall the technologies that exist around us right we want to help to foster day to day data-driven decision-making in all of its places in forms and it seems to me that virtually every breakthrough technology company has gone through one or two major Journal technology transformations or technology shocks to the industry that they never anticipated when they founded the company okay probably the most recent example is Facebook and mobile I mean even though even though mobile the mobile revolution was well in play when when Facebook was founded it really hadn't taken off and that was a blind Facebook was found in oh seven right and look what happened to them right after and here's that here's new was the company you can get it was founded in oh seven yeah right so most companies I mean look how many companies were sort of shocked by the internet or shocked by the iPod or shocked by the emergence of a tablet right or shocked by the social graph you know I think for us in tableaus journey if this was the spirit of the thought of the question we will have our own shocks happen the first was the tablet I mean when we founded tableau like the rest of the world we never would have anticipated that that a brilliant company would finally come along and crack the tablet opportunity wide open and before in a blink of an eye hundreds of millions of people are walking around with powerful multi-touch graphic devices in their I mean who would have guessed people wouldn't have guessed it no six let alone oh three know what and so luckily that's what that's I mean so this is the good kind of uncertainty we've been able to really rally around that there are our developers love to work on this area and today we have probably the most innovative mobile analytics offering on the market but it's one we never could have anticipated so I think the biggest things in terms of big categories of uncertainty that we'll see going forward are similar shocks like that and our success will be determined by how well we're able to adapt to those so why is it and how is it that you're able to respond so quickly as an organization to some of those tectonic shifts well I think the most important thing is having a really fleet-footed R&D team we have just an exceptional group of developers who we have largely not hired from business technology companies we have something very distributed going a tableau yeah one of the amazing things about R&D key our R&D team is when we decided to build just this amazing high-wattage cutting-edge R&D team and focus them on analytics and data we decided not to hire from other business intelligence companies because we didn't think those companies made great products so we've actually been hiring from places like Google and Facebook and Stanford and MIT and computer gaming companies if you look at the R&D engineers who work on gaming companies in terms of the graphic displays and the response times and the high dimensional data there are actually hundreds of times more sophisticated in their thinking and their engineering then some engineer who was working for an enterprise bi reporting company so this incredible horsepower this unique team of inspired zealots and high wattage engineers we have in our R&D team like Apple that's the key to being able to respond to these disruptive shocks every once in a while and rule and really sees them as an opportunity well they're fun to I mean think of something on the stage yesterday and yeah we're in fucky hats and very comfortable there's never been an R&D team like ours assembled in analytics it's been done in other industries right Google and Facebook famously but in analytics there's never been such an amazing team of engineers and Christian what struck me one of the things that struck me yesterday during your keynote or the second half of the keynote was bringing up the developers and talking about the specific features and functions you're gonna add to the product and hearing the crowd kind of erupt at different different announcements different features that you're adding and it's clear that you're very customer focused at this at tableau of you I mean you're responding to the the needs and the requests of your customers and I that's clearly evident again in the in the passion that these customers have for your for your product for your company how do you know first I'm happy how do you maintain that or how do you get get to that point in the first place where you're so customer focused and as you go forward being a public company now you're gonna get pressure from Wall Street and quarter results and all that that you know that comes with that kind of comes with the territory how do you remain that focused on the customer kind of as your you know you're going to be under a lot of pressure to grow and and you know drive revenue yeah I keep that focus well there's two things we do it's a it's always a challenge to stay really connected to your customers as you get big but it's what we pride ourselves on doing and there's two specific things we do to foster it the first is that we really try to focus the company and we try to make a positive aspect of the culture the idea of impact what is the impact of the work we're having and in fact a great example of how we foster that is we bring our entire support and R&D team to this conference no matter where it is we take we fly I mean in this case we literally flew the entire R&D team and product management team and whatnot across country and the time they get here face to face face to face with customers and hearing the customer stories and the victories and actually seeing the feedback you just described really inspires them it gives them specific ideas literally to go back and start working on but it also just gives them a sense of who comes first in a way that if you don't leave the office and you don't focus on that really doesn't materialize and the way you want it the second thing we do is we are we are big followers of I guess what's called the dog food philosophy of eat your own dog so drink your own champagne and so one of our core company values that tableau is we use our products facility a stated value of the company we use our products and into an every group at tableau in tests in bug regressions in development in sales and marketing and planning and finance and HR every sip marketing marketing is so much data these these every group uses tableau to run our own business and make decisions and what happens Matt what's really nice about a company because you know we're getting close to a thousand people now and so it's keeping the spirit you just described alive is really important it becomes quite challenging vectors leagues for it because when that's one of your values and that's the way the culture has been built every single person in the company is a customer everyone understands the customer's situation and the frustrations and the feature requests and knows how to support them when they meet them and can empathize with them when they're on the phone and is a tester automatically by virtue of using the product so we just try to focus on a few very authentic things to keep our connection with the customer as close as possible I'll say christen your company is a rising star we've been talking all this week of the similarities that we were talking off about the similarities with with ServiceNow just in terms of the passion within the customer base we're tracking companies like workday you know great companies that are that are that are being built new emerging disruptive companies we put you in that in that category and we're very excited for different reasons you know different different business altogether but but there are some similar dynamics that we're watching so as observers it's independent observers what kinds of things do you want us to be focused on watching you over the next 12 18 24 months what should we be paying attention to well I think the most important thing is tableau ultimately is a product company and we view ourselves very early in our product development lifecycle I think people who don't really understand tableau think it's a visualization company or a visualization tool I don't I don't really understand that when you talk about the vision a lot but okay sure we can visualization but there's just something much bigger I mean you asked about people watching the company I think what's important to watch is that as I spoke about makino yesterday tableau believes what is called the business intelligence industry what's called the business analytics technology stack needs to be completely rewritten from scratch that's what we believe to do over it's a do-over it's based on technology from a prior hair prior era of computing there's been very little innovation the R&D investment ratios which you can look up online of the companies in this space are pathetically low and have been for decades and this industry needs a Google it needs an apple it's a Facebook an RD machine that is passionate and driven and is leveraging the most recent advances in computing to deliver products that people actually love using so that people start to enjoy doing analytics and have fun with it and make data-driven driven decision in a very in a very in a way that's just woven into their into their into their enjoyment and work style every every single day so the big series of product releases you're going to see from us over the next five years that's the thing to watch and we unveiled a few of them yesterday but trust me there's a lot more that's you a lot of applause christina is awesome you can see you know the passion that you're putting forth your great vision so congratulations in the progress you've made I know I know you're not done we'll be watching it thanks very much for coming to me I'm really a pleasure thank you all right keep right there everybody we're going wall to wall we got a break coming up next and then we'll be back this afternoon and this is Dave Volante with Jeff Kelly this is the cube we'll be right back

Published Date : Sep 11 2013

SUMMARY :

in that sense the answer to your

ENTITIES

Entity	Category	Confidence
Jeff Kelly	PERSON	0.99+
Pat Hanrahan	PERSON	0.99+
Steve Jobs	PERSON	0.99+
3	QUANTITY	0.99+
christina	PERSON	0.99+
Google	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Italy	LOCATION	0.99+
Dave Volante	PERSON	0.99+
140 characters	QUANTITY	0.99+
Excel	TITLE	0.99+
MIT	ORGANIZATION	0.99+
yesterday	DATE	0.99+
Jeff	PERSON	0.99+
Tablo Kristin	PERSON	0.99+
two things	QUANTITY	0.99+
Bruce	PERSON	0.99+
christiane sabo	PERSON	0.99+
Emilie	PERSON	0.99+
Apple	ORGANIZATION	0.99+
apple	ORGANIZATION	0.99+
Stanford	ORGANIZATION	0.99+
one	QUANTITY	0.98+
two-front	QUANTITY	0.98+
15 users	QUANTITY	0.98+
two	QUANTITY	0.98+
hundreds of millions of people	QUANTITY	0.98+
first	QUANTITY	0.98+
Christian Chabot	PERSON	0.98+
today	DATE	0.98+
iPod	COMMERCIAL_ITEM	0.98+
two specific things	QUANTITY	0.97+
second half	QUANTITY	0.97+
second thing	QUANTITY	0.97+
MicroStrategy	ORGANIZATION	0.97+
Christian	ORGANIZATION	0.97+
20 years ago	DATE	0.97+
three co-founders	QUANTITY	0.97+
500 companies	QUANTITY	0.96+
Twitter	ORGANIZATION	0.96+
tens of thousands of people	QUANTITY	0.96+
several years	QUANTITY	0.95+
over eight	QUANTITY	0.95+
Matt	PERSON	0.94+
this afternoon	DATE	0.94+
twice per day	QUANTITY	0.93+
a million decisions	QUANTITY	0.93+
simon Sinek	PERSON	0.93+
hundreds of thousands of users	QUANTITY	0.92+
three things	QUANTITY	0.92+
google	ORGANIZATION	0.92+
decades	QUANTITY	0.91+
two vectors	QUANTITY	0.91+
first place	QUANTITY	0.91+
last century	DATE	0.89+
tableau	ORGANIZATION	0.89+
Objects	ORGANIZATION	0.89+
10x	QUANTITY	0.88+
blogs	QUANTITY	0.88+
Cognos	ORGANIZATION	0.86+
a thousand times	QUANTITY	0.84+
dr. crystal T	PERSON	0.84+
Christian	PERSON	0.83+
hundreds of times	QUANTITY	0.82+
Tableau Customer Conference 2013	EVENT	0.82+
12 18 24 months	DATE	0.82+
six	QUANTITY	0.82+
a thing	QUANTITY	0.8+

Dr. Amr Awadallah - Interview 2 - Hadoop World 2011 - theCUBE

Yeah, I'm Aala, They're the co-founder back to back. This is the cube silicon angle.com, Silicon angle dot TV's production of the cube, our flagship telecasts. We go out to the event. That was a great conversation. I was really just, just cool. I could have, we could have probably hit on a few more things, obviously well read. Awesome. Co-founder of Cloudera a. You were, you did a good job teaming up with that co-founder, huh? Not bad on the cube, huh? He's not bad on the cube, isn't he? He, >>He reads the internet. >>That's what I'm saying. >>Anything is going on. >>He's a cube star, you know, And >>Technology. Jeff knows it. Yeah. >>We, we tell you, I'm smarter just by being in Cloudera all those years. And I actually was following what he was saying, Sad and didn't dust my brain. So, Okay, so you're back. So we were talking earlier with Michaels and about the relational database thing. So I kind of pick that up where we left off with you around, you know, he was really excited. It's like, you know, hey, we saw that relational database movement happen. He was part of that. Yeah, yeah. That generation. And then, but things were happening or kind of happening the same way in a similar way, still early. So I was trying to really peg with him, how early are we, like, so, you know, as the curve, you know, this is 1400, it's not the Javit Center yet. Maybe the Duke world, you know, next year might be at the Javit Center, 35,000 just don't go to Vegas. So I'm trying to figure out where we are on that curve. Yeah. And we on the upwards slope, you know, down here, not even hitting that, >>I think, I think, I think we're moving up quicker than previous waves. And actually if you, if you look for example, Oracle, I think it took them 15, 20 years until they, they really became a mature company, VM VMware, which started about, what, 12, 13 years ago. It took them about maybe eight years to, to be a big company, met your company, and I'm hoping we're gonna do it in five. So a couple more years. >>Highly accelerated. >>Yes. But yeah, we see, I mean, I'm, I'm, I've been surprised by the growth. I have been, Right? I've been told, warned about enterprise software and, and that it takes long for production to take place. >>But the consumerization trend is really changing that. I mean, it seems to be that, yeah, the enterprises always last. Why the shorter >>Cycle? I think the shorter cycle is coming from having the, the, the, the right solution for the right problem at the right time. I think that's a big part of it. So luck definitely is a big part of this. Now, in terms of why this is changing compared to a couple of dec decades ago, why the adoption is changing compared to a couple of decades ago. I, I think that's coming just because of how quickly the technology itself, the underlying hardware is evolving. So right now, the fact that you can buy a single server and it has eight cores to 16 cores has 12 hards to terabytes. Each is, is something that's just pushing the, the, the, the limits what you can do with the existing systems and hence making it more likely for new systems to disrupt them. >>Yeah. We can talk about a lot. It's very easy for people to actually start a, a big data >>Project. >>Yes. For >>Example. Yes. And the hardest part is, okay, what, what do I really, what problem do I need to solve? How am I gonna, how am I gonna monetize it? Right? Those are the hard parts. It's not the, not the underlying >>Technology. Yes, Yes, that's true. That's true. I mean, >>You're saying, eh, you're saying >>Because, because I'm seeing both so much. I'm, I'm seeing both. I'm seeing both. And like, I'm seeing cases where you're right. There's some companies that was like, Oh, this Hadoop thing is so cool. What problem can I solve with it? And I see other companies, like, I have this huge problem and, and, and they don't know that HA exists. It's so, And once they know, they just jump on it right away. It's like, we know when you have a headache and you're searching for the medicine in Espin. Wow. It >>Works. I was talking to Jeff Hiba before he came on stage and, and I didn't even get to it cuz we were so on a nice riff there. Right. Bunch of like a musicians playing the guitar together. But like he, we talked about the it and and dynamics and he said something that I thoughts right. On money and SAP is talking the same thing and said they're going to the lines of business. Yes. Because it is the gatekeeper that's, it's like selling mini computers to a mainframe selling client servers from a mini computer team. Yeah. >>There's not, we're seeing, we're seeing both as well. So more likely the, the former one meaning, meaning that yes, line of business and departments, they adopt the technology and then it comes in and they see there's already these five different departments having it and they think, okay, now we need to formalize this across the organization. >>So what happens then? What are you seeing out there? Like when that happens, that mean people get their hands on, Hey, we got a problem to solve. Yeah. Is that what it comes down to? Well, Hadoop exist. Go get Hadoop. Oh yeah. They plop it in there and I what does it do? They, >>So they pop it into their, in their own installation or on the, on the cloud and they show that this actually is working and solving the problem for them. Yeah. And when that happens, it's a very, it's a very easy adoption from there on because they just go tell it, We need this right now because it's solving this problem and it's gonna make, make us much >>More money moving it right in. Yes. No problems. >>Is is that another reason why the cycle's compressed? I mean, you know, you think client server, there was a lot of resistance from it and now it's more much, Same thing with mobile. I mean mobile is flipped, right? I mean, so okay, bring it in. We gotta deal with it. Yep. I would think the same thing. We, we have a data problem. Let's turn it into an >>Opportunity. Yeah. In my, and it goes back to what I said earlier, the right solution for the right problem at the right time. Like when they, when you have larger amounts of unstructured data, there isn't anything else out there that can even touch what had, can >>Do. So Amar, I need to just change gears here a minute. The gaming stuff. So we have, we we're featured on justin.tv right now on the front page. Oh wow. But the numbers aren't coming in because there's a competing stream of a recently released Modern Warfare three feature. Yes. Yes. So >>I was looking for, we >>Have to compete with Modern Warfare three. So can you, can we talk about Modern Warfare three for a minute and share the folks what you think of the current version, if any, if you played it. Yeah. So >>Unfortunately I'm waiting to get back home. I don't have my Xbox with me here. >>A little like a, I'm talking about >>My lines and business. >>Boom. Water warfares like a Christmas >>Tree here. Sorry. You know, I love, I'm a big gamer. I'm a big video gamer at Cloudera. We have every Thursday at five 30 end office, we, we play Call of of Beauty version four, which is modern world form one actually. And I challenge, I challenge people out there to come challenge our team. Just ping me on Twitter and we'll, we'll do a Cloudera versus >>Let's, let's, let's reframe that. Let team out. There am Abalas company. This is the geeks that invent the future. Jeff Haer Baer at Facebook now at Cloudera. Hammerer leading the charge. These guys are at gamers. So all the young gamers out there am are saying they're gonna challenge you. At which version? >>Modern Warfare one. >>Modern Warfare one. Yes. How do they fire in? Can you set up an >>External We'll >>We'll figure it out. We'll figure it out. Okay. >>Yeah. Just p me on Twitter and We'll, >>We can carry it live actually we can stream that. Yeah, >>That'd be great. >>Great. >>Yeah. So I'll tell you some of our best Hadooop committers and Hadoop developers pitch >>A picture. Modern Warfare >>Three going now Model Warfare three. Very excited about the game. I saw the, the trailers for it looks, graphics look just amazing. Graphics are amazing. I love the Sirius since the first one that came out. And I'm looking forward to getting back home to playing the game. >>I can't play, my son won't let me play. I'm such a fumbler with the Hub. I'm a keyboard controller. I can't work the Xbox controller. Oh, I have a coordination problem my age and I'm just a gluts and like, like Dad, sorry, Charity's over. I can I play with my friends? You the box. But I'm around big gamer. >>But, but in terms of, I mean, something I wanted to bring up is how to link up gaming with big data and analysis and so on. So like, I, I'm a big gamer. I love playing games, but at the same time, whenever I play games, I feel a little bit guilty because it's kind of like wasted time. So it's like, I mean, yeah, it's fun and I'm getting lots of enjoyment on it makes my life much more cheerful. But still, how can we harness all of this, all of these hours that gamers spend playing a game like Modern Warfare three, How can we, how can we collect instrument, all of the data that's coming from that and coming up, for example, with something useful with predicted. >>This is exactly, this is exactly the kind of application that's mainstream is gaming. Yeah. Yeah. Danny at Riot G is telling me, we saw him at Oracle Open World. He's up there for the Java one. He said that they, they don't really have a big data platform and their business is about understanding user behavior rep tons of data about user playing time, who they're playing with. Yeah, Yeah. How they want us to get into currency trading, You know, >>Buy, I can't, I can't mention the names, but some of the biggest giving companies out there are using Hadoop right now. And, and depending on CDH for doing exactly that kind of thing, creating >>A good user experience >>Today, they're doing it for the purpose of enhancing the user experience and improving retention. So they do track everything. Like every single bullet, you fire everything in best Ball Head, you get everything home run, you do. And, and, and in, in a three >>Type of game consecutive headshot, you get >>Everything, everything is being Yeah. Headshot you get and so on. But, but as you said, they are using that information today to sell more products and, and, and retain their users. Now what I'm suggesting is that how can you harness that energy for the good as well? I mean for making money, money is good and everything, but how can you harness that for doing something useful so that all of this entertainment time is also actually productive time as well. I think that'd be a holy grail in this, in this environment if we >>Can achieve that. Yeah. It used to be that corn used to be the telegraph of the future of about, of applications, but gaming really is, if you look at gaming, you know, you get the headset on. It's a collaborative environment. Oh yeah. You got unified communications. >>Yeah. And you see our teenager kids, how, how many hours they spend on these things. >>You got play as a play environments, very social collaborative. Yeah. You know, some say, you know, we we're saying, what I'm saying is that that's the, that's the future work environment with Skype evolving. We're our multiplayer game's called our job. Right? Yeah. You know, so I'm big on gaming. So all the gamers out there, a has challenged you. Yeah. Got a big data example. What else are we seeing? So let's talk about the, the software. So we, one of the things you were talking about that I really liked, you were going down the list. So on Mike's slide he had all the new features. So around the core, can you just go down the core and rattle off your version of what, what it means and what it is. So you start off with say H Base, we talked about that already. What are the other ones that are out there? >>So the projects that we have right there, >>The projects that are around those tools that are being built. Cause >>Yeah, so the foundational, the foundational one as we mentioned before, is sdfs for storage map use for processing. Yeah. And then the, the immediate layer above that is how to make MAP reduce easier for the masses. So how can, not everybody knows how to learn map, use Java, everybody knows sql, right? So, so one of the most successful projects right now that has the highest attach rate, meaning people usually when they install had do installed as well is Hive. So Hive takes sequel and so Jeff Harm Becker, my co-founder, when he was at Facebook, his team built the Hive system. Essentially Hive takes sql so you don't have to learn a new language, you already know sql. And then converts that into MAP use for you. That not only expands the developer base for how many people can use adu, but also makes it easier to integrate Hadoop through all DBC and JDBC integrated with BI tools like MicroStrategy and Tableau and Informatica, et cetera, et cetera. >>You mentioned R too. You mentioned R Program R >>As well. Yeah, R is one of our best partnerships. We're very, very happy with them. So that's, that's one of the very key projects is Hive assisted project to Hive ISS called Pig. A pig Latin is a language that ya invented that you have to learn the language. It's very easy, it's very easy to learn compared to map produce. But once you learn it, you can, you can specify very deep data pipelines, right? SQL is good for queries. It's not good for data pipelines because it becomes very convoluted. It becomes very hard for the, the human brain to understand it. So Pig is much more natural to the human. It's more like Pearl very similar to scripting kind of languages. So with Peggy can write very, very long data pipelines, again, very successful projects doing very, very well. Another key project is Edge Base, like you said. So Edge Base allows you to do low latencies. So you can do very, very quick lookups and also allows you to do transactions. So you can do updates in inserts and deletes. So one of the talks here that had World we try to recommend people watch when the videos come out is the Talk by Jonathan Gray from Facebook. And he talked about how they use Edge Base, >>Jonathan, something on here in the Cube later. Yeah. So >>Drill him on that. So they use Edge Base now for many, many things within Facebook. They have a big team now committed to building an improving edge base with us and with the community at large. And they're using it for doing their online messaging system. The live mail system in Facebook is powered by Edge Base right now. Again, Pro and eBay, The Casini project, they gave a keynote earlier today at the conference as well is using Edge Base as well. So Edge Base is definitely one of the projects that's growing very, very quickly right now within the Hudu system. Another key project that Jeff alluded to earlier when he was on here is Flum. So Flume is very instrumental because you have this nice system had, but Hadoop is useless unless you have data inside it. So how do you get the data inside do? >>So Flum essentially is this very nice framework for having these agents all over your infrastructure, inside your web servers, inside your application servers, inside your mobile devices, your network equipment that collects all of that data and then reliably and, and materializes it inside Hado. So Flum does that. Another good project is Uzi, so many of them, I dunno how, how long you want me to keep going here, But, but Uzi is great. Uzi is a workflow processing system. So Uzi allows you to define a series of jobs. Some of them in Pig, some of them in Hive, some of them in map use. You can define a series of them and then link them to each other and say, only start this job when these other jobs, two jobs finish because I'm waiting for the input from them before I can kick off and so on. >>So Uzi is a very nice framework that will will do that. We'll manage the whole graph of jobs for you and retry things when they fail, et cetera, et cetera. Another good project is where W H I R R and where allows you to very easily start ADU cluster on top of Amazon. Easy two on top of Rackspace, virtualized environ. It's more for kicking off, it's for kicking off Hadoop instances or edge based instances on any virtual infrastructure. Okay. VMware, vCloud. So that it supports all of the major vCloud, sorry, all of the me, all of the major virtualized infrastructure systems out there, Eucalyptus as well, and so on. So that's where W H I R R ARU is another key project. It's one, it's duck cutting's main kind of project right now. Don of that gut cutting came on stage with you guys has, So Aru ARO is a project about how do we encode with our files, the schema of these files, right? >>Because when you open up a text file and you don't know how to what the columns mean and how to pars it, it becomes very hard to work for it. So ARU allows you to do that much more easily. It's also useful for doing rrp. We call rtc remove procedure calls for having different services talk to each other. ARO is very useful for that as well. And the list keeps going on and on Maha. Yeah. Which we just, thanks for me for reminding me of my house. We just added Maha very recently actually. What is that >>Adam? I'm not >>Familiar with it. So Maha is a data mining library. So MAHA takes some of the most popular data mining algorithms for doing clustering and regression and statistical modeling and implements them using the map map with use model. >>They have, they have machine learning in it too or Yes, yes. So that's the machine learning. >>So, So yes. Stay vector to machines and so on. >>What Scoop? >>So Scoop, you know, all of them. Thanks for feeding me all the names. >>The ones I don't understand, >>But there's so many of them, right? I can't even remember all of them. So Scoop actually is a very interesting project, is short for SQL to Hadoop, hence the name Scoop, right? So SQ from SQL and Oops from Hadoop and also means Scoop as in scooping up stuff when you scoop up ice cream. Yeah. And the idea for Scoop is to make it easy to move data between relational systems like Oracle metadata and it is a vertical and so on and Hadoop. So you can very simply say, Scoop the name of the table inside the relation system, the name of the file inside Hadoop. And the, the table will be copied over to the file and Vice and Versa can say Scoop the name of the file in Hadoop, the name of the table over there, it'll move the table over there. So it's a connectivity tool between the relational world and the Hadoop world. >>Great, great tutorial. >>And all of these are Apache projects. They're all projects built. >>It's not part of your, your unique proprietary. >>Yes. But >>These are things that you've been contributing >>To, We're contributing to the whole ecosystem. Yes. >>And you understand very well. Yes. And >>And contribute to your knowledge of the marketplace >>And Absolutely. We collaborate with the, with the community on creating these projects. We employ committers and founders for many of these projects. Like Duck Cutting, the founder of He works in Cloudera, the founder for that UIE project. He works at Calera for zookeeper works at Calera. So we have a number of them on stuff >>Work. So we had Aroon from Horton Works. Yes. And and it was really good because I tell you, I walk away from that conversation and I gotta say for the folks out there, there really isn't a war going on in Apache. There isn't. And >>Apache, there isn't. I mean isn't but would be honest. Like, and in the developer community, we are friends, we're working together. We want to achieve the, there's >>No war. It's all Kumbaya. Everyone understands the rising tide floats, all boats are all playing nice in the same box. Yes. It's just a competitive landscape in Horton. Works >>In the business, >>Business business, competitive business, PR and >>Pr. We're trying to be friendly, as friendly as we can. >>Yeah, no, I mean they're, they're, they're hying it up. But he was like, he was cool. Like, Hey, you know, we know each other. Yes. We all know each other and we're just gonna offer free Yes. And charge with support. And so are they. And that's okay. And they got other things going on. Yes. But he brought up the question. He said they're, they're launching a management console. So I said, Tyler's got a significant lead. He kind of didn't really answer the question. So the question is, that's your core bread and butter, That's your yes >>And no. Yes and no. I mean if you look at, if you look at Cloudera Enterprise, and I mentioned this earlier and when we talked in the morning, it has two main things in it. Cloudera Enterprise has the management suite, but it also has the, the the the support and maintenance that we provide to our customers and all the experience that we have in our team part That subscription. Yes. For a description. And I, I wanna stress the point that the fact that I built a sports car doesn't mean that I'm good at running that sports car. The driver of the car usually is much better at driving the car than the guy who built the car, right? So yes, we have many people on staff that are helping build had, but we have many more people on stuff that helped run Hado at large scale, at at financial indu, financial industry, retail industry, telecom industry, media industry, health industry, et cetera, et cetera. So that's very, very important for our customer. All that experience that we bring in on how to run the system technically Yeah. Within these verticals. >>But their strategies clear. We're gonna create an open source project within Apache for a management consult. Yes. And we sell support too. Yes. So there'll be a free alternative to management. >>So we have to see, But I mean we look at the product, I mean our products, >>It's gotta come down to product differentiation. >>Our product has been in the market for two years, so they just started building their products. It's >>Alpha, It's just Alpha. The >>Product is Alpha in Alpha right now. Yeah. Okay. >>Well the Apache products, it is >>Apache, right? Yeah. The Apache project is out. So we'll see how it does it compare to ours. But I think ours is way, way ahead of anything else out there. Yeah. Essentially people to try that for themselves and >>See essentially, John, when I asked Arro why does the world need Hortonwork? You know, eventually the answer we got was, well it's free. It needs to be more open. Had needs to be more open. >>No, there's, >>It's going to be, That's not really the reason why Warton >>Works. >>No, they want, they want to go make money. >>Exactly. We wasn't >>Gonna say them you >>When I kept pushing and pushing and that's ultimately the closest we can get cuz you >>Just listens. Not gonna >>12 open source projects. Yes. >>I >>Mean, yeah, yeah. You can't get much more open. Yeah. Look >>At management >>Consult, but Airs not shooting on all those. I mean, I mean not only we are No, no, not >>No, no, we absolutely >>Are. No, you are contributing. You're not. But that's not all your projects. There's other people >>Involved. Yeah, we didn't start, we didn't start all of these projects. Yeah, that's >>True. You contributing heavily to all of them. >>Yes, we >>Are. And that's clear. Todd Lipkin said that, you know, he contributed his first patch to HPAC in 2008. Yes. So I mean, you go back through the ranks >>Of your people and Todd now is a committer on Edge base is a committer on had itself. So on a number >>Of you clearly the lead and, and you know, and, but >>There is a concern. But we, we've heard it and I wanna just ask you No, no. So there's a concern that if I build processes around a proprietary management console, Yes. I'm gonna end up being locked into that proprietary management CNA all over again. Now this is so far from ca Yes. >>Right. >>But that's a concern that some people have expressed. And, and, and I think one of the reasons why Port Works is getting so much attention. So Yes. >>Talk about that. It's, it's a very good, it's a very good observation to make. Actually, >>There there is two separate things here. There's the platform where all the data sets and then there's this management parcel beside the platform. Now why did we make the management console why the cloud didn't make the management console? Because it makes our job for supporting the customers much more achievable. When a customer calls in and says, We have a problem, help us fix this problem. When they go to our management console, there is a button they click that gives us a dump of the state, of the cluster. And that's what allows us to very quickly debug what's going on. And within minutes tell them you need to do this and you to do that. Yeah. Without that we just can't offer the support services. There's >>Real value there. >>Yes. So, so now a year from, But, but, but you have to keep in mind that the, the underlying platform is completely open source and free CBH is completely a hundred percent open source, a hundred percent free, a hundred percent Apache. So a year from now, when it comes time to renew with us, if the customer is not happy with our management suite is not happy with our support data, they can, they can go to work >>And works. People are afraid >>Of all they can go to ibm. >>The data, you can take the data that >>You don't even need to take the data. You're not gonna move the data. It's the same system, the same software. Every, everything in CDH is Apache. Right? We're not putting anything in cdh, which is not Apache. So a year from now, if you're not happy with our service to you and the value that we're providing, you can switch. There is no lock in. There is no lock. And >>Your, your argument would be the switching costs to >>The only lock in is happiness. The only lock in is which >>Happiness inspection customer delay. Which by, by the way, we just wrote a piece about those wars and we said the risk of lockin is low. We made that statement. We've got some heat for it. Yes. And >>This is sort of at scale though. What the, what the people are saying, they're throwing the tomatoes is saying if this is, again, in theory at scale, the customers are so comfortable with that, the console that they don't switch. Now my argument was >>Yes, but that means they're happy with it. That means they're satisfied and happy >>With it. >>And it's more economical for them than going and hiding people full-time on stuff. Yeah. >>So you're, you're always on check as, as long as the customer doesn't feel like Oracle. >>Yeah. See that's different. Oracle is very, Oracle >>Is like different, right? Yeah. Here it's like Cisco routers, they get nested into the environment, provide value. That's just good competitive product strategy. Yes. If it they're happy. Yeah. It's >>Called open washing with >>Oracle, >>I mean our number one core attribute on the company, the number one value for us is customer satisfaction. Keeping our people Yeah. Our customers happy with the service that we provide. >>So differentiate in the product. Yes. Keep the commanding lead. That's the strategist. That's the, that's what's happening. That's your goal. Yes. >>That's what's happening. >>Absolutely. Okay. Co-founder of Cloudera, Always a pleasure to have you on the cube. We really appreciate all the hospitality over the beer and a half. And wanna personally thank you for letting us sit in your office and we'll miss you >>And we'll miss you too. We'll >>See you at the, the Cube events off Swing by, thanks for coming on the cube and great to see you and congratulations on all your success. >>Thank >>You. And thanks for the review on Modern Warfare three. Yeah, yeah. >>Love me again. If there any gaming stuff, you know, I.

Published Date : May 1 2012

SUMMARY :

Yeah, I'm Aala, They're the co-founder back to back. Yeah. So I kind of pick that up where we left off with you around, you know, he was really excited. So a couple more years. takes long for production to take place. But the consumerization trend is really changing that. So right now, the fact that you can buy a single server and it It's very easy for people to actually start a, a big data Those are the hard parts. I mean, It's like, we know when you have a headache and you're On money and SAP is talking the same thing and said they're going to the lines of business. the former one meaning, meaning that yes, line of business and departments, they adopt the technology and What are you seeing out there? So they pop it into their, in their own installation or on the, on the cloud and they show that this actually is working and Yes. I mean, you know, you think client server, there was a lot of resistance from for the right problem at the right time. Do. So Amar, I need to just change gears here a minute. of the current version, if any, if you played it. I don't have my Xbox with me here. And I challenge, I challenge people out there to come challenge our team. So all the young gamers out there am are saying they're gonna challenge you. Can you set up an We'll figure it out. We can carry it live actually we can stream that. Modern Warfare I love the Sirius since the first one that came out. You the box. but at the same time, whenever I play games, I feel a little bit guilty because it's kind of like wasted time. Danny at Riot G is telling me, we saw him at Oracle Open World. Buy, I can't, I can't mention the names, but some of the biggest giving companies out there are using Hadoop So they do Now what I'm suggesting is that how can you harness that energy for the good as well? but gaming really is, if you look at gaming, you know, you get the headset on. So around the core, can you just go down the core and rattle off your version of what, The projects that are around those tools that are being built. Yeah, so the foundational, the foundational one as we mentioned before, is sdfs for storage map use You mentioned R too. So one of the talks here that had World we Jonathan, something on here in the Cube later. So Edge Base is definitely one of the projects that's growing very, very quickly right now So Uzi allows you to define a series of So that it supports all of the major vCloud, So ARU allows you to do that much more easily. So MAHA takes some of the most popular data mining So that's the machine learning. So, So yes. So Scoop, you know, all of them. And the idea for Scoop is to make it easy to move data between relational systems like Oracle metadata And all of these are Apache projects. To, We're contributing to the whole ecosystem. And you understand very well. So we have a number of them on And and it was really good because I tell you, Like, and in the developer community, It's all Kumbaya. So the question is, the experience that we have in our team part That subscription. So there'll be a free alternative to management. Our product has been in the market for two years, so they just started building their products. Alpha, It's just Alpha. Product is Alpha in Alpha right now. So we'll see how it does it compare to ours. You know, eventually the answer We wasn't Not gonna Yes. Yeah. I mean, I mean not only we are No, But that's not all your projects. Yeah, we didn't start, we didn't start all of these projects. So I mean, you go back through the ranks So on a number But we, we've heard it and I wanna just ask you No, no. So there's a concern that So Yes. It's, it's a very good, it's a very good observation to make. And within minutes tell them you need to do this and you to do that. So a year from now, when it comes time to renew with us, if the customer is And works. It's the same system, the same software. The only lock in is which Which by, by the way, we just wrote a piece about those wars and we said the risk of lockin is low. the console that they don't switch. Yes, but that means they're happy with it. And it's more economical for them than going and hiding people full-time on stuff. Oracle is very, Oracle Yeah. I mean our number one core attribute on the company, the number one value for us is customer satisfaction. So differentiate in the product. And wanna personally thank you for letting us sit in your office and we'll miss you And we'll miss you too. you and congratulations on all your success. Yeah, yeah. If there any gaming stuff, you know, I.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Jeff Hiba	PERSON	0.99+
Todd Lipkin	PERSON	0.99+
2008	DATE	0.99+
Cisco	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
John	PERSON	0.99+
Mike	PERSON	0.99+
Modern Warfare three	TITLE	0.99+
Apache	ORGANIZATION	0.99+
Danny	PERSON	0.99+
Jonathan Gray	PERSON	0.99+
Jeff Haer Baer	PERSON	0.99+
15	QUANTITY	0.99+
two years	QUANTITY	0.99+
Calera	ORGANIZATION	0.99+
Modern Warfare	TITLE	0.99+
16 cores	QUANTITY	0.99+
Jeff Harm Becker	PERSON	0.99+
Todd	PERSON	0.99+
eight cores	QUANTITY	0.99+
Jonathan	PERSON	0.99+
both	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Java	TITLE	0.99+
next year	DATE	0.99+
Skype	ORGANIZATION	0.99+
two jobs	QUANTITY	0.99+
Vegas	LOCATION	0.99+
Michaels	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Hadoop	TITLE	0.99+
hundred percent	QUANTITY	0.99+
35,000	QUANTITY	0.99+
Horton Works	ORGANIZATION	0.99+
Today	DATE	0.99+
Peggy	PERSON	0.99+
eBay	ORGANIZATION	0.99+
Horton	LOCATION	0.99+
12 hards	QUANTITY	0.99+
Each	QUANTITY	0.99+
vCloud	TITLE	0.99+
HPAC	ORGANIZATION	0.99+
Aala	PERSON	0.99+
Adam	PERSON	0.99+
Tyler	PERSON	0.98+
UIE	ORGANIZATION	0.98+
Hadoop World	TITLE	0.98+
first one	QUANTITY	0.98+
12 open source projects	QUANTITY	0.98+
Edge Base	TITLE	0.98+
W H I R R	TITLE	0.98+
five	QUANTITY	0.98+
Hammerer	PERSON	0.98+
Xbox	COMMERCIAL_ITEM	0.98+
Port Works	ORGANIZATION	0.98+
Hive	TITLE	0.98+
Amar	PERSON	0.98+
five different departments	QUANTITY	0.98+
today	DATE	0.98+
Christmas	EVENT	0.98+
SQL	TITLE	0.97+
Silicon angle dot TV	ORGANIZATION	0.97+
Tableau	TITLE	0.97+
two	QUANTITY	0.97+
W H I R R	TITLE	0.97+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for MicroStrategy: