Jacqueline Kuo, Dataiku | WiDS 2023
(upbeat music) >> Morning guys and girls, welcome back to theCUBE's live coverage of Women in Data Science WIDS 2023 live at Stanford University. Lisa Martin here with my co-host for this segment, Tracy Zhang. We're really excited to be talking with a great female rockstar. You're going to learn a lot from her next, Jacqueline Kuo, solutions engineer at Dataiku. Welcome, Jacqueline. Great to have you. >> Thank you so much. >> Thank for being here. >> I'm so excited to be here. >> So one of the things I have to start out with, 'cause my mom Kathy Dahlia is watching, she's a New Yorker. You are a born and raised New Yorker and I learned from my mom and others. If you're born in New York no matter how long you've moved away, you are a New Yorker. There's you guys have like a secret club. (group laughs) >> I am definitely very proud of being born and raised in New York. My family immigrated to New York, New Jersey from Taiwan. So very proud Taiwanese American as well. But I absolutely love New York and I can't imagine living anywhere else. >> Yeah, yeah. >> I love it. >> So you studied, I was doing some research on you you studied mechanical engineering at MIT. >> Yes. >> That's huge. And you discovered your passion for all things data-related. You worked at IBM as an analytics consultant. Talk to us a little bit about your career path. Were you always interested in engineering STEM-related subjects from the time you were a child? >> I feel like my interests were ranging in many different things and I ended up landing in engineering, 'cause I felt like I wanted to gain a toolkit like a toolset to make some sort of change with or use my career to make some sort of change in this world. And I landed on engineering and mechanical engineering specifically, because I felt like I got to, in my undergrad do a lot of hands-on projects, learn every part of the engineering and design process to build products which is super-transferable and transferable skills sort of is like the trend in my career so far. Where after undergrad I wanted to move back to New York and mechanical engineering jobs are kind of few and fall far in between in the city. And I ended up landing at IBM doing analytics consulting, because I wanted to understand how to use data. I knew that data was really powerful and I knew that working with it could allow me to tell better stories to influence people across different industries. And that's also how I kind of landed at Dataiku to my current role, because it really does allow me to work across different industries and work on different problems that are just interesting. >> Yeah, I like the way that, how you mentioned building a toolkit when doing your studies at school. Do you think a lot of skills are still very relevant to your job at Dataiku right now? >> I think that at the core of it is just problem solving and asking questions and continuing to be curious or trying to challenge what is is currently given to you. And I think in an engineering degree you get a lot of that. >> Yeah, I'm sure. >> But I think that we've actually seen that a lot in the panels today already, that you get that through all different types of work and research and that kind of thoughtfulness comes across in all different industries too. >> Talk a little bit about some of the challenges, that data science is solving, because every company these days, whether it's an enterprise in manufacturing or a small business in retail, everybody has to be data-driven, because the end user, the end customer, whoever that is whether it's a person, an individual, a company, a B2B, expects to have a personalized custom experience and that comes from data. But you have to be able to understand that data treated properly, responsibly. Talk about some of the interesting projects that you're doing at Dataiku or maybe some that you've done in the past that are really kind of transformative across things climate change or police violence, some of the things that data science really is impacting these days. >> Yeah, absolutely. I think that what I love about coming to these conferences is that you hear about those really impactful social impact projects that I think everybody who's in data science wants to be working on. And I think at Dataiku what's great is that we do have this program called Ikig.AI where we work with nonprofits and we support them in their data and analytics projects. And so, a project I worked on was with the Clean Water, oh my goodness, the Ocean Cleanup project, Ocean Cleanup organization, which was amazing, because it was sort of outside of my day-to-day and it allowed me to work with them and help them understand better where plastic is being aggregated across the world and where it appears, whether that's on beaches or in lakes and rivers. So using data to help them better understand that. I feel like from a day-to-day though, we, in terms of our customers, they're really looking at very basic problems with data. And I say basic, not to diminish it, but really just to kind of say that it's high impact, but basic problems around how do they forecast sales better? That's a really kind of, sort of basic problem, but it's actually super-complex and really impactful for people, for companies when it comes to forecasting how much headcount they need to have in the next year or how much inventory to have if they're retail. And all of those are going to, especially for smaller companies, make a huge impact on whether they make profit or not. And so, what's great about working at Dataiku is you get to work on these high-impact projects and oftentimes I think from my perspective, I work as a solutions engineer on the commercial team. So it's just, we work generally with smaller customers and sometimes talking to them, me talking to them is like their first introduction to what data science is and what they can do with that data. And sort of using our platform to show them what the possibilities are and help them build a strategy around how they can implement data in their day-to-day. >> What's the difference? You were a data scientist by title and function, now you're a solutions engineer. Talk about the ascendancy into that and also some of the things that you and Tracy will talk about as those transferable, those transportable skills that probably maybe you learned in engineering, you brought data science now you're bringing to solutions engineering. >> Yeah, absolutely. So data science, I love working with data. I love getting in the weeds of things and I love, oftentimes that means debugging things or looking line by line at your code and trying to make it better. I found that on in the data science role, while those things I really loved, sometimes it also meant that I didn't, couldn't see or didn't have visibility into the broader picture of well like, well why are we doing this project? And who is it impacting? And because oftentimes your day-to-day is very much in the weeds. And so, I moved into sales or solutions engineering at Dataiku to get that perspective, because what a sales engineer does is support the sale from a technical perspective. And so, you really truly understand well, what is the customer looking for and what is going to influence them to make a purchase? And how do you tell the story of the impact of data? Because oftentimes they need to quantify well, if I purchase a software like Dataiku then I'm able to build this project and make this X impact on the business. And that is really powerful. That's where the storytelling comes in and that I feel like a lot of what we've been hearing today about connecting data with people who can actually do something with that data. That's really the bridge that we as sales engineers are trying to connect in that sales process. >> It's all about connectivity, isn't it? >> Yeah, definitely. We were talking about this earlier that it's about making impact and it's about people who we are analyzing data is like influencing. And I saw that one of the keywords or one of the biggest thing at Dataiku is everyday AI, so I wanted to just ask, could you please talk more about how does that weave into the problem solving and then day-to-day making an impact process? >> Yes, so I started working on Dataiku around three years ago and I fell in love with the product itself. The product that we have is we allow for people with different backgrounds. If you're coming from a data analyst background, data science, data engineering, maybe you are more of like a business subject matter expert, to all work in one unified central platform, one user interface. And why that's powerful is that when you're working with data, it's not just that data scientist working on their own and their own computer coding. We've heard today that it's all about connecting the data scientists with those business people, with maybe the data engineers and IT people who are actually going to put that model into production or other folks. And so, they all use different languages. Data scientists might use Python and R, your business people are using PowerPoint and Excel, everyone's using different tools. How do we bring them all in one place so that you can have conversations faster? So the business people can understand exactly what you're building with the data and can get their hands on that data and that model prediction faster. So that's what Dataiku does. That's the product that we have. And I completely forgot your question, 'cause I got so invested in talking about this. Oh, everyday AI. Yeah, so the goal of of Dataiku is really to allow for those maybe less technical people with less traditional data science backgrounds. Maybe they're data experts and they understand the data really well and they've been working in SQL for all their career. Maybe they're just subject matter experts and want to get more into working with data. We allow those people to do that through our no and low-code tools within our platform. Platform is very visual as well. And so, I've seen a lot of people learn data science, learn machine learning by working in the tool itself. And that's sort of, that's where everyday AI comes in, 'cause we truly believe that there are a lot of, there's a lot of unutilized expertise out there that we can bring in. And if we did give them access to data, imagine what we could do in the kind of work that they can do and become empowered basically with that. >> Yeah, we're just scratching the surface. I find data science so fascinating, especially when you talk about some of the real world applications, police violence, health inequities, climate change. Here we are in California and I don't know if you know, we're experiencing an atmospheric river again tomorrow. Californians and the rain- >> Storm is coming. >> We are not good... And I'm a native Californian, but we all know about climate change. People probably don't associate all of the data that is helping us understand it, make decisions based on what's coming what's happened in the past. I just find that so fascinating. But I really think we're truly at the beginning of really understanding the impact that being data-driven can actually mean whether you are investigating climate change or police violence or health inequities or your a grocery store that needs to become data-driven, because your consumer is expecting a personalized relevant experience. I want you to offer me up things that I know I was doing online grocery shopping, yesterday, I just got back from Europe and I was so thankful that my grocer is data-driven, because they made the process so easy for me. And but we have that expectation as consumers that it's going to be that easy, it's going to be that personalized. And what a lot of folks don't understand is the data the democratization of data, the AI that's helping make that a possibility that makes our lives easier. >> Yeah, I love that point around data is everywhere and the more we have, the actually the more access we actually are providing. 'cause now compute is cheaper, data is literally everywhere, you can get access to it very easily. And so, I feel like more people are just getting themselves involved and that's, I mean this whole conference around just bringing more women into this industry and more people with different backgrounds from minority groups so that we get their thoughts, their opinions into the work is so important and it's becoming a lot easier with all of the technology and tools just being open source being easier to access, being cheaper. And that I feel really hopeful about in this field. >> That's good. Hope is good, isn't it? >> Yes, that's all we need. But yeah, I'm glad to see that we're working towards that direction. I'm excited to see what lies in the future. >> We've been talking about numbers of women, percentages of women in technical roles for years and we've seen it hover around 25%. I was looking at some, I need to AnitaB.org stats from 2022 was just looking at this yesterday and the numbers are going up. I think the number was 26, 27.6% of women in technical roles. So we're seeing a growth there especially over pre-pandemic levels. Definitely the biggest challenge that still seems to be one of the biggest that remains is attrition. I would love to get your advice on what would you tell your younger self or the previous prior generation in terms of having the confidence and the courage to pursue engineering, pursue data science, pursue a technical role, and also stay in that role so you can be one of those females on stage that we saw today? >> Yeah, that's the goal right there one day. I think it's really about finding other people to lift and mentor and support you. And I talked to a bunch of people today who just found this conference through Googling it, and the fact that organizations like this exist really do help, because those are the people who are going to understand the struggles you're going through as a woman in this industry, which can get tough, but it gets easier when you have a community to share that with and to support you. And I do want to definitely give a plug to the WIDS@Dataiku team. >> Talk to us about that. >> Yeah, I was so fortunate to be a WIDS ambassador last year and again this year with Dataiku and I was here last year as well with Dataiku, but we have grown the WIDS effort so much over the last few years. So the first year we had two events in New York and also in London. Our Dataiku's global. So this year we additionally have one in the west coast out here in SF and another one in Singapore which is incredible to involve that team. But what I love is that everyone is really passionate about just getting more women involved in this industry. But then also what I find fortunate too at Dataiku is that we have a strong female, just a lot of women. >> Good. >> Yeah. >> A lot of women working as data scientists, solutions engineer and sales and all across the company who even if they aren't doing data work in a day-to-day, they are super-involved and excited to get more women in the technical field. And so. that's like our Empower group internally that hosts events and I feel like it's a really nice safe space for all of us to speak about challenges that we encounter and feel like we're not alone in that we have a support system to make it better. So I think from a nutrition standpoint every organization should have a female ERG to just support one another. >> Absolutely. There's so much value in a network in the community. I was talking to somebody who I'm blanking on this may have been in Barcelona last week, talking about a stat that showed that a really high percentage, 78% of people couldn't identify a female role model in technology. Of course, Sheryl Sandberg's been one of our role models and I thought a lot of people know Sheryl who's leaving or has left. And then a whole, YouTube influencers that have no idea that the CEO of YouTube for years has been a woman, who has- >> And she came last year to speak at WIDS. >> Did she? >> Yeah. >> Oh, I missed that. It must have been, we were probably filming. But we need more, we need to be, and it sounds like Dataiku was doing a great job of this. Tracy, we've talked about this earlier today. We need to see what we can be. And it sounds like Dataiku was pioneering that with that ERG program that you talked about. And I completely agree with you. That should be a standard program everywhere and women should feel empowered to raise their hand ask a question, or really embrace, "I'm interested in engineering, I'm interested in data science." Then maybe there's not a lot of women in classes. That's okay. Be the pioneer, be that next Sheryl Sandberg or the CTO of ChatGPT, Mira Murati, who's a female. We need more people that we can see and lean into that and embrace it. I think you're going to be one of them. >> I think so too. Just so that young girls like me like other who's so in school, can see, can look up to you and be like, "She's my role model and I want to be like her. And I know that there's someone to listen to me and to support me if I have any questions in this field." So yeah. >> Yeah, I mean that's how I feel about literally everyone that I'm surrounded by here. I find that you find role models and people to look up to in every conversation whenever I'm speaking with another woman in tech, because there's a journey that has had happen for you to get to that place. So it's incredible, this community. >> It is incredible. WIDS is a movement we're so proud of at theCUBE to have been a part of it since the very beginning, since 2015, I've been covering it since 2017. It's always one of my favorite events. It's so inspiring and it just goes to show the power that data can have, the influence, but also just that we're at the beginning of uncovering so much. Jacqueline's been such a pleasure having you on theCUBE. Thank you. >> Thank you. >> For sharing your story, sharing with us what Dataiku was doing and keep going. More power to you girl. We're going to see you up on that stage one of these years. >> Thank you so much. Thank you guys. >> Our pleasure. >> Our pleasure. >> For our guests and Tracy Zhang, this is Lisa Martin, you're watching theCUBE live at WIDS '23. #EmbraceEquity is this year's International Women's Day theme. Stick around, our next guest joins us in just a minute. (upbeat music)
SUMMARY :
We're really excited to be talking I have to start out with, and I can't imagine living anywhere else. So you studied, I was the time you were a child? and I knew that working Yeah, I like the way and continuing to be curious that you get that through and that comes from data. And I say basic, not to diminish it, and also some of the I found that on in the data science role, And I saw that one of the keywords so that you can have conversations faster? Californians and the rain- that it's going to be that easy, and the more we have, Hope is good, isn't it? I'm excited to see what and also stay in that role And I talked to a bunch of people today is that we have a strong and all across the company that have no idea that the And she came last and lean into that and embrace it. And I know that there's I find that you find role models but also just that we're at the beginning We're going to see you up on Thank you so much. #EmbraceEquity is this year's
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Sheryl | PERSON | 0.99+ |
Mira Murati | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Tracy Zhang | PERSON | 0.99+ |
Tracy | PERSON | 0.99+ |
Jacqueline | PERSON | 0.99+ |
Kathy Dahlia | PERSON | 0.99+ |
Jacqueline Kuo | PERSON | 0.99+ |
California | LOCATION | 0.99+ |
Europe | LOCATION | 0.99+ |
Dataiku | ORGANIZATION | 0.99+ |
New York | LOCATION | 0.99+ |
Singapore | LOCATION | 0.99+ |
London | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
Sheryl Sandberg | PERSON | 0.99+ |
YouTube | ORGANIZATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Barcelona | LOCATION | 0.99+ |
2022 | DATE | 0.99+ |
Taiwan | LOCATION | 0.99+ |
2015 | DATE | 0.99+ |
last week | DATE | 0.99+ |
two events | QUANTITY | 0.99+ |
26, 27.6% | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
PowerPoint | TITLE | 0.99+ |
Excel | TITLE | 0.99+ |
this year | DATE | 0.99+ |
yesterday | DATE | 0.99+ |
Python | TITLE | 0.99+ |
Dataiku | PERSON | 0.99+ |
New York, New Jersey | LOCATION | 0.99+ |
tomorrow | DATE | 0.99+ |
2017 | DATE | 0.99+ |
SF | LOCATION | 0.99+ |
MIT | ORGANIZATION | 0.99+ |
today | DATE | 0.98+ |
78% | QUANTITY | 0.98+ |
ChatGPT | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
Ocean Cleanup | ORGANIZATION | 0.98+ |
SQL | TITLE | 0.98+ |
next year | DATE | 0.98+ |
International Women's Day | EVENT | 0.97+ |
R | TITLE | 0.97+ |
around 25% | QUANTITY | 0.96+ |
Californians | PERSON | 0.95+ |
Women in Data Science | TITLE | 0.94+ |
one day | QUANTITY | 0.92+ |
theCUBE | ORGANIZATION | 0.91+ |
WIDS | ORGANIZATION | 0.89+ |
first introduction | QUANTITY | 0.88+ |
Stanford University | LOCATION | 0.87+ |
one place | QUANTITY | 0.87+ |
Jed Dougherty, Dataiku | AWS re:Invent 2022
(bright music) >> Welcome back to Vegas, guys and girls. We're pleased that you're watching theCUBE. We know you've been with us. This is our fourth day. We know you've been with us since day one. Why wouldn't you be? Lisa Martin, here. As I mentioned, day four of theCUBE's coverage of AWS re:Invent. There are north of 55,000 people that have been at this event this week. We're hearing hundreds of thousands online. It really feels like old times, which is awesome. We're pleased to welcome back a gentleman from Dataiku who's actually new to theCUBE but Dataiku is not. Jed Dougherty is here, the VP of Platform Strategy. Thanks to joining me today, Jed. >> Oh, I'm so happy to be here. >> Talk a little bit, for anybody that isn't familiar with Dataiku, tell the audience a little bit about the technology, what you guys do. >> Dataiku is an end-to-end data science machine learning platform. We take everything from data ingestion, piplining of that data, bringing it all together, something that's useful for building models, deploying those models and then managing your ML ops workflow. So, really all the way across. And we sit on top of, basically, tons of different AWS stack as well as lots of the partners that are here today. >> Okay, got it. >> Snowflake, Databricks, all that. >> Got it, so one of the things that, it was funny, I think it was Adam's keynote Tuesday morning. I didn't time it, I watched it, but one of my guests said to me earlier this week that Adam spent exactly 52 minutes talking about data. >> Yeah. >> 52 minutes. Obviously, we can't come to an event like this without talking about data. Every company these days has to be a data company. Whether it's my grocery store or a retailer, a hospital, and so- >> Jed: It is the lifeblood of every modern company. >> It is, but you have to be able to access it. You have to be able to harness it, access it, derive insights from it, and be able to act on that faster than the competitors that are waiting, like, right back here. One of the things Adam Selipsky talked about with our boss, John Furrier, who's the co-CEO of theCUBE, they had a sit-down about a week before re:Invent. John always gets a preview of the show and Adam said, you know, he thinks the role of data analyst is going to go away. Or at least the term, because with data democratization that needs to happen. Putting data in the hands of all the business users, that every business user, whether you're in technology or marketing or ops or finance, it's going to have to analyze data to do their jobs. >> Could not agree more. >> Are you hearing that from customers? >> 100% >> Yeah. >> I was just at the CTO Summit of Bank of America two weeks ago out in California, and they told, their CTO had a statistic, 60,000 technologists in Bank of America, all asking data-type questions. You can have the best team of data scientists in the world, and they do. They have some of the best data scientists in the world there. And this team of data scientists could answer any one of the questions that those 60,000 people might have but they can't answer all of them, right? You need those people to be able to answer their own questions. I don't know if the term data analysts are going away. I think, yeah, everybody's just going to have to become a bit more of one. Just like how Excel taught everybody how to use the spreadsheet, in the future, in the next five, 10 years, the democratization of AI means that tools like Dataiku and other data science tools are going to teach everybody how to analyze data. >> Talk about Dataiku as a facilitator of that, of that democratization. Giving, like the citizen technologist who might be in finance, the ability to do that. >> So, a lot of data science tools are aimed at your hardcore coder, right? Somebody who wants to be sitting at a notebook writing (indistinct) or something like that and running models on some big fancy Spark server. Dataiku is still going to be running models on some big fancy Spark server but we're really obfuscating the challenge of writing code away from the user. So we target low code, no code, and high code users all working together in a collaborative platform. So we really do, we believe that there is always going to be a place for data scientists. That role is not going away. You will always need hardcore coders to take on those moonshot very challenging topics. But for every day AI, anybody should be able to do this and it should be open to anybody. >> Right. >> Jed: Really aim to facilitate that. >> I would love to hear some feedback, you know, this is day four of the show as I was saying, and day four is packed. I mean, this is energy-level-wise, guys, it is the same as it was when we started here on Friday night. But I'd love to hear, Jed, from your perspective some of the customer conversations that you've had, what are some of the challenges? They're coming to you saying, "Jed, Dataiku, help us eradicate these challenges so we can transform our business." >> What I'm hearing from customers and partners and AWS here is, over and over, we don't want to buy tools anymore. We want to buy solutions. We want a vertical solution that's pre-built for our industry. And we want it to be, not necessarily click and run out of the box, but we want a template that we can build off of quickly. And I've heard that customers are also looking to understand how tools can be packaged together. You got how many booths are here? 1000 booths? >> Yes, easily. >> You have 1000 different products being talked about, right behind us. Customers need to know which of these products are friends with each other and how they fit together so that they are making sure that when they purchase a set, a suite of tools to do their jobs, it's all going to work naturally together. So, being able, I think this is a really vital concept for GSIs as well. GSIs needs to understand how to package sets of tools together to deliver a full solution to clients. People don't want to be, you know, I think 10 years ago, five years ago, AWS was in the business of selling servers in the cloud. But basically what you do is, you would buy an EC two instance and you install whatever software you wanted on it. I don't know that they're in that business still but customers don't want to buy servers from AWS anymore. They want to buy solutions. >> Right. >> Rent, whatever. >> Yeah. (chuckles) >> That is the big repeated message that I've heard here. >> So you brought up a good point that there are probably 1000 booths here. You could be here every day and not get to see everything that's going on. Plus this show was going on across the strip. We're only getting a fraction of the people that are here. But with that said, to your point, there are so many tools out there. Customers are looking for solutions. One of the things that we say about theCUBE is, we extract the signal from the noise. How does Dataiku get past the noise? How do you get up the stack to really impact customers so they understand the value that you're delivering? >> I think that Data science and ML sound like a very complicated topic but our value prop is relatively simple. And we appeal both to your end users who are excited to learn about how data science works and how they can leverage these tools in their day-to-day jobs, as well as appealing to IT. IT, right now, at major organizations they want to be able to build a full stack that makes sense. And the big choices they're making right now are around infrastructure. Where am I going to run my compute? So, they're choosing between Snowflake or Databricks or a native AWS compute solution, right? And so they make this big choice around compute and then they realize, "Oh, how many of our users across our organization are actually able to leverage this big compute choice?" Oh, maybe 100, maybe 200. That's not incredibly useful for what we've just decided to completely stand behind. Dataiku, all of a sudden, opens that up to 1000s of users across your organization. So it makes IT feel empowered by being able to help more people. And it makes users feel empowered by being able to use a great tool and start answering their own questions. >> And where are your customer conversations these days? As we look at AI and ML, emerging technologies, so many customers and companies, knowing we have to go in this direction. We have to have AI to speed the business. Are you seeing more of the conversations are still in IT or are they actually going up the stack? >> (chuckles) It's a great question. When you're going into large organizations, there's two sales motions, right? There's convincing the business users that this is a great thing and then convincing IT that it's not going to be too painful. You always have to go to both places. IT doesn't want to take on a boondoggler, or there's an albatross, I don't remember the word, but, something that they're going to have to deal with for the next 10 years and then eventually dismantle and pull apart. I think a lot of IT got very scared about big data platforms and solutions because of Hadoop. To be honest, Hadoop was incredibly powerful but maybe not as mature of technology as IT would've liked it to be. From a maintenance and administration standpoint. So yes, you will always have to sell to IT and help IT feel comfortable with the platform. But no, the conversations that I want to have are the use case conversations with a Chief Data Officer, Chief Revenue Officer, Chief Marketing Officer. That's who I really want to convince that this is going to be a worthwhile opportunity. >> And what are some of the key, sorry. What are some of the key use cases that Dataiku is tackling in the market these days? >> So we work a lot. Two of the biggest organizations, or verticals, that I work with personally are finance and pharmaceuticals. In finance, we are closely embedded with wealth management organizations. So, a lot of that is around customer entertainment, churn, relatively obvious, simple concepts but ones where it's worth a lot of money. In pharma, we work both on the supply side. So, doing supply chain optimization, ensuring the right drugs get to the right places at the right time. As well as on the business and marketing side. So, ensuring that your ad spend is correctly distributed across different advertising platforms. >> So if you're working with a financial organization, I want to understand from a consumer, from the end user's perspective, although obviously this technology impacts the end user who's trying to do a transaction. What's in it for me? And I don't know as the end user that Dataiku is under the hood. >> You'd never know. >> Which is good. I shouldn't have to worry about the technology. >> Jed: You shouldn't have to worry about that at all. >> What's in it for the end user customer? What are they gaining from this? >> So, from a very end user perspective, if you think about when you logged onto maybe your Bank of America, your Chase app, five or 10 years ago, maybe you didn't even have it on your phone five years ago. Or when you logged into your account online. We do 95% of our banking online right now, right? I go into a physical location, what? I don't know, once every six months or something? Get a cashier's check? I don't know. The experience that you're getting and the amount of information you're getting back about your spending habits, where your money is going, what your credit score is, all of these things are being driven by these big data organizations inside the banks. Also, any type, this is a little creepier, but any type of promotional emails or the types of things that you get feedback on when you use your credit card and the offers that you get through that, are all being personalized to you through the information that these banks are collecting about your spending habits. >> Yeah, but we want that as a consumer, we want the personalized. >> Yeah, of course. We want it to be magic slash not creepy. (laughs) >> Right, I want them to recommend the best card for me. >> Right. >> The next best thing. >> It's good for me, it's good for them. >> Don't serve me up something that I've already bought. That always bugs me when I'm like, I already bought that. >> I get that all the time. I'm like, yeah, I have that card already. It's in my wallet. Why are you telling me? >> We only have a couple of minutes left Jed, but talk to me about from a platform strategy perspective, what's next for Dataiku and AWS? >> So we are making a matrix transition right now and it's core to our platform. For a long time, the way that we've installed Dataiku is, we help our customers install it on their AWS account so it runs inside their tenant. This is very comfortable for, for example, large banking clients, pharma clients that have personally identifiable information, all that kind of thing. They own everything. However, as we were talking about before, we're really moving from providing a tool to providing solutions. And part of that is obviously a move to SaaS. So two years ago we released a SaaS offering. We've been expanding it more and more to, this year, we want to be pushing SaaS first. So Dataiku online should be the first option when new customers move on. And that is a huge platform shift. It means making sure that we have the right security in place. It means making sure that we have the right scaling in place, that we have 24-7 support. All this has been a big challenge. A big fascinating challenge, actually, to put together. >> Awesome. Last question for you. Say you get a brand new DeLorean, I hear they're coming back, and you want to put, you really, really want to put a bumper sticker on it, 'cause why not? And it's about Dataiku and it's like a sizzle reel kind of thing. >> A sizzle real, alright. >> Yeah. What does it say? >> Extraordinary people, everyday AI. >> Wow. Drop the mic, Jed. That was awesome. Thank you so much for coming on the program. We really appreciate the update on Dataiku. What you guys are doing for customers, your specialization and solutions for verticals. Awesome stuff, we'll have to have you back. >> Thank you so much. >> Alright, my pleasure. >> Bye-Bye. >> For my guest, I'm Lisa Martin. You're watching theCUBE, the leader in live enterprise and emerging tech coverage. (bright music)
SUMMARY :
Jed Dougherty is here, the tell the audience a little lots of the partners that are here today. Got it, so one of the has to be a data company. Jed: It is the lifeblood that needs to happen. I don't know if the term the ability to do that. is always going to be a of the show as I was saying, and run out of the box, I don't know that they're That is the big repeated of the people that are here. And the big choices We have to have AI to speed the business. that this is going to be What are some of the key use cases So, a lot of that is around And I don't know as the I shouldn't have to worry to worry about that at all. and the offers that you get through that, Yeah, but we want that as a consumer, We want it to be magic the best card for me. it's good for them. something that I've already bought. I get that all the time. and it's core to our platform. and you want to put, you really, really What does it say? have to have you back. the leader in live enterprise
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Adam | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Jed Dougherty | PERSON | 0.99+ |
Adam Selipsky | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
95% | QUANTITY | 0.99+ |
California | LOCATION | 0.99+ |
Jed | PERSON | 0.99+ |
1000 booths | QUANTITY | 0.99+ |
Friday night | DATE | 0.99+ |
John | PERSON | 0.99+ |
100% | QUANTITY | 0.99+ |
fourth day | QUANTITY | 0.99+ |
Two | QUANTITY | 0.99+ |
first option | QUANTITY | 0.99+ |
Tuesday morning | DATE | 0.99+ |
Excel | TITLE | 0.99+ |
60,000 people | QUANTITY | 0.99+ |
Bank of America | ORGANIZATION | 0.99+ |
Databricks | ORGANIZATION | 0.99+ |
two years ago | DATE | 0.99+ |
this year | DATE | 0.99+ |
100 | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
52 minutes | QUANTITY | 0.99+ |
60,000 technologists | QUANTITY | 0.99+ |
10 years ago | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
five | DATE | 0.99+ |
Dataiku | ORGANIZATION | 0.99+ |
52 minutes | QUANTITY | 0.98+ |
five years ago | DATE | 0.98+ |
200 | QUANTITY | 0.98+ |
two sales | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
earlier this week | DATE | 0.98+ |
Snowflake | ORGANIZATION | 0.98+ |
Vegas | LOCATION | 0.98+ |
1000 different products | QUANTITY | 0.97+ |
this week | DATE | 0.97+ |
both places | QUANTITY | 0.97+ |
Hadoop | TITLE | 0.97+ |
CTO Summit | EVENT | 0.97+ |
two weeks ago | DATE | 0.96+ |
hundreds of thousands | QUANTITY | 0.96+ |
theCUBE | ORGANIZATION | 0.95+ |
Bank of America | LOCATION | 0.94+ |
Bank of America | EVENT | 0.93+ |
Dataiku | TITLE | 0.92+ |
day one | QUANTITY | 0.91+ |
Spark | TITLE | 0.9+ |
day four | QUANTITY | 0.89+ |
first | QUANTITY | 0.88+ |
EC two | TITLE | 0.88+ |
Dataiku | PERSON | 0.86+ |
a week | DATE | 0.83+ |
Chase | TITLE | 0.83+ |
one of my guests | QUANTITY | 0.83+ |
CTO | ORGANIZATION | 0.81+ |
Ahmad Khan, Snowflake & Kurt Muehmel, Dataiku | Snowflake Summit 2022
>>Hey everyone. Welcome back to the Cube's live coverage of snowflake summit 22 live from Las Vegas. Caesar's forum. Lisa Martin here with Dave Valante. We've got a couple of guests here. We're gonna be talking about every day. AI. You wanna know what that means? You're in the right spot. Kurt UL joins us, the chief customer officer at data ICU and the mod Conn, the head of AI and ML strategy at snowflake guys. Great to have you on the program. >>It's wonderful to be here. Thank you so much. >>So we wanna understand Kurt what everyday AI means, but before we do that for the audience who might not be familiar with data, I could give them a little bit of an overview. What about what you guys do your mission and maybe a little bit about the partnership? >>Yeah, great. Uh, very happy to do so. And thanks so much for this opportunity. Um, well, data IKU, we are a collaborative platform, uh, for enterprise AI. And what that means is it's a software, you know, that sits on top of incredible infrastructure, notably snowflake that allows people from different backgrounds of data, analysts, data, scientists, data, engineers, all to come together, to work together, to build out machine learning models and ultimately the AI that's gonna be the future, uh, of their business. Um, and so we're very excited to, uh, to be here, uh, and you know, very proud to be a, a, a very close partner of snowflake. >>So Amad, what is Snowflake's AI strategy? Is it to, is it to partner? Where do, where do you pick up? And Frank said today, we, we're not doing it all. Yeah. The ecosystem by design. >>Yeah. Yeah, absolutely. So we believe in the best of breed look. Um, I think, um, we, we think that we're the best data platform and for data science and machine learning, we want our customers to really use the best tool for their use cases. Right. And, you know, data ICU is, is our leading partner in that space. And so, you know, when, when you talk about, uh, machine learning and data science, people talk about training a model, but it's really the difficult part and challenges are really, before you train the model, how do you get access to the right data? And then after you train the model, how do you then run the model? And then how do you manage the model? Uh, that's very, very important. And that's where our partnership with, with data, uh, IKU comes in place. Snowflake provides the platform that can process data at scale for the pre-processing bit and, and data IKU comes in and really, uh, simplifies the process for deploying the models and managing the model. >>Got it. Thank >>You. You talk about KD data. Aico talks about everyday AI. I wanna break that down. What do you mean by that? And how is this partnership with snowflake empowering you to deliver that to companies? >>Yeah, absolutely. So everyday AI for us is, uh, you know, kind of a future state that we are building towards where we believe that AI will become so pervasive in all of the business processes, all the decision making that organizations have to go through that it's no longer this special thing that we talk about. It's just the, the day to day life of, uh, of our businesses. And we can't do that without partners like snowflake and, uh, because they're bringing together all of that data and ensuring that there is the, uh, the computational horsepower behind that to drive that we heard that this morning in some of the keynote talking about that broad democratization and the, um, let's call it the, uh, you know, the pressure that that's going to put on the underlying infrastructure. Um, and so ultimately everyday AI for us is where companies own that AI capability. They're building it themselves very broad, uh, participation in the development of that. And all that work then is being pushed down into best of breed, uh, infrastructure, notably of course, snowflake. Well, >>You said push down, you, you guys, you there's a term in the industry push down optimization. What does that mean? How is it evolving? Why is it so important? >>So Amma, do you want to take a first step at that? >>Yeah, absolutely. So, I mean, when, when you're, you know, processing data, so saying data, um, before you train a, uh, a model, you have to do it at scale, that that, that data is, is coming from all different sources. It's human generated machine generated data, we're talking millions and billions of rows of data. Uh, and you have to make sense of it. You have to transform that data into the right kind of features into the right kind of signals that inform the machine learning model that you're trying to, uh, train. Uh, and so that's where, you know, any kind of large scale data processing is automatically pushed down by data IQ, into snowflakes, scalable infrastructure. Um, so you don't get into like memory issues. You don't get into, um, uh, situations where you're where your pipeline is running overnight, and it doesn't finish in time. Right? And so, uh, you can really take advantage of the scalable nature of cloud computing, uh, using Snowflake's infrastructure. So a lot of that processing is actually getting pushed down from data I could down into the scalable snowflake compute engine. How >>Does this affect the life of a data scientist? You always hear a data scientist spend 80% of the time wrangling data. Uh, I presume there's an infrastructure component around that you trying, we heard this morning, you're making infrastructure, my words, infrastructure, self serve, uh, does this directly address that problem and, and talk about that. And what else are you doing to address that 80% problem? >>It, it certainly does, right? Uh, that's how you solve for, uh, data scientists needing to have on demand access to computing resources, or of course, to the, uh, to the underlying data, um, is by ensuring that that work doesn't have to run on their laptop, doesn't have to run on some, you know, constrained, uh, physical machines, uh, in, in a data center somewhere. Instead it gets pushed down into snowflake and can be executed at scale with incredible parallelization. Now what's really, uh, I important is the ongoing development, uh, between the two products, uh, and within that technology. And so today snowflake, uh, announced the introduction of Python within snow park, um, which is really, really exciting, uh, because that really opens up this capability to a much wider audience. Now DataCo provides that both through a visual interface, um, in historically, uh, since last year through Java UDFs, but that's kind of the, the two extremes, right? You have people who don't code on one side, you know, very no code or a low code, uh, population, and then a very high code population. On the other side, this Python, uh, integration really allows us to, to touch really kind the, the fat center of the data science population, who, uh, who, for whom, you know, Python really is the lingua franca that they've been learning for, uh, for decades now. Sure. So >>Talking about the data scientist, I wanna elevate that a little bit because you both are enterprise customers, data ICO, and snowflake Kurt as the chief customer officer, obviously you're with customers all the time. If we look at the macro environment of all the challenges, companies have to be a data company these days, if you're not, you're not gonna be successful. It's how do we do that? Extract insights, value, action, take it. But I'm just curious if your customer conversations are elevating up to the C-suite or, or the board in terms of being able to get democratize access to data, to be competitive, new products, new services, we've seen tremendous momentum, um, on, on the, the part of customer's growth on the snowflake side. But what are you hearing from customers as they're dealing with some of these current macro pains? >>Yeah, no, I, I think it is the conversation today, uh, at that sea level is not only how do we, you know, leverage, uh, new infrastructure, right. You know, they they're, you know, most of them now are starting to have snowflake. I think Frank said, uh, you know, 50% of the, uh, fortune 500, so we can say most, um, have that in place. Um, but now the question is, how do we, how do we ensure that we're getting access to that data, to that, to that computational horsepower, to a broader group of people so that it becomes truly a transformational initiative and not just an it initiative, not just a technology initiative, but really a core business initiative. And that, that really has been a pivot. You know, I've been, you know, with my company now for almost eight years, right. Uh, and we've really seen a change in that discussion going from, you know, much more niche discussions at the team or departmental level now to truly corporate strategic level. How do we build AI into our corporate strategy? How do we really do that in practice? And >>We hear a lot about, Hey, I want to inject data into apps, AI, and machine intelligence into applications. And we've talked about, those are separate stacks. You got the data stack and analytics stack over here. You got the application development, stack the databases off in the corner. And so we see you guys bringing those worlds together. And my question is, what does that stack look like? I took a snapshot. I think it was Frank's presentation today. He had infrastructure at the lowest level live data. So infrastructure's cloud live data. That's multiple data sources coming in workload execution. You made some announcements there. Mm-hmm, <affirmative>, uh, to expend expand that application development. That's the tooling that is needed. Uh, and then marketplace, that's how you bring together this ecosystem. Yes. Monetization is how you turn data into data products and make money. Is that the stack, is that the new stack that's emerging here? Are you guys defining that? >>Absolutely. Absolutely. You talked about like the 80% of the time being spent by data scientists and part of that is actually discovering the right data. Right. Um, being able to give the right access to the right people and being able to go and discover that data. And so you, you, you go from that angle all the way to processing, training a model. And then all those predictions that are insights that are coming out of the model are being consumed downstream by data applications. And so the two major announcements I'm super excited about today is, is the ability to run Python, which is snow park, uh, in, in snowflake. Um, that will do, you know, you can now as a Python developer come and bring the processing to where the data lives rather than move the data out to where the processing lives. Right. Um, so both SQL developers, Python developers, fully enabled. Um, and then the predictions that are coming out of models that are being trained by data ICU are then being used downstream by these data applications for most of our customers. And so that's where number, the second announcement with streamlet is super exciting. I can write a complete data application without writing a single line of JavaScript CSS or HTML. I can write it completely in Python. It's it makes me super excited as, as a Python developer, myself >>And you guys have joint customers that are headed in this direction, doing this today. Where, where can you talk about >>That? Yeah, we do. Uh, you know, there's a few that we're very proud of. Um, you know, company, well known companies like, uh, like REI or emeritus. Um, but one that was mentioned today, uh, this morning by Frank again, uh, Novartis, uh, pharmaceutical company, you know, they have been extremely successful, uh, in accelerating their AI and ML development by expanding access to their data. And that's a combination of, uh, both the data ICU, uh, layer, you know, allowing for that work to be developed in that, uh, in that workspace. Um, but of course, without, you know, the, the underlying, uh, uh, platform of snowflake, right, they, they would not have been able to, to have re realized those, uh, those gains. And they were talking about, you know, very, very significant increases in inefficiency everything from data access to the actual model development to the deployment. Um, it's just really, really honestly inspiring to see. >>And it was great to see Novartis mentioned on the main stage, massive time to value there. We've actually got them on the program later this week. So that was great. Another joint customer, you mentioned re I we'll let you go, cuz you're off to do a, a session with re I, is that right? >>Yes, that's exactly right. So, uh, so we're going to be doing a fireside chat, uh, talking about, in fact, you know, much of the same, all of the success that they've had in accelerating their, uh, analytics, workflow development, uh, the actual development of AI capabilities within, uh, of course that, uh, that beloved brand. >>Excellent guys, thank you so much for joining Dave and me talking about everyday AI, what you're doing together, data ICO, and snowflake to empower organizations to actually achieve that and live it. We appreciate your insights. Thank you both. You guys. Thank you for having us for our guests and Dave ante. I'm Lisa Martin. You're watching the Cube's live coverage of snowflake summit 22 from Las Vegas. Stick around our next guest joins us momentarily.
SUMMARY :
Great to have you on the program. Thank you so much. What about what you guys do Um, and so we're very excited to, uh, to be here, uh, and you know, Where do, where do you pick up? And so, you know, when, Thank And how is this partnership with snowflake empowering you to deliver uh, you know, the pressure that that's going to put on the underlying infrastructure. Why is it so important? Uh, and so that's where, you know, any kind of And what else are you doing to address that 80% problem? You have people who don't code on one side, you know, very no code or a low code, Talking about the data scientist, I wanna elevate that a little bit because you both are enterprise customers, I think Frank said, uh, you know, 50% of the, uh, And so we see you guys Um, that will do, you know, you can now as a Python developer And you guys have joint customers that are headed in this direction, doing this today. And that's a combination of, uh, both the data ICU, uh, layer, you know, you go, cuz you're off to do a, a session with re I, is that right? you know, much of the same, all of the success that they've had in accelerating their, uh, analytics, Thank you both.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
Frank | PERSON | 0.99+ |
Dave Valante | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Novartis | ORGANIZATION | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
Kurt | PERSON | 0.99+ |
80% | QUANTITY | 0.99+ |
50% | QUANTITY | 0.99+ |
Ahmad Khan | PERSON | 0.99+ |
last year | DATE | 0.99+ |
Python | TITLE | 0.99+ |
millions | QUANTITY | 0.99+ |
two products | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
two extremes | QUANTITY | 0.99+ |
Kurt Muehmel | PERSON | 0.99+ |
both | QUANTITY | 0.99+ |
Snowflake Summit 2022 | EVENT | 0.98+ |
Amma | PERSON | 0.98+ |
Kurt UL | PERSON | 0.98+ |
second announcement | QUANTITY | 0.98+ |
JavaScript | TITLE | 0.98+ |
Caesar | PERSON | 0.98+ |
billions | QUANTITY | 0.97+ |
first step | QUANTITY | 0.97+ |
REI | ORGANIZATION | 0.97+ |
HTML | TITLE | 0.97+ |
two major announcements | QUANTITY | 0.97+ |
later this week | DATE | 0.97+ |
Snowflake | ORGANIZATION | 0.96+ |
Amad | PERSON | 0.94+ |
this morning | DATE | 0.94+ |
single line | QUANTITY | 0.94+ |
Aico | ORGANIZATION | 0.93+ |
SQL | TITLE | 0.93+ |
Snowflake | TITLE | 0.93+ |
one side | QUANTITY | 0.91+ |
fortune 500 | QUANTITY | 0.91+ |
Java UDFs | TITLE | 0.9+ |
almost eight years | QUANTITY | 0.9+ |
emeritus | ORGANIZATION | 0.89+ |
snowflake summit 22 | EVENT | 0.85+ |
IKU | ORGANIZATION | 0.85+ |
Cube | ORGANIZATION | 0.85+ |
Cube | PERSON | 0.82+ |
decades | QUANTITY | 0.78+ |
IKU | TITLE | 0.74+ |
streamlet | TITLE | 0.72+ |
snowflake | ORGANIZATION | 0.7+ |
Dataiku | PERSON | 0.65+ |
couple of | QUANTITY | 0.64+ |
DataCo | ORGANIZATION | 0.63+ |
CSS | TITLE | 0.59+ |
one | QUANTITY | 0.55+ |
data ICU | ORGANIZATION | 0.51+ |
rows | QUANTITY | 0.49+ |
Conn | ORGANIZATION | 0.35+ |
Democratizing AI and Advanced Analytics with Dataiku x Snowflake
>>My name is Dave Volonte, and with me are two world class technologists, visionaries and entrepreneurs. And Wa Dodgeville is the he co founded Snowflake, and he's now the president of the product division. And Florian Duetto is the co founder and CEO of Data Aiko. Gentlemen, welcome to the Cube to first timers. Love it. >>Great to be here >>now, Florian you and Ben Wa You have a number of customers in common. And I have said many times on the Cube that you know, the first era of cloud was really about infrastructure, making it more agile, taking out costs. And the next generation of innovation is really coming from the application of machine intelligence to data with the cloud is really the scale platform. So is that premise your relevant to you? Do you buy that? And and why do you think snowflake and data ICU make a good match for customers? >>I think that because it's our values that are aligned when it's all about actually today allowing complexity for customers. So you close the gap or the democratizing access to data access to technology. It's not only about data data is important, but it's also about the impact of data. Who can you make the best out of data as fast as possible as easily as possible within an organization. And another value is about just the openness of the platform building the future together? Uh, I think a platform that is not just about the platform but also full ecosystem of partners around it, bringing the level off accessibility and flexibility you need for the 10 years away. >>Yeah, so that's key. But it's not just data. It's turning data into insights. Have been why you came out of the world of very powerful but highly complex databases. And we know we all know that you and the snowflake team you get very high marks for really radically simplifying customers lives. But can you talk specifically about the types of challenges that your customers air using snowflake to solve? >>Yeah, so So the really the challenge, you know, be four. Snowflake. I would say waas really? To put all the data, you know, in one place and run all the computers, all the workloads that you wanted to run, You know, against that data and off course, you know, existing legacy platforms. We're not able to support. You know that level of concurrency, Many workload. You know, we we talk about machine learning that a science that are engendering, you know, that our house big data were closed or running in one place didn't make sense at all. And therefore, you know what customers did is to create silos, silos of data everywhere, you know, with different system having a subset of the data. And of course, now you cannot analyze this data in one place. So, snowflake, we really solve that problem by creating a single, you know, architectural where you can put all the data in the cloud. So it's a really cloud native we really thought about You know how to solve that problem, how to create, you know, leverage, Cloud and the lessee cc off cloud to really put all the die in one place, but at the same time not run all workload at the same place. So each workload that runs in Snowflake that is dedicated, You know, computer resource is to run, and that makes it very Ajai, right? You know, Floyd and talk about, you know, data scientists having to run analysis, so they need you know a lot of compute resources, but only for, you know, a few hours on. Do you know, with snowflake they can run these new work lord at this workload to the system, get the compute resources that they need to run this workload. And when it's over, they can shut down. You know that their system, it will be automatically shut down. Therefore, they would not pay for the resources that they don't use. So it's a very Ajai system where you can do this, analyzes when you need, and you have all the power to run all this workload at the same time. >>Well, it's profound what you guys built to me. I mean, of course, everybody's trying to copy it now. It was like, remember that bringing the notion of bringing compute to the data and the Hadoop days, and I think that that Asai say everybody is sort of following your suit now are trying to Florian I gotta say the first data scientist I ever interviewed on the Cube was amazing. Hilary Mason, right after she started a bit Lee. And, you know, she made data science that sounds so compelling. But data science is hard. So same same question for you. What do you see is the biggest challenges for customers that they're facing with data science. >>The biggest challenge, from my perspective, is that owns you solve the issue of the data. Seidel with snowflake, you don't want to bring another Seidel, which would be a side off skills. Essentially, there is to the talent gap between the talented label of the market, or are it is to actually find recruits trained data scientist on what needs to be done. And so you need actually to simplify the access to technologies such as every organization can make it, whatever the talent, by bridging that gap and to get there, there is a need of actually breaking up the silos. And in a collaborative approach where technologists and business work together and actually put some their hands into those data projects together, >>it makes sense for flooring. Let's stay with you for a minute. If I can your observation spaces, you know it's pretty, pretty global, and and so you have a unique perspective on how companies around the world might be using data and data science. Are you seeing any trends may be differences between regions or maybe within different industries. What are you seeing? >>Yes. Yeah, definitely. I do see trends that are not geographic that much, but much more in terms of maturity of certain industries and certain sectors, which are that certain industries invested a lot in terms of data, data access, ability to start data in the last few years and no age, a level of maturity where they can invest more and get to the next steps. And it's really rely on the ability of certain medial certain organization actually to have built this long term strategy a few years ago and no start raping up the benefits. >>You know, a decade ago, Florian Hal Varian, we, you know, famously said that the sexy job in the next 10 years will be statisticians. And then everybody sort of change that to data scientists and then everybody. All the statisticians became data scientists, and they got a raise. But data science requires more than just statistics acumen. What what skills >>do >>you see as critical for the next generation of data science? >>Yeah, it's a good question because I think the first generation of the patient is became the licenses because they could done some pipe and quickly on be flexible. And I think that the skills or the next generation of data sentences will definitely be different. It will be first about being able to speak the language of the business, meaning, oh, you translate data inside predictive modeling all of this into actionable insight or business impact. And it would be about you collaborate with the rest of the business. It's not just a farce. You can build something off fast. You can do a notebook in python or your credit models off themselves. It's about, oh, you actually build this bridge with the business. And obviously those things are important. But we also has become the center of the fact that technology will evolve in the future. There will be new tools and technologies, and they will still need to keep this level of flexibility and get to understand quickly, quickly. What are the next tools they need to use the new languages or whatever to get there. >>As you look back on 2020 what are you thinking? What are you telling people as we head into next year? >>Yeah, I I think it's Zaveri interesting, right? We did this crisis, as has told us that the world really can change from one day to the next. And this has, you know, dramatic, you know, and perform the, you know, aspect. For example, companies all the sudden, you know, So their revenue line, you know, dropping. And they had to do less meat data. Some of the companies was the reverse, right? All the sudden, you know, they were online, like in stock out, for example, and their business, you know, completely, you know, change, you know, from one day to the other. So this GT off, You know, I, you know, adjusting the resource is that you have tow the task a need that can change, you know, using solution like snowflakes, you know, really has that. And we saw, you know, both in in our customers some customers from one day to the to do the next where, you know, growing like big time because they benefited, you know, from from from from co vid and their business benefited, but also, as you know, had to drop. And what is nice with with with cloud, it allows to, you know, I just compute resources toe, you know, to your business needs, you know, and really adjusted, you know, in our, uh, the the other aspect is is understanding what is happening, right? You need to analyze the we saw all these all our customers basically wanted to understand. What is that going to be the impact on my business? How can I adapt? How can I adjust? And and for that, they needed to analyze data. And, of course, a lot of data which are not necessarily data about, you know, their business, but also data from the outside. You know, for example, coffee data, You know, where is the States? You know, what is the impact? You know, geographic impact from covitz, You know, all the time and access to this data is critical. So this is, you know, the promise off the data crowd, right? You know, having one single place where you can put all the data off the world. So our customers, all the Children you know, started to consume the cov data from our that our marketplace and and we had the literally thousands of customers looking at this data analyzing this data, uh, to make good decisions So this agility and and and this, you know, adapt adapting, you know, from from one hour to the next is really critical. And that goes, you know, with data with crowding adjusting, resource is on and that's, you know, doesn't exist on premise. So So So indeed, I think the lesson learned is is we are living in a world which machines changing all the time and we have for understanding We have to adjust and and And that's why cloud, you know, somewhere it's great. >>Excellent. Thank you. You know the kid we like to talk about disruption, of course. Who doesn't on And also, I mean, you look at a I and and the impact that is beginning to have and kind of pre co vid. You look at some of the industries that were getting disrupted by, you know, we talked about digital transformation and you had on the one end of the spectrum industries like publishing which are highly disrupted or taxis. And you could say Okay, well, that's, you know, bits versus Adam, the old Negroponte thing. But then the flip side of that look at financial services that hadn't been dramatically disrupted. Certainly healthcare, which is ripe for disruption Defense. So the number number of industries that really hadn't leaned into digital transformation If it ain't broke, don't fix it. Not on my watch. There was this complacency and then, >>of >>course, co vid broke everything. So, florian, I wonder if you could comment? You know what industry or industries do you think you're gonna be most impacted by data science and what I call machine intelligence or a I in the coming years and decades? >>Honestly, I think it's all of them artist, most of them because for some industries, the impact is very visible because we're talking about brand new products, drones like cars or whatever that are very visible for us. But for others, we are talking about sport from changes in the way you operate as an organization, even if financial industry itself doesn't seems to be so impacted when you look it from the consumer side or the outside. In fact, internally, it's probably impacted just because the way you use data on developer for flexibility, you need the kind off cost gay you can get by leveraging the latest technologies is just enormous, and so it will actually transform the industry that also and overall, I think that 2020 is only a where, from the perspective of a I and analytics, we understood this idea of maturity and resilience, maturity, meaning that when you've got a crisis, you actually need data and ai more than before. You need to actually call the people from data in the room to take better decisions and look for a while and not background. And I think that's a very important learning from 2020 that will tell things about 2021 and the resilience it's like, Yeah, Data Analytics today is a function consuming every industries and is so important that it's something that needs to work. So the infrastructure is to work in frustration in super resilient. So probably not on prime on a fully and prime at some point and the kind of residence where you need to be able to plan for literally anything like no hypothesis in terms of behaviors can be taken for granted. And that's something that is new and which is just signaling that we're just getting to the next step for the analytics. >>I wonder, Benoit, if you have anything to add to that. I mean, I often wonder, you know, winter machine's gonna be able to make better diagnoses than doctors. Some people say already, you know? Well, the financial services traditional banks lose control of payment systems. Uh, you know what's gonna happen to big retail stores? I mean, maybe bring us home with maybe some of your final thoughts. >>Yeah, I would say, you know, I I don't see that as a negative, right? The human being will always be involved very closely, but the machine and the data can really have, you know, see, Coalition, you know, in the data that that would be impossible for for for human being alone, you know, you know, to to discover so So I think it's going to be a compliment, not a replacement on. Do you know everything that has made us you know faster, you know, doesn't mean that that we have less work to do. It means that we can doom or and and we have so much, you know, to do, uh, that that I would not be worried about, You know, the effect off being more efficient and and and better at at our you know, work. And indeed, you know, I fundamentally think that that data, you know, processing off images and doing, you know, I ai on on on these images and discovering, you know, patterns and and potentially flagging, you know, disease, where all year that then it was possible is going toe have a huge impact in in health care, Onda and And as as as Ryan was saying, every you know, every industry is going to be impacted by by that technology. So So, yeah, I'm very optimistic. >>Great guys. I wish we had more time. I gotta leave it there. But so thanks so much for coming on. The Cube was really a pleasure having you.
SUMMARY :
And Wa Dodgeville is the he co founded And I have said many times on the Cube that you know, the first era of cloud was really about infrastructure, So you close the gap or the democratizing access to data And we know we all know that you and the snowflake team you get very high marks for Yeah, so So the really the challenge, you know, be four. And, you know, And so you need actually to simplify the access to you know it's pretty, pretty global, and and so you have a unique perspective on how companies the ability of certain medial certain organization actually to have built this long term strategy You know, a decade ago, Florian Hal Varian, we, you know, famously said that the sexy job in the next And it would be about you collaborate with the rest of the business. So our customers, all the Children you know, started to consume the cov you know, we talked about digital transformation and you had on the one end of the spectrum industries You know what industry or industries do you think you're gonna be most impacted by data the kind of residence where you need to be able to plan for literally I mean, I often wonder, you know, winter machine's gonna be able to make better diagnoses that data, you know, processing off images and doing, you know, I ai on I gotta leave it there.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Volonte | PERSON | 0.99+ |
Florian Duetto | PERSON | 0.99+ |
Hilary Mason | PERSON | 0.99+ |
Florian Hal Varian | PERSON | 0.99+ |
Florian | PERSON | 0.99+ |
Benoit | PERSON | 0.99+ |
Ryan | PERSON | 0.99+ |
Ben Wa | PERSON | 0.99+ |
Data Aiko | ORGANIZATION | 0.99+ |
2020 | DATE | 0.99+ |
10 years | QUANTITY | 0.99+ |
Lee | PERSON | 0.99+ |
Wa Dodgeville | PERSON | 0.99+ |
next year | DATE | 0.99+ |
python | TITLE | 0.99+ |
Snowflake | ORGANIZATION | 0.99+ |
first | QUANTITY | 0.99+ |
one place | QUANTITY | 0.99+ |
one hour | QUANTITY | 0.98+ |
a decade ago | DATE | 0.98+ |
Floyd | PERSON | 0.98+ |
2021 | DATE | 0.98+ |
one day | QUANTITY | 0.98+ |
both | QUANTITY | 0.97+ |
today | DATE | 0.97+ |
first generation | QUANTITY | 0.96+ |
Adam | PERSON | 0.93+ |
Onda | ORGANIZATION | 0.93+ |
one single place | QUANTITY | 0.93+ |
florian | PERSON | 0.93+ |
each workload | QUANTITY | 0.92+ |
one | QUANTITY | 0.91+ |
four | QUANTITY | 0.9+ |
few years ago | DATE | 0.88+ |
thousands of customers | QUANTITY | 0.88+ |
Cube | COMMERCIAL_ITEM | 0.87+ |
first data scientist | QUANTITY | 0.84+ |
single | QUANTITY | 0.83+ |
Asai | PERSON | 0.82+ |
two world | QUANTITY | 0.81+ |
first era | QUANTITY | 0.74+ |
next 10 years | DATE | 0.74+ |
Negroponte | PERSON | 0.73+ |
Zaveri | ORGANIZATION | 0.72+ |
Dataiku | ORGANIZATION | 0.7+ |
Cube | ORGANIZATION | 0.64+ |
Ajai | ORGANIZATION | 0.58+ |
years | DATE | 0.57+ |
covitz | PERSON | 0.53+ |
decades | QUANTITY | 0.52+ |
Cube | PERSON | 0.45+ |
Snowflake | TITLE | 0.45+ |
Seidel | ORGANIZATION | 0.43+ |
snowflake | EVENT | 0.35+ |
Seidel | COMMERCIAL_ITEM | 0.34+ |
Democratizing AI & Advanced Analytics with Dataiku x Snowflake | Snowflake Data Cloud Summit
>> My name is Dave Vellante. And with me are two world-class technologists, visionaries and entrepreneurs. Benoit Dageville, he co-founded Snowflake and he's now the President of the Product Division, and Florian Douetteau is the Co-founder and CEO of Dataiku. Gentlemen, welcome to the cube to first timers, love it. >> Yup, great to be here. >> Now Florian you and Benoit, you have a number of customers in common, and I've said many times on theCUBE, that the first era of cloud was really about infrastructure, making it more agile, taking out costs. And the next generation of innovation, is really coming from the application of machine intelligence to data with the cloud, is really the scale platform. So is that premise relevant to you, do you buy that? And why do you think Snowflake, and Dataiku make a good match for customers? >> I think that because it's our values that aligned, when it gets all about actually today, and knowing complexity of our customers, so you close the gap. Where we need to commoditize the access to data, the access to technology, it's not only about data. Data is important, but it's also about the impacts of data. How can you make the best out of data as fast as possible, as easily as possible, within an organization. And another value is about just the openness of the platform, building a future together. Having a platform that is not just about the platform, but also for the ecosystem of partners around it, bringing the level of accessibility, and flexibility you need for the 10 years of that. >> Yeah, so that's key, that it's not just data. It's turning data into insights. Now Benoit, you came out of the world of very powerful, but highly complex databases. And we know we all know that you and the Snowflake team, you get very high marks for really radically simplifying customers' lives. But can you talk specifically about the types of challenges that your customers are using Snowflake to solve? >> Yeah, so the challenge before snowflake, I would say, was really to put all the data in one place, and run all the computes, all the workloads that you wanted to run against that data. And of course existing legacy platforms were not able to support that level of concurrency, many workload, we talk about machine learning, data science, data engineering, data warehouse, big data workloads, all running in one place didn't make sense at all. And therefore be what customers did this to create silos, silos of data everywhere, with different system, having a subset of the data. And of course now, you cannot analyze this data in one place. So Snowflake, we really solved that problem by creating a single architecture where you can put all the data into cloud. So it's a really cloud native. We really thought about how solve that problem, how to create, leverage cloud, and the elasticity of cloud to really put all the data in one place. But at the same time, not run all workload at the same place. So each workload that runs in Snowflake, at its dedicated compute resources to run. And that makes it agile, right? Florian talked about data scientist having to run analysis, so they need a lot of compute resources, but only for a few hours. And with Snowflake, they can run these new workload, add this workload to the system, get the compute resources that they need to run this workload. And then when it's over, they can shut down their system, it will automatically shut down. Therefore they would not pay for the resources that they don't use. So it's a very agile system, where you can do this analysis when you need, and you have all the power to run all these workload at the same time. >> Well, it's profound what you guys built. I mean to me, I mean of course everybody's trying to copy it now, it was like, I remember that bringing the notion of bringing compute to the data, in the Hadoop days. And I think that, as I say, everybody is sort of following your suit now or trying to. Florian, I got to say the first data scientist I ever interviewed on theCUBE, it was the amazing Hillary Mason, right after she started at Bitly, and she made data sciences sounds so compelling, but data science is a hard. So same question for you, what do you see as the biggest challenges for customers that they're facing with data science? >> The biggest challenge from my perspective, is that once you solve the issue of the data silo, with Snowflake, you don't want to bring another silo, which will be a silo of skills. And essentially, thanks to the talent gap, between the talent available to the markets, or are released to actually find recruits, train data scientists, and what needs to be done. And so you need actually to simplify the access to technologies such as, every organization can make it, whatever the talent, by bridging that gap. And to get there, there's a need of actually backing up the silos. Having a collaborative approach, where technologies and business work together, and actually all puts up their ends into those data projects together. >> It makes sense, Florain let's stay with you for a minute, if I can. Your observation space, it's pretty, pretty global. And so you have a unique perspective on how can companies around the world might be using data, and data science. Are you seeing any trends, maybe differences between regions, or maybe within different industries? What are you seeing? >> Yeah, definitely I do see trends that are not geographic, that much, but much more in terms of maturity of certain industries and certain sectors. Which are, that certain industries invested a lot, in terms of data, data access, ability to store data. As well as experience, and know region level of maturity, where they can invest more, and get to the next steps. And it's really relying on the ability of certain leaders, certain organizations, actually, to have built these long-term data strategy, a few years ago when no stats reaping of the benefits. >> A decade ago, Florian, Hal Varian famously said that the sexy job in the next 10 years will be statisticians. And then everybody sort of changed that to data scientist. And then everybody, all the statisticians became data scientists, and they got a raise. But data science requires more than just statistics acumen. What skills do you see as critical for the next generation of data science? >> Yeah, it's a great question because I think the first generation of data scientists, became data scientists because they could have done some Python quickly, and be flexible. And I think that the skills of the next generation of data scientists will definitely be different. It will be, first of all, being able to speak the language of the business, meaning how you translates data insight, predictive modeling, all of this into actionable insights of business impact. And it would be about how you collaborate with the rest of the business. It's not just how fast you can build something, how fast you can do a notebook in Python, or do predictive models of some sorts. It's about how you actually build this bridge with the business, and obviously those things are important, but we also must be cognizant of the fact that technology will evolve in the future. There will be new tools, new technologies, and they will still need to keep this level of flexibility to understand quickly what are the next tools they need to use a new languages, or whatever to get there. >> As you look back on 2020, what are you thinking? What are you telling people as we head into next year? >> Yeah, I think it's very interesting, right? This crises has told us that the world really can change from one day to the next. And this has dramatic and perform the aspects. For example companies all of a sudden, show their revenue line dropping, and they had to do less with data. And some other companies was the reverse, right? All of a sudden, they were online like Instacart, for example, and their business completely changed from one day to the other. So this agility of adjusting the resources that you have to do the task, and need that can change, using solution like Snowflake really helps that. Then we saw both in our customers. Some customers from one day to the next, were growing like big time, because they benefited from COVID, and their business benefited. But others had to drop. And what is nice with cloud, it allows you to adjust compute resources to your business needs, and really address it in house. The other aspect is understanding what happening, right? You need to analyze. We saw all our customers basically, wanted to understand what is the going to be the impact on my business? How can I adapt? How can I adjust? And for that, they needed to analyze data. And of course, a lot of data which are not necessarily data about their business, but also they are from the outside. For example, COVID data, where is the States, what is the impact, geographic impact on COVID, the time. And access to this data is critical. So this is the premise of the data cloud, right? Having one single place, where you can put all the data of the world. So our customer obviously then, started to consume the COVID data from that our data marketplace. And we had delete already thousand customers looking at this data, analyzing these data, and to make good decisions. So this agility and this, adapting from one hour to the next is really critical. And that goes with data, with cloud, with interesting resources, and that doesn't exist on premise. So indeed I think the lesson learned is we are living in a world, which is changing all the time, and we have to understand it. We have to adjust, and that's why cloud some ways is great. >> Excellent thank you. In theCUBE we like to talk about disruption, of course, who doesn't? And also, I mean, you look at AI, and the impact that it's beginning to have, and kind of pre-COVID. You look at some of the industries that were getting disrupted by, everyone talks about digital transformation. And you had on the one end of the spectrum, industries like publishing, which are highly disrupted, or taxis. And you can say, okay, well that's Bits versus Adam, the old Negroponte thing. But then the flip side of, you say look at financial services that hadn't been dramatically disrupted, certainly healthcare, which is ripe for disruption, defense. So there a number of industries that really hadn't leaned into digital transformation, if it ain't broke, don't fix it. Not on my watch. There was this complacency. And then of course COVID broke everything. So Florian I wonder if you could comment, what industry or industries do you think are going to be most impacted by data science, and what I call machine intelligence, or AI, in the coming years and decade? >> Honestly, I think it's all of them, or at least most of them, because for some industries, the impact is very visible, because we have talking about brand new products, drones, flying cars, or whatever that are very visible for us. But for others, we are talking about a part from changes in the way you operate as an organization. Even if financial industry itself doesn't seem to be so impacted, when you look at it from the consumer side, or the outside insights in Germany, it's probably impacted just because the way you use data (mumbles) for flexibility you need. Is there kind of the cost gain you can get by leveraging the latest technologies, is just the numbers. And so it's will actually comes from the industry that also. And overall, I think that 2020, is a year where, from the perspective of AI and analytics, we understood this idea of maturity and resilience, maturity meaning that when you've got to crisis you actually need data and AI more than before, you need to actually call the people from data in the room to take better decisions, and look for one and a backlog. And I think that's a very important learning from 2020, that will tell things about 2021. And the resilience, it's like, data analytics today is a function transforming every industries, and is so important that it's something that needs to work. So the infrastructure needs to work, the infrastructure needs to be super resilient, so probably not on prem or not fully on prem, at some point. And the kind of resilience where you need to be able to blend for literally anything, like no hypothesis in terms of BLOs, can be taken for granted. And that's something that is new, and which is just signaling that we are just getting to a next step for data analytics. >> I wonder Benoir if you have anything to add to that. I mean, I often wonder, when are machines going to be able to make better diagnoses than doctors, some people say already. Will the financial services, traditional banks lose control of payment systems? What's going to happen to big retail stores? I mean, maybe bring us home with maybe some of your finals thoughts. >> Yeah, I would say I don't see that as a negative, right? The human being will always be involved very closely, but then the machine, and the data can really help, see correlation in the data that would be impossible for human being alone to discover. So I think it's going to be a compliment not a replacement. And everything that has made us faster, doesn't mean that we have less work to do. It means that we can do more. And we have so much to do, that I will not be worried about the effect of being more efficient, and bare at our work. And indeed, I fundamentally think that data, processing of images, and doing AI on these images, and discovering patterns, and potentially flagging disease way earlier than it was possible. It is going to have a huge impact in health care. And as Florian was saying, every industry is going to be impacted by that technology. So, yeah, I'm very optimistic. >> Great, guys, I wish we had more time. I've got to leave it there, but so thanks so much for coming on theCUBE. It was really a pleasure having you.
SUMMARY :
and Florian Douetteau is the And the next generation of innovation, the access to data, about the types of challenges all the workloads that you of bringing compute to the And essentially, thanks to the talent gap, And so you have a unique perspective And it's really relying on the that the sexy job in the next 10 years of the next generation the resources that you have and the impact that And the kind of resilience where you need Will the financial services, and the data can really help, I've got to leave it there,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Benoit | PERSON | 0.99+ |
Florian Douetteau | PERSON | 0.99+ |
Florian | PERSON | 0.99+ |
Benoit Dageville | PERSON | 0.99+ |
Dataiku | ORGANIZATION | 0.99+ |
2020 | DATE | 0.99+ |
Hillary Mason | PERSON | 0.99+ |
Hal Varian | PERSON | 0.99+ |
10 years | QUANTITY | 0.99+ |
Python | TITLE | 0.99+ |
Snowflake | ORGANIZATION | 0.99+ |
Germany | LOCATION | 0.99+ |
one hour | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
next year | DATE | 0.99+ |
Bitly | ORGANIZATION | 0.99+ |
one day | QUANTITY | 0.98+ |
2021 | DATE | 0.98+ |
A decade ago | DATE | 0.98+ |
one place | QUANTITY | 0.97+ |
Snowflake Data Cloud Summit | EVENT | 0.97+ |
Snowflake | TITLE | 0.96+ |
each workload | QUANTITY | 0.96+ |
today | DATE | 0.96+ |
first generation | QUANTITY | 0.96+ |
Benoir | PERSON | 0.95+ |
snowflake | EVENT | 0.94+ |
first era | QUANTITY | 0.92+ |
COVID | OTHER | 0.92+ |
single architecture | QUANTITY | 0.91+ |
thousand customers | QUANTITY | 0.9+ |
first data scientist | QUANTITY | 0.9+ |
one | QUANTITY | 0.88+ |
one single place | QUANTITY | 0.87+ |
few years ago | DATE | 0.86+ |
Negroponte | PERSON | 0.85+ |
Florain | ORGANIZATION | 0.82+ |
two world | QUANTITY | 0.81+ |
first | QUANTITY | 0.8+ |
Instacart | ORGANIZATION | 0.75+ |
next 10 years | DATE | 0.7+ |
hours | QUANTITY | 0.67+ |
Snowflake | EVENT | 0.59+ |
a minute | QUANTITY | 0.58+ |
theCUBE | ORGANIZATION | 0.55+ |
Adam | PERSON | 0.49+ |
Will Nowak, Dataiku | AWS re:Invent 2019
>>long from Las Vegas. It's the Q covering a ws re invent 2019. Brought to you by Amazon Web service is and in along with its ecosystem partners. >>Hey, welcome back to the Cube. Lisa Martin at AWS Reinvent 19. This is Day three of the Cubes coverage. We have two sets here. Lots of cute content are joined by Justin Warren, the founder and chief analyst at Pivot nine. Justin. How's it going? Great, right? You still have a voice? Three days? >>Just barely. I've been I've been trying to take care of it. >>Impressed. And you probably have talked to at least half of the 65,000 attendees. >>I'm trying to talk to as many as I can. >>Well, we're gonna talk to another guy here. Joining us from data ICU is well, Novak, the solutions architect will be the Cube. >>Thanks for having me. >>You have a good voice too. After a three day is that you >>have been doing the best I can. >>Yeah, he's good. So did ICU. Interesting name. Let's start off by sharing with our audience. Who did a coup is and what you guys do in technology. >>Yes. So the Entomology of date ICU. It's like hi cooze for data. So we say we take your data and, you know, we make poetry out of it. Make your data so beautiful. Wow, Now, But for those who are unaware Day like it was an enterprise data science platform. Eso we provide a collaborative environment for we say coders and clickers kind of business analyst and native data scientists to make use of organizations, data bill reports and Bill productive machine learning base models and deploy them. >>I'm only the guy's been around around for eight years. Eight years. Okay, >>so start up. Still >>mourning the cloud, the opportunity there That data is no longer a liability. It's an asset or should be. >>So we've been server based from the start, which is one of our differentiators. And so by that we see ourselves as a collaborative platform. Users access it through a Web browser, log into a shared space and share code, can share visual recipes, as we call them to prepare data. >>Okay, so what customers using the platform to do with machine learning is pretty hot at the moment. I think it might be nearing the peak of the life cycle pretty hot. Yeah, what a customer is actually actually doing on the platform, >>you know, So we really focus on enabling the enterprise. So, for example, G has been a customer for some time now, and Sergey is a great prototypical example on that. They have many disparate use cases, like simple things like doing customer segmentation for, you know, marketing campaigns but also stuff like Coyote predicted maintenance. So use cases kind of run the gamut, and so did ICU. Based on open source, we're enabling all of G's users to come into a centralized platform, access their data manipulated for whatever purposes. Maybe >>nobody talked about marketing campaigns for a second. I'm wondering. Are, is their integration with serum technologies? Or how would a customer like wanting to understand customer segmentation or had a segment it for marketing campaign? How would they work in conjunction with a serum and data ICU, for example? >>It's a great question. So again, us being a platform way sit on a single server, something like an Amazon ec2 instance, and then we make connections into an organization's data sources. So if using something like Salesforce weaken seamlessly, pull in data from Salesforce Yuka manipulated in date ICU, but the same time. Maybe also have some excel file someone you know me. I can bring that into my data to work environment. And I also have a red shift data table. All those things would come into the same environment. I can visualize. I can analyze, and I can prepare the data. I see. >>So you tell you it's based on open source? I'm a longtime fan of over. It's always been involved in it for longer than I care to remember. Actually, that's an interesting way t base your product on that. So maybe talk us through how you how you came to found the company based on basic an open source. What? What led to that choice? What? What was that decision based on? >>Yeah, for sure. So you talked about how you know the hype cycle? A. I saw how hot is a I and so I think again, our founders astutely recognize that this is a very fast moving place to be. And so I'm kind of betting on one particular technology can be risky. So instead, by being a platform, we say, like sequel has been the data transformation language do jour for many days now. So, of course, that you can easily write Sequel and a lot of our visual data Transformations are based on the sequel language, but also something like Python again. It's like the language de jour for machine law machine learning model building right now, so you can easily code in python. Maintain your python libraries in date, ICU And so by leveraging open source, we figured we're making our clients more future proof as long as they're staying in date ICU. But using data ICU to leverage the best in breed and open source, they'll always be kind of where they want to be in the technological landscape by supposed to locked into some tech that is now out of date. >>What's been the appetite for making data beautiful for a legacy enterprise, like a G E that's been around for a very long time versus a more modern either. Born in the Cloud er's our CEO says, reborn in the cloud. What are some of the differences but also similarities that you see in terms of we have to be able to use emerging tech. Otherwise someone's gonna come in behind us and replace us. >>Yeah, I mean, I think it's complicated in that there's still a lot of value to be had in someone says, like a bar chart you can rely on right, So it's maybe not sexy. But having good reporting and analytics is something that both you know, 200 year old enterprise organizations and data native organizations startups needs. At the same time, building predicted machine learning models and deploying those is rest a p i n points that developers can use in your organization to provide a data driven product for your consumers. Like that's amore advanced use case that everyone kind of wants to be a part of again data. Who's a nice tool, which says Maybe you don't have developers who are very fluent in turning out flashed applications. We could give you a place to build a predictive model and deploy that predictive model, saving you time to write all that code on the back end. >>One of the themes of the show has been transformation, so it sounds like data ICU would be It's something that you can dip your toes in and start to get used to using. Even if you're not particularly familiar with Time machine learning model a model building. >>Yeah, that's exactly right. So a big part of our product and encourage watchers to go try it out themselves and go to our website. Download a free version pretrial, but is enablement. So if you're the most sophisticated applied math PhD there is, like, Who's a great environment for you to Code and Bill predictive models. If you never built the machine learning model before you can use data ICU to run visual machine learning recipes, we call them, and also we give you documentation, which is, Hey, this is a random forest model. What is a random forest model? We'll tell you a little bit about it. And that's another thing that some of these enterprises have really appreciated about date I could. It is helping up skill there user base >>in terms of that transformation theme that Justin just mention which we're hearing a lot about, not visit this show. It's a big thing, but we hear it all the time, right? But in terms of customers transformation, journey, whatever you wanna call it, cloud is gonna be an essential enabler of being able to really love it value from a I. So I'm just wondering from a strategic positioning standpoint. Is did ICU positioned as a facilitator or as fuel for a cloud transformation that on enterprise would undergo >>again? Yes, great point. So for us, I can't take the credit. This credit goes to our founders, but we've thought from the start the clouds and exciting proposition Not everyone is. They're still in 2019. Most people, if not all of them, want to get there. Also, people want too many of our clients want the multi cloud on a day. Like who says, If you want to be on prim, if you want to be in a single cloud subscription. If you want to be multi cloud again as a platform, we're just gonna give you connection to your underlying infrastructure. You could use the infrastructure that you like and just use our front end to help your analyst get value. They can. I >>think I think a lot of vendors across the entire ecosystem around to say the customer choice is really important, and the customers, particularly enterprise customers, want to be able to have lots of different options, and not all of them will be ready to go completely. All in on cloud today. They made it may take them years, possibly decades, to get there. So having that choice is like it's something that it would work with you today and we'll work with you tomorrow, depending on what choices you make. >>It's exactly right. Another thing we've seen a lot of to that day, like who helps with and whether it's like you or other tools. Like, of course, you want best in breed, but you also want particularly for a large enterprise. You don't want people operating kind of in a wild West, particularly in like the ML data science space. So you know we integrate with Jupiter notebooks, but some of our clients come to us initially. Just have I won't say rogues that has a negative connotation. But maybe I will say Road road data Scientists are just tapping into some day the store. They're using Jupiter notebooks to build a predictive model, but then to actually production allies that to get sustainable value out of it like it's to one off and so having a centralized platform like date ICU, where you can say this is where we're going to use our central model depository, that something where businesses like they can sleep easier at night because they know where is my ML development happening? It's happening in one ecosystem. What tools that happening with, well, best in breed of open source. So again, you kind of get best of both worlds like they like you. >>It sounds like it's more about the operations of machine learning. It is really, really important rather than just. It's the pure technology. Yes, that's important as well, and you need to have the data Sinus to build it, but having something that allows you to operationalize it so that you can just bake it into what we do every day as a business. >>Yeah, I think in a conference like this all about tech, it's easy to forget what we firmly believe, which is a I and maybe tech. More broadly, it's still human problems at the core, right? Once you get the tech right, the code runs corrected. The code is written correctly. Therefore, like human interactions, project management model deployment in an organization. These are really hard, human centered problems, but so having tech that enables that human centric collaboration helps with that, we find >>Let's talk about some of the things that we can't ever go to an event and not talk about. Nut is respected data quality, reliability and security. Understood? I could facilitate those three cornerstones. >>Yeah, sure. So, again, viewers, I would encourage you to check out the date. ICU has some nice visual indications of data quality. So an analyst or data scientists and come in very easily understand, you know, is this quality to conform to the standards that my organization has set and what I mean by standards that could be configured. Right? So does this column have the appropriate schema? Does it have the appropriate carnality? These are things that an individual might decide to use on then for security. So Data has its own security mechanisms. However, we also to this point about incorporating best Retek. We'll work with whatever underlying security mechanisms organizations organizations have in place. So, for instance, if you're using a W s, you have, I am rolls to manage your security. Did ICU comport those that apply those to the date ICU environment or using something like on prime miss, uh, duke waken you something like Kerberos has the technology to again manage access to resources. So we're taking the best in breed that this organization already has invested time, energy and resources into and saying We're not trying to compete with them but rather were trying to enable organizations to use these technologies efficiently. >>Yeah, I like that consistency of customer choice. We spoke about that just before. I'm seeing that here with their choices around. Well, if you're on this particular platform will integrate with whatever the tools are there. People underestimate how important that is for enterprises, that it has to be ahead. Virginia's environment, playing well with others is actually quite important. >>Yeah, I don't know that point. Like the combination of heterogeneity but also uniformity. It's a hard balance to strike, and I think it's really important, giving someone a unified environment but still choice. At the same time. A good restaurant or something like you won't be able to pick your dish, but you want to know that the entire quality is high. And so having that consistent ecosystem, I think, really helps >>what are, in your opinion, some of the next industries that you see there really right to start Really leveraging machine learning to transfer You mentioned g e a very old legacy business. If we think of you know what happened with the ride hailing industry uber, for example, or fitness with Saletan or pinchers with visible Serge, what do you think is the next industry? That's like you guys taking advantage of machine learning will completely transform this and our lives. >>I mean, the easy answer that I'll give because it's easy to say it's gonna transform. But hard to operationalize is health care, right? So there is structured data, but the data quality is so desperate and had a row genius s, I think you know, if organizations in a lot of this again it's a human centered problem. If people could decide on data standards and also data privacy is, of course, a huge issue. We talked about data security internally, but also as a customer. What day to do I want you know, this hospital, this health care provider, to have access to that human issues we have to result but conditional on that being resolved that staring out a way to anonymous eyes data and respect data privacy but have consistent data structure. And we could say, Hey, let's really set these a I M L models loose and figure out things like personalized medicine which were starting to get to. But I feel like there's still a lot of room to go. That >>sounds like it's exciting time to be in machine learning. People should definitely check out products such as Dead Rock you and see what happens. >>Last question for you is so much news has come out in the last three days. It's mind boggling sum of the takeaways, that of some of the things that you've heard from Andy Jassy to border This'll Morning. >>Yeah, I think a big thing for me, which was something for me before this week. But it's always nice to hear an Amazon reassures the concept of white box. Aye, aye. We've been talking about that a date ICU for some time, but everyone wants performance A. I R ml solutions, but increasing. There's a really appetite publicly for interpret ability, and so you have to be responsible. You have to have interpret belay I and so it's nice to hear a leader like Amazon echo that day like you. That's something we've been talking about since our start. >>A little bit validating them for data ICU, for sure, for sure. Well, thank you for joining. Just to be on the kid, the suffering. And we appreciate it. Appreciate it. All right. For my co host, Justin Warren, I'm Lisa Martin and your work to the Cube from Vegas. It's AWS reinvent 19.
SUMMARY :
Brought to you by Amazon Web service by Justin Warren, the founder and chief analyst at Pivot nine. I've been I've been trying to take care of it. And you probably have talked to at least half of the 65,000 attendees. Well, we're gonna talk to another guy here. After a three day is that you Who did a coup is and what you guys do in technology. you know, we make poetry out of it. I'm only the guy's been around around for eight years. so start up. mourning the cloud, the opportunity there That data is no longer a And so by that we see ourselves as a collaborative platform. actually doing on the platform, like simple things like doing customer segmentation for, you know, marketing campaigns but Are, is their integration with serum Maybe also have some excel file someone you know me. So maybe talk us through how you how you came to found the company based on basic So, of course, that you can easily write Sequel and a lot of our visual data Transformations What are some of the differences but also similarities that you see in terms of we have to be had in someone says, like a bar chart you can rely on right, So it's maybe not sexy. One of the themes of the show has been transformation, so it sounds like data ICU would be It's something that you can dip your we call them, and also we give you documentation, which is, Hey, this is a random forest model. transformation, journey, whatever you wanna call it, cloud is gonna be an essential as a platform, we're just gonna give you connection to your underlying infrastructure. So having that choice is like it's something that it would work with you today and we'll work with you tomorrow, So you know we integrate with Jupiter notebooks, but some of our clients come to us initially. to operationalize it so that you can just bake it into what we do every day as a business. Yeah, I think in a conference like this all about tech, it's easy to forget what we firmly Let's talk about some of the things that we can't ever go to an event and not talk about. like on prime miss, uh, duke waken you something like Kerberos has the technology to again Yeah, I like that consistency of customer choice. A good restaurant or something like you won't be able to pick your dish, If we think of you know what happened with the ride hailing industry uber, for example, What day to do I want you know, such as Dead Rock you and see what happens. Last question for you is so much news has come out in the last three days. There's a really appetite publicly for interpret ability, and so you have to be responsible. thank you for joining.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Justin Warren | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
2019 | DATE | 0.99+ |
Justin | PERSON | 0.99+ |
Andy Jassy | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
Will Nowak | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Eight years | QUANTITY | 0.99+ |
python | TITLE | 0.99+ |
200 year | QUANTITY | 0.99+ |
Python | TITLE | 0.99+ |
Vegas | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
echo | COMMERCIAL_ITEM | 0.99+ |
Sergey | PERSON | 0.99+ |
today | DATE | 0.99+ |
tomorrow | DATE | 0.99+ |
Novak | PERSON | 0.99+ |
two sets | QUANTITY | 0.99+ |
Three days | QUANTITY | 0.99+ |
Virginia | LOCATION | 0.98+ |
Dataiku | PERSON | 0.98+ |
both | QUANTITY | 0.98+ |
Dead Rock | TITLE | 0.97+ |
single server | QUANTITY | 0.97+ |
both worlds | QUANTITY | 0.97+ |
three day | QUANTITY | 0.97+ |
Serge | PERSON | 0.96+ |
one | QUANTITY | 0.96+ |
single cloud | QUANTITY | 0.96+ |
Retek | ORGANIZATION | 0.95+ |
uber | ORGANIZATION | 0.95+ |
Salesforce | ORGANIZATION | 0.95+ |
a day | QUANTITY | 0.93+ |
Day three | QUANTITY | 0.93+ |
One | QUANTITY | 0.91+ |
65,000 attendees | QUANTITY | 0.91+ |
This'll Morning | TITLE | 0.9+ |
Coyote | ORGANIZATION | 0.89+ |
Amazon Web | ORGANIZATION | 0.89+ |
Kerberos | ORGANIZATION | 0.88+ |
decades | QUANTITY | 0.88+ |
one ecosystem | QUANTITY | 0.87+ |
ec2 | TITLE | 0.85+ |
last three days | DATE | 0.82+ |
three cornerstones | QUANTITY | 0.79+ |
G | ORGANIZATION | 0.79+ |
19 | QUANTITY | 0.78+ |
eight years | QUANTITY | 0.74+ |
Cube | ORGANIZATION | 0.74+ |
this week | DATE | 0.73+ |
Eso | ORGANIZATION | 0.72+ |
G E | ORGANIZATION | 0.7+ |
Pivot nine | ORGANIZATION | 0.69+ |
excel | TITLE | 0.67+ |
Saletan | PERSON | 0.59+ |
Cubes | ORGANIZATION | 0.57+ |
second | QUANTITY | 0.57+ |
Yuka | COMMERCIAL_ITEM | 0.53+ |
half | QUANTITY | 0.5+ |
Jupiter | ORGANIZATION | 0.48+ |
Invent 2019 | EVENT | 0.46+ |
Reinvent 19 | EVENT | 0.39+ |
invent | EVENT | 0.24+ |
Breaking Analysis: AI Goes Mainstream But ROI Remains Elusive
>> From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR, this is "Breaking Analysis" with Dave Vellante. >> A decade of big data investments combined with cloud scale, the rise of much more cost effective processing power. And the introduction of advanced tooling has catapulted machine intelligence to the forefront of technology investments. No matter what job you have, your operation will be AI powered within five years and machines may actually even be doing your job. Artificial intelligence is being infused into applications, infrastructure, equipment, and virtually every aspect of our lives. AI is proving to be extremely helpful at things like controlling vehicles, speeding up medical diagnoses, processing language, advancing science, and generally raising the stakes on what it means to apply technology for business advantage. But business value realization has been a challenge for most organizations due to lack of skills, complexity of programming models, immature technology integration, sizable upfront investments, ethical concerns, and lack of business alignment. Mastering AI technology will not be a requirement for success in our view. However, figuring out how and where to apply AI to your business will be crucial. That means understanding the business case, picking the right technology partner, experimenting in bite-sized chunks, and quickly identifying winners to double down on from an investment standpoint. Hello and welcome to this week's Wiki-bond CUBE Insights powered by ETR. In this breaking analysis, we update you on the state of AI and what it means for the competition. And to do so, we invite into our studios Andy Thurai of Constellation Research. Andy covers AI deeply. He knows the players, he knows the pitfalls of AI investment, and he's a collaborator. Andy, great to have you on the program. Thanks for coming into our CUBE studios. >> Thanks for having me on. >> You're very welcome. Okay, let's set the table with a premise and a series of assertions we want to test with Andy. I'm going to lay 'em out. And then Andy, I'd love for you to comment. So, first of all, according to McKinsey, AI adoption has more than doubled since 2017, but only 10% of organizations report seeing significant ROI. That's a BCG and MIT study. And part of that challenge of AI is it requires data, is requires good data, data proficiency, which is not trivial, as you know. Firms that can master both data and AI, we believe are going to have a competitive advantage this decade. Hyperscalers, as we show you dominate AI and ML. We'll show you some data on that. And having said that, there's plenty of room for specialists. They need to partner with the cloud vendors for go to market productivity. And finally, organizations increasingly have to put data and AI at the center of their enterprises. And to do that, most are going to rely on vendor R&D to leverage AI and ML. In other words, Andy, they're going to buy it and apply it as opposed to build it. What are your thoughts on that setup and that premise? >> Yeah, I see that a lot happening in the field, right? So first of all, the only 10% of realizing a return on investment. That's so true because we talked about this earlier, the most companies are still in the innovation cycle. So they're trying to innovate and see what they can do to apply. A lot of these times when you look at the solutions, what they come up with or the models they create, the experimentation they do, most times they don't even have a good business case to solve, right? So they just experiment and then they figure it out, "Oh my God, this model is working. Can we do something to solve it?" So it's like you found a hammer and then you're trying to find the needle kind of thing, right? That never works. >> 'Cause it's cool or whatever it is. >> It is, right? So that's why, I always advise, when they come to me and ask me things like, "Hey, what's the right way to do it? What is the secret sauce?" And, we talked about this. The first thing I tell them is, "Find out what is the business case that's having the most amount of problems, that that can be solved using some of the AI use cases," right? Not all of them can be solved. Even after you experiment, do the whole nine yards, spend millions of dollars on that, right? And later on you make it efficient only by saving maybe $50,000 for the company or a $100,000 for the company, is it really even worth the experiment, right? So you got to start with the saying that, you know, where's the base for this happening? Where's the need? What's a business use case? It doesn't have to be about cost efficient and saving money in the existing processes. It could be a new thing. You want to bring in a new revenue stream, but figure out what is a business use case, how much money potentially I can make off of that. The same way that start-ups go after. Right? >> Yeah. Pretty straightforward. All right, let's take a look at where ML and AI fit relative to the other hot sectors of the ETR dataset. This XY graph shows net score spending velocity in the vertical axis and presence in the survey, they call it sector perversion for the October survey, the January survey's in the field. Then that squiggly line on ML/AI represents the progression. Since the January 21 survey, you can see the downward trajectory. And we position ML and AI relative to the other big four hot sectors or big three, including, ML/AI is four. Containers, cloud and RPA. These have consistently performed above that magic 40% red dotted line for most of the past two years. Anything above 40%, we think is highly elevated. And we've just included analytics and big data for context and relevant adjacentness, if you will. Now note that green arrow moving toward, you know, the 40% mark on ML/AI. I got a glimpse of the January survey, which is in the field. It's got more than a thousand responses already, and it's trending up for the current survey. So Andy, what do you make of this downward trajectory over the past seven quarters and the presumed uptick in the coming months? >> So one of the things you have to keep in mind is when the pandemic happened, it's about survival mode, right? So when somebody's in a survival mode, what happens, the luxury and the innovations get cut. That's what happens. And this is exactly what happened in the situation. So as you can see in the last seven quarters, which is almost dating back close to pandemic, everybody was trying to keep their operations alive, especially digital operations. How do I keep the lights on? That's the most important thing for them. So while the numbers spent on AI, ML is less overall, I still think the AI ML to spend to sort of like a employee experience or the IT ops, AI ops, ML ops, as we talked about, some of those areas actually went up. There are companies, we talked about it, Atlassian had a lot of platform issues till the amount of money people are spending on that is exorbitant and simply because they are offering the solution that was not available other way. So there are companies out there, you can take AoPS or incident management for that matter, right? A lot of companies have a digital insurance, they don't know how to properly manage it. How do you find an intern solve it immediately? That's all using AI ML and some of those areas actually growing unbelievable, the companies in that area. >> So this is a really good point. If you can you bring up that chart again, what Andy's saying is a lot of the companies in the ETR taxonomy that are doing things with AI might not necessarily show up in a granular fashion. And I think the other point I would make is, these are still highly elevated numbers. If you put on like storage and servers, they would read way, way down the list. And, look in the pandemic, we had to deal with work from home, we had to re-architect the network, we had to worry about security. So those are really good points that you made there. Let's, unpack this a little bit and look at the ML AI sector and the ETR data and specifically at the players and get Andy to comment on this. This chart here shows the same x y dimensions, and it just notes some of the players that are specifically have services and products that people spend money on, that CIOs and IT buyers can comment on. So the table insert shows how the companies are plotted, it's net score, and then the ends in the survey. And Andy, the hyperscalers are dominant, as you can see. You see Databricks there showing strong as a specialist, and then you got to pack a six or seven in there. And then Oracle and IBM, kind of the big whales of yester year are in the mix. And to your point, companies like Salesforce that you mentioned to me offline aren't in that mix, but they do a lot in AI. But what are your takeaways from that data? >> If you could put the slide back on please. I want to make quick comments on a couple of those. So the first one is, it's surprising other hyperscalers, right? As you and I talked about this earlier, AWS is more about logo blocks. We discussed that, right? >> Like what? Like a SageMaker as an example. >> We'll give you all the components what do you need. Whether it's MLOps component or whether it's, CodeWhisperer that we talked about, or a oral platform or data or data, whatever you want. They'll give you the blocks and then you'll build things on top of it, right? But Google took a different way. Matter of fact, if we did those numbers a few years ago, Google would've been number one because they did a lot of work with their acquisition of DeepMind and other things. They're way ahead of the pack when it comes to AI for longest time. Now, I think Microsoft's move of partnering and taking a huge competitor out would open the eyes is unbelievable. You saw that everybody is talking about chat GPI, right? And the open AI tool and ChatGPT rather. Remember as Warren Buffet is saying that, when my laundry lady comes and talk to me about stock market, it's heated up. So that's how it's heated up. Everybody's using ChatGPT. What that means is at the end of the day is they're creating, it's still in beta, keep in mind. It's not fully... >> Can you play with it a little bit? >> I have a little bit. >> I have, but it's good and it's not good. You know what I mean? >> Look, so at the end of the day, you take the massive text of all the available text in the world today, mass them all together. And then you ask a question, it's going to basically search through that and figure it out and answer that back. Yes, it's good. But again, as we discussed, if there's no business use case of what problem you're going to solve. This is building hype. But then eventually they'll figure out, for example, all your chats, online chats, could be aided by your AI chat bots, which is already there, which is not there at that level. This could build help that, right? Or the other thing we talked about is one of the areas where I'm more concerned about is that it is able to produce equal enough original text at the level that humans can produce, for example, ChatGPT or the equal enough, the large language transformer can help you write stories as of Shakespeare wrote it. Pretty close to it. It'll learn from that. So when it comes down to it, talk about creating messages, articles, blogs, especially during political seasons, not necessarily just in US, but anywhere for that matter. If people are able to produce at the emission speed and throw it at the consumers and confuse them, the elections can be won, the governments can be toppled. >> Because to your point about chatbots is chatbots have obviously, reduced the number of bodies that you need to support chat. But they haven't solved the problem of serving consumers. Most of the chat bots are conditioned response, which of the following best describes your problem? >> The current chatbot. >> Yeah. Hey, did we solve your problem? No. Is the answer. So that has some real potential. But if you could bring up that slide again, Ken, I mean you've got the hyperscalers that are dominant. You talked about Google and Microsoft is ubiquitous, they seem to be dominant in every ETR category. But then you have these other specialists. How do those guys compete? And maybe you could even, cite some of the guys that you know, how do they compete with the hyperscalers? What's the key there for like a C3 ai or some of the others that are on there? >> So I've spoken with at least two of the CEOs of the smaller companies that you have on the list. One of the things they're worried about is that if they continue to operate independently without being part of hyperscaler, either the hyperscalers will develop something to compete against them full scale, or they'll become irrelevant. Because at the end of the day, look, cloud is dominant. Not many companies are going to do like AI modeling and training and deployment the whole nine yards by independent by themselves. They're going to depend on one of the clouds, right? So if they're already going to be in the cloud, by taking them out to come to you, it's going to be extremely difficult issue to solve. So all these companies are going and saying, "You know what? We need to be in hyperscalers." For example, you could have looked at DataRobot recently, they made announcements, Google and AWS, and they are all over the place. So you need to go where the customers are. Right? >> All right, before we go on, I want to share some other data from ETR and why people adopt AI and get your feedback. So the data historically shows that feature breadth and technical capabilities were the main decision points for AI adoption, historically. What says to me that it's too much focus on technology. In your view, is that changing? Does it have to change? Will it change? >> Yes. Simple answer is yes. So here's the thing. The data you're speaking from is from previous years. >> Yes >> I can guarantee you, if you look at the latest data that's coming in now, those two will be a secondary and tertiary points. The number one would be about ROI. And how do I achieve? I've spent ton of money on all of my experiments. This is the same thing theme I'm seeing across when talking to everybody who's spending money on AI. I've spent so much money on it. When can I get it live in production? How much, how can I quickly get it? Because you know, the board is breathing down their neck. You already spend this much money. Show me something that's valuable. So the ROI is going to become, take it from me, I'm predicting this for 2023, that's going to become number one. >> Yeah, and if people focus on it, they'll figure it out. Okay. Let's take a look at some of the top players that won, some of the names we just looked at and double click on that and break down their spending profile. So the chart here shows the net score, how net score is calculated. So pay attention to the second set of bars that Databricks, who was pretty prominent on the previous chart. And we've annotated the colors. The lime green is, we're bringing the platform in new. The forest green is, we're going to spend 6% or more relative to last year. And the gray is flat spending. The pinkish is our spending's going to be down on AI and ML, 6% or worse. And the red is churn. So you don't want big red. You subtract the reds from the greens and you get net score, which is shown by those blue dots that you see there. So AWS has the highest net score and very little churn. I mean, single low single digit churn. But notably, you see Databricks and DataRobot are next in line within Microsoft and Google also, they've got very low churn. Andy, what are your thoughts on this data? >> So a couple of things that stands out to me. Most of them are in line with my conversation with customers. Couple of them stood out to me on how bad IBM Watson is doing. >> Yeah, bring that back up if you would. Let's take a look at that. IBM Watson is the far right and the red, that bright red is churning and again, you want low red here. Why do you think that is? >> Well, so look, IBM has been in the forefront of innovating things for many, many years now, right? And over the course of years we talked about this, they moved from a product innovation centric company into more of a services company. And over the years they were making, as at one point, you know that they were making about majority of that money from services. Now things have changed Arvind has taken over, he came from research. So he's doing a great job of trying to reinvent themselves as a company. But it's going to have a long way to catch up. IBM Watson, if you think about it, that played what, jeopardy and chess years ago, like 15 years ago? >> It was jaw dropping when you first saw it. And then they weren't able to commercialize that. >> Yeah. >> And you're making a good point. When Gerstner took over IBM at the time, John Akers wanted to split the company up. He wanted to have a database company, he wanted to have a storage company. Because that's where the industry trend was, Gerstner said no, he came from AMEX, right? He came from American Express. He said, "No, we're going to have a single throat to choke for the customer." They bought PWC for relatively short money. I think it was $15 billion, completely transformed and I would argue saved IBM. But the trade off was, it sort of took them out of product leadership. And so from Gerstner to Palmisano to Remedi, it was really a services led company. And I think Arvind is really bringing it back to a product company with strong consulting. I mean, that's one of the pillars. And so I think that's, they've got a strong story in data and AI. They just got to sort of bring it together and better. Bring that chart up one more time. I want to, the other point is Oracle, Oracle sort of has the dominant lock-in for mission critical database and they're sort of applying AI there. But to your point, they're really not an AI company in the sense that they're taking unstructured data and doing sort of new things. It's really about how to make Oracle better, right? >> Well, you got to remember, Oracle is about database for the structure data. So in yesterday's world, they were dominant database. But you know, if you are to start storing like videos and texts and audio and other things, and then start doing search of vector search and all that, Oracle is not necessarily the database company of choice. And they're strongest thing being apps and building AI into the apps? They are kind of surviving in that area. But again, I wouldn't name them as an AI company, right? But the other thing that that surprised me in that list, what you showed me is yes, AWS is number one. >> Bring that back up if you would, Ken. >> AWS is number one as you, it should be. But what what actually caught me by surprise is how DataRobot is holding, you know? I mean, look at that. The either net new addition and or expansion, DataRobot seem to be doing equally well, even better than Microsoft and Google. That surprises me. >> DataRobot's, and again, this is a function of spending momentum. So remember from the previous chart that Microsoft and Google, much, much larger than DataRobot. DataRobot more niche. But with spending velocity and has always had strong spending velocity, despite some of the recent challenges, organizational challenges. And then you see these other specialists, H2O.ai, Anaconda, dataiku, little bit of red showing there C3.ai. But these again, to stress are the sort of specialists other than obviously the hyperscalers. These are the specialists in AI. All right, so we hit the bigger names in the sector. Now let's take a look at the emerging technology companies. And one of the gems of the ETR dataset is the emerging technology survey. It's called ETS. They used to just do it like twice a year. It's now run four times a year. I just discovered it kind of mid-2022. And it's exclusively focused on private companies that are potential disruptors, they might be M&A candidates and if they've raised enough money, they could be acquirers of companies as well. So Databricks would be an example. They've made a number of investments in companies. SNEAK would be another good example. Companies that are private, but they're buyers, they hope to go IPO at some point in time. So this chart here, shows the emerging companies in the ML AI sector of the ETR dataset. So the dimensions of this are similar, they're net sentiment on the Y axis and mind share on the X axis. Basically, the ETS study measures awareness on the x axis and intent to do something with, evaluate or implement or not, on that vertical axis. So it's like net score on the vertical where negatives are subtracted from the positives. And again, mind share is vendor awareness. That's the horizontal axis. Now that inserted table shows net sentiment and the ends in the survey, which informs the position of the dots. And you'll notice we're plotting TensorFlow as well. We know that's not a company, but it's there for reference as open source tooling is an option for customers. And ETR sometimes like to show that as a reference point. Now we've also drawn a line for Databricks to show how relatively dominant they've become in the past 10 ETS surveys and sort of mind share going back to late 2018. And you can see a dozen or so other emerging tech vendors. So Andy, I want you to share your thoughts on these players, who were the ones to watch, name some names. We'll bring that data back up as you as you comment. >> So Databricks, as you said, remember we talked about how Oracle is not necessarily the database of the choice, you know? So Databricks is kind of trying to solve some of the issue for AI/ML workloads, right? And the problem is also there is no one company that could solve all of the problems. For example, if you look at the names in here, some of them are database names, some of them are platform names, some of them are like MLOps companies like, DataRobot (indistinct) and others. And some of them are like future based companies like, you know, the Techton and stuff. >> So it's a mix of those sub sectors? >> It's a mix of those companies. >> We'll talk to ETR about that. They'd be interested in your input on how to make this more granular and these sub-sectors. You got Hugging Face in here, >> Which is NLP, yeah. >> Okay. So your take, are these companies going to get acquired? Are they going to go IPO? Are they going to merge? >> Well, most of them going to get acquired. My prediction would be most of them will get acquired because look, at the end of the day, hyperscalers need these capabilities, right? So they're going to either create their own, AWS is very good at doing that. They have done a lot of those things. But the other ones, like for particularly Azure, they're going to look at it and saying that, "You know what, it's going to take time for me to build this. Why don't I just go and buy you?" Right? Or or even the smaller players like Oracle or IBM Cloud, this will exist. They might even take a look at them, right? So at the end of the day, a lot of these companies are going to get acquired or merged with others. >> Yeah. All right, let's wrap with some final thoughts. I'm going to make some comments Andy, and then ask you to dig in here. Look, despite the challenge of leveraging AI, you know, Ken, if you could bring up the next chart. We're not repeating, we're not predicting the AI winter of the 1990s. Machine intelligence. It's a superpower that's going to permeate every aspect of the technology industry. AI and data strategies have to be connected. Leveraging first party data is going to increase AI competitiveness and shorten time to value. Andy, I'd love your thoughts on that. I know you've got some thoughts on governance and AI ethics. You know, we talked about ChatGBT, Deepfakes, help us unpack all these trends. >> So there's so much information packed up there, right? The AI and data strategy, that's very, very, very important. If you don't have a proper data, people don't realize that AI is, your AI is the morals that you built on, it's predominantly based on the data what you have. It's not, AI cannot predict something that's going to happen without knowing what it is. It need to be trained, it need to understand what is it you're talking about. So 99% of the time you got to have a good data for you to train. So this where I mentioned to you, the problem is a lot of these companies can't afford to collect the real world data because it takes too long, it's too expensive. So a lot of these companies are trying to do the synthetic data way. It has its own set of issues because you can't use all... >> What's that synthetic data? Explain that. >> Synthetic data is basically not a real world data, but it's a created or simulated data equal and based on real data. It looks, feels, smells, taste like a real data, but it's not exactly real data, right? This is particularly useful in the financial and healthcare industry for world. So you don't have to, at the end of the day, if you have real data about your and my medical history data, if you redact it, you can still reverse this. It's fairly easy, right? >> Yeah, yeah. >> So by creating a synthetic data, there is no correlation between the real data and the synthetic data. >> So that's part of AI ethics and privacy and, okay. >> So the synthetic data, the issue with that is that when you're trying to commingle that with that, you can't create models based on just on synthetic data because synthetic data, as I said is artificial data. So basically you're creating artificial models, so you got to blend in properly that that blend is the problem. And you know how much of real data, how much of synthetic data you could use. You got to use judgment between efficiency cost and the time duration stuff. So that's one-- >> And risk >> And the risk involved with that. And the secondary issues which we talked about is that when you're creating, okay, you take a business use case, okay, you think about investing things, you build the whole thing out and you're trying to put it out into the market. Most companies that I talk to don't have a proper governance in place. They don't have ethics standards in place. They don't worry about the biases in data, they just go on trying to solve a business case >> It's wild west. >> 'Cause that's what they start. It's a wild west! And then at the end of the day when they are close to some legal litigation action or something or something else happens and that's when the Oh Shit! moments happens, right? And then they come in and say, "You know what, how do I fix this?" The governance, security and all of those things, ethics bias, data bias, de-biasing, none of them can be an afterthought. It got to start with the, from the get-go. So you got to start at the beginning saying that, "You know what, I'm going to do all of those AI programs, but before we get into this, we got to set some framework for doing all these things properly." Right? And then the-- >> Yeah. So let's go back to the key points. I want to bring up the cloud again. Because you got to get cloud right. Getting that right matters in AI to the points that you were making earlier. You can't just be out on an island and hyperscalers, they're going to obviously continue to do well. They get more and more data's going into the cloud and they have the native tools. To your point, in the case of AWS, Microsoft's obviously ubiquitous. Google's got great capabilities here. They've got integrated ecosystems partners that are going to continue to strengthen through the decade. What are your thoughts here? >> So a couple of things. One is the last mile ML or last mile AI that nobody's talking about. So that need to be attended to. There are lot of players in the market that coming up, when I talk about last mile, I'm talking about after you're done with the experimentation of the model, how fast and quickly and efficiently can you get it to production? So that's production being-- >> Compressing that time is going to put dollars in your pocket. >> Exactly. Right. >> So once, >> If you got it right. >> If you get it right, of course. So there are, there are a couple of issues with that. Once you figure out that model is working, that's perfect. People don't realize, the moment you decide that moment when the decision is made, it's like a new car. After you purchase the value decreases on a minute basis. Same thing with the models. Once the model is created, you need to be in production right away because it starts losing it value on a seconds minute basis. So issue number one, how fast can I get it over there? So your deployment, you are inferencing efficiently at the edge locations, your optimization, your security, all of this is at issue. But you know what is more important than that in the last mile? You keep the model up, you continue to work on, again, going back to the car analogy, at one point you got to figure out your car is costing more than to operate. So you got to get a new car, right? And that's the same thing with the models as well. If your model has reached a stage, it is actually a potential risk for your operation. To give you an idea, if Uber has a model, the first time when you get a car from going from point A to B cost you $60. If the model decayed the next time I might give you a $40 rate, I would take it definitely. But it's lost for the company. The business risk associated with operating on a bad model, you should realize it immediately, pull the model out, retrain it, redeploy it. That's is key. >> And that's got to be huge in security model recency and security to the extent that you can get real time is big. I mean you, you see Palo Alto, CrowdStrike, a lot of other security companies are injecting AI. Again, they won't show up in the ETR ML/AI taxonomy per se as a pure play. But ServiceNow is another company that you have have mentioned to me, offline. AI is just getting embedded everywhere. >> Yep. >> And then I'm glad you brought up, kind of real-time inferencing 'cause a lot of the modeling, if we can go back to the last point that we're going to make, a lot of the AI today is modeling done in the cloud. The last point we wanted to make here, I'd love to get your thoughts on this, is real-time AI inferencing for instance at the edge is going to become increasingly important for us. It's going to usher in new economics, new types of silicon, particularly arm-based. We've covered that a lot on "Breaking Analysis", new tooling, new companies and that could disrupt the sort of cloud model if new economics emerge. 'Cause cloud obviously very centralized, they're trying to decentralize it. But over the course of this decade we could see some real disruption there. Andy, give us your final thoughts on that. >> Yes and no. I mean at the end of the day, cloud is kind of centralized now, but a lot of this companies including, AWS is kind of trying to decentralize that by putting their own sub-centers and edge locations. >> Local zones, outposts. >> Yeah, exactly. Particularly the outpost concept. And if it can even become like a micro center and stuff, it won't go to the localized level of, I go to a single IOT level. But again, the cloud extends itself to that level. So if there is an opportunity need for it, the hyperscalers will figure out a way to fit that model. So I wouldn't too much worry about that, about deployment and where to have it and what to do with that. But you know, figure out the right business use case, get the right data, get the ethics and governance place and make sure they get it to production and make sure you pull the model out when it's not operating well. >> Excellent advice. Andy, I got to thank you for coming into the studio today, helping us with this "Breaking Analysis" segment. Outstanding collaboration and insights and input in today's episode. Hope we can do more. >> Thank you. Thanks for having me. I appreciate it. >> You're very welcome. All right. I want to thank Alex Marson who's on production and manages the podcast. Ken Schiffman as well. Kristen Martin and Cheryl Knight helped get the word out on social media and our newsletters. And Rob Hoof is our editor-in-chief over at Silicon Angle. He does some great editing for us. Thank you all. Remember all these episodes are available as podcast. Wherever you listen, all you got to do is search "Breaking Analysis" podcast. I publish each week on wikibon.com and silicon angle.com or you can email me at david.vellante@siliconangle.com to get in touch, or DM me at dvellante or comment on our LinkedIn posts. Please check out ETR.AI for the best survey data and the enterprise tech business, Constellation Research. Andy publishes there some awesome information on AI and data. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching everybody and we'll see you next time on "Breaking Analysis". (gentle closing tune plays)
SUMMARY :
bringing you data-driven Andy, great to have you on the program. and AI at the center of their enterprises. So it's like you found a of the AI use cases," right? I got a glimpse of the January survey, So one of the things and it just notes some of the players So the first one is, Like a And the open AI tool and ChatGPT rather. I have, but it's of all the available text of bodies that you need or some of the others that are on there? One of the things they're So the data historically So here's the thing. So the ROI is going to So the chart here shows the net score, Couple of them stood out to me IBM Watson is the far right and the red, And over the course of when you first saw it. I mean, that's one of the pillars. Oracle is not necessarily the how DataRobot is holding, you know? So it's like net score on the vertical database of the choice, you know? on how to make this more Are they going to go IPO? So at the end of the day, of the technology industry. So 99% of the time you What's that synthetic at the end of the day, and the synthetic data. So that's part of AI that blend is the problem. And the risk involved with that. So you got to start at data's going into the cloud So that need to be attended to. is going to put dollars the first time when you that you can get real time is big. a lot of the AI today is I mean at the end of the day, and make sure they get it to production Andy, I got to thank you for Thanks for having me. and manages the podcast.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
Alex Marson | PERSON | 0.99+ |
Andy | PERSON | 0.99+ |
Andy Thurai | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Ken Schiffman | PERSON | 0.99+ |
Tom Davenport | PERSON | 0.99+ |
AMEX | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Cheryl Knight | PERSON | 0.99+ |
Rashmi Kumar | PERSON | 0.99+ |
Rob Hoof | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Uber | ORGANIZATION | 0.99+ |
Ken | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
October | DATE | 0.99+ |
6% | QUANTITY | 0.99+ |
$40 | QUANTITY | 0.99+ |
January 21 | DATE | 0.99+ |
Chipotle | ORGANIZATION | 0.99+ |
$15 billion | QUANTITY | 0.99+ |
five | QUANTITY | 0.99+ |
Rashmi | PERSON | 0.99+ |
$50,000 | QUANTITY | 0.99+ |
$60 | QUANTITY | 0.99+ |
US | LOCATION | 0.99+ |
January | DATE | 0.99+ |
Antonio | PERSON | 0.99+ |
John Akers | PERSON | 0.99+ |
Warren Buffet | PERSON | 0.99+ |
late 2018 | DATE | 0.99+ |
Ikea | ORGANIZATION | 0.99+ |
American Express | ORGANIZATION | 0.99+ |
MIT | ORGANIZATION | 0.99+ |
PWC | ORGANIZATION | 0.99+ |
99% | QUANTITY | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
Domino | ORGANIZATION | 0.99+ |
Arvind | PERSON | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
30 billion | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Constellation Research | ORGANIZATION | 0.99+ |
Gerstner | PERSON | 0.99+ |
120 billion | QUANTITY | 0.99+ |
$100,000 | QUANTITY | 0.99+ |
Breaking Analysis: We Have the Data…What Private Tech Companies Don’t Tell you About Their Business
>> From The Cube Studios in Palo Alto and Boston, bringing you data driven insights from The Cube at ETR. This is "Breaking Analysis" with Dave Vellante. >> The reverse momentum in tech stocks caused by rising interest rates, less attractive discounted cash flow models, and more tepid forward guidance, can be easily measured by public market valuations. And while there's lots of discussion about the impact on private companies and cash runway and 409A valuations, measuring the performance of non-public companies isn't as easy. IPOs have dried up and public statements by private companies, of course, they accentuate the good and they kind of hide the bad. Real data, unless you're an insider, is hard to find. Hello and welcome to this week's "Wikibon Cube Insights" powered by ETR. In this "Breaking Analysis", we unlock some of the secrets that non-public, emerging tech companies may or may not be sharing. And we do this by introducing you to a capability from ETR that we've not exposed you to over the past couple of years, it's called the Emerging Technologies Survey, and it is packed with sentiment data and performance data based on surveys of more than a thousand CIOs and IT buyers covering more than 400 companies. And we've invited back our colleague, Erik Bradley of ETR to help explain the survey and the data that we're going to cover today. Erik, this survey is something that I've not personally spent much time on, but I'm blown away at the data. It's really unique and detailed. First of all, welcome. Good to see you again. >> Great to see you too, Dave, and I'm really happy to be talking about the ETS or the Emerging Technology Survey. Even our own clients of constituents probably don't spend as much time in here as they should. >> Yeah, because there's so much in the mainstream, but let's pull up a slide to bring out the survey composition. Tell us about the study. How often do you run it? What's the background and the methodology? >> Yeah, you were just spot on the way you were talking about the private tech companies out there. So what we did is we decided to take all the vendors that we track that are not yet public and move 'em over to the ETS. And there isn't a lot of information out there. If you're not in Silicon (indistinct), you're not going to get this stuff. So PitchBook and Tech Crunch are two out there that gives some data on these guys. But what we really wanted to do was go out to our community. We have 6,000, ITDMs in our community. We wanted to ask them, "Are you aware of these companies? And if so, are you allocating any resources to them? Are you planning to evaluate them," and really just kind of figure out what we can do. So this particular survey, as you can see, 1000 plus responses, over 450 vendors that we track. And essentially what we're trying to do here is talk about your evaluation and awareness of these companies and also your utilization. And also if you're not utilizing 'em, then we can also figure out your sales conversion or churn. So this is interesting, not only for the ITDMs themselves to figure out what their peers are evaluating and what they should put in POCs against the big guys when contracts come up. But it's also really interesting for the tech vendors themselves to see how they're performing. >> And you can see 2/3 of the respondents are director level of above. You got 28% is C-suite. There is of course a North America bias, 70, 75% is North America. But these smaller companies, you know, that's when they start doing business. So, okay. We're going to do a couple of things here today. First, we're going to give you the big picture across the sectors that ETR covers within the ETS survey. And then we're going to look at the high and low sentiment for the larger private companies. And then we're going to do the same for the smaller private companies, the ones that don't have as much mindshare. And then I'm going to put those two groups together and we're going to look at two dimensions, actually three dimensions, which companies are being evaluated the most. Second, companies are getting the most usage and adoption of their offerings. And then third, which companies are seeing the highest churn rates, which of course is a silent killer of companies. And then finally, we're going to look at the sentiment and mindshare for two key areas that we like to cover often here on "Breaking Analysis", security and data. And data comprises database, including data warehousing, and then big data analytics is the second part of data. And then machine learning and AI is the third section within data that we're going to look at. Now, one other thing before we get into it, ETR very often will include open source offerings in the mix, even though they're not companies like TensorFlow or Kubernetes, for example. And we'll call that out during this discussion. The reason this is done is for context, because everyone is using open source. It is the heart of innovation and many business models are super glued to an open source offering, like take MariaDB, for example. There's the foundation and then there's with the open source code and then there, of course, the company that sells services around the offering. Okay, so let's first look at the highest and lowest sentiment among these private firms, the ones that have the highest mindshare. So they're naturally going to be somewhat larger. And we do this on two dimensions, sentiment on the vertical axis and mindshare on the horizontal axis and note the open source tool, see Kubernetes, Postgres, Kafka, TensorFlow, Jenkins, Grafana, et cetera. So Erik, please explain what we're looking at here, how it's derived and what the data tells us. >> Certainly, so there is a lot here, so we're going to break it down first of all by explaining just what mindshare and net sentiment is. You explain the axis. We have so many evaluation metrics, but we need to aggregate them into one so that way we can rank against each other. Net sentiment is really the aggregation of all the positive and subtracting out the negative. So the net sentiment is a very quick way of looking at where these companies stand versus their peers in their sectors and sub sectors. Mindshare is basically the awareness of them, which is good for very early stage companies. And you'll see some names on here that are obviously been around for a very long time. And they're clearly be the bigger on the axis on the outside. Kubernetes, for instance, as you mentioned, is open source. This de facto standard for all container orchestration, and it should be that far up into the right, because that's what everyone's using. In fact, the open source leaders are so prevalent in the emerging technology survey that we break them out later in our analysis, 'cause it's really not fair to include them and compare them to the actual companies that are providing the support and the security around that open source technology. But no survey, no analysis, no research would be complete without including these open source tech. So what we're looking at here, if I can just get away from the open source names, we see other things like Databricks and OneTrust . They're repeating as top net sentiment performers here. And then also the design vendors. People don't spend a lot of time on 'em, but Miro and Figma. This is their third survey in a row where they're just dominating that sentiment overall. And Adobe should probably take note of that because they're really coming after them. But Databricks, we all know probably would've been a public company by now if the market hadn't turned, but you can see just how dominant they are in a survey of nothing but private companies. And we'll see that again when we talk about the database later. >> And I'll just add, so you see automation anywhere on there, the big UiPath competitor company that was not able to get to the public markets. They've been trying. Snyk, Peter McKay's company, they've raised a bunch of money, big security player. They're doing some really interesting things in developer security, helping developers secure the data flow, H2O.ai, Dataiku AI company. We saw them at the Snowflake Summit. Redis Labs, Netskope and security. So a lot of names that we know that ultimately we think are probably going to be hitting the public market. Okay, here's the same view for private companies with less mindshare, Erik. Take us through this one. >> On the previous slide too real quickly, I wanted to pull that security scorecard and we'll get back into it. But this is a newcomer, that I couldn't believe how strong their data was, but we'll bring that up in a second. Now, when we go to the ones of lower mindshare, it's interesting to talk about open source, right? Kubernetes was all the way on the top right. Everyone uses containers. Here we see Istio up there. Not everyone is using service mesh as much. And that's why Istio is in the smaller breakout. But still when you talk about net sentiment, it's about the leader, it's the highest one there is. So really interesting to point out. Then we see other names like Collibra in the data side really performing well. And again, as always security, very well represented here. We have Aqua, Wiz, Armis, which is a standout in this survey this time around. They do IoT security. I hadn't even heard of them until I started digging into the data here. And I couldn't believe how well they were doing. And then of course you have AnyScale, which is doing a second best in this and the best name in the survey Hugging Face, which is a machine learning AI tool. Also doing really well on a net sentiment, but they're not as far along on that access of mindshare just yet. So these are again, emerging companies that might not be as well represented in the enterprise as they will be in a couple of years. >> Hugging Face sounds like something you do with your two year old. Like you said, you see high performers, AnyScale do machine learning and you mentioned them. They came out of Berkeley. Collibra Governance, InfluxData is on there. InfluxDB's a time series database. And yeah, of course, Alex, if you bring that back up, you get a big group of red dots, right? That's the bad zone, I guess, which Sisense does vis, Yellowbrick Data is a NPP database. How should we interpret the red dots, Erik? I mean, is it necessarily a bad thing? Could it be misinterpreted? What's your take on that? >> Sure, well, let me just explain the definition of it first from a data science perspective, right? We're a data company first. So the gray dots that you're seeing that aren't named, that's the mean that's the average. So in order for you to be on this chart, you have to be at least one standard deviation above or below that average. So that gray is where we're saying, "Hey, this is where the lump of average comes in. This is where everyone normally stands." So you either have to be an outperformer or an underperformer to even show up in this analysis. So by definition, yes, the red dots are bad. You're at least one standard deviation below the average of your peers. It's not where you want to be. And if you're on the lower left, not only are you not performing well from a utilization or an actual usage rate, but people don't even know who you are. So that's a problem, obviously. And the VCs and the PEs out there that are backing these companies, they're the ones who mostly are interested in this data. >> Yeah. Oh, that's great explanation. Thank you for that. No, nice benchmarking there and yeah, you don't want to be in the red. All right, let's get into the next segment here. Here going to look at evaluation rates, adoption and the all important churn. First new evaluations. Let's bring up that slide. And Erik, take us through this. >> So essentially I just want to explain what evaluation means is that people will cite that they either plan to evaluate the company or they're currently evaluating. So that means we're aware of 'em and we are choosing to do a POC of them. And then we'll see later how that turns into utilization, which is what a company wants to see, awareness, evaluation, and then actually utilizing them. That's sort of the life cycle for these emerging companies. So what we're seeing here, again, with very high evaluation rates. H2O, we mentioned. SecurityScorecard jumped up again. Chargebee, Snyk, Salt Security, Armis. A lot of security names are up here, Aqua, Netskope, which God has been around forever. I still can't believe it's in an Emerging Technology Survey But so many of these names fall in data and security again, which is why we decided to pick those out Dave. And on the lower side, Vena, Acton, those unfortunately took the dubious award of the lowest evaluations in our survey, but I prefer to focus on the positive. So SecurityScorecard, again, real standout in this one, they're in a security assessment space, basically. They'll come in and assess for you how your security hygiene is. And it's an area of a real interest right now amongst our ITDM community. >> Yeah, I mean, I think those, and then Arctic Wolf is up there too. They're doing managed services. You had mentioned Netskope. Yeah, okay. All right, let's look at now adoption. These are the companies whose offerings are being used the most and are above that standard deviation in the green. Take us through this, Erik. >> Sure, yet again, what we're looking at is, okay, we went from awareness, we went to evaluation. Now it's about utilization, which means a survey respondent's going to state "Yes, we evaluated and we plan to utilize it" or "It's already in our enterprise and we're actually allocating further resources to it." Not surprising, again, a lot of open source, the reason why, it's free. So it's really easy to grow your utilization on something that's free. But as you and I both know, as Red Hat proved, there's a lot of money to be made once the open source is adopted, right? You need the governance, you need the security, you need the support wrapped around it. So here we're seeing Kubernetes, Postgres, Apache Kafka, Jenkins, Grafana. These are all open source based names. But if we're looking at names that are non open source, we're going to see Databricks, Automation Anywhere, Rubrik all have the highest mindshare. So these are the names, not surprisingly, all names that probably should have been public by now. Everyone's expecting an IPO imminently. These are the names that have the highest mindshare. If we talk about the highest utilization rates, again, Miro and Figma pop up, and I know they're not household names, but they are just dominant in this survey. These are applications that are meant for design software and, again, they're going after an Autodesk or a CAD or Adobe type of thing. It is just dominant how high the utilization rates are here, which again is something Adobe should be paying attention to. And then you'll see a little bit lower, but also interesting, we see Collibra again, we see Hugging Face again. And these are names that are obviously in the data governance, ML, AI side. So we're seeing a ton of data, a ton of security and Rubrik was interesting in this one, too, high utilization and high mindshare. We know how pervasive they are in the enterprise already. >> Erik, Alex, keep that up for a second, if you would. So yeah, you mentioned Rubrik. Cohesity's not on there. They're sort of the big one. We're going to talk about them in a moment. Puppet is interesting to me because you remember the early days of that sort of space, you had Puppet and Chef and then you had Ansible. Red Hat bought Ansible and then Ansible really took off. So it's interesting to see Puppet on there as well. Okay. So now let's look at the churn because this one is where you don't want to be. It's, of course, all red 'cause churn is bad. Take us through this, Erik. >> Yeah, definitely don't want to be here and I don't love to dwell on the negative. So we won't spend as much time. But to your point, there's one thing I want to point out that think it's important. So you see Rubrik in the same spot, but Rubrik has so many citations in our survey that it actually would make sense that they're both being high utilization and churn just because they're so well represented. They have such a high overall representation in our survey. And the reason I call that out is Cohesity. Cohesity has an extremely high churn rate here about 17% and unlike Rubrik, they were not on the utilization side. So Rubrik is seeing both, Cohesity is not. It's not being utilized, but it's seeing a high churn. So that's the way you can look at this data and say, "Hm." Same thing with Puppet. You noticed that it was on the other slide. It's also on this one. So basically what it means is a lot of people are giving Puppet a shot, but it's starting to churn, which means it's not as sticky as we would like. One that was surprising on here for me was Tanium. It's kind of jumbled in there. It's hard to see in the middle, but Tanium, I was very surprised to see as high of a churn because what I do hear from our end user community is that people that use it, like it. It really kind of spreads into not only vulnerability management, but also that endpoint detection and response side. So I was surprised by that one, mostly to see Tanium in here. Mural, again, was another one of those application design softwares that's seeing a very high churn as well. >> So you're saying if you're in both... Alex, bring that back up if you would. So if you're in both like MariaDB is for example, I think, yeah, they're in both. They're both green in the previous one and red here, that's not as bad. You mentioned Rubrik is going to be in both. Cohesity is a bit of a concern. Cohesity just brought on Sanjay Poonen. So this could be a go to market issue, right? I mean, 'cause Cohesity has got a great product and they got really happy customers. So they're just maybe having to figure out, okay, what's the right ideal customer profile and Sanjay Poonen, I guarantee, is going to have that company cranking. I mean they had been doing very well on the surveys and had fallen off of a bit. The other interesting things wondering the previous survey I saw Cvent, which is an event platform. My only reason I pay attention to that is 'cause we actually have an event platform. We don't sell it separately. We bundle it as part of our offerings. And you see Hopin on here. Hopin raised a billion dollars during the pandemic. And we were like, "Wow, that's going to blow up." And so you see Hopin on the churn and you didn't see 'em in the previous chart, but that's sort of interesting. Like you said, let's not kind of dwell on the negative, but you really don't. You know, churn is a real big concern. Okay, now we're going to drill down into two sectors, security and data. Where data comprises three areas, database and data warehousing, machine learning and AI and big data analytics. So first let's take a look at the security sector. Now this is interesting because not only is it a sector drill down, but also gives an indicator of how much money the firm has raised, which is the size of that bubble. And to tell us if a company is punching above its weight and efficiently using its venture capital. Erik, take us through this slide. Explain the dots, the size of the dots. Set this up please. >> Yeah. So again, the axis is still the same, net sentiment and mindshare, but what we've done this time is we've taken publicly available information on how much capital company is raised and that'll be the size of the circle you see around the name. And then whether it's green or red is basically saying relative to the amount of money they've raised, how are they doing in our data? So when you see a Netskope, which has been around forever, raised a lot of money, that's why you're going to see them more leading towards red, 'cause it's just been around forever and kind of would expect it. Versus a name like SecurityScorecard, which is only raised a little bit of money and it's actually performing just as well, if not better than a name, like a Netskope. OneTrust doing absolutely incredible right now. BeyondTrust. We've seen the issues with Okta, right. So those are two names that play in that space that obviously are probably getting some looks about what's going on right now. Wiz, we've all heard about right? So raised a ton of money. It's doing well on net sentiment, but the mindshare isn't as well as you'd want, which is why you're going to see a little bit of that red versus a name like Aqua, which is doing container and application security. And hasn't raised as much money, but is really neck and neck with a name like Wiz. So that is why on a relative basis, you'll see that more green. As we all know, information security is never going away. But as we'll get to later in the program, Dave, I'm not sure in this current market environment, if people are as willing to do POCs and switch away from their security provider, right. There's a little bit of tepidness out there, a little trepidation. So right now we're seeing overall a slight pause, a slight cooling in overall evaluations on the security side versus historical levels a year ago. >> Now let's stay on here for a second. So a couple things I want to point out. So it's interesting. Now Snyk has raised over, I think $800 million but you can see them, they're high on the vertical and the horizontal, but now compare that to Lacework. It's hard to see, but they're kind of buried in the middle there. That's the biggest dot in this whole thing. I think I'm interpreting this correctly. They've raised over a billion dollars. It's a Mike Speiser company. He was the founding investor in Snowflake. So people watch that very closely, but that's an example of where they're not punching above their weight. They recently had a layoff and they got to fine tune things, but I'm still confident they they're going to do well. 'Cause they're approaching security as a data problem, which is probably people having trouble getting their arms around that. And then again, I see Arctic Wolf. They're not red, they're not green, but they've raised fair amount of money, but it's showing up to the right and decent level there. And a couple of the other ones that you mentioned, Netskope. Yeah, they've raised a lot of money, but they're actually performing where you want. What you don't want is where Lacework is, right. They've got some work to do to really take advantage of the money that they raised last November and prior to that. >> Yeah, if you're seeing that more neutral color, like you're calling out with an Arctic Wolf, like that means relative to their peers, this is where they should be. It's when you're seeing that red on a Lacework where we all know, wow, you raised a ton of money and your mindshare isn't where it should be. Your net sentiment is not where it should be comparatively. And then you see these great standouts, like Salt Security and SecurityScorecard and Abnormal. You know they haven't raised that much money yet, but their net sentiment's higher and their mindshare's doing well. So those basically in a nutshell, if you're a PE or a VC and you see a small green circle, then you're doing well, then it means you made a good investment. >> Some of these guys, I don't know, but you see these small green circles. Those are the ones you want to start digging into and maybe help them catch a wave. Okay, let's get into the data discussion. And again, three areas, database slash data warehousing, big data analytics and ML AI. First, we're going to look at the database sector. So Alex, thank you for bringing that up. Alright, take us through this, Erik. Actually, let me just say Postgres SQL. I got to ask you about this. It shows some funding, but that actually could be a mix of EDB, the company that commercializes Postgres and Postgres the open source database, which is a transaction system and kind of an open source Oracle. You see MariaDB is a database, but open source database. But the companies they've raised over $200 million and they filed an S-4. So Erik looks like this might be a little bit of mashup of companies and open source products. Help us understand this. >> Yeah, it's tough when you start dealing with the open source side and I'll be honest with you, there is a little bit of a mashup here. There are certain names here that are a hundred percent for profit companies. And then there are others that are obviously open source based like Redis is open source, but Redis Labs is the one trying to monetize the support around it. So you're a hundred percent accurate on this slide. I think one of the things here that's important to note though, is just how important open source is to data. If you're going to be going to any of these areas, it's going to be open source based to begin with. And Neo4j is one I want to call out here. It's not one everyone's familiar with, but it's basically geographical charting database, which is a name that we're seeing on a net sentiment side actually really, really high. When you think about it's the third overall net sentiment for a niche database play. It's not as big on the mindshare 'cause it's use cases aren't as often, but third biggest play on net sentiment. I found really interesting on this slide. >> And again, so MariaDB, as I said, they filed an S-4 I think $50 million in revenue, that might even be ARR. So they're not huge, but they're getting there. And by the way, MariaDB, if you don't know, was the company that was formed the day that Oracle bought Sun in which they got MySQL and MariaDB has done a really good job of replacing a lot of MySQL instances. Oracle has responded with MySQL HeatWave, which was kind of the Oracle version of MySQL. So there's some interesting battles going on there. If you think about the LAMP stack, the M in the LAMP stack was MySQL. And so now it's all MariaDB replacing that MySQL for a large part. And then you see again, the red, you know, you got to have some concerns about there. Aerospike's been around for a long time. SingleStore changed their name a couple years ago, last year. Yellowbrick Data, Fire Bolt was kind of going after Snowflake for a while, but yeah, you want to get out of that red zone. So they got some work to do. >> And Dave, real quick for the people that aren't aware, I just want to let them know that we can cut this data with the public company data as well. So we can cross over this with that because some of these names are competing with the larger public company names as well. So we can go ahead and cross reference like a MariaDB with a Mongo, for instance, or of something of that nature. So it's not in this slide, but at another point we can certainly explain on a relative basis how these private names are doing compared to the other ones as well. >> All right, let's take a quick look at analytics. Alex, bring that up if you would. Go ahead, Erik. >> Yeah, I mean, essentially here, I can't see it on my screen, my apologies. I just kind of went to blank on that. So gimme one second to catch up. >> So I could set it up while you're doing that. You got Grafana up and to the right. I mean, this is huge right. >> Got it thank you. I lost my screen there for a second. Yep. Again, open source name Grafana, absolutely up and to the right. But as we know, Grafana Labs is actually picking up a lot of speed based on Grafana, of course. And I think we might actually hear some noise from them coming this year. The names that are actually a little bit more disappointing than I want to call out are names like ThoughtSpot. It's been around forever. Their mindshare of course is second best here but based on the amount of time they've been around and the amount of money they've raised, it's not actually outperforming the way it should be. We're seeing Moogsoft obviously make some waves. That's very high net sentiment for that company. It's, you know, what, third, fourth position overall in this entire area, Another name like Fivetran, Matillion is doing well. Fivetran, even though it's got a high net sentiment, again, it's raised so much money that we would've expected a little bit more at this point. I know you know this space extremely well, but basically what we're looking at here and to the bottom left, you're going to see some names with a lot of red, large circles that really just aren't performing that well. InfluxData, however, second highest net sentiment. And it's really pretty early on in this stage and the feedback we're getting on this name is the use cases are great, the efficacy's great. And I think it's one to watch out for. >> InfluxData, time series database. The other interesting things I just noticed here, you got Tamer on here, which is that little small green. Those are the ones we were saying before, look for those guys. They might be some of the interesting companies out there and then observe Jeremy Burton's company. They do observability on top of Snowflake, not green, but kind of in that gray. So that's kind of cool. Monte Carlo is another one, they're sort of slightly green. They are doing some really interesting things in data and data mesh. So yeah, okay. So I can spend all day on this stuff, Erik, phenomenal data. I got to get back and really dig in. Let's end with machine learning and AI. Now this chart it's similar in its dimensions, of course, except for the money raised. We're not showing that size of the bubble, but AI is so hot. We wanted to cover that here, Erik, explain this please. Why TensorFlow is highlighted and walk us through this chart. >> Yeah, it's funny yet again, right? Another open source name, TensorFlow being up there. And I just want to explain, we do break out machine learning, AI is its own sector. A lot of this of course really is intertwined with the data side, but it is on its own area. And one of the things I think that's most important here to break out is Databricks. We started to cover Databricks in machine learning, AI. That company has grown into much, much more than that. So I do want to state to you Dave, and also the audience out there that moving forward, we're going to be moving Databricks out of only the MA/AI into other sectors. So we can kind of value them against their peers a little bit better. But in this instance, you could just see how dominant they are in this area. And one thing that's not here, but I do want to point out is that we have the ability to break this down by industry vertical, organization size. And when I break this down into Fortune 500 and Fortune 1000, both Databricks and Tensorflow are even better than you see here. So it's quite interesting to see that the names that are succeeding are also succeeding with the largest organizations in the world. And as we know, large organizations means large budgets. So this is one area that I just thought was really interesting to point out that as we break it down, the data by vertical, these two names still are the outstanding players. >> I just also want to call it H2O.ai. They're getting a lot of buzz in the marketplace and I'm seeing them a lot more. Anaconda, another one. Dataiku consistently popping up. DataRobot is also interesting because all the kerfuffle that's going on there. The Cube guy, Cube alum, Chris Lynch stepped down as executive chairman. All this stuff came out about how the executives were taking money off the table and didn't allow the employees to participate in that money raising deal. So that's pissed a lot of people off. And so they're now going through some kind of uncomfortable things, which is unfortunate because DataRobot, I noticed, we haven't covered them that much in "Breaking Analysis", but I've noticed them oftentimes, Erik, in the surveys doing really well. So you would think that company has a lot of potential. But yeah, it's an important space that we're going to continue to watch. Let me ask you Erik, can you contextualize this from a time series standpoint? I mean, how is this changed over time? >> Yeah, again, not show here, but in the data. I'm sorry, go ahead. >> No, I'm sorry. What I meant, I should have interjected. In other words, you would think in a downturn that these emerging companies would be less interesting to buyers 'cause they're more risky. What have you seen? >> Yeah, and it was interesting before we went live, you and I were having this conversation about "Is the downturn stopping people from evaluating these private companies or not," right. In a larger sense, that's really what we're doing here. How are these private companies doing when it comes down to the actual practitioners? The people with the budget, the people with the decision making. And so what I did is, we have historical data as you know, I went back to the Emerging Technology Survey we did in November of 21, right at the crest right before the market started to really fall and everything kind of started to fall apart there. And what I noticed is on the security side, very much so, we're seeing less evaluations than we were in November 21. So I broke it down. On cloud security, net sentiment went from 21% to 16% from November '21. That's a pretty big drop. And again, that sentiment is our one aggregate metric for overall positivity, meaning utilization and actual evaluation of the name. Again in database, we saw it drop a little bit from 19% to 13%. However, in analytics we actually saw it stay steady. So it's pretty interesting that yes, cloud security and security in general is always going to be important. But right now we're seeing less overall net sentiment in that space. But within analytics, we're seeing steady with growing mindshare. And also to your point earlier in machine learning, AI, we're seeing steady net sentiment and mindshare has grown a whopping 25% to 30%. So despite the downturn, we're seeing more awareness of these companies in analytics and machine learning and a steady, actual utilization of them. I can't say the same in security and database. They're actually shrinking a little bit since the end of last year. >> You know it's interesting, we were on a round table, Erik does these round tables with CISOs and CIOs, and I remember one time you had asked the question, "How do you think about some of these emerging tech companies?" And one of the executives said, "I always include somebody in the bottom left of the Gartner Magic Quadrant in my RFPs. I think he said, "That's how I found," I don't know, it was Zscaler or something like that years before anybody ever knew of them "Because they're going to help me get to the next level." So it's interesting to see Erik in these sectors, how they're holding up in many cases. >> Yeah. It's a very important part for the actual IT practitioners themselves. There's always contracts coming up and you always have to worry about your next round of negotiations. And that's one of the roles these guys play. You have to do a POC when contracts come up, but it's also their job to stay on top of the new technology. You can't fall behind. Like everyone's a software company. Now everyone's a tech company, no matter what you're doing. So these guys have to stay in on top of it. And that's what this ETS can do. You can go in here and look and say, "All right, I'm going to evaluate their technology," and it could be twofold. It might be that you're ready to upgrade your technology and they're actually pushing the envelope or it simply might be I'm using them as a negotiation ploy. So when I go back to the big guy who I have full intentions of writing that contract to, at least I have some negotiation leverage. >> Erik, we got to leave it there. I could spend all day. I'm going to definitely dig into this on my own time. Thank you for introducing this, really appreciate your time today. >> I always enjoy it, Dave and I hope everyone out there has a great holiday weekend. Enjoy the rest of the summer. And, you know, I love to talk data. So anytime you want, just point the camera on me and I'll start talking data. >> You got it. I also want to thank the team at ETR, not only Erik, but Darren Bramen who's a data scientist, really helped prepare this data, the entire team over at ETR. I cannot tell you how much additional data there is. We are just scratching the surface in this "Breaking Analysis". So great job guys. I want to thank Alex Myerson. Who's on production and he manages the podcast. Ken Shifman as well, who's just coming back from VMware Explore. Kristen Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our editor in chief over at SiliconANGLE. Does some great editing for us. Thank you. All of you guys. Remember these episodes, they're all available as podcast, wherever you listen. All you got to do is just search "Breaking Analysis" podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me to get in touch david.vellante@siliconangle.com. You can DM me at dvellante or comment on my LinkedIn posts and please do check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for Erik Bradley and The Cube Insights powered by ETR. Thanks for watching. Be well. And we'll see you next time on "Breaking Analysis". (upbeat music)
SUMMARY :
bringing you data driven it's called the Emerging Great to see you too, Dave, so much in the mainstream, not only for the ITDMs themselves It is the heart of innovation So the net sentiment is a very So a lot of names that we And then of course you have AnyScale, That's the bad zone, I guess, So the gray dots that you're rates, adoption and the all And on the lower side, Vena, Acton, in the green. are in the enterprise already. So now let's look at the churn So that's the way you can look of dwell on the negative, So again, the axis is still the same, And a couple of the other And then you see these great standouts, Those are the ones you want to but Redis Labs is the one And by the way, MariaDB, So it's not in this slide, Alex, bring that up if you would. So gimme one second to catch up. So I could set it up but based on the amount of time Those are the ones we were saying before, And one of the things I think didn't allow the employees to here, but in the data. What have you seen? the market started to really And one of the executives said, And that's one of the Thank you for introducing this, just point the camera on me We are just scratching the surface
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Erik | PERSON | 0.99+ |
Alex Myerson | PERSON | 0.99+ |
Ken Shifman | PERSON | 0.99+ |
Sanjay Poonen | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Erik Bradley | PERSON | 0.99+ |
November 21 | DATE | 0.99+ |
Darren Bramen | PERSON | 0.99+ |
Alex | PERSON | 0.99+ |
Cheryl Knight | PERSON | 0.99+ |
Postgres | ORGANIZATION | 0.99+ |
Databricks | ORGANIZATION | 0.99+ |
Netskope | ORGANIZATION | 0.99+ |
Adobe | ORGANIZATION | 0.99+ |
Rob Hof | PERSON | 0.99+ |
Fivetran | ORGANIZATION | 0.99+ |
$50 million | QUANTITY | 0.99+ |
21% | QUANTITY | 0.99+ |
Chris Lynch | PERSON | 0.99+ |
19% | QUANTITY | 0.99+ |
Jeremy Burton | PERSON | 0.99+ |
$800 million | QUANTITY | 0.99+ |
6,000 | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Redis Labs | ORGANIZATION | 0.99+ |
November '21 | DATE | 0.99+ |
ETR | ORGANIZATION | 0.99+ |
First | QUANTITY | 0.99+ |
25% | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
OneTrust | ORGANIZATION | 0.99+ |
two dimensions | QUANTITY | 0.99+ |
two groups | QUANTITY | 0.99+ |
November of 21 | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
Boston | LOCATION | 0.99+ |
more than 400 companies | QUANTITY | 0.99+ |
Kristen Martin | PERSON | 0.99+ |
MySQL | TITLE | 0.99+ |
Moogsoft | ORGANIZATION | 0.99+ |
The Cube | ORGANIZATION | 0.99+ |
third | QUANTITY | 0.99+ |
Grafana | ORGANIZATION | 0.99+ |
H2O | ORGANIZATION | 0.99+ |
Mike Speiser | PERSON | 0.99+ |
david.vellante@siliconangle.com | OTHER | 0.99+ |
second | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
28% | QUANTITY | 0.99+ |
16% | QUANTITY | 0.99+ |
Second | QUANTITY | 0.99+ |
Breaking Analysis: How JPMC is Implementing a Data Mesh Architecture on the AWS Cloud
>> From theCUBE studios in Palo Alto and Boston, bringing you data-driven insights from theCUBE and ETR. This is braking analysis with Dave Vellante. >> A new era of data is upon us, and we're in a state of transition. You know, even our language reflects that. We rarely use the phrase big data anymore, rather we talk about digital transformation or digital business, or data-driven companies. Many have come to the realization that data is a not the new oil, because unlike oil, the same data can be used over and over for different purposes. We still use terms like data as an asset. However, that same narrative, when it's put forth by the vendor and practitioner communities, includes further discussions about democratizing and sharing data. Let me ask you this, when was the last time you wanted to share your financial assets with your coworkers or your partners or your customers? Hello everyone, and welcome to this week's Wikibon Cube Insights powered by ETR. In this breaking analysis, we want to share our assessment of the state of the data business. We'll do so by looking at the data mesh concept and how a leading financial institution, JP Morgan Chase is practically applying these relatively new ideas to transform its data architecture. Let's start by looking at what is the data mesh. As we've previously reported many times, data mesh is a concept and set of principles that was introduced in 2018 by Zhamak Deghani who's director of technology at ThoughtWorks, it's a global consultancy and software development company. And she created this movement because her clients, who were some of the leading firms in the world had invested heavily in predominantly monolithic data architectures that had failed to deliver desired outcomes in ROI. So her work went deep into trying to understand that problem. And her main conclusion that came out of this effort was the world of data is distributed and shoving all the data into a single monolithic architecture is an approach that fundamentally limits agility and scale. Now a profound concept of data mesh is the idea that data architectures should be organized around business lines with domain context. That the highly technical and hyper specialized roles of a centralized cross functional team are a key blocker to achieving our data aspirations. This is the first of four high level principles of data mesh. So first again, that the business domain should own the data end-to-end, rather than have it go through a centralized big data technical team. Second, a self-service platform is fundamental to a successful architectural approach where data is discoverable and shareable across an organization and an ecosystem. Third, product thinking is central to the idea of data mesh. In other words, data products will power the next era of data success. And fourth data products must be built with governance and compliance that is automated and federated. Now there's lot more to this concept and there are tons of resources on the web to learn more, including an entire community that is formed around data mesh. But this should give you a basic idea. Now, the other point is that, in observing Zhamak Deghani's work, she is deliberately avoided discussions around specific tooling, which I think has frustrated some folks because we all like to have references that tie to products and tools and companies. So this has been a two-edged sword in that, on the one hand it's good, because data mesh is designed to be tool agnostic and technology agnostic. On the other hand, it's led some folks to take liberties with the term data mesh and claim mission accomplished when their solution, you know, maybe more marketing than reality. So let's look at JP Morgan Chase in their data mesh journey. Is why I got really excited when I saw this past week, a team from JPMC held a meet up to discuss what they called, data lake strategy via data mesh architecture. I saw that title, I thought, well, that's a weird title. And I wondered, are they just taking their legacy data lakes and claiming they're now transformed into a data mesh? But in listening to the presentation, which was over an hour long, the answer is a definitive no, not at all in my opinion. A gentleman named Scott Hollerman organized the session that comprised these three speakers here, James Reid, who's a divisional CIO at JPMC, Arup Nanda who is a technologist and architect and Serita Bakst who is an information architect, again, all from JPMC. This was the most detailed and practical discussion that I've seen to date about implementing a data mesh. And this is JP Morgan's their approach, and we know they're extremely savvy and technically sound. And they've invested, it has to be billions in the past decade on data architecture across their massive company. And rather than dwell on the downsides of their big data past, I was really pleased to see how they're evolving their approach and embracing new thinking around data mesh. So today, we're going to share some of the slides that they use and comment on how it dovetails into the concept of data mesh that Zhamak Deghani has been promoting, and at least as we understand it. And dig a bit into some of the tooling that is being used by JP Morgan, particularly around it's AWS cloud. So the first point is it's all about business value, JPMC, they're in the money business, and in that world, business value is everything. So Jr Reid, the CIO showed this slide and talked about their overall goals, which centered on a cloud first strategy to modernize the JPMC platform. I think it's simple and sensible, but there's three factors on which he focused, cut costs always short, you got to do that. Number two was about unlocking new opportunities, or accelerating time to value. But I was really happy to see number three, data reuse. That's a fundamental value ingredient in the slide that he's presenting here. And his commentary was all about aligning with the domains and maximizing data reuse, i.e. data is not like oil and making sure there's appropriate governance around that. Now don't get caught up in the term data lake, I think it's just how JP Morgan communicates internally. It's invested in the data lake concept, so they use water analogies. They use things like data puddles, for example, which are single project data marts or data ponds, which comprise multiple data puddles. And these can feed in to data lakes. And as we'll see, JPMC doesn't strive to have a single version of the truth from a data standpoint that resides in a monolithic data lake, rather it enables the business lines to create and own their own data lakes that comprise fit for purpose data products. And they do have a single truth of metadata. Okay, we'll get to that. But generally speaking, each of the domains will own end-to-end their own data and be responsible for those data products, we'll talk about that more. Now the genesis of this was sort of a cloud first platform, JPMC is leaning into public cloud, which is ironic since the early days, in the early days of cloud, all the financial institutions were like never. Anyway, JPMC is going hard after it, they're adopting agile methods and microservices architectures, and it sees cloud as a fundamental enabler, but it recognizes that on-prem data must be part of the data mesh equation. Here's a slide that starts to get into some of that generic tooling, and then we'll go deeper. And I want to make a couple of points here that tie back to Zhamak Deghani's original concept. The first is that unlike many data architectures, this puts data as products right in the fat middle of the chart. The data products live in the business domains and are at the heart of the architecture. The databases, the Hadoop clusters, the files and APIs on the left-hand side, they serve the data product builders. The specialized roles on the right hand side, the DBA's, the data engineers, the data scientists, the data analysts, we could have put in quality engineers, et cetera, they serve the data products. Because the data products are owned by the business, they inherently have the context that is the middle of this diagram. And you can see at the bottom of the slide, the key principles include domain thinking, an end-to-end ownership of the data products. They build it, they own it, they run it, they manage it. At the same time, the goal is to democratize data with a self-service as a platform. One of the biggest points of contention of data mesh is governance. And as Serita Bakst said on the Meetup, metadata is your friend, and she kind of made a joke, she said, "This sounds kind of geeky, but it's important to have a metadata catalog to understand where data resides and the data lineage in overall change management. So to me, this really past the data mesh stink test pretty well. Let's look at data as products. CIO Reid said the most difficult thing for JPMC was getting their heads around data product, and they spent a lot of time getting this concept to work. Here's the slide they use to describe their data products as it related to their specific industry. They set a common language and taxonomy is very important, and you can imagine how difficult that was. He said, for example, it took a lot of discussion and debate to define what a transaction was. But you can see at a high level, these three product groups around wholesale, credit risk, party, and trade and position data as products, and each of these can have sub products, like, party, we'll have to know your customer, KYC for example. So a key for JPMC was to start at a high level and iterate to get more granular over time. So lots of decisions had to be made around who owns the products and the sub-products. The product owners interestingly had to defend why that product should even exist, what boundaries should be in place and what data sets do and don't belong in the various products. And this was a collaborative discussion, I'm sure there was contention around that between the lines of business. And which sub products should be part of these circles? They didn't say this, but tying it back to data mesh, each of these products, whether in a data lake or a data hub or a data pond or data warehouse, data puddle, each of these is a node in the global data mesh that is discoverable and governed. And supporting this notion, Serita said that, "This should not be infrastructure-bound, logically, any of these data products, whether on-prem or in the cloud can connect via the data mesh." So again, I felt like this really stayed true to the data mesh concept. Well, let's look at some of the key technical considerations that JPM discussed in quite some detail. This chart here shows a diagram of how JP Morgan thinks about the problem, and some of the challenges they had to consider were how to write to various data stores, can you and how can you move data from one data store to another? How can data be transformed? Where's the data located? Can the data be trusted? How can it be easily accessed? Who has the right to access that data? These are all problems that technology can help solve. And to address these issues, Arup Nanda explained that the heart of this slide is the data in ingestor instead of ETL. All data producers and contributors, they send their data to the ingestor and the ingestor then registers the data so it's in the data catalog. It does a data quality check and it tracks the lineage. Then, data is sent to the router, which persists the data in the data store based on the best destination as informed by the registration. This is designed to be a flexible system. In other words, the data store for a data product is not fixed, it's determined at the point of inventory, and that allows changes to be easily made in one place. The router simply reads that optimal location and sends it to the appropriate data store. Nowadays you see the schema infer there is used when there is no clear schema on right. In this case, the data product is not allowed to be consumed until the schema is inferred, and then the data goes into a raw area, and the inferer determines the schema and then updates the inventory system so that the data can be routed to the proper location and properly tracked. So that's some of the detail of how the sausage factory works in this particular use case, it was very interesting and informative. Now let's take a look at the specific implementation on AWS and dig into some of the tooling. As described in some detail by Arup Nanda, this diagram shows the reference architecture used by this group within JP Morgan, and it shows all the various AWS services and components that support their data mesh approach. So start with the authorization block right there underneath Kinesis. The lake formation is the single point of entitlement and has a number of buckets including, you can see there the raw area that we just talked about, a trusted bucket, a refined bucket, et cetera. Depending on the data characteristics at the data catalog registration block where you see the glue catalog, that determines in which bucket the router puts the data. And you can see the many AWS services in use here, identity, the EMR, the elastic MapReduce cluster from the legacy Hadoop work done over the years, the Redshift Spectrum and Athena, JPMC uses Athena for single threaded workloads and Redshift Spectrum for nested types so they can be queried independent of each other. Now remember very importantly, in this use case, there is not a single lake formation, rather than multiple lines of business will be authorized to create their own lakes, and that creates a challenge. So how can that be done in a flexible and automated manner? And that's where the data mesh comes into play. So JPMC came up with this federated lake formation accounts idea, and each line of business can create as many data producer or consumer accounts as they desire and roll them up into their master line of business lake formation account. And they cross-connect these data products in a federated model. And these all roll up into a master glue catalog so that any authorized user can find out where a specific data element is located. So this is like a super set catalog that comprises multiple sources and syncs up across the data mesh. So again to me, this was a very well thought out and practical application of database. Yes, it includes some notion of centralized management, but much of that responsibility has been passed down to the lines of business. It does roll up to a master catalog, but that's a metadata management effort that seems compulsory to ensure federated and automated governance. As well at JPMC, the office of the chief data officer is responsible for ensuring governance and compliance throughout the federation. All right, so let's take a look at some of the suspects in this world of data mesh and bring in the ETR data. Now, of course, ETR doesn't have a data mesh category, there's no such thing as that data mesh vendor, you build a data mesh, you don't buy it. So, what we did is we use the ETR dataset to select and filter on some of the culprits that we thought might contribute to the data mesh to see how they're performing. This chart depicts a popular view that we often like to share. It's a two dimensional graphic with net score or spending momentum on the vertical axis and market share or pervasiveness in the data set on the horizontal axis. And we filtered the data on sectors such as analytics, data warehouse, and the adjacencies to things that might fit into data mesh. And we think that these pretty well reflect participation that data mesh is certainly not all compassing. And it's a subset obviously, of all the vendors who could play in the space. Let's make a few observations. Now as is often the case, Azure and AWS, they're almost literally off the charts with very high spending velocity and large presence in the market. Oracle you can see also stands out because much of the world's data lives inside of Oracle databases. It doesn't have the spending momentum or growth, but the company remains prominent. And you can see Google Cloud doesn't have nearly the presence in the dataset, but it's momentum is highly elevated. Remember that red dotted line there, that 40% line, anything over that indicates elevated spending momentum. Let's go to Snowflake. Snowflake is consistently shown to be the gold standard in net score in the ETR dataset. It continues to maintain highly elevated spending velocity in the data. And in many ways, Snowflake with its data marketplace and its data cloud vision and data sharing approach, fit nicely into the data mesh concept. Now, a caution, Snowflake has used the term data mesh in it's marketing, but in our view, it lacks clarity, and we feel like they're still trying to figure out how to communicate what that really is. But is really, we think a lot of potential there to that vision. Databricks is also interesting because the firm has momentum and we expect further elevated levels in the vertical axis in upcoming surveys, especially as it readies for its IPO. The firm has a strong product and managed service, and is really one to watch. Now we included a number of other database companies for obvious reasons like Redis and Mongo, MariaDB, Couchbase and Terradata. SAP as well is in there, but that's not all database, but SAP is prominent so we included them. As is IBM more of a database, traditional database player also with the big presence. Cloudera includes Hortonworks and HPE Ezmeral comprises the MapR business that HPE acquired. So these guys got the big data movement started, between Cloudera, Hortonworks which is born out of Yahoo, which was the early big data, sorry early Hadoop innovator, kind of MapR when it's kind of owned course, and now that's all kind of come together in various forms. And of course, we've got Talend and Informatica are there, they are two data integration companies that are worth noting. We also included some of the AI and ML specialists and data science players in the mix like DataRobot who just did a monster $250 million round. Dataiku, H2O.ai and ThoughtSpot, which is all about democratizing data and injecting AI, and I think fits well into the data mesh concept. And you know we put VMware Cloud in there for reference because it really is the predominant on-prem infrastructure platform. All right, let's wrap with some final thoughts here, first, thanks a lot to the JP Morgan team for sharing this data. I really want to encourage practitioners and technologists, go to watch the YouTube of that meetup, we'll include it in the link of this session. And thank you to Zhamak Deghani and the entire data mesh community for the outstanding work that you're doing, challenging the established conventions of monolithic data architectures. The JPM presentation, it gives you real credibility, it takes Data Mesh well beyond concept, it demonstrates how it can be and is being done. And you know, this is not a perfect world, you're going to start somewhere and there's going to be some failures, the key is to recognize that shoving everything into a monolithic data architecture won't support massive scale and agility that you're after. It's maybe fine for smaller use cases in smaller firms, but if you're building a global platform in a data business, it's time to rethink data architecture. Now much of this is enabled by the cloud, but cloud first doesn't mean cloud only, doesn't mean you'll leave your on-prem data behind, on the contrary, you have to include non-public cloud data in your Data Mesh vision just as JPMC has done. You've got to get some quick wins, that's crucial so you can gain credibility within the organization and grow. And one of the key takeaways from the JP Morgan team is, there is a place for dogma, like organizing around data products and domains and getting that right. On the other hand, you have to remain flexible because technologies is going to come, technology is going to go, so you got to be flexible in that regard. And look, if you're going to embrace the metaphor of water like puddles and ponds and lakes, we suggest maybe a little tongue in cheek, but still we believe in this, that you expand your scope to include data ocean, something John Furry and I have talked about and laughed about extensively in theCUBE. Data oceans, it's huge. It's the new data lake, go transcend data lake, think oceans. And think about this, just as we're evolving our language, we should be evolving our metrics. Much the last the decade of big data was around just getting the stuff to work, getting it up and running, standing up infrastructure and managing massive, how much data you got? Massive amounts of data. And there were many KPIs built around, again, standing up that infrastructure, ingesting data, a lot of technical KPIs. This decade is not just about enabling better insights, it's a more than that. Data mesh points us to a new era of data value, and that requires the new metrics around monetizing data products, like how long does it take to go from data product conception to monetization? And how does that compare to what it is today? And what is the time to quality if the business owns the data, and the business has the context? the quality that comes out of them, out of the shoot should be at a basic level, pretty good, and at a higher mark than out of a big data team with no business context. Automation, AI, and very importantly, organizational restructuring of our data teams will heavily contribute to success in the coming years. So we encourage you, learn, lean in and create your data future. Okay, that's it for now, remember these episodes, they're all available as podcasts wherever you listen, all you got to do is search, breaking analysis podcast, and please subscribe. Check out ETR's website at etr.plus for all the data and all the survey information. We publish a full report every week on wikibon.com and siliconangle.com. And you can get in touch with us, email me david.vellante@siliconangle.com, you can DM me @dvellante, or you can comment on my LinkedIn posts. This is Dave Vellante for theCUBE insights powered by ETR. Have a great week everybody, stay safe, be well, and we'll see you next time. (upbeat music)
SUMMARY :
This is braking analysis and the adjacencies to things
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
JPMC | ORGANIZATION | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
2018 | DATE | 0.99+ |
Zhamak Deghani | PERSON | 0.99+ |
James Reid | PERSON | 0.99+ |
JP Morgan | ORGANIZATION | 0.99+ |
JP Morgan | ORGANIZATION | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
Serita Bakst | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Scott Hollerman | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Boston | LOCATION | 0.99+ |
40% | QUANTITY | 0.99+ |
JP Morgan Chase | ORGANIZATION | 0.99+ |
Serita | PERSON | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
Arup Nanda | PERSON | 0.99+ |
each | QUANTITY | 0.99+ |
ThoughtWorks | ORGANIZATION | 0.99+ |
first | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
david.vellante@siliconangle.com | OTHER | 0.99+ |
each line | QUANTITY | 0.99+ |
Terradata | ORGANIZATION | 0.99+ |
Redis | ORGANIZATION | 0.99+ |
$250 million | QUANTITY | 0.99+ |
first point | QUANTITY | 0.99+ |
three factors | QUANTITY | 0.99+ |
Second | QUANTITY | 0.99+ |
MapR | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
Informatica | ORGANIZATION | 0.99+ |
Talend | ORGANIZATION | 0.99+ |
John Furry | PERSON | 0.99+ |
Zhamak Deghani | PERSON | 0.99+ |
first platform | QUANTITY | 0.98+ |
YouTube | ORGANIZATION | 0.98+ |
fourth | QUANTITY | 0.98+ |
single | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
Third | QUANTITY | 0.97+ |
Couchbase | ORGANIZATION | 0.97+ |
three speakers | QUANTITY | 0.97+ |
two data | QUANTITY | 0.97+ |
first strategy | QUANTITY | 0.96+ |
one | QUANTITY | 0.96+ |
one place | QUANTITY | 0.96+ |
Jr Reid | PERSON | 0.96+ |
single lake | QUANTITY | 0.95+ |
SAP | ORGANIZATION | 0.95+ |
wikibon.com | OTHER | 0.95+ |
siliconangle.com | OTHER | 0.94+ |
Azure | ORGANIZATION | 0.93+ |
Breaking Analysis: Moore's Law is Accelerating and AI is Ready to Explode
>> From theCUBE Studios in Palo Alto and Boston, bringing you data-driven insights from theCUBE and ETR. This is breaking analysis with Dave Vellante. >> Moore's Law is dead, right? Think again. Massive improvements in processing power combined with data and AI will completely change the way we think about designing hardware, writing software and applying technology to businesses. Every industry will be disrupted. You hear that all the time. Well, it's absolutely true and we're going to explain why and what it all means. Hello everyone, and welcome to this week's Wikibon Cube Insights powered by ETR. In this breaking analysis, we're going to unveil some new data that suggests we're entering a new era of innovation that will be powered by cheap processing capabilities that AI will exploit. We'll also tell you where the new bottlenecks will emerge and what this means for system architectures and industry transformations in the coming decade. Moore's Law is dead, you say? We must have heard that hundreds, if not, thousands of times in the past decade. EE Times has written about it, MIT Technology Review, CNET, and even industry associations that have lived by Moore's Law. But our friend Patrick Moorhead got it right when he said, "Moore's Law, by the strictest definition of doubling chip densities every two years, isn't happening anymore." And you know what, that's true. He's absolutely correct. And he couched that statement by saying by the strict definition. And he did that for a reason, because he's smart enough to know that the chip industry are masters at doing work arounds. Here's proof that the death of Moore's Law by its strictest definition is largely irrelevant. My colleague, David Foyer and I were hard at work this week and here's the result. The fact is that the historical outcome of Moore's Law is actually accelerating and in quite dramatically. This graphic digs into the progression of Apple's SoC, system on chip developments from the A9 and culminating with the A14, 15 nanometer bionic system on a chip. The vertical axis shows operations per second and the horizontal axis shows time for three processor types. The CPU which we measure here in terahertz, that's the blue line which you can't even hardly see, the GPU which is the orange that's measured in trillions of floating point operations per second and then the NPU, the neural processing unit and that's measured in trillions of operations per second which is that exploding gray area. Now, historically, we always rushed out to buy the latest and greatest PC, because the newer models had faster cycles or more gigahertz. Moore's Law would double that performance every 24 months. Now that equates to about 40% annually. CPU performance is now moderated. That growth is now down to roughly 30% annual improvements. So technically speaking, Moore's Law as we know it was dead. But combined, if you look at the improvements in Apple's SoC since 2015, they've been on a pace that's higher than 118% annually. And it's even higher than that, because the actual figure for these three processor types we're not even counting the impact of DSPs and accelerator components of Apple system on a chip. It would push this even higher. Apple's A14 which is shown in the right hand side here is quite amazing. It's got a 64 bit architecture, it's got many, many cores. It's got a number of alternative processor types. But the important thing is what you can do with all this processing power. In an iPhone, the types of AI that we show here that continue to evolve, facial recognition, speech, natural language processing, rendering videos, helping the hearing impaired and eventually bringing augmented reality to the palm of your hand. It's quite incredible. So what does this mean for other parts of the IT stack? Well, we recently reported Satya Nadella's epic quote that "We've now reached peak centralization." So this graphic paints a picture that was quite telling. We just shared the processing powers exploding. The costs consequently are dropping like a rock. Apple's A14 cost the company approximately 50 bucks per chip. Arm at its v9 announcement said that it will have chips that can go into refrigerators. These chips are going to optimize energy usage and save 10% annually on your power consumption. They said, this chip will cost a buck, a dollar to shave 10% of your refrigerator electricity bill. It's just astounding. But look at where the expensive bottlenecks are, it's networks and it's storage. So what does this mean? Well, it means the processing is going to get pushed to the edge, i.e., wherever the data is born. Storage and networking are going to become increasingly distributed and decentralized. Now with custom silicon and all that processing power placed throughout the system, an AI is going to be embedded into software, into hardware and it's going to optimize a workloads for latency, performance, bandwidth, and security. And remember, most of that data, 99% is going to stay at the edge. And we love to use Tesla as an example. The vast majority of data that a Tesla car creates is never going to go back to the cloud. Most of it doesn't even get persisted. I think Tesla saves like five minutes of data. But some data will connect occasionally back to the cloud to train AI models and we're going to come back to that. But this picture says if you're a hardware company, you'd better start thinking about how to take advantage of that blue line that's exploding, Cisco. Cisco is already designing its own chips. But Dell, HPE, who kind of does maybe used to do a lot of its own custom silicon, but Pure Storage, NetApp, I mean, the list goes on and on and on either you're going to get start designing custom silicon or you're going to get disrupted in our view. AWS, Google and Microsoft are all doing it for a reason as is IBM and to Sarbjeet Johal said recently this is not your grandfather's semiconductor business. And if you're a software engineer, you're going to be writing applications that take advantage of all the data being collected and bringing to bear this processing power that we're talking about to create new capabilities like we've never seen it before. So let's get into that a little bit and dig into AI. You can think of AI as the superset. Just as an aside, interestingly in his book, "Seeing Digital", author David Moschella says, there's nothing artificial about this. He uses the term machine intelligence, instead of artificial intelligence and says that there's nothing artificial about machine intelligence just like there's nothing artificial about the strength of a tractor. It's a nuance, but it's kind of interesting, nonetheless, words matter. We hear a lot about machine learning and deep learning and think of them as subsets of AI. Machine learning applies algorithms and code to data to get "smarter", make better models, for example, that can lead to augmented intelligence and help humans make better decisions. These models improve as they get more data and are iterated over time. Now deep learning is a more advanced type of machine learning. It uses more complex math. But the point that we want to make here is that today much of the activity in AI is around building and training models. And this is mostly happening in the cloud. But we think AI inference will bring the most exciting innovations in the coming years. Inference is the deployment of that model that we were just talking about, taking real time data from sensors, processing that data locally and then applying that training that has been developed in the cloud and making micro adjustments in real time. So let's take an example. Again, we love Tesla examples. Think about an algorithm that optimizes the performance and safety of a car on a turn, the model take data on friction, road condition, angles of the tires, the tire wear, the tire pressure, all this data, and it keeps testing and iterating, testing and iterating, testing iterating that model until it's ready to be deployed. And then the intelligence, all this intelligence goes into an inference engine which is a chip that goes into a car and gets data from sensors and makes these micro adjustments in real time on steering and braking and the like. Now, as you said before, Tesla persist the data for very short time, because there's so much of it. It just can't push it back to the cloud. But it can now ever selectively store certain data if it needs to, and then send back that data to the cloud to further train them all. Let's say for instance, an animal runs into the road during slick conditions, Tesla wants to grab that data, because they notice that there's a lot of accidents in New England in certain months. And maybe Tesla takes that snapshot and sends it back to the cloud and combines it with other data and maybe other parts of the country or other regions of New England and it perfects that model further to improve safety. This is just one example of thousands and thousands that are going to further develop in the coming decade. I want to talk about how we see this evolving over time. Inference is where we think the value is. That's where the rubber meets the road, so to speak, based on the previous example. Now this conceptual chart shows the percent of spend over time on modeling versus inference. And you can see some of the applications that get attention today and how these applications will mature over time as inference becomes more and more mainstream, the opportunities for AI inference at the edge and in IOT are enormous. And we think that over time, 95% of that spending is going to go to inference where it's probably only 5% today. Now today's modeling workloads are pretty prevalent and things like fraud, adtech, weather, pricing, recommendation engines, and those kinds of things, and now those will keep getting better and better and better over time. Now in the middle here, we show the industries which are all going to be transformed by these trends. Now, one of the point that Moschella had made in his book, he kind of explains why historically vertically industries are pretty stovepiped, they have their own stack, sales and marketing and engineering and supply chains, et cetera, and experts within those industries tend to stay within those industries and they're largely insulated from disruption from other industries, maybe unless they were part of a supply chain. But today, you see all kinds of cross industry activity. Amazon entering grocery, entering media. Apple in finance and potentially getting into EV. Tesla, eyeing insurance. There are many, many, many examples of tech giants who are crossing traditional industry boundaries. And the reason is because of data. They have the data. And they're applying machine intelligence to that data and improving. Auto manufacturers, for example, over time they're going to have better data than insurance companies. DeFi, decentralized finance platforms going to use the blockchain and they're continuing to improve. Blockchain today is not great performance, it's very overhead intensive all that encryption. But as they take advantage of this new processing power and better software and AI, it could very well disrupt traditional payment systems. And again, so many examples here. But what I want to do now is dig into enterprise AI a bit. And just a quick reminder, we showed this last week in our Armv9 post. This is data from ETR. The vertical axis is net score. That's a measure of spending momentum. The horizontal axis is market share or pervasiveness in the dataset. The red line at 40% is like a subjective anchor that we use. Anything above 40% we think is really good. Machine learning and AI is the number one area of spending velocity and has been for awhile. RPA is right there. Very frankly, it's an adjacency to AI and you could even argue. So it's cloud where all the ML action is taking place today. But that will change, we think, as we just described, because data's going to get pushed to the edge. And this chart will show you some of the vendors in that space. These are the companies that CIOs and IT buyers associate with their AI and machine learning spend. So it's the same XY graph, spending velocity by market share on the horizontal axis. Microsoft, AWS, Google, of course, the big cloud guys they dominate AI and machine learning. Facebook's not on here. Facebook's got great AI as well, but it's not enterprise tech spending. These cloud companies they have the tooling, they have the data, they have the scale and as we said, lots of modeling is going on today, but this is going to increasingly be pushed into remote AI inference engines that will have massive processing capabilities collectively. So we're moving away from that peak centralization as Satya Nadella described. You see Databricks on here. They're seen as an AI leader. SparkCognition, they're off the charts, literally, in the upper left. They have extremely high net score albeit with a small sample. They apply machine learning to massive data sets. DataRobot does automated AI. They're super high in the y-axis. Dataiku, they help create machine learning based apps. C3.ai, you're hearing a lot more about them. Tom Siebel's involved in that company. It's an enterprise AI firm, hear a lot of ads now doing AI and responsible way really kind of enterprise AI that's sort of always been IBM. IBM Watson's calling card. There's SAP with Leonardo. Salesforce with Einstein. Again, IBM Watson is right there just at the 40% line. You see Oracle is there as well. They're embedding automated and tele or machine intelligence with their self-driving database they call it that sort of machine intelligence in the database. You see Adobe there. So a lot of typical enterprise company names. And the point is that these software companies they're all embedding AI into their offerings. So if you're an incumbent company and you're trying not to get disrupted, the good news is you can buy AI from these software companies. You don't have to build it. You don't have to be an expert at AI. The hard part is going to be how and where to apply AI. And the simplest answer there is follow the data. There's so much more to the story, but we just have to leave it there for now and I want to summarize. We have been pounding the table that the post x86 era is here. It's a function of volume. Arm volumes are a way for volumes are 10X those of x86. Pat Gelsinger understands this. That's why he made that big announcement. He's trying to transform the company. The importance of volume in terms of lowering the cost of semiconductors it can't be understated. And today, we've quantified something that we haven't really seen much of and really haven't seen before. And that's that the actual performance improvements that we're seeing in processing today are far outstripping anything we've seen before, forget Moore's Law being dead that's irrelevant. The original finding is being blown away this decade and who knows with quantum computing what the future holds. This is a fundamental enabler of AI applications. And this is most often the case the innovation is coming from the consumer use cases first. Apple continues to lead the way. And Apple's integrated hardware and software model we think increasingly is going to move into the enterprise mindset. Clearly the cloud vendors are moving in this direction, building their own custom silicon and doing really that deep integration. You see this with Oracle who kind of really a good example of the iPhone for the enterprise, if you will. It just makes sense that optimizing hardware and software together is going to gain momentum, because there's so much opportunity for customization in chips as we discussed last week with Arm's announcement, especially with the diversity of edge use cases. And it's the direction that Pat Gelsinger is taking Intel trying to provide more flexibility. One aside, Pat Gelsinger he may face massive challenges that we laid out a couple of posts ago with our Intel breaking analysis, but he is right on in our view that semiconductor demand is increasing. There's no end in sight. We don't think we're going to see these ebbs and flows as we've seen in the past that these boom and bust cycles for semiconductor. We just think that prices are coming down. The market's elastic and the market is absolutely exploding with huge demand for fab capacity. Now, if you're an enterprise, you should not stress about and trying to invent AI, rather you should put your focus on understanding what data gives you competitive advantage and how to apply machine intelligence and AI to win. You're going to be buying, not building AI and you're going to be applying it. Now data as John Furrier has said in the past is becoming the new development kit. He said that 10 years ago and he seems right. Finally, if you're an enterprise hardware player, you're going to be designing your own chips and writing more software to exploit AI. You'll be embedding custom silicon in AI throughout your product portfolio and storage and networking and you'll be increasingly bringing compute to the data. And that data will mostly stay where it's created. Again, systems and storage and networking stacks they're all being completely re-imagined. If you're a software developer, you now have processing capabilities in the palm of your hand that are incredible. And you're going to rewriting new applications to take advantage of this and use AI to change the world, literally. You'll have to figure out how to get access to the most relevant data. You have to figure out how to secure your platforms and innovate. And if you're a services company, your opportunity is to help customers that are trying not to get disrupted are many. You have the deep industry expertise and horizontal technology chops to help customers survive and thrive. Privacy? AI for good? Yeah well, that's a whole another topic. I think for now, we have to get a better understanding of how far AI can go before we determine how far it should go. Look, protecting our personal data and privacy should definitely be something that we're concerned about and we should protect. But generally, I'd rather not stifle innovation at this point. I'd be interested in what you think about that. Okay. That's it for today. Thanks to David Foyer, who helped me with this segment again and did a lot of the charts and the data behind this. He's done some great work there. Remember these episodes are all available as podcasts wherever you listen, just search breaking it analysis podcast and please subscribe to the series. We'd appreciate that. Check out ETR's website at ETR.plus. We also publish a full report with more detail every week on Wikibon.com and siliconangle.com, so check that out. You can get in touch with me. I'm dave.vellante@siliconangle.com. You can DM me on Twitter @dvellante or comment on our LinkedIn posts. I always appreciate that. This is Dave Vellante for theCUBE Insights powered by ETR. Stay safe, be well. And we'll see you next time. (bright music)
SUMMARY :
This is breaking analysis and did a lot of the charts
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David Foyer | PERSON | 0.99+ |
David Moschella | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Patrick Moorhead | PERSON | 0.99+ |
Tom Siebel | PERSON | 0.99+ |
New England | LOCATION | 0.99+ |
Pat Gelsinger | PERSON | 0.99+ |
CNET | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Cisco | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
MIT Technology Review | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
10% | QUANTITY | 0.99+ |
five minutes | QUANTITY | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
hundreds | QUANTITY | 0.99+ |
Satya Nadella | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Boston | LOCATION | 0.99+ |
95% | QUANTITY | 0.99+ |
40% | QUANTITY | 0.99+ |
iPhone | COMMERCIAL_ITEM | 0.99+ |
Adobe | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
last week | DATE | 0.99+ |
99% | QUANTITY | 0.99+ |
ETR | ORGANIZATION | 0.99+ |
dave.vellante@siliconangle.com | OTHER | 0.99+ |
John Furrier | PERSON | 0.99+ |
EE Times | ORGANIZATION | 0.99+ |
Sarbjeet Johal | PERSON | 0.99+ |
10X | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
Moschella | PERSON | 0.99+ |
theCUBE | ORGANIZATION | 0.98+ |
Intel | ORGANIZATION | 0.98+ |
15 nanometer | QUANTITY | 0.98+ |
2015 | DATE | 0.98+ |
today | DATE | 0.98+ |
Seeing Digital | TITLE | 0.98+ |
30% | QUANTITY | 0.98+ |
HPE | ORGANIZATION | 0.98+ |
this week | DATE | 0.98+ |
A14 | COMMERCIAL_ITEM | 0.98+ |
higher than 118% | QUANTITY | 0.98+ |
5% | QUANTITY | 0.97+ |
10 years ago | DATE | 0.97+ |
Ein | ORGANIZATION | 0.97+ |
a buck | QUANTITY | 0.97+ |
64 bit | QUANTITY | 0.97+ |
C3.ai | TITLE | 0.97+ |
Databricks | ORGANIZATION | 0.97+ |
about 40% | QUANTITY | 0.96+ |
theCUBE Studios | ORGANIZATION | 0.96+ |
Dataiku | ORGANIZATION | 0.95+ |
siliconangle.com | OTHER | 0.94+ |
Boost Your Solutions with the HPE Ezmeral Ecosystem Program | HPE Ezmeral Day 2021
>> Hello. My name is Ron Kafka, and I'm the senior director for Partner Scale Initiatives for HBE Ezmeral. Thanks for joining us today at Analytics Unleashed. By now, you've heard a lot about the Ezmeral portfolio and how it can help you accomplish objectives around big data analytics and containerization. I want to shift gears a bit and then discuss our Ezmeral Technology Partner Program. I've got two great guest speakers here with me today. And together, We're going to discuss how jointly we are solving data analytic challenges for our customers. Before I introduce them, I want to take a minute to talk to provide a little bit more insight into our ecosystem program. We've created a program with a realization based on customer feedback that even the most mature organizations are struggling with their data-driven transformation efforts. It turns out this is largely due to the pace of innovation with application vendors or ICS supporting data science and advanced analytic workloads. Their advancements are simply outpacing organization's ability to move workloads into production rapidly. Bottom line, organizations want a unified experience across environments where their entire application portfolio in essence provide a comprehensive application stack and not piece parts. So, let's talk about how our ecosystem program helps solve for this. For starters, we were leveraging HPEs long track record of forging technology partnerships and it created a best in class ISB partner program specific for the Ezmeral portfolio. We were doing this by developing an open concept marketplace where customers and partners can explore, learn, engage and collaborate with our strategic technology partners. This enables our customers to adopt, deploy validated applications from industry leading software vendors on HPE Ezmeral with a high degree of confidence. Also, it provides a very deep bench of leading ISVs for other groups inside of HPE to leverage for their solutioning efforts. Speaking of industry leading ISV, it's about time and introduce you to two of those industry leaders right now. Let me welcome Daniel Hladky from Dataiku, and Omri Geller from Run:AI. So I'd like to introduce Daniel Hladky. Daniel is with Dataiku. He's a great partner for HPE. Daniel, welcome. >> Thank you for having me here. >> That's great. Hey, would you mind just talking a bit about how your partnership journey has been with HPE? >> Yes, pleasure. So the journey started about five years ago and in 2018 we signed a worldwide reseller agreement with HPE. And in 2020, we actually started to work jointly on the integration between the Dataiku Data Science Studio called DSS and integrated that with the Ezmeral Container platform, and was a great success. And it was on behalf of some clear customer projects. >> It's been a long partnership journey with you for sure with HPE. And we welcome your partnership extremely well. Just a brief question about the Container Platform and really what that's meant for Dataiku. >> Yes, Ron. Thanks. So, basically I'd like the quote here Florian Douetteau, which is the CEO of Dataiku, who said that the combination of Dataiku with the HPE Ezmeral Container Platform will help the customers to successfully scale and put machine learning projects into production. And this basically is going to deliver real impact for their business. So, the combination of the two of us is a great success. >> That's great. Can you talk about what Dataiku is doing and how HPE Ezmeral Container Platform fits in a solution offering a bit more? >> Great. So basically Dataiku DSS is our product which is a end to end data science platform, and basically brings value to the project of customers on their past enterprise AI. In simple ways, we can say it could be as simple as building data pipelines, but it could be also very complex by having machine and deep learning models at scale. So the fast track to value is by having collaboration, orchestration online technologies and the models in production. So, all of that is part of the Data Science Studio and Ezmeral fits perfectly into the part where we design and then basically put at scale those project and put it into product. >> That's perfect. Can you be a bit more specific about how you see HPE and Dataiku really tightening up a customer outcome and value proposition? >> Yes. So what we see is also the challenge of the market that probably about 80% of the use cases really never make it to production. And this is of course a big challenge and we need to change that. And I think the combination of the two of us is actually addressing exactly this need. What we can say is part of the MLOps approach, Dataiku and the Ezmeral Container Platform will provide a frictionless approach, which means without scripting and coding, customers can put all those projects into the productive environment and don't have to worry any more and be more business oriented. >> That's great. So you mentioned you're seeing customers be a lot more mature with their AI workloads and deployment. What do you suggest for the other customers out there that are just starting this journey or just thinking about how to get started? >> Yeah. That's a very good question, Ron. So what we see there is actually the challenge that people need to go on a pass of maturity. And this starts with a simple data pipelines, et cetera, and then basically move up the ladder and basically build large complex project. And here I see a very interesting offer coming now from HPE which is called D3S, which is the data science startup pack. That's something I discussed together with HPE back in early 2020. And basically, it solves the three stages, which is explore, experiment and evolve and builds quickly MVPs for the customers. By doing so, basically you addressed business objectives, lay out in the proper architecture and also setting up the proper organization around it. So, this is a great combination by HPE and Dataiku through the D3S. >> And it's a perfect example of what I mentioned earlier about leveraging the ecosystem program that we built to do deeper solutioning efforts inside of HPE in this case with our AI business unit. So, congratulations on that and thanks for joining us today. I'm going to shift gears. I'm going to bring in Omri Geller from Run:AI. Omri, welcome. It's great to have you. You guys are killing it out there in the market today. And I just thought we could spend a few minutes talking about what is so unique and differentiated from your offerings. >> Thank you, Ron. It's a pleasure to be here. Run:AI creates a virtualization and orchestration layer for AI infrastructure. We help organizations to gain visibility and control over their GPO resources and help them deliver AI solutions to market faster. And we do that by managing granular scheduling, prioritization, allocation of compute power, together with the HPE Ezmeral Container Platform. >> That's great. And your partnership with HPE is a bit newer than Daniel's, right? Maybe about the last year or so we've been working together a lot more closely. Can you just talk about the HPE partnership, what it's meant for you and how do you see it impacting your business? >> Sure. First of all, Run:AI is excited to partner with HPE Ezmeral Container Platform and help customers manage appeals for their AI workloads. We chose HPE since HPE has years of experience partnering with AI use cases and outcomes with vendors who have strong footprint in this markets. HPE works with many partners that are complimentary for our use case such as Nvidia, and HPE Container Platform together with Run:AI and Nvidia deliver a world class solutions for AI accelerated workloads. And as you can understand, for AI speed is critical. Companies want to gather important AI initiatives into production as soon as they can. And the HPE Ezmeral Container Platform, running IGP orchestration solution enables that by enabling dynamic provisioning of GPU so that resources can be easily shared, efficiently orchestrated and optimal used. >> That's great. And you talked a lot about the efficiency of the solution. What about from a customer perspective? What is the real benefit that our customers are going to be able to gain from an HPE and Run:AI offering? >> So first, it is important to understand how data scientists and AI researchers actually build solution. They do it by running experiments. And if a data scientist is able to run more experiments per given time, they will get to the solution faster. With HPE Ezmeral Container Platform, Run:AI and users such as data scientists can actually do that and seamlessly and efficiently consume large amounts of GPU resources, run more experiments or given time and therefore accelerate their research. Together, we actually saw a customer that is running almost 7,000 jobs in parallel over GPUs with efficient utilization of those GPUs. And by running more experiments, those customers can be much more effective and efficient when it comes to bringing solutions to market >> Couldn't agree more. And I think we're starting to see a lot of joint success together as we go out and talk to the story. Hey, I want to thank you both one last time for being here with me today. It was very enlightening for our team to have you as part of the program. And I'm excited to extend this customer value proposition out to the rest of our communities. With that, I'd like to close today's session. I appreciate everyone's time. And keep an eye out on our ISP marketplace for Ezmeral We're continuing to expand and add new capabilities and new partners to our marketplace. We're excited to do a lot of great things and help you guys all be successful. Thanks for joining. >> Thank you, Ron. >> What a great panel discussion. And these partners they really do have a good understanding of the possibilities, working on the platform, and I hope and expect we'll see this ecosystem continue to grow. That concludes the main program, which means you can now pick one of three live demos to attend and chat live with experts. Now those three include day in the life of IT Admin, day in the life of a data scientist, and even a day in the life of the HPE Ezmeral Data Fabric, where you can see the many ways the data fabric is used in your life today. Wish you could attend all three, no worries. The recordings will be available on demand for you and your teams. Moreover, the show doesn't stop here, HPE has a growing and thriving tech community, you should check it out. It's really a solid starting point for learning more, talking to smart people about great ideas and seeing how Ezmeral can be part of your own data journey. Again, thanks very much to all of you for joining, until next time, keep unleashing the power of your data.
SUMMARY :
and how it can help you Hey, would you mind just talking a bit and integrated that with the and really what that's meant for Dataiku. So, basically I'd like the quote here Florian Douetteau, and how HPE Ezmeral Container Platform and the models in production. about how you see HPE and and the Ezmeral Container Platform or just thinking about how to get started? and builds quickly MVPs for the customers. and differentiated from your offerings. and control over their GPO resources and how do you see it and HPE Container Platform together with Run:AI efficiency of the solution. So first, it is important to understand for our team to have you and even a day in the life of
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Daniel | PERSON | 0.99+ |
Ron Kafka | PERSON | 0.99+ |
Ron | PERSON | 0.99+ |
Omri Geller | PERSON | 0.99+ |
Florian Douetteau | PERSON | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
Daniel Hladky | PERSON | 0.99+ |
Dataiku | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
2020 | DATE | 0.99+ |
Nvidia | ORGANIZATION | 0.99+ |
2018 | DATE | 0.99+ |
DSS | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
today | DATE | 0.99+ |
three | QUANTITY | 0.99+ |
early 2020 | DATE | 0.99+ |
first | QUANTITY | 0.98+ |
Data Science Studio | ORGANIZATION | 0.98+ |
Ezmeral | PERSON | 0.98+ |
Ezmeral | ORGANIZATION | 0.98+ |
Dataiku Data Science Studio | ORGANIZATION | 0.97+ |
three live demos | QUANTITY | 0.97+ |
both | QUANTITY | 0.97+ |
about 80% | QUANTITY | 0.96+ |
HPEs | ORGANIZATION | 0.95+ |
three stages | QUANTITY | 0.94+ |
two great guest speakers | QUANTITY | 0.93+ |
Omri | PERSON | 0.91+ |
Analytics Unleashed | ORGANIZATION | 0.91+ |
D3S | TITLE | 0.87+ |
almost 7,000 jobs | QUANTITY | 0.87+ |
HPE Container Platform | TITLE | 0.86+ |
HPE Ezmeral Container Platform | TITLE | 0.83+ |
HBE Ezmeral | ORGANIZATION | 0.83+ |
Run | ORGANIZATION | 0.82+ |
Ezmeral Container Platform | TITLE | 0.81+ |
about five years ago | DATE | 0.8+ |
Platform | TITLE | 0.71+ |
Ezmeral | TITLE | 0.7+ |
Run:AI | ORGANIZATION | 0.7+ |
Ezmeral Data | ORGANIZATION | 0.69+ |
2021 | DATE | 0.68+ |
Ezmeral Ecosystem Program | TITLE | 0.68+ |
ICS | ORGANIZATION | 0.67+ |
Run | TITLE | 0.66+ |
Partner Scale Initiatives | ORGANIZATION | 0.66+ |
Boost Your Solutions with the HPE Ezmeral Ecosystem Program | HPE Ezmeral Day 2021
>> Hello. My name is Ron Kafka, and I'm the senior director for Partner Scale Initiatives for HBE Ezmeral. Thanks for joining us today at Analytics Unleashed. By now, you've heard a lot about the Ezmeral portfolio and how it can help you accomplish objectives around big data analytics and containerization. I want to shift gears a bit and then discuss our Ezmeral Technology Partner Program. I've got two great guest speakers here with me today. And together, We're going to discuss how jointly we are solving data analytic challenges for our customers. Before I introduce them, I want to take a minute to talk to provide a little bit more insight into our ecosystem program. We've created a program with a realization based on customer feedback that even the most mature organizations are struggling with their data-driven transformation efforts. It turns out this is largely due to the pace of innovation with application vendors or ICS supporting data science and advanced analytic workloads. Their advancements are simply outpacing organization's ability to move workloads into production rapidly. Bottom line, organizations want a unified experience across environments where their entire application portfolio in essence provide a comprehensive application stack and not piece parts. So, let's talk about how our ecosystem program helps solve for this. For starters, we were leveraging HPEs long track record of forging technology partnerships and it created a best in class ISB partner program specific for the Ezmeral portfolio. We were doing this by developing an open concept marketplace where customers and partners can explore, learn, engage and collaborate with our strategic technology partners. This enables our customers to adopt, deploy validated applications from industry leading software vendors on HPE Ezmeral with a high degree of confidence. Also, it provides a very deep bench of leading ISVs for other groups inside of HPE to leverage for their solutioning efforts. Speaking of industry leading ISV, it's about time and introduce you to two of those industry leaders right now. Let me welcome Daniel Hladky from Dataiku, and Omri Geller from Run:AI. So I'd like to introduce Daniel Hladky. Daniel is with Dataiku. He's a great partner for HPE. Daniel, welcome. >> Thank you for having me here. >> That's great. Hey, would you mind just talking a bit about how your partnership journey has been with HPE? >> Yes, pleasure. So the journey started about five years ago and in 2018 we signed a worldwide reseller agreement with HPE. And in 2020, we actually started to work jointly on the integration between the Dataiku Data Science Studio called DSS and integrated that with the Ezmeral Container platform, and was a great success. And it was on behalf of some clear customer projects. >> It's been a long partnership journey with you for sure with HPE. And we welcome your partnership extremely well. Just a brief question about the Container Platform and really what that's meant for Dataiku. >> Yes, Ron. Thanks. So, basically I like the quote here Florian Douetteau, which is the CEO of Dataiku, who said that the combination of Dataiku with the HPE Ezmeral Container Platform will help the customers to successfully scale and put machine learning projects into production. And this basically is going to deliver real impact for their business. So, the combination of the two of us is a great success. >> That's great. Can you talk about what Dataiku is doing and how HPE Ezmeral Container Platform fits in a solution offering a bit more? >> Great. So basically Dataiku DSS is our product which is a end to end data science platform, and basically brings value to the project of customers on their past enterprise AI. In simple ways, we can say it could be as simple as building data pipelines, but it could be also very complex by having machine and deep learning models at scale. So the fast track to value is by having collaboration, orchestration online technologies and the models in production. So, all of that is part of the Data Science Studio and Ezmeral fits perfectly into the part where we design and then basically put at scale those project and put it into product. >> That's perfect. Can you be a bit more specific about how you see HPE and Dataiku really tightening up a customer outcome and value proposition? >> Yes. So what we see is also the challenge of the market that probably about 80% of the use cases really never make it to production. And this is of course a big challenge and we need to change that. And I think the combination of the two of us is actually addressing exactly this need. What we can say is part of the MLOps approach, Dataiku and the Ezmeral Container Platform will provide a frictionless approach, which means without scripting and coding, customers can put all those projects into the productive environment and don't have to worry any more and be more business oriented. >> That's great. So you mentioned you're seeing customers be a lot more mature with their AI workloads and deployment. What do you suggest for the other customers out there that are just starting this journey or just thinking about how to get started? >> Yeah. That's a very good question, Ron. So what we see there is actually the challenge that people need to go on a pass of maturity. And this starts with a simple data pipelines, et cetera, and then basically move up the ladder and basically build large complex project. And here I see a very interesting offer coming now from HPE which is called D3S, which is the data science startup pack. That's something I discussed together with HPE back in early 2020. And basically, it solves the three stages, which is explore, experiment and evolve and builds quickly MVPs for the customers. By doing so, basically you addressed business objectives, lay out in the proper architecture and also setting up the proper organization around it. So, this is a great combination by HPE and Dataiku through the D3S. >> And it's a perfect example of what I mentioned earlier about leveraging the ecosystem program that we built to do deeper solutioning efforts inside of HPE in this case with our AI business unit. So, congratulations on that and thanks for joining us today. I'm going to shift gears. I'm going to bring in Omri Geller from Run:AI. Omri, welcome. It's great to have you. You guys are killing it out there in the market today. And I just thought we could spend a few minutes talking about what is so unique and differentiated from your offerings. >> Thank you, Ron. It's a pleasure to be here. Run:AI creates a virtualization and orchestration layer for AI infrastructure. We help organizations to gain visibility and control over their GPO resources and help them deliver AI solutions to market faster. And we do that by managing granular scheduling, prioritization, allocation of compute power, together with the HPE Ezmeral Container Platform. >> That's great. And your partnership with HPE is a bit newer than Daniel's, right? Maybe about the last year or so we've been working together a lot more closely. Can you just talk about the HPE partnership, what it's meant for you and how do you see it impacting your business? >> Sure. First of all, Run:AI is excited to partner with HPE Ezmeral Container Platform and help customers manage appeals for their AI workloads. We chose HPE since HPE has years of experience partnering with AI use cases and outcomes with vendors who have strong footprint in this markets. HPE works with many partners that are complimentary for our use case such as Nvidia, and HPE Ezmeral Container Platform together with Run:AI and Nvidia deliver a word about solution for AI accelerated workloads. And as you can understand, for AI speed is critical. Companies want to gather important AI initiatives into production as soon as they can. And the HPE Ezmeral Container Platform, running IGP orchestration solution enables that by enabling dynamic provisioning of GPU so that resources can be easily shared, efficiently orchestrated and optimal used. >> That's great. And you talked a lot about the efficiency of the solution. What about from a customer perspective? What is the real benefit that our customers are going to be able to gain from an HPE and Run:AI offering? >> So first, it is important to understand how data scientists and AI researchers actually build solution. They do it by running experiments. And if a data scientist is able to run more experiments per given time, they will get to the solution faster. With HPE Ezmeral Container Platform, Run:AI and users such as data scientists can actually do that and seamlessly and efficiently consume large amounts of GPU resources, run more experiments or given time and therefore accelerate their research. Together, we actually saw a customer that is running almost 7,000 jobs in parallel over GPUs with efficient utilization of those GPUs. And by running more experiments, those customers can be much more effective and efficient when it comes to bringing solutions to market >> Couldn't agree more. And I think we're starting to see a lot of joint success together as we go out and talk to the story. Hey, I want to thank you both one last time for being here with me today. It was very enlightening for our team to have you as part of the program. And I'm excited to extend this customer value proposition out to the rest of our communities. With that, I'd like to close today's session. I appreciate everyone's time. And keep an eye out on our ISP marketplace for Ezmeral We're continuing to expand and add new capabilities and new partners to our marketplace. We're excited to do a lot of great things and help you guys all be successful. Thanks for joining. >> Thank you, Ron. (bright upbeat music)
SUMMARY :
and how it can help you journey has been with HPE? and integrated that with the and really what that's meant for Dataiku. and put machine learning and how HPE Ezmeral Container Platform and the models in production. about how you see HPE and and the Ezmeral Container Platform or just thinking about how to get started? and builds quickly MVPs for the customers. and differentiated from your offerings. and control over their GPO resources and how do you see it and outcomes with vendors efficiency of the solution. So first, it is important to understand and new partners to our marketplace. Thank you, Ron.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Daniel | PERSON | 0.99+ |
Ron Kafka | PERSON | 0.99+ |
Florian Douetteau | PERSON | 0.99+ |
Ron | PERSON | 0.99+ |
Omri Geller | PERSON | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
Daniel Hladky | PERSON | 0.99+ |
Nvidia | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
2020 | DATE | 0.99+ |
2018 | DATE | 0.99+ |
Dataiku | ORGANIZATION | 0.99+ |
DSS | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
today | DATE | 0.99+ |
Omri | PERSON | 0.99+ |
Data Science Studio | ORGANIZATION | 0.98+ |
early 2020 | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
Ezmeral | ORGANIZATION | 0.98+ |
Dataiku Data Science Studio | ORGANIZATION | 0.97+ |
about 80% | QUANTITY | 0.97+ |
both | QUANTITY | 0.97+ |
HPEs | ORGANIZATION | 0.95+ |
three stages | QUANTITY | 0.94+ |
two great guest speakers | QUANTITY | 0.93+ |
one | QUANTITY | 0.93+ |
almost 7,000 jobs | QUANTITY | 0.92+ |
Analytics Unleashed | ORGANIZATION | 0.91+ |
HPE Ezmeral Container Platform | TITLE | 0.84+ |
HBE Ezmeral | ORGANIZATION | 0.83+ |
Run | ORGANIZATION | 0.83+ |
Ezmeral Container Platform | TITLE | 0.82+ |
D3S | TITLE | 0.81+ |
about five years ago | DATE | 0.8+ |
HPE Ezmeral Container Platform | TITLE | 0.79+ |
2021 | DATE | 0.76+ |
Run:AI | ORGANIZATION | 0.72+ |
Ezmeral | TITLE | 0.7+ |
Platform | TITLE | 0.69+ |
Ezmeral Container Platform | TITLE | 0.68+ |
ICS | ORGANIZATION | 0.67+ |
Partner Scale Initiatives | ORGANIZATION | 0.66+ |
HPE | TITLE | 0.62+ |
DSS | TITLE | 0.6+ |
Ezmeral Container | TITLE | 0.59+ |
Container | TITLE | 0.56+ |
HPE Ezmeral | EVENT | 0.55+ |
First | QUANTITY | 0.52+ |
Run | TITLE | 0.51+ |
Day | EVENT | 0.51+ |
Benoit Dageville and Florian Douetteau V1
>> Hello everyone, welcome back to theCUBE'S wall to wall coverage of the Snowflake Data Cloud Summit. My name is Dave Vellante and with me are two world-class technologists, visionaries, and entrepreneurs. Benoit Dageville is the, he co-founded Snowflake. And he's now the president of the Product division and Florian Douetteau is the co-founder and CEO of Dataiku. Gentlemen, welcome to theCUBE, two first timers, love it. >> Great time to be here. >> Now Florian, you and Benoit, you have a number of customers in common. And I've said many times on theCUBE that, the first era of cloud was really about infrastructure, making it more agile taking out costs. And the next generation of innovation is really coming from the application of machine intelligence to data with the cloud, is really the scale platform. So is that premise relevant to you, do you buy that? And why do you think Snowflake and Dataiku make a good match for customers? >> I think that because it's our values that align. When it gets all about actually today, and knowing complexity per customer, so you close the gap or we need to commoditize the access to data, the access to technology, it's not only about data, data is important, but it's also about the impacts of data. How can you make the best out of data as fast as possible, as easily as possible within an organization? And another value is about just the openness of the platform, building a future together. I think a platform that is not just about the platform but also for the ecosystem of partners around it, bringing the little bit of accessibility and flexibility, you need for the 10 years of that. >> Yes, so that's key, but it's not just data. It's turning data into insights. Now Benoit, you came out of the world of very powerful, but highly complex databases. And we all know that, you and the Snowflake team, you get very high marks for really radically simplifying customers' lives. But can you talk specifically about the types of challenges that your customers are using Snowflake to solve? >> Yeah, so really the challenge before Snowflake, I would say, was really to put all the data, in one place and run all the computes, all the workloads that you wanted to run, against that data. And of course, existing legacy platforms were not able to support that level of concurrency, many workload. We talk about machine learning, data science, data engineering, data warehouse, big data workloads, all running in one place, didn't make sense at all. And therefore, what customers did, is to create silos, silos of data everywhere, with different systems having a subset of the data. And of course now you cannot analyze this data in one place. So Snowflake, we really solved that problem by creating a single architecture where you can put all the data in the cloud. So it's a really cloud native. We really thought about how to solve that problem, how to create leverage cloud and the elasticity of cloud to really put all the data in one place. But at the same time, not run all workload at the same place. So each workload that runs in Snowflake at least dedicate compute resources to run. And that makes it very agile, right. Florian talked about data scientist having to run analysis. So they need a lot of compute resources, but only for few hours and with Snowflake, they can run these new workload, add this workload to the system, get the compute resources that they need to run this workload. And then when it's over, they can shut down their system. It will automatically shut down. Therefore they would not pay for the resources that they don't choose. So it's a very agile system, where you can do these analysis when you need, and you have all the power to run all these workload at the same time. >> Well, it's profound what you guys built. To me, I mean, because everybody's trying to copy it now. It's like, I remember the notion of bringing compute to the data in the Hadoop days. And I think that, as I say, everybody is sort of following your suit now or trying to. Florian, I got to say, the first data scientist I ever interviewed on theCUBE was the amazing Hilary Mason, right after she started at Bitly. And she made data science sounds so compelling, but data science is hard. So same question for you. What do you see is the biggest challenges for customers that they're facing with data science? >> The biggest challenge from my perspective is that once you solve the issue of the data silo with Snowflake, you don't want to bring another silo, which would be a silo of skills. And essentially, thanks to that talent gap between the talent and labor of the markets, or how it is to actually find, recruit and train data scientists and what needs to be done. And so you need actually to simplify the access to technology such as every organization can make it, whatever the talents by bridging that gap. And to get there, there is a need of actually breaking up the silos. I think a collaborative approach, where technologies and business work together and actually all put some of their ends into those data projects together. >> Yeah, it makes sense. So Florian, Let's stay with you for a minute, if I can. Your observation spaces, is pretty, pretty global. And so, you have a unique perspective on how companies around the world might be using data and data science. Are you seeing any trends, maybe differences between regions or maybe within different industries? What are you seeing? >> Yep. Yeah, definitely, I do see trends that are not geographic that much, but much more in terms of maturity of certain industries and certain sectors, which are that certain industries invested a lot in terms of data, data access, ability to store data as well as few years and know each level of maturity where they can invest more and get to the next steps. And it's really reliant to reach out to certain details, certain organization, actually to have built this longterm data strategy a few years ago, and no stocks ripping off the benefits. >> You know, a decade ago, Florian, Hal Varian famously said that the sexy job in the next 10 years will be statisticians. And then everybody sort of changed that to data scientists. And then everybody, all the statisticians became data scientists and they got a raise. But data science requires more than just statistics acumen. What skills do you see is critical for the next generation of data science? >> Yeah, it's a good question because I think the first generation of data scientists became better scientists because they could learn some Python quickly and be flexible. And I think that skills of the next generation of data scientists will definitely be different. It will be first about being able to speak the language of the business, meaning all you translate data insight, predictive modeling, all of this into actionable insights or business impact. And it will be about who you collaborate with the rest of the business. It's not just how fast you can build something, how fast you can do a notebook in Python or do quantity models of some sorts. It's about how you actually build this bridge with the business. And obviously those things are important, but we also must be cognizant of the fact that technology will evolve in the future. There will be new tools in technologies, and they will still need to get this level of flexibility and get to understand quickly what are the next tools, they need to use or new languages or whatever to get there. >> Thank you for that. Benoit, let's come back to you. This year has been tumultuous to say the least for everyone, but it's a good time to be in tech, ironically. And if you're in cloud, it's even better. But you look at Snowflake and Dataiku, you guys had done well, despite the economic uncertainty and the challenges of the pandemic. As you look back on 2020, what are you thinking? What are you telling people as we head into next year? >> Yeah, I think it's very interesting, right. We, this crisis has told us that the world really can change from one day to the next. And this has dramatic and profound aspects. For example, companies all of a sudden, saw their revenue line dropping and they had to do less with data. And some of the companies was the reverse, right? All of a sudden, they were online like Instacart, for example, and their business completely change from one day to the other. So this agility of adjusting the resources that you have to do the task, a need that can change, using solution like Snowflake, really helps that. And we saw both in our customers. Some customers from one day to the next, were growing like big time, because they benefited from COVID and their business benefited, but also, as you know, had to drop and what is nice with cloud, it allows to adjust compute resources to your business needs and really address it in-house. The other aspect is understanding what is happening, right? You need to analyze. So we saw all our customers basically wanted to understand, what is it going to be the impact on my business? How can I adapt? How can I adjust? And for that, they needed to analyze data. And of course, a lot of data, which are not necessarily data about their business, but also data from the outside. For example, COVID data. Where is the state, what is the impact, geographic impact on COVID all the time. And access to this data is critical. So this is the promise of the data cloud, right? Having one single place where you can put all the data of the world. So, our customers all of a sudden, started to consume the COVID data from our data marketplace. And we have the unit already thousands of customers looking at this data, analyzing this data to make good decisions. So this agility and this adapting from one hour to the next is really critical and that goes with data, with cloud, more interesting resources and that's doesn't exist on premise. So, indeed I think the lesson learned is, we are living in a world which is changing all the time, and we have to understand it. We have to adjust and that's why cloud, some way is great. >> Excellent, thank you. You know, in theCUBE, we like to talk about disruption, of course, who doesn't. And also, I mean, you look at AI and the impact that it's beginning to have and kind of pre-COVID, you look at some of the industries that were getting disrupted by, everybody talks about digital transformation and you had on the one end of the spectrum, industries like publishing, which are highly disrupted or taxis, and you can say, "Okay well, that's Bits versus Adam, the old Negroponte thing." But then the flip side of this, it says, "Look at financial services that hadn't been dramatically disrupted, certainly healthcare, which is right for disruption, defense." So the more the number of industries that really hadn't leaned into digital transformation, if it ain't broke, don't fix it. Not on my watch. There was this complacency. And then of course COVID broke everything. So Florian, I wonder if you could comment, what industry or industries do you think are going to be most impacted by data science and what I call machine intelligence or AI in the coming years and decades? >> Honestly, I think it's all of them, or at least most of them. Because for some industries, the impact is very visible because we are talking about brand new products, drones, flying cars, or whatever is that are very visible for us. But for others, we are talking about spectrum changes in the way you operate as an organization. Even if financial industry itself doesn't seem to be so impacted when you look at it from the consumer side or the outside. In fact internally, it's probably impacted just because of the way you use data to develop for flexibility you need, is there kind of a cost gain you can get by leveraging the latest technologies, is just enormous. And so it will, actually comes from the industry, that also. And overall, I think that 2020 is a year where, from the perspective of AI and analytics, we understood this idea of maturity and resilience. Maturity, meaning that when you've got a crisis, you actually need data and AI more than before, you need to actually call the people from data in the room to take better decisions and look forward and not backward. And I think that's a very important learning from 2020 that will tell things about 2021. And resilience, it's like, yeah, data analytics today is a function consuming every industries, and is so important that it's something that needs to work. So the infrastructure needs to work, the infrastructure needs to be super resilient. So probably not on trend and not fully on trend, at some point and the kind of residence where you need to be able to plan for literally anything. like no hypothesis in terms of behaviors can be taken for granted. And that's something that is new and which is just signaling that we are just getting into a next step for all data analytics. >> I wonder Benoit, if you have anything to add to that, I mean, I often wonder, you know, when are machines going to be able to make better diagnoses than doctors, some people say already. Will the financial services, traditional banks lose control of payment systems? You know, what's going to happen to big retail stores? I mean, may be bring us home with maybe some of your final thoughts. >> Yeah, I would say, I don't see that as a negative, right? The human being will always be involved very closely, but then the machine and the data can really help, see correlation in the data that would be impossible for human being alone to discover. So, I think it's going to be a compliment, not a replacement and everything that has made us faster, doesn't mean that we have less work to do. It means that we can do more. And we have so much to do. That I would not be worried about the effect of being more efficient and better at our work. And indeed, I fundamentally think that, data, processing of images and doing AI on these images and discovering patterns and potentially flagging disease, way earlier than it was possible, it is going to have a huge impact in health care. And as Florian was saying, every industry is going to be impacted by that technology. So, yeah, I'm very optimistic. >> Great, Guys, I wish we had more time. We got to leave it there but so thanks so much for coming on theCUBE. It was really a pleasure having you. >> [Benoit & Florian] Thank you. >> You're welcome but keep it right there, everybody. We'll back with our next guest, right after this short break. You're watching theCUBE.
SUMMARY :
And he's now the president And the next generation of the access to data, the And we all know that, you all the workloads that you the notion of bringing the access to technology such as And so, you have a unique And it's really reliant to reach out Hal Varian famously said that the sexy job And it will be about who you collaborate and the challenges of the pandemic. adjusting the resources that you have end of the spectrum, of the way you use data to I mean, I often wonder, you know, So, I think it's going to be a compliment, We got to leave it there right after this short break.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Florian | PERSON | 0.99+ |
Benoit | PERSON | 0.99+ |
Florian Douetteau | PERSON | 0.99+ |
Benoit Dageville | PERSON | 0.99+ |
2020 | DATE | 0.99+ |
10 years | QUANTITY | 0.99+ |
Dataiku | ORGANIZATION | 0.99+ |
Hilary Mason | PERSON | 0.99+ |
Python | TITLE | 0.99+ |
Hal Varian | PERSON | 0.99+ |
next year | DATE | 0.99+ |
Snowflake | ORGANIZATION | 0.99+ |
one place | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
one hour | QUANTITY | 0.99+ |
Bitly | ORGANIZATION | 0.99+ |
Snowflake Data Cloud Summit | EVENT | 0.99+ |
a decade ago | DATE | 0.98+ |
one day | QUANTITY | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
first | QUANTITY | 0.98+ |
each level | QUANTITY | 0.98+ |
Snowflake | TITLE | 0.98+ |
2021 | DATE | 0.97+ |
today | DATE | 0.97+ |
first generation | QUANTITY | 0.97+ |
pandemic | EVENT | 0.97+ |
few years ago | DATE | 0.93+ |
thousands of customers | QUANTITY | 0.93+ |
single architecture | QUANTITY | 0.92+ |
first era | QUANTITY | 0.88+ |
Negroponte | PERSON | 0.87+ |
first data scientist | QUANTITY | 0.87+ |
Instacart | ORGANIZATION | 0.87+ |
This year | DATE | 0.86+ |
one single place | QUANTITY | 0.86+ |
two | QUANTITY | 0.83+ |
two world- | QUANTITY | 0.78+ |
each workload | QUANTITY | 0.78+ |
one | QUANTITY | 0.76+ |
Adam | PERSON | 0.74+ |
next 10 years | DATE | 0.69+ |
first timers | QUANTITY | 0.52+ |
COVID | OTHER | 0.51+ |
COVID | ORGANIZATION | 0.43+ |
COVID | EVENT | 0.37+ |
decades | DATE | 0.29+ |