Steven Hillion & Jeff Fletcher, Astronomer | AWS Startup Showcase S3E1

(upbeat music) >> Welcome everyone to theCUBE's presentation of the AWS Startup Showcase AI/ML Top Startups Building Foundation Model Infrastructure. This is season three, episode one of our ongoing series covering exciting startups from the AWS ecosystem to talk about data and analytics. I'm your host, Lisa Martin and today we're excited to be joined by two guests from Astronomer. Steven Hillion joins us, it's Chief Data Officer and Jeff Fletcher, it's director of ML. They're here to talk about machine learning and data orchestration. Guys, thank you so much for joining us today. >> Thank you. >> It's great to be here. >> Before we get into machine learning let's give the audience an overview of Astronomer. Talk about what that is, Steven. Talk about what you mean by data orchestration. >> Yeah, let's start with Astronomer. We're the Airflow company basically. The commercial developer behind the open-source project, Apache Airflow. I don't know if you've heard of Airflow. It's sort of de-facto standard these days for orchestrating data pipelines, data engineering pipelines, and as we'll talk about later, machine learning pipelines. It's really is the de-facto standard. I think we're up to about 12 million downloads a month. That's actually as a open-source project. I think at this point it's more popular by some measures than Slack. Airflow was created by Airbnb some years ago to manage all of their data pipelines and manage all of their workflows and now it powers the data ecosystem for organizations as diverse as Electronic Arts, Conde Nast is one of our big customers, a big user of Airflow. And also not to mention the biggest banks on Wall Street use Airflow and Astronomer to power the flow of data throughout their organizations. >> Talk about that a little bit more, Steven, in terms of the business impact. You mentioned some great customer names there. What is the business impact or outcomes that a data orchestration strategy enables businesses to achieve? >> Yeah, I mean, at the heart of it is quite simply, scheduling and managing data pipelines. And so if you have some enormous retailer who's managing the flow of information throughout their organization they may literally have thousands or even tens of thousands of data pipelines that need to execute every day to do things as simple as delivering metrics for the executives to consume at the end of the day, to producing on a weekly basis new machine learning models that can be used to drive product recommendations. One of our customers, for example, is a British food delivery service. And you get those recommendations in your application that says, "Well, maybe you want to have samosas with your curry." That sort of thing is powered by machine learning models that they train on a regular basis to reflect changing conditions in the market. And those are produced through Airflow and through the Astronomer platform, which is essentially a managed platform for running airflow. So at its simplest it really is just scheduling and managing those workflows. But that's easier said than done of course. I mean if you have 10 thousands of those things then you need to make sure that they all run that they all have sufficient compute resources. If things fail, how do you track those down across those 10,000 workflows? How easy is it for an average data scientist or data engineer to contribute their code, their Python notebooks or their SQL code into a production environment? And then you've got reproducibility, governance, auditing, like managing data flows across an organization which we think of as orchestrating them is much more than just scheduling. It becomes really complicated pretty quickly. >> I imagine there's a fair amount of complexity there. Jeff, let's bring you into the conversation. Talk a little bit about Astronomer through your lens, data orchestration and how it applies to MLOps. >> So I come from a machine learning background and for me the interesting part is that machine learning requires the expansion into orchestration. A lot of the same things that you're using to go and develop and build pipelines in a standard data orchestration space applies equally well in a machine learning orchestration space. What you're doing is you're moving data between different locations, between different tools, and then tasking different types of tools to act on that data. So extending it made logical sense from a implementation perspective. And a lot of my focus at Astronomer is really to explain how Airflow can be used well in a machine learning context. It is being used well, it is being used a lot by the customers that we have and also by users of the open source version. But it's really being able to explain to people why it's a natural extension for it and how well it fits into that. And a lot of it is also extending some of the infrastructure capabilities that Astronomer provides to those customers for them to be able to run some of the more platform specific requirements that come with doing machine learning pipelines. >> Let's get into some of the things that make Astronomer unique. Jeff, sticking with you, when you're in customer conversations, what are some of the key differentiators that you articulate to customers? >> So a lot of it is that we are not specific to one cloud provider. So we have the ability to operate across all of the big cloud providers. I know, I'm certain we have the best developers that understand how best practices implementations for data orchestration works. So we spend a lot of time talking to not just the business outcomes and the business users of the product, but also also for the technical people, how to help them better implement things that they may have come across on a Stack Overflow article or not necessarily just grown with how the product has migrated. So it's the ability to run it wherever you need to run it and also our ability to help you, the customer, better implement and understand those workflows that I think are two of the primary differentiators that we have. >> Lisa: Got it. >> I'll add another one if you don't mind. >> You can go ahead, Steven. >> Is lineage and dependencies between workflows. One thing we've done is to augment core Airflow with Lineage services. So using the Open Lineage framework, another open source framework for tracking datasets as they move from one workflow to another one, team to another, one data source to another is a really key component of what we do and we bundle that within the service so that as a developer or as a production engineer, you really don't have to worry about lineage, it just happens. Jeff, may show us some of this later that you can actually see as data flows from source through to a data warehouse out through a Python notebook to produce a predictive model or a dashboard. Can you see how those data products relate to each other? And when something goes wrong, figure out what upstream maybe caused the problem, or if you're about to change something, figure out what the impact is going to be on the rest of the organization. So Lineage is a big deal for us. >> Got it. >> And just to add on to that, the other thing to think about is that traditional Airflow is actually a complicated implementation. It required quite a lot of time spent understanding or was almost a bespoke language that you needed to be able to develop in two write these DAGs, which is like fundamental pipelines. So part of what we are focusing on is tooling that makes it more accessible to say a data analyst or a data scientist who doesn't have or really needs to gain the necessary background in how the semantics of Airflow DAGs works to still be able to get the benefit of what Airflow can do. So there is new features and capabilities built into the astronomer cloud platform that effectively obfuscates and removes the need to understand some of the deep work that goes on. But you can still do it, you still have that capability, but we are expanding it to be able to have orchestrated and repeatable processes accessible to more teams within the business. >> In terms of accessibility to more teams in the business. You talked about data scientists, data analysts, developers. Steven, I want to talk to you, as the chief data officer, are you having more and more conversations with that role and how is it emerging and evolving within your customer base? >> Hmm. That's a good question, and it is evolving because I think if you look historically at the way that Airflow has been used it's often from the ground up. You have individual data engineers or maybe single data engineering teams who adopt Airflow 'cause it's very popular. Lots of people know how to use it and they bring it into an organization and say, "Hey, let's use this to run our data pipelines." But then increasingly as you turn from pure workflow management and job scheduling to the larger topic of orchestration you realize it gets pretty complicated, you want to have coordination across teams, and you want to have standardization for the way that you manage your data pipelines. And so having a managed service for Airflow that exists in the cloud is easy to spin up as you expand usage across the organization. And thinking long term about that in the context of orchestration that's where I think the chief data officer or the head of analytics tends to get involved because they really want to think of this as a strategic investment that they're making. Not just per team individual Airflow deployments, but a network of data orchestrators. >> That network is key. Every company these days has to be a data company. We talk about companies being data driven. It's a common word, but it's true. It's whether it is a grocer or a bank or a hospital, they've got to be data companies. So talk to me a little bit about Astronomer's business model. How is this available? How do customers get their hands on it? >> Jeff, go ahead. >> Yeah, yeah. So we have a managed cloud service and we have two modes of operation. One, you can bring your own cloud infrastructure. So you can say here is an account in say, AWS or Azure and we can go and deploy the necessary infrastructure into that, or alternatively we can host everything for you. So it becomes a full SaaS offering. But we then provide a platform that connects at the backend to your internal IDP process. So however you are authenticating users to make sure that the correct people are accessing the services that they need with role-based access control. From there we are deploying through Kubernetes, the different services and capabilities into either your cloud account or into an account that we host. And from there Airflow does what Airflow does, which is its ability to then reach to different data systems and data platforms and to then run the orchestration. We make sure we do it securely, we have all the necessary compliance certifications required for GDPR in Europe and HIPAA based out of the US, and a whole bunch host of others. So it is a secure platform that can run in a place that you need it to run, but it is a managed Airflow that includes a lot of the extra capabilities like the cloud developer environment and the open lineage services to enhance the overall airflow experience. >> Enhance the overall experience. So Steven, going back to you, if I'm a Conde Nast or another organization, what are some of the key business outcomes that I can expect? As one of the things I think we've learned during the pandemic is access to realtime data is no longer a nice to have for organizations. It's really an imperative. It's that demanding consumer that wants to have that personalized, customized, instant access to a product or a service. So if I'm a Conde Nast or I'm one of your customers, what can I expect my business to be able to achieve as a result of data orchestration? >> Yeah, I think in a nutshell it's about providing a reliable, scalable, and easy to use service for developing and running data workflows. And talking of demanding customers, I mean, I'm actually a customer myself, as you mentioned, I'm the head of data for Astronomer. You won't be surprised to hear that we actually use Astronomer and Airflow to run all of our data pipelines. And so I can actually talk about my experience. When I started I was of course familiar with Airflow, but it always seemed a little bit unapproachable to me if I was introducing that to a new team of data scientists. They don't necessarily want to have to think about learning something new. But I think because of the layers that Astronomer has provided with our Astro service around Airflow it was pretty easy for me to get up and running. Of course I've got an incentive for doing that. I work for the Airflow company, but we went from about, at the beginning of last year, about 500 data tasks that we were running on a daily basis to about 15,000 every day. We run something like a million data operations every month within my team. And so as one outcome, just the ability to spin up new production workflows essentially in a single day you go from an idea in the morning to a new dashboard or a new model in the afternoon, that's really the business outcome is just removing that friction to operationalizing your machine learning and data workflows. >> And I imagine too, oh, go ahead, Jeff. >> Yeah, I think to add to that, one of the things that becomes part of the business cycle is a repeatable capabilities for things like reporting, for things like new machine learning models. And the impediment that has existed is that it's difficult to take that from a team that's an analyst team who then provide that or a data science team that then provide that to the data engineering team who have to work the workflow all the way through. What we're trying to unlock is the ability for those teams to directly get access to scheduling and orchestrating capabilities so that a business analyst can have a new report for C-suite execs that needs to be done once a week, but the time to repeatability for that report is much shorter. So it is then immediately in the hands of the person that needs to see it. It doesn't have to go into a long list of to-dos for a data engineering team that's already overworked that they eventually get it to it in a month's time. So that is also a part of it is that the realizing, orchestration I think is fairly well and a lot of people get the benefit of being able to orchestrate things within a business, but it's having more people be able to do it and shorten the time that that repeatability is there is one of the main benefits from good managed orchestration. >> So a lot of workforce productivity improvements in what you're doing to simplify things, giving more people access to data to be able to make those faster decisions, which ultimately helps the end user on the other end to get that product or the service that they're expecting like that. Jeff, I understand you have a demo that you can share so we can kind of dig into this. >> Yeah, let me take you through a quick look of how the whole thing works. So our starting point is our cloud infrastructure. This is the login. You go to the portal. You can see there's a a bunch of workspaces that are available. Workspaces are like individual places for people to operate in. I'm not going to delve into all the deep technical details here, but starting point for a lot of our data science customers is we have what we call our Cloud IDE, which is a web-based development environment for writing and building out DAGs without actually having to know how the underpinnings of Airflow work. This is an internal one, something that we use. You have a notebook-like interface that lets you write python code and SQL code and a bunch of specific bespoke type of blocks if you want. They all get pulled together and create a workflow. So this is a workflow, which gets compiled to something that looks like a complicated set of Python code, which is the DAG. I then have a CICD process pipeline where I commit this through to my GitHub repo. So this comes to a repo here, which is where these DAGs that I created in the previous step exist. I can then go and say, all right, I want to see how those particular DAGs have been running. We then get to the actual Airflow part. So this is the managed Airflow component. So we add the ability for teams to fairly easily bring up an Airflow instance and write code inside our notebook-like environment to get it into that instance. So you can see it's been running. That same process that we built here that graph ends up here inside this, but you don't need to know how the fundamentals of Airflow work in order to get this going. Then we can run one of these, it runs in the background and we can manage how it goes. And from there, every time this runs, it's emitting to a process underneath, which is the open lineage service, which is the lineage integration that allows me to come in here and have a look and see this was that actual, that same graph that we built, but now it's the historic version. So I know where things started, where things are going, and how it ran. And then I can also do a comparison. So if I want to see how this particular run worked compared to one historically, I can grab one from a previous date and it will show me the comparison between the two. So that combination of managed Airflow, getting Airflow up and running very quickly, but the Cloud IDE that lets you write code and know how to get something into a repeatable format get that into Airflow and have that attached to the lineage process adds what is a complete end-to-end orchestration process for any business looking to get the benefit from orchestration. >> Outstanding. Thank you so much Jeff for digging into that. So one of my last questions, Steven is for you. This is exciting. There's a lot that you guys are enabling organizations to achieve here to really become data-driven companies. So where can folks go to get their hands on this? >> Yeah, just go to astronomer.io and we have plenty of resources. If you're new to Airflow, you can read our documentation, our guides to getting started. We have a CLI that you can download that is really I think the easiest way to get started with Airflow. But you can actually sign up for a trial. You can sign up for a guided trial where our teams, we have a team of experts, really the world experts on getting Airflow up and running. And they'll take you through that trial and allow you to actually kick the tires and see how this works with your data. And I think you'll see pretty quickly that it's very easy to get started with Airflow, whether you're doing that from the command line or doing that in our cloud service. And all of that is available on our website >> astronomer.io. Jeff, last question for you. What are you excited about? There's so much going on here. What are some of the things, maybe you can give us a sneak peek coming down the road here that prospects and existing customers should be excited about? >> I think a lot of the development around the data awareness components, so one of the things that's traditionally been complicated with orchestration is you leave your data in the place that you're operating on and we're starting to have more data processing capability being built into Airflow. And from a Astronomer perspective, we are adding more capabilities around working with larger datasets, doing bigger data manipulation with inside the Airflow process itself. And that lends itself to better machine learning implementation. So as we start to grow and as we start to get better in the machine learning context, well, in the data awareness context, it unlocks a lot more capability to do and implement proper machine learning pipelines. >> Awesome guys. Exciting stuff. Thank you so much for talking to me about Astronomer, machine learning, data orchestration, and really the value in it for your customers. Steve and Jeff, we appreciate your time. >> Thank you. >> My pleasure, thanks. >> And we thank you for watching. This is season three, episode one of our ongoing series covering exciting startups from the AWS ecosystem. I'm your host, Lisa Martin. You're watching theCUBE, the leader in live tech coverage. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

of the AWS Startup Showcase let's give the audience and now it powers the data ecosystem What is the business impact or outcomes for the executives to consume how it applies to MLOps. and for me the interesting that you articulate to customers? So it's the ability to run it if you don't mind. that you can actually see as data flows the other thing to think about to more teams in the business. about that in the context of orchestration So talk to me a little bit at the backend to your So Steven, going back to you, just the ability to spin up but the time to repeatability a demo that you can share that allows me to come There's a lot that you guys We have a CLI that you can download What are some of the things, in the place that you're operating on and really the value in And we thank you for watching.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Jeff Fletcher	PERSON	0.99+
Steven	PERSON	0.99+
Steve	PERSON	0.99+
Steven Hillion	PERSON	0.99+
Lisa	PERSON	0.99+
Europe	LOCATION	0.99+
Conde Nast	ORGANIZATION	0.99+
US	LOCATION	0.99+
thousands	QUANTITY	0.99+
two	QUANTITY	0.99+
HIPAA	TITLE	0.99+
AWS	ORGANIZATION	0.99+
two guests	QUANTITY	0.99+
Airflow	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
10 thousands	QUANTITY	0.99+
One	QUANTITY	0.99+
Electronic Arts	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Python	TITLE	0.99+
two modes	QUANTITY	0.99+
Airflow	TITLE	0.98+
10,000 workflows	QUANTITY	0.98+
about 500 data tasks	QUANTITY	0.98+
today	DATE	0.98+
one outcome	QUANTITY	0.98+
tens of thousands	QUANTITY	0.98+
GDPR	TITLE	0.97+
SQL	TITLE	0.97+
GitHub	ORGANIZATION	0.96+
astronomer.io	OTHER	0.94+
Slack	ORGANIZATION	0.94+
Astronomer	ORGANIZATION	0.94+
some years ago	DATE	0.92+
once a week	QUANTITY	0.92+
Astronomer	TITLE	0.92+
theCUBE	ORGANIZATION	0.92+
last year	DATE	0.91+
Kubernetes	TITLE	0.88+
single day	QUANTITY	0.87+
about 15,000 every day	QUANTITY	0.87+
one cloud	QUANTITY	0.86+
IDE	TITLE	0.86+

Accelerating Automated Analytics in the Cloud with Alteryx

>>Alteryx is a company with a long history that goes all the way back to the late 1990s. Now the one consistent theme over 20 plus years has been that Ultrix has always been a data company early in the big data and Hadoop cycle. It saw the need to combine and prep different data types so that organizations could analyze data and take action Altrix and similar companies played a critical role in helping companies become data-driven. The problem was the decade of big data, brought a lot of complexities and required immense skills just to get the technology to work as advertised this in turn limited, the pace of adoption and the number of companies that could really lean in and take advantage of the cloud began to change all that and set the foundation for today's theme to Zuora of digital transformation. We hear that phrase a ton digital transformation. >>People used to think it was a buzzword, but of course we learned from the pandemic that if you're not a digital business, you're out of business and a key tenant of digital transformation is democratizing data, meaning enabling, not just hypo hyper specialized experts, but anyone business users to put data to work. Now back to Ultrix, the company has embarked on a major transformation of its own. Over the past couple of years, brought in new management, they've changed the way in which it engaged with customers with the new subscription model and it's topgraded its talent pool. 2021 was even more significant because of two acquisitions that Altrix made hyper Ana and trifecta. Why are these acquisitions important? Well, traditionally Altryx sold to business analysts that were part of the data pipeline. These were fairly technical people who had certain skills and were trained in things like writing Python code with hyper Ana Altryx has added a new persona, the business user, anyone in the business who wanted to gain insights from data and, or let's say use AI without having to be a deep technical expert. >>And then Trifacta a company started in the early days of big data by cube alum, Joe Hellerstein and his colleagues at Berkeley. They knocked down the data engineering persona, and this gives Altryx a complimentary extension into it where things like governance and security are paramount. So as we enter 2022, the post isolation economy is here and we do so with a digital foundation built on the confluence of cloud native technologies, data democratization and machine intelligence or AI, if you prefer. And Altryx is entering that new era with an expanded portfolio, new go-to market vectors, a recurring revenue business model, and a brand new outlook on how to solve customer problems and scale a company. My name is Dave Vellante with the cube and I'll be your host today. And the next hour, we're going to explore the opportunities in this new data market. And we have three segments where we dig into these trends and themes. First we'll talk to Jay Henderson, vice president of product management at Ultrix about cloud acceleration and simplifying complex data operations. Then we'll bring in Suresh Vetol who's the chief product officer at Altrix and Adam Wilson, the CEO of Trifacta, which of course is now part of Altrix. And finally, we'll hear about how Altryx is partnering with snowflake and the ecosystem and how they're integrating with data platforms like snowflake and what this means for customers. And we may have a few surprises sprinkled in as well into the conversation let's get started. >>We're kicking off the program with our first segment. Jay Henderson is the vice president of product management Altryx and we're going to talk about the trends and data, where we came from, how we got here, where we're going. We get some launch news. Well, Jay, welcome to the cube. >>Great to be here, really excited to share some of the things we're working on. >>Yeah. Thank you. So look, you have a deep product background, product management, product marketing, you've done strategy work. You've been around software and data, your entire career, and we're seeing the collision of software data cloud machine intelligence. Let's start with the customer and maybe we can work back from there. So if you're an analytics or data executive in an organization, w J what's your north star, where are you trying to take your company from a data and analytics point of view? >>Yeah, I mean, you know, look, I think all organizations are really struggling to get insights out of their data. I think one of the things that we see is you've got digital exhaust, creating large volumes of data storage is really cheap, so it doesn't cost them much to keep it. And that results in a situation where the organization's, you know, drowning in data, but somehow still starving for insights. And so I think, uh, you know, when I talk to customers, they're really excited to figure out how they can put analytics in the hands of every single person in their organization, and really start to democratize the analytics, um, and, you know, let the, the business users and the whole organization get value out of all that data they have. >>And we're going to dig into that throughout this program data, I like to say is plentiful insights, not always so much. Tell us about your launch today, Jay, and thinking about the trends that you just highlighted, the direction that your customers want to go and the problems that you're solving, what role does the cloud play in? What is what you're launching? How does that fit in? >>Yeah, we're, we're really excited today. We're launching the Altryx analytics cloud. That's really a portfolio of cloud-based solutions that have all been built from the ground up to be cloud native, um, and to take advantage of things like based access. So that it's really easy to give anyone access, including folks on a Mac. Um, it, you know, it also lets you take advantage of elastic compute so that you can do, you know, in database processing and cloud native, um, solutions that are gonna scale to solve the most complex problems. So we've got a portfolio of solutions, things like designer cloud, which is our flagship designer product in a browser and on the cloud, but we've got ultra to machine learning, which helps up-skill regular old analysts with advanced machine learning capabilities. We've got auto insights, which brings a business users into the fold and automatically unearths insights using AI and machine learning. And we've got our latest edition, which is Trifacta that helps data engineers do data pipelining and really, um, you know, create a lot of the underlying data sets that are used in some of this, uh, downstream analytics. >>Let's dig into some of those roles if we could a little bit, I mean, you've traditionally Altryx has served the business analysts and that's what designer cloud is fit for, I believe. And you've explained, you know, kind of the scope, sorry, you've expanded that scope into the, to the business user with hyper Anna. And we're in a moment we're going to talk to Adam Wilson and Suresh, uh, about Trifacta and that recent acquisition takes you, as you said, into the data engineering space in it. But in thinking about the business analyst role, what's unique about designer cloud cloud, and how does it help these individuals? >>Yeah, I mean, you know, really, I go back to some of the feedback we've had from our customers, which is, um, you know, they oftentimes have dozens or hundreds of seats of our designer desktop product, you know, really, as they look to take the next step, they're trying to figure out how do I give access to that? Those types of analytics to thousands of people within the organization and designer cloud is, is really great for that. You've got the browser-based interface. So if folks are on a Mac, they can really easily just pop, open the browser and get access to all of those, uh, prep and blend capabilities to a lot of the analysis we're doing. Um, it's a great way to scale up access to the analytics and then start to put it in the hands of really anyone in the organization, not just those highly skilled power users. >>Okay, great. So now then you add in the hyper Anna acquisition. So now you're targeting the business user Trifacta comes into the mix that deeper it angle that we talked about, how does this all fit together? How should we be thinking about the new Altryx portfolio? >>Yeah, I mean, I think it's pretty exciting. Um, you know, when you think about democratizing analytics and providing access to all these different groups of people, um, you've not been able to do it through one platform before. Um, you know, it's not going to be one interface that meets the, of all these different groups within the organization. You really do need purpose built specialized capabilities for each group. And finally, today with the announcement of the alternates analytics cloud, we brought together all of those different capabilities, all of those different interfaces into a single in the end application. So really finally delivering on the promise of providing analytics to all, >>How much of this you've been able to share with your customers and maybe your partners. I mean, I know OD is fairly new, but if you've been able to get any feedback from them, what are they saying about it? >>Uh, I mean, it's, it's pretty amazing. Um, we ran a early access, limited availability program that led us put a lot of this technology in the hands of over 600 customers, um, over the last few months. So we have gotten a lot of feedback. I tell you, um, it's been overwhelmingly positive. I think organizations are really excited to unlock the insights that have been hidden in all this data. They've got, they're excited to be able to use analytics in every decision that they're making so that the decisions they have or more informed and produce better business outcomes. Um, and, and this idea that they're going to move from, you know, dozens to hundreds or thousands of people who have access to these kinds of capabilities, I think has been a really exciting thing that is going to accelerate the transformation that these customers are on. >>Yeah, those are good. Good, good numbers for, for preview mode. Let's, let's talk a little bit about vision. So it's democratizing data is the ultimate goal, which frankly has been elusive for most organizations over time. How's your cloud going to address the challenges of putting data to work across the entire enterprise? >>Yeah, I mean, I tend to think about the future and some of the investments we're making in our products and our roadmap across four big themes, you know, in the, and these are really kind of enduring themes that you're going to see us making investments in over the next few years, the first is having cloud centricity. You know, the data gravity has been moving to the cloud. We need to be able to provide access, to be able to ingest and manipulate that data, to be able to write back to it, to provide cloud solution. So the first one is really around cloud centricity. The second is around big data fluency. Once you have all of the data, you need to be able to manipulate it in a performant manner. So having the elastic cloud infrastructure and in database processing is so important, the third is around making AI a strategic advantage. >>So, uh, you know, getting everyone involved and accessing AI and machine learning to unlock those insights, getting it out of the hands of the small group of data scientists, putting it in the hands of analysts and business users. Um, and then the fourth thing is really providing access across the entire organization. You know, it and data engineers, uh, as well as business owners and analysts. So, um, cloud centricity, big data fluency, um, AI is a strategic advantage and, uh, personas across the organization are really the four big themes you're going to see us, uh, working on over the next few months and, uh, coming coming year. >>That's good. Thank you for that. So, so on a related question, how do you see the data organizations evolving? I mean, traditionally you've had, you know, monolithic organizations, uh, very specialized or I might even say hyper specialized roles and, and your, your mission of course is the customer. You, you, you, you and your customers, they want to democratize the data. And so it seems logical that domain leaders are going to take more responsibility for data, life cycles, data ownerships, low code becomes more important. And perhaps this kind of challenges, the historically highly centralized and really specialized roles that I just talked about. How do you see that evolving and, and, and what role will Altryx play? >>Yeah. Um, you know, I think we'll see sort of a more federated systems start to emerge. Those centralized groups are going to continue to exist. Um, but they're going to start to empower, you know, in a much more de-centralized way, the people who are closer to the business problems and have better business understanding. I think that's going to let the centralized highly skilled teams work on, uh, problems that are of higher value to the organization. The kinds of problems where one or 2% lift in the model results in millions of dollars a day for the business. And then by pushing some of the analytics out to, uh, closer to the edge and closer to the business, you'll be able to apply those analytics in every single decision. So I think you're going to see, you know, both the decentralized and centralized models start to work in harmony and a little bit more about almost a federated sort of a way. And I think, you know, the exciting thing for us at Altryx is, you know, we want to facilitate that. We want to give analytic capabilities and solutions to both groups and types of people. We want to help them collaborate better, um, and drive business outcomes with the analytics they're using. >>Yeah. I mean, I think my take on another one, if you could comment is to me, the technology should be an operational detail and it has been the, the, the dog that wags the tail, or maybe the other way around, you mentioned digital exhaust before. I mean, essentially it's digital exhaust coming out of operationals systems that then somehow, eventually end up in the hand of the domain users. And I wonder if increasingly we're going to see those domain users, users, those, those line of business experts get more access. That's your goal. And then even go beyond analytics, start to build data products that could be monetized, and that maybe it's going to take a decade to play out, but that is sort of a new era of data. Do you see it that way? >>Absolutely. We're actually making big investments in our products and capabilities to be able to create analytic applications and to enable somebody who's an analyst or business user to create an application on top of the data and analytics layers that they have, um, really to help democratize the analytics, to help prepackage some of the analytics that can drive more insights. So I think that's definitely a trend we're going to see more. >>Yeah. And to your point, if you can federate the governance and automate that, then that can happen. I mean, that's a key part of it, obviously. So, all right, Jay, we have to leave it there up next. We take a deep dive into the Altryx recent acquisition of Trifacta with Adam Wilson who led Trifacta for more than seven years. It's the recipe. Tyler is the chief product officer at Altryx to explain the rationale behind the acquisition and how it's going to impact customers. Keep it right there. You're watching the cube. You're a leader in enterprise tech coverage. >>It's go time, get ready to accelerate your data analytics journey with a unified cloud native platform. That's accessible for everyone on the go from home to office and everywhere in between effortless analytics to help you go from ideas to outcomes and no time. It's your time to shine. It's Altryx analytics cloud time. >>Okay. We're here with. Who's the chief product officer at Altryx and Adam Wilson, the CEO of Trifacta. Now of course, part of Altryx just closed this quarter. Gentlemen. Welcome. >>Great to be here. >>Okay. So let me start with you. In my opening remarks, I talked about Altrix is traditional position serving business analysts and how the hyper Anna acquisition brought you deeper into the business user space. What does Trifacta bring to your portfolio? Why'd you buy the company? >>Yeah. Thank you. Thank you for the question. Um, you know, we see, uh, we see a massive opportunity of helping, um, brands, um, democratize the use of analytics across their business. Um, every knowledge worker, every individual in the company should have access to analytics. It's no longer optional, um, as they navigate their businesses with that in mind, you know, we know designer and are the products that Altrix has been selling the past decade or so do a really great job, um, addressing the business analysts, uh, with, um, hyper Rana now kind of renamed, um, Altrix auto. We even speak with the business owner and the line of business owner. Who's looking for insights that aren't real in traditional dashboards and so on. Um, but we see this opportunity of really helping the data engineering teams and it organizations, um, to also make better use of analytics. Um, and that's where the drive factor comes in for us. Um, drive factor has the best data engineering cloud in the planet. Um, they have an established track record of working across multiple cloud platforms and helping data engineers, um, do better data pipelining and work better with, uh, this massive kind of cloud transformation that's happening in every business. Um, and so fact made so much sense for us. >>Yeah. Thank you for that. I mean, you, look, you could have built it yourself would have taken, you know, who knows how long, you know, but, uh, so definitely a great time to market move, Adam. I wonder if we could dig into Trifacta some more, I mean, I remember interviewing Joe Hellerstein in the early days. You've talked about this as well, uh, on the cube coming at the problem of taking data from raw refined to an experience point of view. And Joe in the early days, talked about flipping the model and starting with data visualization, something Jeff, her was expert at. So maybe explain how we got here. We used to have this cumbersome process of ETL and you may be in some others changed that model with ELL and then T explain how Trifacta really changed the data engineering game. >>Yeah, that's exactly right. Uh, David, it's been a really interesting journey for us because I think the original hypothesis coming out of the campus research, uh, at Berkeley and Stanford that really birth Trifacta was, you know, why is it that the people who know the data best can't do the work? You know, why is this become the exclusive purview of the highly technical? And, you know, can we rethink this and make this a user experience, problem powered by machine learning that will take some of the more complicated things that people want to do with data and really help to automate those. So, so a broader set of, of users can, um, can really see for themselves and help themselves. And, and I think that, um, there was a lot of pent up frustration out there because people have been told for, you know, for a decade now to be more data-driven and then the whole time they're saying, well, then give me the data, you know, in the shape that I could use it with the right level of quality and I'm happy to be, but don't tell me to be more data-driven and then, and, and not empower me, um, to, to get in there and to actually start to work with the data in meaningful ways. >>And so, um, that was really, you know, what, you know, the origin story of the company and I think is, as we, um, saw over the course of the last 5, 6, 7 years that, um, you know, uh, real, uh, excitement to embrace this idea of, of trying to think about data engineering differently, trying to democratize the, the ETL process and to also leverage all these exciting new, uh, engines and platforms that are out there that allow for processing, you know, ever more diverse data sets, ever larger data sets and new and interesting ways. And that's where a lot of the push-down or the ELT approaches that, you know, I think it could really won the day. Um, and that, and that for us was a hallmark of the solution from the very beginning. >>Yeah, this is a huge point that you're making is, is first of all, there's a large business, it's probably about a hundred billion dollar Tam. Uh, and the, the point you're making, because we've looked, we've contextualized most of our operational systems, but the big data pipeline is hasn't gotten there. But, and maybe we could talk about that a little bit because democratizing data is Nirvana, but it's been historically very difficult. You've got a number of companies it's very fragmented and they're all trying to attack their little piece of the problem to achieve an outcome, but it's been hard. And so what's going to be different about Altryx as you bring these puzzle pieces together, how is this going to impact your customers who would like to take that one? >>Yeah, maybe, maybe I'll take a crack at it. And Adam will, um, add on, um, you know, there hasn't been a single platform for analytics, automation in the enterprise, right? People have relied on, uh, different products, um, to solve kind of, uh, smaller problems, um, across this analytics, automation, data transformation domain. Um, and, um, I think uniquely Alcon's has that opportunity. Uh, we've got 7,000 plus customers who rely on analytics for, um, data management, for analytics, for AI and ML, uh, for transformations, uh, for reporting and visualization for automated insights and so on. Um, and so by bringing drive factor, we have the opportunity to scale this even further and solve for more use cases, expand the scenarios where it's applied and so multiple personas. Um, and we just talked about the data engineers. They are really a growing stakeholder in this transformation of data and analytics. >>Yeah, good. Maybe we can stay on this for a minute cause you, you you're right. You bring it together. Now at least three personas the business analyst, the end user slash business user. And now the data engineer, which is really out of an it role in a lot of companies, and you've used this term, the data engineering cloud, what is that? How is it going to integrate in with, or support these other personas? And, and how's it going to integrate into the broader ecosystem of clouds and cloud data warehouses or any other data stores? >>Yeah, no, that's great. Uh, yeah, I think for us, we really looked at this and said, you know, we want to build an open and interactive cloud platform for data engineers, you know, to collaboratively profile pipeline, um, and prepare data for analysis. And that really meant collaborating with the analysts that were in the line of business. And so this is why a big reason why this combination is so magic because ultimately if we can get the data engineers that are creating the data products together with the analysts that are in the line of business that are driving a lot of the decision making and allow for that, what I would describe as collaborative curation of the data together, so that you're starting to see, um, uh, you know, increasing returns to scale as this, uh, as this rolls out. I just think that is an incredibly powerful combination and, and frankly, something that the market is not crack the code on yet. And so, um, I think when we, when I sat down with Suresh and with mark and the team at Ultrix, that was really part of the, the, the big idea, the big vision that was painted and got us really energized about the acquisition and about the potential of the combination. >>And you're really, you're obviously writing the cloud and the cloud native wave. Um, and, but specifically we're seeing, you know, I almost don't even want to call it a data warehouse anyway, because when you look at what's, for instance, Snowflake's doing, of course their marketing is around the data cloud, but I actually think there's real justification for that because it's not like the traditional data warehouse, right. It's, it's simplified get there fast, don't necessarily have to go through the central organization to share data. Uh, and, and, and, but it's really all about simplification, right? Isn't that really what the democratization comes down to. >>Yeah. It's simplification and collaboration. Right. I don't want to, I want to kind of just what Adam said resonates with me deeply. Um, analytics is one of those, um, massive disciplines inside an enterprise that's really had the weakest of tools. Um, and we just have interfaces to collaborate with, and I think truly this was all drinks and a superpower was helping the analysts get more out of their data, get more out of the analytics, like imagine a world where these people are collaborating and sharing insights in real time and sharing workflows and getting access to new data sources, um, understanding data models better, I think, um, uh, curating those insights. I boring Adam's phrase again. Um, I think that creates a real value inside the organization because frankly in scaling analytics and democratizing analytics and data, we're still in such early phases of this journey. >>So how should we think about designer cloud, which is from Altrix it's really been the on-prem and the server desktop offering. And of course Trifacta is with cloud cloud data warehouses. Right. Uh, how, how should we think about those two products? Yeah, >>I think, I think you should think about them. And, uh, um, as, as very complimentary right designer cloud really shares a lot of DNA and heritage with, uh, designer desktop, um, the low code tooling and that interface, uh, the really appeals to the business analysts, um, and gets a lot of the things that they do well, we've also built it with interoperability in mind, right. So if you started building your workflows in designer desktop, you want to share that with design and cloud, we want to make it super easy for you to do that. Um, and I think over time now we're only a week into, um, this Alliance with, um, with, um, Trifacta, um, I think we have to get deeper inside to think about what does the data engineer really need? What's the business analysts really need and how to design a cloud, and Trifacta really support both of those requirements, uh, while kind of continue to build on the trifecta on the amazing Trifacta cloud platform. >>You know, >>I think we're just going to say, I think that's one of the things that, um, you know, creates a lot of, uh, opportunity as we go forward, because ultimately, you know, Trifacta took a platform, uh, first mentality to everything that we built. So thinking about openness and extensibility and, um, and how over time people could build things on top of factor that are a variety of analytic tool chain, or analytic applications. And so, uh, when you think about, um, Ultrix now starting to, uh, to move some of its capabilities or to provide additional capabilities, uh, in the cloud, um, you know, Trifacta becomes a platform that can accelerate, you know, all of that work and create, uh, uh, a cohesive set of, of cloud-based services that, um, share a common platform. And that maintains independence because both companies, um, have been, uh, you know, fiercely independent, uh, and, and really giving people choice. >>Um, so making sure that whether you're, uh, you know, picking one cloud platform and other, whether you're running things on the desktop, uh, whether you're running in hybrid environments, that, um, no matter what your decision, um, you're always in a position to be able to get out your data. You're always in a position to be able to cleanse transform shape structure, that data, and ultimately to deliver, uh, the analytics that you need. And so I think in that sense, um, uh, you know, this, this again is another reason why the combination, you know, fits so well together, giving people, um, the choice. Um, and as they, as they think about their analytics strategy and their platform strategy going forward, >>Yeah. I make a chuckle, but one of the reasons I always liked Altrix is cause you kinda did the little end run on it. It can be a blocker sometimes, but that created problems, right? Because the organization said, wow, this big data stuff has taken off, but we need security. We need governance. And it's interesting because you've got, you know, ETL has been complex, whereas the visualization tools, they really, you know, really weren't great at governance and security. It took some time there. So that's not, not their heritage. You're bringing those worlds together. And I'm interested, you guys just had your sales kickoff, you know, what was their reaction like? Uh, maybe Suresh, you could start off and maybe Adam, you could bring us home. >>Um, thanks for asking about our sales kickoff. So we met for the first time and you've got a two years, right. For, as, as it is for many of us, um, in person, uh, um, which I think was a, was a real breakthrough as Qualtrics has been on its transformation journey. Uh, we added a Trifacta to, um, the, the potty such as the tour, um, and getting all of our sales teams and product organizations, um, to meet in person in one location. I thought that was very powerful for other the company. Uh, but then I tell you, um, um, the reception for Trifacta was beyond anything I could have imagined. Uh, we were working out him and I will, when he's so hot on, on the deal and the core hypotheses and so on. And then you step back and you're going to share the vision with the field organization, and it blows you away, the energy that it creates among our sellers out of partners. >>And I'm sure Madam will and his team were mocked, um, every single day, uh, with questions and opportunities to bring them in. But Adam, maybe you should share. Yeah, no, it was, uh, it was through the roof. I mean, uh, uh, the, uh, the amount of energy, the, uh, certainly how welcoming everybody was, uh, uh, you know, just, I think the story makes so much sense together. I think culturally, the company is, are very aligned. Um, and, uh, it was a real, uh, real capstone moment, uh, to be able to complete the acquisition and to, and to close and announced, you know, at the kickoff event. And, um, I think, you know, for us, when we really thought about it, you know, when we ended, the story that we told was just, you have this opportunity to really cater to what the end users care about, which is a lot about interactivity and self-service, and at the same time. >>And that's, and that's a lot of the goodness that, um, that Altryx is, has brought, you know, through, you know, you know, years and years of, of building a very vibrant community of, you know, thousands, hundreds of thousands of users. And on the other side, you know, Trifacta bringing in this data engineering focus, that's really about, uh, the governance things that you mentioned and the openness, um, that, that it cares deeply about. And all of a sudden, now you have a chance to put that together into a complete story where the data engineering cloud and analytics, automation, you know, coming together. And, um, and I just think, you know, the lights went on, um, you know, for people instantaneously and, you know, this is a story that, um, that I think the market is really hungry for. And certainly the reception we got from, uh, from the broader team at kickoff was, uh, was a great indication. >>Well, I think the story hangs together really well, you know, one of the better ones I've seen in, in this space, um, and, and you guys coming off a really, really strong quarter. So congratulations on that jets. We have to leave it there. I really appreciate your time today. Yeah. Take a look at this short video. And when we come back, we're going to dig into the ecosystem and the integration into cloud data warehouses and how leading organizations are creating modern data teams and accelerating their digital businesses. You're watching the cube you're leader in enterprise tech coverage. >>This is your data housed neatly insecurely in the snowflake data cloud. And all of it has potential the potential to solve complex business problems, deliver personalized financial offerings, protect supply chains from disruption, cut costs, forecast, grow and innovate. All you need to do is put your data in the hands of the right people and give it an opportunity. Luckily for you. That's the easy part because snowflake works with Alteryx and Alteryx turns data into breakthroughs with just a click. Your organization can automate analytics with drag and drop building blocks, easily access snowflake data with both sequel and no SQL options, share insights, powered by Alteryx data science and push processing to snowflake for lightning, fast performance, you get answers you can put to work in your teams, get repeatable processes they can share in that's exciting because not only is your data no longer sitting around in silos, it's also mobilized for the next opportunity. Turn your data into a breakthrough Alteryx and snowflake >>Okay. We're back here in the queue, focusing on the business promise of the cloud democratizing data, making it accessible and enabling everyone to get value from analytics, insights, and data. We're now moving into the eco systems segment the power of many versus the resources of one. And we're pleased to welcome. Barb Hills camp was the senior vice president partners and alliances at Ultrix and a special guest Terek do week head of technology alliances at snowflake folks. Welcome. Good to see you. >>Thank you. Thanks for having me. Good to see >>Dave. Great to see you guys. So cloud migration, it's one of the hottest topics. It's the top one of the top initiatives of senior technology leaders. We have survey data with our partner ETR it's number two behind security, and just ahead of analytics. So we're hovering around all the hot topics here. Barb, what are you seeing with respect to customer, you know, cloud migration momentum, and how does the Ultrix partner strategy fit? >>Yeah, sure. Partners are central company's strategy. They always have been. We recognize that our partners have deep customer relationships. And when you connect that with their domain expertise, they're really helping customers on their cloud and business transformation journey. We've been helping customers achieve their desired outcomes with our partner community for quite some time. And our partner base has been growing an average of 30% year over year, that partner community and strategy now addresses several kinds of partners, spanning solution providers to global SIS and technology partners, such as snowflake and together, we help our customers realize the business promise of their journey to the cloud. Snowflake provides a scalable storage system altereds provides the business user friendly front end. So for example, it departments depend on snowflake to consolidate data across systems into one data cloud with Altryx business users can easily unlock that data in snowflake solving real business outcomes. Our GSI and solution provider partners are instrumental in providing that end to end benefit of a modern analytic stack in the cloud providing platform, guidance, deployment, support, and other professional services. >>Great. Let's get a little bit more into the relationship between Altrix and S in snowflake, the partnership, maybe a little bit about the history, you know, what are the critical aspects that we should really focus on? Barb? Maybe you could start an Interra kindly way in as well. >>Yeah, so the relationship started in 2020 and all shirts made a big bag deep with snowflake co-innovating and optimizing cloud use cases together. We are supporting customers who are looking for that modern analytic stack to replace an old one or to implement their first analytic strategy. And our joint customers want to self-serve with data-driven analytics, leveraging all the benefits of the cloud, scalability, accessibility, governance, and optimizing their costs. Um, Altrix proudly achieved. Snowflake's highest elite tier in their partner program last year. And to do that, we completed a rigorous third party testing process, which also helped us make some recommended improvements to our joint stack. We wanted customers to have confidence. They would benefit from high quality and performance in their investment with us then to help customers get the most value out of the destroyed solution. We developed two great assets. One is the officer starter kit for snowflake, and we coauthored a joint best practices guide. >>The starter kit contains documentation, business workflows, and videos, helping customers to get going more easily with an altered since snowflake solution. And the best practices guide is more of a technical document, bringing together experiences and guidance on how Altryx and snowflake can be deployed together. Internally. We also built a full enablement catalog resources, right? We wanted to provide our account executives more about the value of the snowflake relationship. How do we engage and some best practices. And now we have hundreds of joint customers such as Juniper and Sainsbury who are actively using our joint solution, solving big business problems much faster. >>Cool. Kara, can you give us your perspective on the partnership? >>Yeah, definitely. Dave, so as Barb mentioned, we've got this standing very successful partnership going back years with hundreds of happy joint customers. And when I look at the beginning, Altrix has helped pioneer the concept of self-service analytics, especially with use cases that we worked on with for, for data prep for BI users like Tableau and as Altryx has evolved to now becoming from data prep to now becoming a full end to end data science platform. It's really opened up a lot more opportunities for our partnership. Altryx has invested heavily over the last two years in areas of deep integration for customers to fully be able to expand their investment, both technologies. And those investments include things like in database pushed down, right? So customers can, can leverage that elastic platform, that being the snowflake data cloud, uh, with Alteryx orchestrating the end to end machine learning workflows Alteryx also invested heavily in snow park, a feature we released last year around this concept of data programmability. So all users were regardless of their business analysts, regardless of their data, scientists can use their tools of choice in order to consume and get at data. And now with Altryx cloud, we think it's going to open up even more opportunities. It's going to be a big year for the partnership. >>Yeah. So, you know, Terike, we we've covered snowflake pretty extensively and you initially solve what I used to call the, I still call the snake swallowing the basketball problem and cloud data warehouse changed all that because you had virtually infinite resources, but so that's obviously one of the problems that you guys solved early on, but what are some of the common challenges or patterns or trends that you see with snowflake customers and where does Altryx come in? >>Sure. Dave there's there's handful, um, that I can come up with today, the big challenges or trends for us, and Altrix really helps us across all of them. Um, there are three particular ones I'm going to talk about the first one being self-service analytics. If we think about it, every organization is trying to democratize data. Every organization wants to empower all their users, business users, um, you know, the, the technology users, but the business users, right? I think every organization has realized that if everyone has access to data and everyone can do something with data, it's going to make them competitively, give them a competitive advantage with Altrix is something we share that vision of putting that power in the hands of everyday users, regardless of the skillsets. So, um, with self-service analytics, with Ultrix designer they've they started out with self-service analytics as the forefront, and we're just scratching the surface. >>I think there was an analyst, um, report that shows that less than 20% of organizations are truly getting self-service analytics to their end users. Now, with Altryx going to Ultrix cloud, we think that's going to be a huge opportunity for us. Um, and then that opens up the second challenge, which is machine learning and AI, every organization is trying to get predictive analytics into every application that they have in order to be competitive in order to be competitive. Um, and with Altryx creating this platform so they can cater to both the everyday business user, the quote unquote, citizen data scientists, and making a code friendly for data scientists to be able to get at their notebooks and all the different tools that they want to use. Um, they fully integrated in our snow park platform, which I talked about before, so that now we get an end to end solution caring to all, all lines of business. >>And then finally this concept of data marketplaces, right? We, we created snowflake from the ground up to be able to solve the data sharing problem, the big data problem, the data sharing problem. And Altryx um, if we look at mobilizing your data, getting access to third-party datasets, to enrich with your own data sets, to enrich with, um, with your suppliers and with your partners, data sets, that's what all customers are trying to do in order to get a more comprehensive 360 view, um, within their, their data applications. And so with Altryx alterations, we're working on third-party data sets and marketplaces for quite some time. Now we're working on how do we integrate what Altrix is providing with the snowflake data marketplace so that we can enrich these workflows, these great, great workflows that Altrix writing provides. Now we can add third party data into that workflow. So that opens up a ton of opportunities, Dave. So those are three I see, uh, easily that we're going to be able to solve a lot of customer challenges with. >>So thank you for that. Terrick so let's stay on cloud a little bit. I mean, Altrix is undergoing a major transformation, big focus on the cloud. How does this cloud launch impact the partnership Terike from snowflakes perspective and then Barb, maybe, please add some color. >>Yeah, sure. Dave snowflake started as a cloud data platform. We saw our founders really saw the challenges that customers are having with becoming data-driven. And the biggest challenge was the complexity of having imagine infrastructure to even be able to do it, to get applications off the ground. And so we created something to be cloud-native. We created to be a SAS managed service. So now that that Altrix is moving to the same model, right? A cloud platform, a SAS managed service, we're just, we're just removing more of the friction. So we're going to be able to start to package these end to end solutions that are SAS based that are fully managed. So customers can, can go faster and they don't have to worry about all of the underlying complexities of, of, of stitching things together. Right? So, um, so that's, what's exciting from my viewpoint >>And I'll follow up. So as you said, we're investing heavily in the cloud a year ago, we had two pre desktop products, and today we have four cloud products with cloud. We can provide our users with more flexibility. We want to make it easier for the users to leverage their snowflake data in the Alteryx platform, whether they're using our beloved on-premise solution or the new cloud products were committed to that continued investment in the cloud, enabling our joint partner solutions to meet customer requirements, wherever they store their data. And we're working with snowflake, we're doing just that. So as customers look for a modern analytic stack, they expect that data to be easily accessible, right within a fast, secure and scalable platform. And the launch of our cloud strategy is a huge leap forward in making Altrix more widely accessible to all users in all types of roles, our GSI and our solution provider partners have asked for these cloud capabilities at scale, and they're excited to better support our customers, cloud and analytic >>Are. How about you go to market strategy? How would you describe your joint go to market strategy with snowflake? >>Sure. It's simple. We've got to work backwards from our customer's challenges, right? Driving transformation to solve problems, gain efficiencies, or help them save money. So whether it's with snowflake or other GSI, other partner types, we've outlined a joint journey together from recruit solution development, activation enablement, and then strengthening our go to market strategies to optimize our results together. We launched an updated partner program and within that framework, we've created new benefits for our partners around opportunity registration, new role based enablement and training, basically extending everything we do internally for our own go-to-market teams to our partners. We're offering partner, marketing resources and funding to reach new customers together. And as a matter of fact, we recently launched a fantastic video with snowflake. I love this video that very simply describes the path to insights starting with your snowflake data. Right? We do joint customer webinars. We're working on joint hands-on labs and have a wonderful landing page with a lot of assets for our customers. Once we have an interested customer, we engage our respective account managers, collaborating through discovery questions, proof of concepts really showcasing the desired outcome. And when you combine that with our partners technology or domain expertise, it's quite powerful, >>Dark. How do you see it? You'll go to market strategy. >>Yeah. Dave we've. Um, so we initially started selling, we initially sold snowflake as technology, right? Uh, looking at positioning the diff the architectural differentiators and the scale and concurrency. And we noticed as we got up into the larger enterprise customers, we're starting to see how do they solve their business problems using the technology, as well as them coming to us and saying, look, we want to also know how do you, how do you continue to map back to the specific prescriptive business problems we're having? And so we shifted to an industry focus last year, and this is an area where Altrix has been mature for probably since their inception selling to the line of business, right? Having prescriptive use cases that are particular to an industry like financial services, like retail, like healthcare and life sciences. And so, um, Barb talked about these, these starter kits where it's prescriptive, you've got a demo and, um, a way that customers can get off the ground and running, right? >>Cause we want to be able to shrink that time to market, the time to value that customers can watch these applications. And we want to be able to, to tell them specifically how we can map back to their business initiatives. So I see a huge opportunity to align on these industry solutions. As BARR mentioned, we're already doing that where we've released a few around financial services working in healthcare and retail as well. So that is going to be a way for us to allow customers to go even faster and start to map two lines of business with Alteryx. >>Great. Thanks Derek. Bob, what can we expect if we're observing this relationship? What should we look for in the coming year? >>A lot specifically with snowflake, we'll continue to invest in the partnership. Uh, we're co innovators in this journey, including snow park extensibility efforts, which Derek will tell you more about shortly. We're also launching these great news strategic solution blueprints, and extending that at no charge to our partners with snowflake, we're already collaborating with their retail and CPG team for industry blueprints. We're working with their data marketplace team to highlight solutions, working with that data in their marketplace. More broadly, as I mentioned, we're relaunching the ultra partner program designed to really better support the unique partner types in our global ecosystem, introducing new benefits so that with every partner, achievement or investment with ultra score, providing our partners with earlier access to benefits, um, I could talk about our program for 30 minutes. I know we don't have time. The key message here Alteryx is investing in our partner community across the business, recognizing the incredible value that they bring to our customers every day. >>Tarik will give you the last word. What should we be looking for from, >>Yeah, thanks. Thanks, Dave. As BARR mentioned, Altrix has been the forefront of innovating with us. They've been integrating into, uh, making sure again, that customers get the full investment out of snowflake things like in database push down that I talked about before that extensibility is really what we're excited about. Um, the ability for Ultrix to plug into this extensibility framework that we call snow park and to be able to extend out, um, ways that the end users can consume snowflake through, through sequel, which has traditionally been the way that you consume snowflake as well as Java and Scala, not Python. So we're excited about those, those capabilities. And then we're also excited about the ability to plug into the data marketplace to provide third party data sets, right there probably day sets in, in financial services, third party, data sets and retail. So now customers can build their data applications from end to end using ultrasound snowflake when the comprehensive 360 view of their customers, of their partners, of even their employees. Right? I think it's exciting to see what we're going to be able to do together with these upcoming innovations. Great >>Barb Tara, thanks so much for coming on the program, got to leave it right there in a moment, I'll be back with some closing thoughts in a summary, don't go away. >>1200 hours of wind tunnel testing, 30 million race simulations, 2.4 second pit stops make that 2.3. The sector times out the wazoo, whites are much of this velocity's pressures, temperatures, 80,000 components generating 11.8 billion data points and one analytics platform to make sense of it all. When McLaren needs to turn complex data into insights, they turn to Altryx Qualtrics analytics, automation, >>Okay, let's summarize and wrap up the session. We can pretty much agree the data is plentiful, but organizations continue to struggle to get maximum value out of their data investments. The ROI has been elusive. There are many reasons for that complexity data, trust silos, lack of talent and the like, but the opportunity to transform data operations and drive tangible value is immense collaboration across various roles. And disciplines is part of the answer as is democratizing data. This means putting data in the hands of those domain experts that are closest to the customer and really understand where the opportunity exists and how to best address them. We heard from Jay Henderson that we have all this data exhaust and cheap storage. It allows us to keep it for a long time. It's true, but as he pointed out that doesn't solve the fundamental problem. Data is spewing out from our operational systems, but much of it lacks business context for the data teams chartered with analyzing that data. >>So we heard about the trend toward low code development and federating data access. The reason this is important is because the business lines have the context and the more responsibility they take for data, the more quickly and effectively organizations are going to be able to put data to work. We also talked about the harmonization between centralized teams and enabling decentralized data flows. I mean, after all data by its very nature is distributed. And importantly, as we heard from Adam Wilson and Suresh Vittol to support this model, you have to have strong governance and service the needs of it and engineering teams. And that's where the trifecta acquisition fits into the equation. Finally, we heard about a key partnership between Altrix and snowflake and how the migration to cloud data warehouses is evolving into a global data cloud. This enables data sharing across teams and ecosystems and vertical markets at massive scale all while maintaining the governance required to protect the organizations and individuals alike. >>This is a new and emerging business model that is very exciting and points the way to the next generation of data innovation in the coming decade. We're decentralized domain teams get more facile access to data. Self-service take more responsibility for quality value and data innovation. While at the same time, the governance security and privacy edicts of an organization are centralized in programmatically enforced throughout an enterprise and an external ecosystem. This is Dave Volante. All these videos are available on demand@theqm.net altrix.com. Thanks for watching accelerating automated analytics in the cloud made possible by Altryx. And thanks for watching the queue, your leader in enterprise tech coverage. We'll see you next time.

Published Date : Mar 1 2022

SUMMARY :

It saw the need to combine and prep different data types so that organizations anyone in the business who wanted to gain insights from data and, or let's say use AI without the post isolation economy is here and we do so with a digital We're kicking off the program with our first segment. So look, you have a deep product background, product management, product marketing, And that results in a situation where the organization's, you know, the direction that your customers want to go and the problems that you're solving, what role does the cloud and really, um, you know, create a lot of the underlying data sets that are used in some of this, into the, to the business user with hyper Anna. of our designer desktop product, you know, really, as they look to take the next step, comes into the mix that deeper it angle that we talked about, how does this all fit together? analytics and providing access to all these different groups of people, um, How much of this you've been able to share with your customers and maybe your partners. Um, and, and this idea that they're going to move from, you know, So it's democratizing data is the ultimate goal, which frankly has been elusive for most You know, the data gravity has been moving to the cloud. So, uh, you know, getting everyone involved and accessing AI and machine learning to unlock seems logical that domain leaders are going to take more responsibility for data, And I think, you know, the exciting thing for us at Altryx is, you know, we want to facilitate that. the tail, or maybe the other way around, you mentioned digital exhaust before. the data and analytics layers that they have, um, really to help democratize the We take a deep dive into the Altryx recent acquisition of Trifacta with Adam Wilson It's go time, get ready to accelerate your data analytics journey the CEO of Trifacta. serving business analysts and how the hyper Anna acquisition brought you deeper into the with that in mind, you know, we know designer and are the products And Joe in the early days, talked about flipping the model that really birth Trifacta was, you know, why is it that the people who know the data best can't And so, um, that was really, you know, what, you know, the origin story of the company but the big data pipeline is hasn't gotten there. um, you know, there hasn't been a single platform for And now the data engineer, which is really And so, um, I think when we, when I sat down with Suresh and with mark and the team and, but specifically we're seeing, you know, I almost don't even want to call it a data warehouse anyway, Um, and we just have interfaces to collaborate And of course Trifacta is with cloud cloud data warehouses. What's the business analysts really need and how to design a cloud, and Trifacta really support both in the cloud, um, you know, Trifacta becomes a platform that can You're always in a position to be able to cleanse transform shape structure, that data, and ultimately to deliver, And I'm interested, you guys just had your sales kickoff, you know, what was their reaction like? And then you step back and you're going to share the vision with the field organization, and to close and announced, you know, at the kickoff event. And certainly the reception we got from, Well, I think the story hangs together really well, you know, one of the better ones I've seen in, in this space, And all of it has potential the potential to solve complex business problems, We're now moving into the eco systems segment the power of many Good to see So cloud migration, it's one of the hottest topics. on snowflake to consolidate data across systems into one data cloud with Altryx business the partnership, maybe a little bit about the history, you know, what are the critical aspects that we should really focus Yeah, so the relationship started in 2020 and all shirts made a big bag deep with snowflake And the best practices guide is more of a technical document, bringing together experiences and guidance So customers can, can leverage that elastic platform, that being the snowflake data cloud, one of the problems that you guys solved early on, but what are some of the common challenges or patterns or trends everyone has access to data and everyone can do something with data, it's going to make them competitively, application that they have in order to be competitive in order to be competitive. to enrich with your own data sets, to enrich with, um, with your suppliers and with your partners, So thank you for that. So now that that Altrix is moving to the same model, And the launch of our cloud strategy How would you describe your joint go to market strategy the path to insights starting with your snowflake data. You'll go to market strategy. And so we shifted to an industry focus So that is going to be a way for us to allow What should we look for in the coming year? blueprints, and extending that at no charge to our partners with snowflake, we're already collaborating with Tarik will give you the last word. Um, the ability for Ultrix to plug into this extensibility framework that we call Barb Tara, thanks so much for coming on the program, got to leave it right there in a moment, I'll be back with 11.8 billion data points and one analytics platform to make sense of it all. This means putting data in the hands of those domain experts that are closest to the customer are going to be able to put data to work. While at the same time, the governance security and privacy edicts

ENTITIES

Entity	Category	Confidence
Derek	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Suresh Vetol	PERSON	0.99+
Altryx	ORGANIZATION	0.99+
Jay	PERSON	0.99+
Joe Hellerstein	PERSON	0.99+
Dave	PERSON	0.99+
Dave Volante	PERSON	0.99+
Altrix	ORGANIZATION	0.99+
Jay Henderson	PERSON	0.99+
David	PERSON	0.99+
Adam	PERSON	0.99+
Barb	PERSON	0.99+
Jeff	PERSON	0.99+
2020	DATE	0.99+
Bob	PERSON	0.99+
Trifacta	ORGANIZATION	0.99+
Suresh Vittol	PERSON	0.99+
Tyler	PERSON	0.99+
Juniper	ORGANIZATION	0.99+
Alteryx	ORGANIZATION	0.99+
Ultrix	ORGANIZATION	0.99+
30 minutes	QUANTITY	0.99+
Terike	PERSON	0.99+
Adam Wilson	PERSON	0.99+
Joe	PERSON	0.99+
Suresh	PERSON	0.99+
Terrick	PERSON	0.99+
demand@theqm.net	OTHER	0.99+
thousands	QUANTITY	0.99+
Alcon	ORGANIZATION	0.99+
Kara	PERSON	0.99+
last year	DATE	0.99+
three	QUANTITY	0.99+
Qualtrics	ORGANIZATION	0.99+
less than 20%	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.99+
Java	TITLE	0.99+
more than seven years	QUANTITY	0.99+
two acquisitions	QUANTITY	0.99+

Alteryx Intro

>> Alteryx is a company with a long history that goes all the way back to the late 1990s. Now the one consistent theme over the past 20-plus years, however, is that Alteryx has always been a data company. Early in the big data and Hadoop cycle. It saw the need to combine and prep different data types, so that organizations could confidently analyze data and take action. Alteryx and similar companies played a critical role in helping, helping companies become, data driven. Alex, let me start over. Shit, sorry. Sorry, Leonard. Alteryx is a company with a long history that goes all the way back to the late 1990s. Now the one consistent theme over 20 plus years has been that Alteryx has always been a data company early in the big data and Hadoop cycle. It saw the need to combine and prep different data types so that organizations could analyze data and take action. Alteryx and similar companies played a critical role in helping companies become data driven. The problem was the decade of big data, brought a lot of complexities and required immense skills just to get the technology to work as advertised. This in turn limited, the pace of adoption and the number of companies that could really lean in and take advantage. Now, the cloud began to change all that, and set the foundation for today's themed, de jor of digital transformation. We hear that phrase a ton, digital transformation. People used to think it was a buzzword but of course we learn from the pandemic that if you're not a digital business, you're out of business. And a key tenant of digital transformation is democratizing data. Meaning enabling not just hyper specialized experts but anyone, business users to put data to work. Now back to Alteryx, the company has embarked on a major transformation of its own over the past couple of years. Brought in new management, they've changed the way in which it engaged it with customers with a new subscription model, and it's top graded. It's talent pool. 2021 was even more significant because of two acquisitions that Alteryx made, Hyper Anna and Trifecta. Why are these acquisitions important? While traditionally Altrix sold to business analysts that were part of the data pipeline. These were fairly technical people who had certain skills, and were trained in things like writing Python code. With Hyper Anna, Alteryx has added a new persona the business user, anyone in the business who wanted to gain insights from data and, or let's say use AI without having to be a deep technical expert. And then Trifecta, a company started in the early days of big data by Cubelum, Joe Hellerstein and his colleagues at Berkeley. They knock down the data engineering persona, and this gives Alteryx a complimentary extension into IT where things like governance and security are paramount. So as we enter 2022, the post isolation economy is here, and we do so with a digital foundation, built on the confluence of cloud native technologies, data democratization and machine intelligence or AI, if you prefer. And Alteryx is entering that new era with an expanded portfolio, new go to market vectors, a recurring revenue business model, and a brand new outlook on how to solve customer problems and scale a company. My name is Dave Volante with the Cube and I'll be your host today in the next hour we're going to explore the opportunities in this new data market. And we have three segments where we dig into these trends and themes. First we'll talk to Jay Henderson, vice president of product management at Alteryx about cloud accelerate and simplifying complex data operations. Then we'll bring in Crajesh vitall. Who's the chief product officer at Alteryx and Adam Wilson the CEO of trifecta, which of course is now part of Alteryx. And finally, we'll hear about how Alteryx is partnering with snowflake in the ecosystem and how they're integrating with data platforms like snow flick and what this means for customers. And we may have a few surprises sprinkled in as well into the conversation let's get started.

Published Date : Feb 16 2022

SUMMARY :

and set the foundation for today's themed,

ENTITIES

Entity	Category	Confidence
Jay Henderson	PERSON	0.99+
Alteryx	ORGANIZATION	0.99+
Leonard	PERSON	0.99+
Joe Hellerstein	PERSON	0.99+
Alex	PERSON	0.99+
Dave Volante	PERSON	0.99+
Adam Wilson	PERSON	0.99+
Altrix	ORGANIZATION	0.99+
Trifecta	ORGANIZATION	0.99+
2022	DATE	0.99+
two acquisitions	QUANTITY	0.99+
First	QUANTITY	0.99+
late 1990s	DATE	0.99+
Hyper Anna	ORGANIZATION	0.98+
2021	DATE	0.98+
over 20 plus years	QUANTITY	0.97+
today	DATE	0.95+
Cube	ORGANIZATION	0.94+
trifecta	ORGANIZATION	0.92+
Cubelum	PERSON	0.9+
Berkeley	LOCATION	0.9+
Python code	TITLE	0.88+
a ton	QUANTITY	0.84+
past 20-plus years	DATE	0.81+
next hour	DATE	0.78+
one	QUANTITY	0.78+
past couple of years	DATE	0.74+
three segments	QUANTITY	0.73+
pandemic	EVENT	0.62+
Crajesh	PERSON	0.61+

UNLIST TILL 4/2 - Extending Vertica with the Latest Vertica Ecosystem and Open Source Initiatives

>> Sue: Hello everybody. Thank you for joining us today for the Virtual Vertica BDC 2020. Today's breakout session in entitled Extending Vertica with the Latest Vertica Ecosystem and Open Source Initiatives. My name is Sue LeClaire, Director of Marketing at Vertica and I'll be your host for this webinar. Joining me is Tom Wall, a member of the Vertica engineering team. But before we begin, I encourage you to submit questions or comments during the virtual session. You don't have to wait. Just type your question or comment in the question box below the slides and click submit. There will be a Q and A session at the end of the presentation. We'll answer as many questions as we're able to during that time. Any questions that we don't get to, we'll do our best to answer them offline. Alternatively, you can visit the Vertica forums to post you questions after the session. Our engineering team is planning to join the forums to keep the conversation going. Also a reminder that you can maximize your screen by clicking the double arrow button in the lower right corner of the slides. And yes, this virtual session is being recorded and will be available to view on demand later this week. We'll send you a notification as soon as it's ready. So let's get started. Tom, over to you. >> Tom: Hello everyone and thanks for joining us today for this talk. My name is Tom Wall and I am the leader of Vertica's ecosystem engineering team. We are the team that focuses on building out all the developer tools, third party integrations that enables the SoftMaker system that surrounds Vertica to thrive. So today, we'll be talking about some of our new open source initatives and how those can be really effective for you and make things easier for you to build and integrate Vertica with the rest of your technology stack. We've got several new libraries, integration projects and examples, all open source, to share, all being built out in the open on our GitHub page. Whether you use these open source projects or not, this is a very exciting new effort that will really help to grow the developer community and enable lots of exciting new use cases. So, every developer out there has probably had to deal with the problem like this. You have some business requirements, to maybe build some new Vertica-powered application. Maybe you have to build some new system to visualize some data that's that's managed by Vertica. The various circumstances, lots of choices will might be made for you that constrain your approach to solving a particular problem. These requirements can come from all different places. Maybe your solution has to work with a specific visualization tool, or web framework, because the business has already invested in the licensing and the tooling to use it. Maybe it has to be implemented in a specific programming language, since that's what all the developers on the team know how to write code with. While Vertica has many different integrations with lots of different programming language and systems, there's a lot of them out there, and we don't have integrations for all of them. So how do you make ends meet when you don't have all the tools you need? All you have to get creative, using tools like PyODBC, for example, to bridge between programming languages and frameworks to solve the problems you need to solve. Most languages do have an ODBC-based database interface. ODBC is our C-Library and most programming languages know how to call C code, somehow. So that's doable, but it often requires lots of configuration and troubleshooting to make all those moving parts work well together. So that's enough to get the job done but native integrations are usually a lot smoother and easier. So rather than, for example, in Python trying to fight with PyODBC, to configure things and get Unicode working, and to compile all the different pieces, the right way is to make it all work smoothly. It would be much better if you could just PIP install library and get to work. And with Vertica-Python, a new Python client library, you can actually do that. So that story, I assume, probably sounds pretty familiar to you. Sounds probably familiar to a lot of the audience here because we're all using Vertica. And our challenge, as Big Data practitioners is to make sense of all this stuff, despite those technical and non-technical hurdles. Vertica powers lots of different businesses and use cases across all kinds of different industries and verticals. While there's a lot different about us, we're all here together right now for this talk because we do have some things in common. We're all using Vertica, and we're probably also using Vertica with other systems and tools too, because it's important to use the right tool for the right job. That's a founding principle of Vertica and it's true today too. In this constantly changing technology landscape, we need lots of good tools and well established patterns, approaches, and advice on how to combine them so that we can be successful doing our jobs. Luckily for us, Vertica has been designed to be easy to build with and extended in this fashion. Databases as a whole had had this goal from the very beginning. They solve the hard problems of managing data so that you don't have to worry about it. Instead of worrying about those hard problems, you can focus on what matters most to you and your domain. So implementing that business logic, solving that problem, without having to worry about all of these intense, sometimes details about what it takes to manage a database at scale. With the declarative syntax of SQL, you tell Vertica what the answer is that you want. You don't tell Vertica how to get it. Vertica will figure out the right way to do it for you so that you don't have to worry about it. So this SQL abstraction is very nice because it's a well defined boundary where lots of developers know SQL, and it allows you to express what you need without having to worry about those details. So we can be the experts in data management while you worry about your problems. This goes beyond though, what's accessible through SQL to Vertica. We've got well defined extension and integration points across the product that allow you to customize this experience even further. So if you want to do things write your own SQL functions, or extend database softwares with UDXs, you can do so. If you have a custom data format that might be a proprietary format, or some source system that Vertica doesn't natively support, we have extension points that allow you to use those. To make it very easy to do passive, parallel, massive data movement, loading into Vertica but also to export Vertica to send data to other systems. And with these new features in time, we also could do the same kinds of things with Machine Learning models, importing and exporting to tools like TensorFlow. And it's these integration points that have enabled Vertica to build out this open architecture and a rich ecosystem of tools, both open source and closed source, of different varieties that solve all different problems that are common in this big data processing world. Whether it's open source, streaming systems like Kafka or Spark, or more traditional ETL tools on the loading side, but also, BI tools and visualizers and things like that to view and use the data that you keep in your database on the right side. And then of course, Vertica needs to be flexible enough to be able to run anywhere. So you can really take Vertica and use it the way you want it to solve the problems that you need to solve. So Vertica has always employed open standards, and integrated it with all kinds of different open source systems. What we're really excited to talk about now is that we are taking our new integration projects and making those open source too. In particular, we've got two new open source client libraries that allow you to build Vertica applications for Python and Go. These libraries act as a foundation for all kinds of interesting applications and tools. Upon those libraries, we've also built some integrations ourselves. And we're using these new libraries to power some new integrations with some third party products. Finally, we've got lots of new examples and reference implementations out on our GitHub page that can show you how to combine all these moving parts and exciting ways to solve new problems. And the code for all these things is available now on our GitHub page. And so you can use it however you like, and even help us make it better too. So the first such project that we have is called Vertica-Python. Vertica-Python began at our customer, Uber. And then in late 2018, we collaborated with them and we took it over and made Vertica-Python the first official open source client for Vertica You can use this to build your own Python applications, or you can use it via tools that were written in Python. Python has grown a lot in recent years and it's very common language to solve lots of different problems and use cases in the Big Data space from things like DevOps admission and Data Science or Machine Learning, or just homegrown applications. We use Python a lot internally for our own QA testing and automation needs. And with the Python 2 End Of Life, that happened at the end of 2019, it was important that we had a robust Python solution to help migrate our internal stuff off of Python 2. And also to provide a nice migration path for all of you our users that might be worried about the same problems with their own Python code. So Vertica-Python is used already for lots of different tools, including Vertica's admintools now starting with 9.3.1. It was also used by DataDog to build a Vertica-DataDog integration that allows you to monitor your Vertica infrastructure within DataDog. So here's a little example of how you might use the Python Client to do some some work. So here we open in connection, we run a query to find out what node we've connected to, and then we do a little DataLoad by running a COPY statement. And this is designed to have a familiar look and feel if you've ever used a Python Database Client before. So we implement the DB API 2.0 standard and it feels like a Python package. So that includes things like, it's part of the centralized package manager, so you can just PIP install this right now and go start using it. We also have our client for Go length. So this is called vertica-sql-go. And this is a very similar story, just in a different context or the different programming language. So vertica-sql-go, began as a collaboration with the Microsoft Focus SecOps Group who builds microfocus' security products some of which use vertica internally to provide some of those analytics. So you can use this to build your own apps in the Go programming language but you can also use it via tools that are written Go. So most notably, we have our Grafana integration, which we'll talk a little bit more about later, that leverages this new clients to provide Grafana visualizations for vertica data. And Go is another rising popularity programming language 'cause it offers an interesting balance of different programming design trade-offs. So it's got good performance, got a good current concurrency and memory safety. And we liked all those things and we're using it to power some internal monitoring stuff of our own. And here's an example of the code you can write with this client. So this is Go code that does a similar thing. It opens a connection, it runs a little test query, and then it iterates over those rows, processing them using Go data types. You get that native look and feel just like you do in Python, except this time in the Go language. And you can go get it the way you usually package things with Go by running that command there to acquire this package. And it's important to note here for the DC projects, we're really doing open source development. We're not just putting code out on our GitHub page. So if you go out there and look, you can see that you can ask questions, you can report bugs, you can submit poll requests yourselves and you can collaborate directly with our engineering team and the other vertica users out on our GitHub page. Because it's out on our GitHub page, it allows us to be a little bit faster with the way we ship and deliver functionality compared to the core vertica release cycle. So in 2019, for example, as we were building features to prepare for the Python 3 migration, we shipped 11 different releases with 40 customer reported issues, filed on GitHub. That was done over 78 different poll requests and with lots of community engagement as we do so. So lots of people are using this already, we see as our GitHub badge last showed with about 5000 downloads of this a day of people using it in their software. And again, we want to make this easy, not just to use but also to contribute and understand and collaborate with us. So all these projects are built using the Apache 2.0 license. The master branch is always available and stable with the latest creative functionality. And you can always build it and test it the way we do so that it's easy for you to understand how it works and to submit contributions or bug fixes or even features. It uses automated testing both for locally and with poll requests. And for vertica-python, it's fully automated with Travis CI. So we're really excited about doing this and we're really excited about where it can go in the future. 'Cause this offers some exciting opportunities for us to collaborate with you more directly than we have ever before. You can contribute improvements and help us guide the direction of these projects, but you can also work with each other to share knowledge and implementation details and various best practices. And so maybe you think, "Well, I don't use Python, "I don't use go so maybe it doesn't matter to me." But I would argue it really does matter. Because even if you don't use these tools and languages, there's lots of amazing vertica developers out there who do. And these clients do act as low level building blocks for all kinds of different interesting tools, both in these Python and Go worlds, but also well beyond that. Because these implementations and examples really generalize to lots of different use cases. And we're going to do a deeper dive now into some of these to understand exactly how that's the case and what you can do with these things. So let's take a deeper look at some of the details of what it takes to build one of these open source client libraries. So these database client interfaces, what are they exactly? Well, we all know SQL, but if you look at what SQL specifies, it really only talks about how to manipulate the data within the database. So once you're connected and in, you can run commands with SQL. But these database client interfaces address the rest of those needs. So what does the programmer need to do to actually process those SQL queries? So these interfaces are specific to a particular language or a technology stack. But the use cases and the architectures and design patterns are largely the same between different languages. They all have a need to do some networking and connect and authenticate and create a session. They all need to be able to run queries and load some data and deal with problems and errors. And then they also have a lot of metadata and Type Mapping because you want to use these clients the way you use those programming languages. Which might be different than the way that vertica's data types and vertica's semantics work. So some of this client interfaces are truly standards. And they are robust enough in terms of what they design and call for to support a truly pluggable driver model. Where you might write an application that codes directly against the standard interface, and you can then plug in a different database driver, like a JDBC driver, to have that application work with any database that has a JDBC driver. So most of these interfaces aren't as robust as a JDBC or ODBC but that's okay. 'Cause it's good as a standard is, every database is unique for a reason. And so you can't really expose all of those unique properties of a database through these standard interfaces. So vertica's unique in that it can scale to the petabytes and beyond. And you can run it anywhere in any environment, whether it's on-prem or on clouds. So surely there's something about vertica that's unique, and we want to be able to take advantage of that fact in our solutions. So even though these standards might not cover everything, there's often a need and common patterns that arise to solve these problems in similar ways. When there isn't enough of a standard to define those comments, semantics that different databases might have in common, what you often see is tools will invent plug in layers or glue code to compensate by defining application wide standard to cover some of these same semantics. Later on, we'll get into some of those details and show off what exactly that means. So if you connect to a vertica database, what's actually happening under the covers? You have an application, you have a need to run some queries, so what does that actually look like? Well, probably as you would imagine, your application is going to invoke some API calls and some client library or tool. This library takes those API calls and implements them, usually by issuing some networking protocol operations, communicating over the network to ask vertica to do the heavy lifting required for that particular API call. And so these API's usually do the same kinds of things although some of the details might differ between these different interfaces. But you do things like establish a connection, run a query, iterate over your rows, manage your transactions, that sort of thing. Here's an example from vertica-python, which just goes into some of the details of what actually happens during the Connect API call. And you can see all these details in our GitHub implementation of this. There's actually a lot of moving parts in what happens during a connection. So let's walk through some of that and see what actually goes on. I might have my API call like this where I say Connect and I give it a DNS name, which is my entire cluster. And I give you my connection details, my username and password. And I tell the Python Client to get me a session, give me a connection so I can start doing some work. Well, in order to implement this, what needs to happen? First, we need to do some TCP networking to establish our connection. So we need to understand what the request is, where you're going to connect to and why, by pressing the connection string. and vertica being a distributed system, we want to provide high availability, so we might need to do some DNS look-ups to resolve that DNS name which might be an entire cluster and not just a single machine. So that you don't have to change your connection string every time you add or remove nodes to the database. So we do some high availability and DNS lookup stuff. And then once we connect, we might do Load Balancing too, to balance the connections across the different initiator nodes in the cluster, or in a sub cluster, as needed. Once we land on the node we want to be at, we might do some TLS to secure our connections. And vertica supports the industry standard TLS protocols, so this looks pretty familiar for everyone who've used TLS anywhere before. So you're going to do a certificate exchange and the client might send the server certificate too, and then you going to verify that the server is who it says it is, so that you can know that you trust it. Once you've established that connection, and secured it, then you can start actually beginning to request a session within vertica. So you going to send over your user information like, "Here's my username, "here's the database I want to connect to." You might send some information about your application like a session label, so that you can differentiate on the database with monitoring queries, what the different connections are and what their purpose is. And then you might also send over some session settings to do things like auto commit, to change the state of your session for the duration of this connection. So that you don't have to remember to do that with every query that you have. Once you've asked vertica for a session, before vertica will give you one, it has to authenticate you. and vertica has lots of different authentication mechanisms. So there's a negotiation that happens there to decide how to authenticate you. Vertica decides based on who you are, where you're coming from on the network. And then you'll do an auth-specific exchange depending on what the auth mechanism calls for until you are authenticated. Finally, vertica trusts you and lets you in, so you going to establish a session in vertica, and you might do some note keeping on the client side just to know what happened. So you might log some information, you might record what the version of the database is, you might do some protocol feature negotiation. So if you connect to a version of the database that doesn't support all these protocols, you might decide to turn some functionality off and that sort of thing. But finally, after all that, you can return from this API call and then your connection is good to go. So that connection is just one example of many different APIs. And we're excited here because with vertica-python we're really opening up the vertica client wire protocol for the first time. And so if you're a low level vertica developer and you might have used Postgres before, you might know that some of vertica's client protocol is derived from Postgres. But they do differ in many significant ways. And this is the first time we've ever revealed those details about how it works and why. So not all Postgres protocol features work with vertica because vertica doesn't support all the features that Postgres does. Postgres, for example, has a large object interface that allows you to stream very wide data values over. Whereas vertica doesn't really have very wide data values, you have 30, you have long bar charts, but that's about as wide as you can get. Similarly, the vertica protocol supports lots of features not present in Postgres. So Load Balancing, for example, which we just went through an example of, Postgres is a single node system, it doesn't really make sense for Postgres to have Load Balancing. But Load Balancing is really important for vertica because it is a distributed system. Vertica-python serves as an open reference implementation of this protocol. With all kinds of new details and extension points that we haven't revealed before. So if you look at these boxes below, all these different things are new protocol features that we've implemented since August 2019, out in the open on our GitHub page for Python. Now, the vertica-sql-go implementation of these things is still in progress, but the core protocols are there for basic query operations. There's more to do there but we'll get there soon. So this is really cool 'cause not only do you have now a Python Client implementation, and you have a Go client implementation of this, but you can use this protocol reference to do lots of other things, too. The obvious thing you could do is build more clients for other languages. So if you have a need for a client in some other language that are vertica doesn't support yet, now you have everything available to solve that problem and to go about doing so if you need to. But beyond clients, it's also used for other things. So you might use it for mocking and testing things. So rather than connecting to a real vertica database, you can simulate some of that. You can also use it to do things like query routing and proxies. So Uber, for example, this log here in this link tells a great story of how they route different queries to different vertical clusters by intercepting these protocol messages, parsing the queries in them and deciding which clusters to send them to. So a lot of these things are just ideas today, but now that you have the source code, there's no limit in sight to what you can do with this thing. And so we're very interested in hearing your ideas and requests and we're happy to offer advice and collaborate on building some of these things together. So let's take a look now at some of the things we've already built that do these things. So here's a picture of vertica's Grafana connector with some data powered from an example that we have in this blog link here. So this has an internet of things use case to it, where we have lots of different sensors recording flight data, feeding into Kafka which then gets loaded into vertica. And then finally, it gets visualized nicely here with Grafana. And Grafana's visualizations make it really easy to analyze the data with your eyes and see when something something happens. So in these highlighted sections here, you notice a drop in some of the activity, that's probably a problem worth looking into. It might be a lot harder to see that just by staring at a large table yourself. So how does a picture like that get generated with a tool like Grafana? Well, Grafana specializes in visualizing time series data. And time can be really tricky for computers to do correctly. You got time zones, daylight savings, leap seconds, negative infinity timestamps, please don't ever use those. In every system, if it wasn't hard enough, just with those problems, what makes it harder is that every system does it slightly differently. So if you're querying some time data, how do we deal with these semantic differences as we cross these domain boundaries from Vertica to Grafana's back end architecture, which is implemented in Go on it's front end, which is implemented with JavaScript? Well, you read this from bottom up in terms of the processing. First, you select the timestamp and Vertica is timestamp has to be converted to a Go time object. And we have to reconcile the differences that there might be as we translate it. So Go time has a different time zone specifier format, and it also supports nanosecond precision, while Vertica only supports microsecond precision. So that's not too big of a deal when you're querying data because you just see some extra zeros, not fractional seconds. But on the way in, if we're loading data, we have to find a way to resolve those things. Once it's into the Go process, it has to be converted further to render in the JavaScript UI. So that there, the Go time object has to be converted to a JavaScript Angular JS Date object. And there too, we have to reconcile those differences. So a lot of these differences might just be presentation, and not so much the actual data changing, but you might want to choose to render the date into a more human readable format, like we've done in this example here. Here's another picture. This is another picture of some time series data, and this one shows you can actually write your own queries with Grafana to provide answers. So if you look closely here you can see there's actually some functions that might not look too familiar with you if you know vertica's functions. Vertica doesn't have a dollar underscore underscore time function or a time filter function. So what's actually happening there? How does this actually provide an answer if it's not really real vertica syntax? Well, it's not sufficient to just know how to manipulate data, it's also really important that you know how to operate with metadata. So information about how the data works in the data source, Vertica in this case. So Grafana needs to know how time works in detail for each data source beyond doing that basic I/O that we just saw in the previous example. So it needs to know, how do you connect to the data source to get some time data? How do you know what time data types and functions there are and how they behave? How do you generate a query that references a time literal? And finally, once you've figured out how to do all that, how do you find the time in the database? How do you do know which tables have time columns and then they might be worth rendering in this kind of UI. So Go's database standard doesn't actually really offer many metadata interfaces. Nevertheless, Grafana needs to know those answers. And so it has its own plugin layer that provides a standardizing layer whereby every data source can implement hints and metadata customization needed to have an extensible data source back end. So we have another open source project, the Vertica-Grafana data source, which is a plugin that uses Grafana's extension points with JavaScript and the front end plugins and also with Go in the back end plugins to provide vertica connectivity inside Grafana. So the way this works, is that the plugin frameworks defines those standardizing functions like time and time filter, and it's our plugin that's going to rewrite them in terms of vertica syntax. So in this example, time gets rewritten to a vertica cast. And time filter becomes a BETWEEN predicate. So that's one example of how you can use Grafana, but also how you might build any arbitrary visualization tool that works with data in Vertica. So let's now look at some other examples and reference architectures that we have out in our GitHub page. For some advanced integrations, there's clearly a need to go beyond these standards. So SQL and these surrounding standards, like JDBC, and ODBC, were really critical in the early days of Vertica, because they really enabled a lot of generic database tools. And those will always continue to play a really important role, but the Big Data technology space moves a lot faster than these old database data can keep up with. So there's all kinds of new advanced analytics and query pushdown logic that were never possible 10 or 20 years ago, that Vertica can do natively. There's also all kinds of data-oriented application workflows doing things like streaming data, or Parallel Loading or Machine Learning. And all of these things, we need to build software with, but we don't really have standards to go by. So what do we do there? Well, open source implementations make for easier integrations, and applications all over the place. So even if you're not using Grafana for example, other tools have similar challenges that you need to overcome. And it helps to have an example there to show you how to do it. Take Machine Learning, for example. There's been many excellent Machine Learning tools that have arisen over the years to make data science and the task of Machine Learning lot easier. And a lot of those have basic database connectivity, but they generally only treat the database as a source of data. So they do lots of data I/O to extract data from a database like Vertica for processing in some other engine. We all know that's not the most efficient way to do it. It's much better if you can leverage Vertica scale and bring the processing to the data. So a lot of these tools don't take full advantage of Vertica because there's not really a uniform way to go do so with these standards. So instead, we have a project called vertica-ml-python. And this serves as a reference architecture of how you can do scalable machine learning with Vertica. So this project establishes a familiar machine learning workflow that scales with vertica. So it feels similar to like a scickit-learn project except all the processing and aggregation and heavy lifting and data processing happens in vertica. So this makes for a much more lightweight, scalable approach than you might otherwise be used to. So with vertica-ml-python, you can probably use this yourself. But you could also see how it works. So if it doesn't meet all your needs, you could still see the code and customize it to build your own approach. We've also got lots of examples of our UDX framework. And so this is an older GitHub project. We've actually had this for a couple of years, but it is really useful and important so I wanted to plug it here. With our User Defined eXtensions framework or UDXs, this allows you to extend the operators that vertica executes when it does a database load or a database query. So with UDXs, you can write your own domain logic in a C++, Java or Python or R. And you can call them within the context of a SQL query. And vertica brings your logic to that data, and makes it fast and scalable and fault tolerant and correct for you. So you don't have to worry about all those hard problems. So our UDX examples, demonstrate how you can use our SDK to solve interesting problems. And some of these examples might be complete, total usable packages or libraries. So for example, we have a curl source that allows you to extract data from any curlable endpoint and load into vertica. We've got things like an ODBC connector that allows you to access data in an external database via an ODBC driver within the context of a vertica query, all kinds of parsers and string processors and things like that. We also have more exciting and interesting things where you might not really think of vertica being able to do that, like a heat map generator, which takes some XY coordinates and renders it on top of an image to show you the hotspots in it. So the image on the right was actually generated from one of our intern gaming sessions a few years back. So all these things are great examples that show you not just how you can solve problems, but also how you can use this SDK to solve neat things that maybe no one else has to solve, or maybe that are unique to your business and your needs. Another exciting benefit is with testing. So the test automation strategy that we have in vertica-python these clients, really generalizes well beyond the needs of a database client. Anyone that's ever built a vertica integration or an application, probably has a need to write some integration tests. And that could be hard to do with all the moving parts, in the big data solution. But with our code being open source, you can see in vertica-python, in particular, how we've structured our tests to facilitate smooth testing that's fast, deterministic and easy to use. So we've automated the download process, the installation deployment process, of a Vertica Community Edition. And with a single click, you can run through the tests locally and part of the PR workflow via Travis CI. We also do this for multiple different python environments. So for all python versions from 2.7 up to 3.8 for different Python interpreters, and for different Linux distros, we're running through all of them very quickly with ease, thanks to all this automation. So today, you can see how we do it in vertica-python, in the future, we might want to spin that out into its own stand-alone testbed starter projects so that if you're starting any new vertica integration, this might be a good starting point for you to get going quickly. So that brings us to some of the future work we want to do here in the open source space . Well, there's a lot of it. So in terms of the the client stuff, for Python, we are marching towards our 1.0 release, which is when we aim to be protocol complete to support all of vertica's unique protocols, including COPY LOCAL and some new protocols invented to support complex types, which is our new feature in vertica 10. We have some cursor enhancements to do things like better streaming and improved performance. Beyond that we want to take it where you want to bring it. So send us your requests in the Go client fronts, just about a year behind Python in terms of its protocol implementation, but the basic operations are there. But we still have more work to do to implement things like load balancing, some of the advanced auths and other things. But they're two, we want to work with you and we want to focus on what's important to you so that we can continue to grow and be more useful and more powerful over time. Finally, this question of, "Well, what about beyond database clients? "What else might we want to do with open source?" If you're building a very deep or a robust vertica integration, you probably need to do a lot more exciting things than just run SQL queries and process the answers. Especially if you're an OEM or you're a vendor that resells vertica packaged as a black box piece of a larger solution, you might to have managed the whole operational lifecycle of vertica. There's even fewer standards for doing all these different things compared to the SQL clients. So we started with the SQL clients 'cause that's a well established pattern, there's lots of downstream work that that can enable. But there's also clearly a need for lots of other open source protocols, architectures and examples to show you how to do these things and do have real standards. So we talked a little bit about how you could do UDXs or testing or Machine Learning, but there's all sorts of other use cases too. That's why we're excited to announce here our awesome vertica, which is a new collection of open source resources available on our GitHub page. So if you haven't heard of this awesome manifesto before, I highly recommend you check out this GitHub page on the right. We're not unique here but there's lots of awesome projects for all kinds of different tools and systems out there. And it's a great way to establish a community and share different resources, whether they're open source projects, blogs, examples, references, community resources, and all that. And this tool is an open source project. So it's an open source wiki. And you can contribute to it by submitting yourself to PR. So we've seeded it with some of our favorite tools and projects out there but there's plenty more out there and we hope to see more grow over time. So definitely check this out and help us make it better. So with that, I'm going to wrap up. I wanted to thank you all. Special thanks to Siting Ren and Roger Huebner, who are the project leads for the Python and Go clients respectively. And also, thanks to all the customers out there who've already been contributing stuff. This has already been going on for a long time and we hope to keep it going and keep it growing with your help. So if you want to talk to us, you can find us at this email address here. But of course, you can also find us on the Vertica forums, or you could talk to us on GitHub too. And there you can find links to all the different projects I talked about today. And so with that, I think we're going to wrap up and now we're going to hand it off for some Q&A.

Published Date : Mar 30 2020

SUMMARY :

Also a reminder that you can maximize your screen and frameworks to solve the problems you need to solve.

ENTITIES

Entity	Category	Confidence
Tom Wall	PERSON	0.99+
Sue LeClaire	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Roger Huebner	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Tom	PERSON	0.99+
Python 2	TITLE	0.99+
August 2019	DATE	0.99+
2019	DATE	0.99+
Python 3	TITLE	0.99+
two	QUANTITY	0.99+
Sue	PERSON	0.99+
Python	TITLE	0.99+
python	TITLE	0.99+
SQL	TITLE	0.99+
late 2018	DATE	0.99+
First	QUANTITY	0.99+
end of 2019	DATE	0.99+
Vertica	TITLE	0.99+
today	DATE	0.99+
Java	TITLE	0.99+
Spark	TITLE	0.99+
C++	TITLE	0.99+
JavaScript	TITLE	0.99+
vertica-python	TITLE	0.99+
Today	DATE	0.99+
first time	QUANTITY	0.99+
11 different releases	QUANTITY	0.99+
UDXs	TITLE	0.99+
Kafka	TITLE	0.99+
Extending Vertica with the Latest Vertica Ecosystem and Open Source Initiatives	TITLE	0.98+
Grafana	ORGANIZATION	0.98+
PyODBC	TITLE	0.98+
first	QUANTITY	0.98+
UDX	TITLE	0.98+
vertica 10	TITLE	0.98+
ODBC	TITLE	0.98+
10	DATE	0.98+
Postgres	TITLE	0.98+
DataDog	ORGANIZATION	0.98+
40 customer reported issues	QUANTITY	0.97+
both	QUANTITY	0.97+

Holden Karau, IBM - #BigDataNYC 2016 - #theCUBE

>> Narrator: Live from New York, it's the CUBE from Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, Nvidia. And our ecosystem sponsors. Now, here are your hosts: Dave Vellante and Peter Burris. >> Welcome back to New York City, everybody. This is the CUBE, the worldwide leader in live tech coverage. Holden Karau is here, principle software engineer with IBM. Welcome to the CUBE. >> Thank you for having me. It's nice to be back. >> So, what's with Boo? >> So, Boo is my stuffed dog that I bring-- >> You've got to hold Boo up. >> Okay, yeah. >> Can't see Boo. >> So, this is Boo. Boo comes with me to all of my conferences in case I get stressed out. And she also hangs out normally on the podium while I'm giving the talk as well, just in case people get bored. You know, they can look at Boo. >> So, Boo is not some new open source project. >> No, no, Boo is not an open source project. But Boo is really cute. So, that counts for something. >> All right, so, what's new in your world of spark and machinery? >> So, there's a lot of really exciting things, right. Spark 2.0.0 came out, and that's really exciting because we finally got to get rid of some of the chunkier APIs. And data sets are just becoming sort of the core base of everything going forward in Spark. This is bringing the Spark Sequel engine to all sorts of places, right. So, the machine learning APIs are built on top of the data set API now. The streaming APIs are being built on top of the data set APIs. And this is starting to actually make it a lot easier for people to work together, I think. And that's one of the things that I really enjoy is when we can have people from different sort of profiles or roles work together. And so this support of data sets being everywhere in Spark now lets people with more of like a Sequel background still write stuff that's going to be used directly in sort of a production pipeline. And the engineers can build whatever, you know, production ready stuff they need on top of the Sequel expressions from the analysts and do some really cool stuff there. >> So, chunky API, what does that mean to a layperson? >> Sure, um, it means like, for example, there's this thing in Spark where one of the things you want to do is shuffle a whole bunch of data around and then look at all of the records associated with a given key, right? But, you know, when the APIs were first made, right, it was made by university students. Very smart university students, but you know, it started out as like a grad school project, right? And like, um, so finally with 2.0, we were about to get rid of things like places where we use traits like iterables rather than iterators. And because like these minor little drunky things it's like we had to keep supporting this old API, because you can't break people's code in a minor release, but when you do a big release like Spark 2.0, you can actually go, okay, you need to change your stuff now to start using Spark 2.0. But as a result of changing that in this one place, we're actually able to better support spilling to disk. And this is for people who have too much data to fit in memory even on the individual executors. So, being able to spill to disk more effectively is really important from a performance point of view. So, there's a lot of clean up of getting rid of things, which were sort of holding us back performance-wise. >> So, the value is there. Enough value to break the-- >> Yeah, enough value to break the APIs. And 1.6 will continue to be updated for people that are not ready to migrate right today. But for the people that are looking at it, it's definitely worth it, right? You get a bunch of real cool optimizations. >> One of the themes of this event of the last couple of years has been complexity. You guys wrote an article recently in SiliconANGLE some of the broken promises of open source, really the route of it, being complexity. So, Spark addresses that to a large degree. >> I think so. >> Maybe you could talk about that and explain to us sort of how and what the impact could be for businesses. >> So, I think Spark does a really good job of being really user-friendly, right? It has a Sequel engine for people that aren't comfortable with writing, you know, Scala or Java or Python code. But then on top of that, right, there's a lot of analysts that are really familiar with Python. And Spark actually exposes Python APIs and is working on exposing R APIs. And this is making it so that if you're working on Spark, you don't have to understand the internals in a lot of depth, right? There's some other streaming systems where to make them perform really well, you have to have a really deep mental model of what you're doing. But with Spark, it's much simpler and the APIs are cleaner, and they're exposed in the ways that people are already used to working with their data. And because it's exposed in ways that people are used to working with their data, they don't have to relearn large amounts of complexity. They just have to learn it in the few cases where they run into problems, right? Because it will work most of the time just with the sort of techniques that they're used to doing. So, I think that it's really cool. Especially structured streaming, which is new in Spark 2.0. And structured streaming makes it so that you can write sort of arbitrary Sequel expressions on streaming data, which is really awesome. Like, you can do aggregations without having to sit around and think about how to effectively do an aggregation over different microbatches. That's not a problem for you to worry about. That's a problem for the Spark developers to worry about. Which, unfortunately, is sometimes a problem for me to worry about, but you know, not too often. Boo helps out whenever it gets too stressful. >> First of all, a lot to learn. But there's been some great research done in places like Cornell and Penn and others about how the open source community collaborates and works together. And I'm wondering is the open source community that's building things like Spark, especially in a domain like Big Data, which the use cases themselves are so complex and so important. Are we starting to take some of the knowledge in the contributors, or developing, on how to collaborate and how to work together. And starting to find that way into the tools so that the whole thing starts to collaborate better? >> Yeah, I think, actually, if you look at Spark, you can see that there's a lot of sort of tools that are being built on top of Spark, which are also being built in similar models. I mean, the Apache Software Foundation is a really good tool for managing projects of a certain scale. You can see a lot of Spark-related projects that have also decided that become part of Apache Foundation is a good way to manage their governance and collaborate with different people. But then there's people that look at Spark and go like wow, there's a lot of overhead here. I don't think I'm going to have 500 people working on this project. I'm going to go and model my project after something a bit simpler, right? And I think that both of those are really valid ways of building open source tools on Spark. But it's really interesting seeing there's a Spark components page, essentially, a Spark packages list, for community to publish the work that they're doing on top of Spark. And it's really interesting to see all of the collaborations that are happening there. Especially even between vendors sometimes. You'll see people make tools, which help everyone's data access go faster. And it's open source. so you'll see it start to get contributed into other people's data access layers as well. >> So, pedagogy of how the open source community's work starting to find a way into the tools, so people who aren't in the community, but are focused on the outcomes are now able to not only gain the experience about how the big data works, but also how people on complex outcomes need to work. >> I think that's definitely happening. And you can see that a lot with, like, the collaboration layers that different people are building on top of Spark, like the different notebook solutions, are all very focused on ableing collaboration, right? Because if you're an analyst and you're writing some python code on your local machine, you're not going to, like, probably set up a get up recode to share that with everyone, right? But if you have a notebook and you can just send the link to your friends and be like hey, what's up, can you take a look at this? You can share your results more easily and you can also work together a lot more, more collaboratively. And then so data bricks is doing some great things. IBM as well. I'm sure there's other companies building great notebook solutions who I'm forgetting. But the notebooks, I think, are really empowering people to collaborate in ways that we haven't traditionally seen in the big data space before. >> So, collaboration, to stay on that theme. So, we had eight data scientists on a panel the other night and just talking about, collaboration came up, and the question is specifically from an application developer standpoint. As data becomes, you know, the new development kit, how much of a data scientist do you have to become or are you becoming as a developer? >> Right, so, my role is very different, right? Because I focus just on tools, mostly. So, my data science is mostly to make sure that what I'm doing is actually useful to other people. Because a lot of the people that consume my stuff are data scientists. So, for me, personally, like the answer is not a whole lot. But for a lot of my friends that are working in more traditional sort of data engineering roles where they're empowering specific use cases, they find themselves either working really closely with data scientists often to be like, okay, what are your requirements? What data do I need to be able to get to you so you can do your job? And, you know, sometimes if they find themselves blocking on the data scientists, they're like, how hard could it be? And it turns out, you know, statistics is actually pretty complicated. But sometimes, you know, they go ahead and pick up some of the tools on their own. And we get to see really cool things with really, really ugly graphs. 'Cause they do not know how to use graphing libraries. But, you know, it's really exciting. >> Machine learning is another big theme in this conference. Maybe you could share with us your perspectives on ML and what's happening there. >> So, I really thing machine learning is very powerful. And I think machine learning in Spark is also super powerful. And especially just like the traditional things is you down-sample your data. And you train a bunch of your models. And then, eventually, you're like okay, I think this is like the model that I want to like build for real. And then you go and you get your engineer to help you train it on your giant data set. But Spark and the notebooks that are built on top of it actually mean that it's entirely reasonable for data scientists to take the tools which are traditionally used by the data engineering roles, and just start directly applying them during their exploration phase. And so we're seeing a lot of really more interesting models come to life, right? Because if you're always working with down-sampled data, it's okay, right? Like you can do reasonable exploration on down-sampled data. But you can find some really cool sort of features that you wouldn't normally find once you're working with your full data set, right? 'Cause you're just not going to have that show up in your down-sampled data. And I think also streaming machine learning is a really interesting thing, right? Because we see there's a lot of IOT devices and stuff like that. And like the traditional machine learning thing is I'm going to build a model and then I'm going to deploy it. And then like a week later, I'll maybe consider building a new model. And then I'll deploy it. And then so very much it looks like the old software release processes as opposed to the more agile software release processes. And I think that streaming machine learning can look a lot more like, sort of the agile software development processes where it's like cool, I've got a bunch of labeled data from our contractors. I'm going to integrate that right away. And if I don't see any regression on my cross-validation set, we're just going to go ahead and deploy that today. And I think it's really exciting. I'm obviously a little biased, because some of my work right now is on enabling machine learning with structured streaming in Spark. So, I obviously think my work is useful. Otherwise I would be doing something else. But it's entirely possible. You know, everyone will be like Holden, your work is terrible. But I hope not. I hope people find it useful. >> Talking about sampling. In our first at Dupe World 2010, Albi Meta, he stopped by again today, of course, and he made the statement then. Sampling's dead. It's dead. Is sampling dead? >> Sampling didn't quite die. I think we're getting really close to killing sampling. Sampling will only be data once all of the data scientists in the organization have access to the same tools that the data engineers have been using, right? 'Cause otherwise you'll still be sampling. You'll still be implicitly doing your model selection on down-sampled data. And we'll still probably always find an excuse to sample data, because I'm lazy and sometimes I just want to develop on my laptop. But, you know, I think we're getting close to killing a lot more of sampling. >> Do you see an opportunity to start utilizing many of these tools to actually improve the process of building models, finding data sources, identifying individuals that need access to the data? Are we going to start turning big data on the problem of big data? >> No, that's really exciting. And so, okay, so this is something that I find really enjoyable. So, one of the things that traditionally, when everyone's doing their development on their laptop, right? You don't get to collect a lot of metrics about what they're doing, right? But once you start moving everyone into a sort of more integrated notebook environment, you can be like, okay, like, these are data sets that these different people are accessing. Like these are the things that I know about them. And you can actually train a recommendation algorithm on the data sets to recommend other data sets to people. And there are people that are starting to do this. And I think it's really powerful, right? Because it's like in small companies, maybe not super important, right? Because I'll just go an ask my coworker like hey, what data sets do I want to use? But if you're at a company like Google or IBM scale or even like a 500 person company, you're not going to know all of the data sets that are available for you to work with. And the machine will actually be able to make some really interesting recommendations there. >> All right, we have to leave it there. We're out of time. Holden, thanks very much. >> Thank you so much for having me and having Boo. >> Pleasure. All right, any time. Keep right there everybody. We'll be back with our next guest. This is the CUBE. We're live from New York City. We'll be right back.

Published Date : Sep 30 2016

SUMMARY :

Brought to you by headline sponsors, This is the CUBE, the worldwide leader It's nice to be back. normally on the podium So, Boo is not some So, that counts for something. And this is starting to So, being able to spill So, the value is there. But for the people that are looking at it, that to a large degree. about that and explain to us and think about how to And starting to find And it's really interesting to but are focused on the outcomes the link to your friends and the question is specifically be able to get to you Maybe you could share with And then you go and you get your engineer and he made the statement then. that the data engineers on the data sets to recommend All right, we have to leave it there. Thank you so much for This is the CUBE.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Holden Karau	PERSON	0.99+
New York City	LOCATION	0.99+
Java	TITLE	0.99+
Apache Foundation	ORGANIZATION	0.99+
Scala	TITLE	0.99+
New York City	LOCATION	0.99+
Python	TITLE	0.99+
Spark 2.0	TITLE	0.99+
Spark	TITLE	0.99+
500 people	QUANTITY	0.99+
Albi Meta	PERSON	0.99+
a week later	DATE	0.99+
Spark 2.0.0	TITLE	0.99+
500 person	QUANTITY	0.99+
Apache Software Foundation	ORGANIZATION	0.98+
New York	LOCATION	0.98+
today	DATE	0.98+
Holden	PERSON	0.98+
first	QUANTITY	0.98+
both	QUANTITY	0.98+
Cornell	ORGANIZATION	0.97+
Boo	PERSON	0.97+
One	QUANTITY	0.96+
Spark Sequel	TITLE	0.95+
CUBE	ORGANIZATION	0.93+
eight data scientists	QUANTITY	0.93+
python code	TITLE	0.93+
2016	DATE	0.91+
one	QUANTITY	0.91+
First	QUANTITY	0.9+
Penn	ORGANIZATION	0.89+
last couple of years	DATE	0.88+
Big Data	ORGANIZATION	0.86+
one place	QUANTITY	0.85+
2.0	TITLE	0.8+
agile	TITLE	0.79+
one of	QUANTITY	0.75+
things	QUANTITY	0.73+
once	QUANTITY	0.7+
#BigDataNYC	EVENT	0.7+
2010	DATE	0.65+
Dupe	EVENT	0.6+
World	ORGANIZATION	0.56+
Data	TITLE	0.53+
themes	QUANTITY	0.52+
1.6	OTHER	0.5+

Stella Low & Amy Posey - EMC World 2015 - theCUBE - #EMCWorld

>>Live from Las Vegas, Nevada. It's the cube covering EMC world 2015. Brought to you by EMC, Brocade and VCE. >>Okay. Welcome back everyone. We are live here in Las Vegas with the cube at EMC real 2015. I'm John ferry, the founder of Silicon Ang. I'm joined with two special guests. Stella Lowe. Who's the global communications at EMC runs, global communications and Amy Posey, neuro facilitator at peak teams. Welcome to the cube. >>So >>You had a session women of the world. We did it last year, but great cube session last year. Um, so I want to get a couple of quick questions. What's going on with women of the world, what you guys just came from there and you guys were on the panel and then what is a neuro facilitator? And then let's get into it. Let's talk about men and women, how we work together. >>Okay, great. So let's start with women of world. So, um, so last year we talked about the challenges that we face and how we reframe them into opportunities that we had some fantastic panelists, but this year I was really interested in the science behind men and women. So it's clear that we're different and we're all bled for success, but, but we're wired differently. And we kind of knew that already. I know we talked about it before John, but we now have the science behind it. We can look at brain scans and we can see that we, Oh, we have different brain patterns. We think differently, uh, different parts of the brain fire fire up when, in times of motivation and stress and people like Amy here, who've done lots of work into this, have having the stages. It was great to have her on the panel to discuss it. >>I'm going to give you a plug because EMC does all kinds of things with formula one cars, motorcycles, getting the data and understanding the race. But now you're dealing with people. So what is going on? Tell us what's up neuro facilitator and let's >>So a neuro facilitator is maybe the best made up job title in the world that I gave myself. So essentially what I do is I look at information about the brain and I curate the research that's out there. So there's a lot of new technology to actually read and look inside our heads. We all have a brain, but we don't necessarily all know how it works. So there's a lot more research and, and tools to read our brains and take a look inside. So what I do is I take that research and, and work with, um, neuroscientists and neurobiologist at Stanford, Columbia, UCLA, and, and reach out and figure out how do we take that information and make it easier, still attain. And I do it in the scope of leadership at organizations like EMC and other technology companies to figure out how do we work better? What information is out there? You know, soft skills and sort of relationship skills. I've always been sort of squishy, right? So now there's a lot more science and information about our brains that are informing it. The, the data's out there, what I do and what my job is, is to pull the data and figure out how do we make it into practical, useful applications for us at work at home, wherever we are. So that's essentially, I'm doing so you >>Guys discussed and how men and women are different. Actually look at the data. We have to give a lot of qualitative data. I mean, it keeps counselors in business. You know, the grant in the workforce, uh, balance is important, but we have a lot of that data, but what's the numbers. What is your findings? So >>What's interesting is looking at men and women's brains. What's fascinating is that we are more alike than dissimilar in looking at a brain. If you looked at a brain scan, one of a man and woman, you wouldn't be able to tell the difference between the two, but they're now finding and looking at different parts of the brain in different functions. So for instance, men have approximately 6% more gray matter than women. So in terms of the gray matter, that's the thinking brain essentially, and women have more white matter than gray. Matter about 9% more than men. And the white matter is what connects the brain and communicate both front and back and side to side. And so you can make some extrapolation of that information and say, you know, men may focus more on issues, solutions, problems, whereas women sort of think more broadly or wider. >>So, I mean, there are generalities, but a lot of the sciences is fascinating. There's also some interesting science about the hippocampus, which is, um, sort of deep. If this is your brain, it's deep inside the brain and the hippocampus is the memory center. And it's what they're finding is that for women, they tend to store emotional memories more effectively. So happy, sad, fearful those types of emotions get stored more effectively in the hippocampus. Whereas men oftentimes during stress, the hippocampus actually has a challenge in making connections. So that's where, again, some of the, the focus and determination and silo viewed sometimes that men have in situations or problems comes into play. Um, there's one other piece, the anterior cingulate cortex, which is sort of within the brain and that's the brains error detector. And it turns out it's a little bit bigger in women. So women sort of tend to look for, uh, issues CA you know, problems, um, maybe less solution focused, especially under times of stress and, and a lot of this, data's interesting. >>It, it causes you to make some generalities, you know, not everybody is going to operate in that way. Your mileage may theory, but it's, it's good because it helps us inform some of the quirky behavior that we deal with at work and figuring out why, why don't you do that? Why do you do that and installed that women being better or women using more of the brain or less of the brain it's, it's, it's simply about we, we, if all brains away from differently, we both bring different things to the table. And how do you take both of those benefits and bring them forward into a better outcomes? >>Always great to talk about because in the workforce, people are different. And so differences is a term that we use, like, you know, with kids learn differently, some have evolved differently and men and women have had differences. So the data shows that that's clear. Um, I want to share a quote that my wife shared on Facebook. It says mother, um, well, a worried mother does better research than the FBI. So, um, I bring that up, you know, it's instinctual. So a lot of it's also biological and also environmental talk about the dynamics around that, that wiring, because you're wired by your upbringing too, that affects you. And what's the, what's the data show in the biology. >>So it's interesting because the, the key piece is that it's not just the biological brain differences. It's, it's a whole host of factors that leave a footprint on us, in our behavior. So it's our education, it's our, uh, you know, where we, where we grew up, our culture is part of that. It's also gender stereotypes that play a role in how we operate. And I think all of those things leave a footprint on a, an and lead us to different behaviors. And so you can't just say it's the, the, the information that's on our brains. It's a whole host of factors that influence. So my study of looking at how the brains are a little bit different and what the research is coming, it's, it's blended in with research around leadership and things like confidence and motivation in the workplace bias in the workplace. And they're, they're showing very different things. >>So for instance, if you think about confidence, we did an interesting exercise in the event at women of world. And I asked, you know, there's, there's a lot about confidence and confidence is essentially the will or motivation to act. So how many women in the room, uh, would raise the, you know, go up for a job that they were really interested in and fascinated by, but maybe weren't a hundred percent qualified for, like, how many of you have maybe turned down that job or decided not to apply because it wasn't the right time. Like you, you're pretty competent, but not a hundred percent confident in it. And it was funny because the majority, all the women's hands went up in the room. So then I asked him, I flipped the question in the room and I asked the men in the room. I said, okay, if you were only about 50% confident for a job that you were going up for, would you, of course, right. Like, yes, I >>Fabricate some stuff on their resume and you make >>Them look bigger. So, exactly. So what's interesting is testosterone plays a role in confidence and motivation at work. And it turns out men have 10 times the amount of testosterone as women do. So part of that is that aggression, but we both have it, but that, that aggressive factor, that idea to go after something, to be more confident, um, women are behind the curve in that, from the research that I've seen. So it takes more effort to, to, to be able to have the confidence, to go for it and to sort of break down those barriers that exist for women to, to go after those jobs that they want, even if it's not a hundred percent. And so we did a, an exercise in boosting confidence in testosterone called power posing. And Amy Cuddy out of Harvard does a, a whole Ted talk on it, which is fascinating. >>But the idea is that you, you know, you, you put your chest back, you put your hands on your hips and it helps boost your testosterone up to about 20%. And it reduces cortisol, which is a stress hormone. So it's a, it's a quick way. You don't do it in front of people. You do it sort of on the sly or else you kind of, you don't look very nice to others, but you, you boost your confidence doing that. And it's just a small sort of brain hack that you can do to give yourself an upper hand, knowing that knowing the science behind it. So it's a behavior changing type of research that's coming out, which I think is really, >>That's really interesting, but now it translates into leadership and execution in the workforce. So people are different than men and women are different that changes the dynamic around what good is, because if your point about women not asking for that job or having confidence to the field, like I'm not going to go for it, like a man bravado, whatever testosterone that's what mean that that's the benchmark of what drive means. So this came up with Microsoft CEO at the Anita board conferences, which we had a cube there. And, and this is a big issue. So how do HR, how do the managers, how do people recognize the differences and what does the data show, and, and can you share your thoughts on that? >>Yeah, so I think a lot of it comes down to bias and bias is essentially a shortcut that we use in our brains to take less energy. And it's not a bad thing. It's, it's something we all do. And it's conscious and it's unconscious. So bias, I think is a key piece of that. And the research on bias is fascinating. It's very, it's, it's very popular topic these days, because I think being able to do a couple of things, be aware that there are hundreds of biases and they're both conscious and unconscious, uh, acknowledge that it exists, but not legitimize it not make that. Okay. The third piece is to, to counter it and, and being able to counter bias by making sure that people have opportunities. And even though you may have re removed hypothetical barriers explicitly stating that you want people, men, or women to apply for promotions, be this type of leader, not just assume that because there are no barriers that it's okay, but really be explicit in how you give people opportunities and let them know that they're out there. I think that's really key. >>Yeah. That brings up the point around work life balance, because, you know, I have a family of four, four kids it's stressful just in and of itself to have four kids, but then I go to the workforce and the same with women too. So there's also a home dynamic with leadership and biases and roles. Um, what's your take on any data on the how of that shifting persona realities, if you will, um, shapes the data. >>It's interesting because it's, it's something that we even talked about in the session that it's a struggle and, and, um, Bev career from Intel was talking about that. There's a period of time that actually is really tough to keep women in the workforce. And it's that time where you're growing your family, you're growing your career. And oftentimes things sort of struggle. And I, I read something recently around women in STEM careers, over a 10-year period, 42% of women drop out of the workforce in comparison to 17% of men. And so I think there's a lot, a ways to go in terms of being able to set up environments where working life is integrated, because it's, it's not even balanced anymore. It's integration. And how do you set up structures so that people can do that through how they work through how they connect with others. And, and to me, that's a big piece is how do you keep people in the workforce and still contributing in that critical point in time? And, you know, Intel hasn't figured it out. It's a tough challenge, >>Stamina. We're a big fans of women in tech, obviously because we love tech athletes. We'd love to promote people who are rock stars and technology, whether it's developers to leaders. And I also have a daughter, two daughters. And so two questions. One is women in tech, anything you could share that the data can talk to, to either inspire or give some insight and to, for the young women out there that might not have that cultural baggage, that my generation, at least our worse than older than me have from the previous bias. So motivating young daughters out there, and then how you deal with the career advice for existing women. >>So the motivating young women to get into tech, um, Bev shared a really absolutely fascinating statistic that between the ages of 12 and 18, it's incredibly important to have a male support model for young girls to get into STEM careers, that it was absolutely critical for their success. And it's funny because the question came up like, why can't that be a woman too? And what's interesting. And what we find is oftentimes we give men the short shrift when they try and support women, and we don't want to do that. We want to support men supporting women because when that happens, we all win. Um, and so I think that's a big piece of it is starting young and starting with male support as well as female support. So many women who, who cite men as, as he had mental was in that gray, you know, or in their daily life. And it's pretty important that they can feel that they can do that. >>And this goes back down the wiring data that you have the data on how we're were wired. It's okay, guys, to understand that it's not an apples to apples. So to speak, men are from Mars. Women are from beans, whatever that phrase is, but that's really what the data is. >>And being explicit to men to say, we want you to support women instead of having men take a back seat feeling like maybe this isn't my battle to fight. It's, it's really important to then encourage men to speak up to in those, those situations to, to think about sort of women in tech. One of, uh, a really interesting piece of research that I've seen is about team intelligence and what happens on teams and Anita Willy from Carnegie Mellon produced this really fascinating piece of research on the three things that a team needs to be more intelligent. It's not just getting the smartest people in the room with the highest IQ. That's a part of it. You want table stakes, you want to start with smart people, but she found that having women, more women on a team actually improved the team's overall intelligence, the collective intelligence and success of a team. So more women was the first one. The second was there's this ability and women tend to be better at it, but the ability to read someone's thoughts and emotions just by looking at their eyes. So it's called breeding in the mind's eye. So just taking a look and being able to sense behavior, um, and, and what someone's thinking and feeling, and then being able to adjust to that and pivot on that, not just focusing on the task at hand, but the cohesion of a team with that skill made a difference. >>It's like if it's a total team sport, now that's what you're saying in terms of how use sport analogy, but women now you see women's sports is booming. This brings up my, my, your, uh, awesome research that you just did for the folks out there. Stella was leading this information generation study and the diversity of use cases now with tech, which is why we love tech so much. It's not just the geeky programmer, traditional nail role. You mentioned team, you've got UX design. You have, um, real time agile. So you have more of a, whether it's a rowing analogy or whatever sport or music, collaboration, collaboration is key. And there's so many new disciplines. I mean, I'll share data that I have on the cube looking at all the six years and then even women and men, the pattern that's coming up is women love the visualization. It's weird. I don't know if that's just so it's in the data, but like data scientists that render into reporting and visualization, not like just making slides like in the data. Yeah. So, but they're not writing, maybe not Python code. So what do you guys see similar patterns in terms of, uh, information generation, it's sexy to have an iWatch. It's >>Cool. So like a cry from Intel on the panel, she gave a great statistic that actually, uh, it's more it's women that are more likely to make a decision on consumer tech than men. And yet a lot of the focus is about trying to build tech for men, uh, on the, you know, if consumer tech companies want to get this right, they need to start thinking about what are women looking for, uh, because, uh, they're the ones that are out there making these decisions, the majority of those decisions. >>Yeah. I mean, it's an old thing back in the day when I was in co, um, right out of college and doing my first startup was the wife test. Yeah. Everything goes by the wife because you want to have collaborative decision-making and that's kind of been seen as a negative bias or reinforcement bias, but I think what guys mean is like, they want to get their partner involved. Yeah. So how do, how do we change the biases? And you know, where I've talked to a guy who said, the word geek is reinforcing a bias or nerd where like, I use that term all the time, um, with science, is there, I mean, we had the, the lawsuit with Kleiner Perkins around the gender discrimination. She wasn't included. I mean, what's your take on all of this? I mean, how does someone practically take the data and put it into practice? >>I think the big thing is, you know, like I said, acknowledging that it exists, right? It's out there. We've been, I feel like our brains haven't necessarily adapted to the modern workplace and the challenges that we've dealt with because the modern workplace is something that was invented in the 1960s and our brains have evolved over a long time. So being able to handle some of the challenges that we have, especially on how men and women operate differently at the workplace, I think is key, but calling it out and making it okay to acknowledge it, but then counter where it needs to be countered where it's not right. And being explicit and having the conversations I think is the big piece. And that's what struck me with the Kleiner Perkins deal was let's have the conversation it's out there. A lot of times people are reticent to, to have the conversation because it's awkward and I need to be PC. And I'm worried about things. It's the elephant in the room, right. But it actually is. Dialogue is far better than leaving it. >>People are afraid. I mean, guys are afraid. Women are afraid. So it's a negative cycle. If it's not an out in the open, that's what I'm saying. >>And the idea is it's, what can we do collectively better to, to be more positive, to, to frame it more positively, because I think that makes a bigger difference in terms, in terms of talking about, Oh, we're different. How are we the same? How can we work together? What is the, the connection point that you bring, you bring, we all bring different skills and talents to the table. I think it's really taking a look at that and talking about it and calling it out and say, I'm not great at this. You're great at this. Let's, let's work together on what we can do, uh, more effectively, >>Okay. Team sports is great. And the diversity of workforce and tech is an issue. That's awesome. So I'd ask you to kind of a different question for both of you guys. What's the biggest surprise in the data and it could be what reinforced the belief or insight into something new share, uh, a surprise. Um, it could be pleasant or creepy or share it. >>Price to me is intuition. So we always talk about women having intuitions. I've had men say, you know, well, my wife is so intuitive. She kind of, she kinda gets that and I've had that in the workplace as well. And I think the biggest surprise for me was that we can now see, we've now proved the intuition. Intuition is a thing that women have, and it's about this kind of web thinking and connecting the dots. Yeah. So we sort of store these memories deep, deep inside. And then when we see something similar, we then make that connection. We call it intuition, but it's actually something it's a kind of a, you know, super recall if you like, and, and, and replaying that situation. But that I think was the biggest surprise to me, Amy. So I would think that the thing that, that always astonishes me is the workplace environment and how we set up environments sometimes to shoot ourselves in the foot. >>So, so often we'll set up, uh, a competitive environment, whatever it is, let's let's and it's internal competition. Well, it turns out that the way that the brain chemicals work in women is that competition actually froze us into, to stress or threat cycle much more easily than it does to men, but men need it to be able to get to optimal arousal. There's a lot of interesting research from Amy Ernest in, at Yale and, and that piece of how you can manipulate your environment to be more successful together to me is absolutely key. And being able to pull out elements of competition, but also elements of collaboration, you kind of knew it, but the science validates it and you go, this is why we need to make sure there's a balance between the two. So everyone's successful. So to me, that's the aha. I could listen to Amy all day and how we apply it to the workplace. That's the next big step. Yeah. >>Yeah. You guys are awesome. And thanks so much for sharing and I wish we could go long. We're getting the hook here on time, but is there any links and locations websites we can, people can go to to get more information on the studies, the science. So I, a lot of my day curating >>And looking for more research. So peak teams.com/blog is where I do a lot of my writing and suggestions. Um, it's peak teams, P E K T E M s.com. And so I run our blog and kind of put my musings every once in a while up there so that people can see what I'm working on. Um, but they can reach out at any time. And I'm on Twitter at, at peak teams geek. Speaking of geeks, I embraced the geek mentality, right? >>Well, we have, I think geeks comment personally, but, um, final point, I'll give you the last word, Amy, if you could have a magic wand to take the science and change the preferred vision of the future with respect to men and women, you know, working cohesively together, understanding that we're different decoupled in science. Now, what would you want to see for the environment work force, life balance? What would be the magic wand that you would change? >>I think being able to make women more confident by helping reduce bias with everybody. So being more keyed in to those biases that we have in those automatic things we do to shortcut and to be more aware of them and work on them together and not see them as bad, but see them as human. So I think that's my big takeaway is remove, remove more bias. >>Fantastic. Stella Lowe, and Amy Posey here inside the cube. Thanks so much. Congratulations on your great work. Great panel. We'll continue. Of course, we have a special channel on SiliconANGLE's dot TV for women in tech. Go to SiliconANGLE dot DV. We've got a lot of cube alumni. We had another one here today with Amy. Thank you for joining us. This is the cube. We'll be right back day three, bringing it to a close here inside the cube live in Las Vegas. I'm John Forney. We'll be right back after this short break.

Published Date : May 6 2015

SUMMARY :

Brought to you by EMC, I'm John ferry, the founder of Silicon Ang. What's going on with women of the So let's start with women of world. I'm going to give you a plug because EMC does all kinds of things with formula one cars, motorcycles, And I do it in the scope of leadership at organizations like You know, the grant in the workforce, uh, So in terms of the gray matter, to look for, uh, issues CA you know, problems, that we deal with at work and figuring out why, why don't you do that? So a lot of it's also biological and also environmental talk about the dynamics around So it's our education, it's our, uh, you know, And I asked, you know, there's, there's a lot about confidence and confidence is essentially So part of that is that aggression, but we both have it, but that, And it's just a small sort of brain hack that you can So how do HR, how do the managers, how do people recognize the And the research on bias is fascinating. So there's also a home dynamic with leadership and biases And, and to me, that's a big piece is how do you keep people in the workforce and still contributing in And I also have a daughter, two daughters. And it's funny because the question came up like, And this goes back down the wiring data that you have the data on how we're were wired. And being explicit to men to say, we want you to support women instead of having men take a back seat So what do you guys see similar patterns in terms of, uh, information generation, on the, you know, if consumer tech companies want to get this right, they need to start thinking about what are women Everything goes by the wife because you want to have collaborative decision-making and that's kind of been seen So being able to handle some of the challenges that we have, especially on how men and women operate If it's not an out in the open, that's what I'm saying. And the idea is it's, what can we do collectively better to, to be more positive, And the diversity of workforce and tech is an issue. And I think the biggest surprise for me was that we can now see, we've now proved the intuition. So to me, that's the aha. So I, a lot of my day curating Speaking of geeks, I embraced the geek mentality, right? Well, we have, I think geeks comment personally, but, um, final point, I'll give you the last word, So being more keyed in to those biases that we have This is the cube.

ENTITIES

Entity	Category	Confidence
Stella Lowe	PERSON	0.99+
Amy Posey	PERSON	0.99+
Amy Cuddy	PERSON	0.99+
EMC	ORGANIZATION	0.99+
Anita Willy	PERSON	0.99+
Amy	PERSON	0.99+
Amy Ernest	PERSON	0.99+
John ferry	PERSON	0.99+
John Forney	PERSON	0.99+
10 times	QUANTITY	0.99+
two questions	QUANTITY	0.99+
John	PERSON	0.99+
17%	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
FBI	ORGANIZATION	0.99+
Mars	LOCATION	0.99+
42%	QUANTITY	0.99+
Stella	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
One	QUANTITY	0.99+
last year	DATE	0.99+
two daughters	QUANTITY	0.99+
two	QUANTITY	0.99+
last year	DATE	0.99+
four kids	QUANTITY	0.99+
Brocade	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
third piece	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
second	QUANTITY	0.99+
2015	DATE	0.99+
both	QUANTITY	0.99+
Carnegie Mellon	ORGANIZATION	0.99+
SiliconANGLE	ORGANIZATION	0.99+
two special guests	QUANTITY	0.99+
18	QUANTITY	0.99+
VCE	ORGANIZATION	0.98+
this year	DATE	0.98+
peak teams.com/blog	OTHER	0.98+
today	DATE	0.98+
Yale	ORGANIZATION	0.98+
Silicon Ang	ORGANIZATION	0.98+
about 50%	QUANTITY	0.98+
three things	QUANTITY	0.98+
UCLA	ORGANIZATION	0.98+
first one	QUANTITY	0.98+
approximately 6%	QUANTITY	0.98+
hundred percent	QUANTITY	0.98+
Las Vegas, Nevada	LOCATION	0.98+
1960s	DATE	0.97+
iWatch	COMMERCIAL_ITEM	0.97+
one	QUANTITY	0.97+
Anita	EVENT	0.97+
Bev	PERSON	0.97+
Stella Low	PERSON	0.96+
about 9%	QUANTITY	0.95+
Twitter	ORGANIZATION	0.95+
hundreds of biases	QUANTITY	0.93+
Stanford	ORGANIZATION	0.93+
first startup	QUANTITY	0.93+
Kleiner Perkins	ORGANIZATION	0.92+
Facebook	ORGANIZATION	0.91+
Python code	TITLE	0.91+
about 20%	QUANTITY	0.91+
10-year	QUANTITY	0.9+
Harvard	ORGANIZATION	0.9+
P E K T E M s.com	OTHER	0.87+
four	QUANTITY	0.86+
12	QUANTITY	0.86+
Columbia	LOCATION	0.77+
theCUBE	ORGANIZATION	0.76+
day three	QUANTITY	0.76+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Python code: