Paul Barth, Podium Data | The Podium Data Marketplace
(light techno music) >> Narrator: From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now here's your host, Stu Miniman. >> Hi, I'm Stu Miniman and welcome to theCUBE conversation here in our Boston area studio. Happy to welcome back to the program, Paul Barth, who's the CEO of Podium Data, also a Boston area company. Paul, great to see you. >> Great to see you, Stu. >> Alright, so we last caught up with you, it was a fun event that we do at MIT talking about information, data quality, kind of understand why your company would be there. For our audience that doesn't know, just give us a quick summary, your background, what was kind of the why of Podium Data back when it was founded in 2014. >> Oh that's great Stu, thank you. I've spent most of my career in helping large companies with their data and analytic strategies, next generation architectures, new technologies, et cetera, and in doing this work, we kept stumbling across the complexity of adopting new technologies. And around the time that big data and Hadoop was getting popular and lots of hype in the marketplace, we realized that traditional large businesses couldn't manage data on this because the technology was so new and different. So we decided to form a software company that would automate a lot of the processing, manage a catalog of the data, and make it easy for nontechnical users to access their data. >> Yeah, that's great. You know when I think back to when we were trying to help people understand this whole big data wave, one of the pithy things we did, it was turning all this glut of data from a problem to an opportunity, how do we put this in to the users. But a lot of things kind of, we hit bumps in the road as an industry. Did studies it was more than 50 percent of these projects fail. You brought up a great point, tooling is tough, changing processes is really challenging. But that focus on data is core to our research, what we talk about all the time. But now it's automation and AIML, choose your favorite acronym of the day. This is going to solve all the ills that the big data wave didn't do right. Right, Paul? So maybe you can help us connect the dots a little bit because I hear a lot in to the foundation that trend from the big data to kind of the automation and AI thing. So you're maybe just a little ahead of your time. >> Well thanks, I saw an opportunity before there was anything in the marketplace that could help companies really corral their data, get some of the benefits of consolidation, some oversight in management through an automated catalog and the like. As AI has started to emerge as the next hype wave, what we're seeing consistently from our partners like Data Robot and others who have great AI technology is they're starved for good information. You can't learn automatically or even human learning if you're given inconsistent information, data that's not conformed or ready or consistent, which you can look at a lot of different events and start to build correlations. So we believe that we're still a central part of large companies building out their analytics infrastructure. >> Okay, help us kind of look at how your users and how you fit into this changing ecosystem. We all know things are just changing so fast. From 2014 to today, Cloud is so much bigger, the big waves of IoT keep talking. Everybody's got some kind of machine learning initiative. So what're the customers looking for, how do you fit in some of those different environments? >> I think when we formed the company we recognized that the cost performance differential between the open-sourced data management platforms like Hadoop and now Spark, were so dramatically better than the traditional databases and data warehouses, that we could transform the business process of how do you get data from Rotaready. And that's a consistent problem for large companies they have data in legacy formats, on mainframes, they have them in relational databases, they have them in flat files, in the Cloud, behind the firewall, and these silos continue to grow. This view of a consistent, or consistent view of your business, your customers, your processes, your operations, is cental to optimizing and automating the business today. So our business users are looking for a couple of things. One thing they are looking for is some manageability and a consistent view of their data no matter where it lives, and our catalog can create that automatically in days or weeks depending on how how big we go or how broadly we go. They're looking for that visibility but also they're looking for productivity enhancements, which means that they can start leveraging that data without a big IT project. And finally they're looking for agility which means there's self-service, there's an ability to access data that you know is trusted and secured and safe for the end users to use without having to call IT and have a program spin something up. So they're really looking for a totally new paradigm of data delivery. >> I tell you that hits on so many things that we've been seeing and a challenge that we've seen in the marketplace. In my world, talk about people they had their data centers and if I look at my data and I look at my applications, it's this heterogeneous nightmare. We call it hybrid or multi cloud these days, and it shows the promise of making me faster and all this stuff. But as you said, my data is all over the place, my applications are getting spun up and maybe I'm moving them and federating things and all that. But, my data is one of the most critical components of my business. Maybe explain a little bit how that works. Where do the customers come in and say oh my gosh, I've got a challenge and Podium Data's helping and the marketplace and all that. >> Sure, first of all we targeted from the start large regulated businesses, financial services, pharmaceutical healthcare, and we've broadened since then. But these companies' data issues were really pressure from both ends. One was a compliance pressure. They needed to develop regulatory reports that could be audited and proven correct. If your data is in many silos and it's compiled manually using spreadsheets, that's not only incredibly expensive and nonreproducible, it's really not auditable. So a lot of these folks were pressured to prove that the data they were reporting was accurate. On the other side, it's the opportunity cost. Fintech companies are coming into their space offering loans and financial products, without any human interaction, without any branches. They knew that data was the center to that. The only way you can make an offer to someone for financial product is if you know enough about them that you understand the risk. So the use and leverage of data was a very critical mass. There was good money to invest in it and they also saw that the old ways of doing this just weren't working. >> Paul, does your company help with the incoming GDPR challenges that are being faced? >> Sure, last year we introduced a PII detector and protection scheme. That may not sound like such a big deal but in the Hadoop open-source world it is. At the end of the day this technology while cheap and powerful is incredibly immature. So when you land data, for example, into these open data platforms like S3 out in the Cloud, Podium takes the time to analyze that data and tell you what the structures of the data are, where you might have issues with sensitive data, and has the tooling like obfuscation and encryption to protect the data so you can create safe to use data. I'd say our customers right now, they started out behind the firewall. Again, these regulated businesses were very nervous about breaches. They're looking and realizing they need to get to the Cloud 'cause frankly not only is it a better platform for them from a cost basis and scalability, it's actually where the data comes from these days, their data suppliers are in the Cloud. So we're helping them catalog their data and identify the sensitive data and prepare data sets to move to the Cloud and then migrate it to the Cloud and manage it there. >> Such a critical piece. I lived in the storage world for about a decade. There was a little acquisition that they made of a company called Pi, P-I. It was Paul Maritz who a lot of people know, Paul had a great career at Microsoft went on to run VMware for a bunch. But it was, the vision you talk about reminds me of what I heard Paul Maritz talking to. Gosh, that was a decade ago. Information, so much sensitivity. Expand a little bit on the security aspect there, when I looked through your website, you're not a security company per se, but are there partnerships? How do you help customers with I want to leverage data but I need to be secure, all the GRC and security things that's super challenging. >> At this space to achieve agility and scale on a new technology, you have to be enterprise ready. So in version one of our product, we had security features that included field level encryption and protection, but also integration with LDAB and Kerberos and other enterprise standard mechanisms and systems that would protect data. We can interoperate with Protegrity's and other kinds of encryption and protection algorithms with our open architecture. But it's kind of table stakes to get your data in a secured, monitorable infrastructure if you're going to enable this agility and self-service. Otherwise you restrict the use of the new data technologies to sandboxes. The failures you hear about are not in the sandboxes in the exploration, they're in getting those to production. I had one of my customers talk about how before Podium they had 50 different projects on Hadoop and all of them were in code red and none of them could go to production. >> Paul you mentioned catalogs, give us the update. What's the newest from Podium Data? Help explain that a little bit more. >> So we believe that the catalog has to help operationalize the data delivery process. So one of the things we did from the very start was say let's use the analytical power of big data technologies, Spark, Hadoop, and others, to analyze the data on it's way in to the platform and build a metadata catalog out of that. So we have over 100 profiling statistics that we automatically calculate and maintain for every field of every file we ever load. It's not something you do as an afterthought or selectively. We knew from our experience that we needed to do that, data validation, and then bring in inferences such as this field looks like PII data and tag that in the metadata. That process of taking in data and this even applies to legacy mainframe data coming in a VSAM format. It gets converted and landed to a usable format automatically. But the most important part is the catalog gets enriched with all this statistical profiling information, validation, all of the technical information and we interoperate as well as have a GUI to help with business tagging, business definitions in the light. >> Paul, just a little bit of a broader industry question, we talked a value of data I think everybody understands how important is it. How are we doing in understanding the value of that data though, is that a monetization thing? You've got academia in your background, there's debates, we've talked to some people at MIT about this. How do you look at data value as an industry in general, is there anything from Podium Data that you help people identify, are we leveraging it, are we doing the most, what are your thoughts around that? >> So I'd say someone who's looking for a good framework to think about this I'd recommend Doug Laney's book on infonomics, we've collaborated for a while, he's doing a great job there. But there's also just a blocking and tackling which is what data is getting used or a common one for our customers is where do I have data that's duplicate or it comes from the same source but it's not exactly the same. That often causes reconciliation issues in finance, or in forecasting, in sales analysis. So what we've done with our data catalog with all these profiling statistics is start to build some analytics that identify similar data sets that don't have to be exactly the same to say you may have a version of the data that you're trying to load here already available. Why don't you look at that data set and see if that one is preferred and the data governance community really likes this. For one of our customers there were literally millions of dollars in savings of eliminating duplication but the more important thing is the inconsistency, when people are using similar but not the same data sets. So we're seeing that as a real driver. >> I want to give you the final word. Just what are you seeing out in the industry these days, biggest opportunities, biggest challenges from users you're talking to? >> Well, what I'd say is when we started this it was very difficult for traditional businesses to use Hadoop in production and they needed an army of programmers and I think we solved that. Last year we started on our work to move to a post-Hadoop world so the first thing we've done is open up our cataloging tools so we can catalog any data set in any source and allow the data to be brought into an analytical environment or production environment more on demand then the idea that you're going to build a giant data lake with everything in it and replicate everything. That's become really interesting because you can build the catalog in a few weeks and then actually use the analysis and all the contents to drive the strategy. What do I prioritize, where do I put things? The other big initiative is of course, Cloud. As I mentioned earlier you have to protect and make Cloud ready data behind your firewall and then you have to know where it's used and how it's used externally. We automate a lot of that process and make that transition something that you can manage over time, and that is now going to be extended into multi cloud, multi lake type of technologies. >> Multi cloud, multi lake, alright. Well Paul Barth, I appreciate getting the update everything happening with Podium Data. Well, theCUBE had so many events this year, be sure to check out thecube.net for all the upcoming events and all the existing interviews. I'm Stu Miniman, thanks for watching theCUBE. (light techno music)
SUMMARY :
Narrator: From the SiliconANGLE Media office Hi, I'm Stu Miniman and welcome to theCUBE conversation it was a fun event that we do at MIT and in doing this work, we kept stumbling across one of the pithy things we did, and start to build correlations. and how you fit into this changing ecosystem. and safe for the end users to use and it shows the promise of making me So the use and leverage of data was a very critical mass. and then migrate it to the Cloud and manage it there. Expand a little bit on the security aspect there, and none of them could go to production. What's the newest from Podium Data? and tag that in the metadata. that you help people identify, are we leveraging it, and the data governance community really likes this. I want to give you the final word. and allow the data to be brought into Well Paul Barth, I appreciate getting the update
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
2014 | DATE | 0.99+ |
Podium Data | ORGANIZATION | 0.99+ |
Paul Maritz | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Paul Barth | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
Stu | PERSON | 0.99+ |
Last year | DATE | 0.99+ |
Podium | ORGANIZATION | 0.99+ |
Doug Laney | PERSON | 0.99+ |
thecube.net | OTHER | 0.99+ |
more than 50 percent | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Boston, Massachusetts | LOCATION | 0.99+ |
MIT | ORGANIZATION | 0.98+ |
GRC | ORGANIZATION | 0.98+ |
One | QUANTITY | 0.98+ |
this year | DATE | 0.98+ |
both ends | QUANTITY | 0.98+ |
50 different projects | QUANTITY | 0.97+ |
Spark | TITLE | 0.97+ |
Data Robot | ORGANIZATION | 0.97+ |
Hadoop | TITLE | 0.96+ |
S3 | TITLE | 0.95+ |
millions of dollars | QUANTITY | 0.95+ |
GDPR | TITLE | 0.95+ |
theCUBE | ORGANIZATION | 0.95+ |
a decade ago | DATE | 0.94+ |
over 100 profiling statistics | QUANTITY | 0.91+ |
Cloud | TITLE | 0.9+ |
Rotaready | ORGANIZATION | 0.89+ |
One thing | QUANTITY | 0.87+ |
first thing | QUANTITY | 0.87+ |
VMware | TITLE | 0.86+ |
Kerberos | TITLE | 0.83+ |
The Podium Data Marketplace | ORGANIZATION | 0.79+ |
first | QUANTITY | 0.79+ |
LDAB | TITLE | 0.79+ |
Pi, P-I | ORGANIZATION | 0.77+ |
SiliconANGLE Media | ORGANIZATION | 0.61+ |
a decade | QUANTITY | 0.6+ |
wave | EVENT | 0.45+ |
Protegrity | ORGANIZATION | 0.44+ |
Ken Barth, Catalogic Software & Eric Herzog, IBM - #VMworld - #theCUBE
live from the mandalay bay convention center in las vegas it's the cues covering vmworld 2016 rock you buy vmware and its ecosystem sponsors and welcome back here on the cube to continue our coverage to vmworld from mandalay bay along with peter burrows i'm john woloson it's a pleasure to welcome two fellows are know all about being on the cube one of them very recently Kim Barth is back with a CEO and co-founder of catalyzing software came good to see you oh it's great to see you and Eric Herzog I mean the Hawaiian shirt we know is is your signature moment it was finally a vice president probably marketing and management at IBM but you're an original cubist you said that I think the first year that the cube happened I was on with Dave eons ago must have been either 2010 or 2011 the first cube ever we got to make you like an emeritus member of the Alumni Association something and let it be careful when we say say cubist let's be very clear about it right now I've got to mix words here yeah kubera all right so if you would let's take a look at talk about your relationship Kenta logic at IBM I know you have a long-standing partnership you might call that that's evolving and getting a little bit stronger and Ken if you would maybe paint that picture a little bit oh look I mean these guys are just fantastic to work with we've been working with IBM for a couple of years now we're excited because we're going to continue to move the relationship forward and we've got some exciting new announcements about supporting even more of their storage coming out later this year what we're really excited about is the way that they've jumped in and they have a complete line of flash products and as you know from our conversation the other day flash is just taking the market absolutely by storm particular around the primary applications so what we've done at IBM is dramatic extend our portfolio this year we've been a market leader for years in all flash and we see flashes ubiquity cross all primary data sets so whether that be the high-performance databases VMware environments are virtualized environments cloud configurations big data linux doesn't matter what the workload is and we have all sorts of price points all sorted from performance yo flash does have different performance characteristics depending on how you configure it now you use it substantially now of course any flash configuration abstention faster than a traditional storage array or any hybrid array 10x to as much as a hundred x in real-world application spaces so we've expanded it down from our high end into very cost effective energy products as low as nineteen thousand dollars street price not lit not right there at the point of attack end-user raid five configuration for nineteen thousand we have big data analytics all flash configurations we have mainframe in the upper end of the Linux community of what's left of the UNIX world that's still out there that few Solaris and AIX business we have a lot of products of that space again all going flash and it doesn't matter what the workload is virtualized workloads database workloads virtual server workloads virtual desktop workloads cloud workloads new world databases Splunk spark Bongo Hadoop Cassandra all of those types of workloads now can be all flash and we have the right workloads with the right solution at the rice price point and you pick the right price point right solution you need for the right workload an application and when it seems to me that you talk about performance obviously key factor their speed you know off the charts but cost is the one that once that's been solved as you said is that the big nighter is that's what's going to like the what you're seeing is flash is essentially at the same price as disk was so there's a number of storage efficiency technologies on the primary side which is a we do cattle onic edges efficiency technologies on the copy side because so much copies of data are made not only for disaster protection but for test and dev snapshotting that's n used for backup so they track all that to get efficiency on the secondary side of the equation we do things like real time compression you block level d do we have all kinds of technologies dying to cut the cost of flash and so when you factor that in flash is way less expensive actually then disc and when you look at how it impacts your data center so for example if you were running certain workloads we have a real world public reference to run their work blood which is database work look took 80 servers because the storage was so slow so you over provision your servers because of what's called storage latency that customer just swapped out the storage for flash and went from 80 physical servers to 10 to the exact same workload so the impact of flash is not just performance oriented it's actually very cost oriented not just what does it cost per gigabyte for the storage but if you can take out 70 servers you just cut not only the capex on his server farm right all the operational expenditures around it and then what cat logic does people make copies of the primary data sets and they make everything efficient on the copy cider if you will the secondary side of storage and so they complement each other what we do on primary what they do on secondary so let's talk about that a little bit so if you think about it there no productivity is a function of the amount of work that you can do divided by the amount of cost or resources consumed to form that word so flash has significant benefits as you just said that cause side but when we start talking about a lot more copies that can be made available to developers or decision-makers in a lot of different forms now we're accelerating the speed by which that digital assets get created and we're improving productivity not just through efficiency and the cost but accelerating the value that I t's able to deliver through the business that's exactly right you're hitting the nail on the head because as Eric over here said it saves capex and opex with just slash but if you had a copy data management product particularly one like ours that has it's really a combination a copy data management we have a workflow engine and we have full access to rest api's that the customer can begin to tailor it to their environment and solve a lot of pain points like around test dev database copies snap copies things like that you know they did some studies IDC actually did some studies earlier this year we're at any given time a customer would have 50 copies of different data floating around the neighborhood 50 snaps and the reason this is a complex issue is because you have many different storage types taking many different stamps you have applications snaps and so if you think about it this all starts by organizing the snaps putting them in a searchable database if you will then offering a workflow engine where you can automate the process even make it self service right and at the end of the day what can happen is they can move delete so they really kind of you have control over your environment but what they can do is they can begin to really save huge money so with flash you're going to have good kept at x + op X but if you put our ECX product in which is what a lot of our customers call copy data management on steroids you can see geometric savings of that op X and capex but you're also accelerate development time absolutely official with all about efficiencies you all those things are absolutely improved absolutely right and then if you start having like we have arrested a series of rest api's you can begin to really tailor it to that customers environment so if you're doing again I go back to the test dev example and test dev we can tie that directly into things like puppets chef bluemix right these are all development tools that make it totally efficient for the software developer right that's just one use case will we go ahead no so Eric as I new introduces more of these products arguments in the storage business for a long time forever yeah ain't that about me and respects IBM created the whole concept of storage administration whatever was 30 years ago now but as IBM does this is storage increasingly being elevated as customers see their data volumes going up and the need to track where this data is who's using it the number of copies in place how is that impacting the way IBM thinks about the concept of an overall system well we look at it from the application space it's all about the applications workloads and use cases and customers want to optimize the business value of that data so as it's growing exponentially you'd be able to access that data quickly and most importantly it needs to be always there so everyone talks about speech BCC speed for flash it's not just about speed of flash your Flash ray needs to be reliable available and serviceable just like our driver ray had to be and so you're looking at different characteristics and performance different characteristics and price different characteristics in the rats capability the reliability available in serviceability and you tie that to what you need for your workloads we've had the highest in oracle database in a company let's say that company is all oracle so you need something like our flash systems a 9000 or flash system 900 but if you've got the oracle database that tracks their asset management which would mean things like chairs tables and whiteboards that's not high performance that could go on our store wise 50 30 f which is way more cost effective and it's incredibly fast compared to our driver e but not as fast as our flash systems so it's very important a that you have the performance but be if you don't have the reliability doesn't matter how fast you are if the thing fails then your cloud goes down your virtual environment goes down your VMware doesn't work you can't access that Oracle or there sa p or that Hadoop and so it's really about how to optimize those workloads those applications and those use cases and storage is the rock-solid foundation underneath that allows you to do that absolutely and when you're going into world that's all about cloud which means real-time access and self service and the self-service suspect by the way it means that you don't always have a store gentlemen accessing it so if the thing fails and the guy's a VMware admin or a developer in Oracle or in any other environment he doesn't know what to do so you can't have the storage fee land in cognitive workloads and big data analytics workloads where you're running petabytes and petabytes and petabytes of information as fast as you possibly can you're trying to make business decisions or rail times you need the speed so what if it's super fast and then it fails so to put it on a black trading you know database for black trading for example or some of financial applications if it's really fast and then it fails that didn't help it hurts you so it's all about how to manage those workloads applications use cases natural for performance which everyone knows flash is but all that reliability available in the serviceability and then they manage a cat logic on the back side all the copies that people create which is it which is critical to make sure that those get managed appropriately and you don't have you really need 50 copies but you don't want 150 it is completely and efficient on the storage side and then developer doesn't know what to use so you just made it worse for yourself so you just introduce raise an interesting point related to data governance so I know that obviously cata logic has some ideas about how data governance is likely evolved partly in response to the need to manage multiple San apples understand where they are talk to us a little bit about how data governance which is fundamentally about how a business brings policy roles responsibilities to assets as data becomes more of an asset house governance changing oh I think governance is huge because dated you know data is exploding and particularly you start moving you have numbers of copies like Eric was saying how do you track that how do you know where it is how do you you know if you're in a compliance based business you could be in a lot of trouble so you've got to make sure you can audit and know where it goes and again one of the ways to do that is to keep it under control and not have so many copies floating around in his example you might make 10 to 15 copies of that database why do that if you only need one right that's one of our big advantages that we have versus some of our competitors we do what's called in place copy data management which means we we simply leverage Eric's great storage out there so a lot of our competitors will actually put a copy of that they'll make a copy on Eric storage move it to their storage and then you've kind of exacerbated the problem a little bit right what's like hoarding right exactly right but I and I mean kind of the Peters pointing some what you're saying is is that because we can we do right and so we make all these copies and it's exactly not need you know fifth down but but because I can and it's cheaper and storage is going down like cleaning out that closet we all have that closet at the house that we just keep putting stuff in and one of these days we think we're going to clean it out and the thing just grows and grows and they have to buy another house to get another closet so again how does this all this curb that behavior and that allow me to monitor through some governance policy when somebody is going over the line and we bring it back of the line and and we get a little more regular restrictive act again because of our workflow engine that we have in the product you can set thresholds you can automate the process so is example when a you know when a DBA or somebody gets a copy of the database you can put a time limit on when it's going to wipe it out they're going to stay in sync across the board so again you're not replicating this thing time and time again they're getting timely data when they need it and then it can automatically be removed but if I mean time one of the biggest problems within an IT organization is making available making data available to the disparate groups that need it solutely administrative costs of I need data well we'll get around to giving you that second to sorry in September right being able to do this much faster and utilize flaps technologies to facilitate that process has an impact on cost has an impact on the benefits which increases productivity has an impact of governance but also is an impact on the healthy friendly relations between IT and the business yes well what's happening is you're undergoing a revolution in the data center cloud obviously it's started with virtualization now it's extending to the cloud now you have a line of business that's more involved in IT than it's ever been before so the last thing you want is to worry about your storage or you just want it to be the foundation okay I'm from Silicon Valley we have earthquakes buildings really fall down on earthquakes if they have a bad foundation if you have a rock-solid foundation your cloud your cognitive your database workloads will always be fine you want to make sure that as you're doing that you're doing a cost effectively so both high performance that you need but high performance has a whole bunch of different price points at high performance because the entire world's got high performance other thing from an IT perspective and a business on a perspective flash storage is actually the evolution the revolutions the rest of the data center right I'm old enough where when I took my first computer class of University of California not a punch card then it all went tape anyone's seen a 1985 Schwarzenegger spy movie it's all tape then you see a 1995 Schwarzenegger spy movie and it's all hard drive arrays now it's all flash arrays so it's just an evolution from a storage perspective and it coincides with a revolution in the data center of cloud cognitive big data analytics real-time evaluation of data sets and so flash is coming at the fur and perfect time as you have this revolutionary confluence in the data center in the cloud and the web application workload yusuke space the fact that flash is only at evolution is actually great because you don't have to worry about it it's just an evolution of storage and allows you to take advantage of the revolution in your gayness enter your application or workload space that's the way the flash brings is is it's not a revolution it helps the revolution it does because as Eric was saying it you want to modernize your data center is what you're out to do and if you splash is a good step towards that and then if you had a copy data management tool like our product ECX on top of it it gives you the flexibility to move to the cloud move move it move data up to the cloud and back right it allows you to start offering self-service to your people so it doesn't take you know weeks or days to get that copy of the data they can start doing it themselves so it's a step in the right direction as he said from an evolution to the revolution of the data center yeah I'll bet out there somewhere right now there are a couple Millennials watching say did you already said about punch cards what a punch good oh no that's all it's all about date at the right place at the right time for the right people and you guys are a great example of getting that job done and thanks for being with us and sharing your story and we wish you continued success that's right I'd like to say one thing with you it is finished real quick if anybody out there has SVC or if they have in the flash from IBM please come see us we've got a great product that will greatly increase the capex it's cattle ajik software or can bart thank you gentlemen for being with us here on the cube we continue our coverage from vmworld after this thank you
**Summary and Sentiment Analysis are not been shown because of improper transcript**
ENTITIES
Entity | Category | Confidence |
---|---|---|
Eric Herzog | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
50 copies | QUANTITY | 0.99+ |
10 | QUANTITY | 0.99+ |
Kim Barth | PERSON | 0.99+ |
Ken Barth | PERSON | 0.99+ |
2010 | DATE | 0.99+ |
2011 | DATE | 0.99+ |
Eric | PERSON | 0.99+ |
80 servers | QUANTITY | 0.99+ |
70 servers | QUANTITY | 0.99+ |
john woloson | PERSON | 0.99+ |
September | DATE | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
nineteen thousand dollars | QUANTITY | 0.99+ |
nineteen thousand | QUANTITY | 0.99+ |
University of California | ORGANIZATION | 0.99+ |
peter burrows | PERSON | 0.99+ |
Ken | PERSON | 0.99+ |
50 snaps | QUANTITY | 0.99+ |
vmworld | ORGANIZATION | 0.99+ |
first computer | QUANTITY | 0.99+ |
oracle | ORGANIZATION | 0.99+ |
vmware | ORGANIZATION | 0.98+ |
1995 | DATE | 0.98+ |
15 copies | QUANTITY | 0.98+ |
Catalogic Software | ORGANIZATION | 0.98+ |
las vegas | LOCATION | 0.97+ |
Dave | PERSON | 0.97+ |
1985 | DATE | 0.97+ |
UNIX | TITLE | 0.97+ |
150 | QUANTITY | 0.97+ |
linux | TITLE | 0.97+ |
this year | DATE | 0.97+ |
mandalay bay | ORGANIZATION | 0.96+ |
Oracle | ORGANIZATION | 0.96+ |
Kenta | PERSON | 0.96+ |
one | QUANTITY | 0.95+ |
10x | QUANTITY | 0.95+ |
earlier this year | DATE | 0.95+ |
Linux | TITLE | 0.95+ |
later this year | DATE | 0.95+ |
Hawaiian | OTHER | 0.94+ |
two fellows | QUANTITY | 0.94+ |
first cube | QUANTITY | 0.94+ |
2016 | DATE | 0.94+ |
both | QUANTITY | 0.94+ |
fifth | QUANTITY | 0.94+ |
30 years ago | DATE | 0.93+ |
Solaris | ORGANIZATION | 0.92+ |
Alumni Association | ORGANIZATION | 0.92+ |
ECX | TITLE | 0.91+ |
first year | QUANTITY | 0.91+ |
80 physical servers | QUANTITY | 0.9+ |
AIX | ORGANIZATION | 0.9+ |
one use case | QUANTITY | 0.9+ |
50 30 f | OTHER | 0.88+ |
#VMworld | ORGANIZATION | 0.87+ |
second | QUANTITY | 0.87+ |
IDC | ORGANIZATION | 0.84+ |
Millennials | PERSON | 0.84+ |
one thing | QUANTITY | 0.82+ |
VMware | TITLE | 0.81+ |
a lot more copies | QUANTITY | 0.81+ |
a couple of years | QUANTITY | 0.81+ |
Schwarzenegger | PERSON | 0.8+ |
9000 | COMMERCIAL_ITEM | 0.8+ |
hundred x | QUANTITY | 0.76+ |
lot | QUANTITY | 0.75+ |
Peters | PERSON | 0.73+ |
gigabyte | QUANTITY | 0.69+ |
Hadoop | TITLE | 0.69+ |
capex | TITLE | 0.66+ |
Vaughn Stewart, Pure Storage & Ken Barth, Catalogic - #VMworld - #theCUBE
live from the mandalay bay convention center in las vegas it's the cues covering vmworld 2016 rock you by vmware and its ecosystem sponsors it's legal yeah everything's legal welcome inside walls here on the cube as we continue our coverage here at vmworld once again we're back or what is going to be an exciting three days here in Mandalay Bay and i'm joined by my partner in crime you might say mark farley the producer Vulcan cast a host of Vulcan cast and tell us about Vulcan kestrel quick mark well you've seen comedy in cars you've seen singing in cars with carpool karaoke this is discussions about technology and cars it's tech talk and cars I see it on you can see it on Vulcan cast calm what a novel name for a website I'm pretty you figure good all day coming up with that one didn't you yeah but it's cool you know what it's like to look for a name absolutely benefit but it's a neat neat concept Tech Talk comes the cars you're kind of like the the james corbett of tech there you go except we don't sing about it I'm more like the Jerry Seinfeld maybe that's the next time we're joined by a couple of guests who are they become partners to more or less here in the business and solutely with Vaughn Stewart who is the enterprise architect and chief evangelist I love that by the way of on a pure storage and that evangelist looked up you do have it you getting the whole thing today and kimbark is a CEO of cata logic software and gentlemen ulcers thank you for being here we appreciate that so if you would start off by telling us a little bit about your individual companies you know what you do and then the marriage you to have partnered up here for the past four months came together pretty quickly and what that's all about and if you would bomb what you go first sure so pure storage is recognized widely as being the number one independent all-flash storage vendor we've been recognized for three years as being the leader in gartner's solid-state array Magic Quadrant we've really allowed flash to be consumed by the masses by making it more affordable than traditional disk based storage arrays and deliver all the promise of of the performance of flash kent and in a nutshell cattle objects software's that spin out three years ago from the syncsort company and what we've got about twenty nine patents we're working hard what we did is we evolved our technology to this whole copy data management space which is very exciting and when you marry copy data management to flash technology you drive some really serious effects and catback savings for customers so it's kind of a peanut butter and chocolate on here right was together really really does right so let's talk about your relationship then this has only been four months in the making you've known each other for a long time but you put together your business venture here very quickly what brought it together so fast and how did it make that kind of sense that boom it just happened almost overnight like that to start going on with the Kent listen we were lucky enough that these guys actually found us that a trade show it was a mug event Vav mug event in Austin Texas they found some for a show they have been absolutely brilliant to work with in the business that we're in we're what's called in place copy data management and why that's important is because we get to pick our partners and it's a lot easier to build a technology if you have a partner that cooperates and these guys have been so cooperative that's what made this thing tick they saw a gap that we could fill they were kind enough they sent us a box up to work with the team culturally has been aligned I mean we we've kind of do things all up and down the stack the same way pricing I think we're very similar channel driven we're similar the way we we look at at working together is very similar say just been brilliant and that's kind of what it is it's a neat at the end of the date and to try to squeeze the effects and capex savings out for customers that's kind of the do yeah and we're also seeing a lot of requests from our from our customer base we have a large number of joint customers as well as customers that were interested in purchasing the other technology but we're waiting for a point of integration and so as we're seeing this shift in the the mid market and the enterprise to a more DevOps centric model more of infrastructure teams converging their their server and their compute management or application owners into owning the entire stack there was this this need for taking the data management constructs that we had and allowing an end-to-end ecosystem enable meant so that dev teams could just you know at the push of a button and refresh their data sets move they move their development efforts forward and get rid of all the old legacy time centric based provisioning models yeah I mean I mean CDM has kind of become you know one of these hot buzzwords right all of a sudden as as our data storage just become more capable and has become cheaper we tend to hoard more stuff right now listen we're hanging on things a lot longer so what is the gap exactly you're talking about that you're filling and what's the need that you're addressing specifically then you have all this data at your disposal and and and I guess with Flash movie great question John so what what happens is when you first of all let's talk about what's driving the flash analogy right why why flash is so popular right now everybody that we've talked to is either moving to flash or thinking about moving to flash simply for their primary applications you know those are things like databases virtualization filers you know SharePoint right and as you start to move you get you get really good benefits around effects from using your flash because the speed and the performance particularly with what they do they've got some compression stuff that's unbelievable and then what we do is we overlay that so if you take CDM which was your question if you look at CDM what CDM does copy data management it allows you to deal with all of these copies in the in the world today you've got so many of the vendors that are taking different snapshots at different times and you end up at any given time I think IDC did a study what was it like 50 50 versions of an email that you've got floating around is any given time floating in your organization right so what Vaughn was referring to let's take one example in a test dev environment right we could drive home on that which they do a lot more than that but if you take the test stab and let's say you're a developer and you have an Oracle database that you really want to test the latest data right now without flash without CDM what happens is you make a copy of that database you move it to the developers and getting that copy if you're a developer getting that copy away from the internal IT infrastructure department can take you hours can take days go ahead we've we've got customers whose current copy data management process is it is is fulfilled by either a full-time employee or a staff that runs around doing arm and restores or restores from tape and development teams have to try to anticipate weeks in advance when a new copy of the data that model has been the the de facto standard in the industry for a decade or more and in what you're seeing from from all conversations around DevOps is agility it's time to how can I no increase the rate at which we innovate part of it is by bringing agility into your development process and so so this is a real nice pairing of technologies the performance capabilities within a flash a flash array allows you to scale a large number of instances the instant ability to clone the data set gives you gives you the agility but it's just an engine I still have to take care of the rest of the stack I got a role based access which users get to see which data do I need a datum ask the data or do they get direct access are they having a virtual copy or a physical right and best part can I make it a portal or can I make it right into their native workflow so they never hit the storage team or even the infrastructure team so let's talk about how customers are going to use this right pure has been a big leader not just in flash but and also digital efficiency capacity efficiency and you've had to be that right from the get-go people are saying well how am I going to be able to get the cost you know the effect of costs down of this flash well you have dee doop and you have compression and now you're adding this application layer or higher layer if you will another layer of the stack towards you know data density do you think this is going to have you done run the numbers on what kind of percentage or anything like that that customers will see absolutely kind of kind of absolute ken so I'm actually doing in the solution booth I think 430 tomorrow's solution a the vmworld booth we've got a customer six flags theme park operator that doing this test dev case we saved ninety percent affects efficiency for these guys so there's some really solid number again 90 90 90 / such a big number what's a huge number but it's what is what Vaughn was trying to say if you start marrying the workflow if you take their ability to make the storage and the moving the data more efficient and you'll ever their tool and then you overlay it with our api's we have rest api is that you can tie into a customer environment and then we've got to work flow this workflow engine that we call full stack automation the customer can start automating a lot of the stuff that they're trying to do and it's a home run yeah let's be let's be get a little bit in greater depth here but not too deep yeah these capabilities have existed in market for a long time yeah but the customers had to assemble and build their own scriptures in a fool's the phone and again we're not talking just copying of the data yeah we're giving you an efficiency in the copy data engine with it running on the flash array right what cata logic is doing is giving you a single interface either via portal or API for the entire orc for the orchestration of the entire stack the test Network the virtual machines the physical servers the volume managers all the way down to the copy of the data absolutely so I'm going to dive even deeper bond what kind of skill set be careful what did I get wet what kind of skill set does a customer need to have to take advantage of this solution so that's that's a beautiful question because it goes back to the synergy between our two companies right we're known for being able to set up storage in under an hour that requires no administrative skill set right nothing to tune much like very much like an iphone right kind of out of the box there's no manual right cata logics in the same boat you download an ova you're up and running in 30 minutes you're connected to the pure array in four at 40 minutes yeah you're connected ad and 50 and you're running you're off to the races right we don't have any boxes no appliance versus our competitors out there right we don't have any agents to install no appliances it's just it's the perfect match simplistic and we're running and through api's right we're getting we're getting consistent application consistent copies of the data sets right and we're orchestrating through the built in infrastructures that that already exists whether we're looking at vSphere or the rest of the ecosystem so say a customer does their own development and they've got they've got people that know how to use api's program for api's will they be able to will they be any faster be able to do more with it or does it really not what it does this gets back to the effects issue right so so with our REST API they can tie it in and we've already got a lot of things that are tied in like some of the development tools out there chef puppet bluemix from IBM I mean these are all things that we we can kind of work with to complete the environment and allowed them to lever is amazing platform does that answer your question I think yeah so what about the market for this right it a happy data management took a while to take off right it's one of those things in data management has always been a tough thing and it takes a while for customers to sort of get a what what I'm going to say a group think and the critical mass of people thinking about it it looks like you've had some help in the last year with other vendors getting in well and popularizing it you know EMC has theirs and commvault I think is doing something in my response is talking about it now you know 18 months ago those of what he did but what started it mark and this is and that's a great question is what I was alluding to earlier once flash comes on the scene and particularly flash vendors that can do what they do that have got a huge cat-back saving or opik savings for the customer then you can start working in their workflow in their processes and saving them even more money so it actually is copy data management with flash storage can becomes almost to have to have versus and the other things that we were doing a year ago it was a nice to have what i call a nice to have right because if you start looking at how to save yourself money from an effects perspective you might as well look at how to go all the way and sometimes you can triple to 10 times your savings geometrically by adding see the right CDM what i call enhance CDM what our customers sometimes say is they call us a CDM on steroids copy data management on steroids that's energy is a big thing if you've looked at the industry historically what you've seen as storage vendors put out their own homogeneous right automation walls right point bond and then you've seen a number of heterogeneous vendors to play their tools but they don't want to have any correlation with any hardware vendor right right and so and so as a storage provider right and customers are looking to say well look I don't wanna get locked in a particular storage provider and right so that's one aspect as a storage vendor we're sitting there saying we'd like to have greater integration your ecosystem so we can bubble up our value cattle logics kind of hit that sweet spot and said we're going to be heterogeneous we're going to be multi-platform and we're going to leverage leverage the channel right hundred percent channel driven and we're going to leverage the API and the data management ecosystem the storage vendors so they've kind of got a perfect storm going on in terms of a technology and market momentum if you like ok so let's talk about how the solution is going to be delivered you sell it do you sell it do you sell into pure accounts you talked about channel we're getting we're going to meet in the channel okay we're also talking about doing some more creative things possibly up for right now it's a meet in the channel we think there's enough enough good networking the teams are in touch with each other you know the value proposition proves itself right if somebody when's it going to be available in another month or so so there are demonstrations available both in the cat illogic and in the pure storage booth here at vmworld I so we would we would encourage those who are interested in seeing the power of this this solution to stop by either booth at any time we're going to speaking sessions in each others as well this week absolutely up and we are currently targeting for somewhere between mid to end-september for a ga release right and I need to say one other thing going back to this the reason this works is because these guys have but one care and they are customer driven right they don't have an ego they are driving to the customer and fulfill the needs because as he said it's sometimes hard for a heterogeneous vendor that controls a lot to be welcomed as much as we've been welcomed with this group it's because they know they want to drive it through the customer get the best solution in the world of the customer so on the customer side you've talked about the perfect storm of services and products who's the perfect customer who's the optimal customer something like this that I i think the low-hanging fruit is any development team that has as some requirement where they are taking copies of their current data set and are developing off of that platform I think that's the low-hanging fruit I think at a more macro level any organization that says they have a DevOps initiative and particularly they want to turn key DevOps platform to be riding with and launch launch ahead versus a try to acquire talent to build their own this is rate rate within your wheelhouse good deal no brainer and if people aren't looking at that right now you know they're not they're not in this century right because everybody's moving to flash for the primary all the projections are going forward to going off the charts in terms of the growth of flash of what's gonna happen at any what's changed with flash right where four years ago sure had to kind of get over the hurdle of the price berry for flat right we did that with industry-leading data reduction that's still two x better than the rest of the industry but as flash prices keep coming down not what you're seeing as a pivot around around value is around making multiple data sets I mean if you get into a depth use case and I'm making ten copies of a data footprint that's already reduced by x 5x and you're getting to a price point that you just you can't you can't meet with with this because you couldn't drive enough performance either death actually that's not possible yeah well before I let you go I want to tell you it's just disappointing to us that you're not more enthusiastic so and super a little it's really impressed today we had a long night life maybe tomorrow things will pick up but congratulations on the business venture and wish you the best of luck down the road thanks for being well thank you thank you guys for having us on really enjoyed it appreciate it thank huh thank you back with more from vmworld right after this here on the cube
**Summary and Sentiment Analysis are not been shown because of improper transcript**
ENTITIES
Entity | Category | Confidence |
---|---|---|
two companies | QUANTITY | 0.99+ |
ninety percent | QUANTITY | 0.99+ |
three years | QUANTITY | 0.99+ |
Vaughn | PERSON | 0.99+ |
mark farley | PERSON | 0.99+ |
30 minutes | QUANTITY | 0.99+ |
four months | QUANTITY | 0.99+ |
40 minutes | QUANTITY | 0.99+ |
Vaughn Stewart | PERSON | 0.99+ |
Mandalay Bay | LOCATION | 0.99+ |
vmware | ORGANIZATION | 0.99+ |
Austin Texas | LOCATION | 0.99+ |
ten copies | QUANTITY | 0.99+ |
vmworld | ORGANIZATION | 0.99+ |
18 months ago | DATE | 0.99+ |
tomorrow | DATE | 0.99+ |
a year ago | DATE | 0.99+ |
SharePoint | TITLE | 0.99+ |
two | QUANTITY | 0.99+ |
Jerry Seinfeld | PERSON | 0.98+ |
Ken Barth | PERSON | 0.98+ |
50 | QUANTITY | 0.98+ |
IDC | ORGANIZATION | 0.98+ |
kimbark | PERSON | 0.98+ |
iphone | COMMERCIAL_ITEM | 0.98+ |
IBM | ORGANIZATION | 0.97+ |
today | DATE | 0.97+ |
10 times | QUANTITY | 0.97+ |
four years ago | DATE | 0.97+ |
one aspect | QUANTITY | 0.97+ |
last year | DATE | 0.97+ |
three days | QUANTITY | 0.97+ |
vSphere | TITLE | 0.97+ |
DevOps | TITLE | 0.97+ |
Oracle | ORGANIZATION | 0.96+ |
three years ago | DATE | 0.96+ |
mid | DATE | 0.96+ |
EMC | ORGANIZATION | 0.96+ |
this week | DATE | 0.95+ |
Pure Storage | ORGANIZATION | 0.95+ |
hundred percent | QUANTITY | 0.95+ |
Catalogic | ORGANIZATION | 0.95+ |
2016 | DATE | 0.95+ |
james corbett | PERSON | 0.94+ |
one | QUANTITY | 0.94+ |
las vegas | LOCATION | 0.93+ |
a decade | QUANTITY | 0.93+ |
gartner | ORGANIZATION | 0.92+ |
cata logic software | ORGANIZATION | 0.92+ |
under an hour | QUANTITY | 0.91+ |
about twenty nine patents | QUANTITY | 0.91+ |
John | PERSON | 0.9+ |
both | QUANTITY | 0.9+ |
#VMworld | ORGANIZATION | 0.9+ |
triple | QUANTITY | 0.87+ |
CDM | ORGANIZATION | 0.86+ |
single interface | QUANTITY | 0.85+ |
first | QUANTITY | 0.85+ |
x | QUANTITY | 0.83+ |
flags | ORGANIZATION | 0.82+ |
50 versions | QUANTITY | 0.82+ |
90 | OTHER | 0.79+ |
end-september | DATE | 0.79+ |
Vaughn Stewart | ORGANIZATION | 0.79+ |
Vulcan | ORGANIZATION | 0.76+ |
mandalay bay convention center | LOCATION | 0.74+ |
couple of guests | QUANTITY | 0.74+ |
vOps | TITLE | 0.73+ |
Vulcan | TITLE | 0.7+ |
number one | QUANTITY | 0.7+ |
carpool karaoke | TITLE | 0.68+ |
past four months | DATE | 0.66+ |
one care | QUANTITY | 0.65+ |
430 | OTHER | 0.62+ |
5x | QUANTITY | 0.59+ |
six | QUANTITY | 0.49+ |
lot | QUANTITY | 0.48+ |
Kent | LOCATION | 0.44+ |
90 90 | OTHER | 0.35+ |
Bharath Chari, Confluent & Sam Kassoumeh, SecurityScorecard | AWS Startup Showcase S2 E4
>>Hey everyone. Welcome to the cubes presentation of the AWS startup showcase. This is season two, episode four of our ongoing series. That's featuring exciting startups within the AWS ecosystem. This theme, cybersecurity protect and detect against threats. I'm your host. Lisa Martin. I've got two guests here with me. Please. Welcome back to the program. Sam Kam, a COO and co-founder of security scorecard and bar Roth. Charri team lead solutions marketing at confluent guys. It's great to have you on the program talking about cybersecurity. >>Thanks for having us, Lisa, >>Sam, let's go ahead and kick off with you. You've been on the queue before, but give the audience just a little bit of context about security scorecard or SSC as they're gonna hear it referred to. >>Yeah. AB absolutely. Thank you for that. Well, the easiest way to, to put it is when people wanna know about their credit risk, they consult one of the major credit scoring companies. And when companies wanna know about their cybersecurity risk, they turn to security scorecard to get that holistic view of, of, of the security posture. And the way it works is SSC is continuously 24 7 collecting signals from across the entire internet. I entire IPV four space and they're doing it to identify vulnerable and misconfigured digital assets. And we were just looking back over like a three year period. We looked from 2019 to 2022. We, we, we assessed through our techniques over a million and a half organizations and found that over half of them had at least one open critical vulnerability exposed to the internet. What was even more shocking was 20% of those organizations had amassed over a thousand vulnerabilities each. >>So SSC we're in the business of really building solutions for customers. We mine the data from dozens of digital sources and help discover the risks and the flaws that are inherent to their business. And that becomes increasingly important as companies grow and find new sources of risk and new threat vectors that emerge on the internet for themselves and for their vendor and business partner ecosystem. The last thing I'll mention is the platform that we provide. It relies on data collection and processing to be done in an extremely accurate and real time way. That's a key for that's allowed us to scale. And in order to comp, in order for us to accomplish this security scorecard engineering teams, they used a really novel combination of confluent cloud and confluent platform to build a really, really robust data for streaming pipelines and the data streaming pipelines enabled by confluent allow us at security scorecard to collect the data from a lot of various sources for risk analysis. Then they get feer further analyzed and provided to customers as a easy to understand summary of analytics. >>Rob, let's bring you into the conversation, talk about confluent, give the audience that overview and then talk about what you're doing together with SSC. >>Yeah, and I wanted to say Sam did a great job of setting up the context about what confluent is. So, so appreciate that, but a really simple way to think about it. Lisa is confident as a data streaming platform that is pioneering a fundamentally new category of data infrastructure that is at the core of what SSE does. Like Sam said, the key is really collect data accurately at scale and in real time. And that's where our cloud native offering really empowers organizations like SSE to build great customer experiences for their customers. And the other thing we do is we also help organizations build a sophisticated real time backend operations. And so at a high level, that's the best way to think about comfort. >>Got it. But I'll talk about data streaming, how it's being used in cyber security and what the data streaming pipelines enable enabled by confluent allow SSE to do for its customers. >>Yeah, I think Sam can definitely share his thoughts on this, but one of the things I know we are all sort of experiencing is the, is the rise of cyber threats, whether it's online from a business B2B perspective or as consumers just be our data and, and the data that they're generating and the companies that have access to it. So as the, the need to protect the data really grows companies and organizations really need to effectively detect, respond and protect their environments. And the best way to do this is through three ways, scale, speed, and cost. And so going back to the points I brought up earlier with conference, you can really gain real time data ingestion and enable those analytics that Sam talked about previously while optimizing for cost scale. So those are so doing all of this at the same time, as you can imagine, is, is not easy and that's where we Excel. >>And so the entire premise of data streaming is built on the concepts. That data is not static, but constantly moving across your organization. And that's why we call it data streams. And so at its core, we we've sort of built or leveraged that open source foundation of APA sheet Kafka, but we have rearchitected it for the cloud with a totally new cloud native experience. And ultimately for customers like SSE, we have taken a away the need to manage a lot of those operational tasks when it comes to Apache Kafka. The other thing we've done is we've added a ton of proprietary IP, including security features like role based access control. I mean, some prognosis talking about, and that really allows you to securely connect to any data no matter where it resides at scale at speed. And it, >>Can you talk about bar sticking with you, but some of the improvements, and maybe this is a actually question for Sam, some of the improvements that have been achieved on the SSC side as a result of the confluent partnership, things are much faster and you're able to do much more understand, >>Can I, can Sam take it away? I can maybe kick us off and then breath feel, feel free to chime in Lisa. The, the, the, the problem that we're talking about has been for us, it was a longstanding challenge. We're about a nine year old company. We're a high growth startup and data collection has always been in, in our DNA. It's at it's at the core of what we do and getting, getting the insights, the, and analytics that we synthesize from that data into customer's hands as quickly as possible is the, is the name of the game because they're trying to make decisions and we're empowering them to make those decisions faster. We always had challenges in, in the arena because we, well partners like confluent didn't didn't exist when we started scorecard when, when we we're a customer. But we, we, we think of it as a partnership when we found confluent technology and you can hear it from Barth's description. >>Like we, we shared a common vision and they understood some of the pain points that we were experiencing on a very like visceral and intimate level. And for us, that was really exciting, right? Just to have partners that are there saying, we understand your problem. This is exactly the problem that we're solving. We're, we're here to help what the technology has done for us since then is it's not only allowed us to process the data faster and get the analytics to the customer, but it's also allowed us to create more value for customers, which, which I'll talk about in a bit, including new products and new modules that we didn't have the capabilities to deliver before. >>And we'll talk about those new products in a second exciting stuff coming out there from SSC, bro. Talk about the partnership from, from confluence perspective, how has it enabled confluence to actually probably enhance its technology as a result of seeing and learning what SSC is able to do with the technology? >>Yeah, first of all, I, I completely agree with Sam it's, it's more of a partnership because like Sam said, we sort of shared the same vision and that is to really make sure that organizations have access to the data. Like I said earlier, no matter where it resides so that you can scan and identify the, the potential security security threads. I think from, from our perspective, what's really helped us from the perspective of partnering with SSE is just looking at the data volumes that they're working with. So I know a stat that we talked about recently was around scanning billions of records, thousands of ports on a daily basis. And so that's where, like I, like I mentioned earlier, our technology really excels because you can really ingest and amplify the volumes of data that you're processing so that you can scan and, and detect those threats in real time. >>Because I mean, especially the amount of volume, the data volume that's increasing on a year by basis, that aspect in order to be able to respond quickly, that is paramount. And so what's really helped us is just seeing what SSE is doing in terms of scanning the, the web ports or the data systems that are at are at potential risk. Being able to support their use cases, whether it's data sharing between their different teams internally are being able to empower customers, to be able to detect and scan their data systems. And so the learning for us is really seeing how those millions and billions of records get processed. >>Got it sounds like a really synergistic partnership that you guys have had there for the last year or so, Sam, let's go back over to you. You mentioned some new products. I see SSC just released a tax surface intelligence product. That's detecting thousands of vulnerabilities per minute. Talk to us about that, the importance of that, and another release that you're making. >>There are some really exciting products that we have released recently and are releasing at security scorecard. When we think about, when we think about ratings and risk, we think about it not just for our companies or our third parties, but we think about it in a, in a broader sense of an, of an ecosystem, because it's important to have data on third parties, but we also want to have the data on their third parties as well. No, nobody's operating in a vacuum. Everybody's operating in this hyper connected ecosystem and the risk can live not just in the third parties, but they might be storing processing data in a myriad of other technological solutions, which we want to understand, but it's really hard to get that visibility because today the way it's done is companies ask their third parties. Hey, send me a list of your third parties, where my data is stored. >>It's very manual, it's very labor intensive, and it's a trust based exercise that makes it really difficult to validate. What we've done is we've developed a technology called a V D automatic vendor detection. And what a V D does is it goes out and for any company, your own company or another business partner that you work with, it will go detect all of the third party connections that we see that have a live network connection or data connection to an organization. So that's like an awareness and discovery tool because now we can see and pull the veil back and see what the bigger ecosystem and connectivity looks like. Thus allowing the customers to go hold accountable, not just the third parties, but their fourth parties, fifth parties really end parties. And they, and they can only do that by using scorecard. The attack surface intelligence tool is really exciting for us because well, be before security scorecard people thought what we were doing was fairly, I impossible. >>It was really hard to get instant visibility on any company and any business partner. And at the same time, it was of critical importance to have that instant visibility into the risk because companies are trying to make faster decisions and they need the risk data to steer those decisions. So when I think about, when I think about that problem in, in managing sort of this evolving landscape, what it requires is it requires insightful and actionable, real time security data. And that relies on a couple things, talent and tech on the talent side, it starts with people. We have an amazing R and D team. We invest heavily. It's the heartbeat of what we do. That team really excels in areas of data collection analysis and scaling large data sets. And then we know on the tech side, well, we figured out some breakthrough techniques and it also requires partners like confluent to help with the real time streaming. >>What we realized was those capabilities are very desired in the market. And we created a new product from it called the tech surface intelligence. A tech surface intelligence focuses less on the rating. There's, there's a persona on users that really value the rating. It's easy to understand. It's a bridge language between technical and non-technical stakeholders. That's on one end of the spectrum on the other end of the spectrum. There's customers and users, very technical customers and users that may not have as much interest in a layman's rating, but really want a deep dive into the strong threat Intel data and capabilities and insights that we're producing. So we produced ASI, which stands for attack surface intelligence that allows customers to look at the surface area of attack all of the digital assets for any organization and see all of the threats, vulnerabilities, bad actors, including sometimes discoveries of zero day vulnerabilities that are, that are out in the wild and being exploited by bad guys. So we have a really strong pulse on what's happening on the internet, good and bad. And we created that product to help service a market that was interested in, in going deep into the data. >>So it's >>So critical. Go >>Ahead to jump in there real quick, because I think the points that Sam brought up, we had a great, great discussion recently while we were building on the case study that I think brings this to life, going back to the AVD product that Sam talked about and, and Sam can probably do a better job of walking through the story, but the way I understand it, one of security scorecards customers approached them and told them that they had an issue to resolve and what they ended up. So this customer was using an AVD product at the time. And so they said that, Hey, the car SSE, they said, Hey, your product shows that we used, you were using HubSpot, but we stopped using that age server. And so I think when SSE investigated, they did find a very recent HubSpot ping being used by the marketing team in this instance. And as someone who comes from that marketing background, I can raise my hand and said, I've been there, done that. So, so yeah, I mean, Sam can probably share his thoughts on this, but that's, I think the great story that sort of brings this all to life in terms of how actually customers go about using SSCs products. >>And Sam, go ahead on that. It sounds like, and one of the things I'm hearing that is a benefit is reduction in shadow. It, I'm sure that happens so frequently with your customers about Mar like a great example that you gave of, of the, the it folks saying we don't use HubSpot, have it in years marketing initiates an instance. Talk about that as some of the benefits in it for customers reducing shadow it, there's gotta be many more benefits from a security perspective. >>Yeah, the, there's a, there's a big challenge today because the market moved to the cloud and that makes it really easy for anybody in an organization to go sign, sign up, put in a credit card, or get a free trial to, to any product. And that product can very easily connect into the corporate system and access the data. And because of the nature of how cloud products work and how easy they are to sign up a byproduct of that is they sort of circumvent a traditional risk assessment process that, that organizations go through and organizations invest a, a lot of money, right? So there's a lot of time and money and energy that are invested in having good procurement risk management life cycles, and making sure that contracts are buttoned up. So on one side you have companies investing loads of energy. And then on the other side, any employee can circumvent that process by just going and with a few clicks, signing up and purchasing a product. >>And that's, and, and, and then that causes a, a disparity and Delta between what the technology and security team's understanding is of the landscape and, and what reality is. And we're trying to close that gap, right? We wanna close and reduce any windows of time or opportunity where a hacker can go discover some misconfigured cloud asset that somebody signed up for and maybe forgot to turn off. I mean, it's a lot of it is just human error and it, and it happens the example that Barra gave, and this is why understanding the third parties are so important. A customer contacted us and said, Hey, you're a V D detection product has an error. It's showing we're using a product. I think it was HubSpot, but we stopped using that. Right. And we don't understand why you're still showing it. It has to be a false positive. >>So we investigated and found that there was a very recent live HubSpot connection, ping being made. Sure enough. When we went back to the customer said, we're very confident the data's accurate. They looked into it. They found that the marketing team had started experimenting with another instance of HubSpot on the side. They were putting in real customer data in that instance. And it, it, you know, it triggered a security assessment. So we, we see all sorts of permutations of it, large multinational companies spin up a satellite office and a contractor setting up the network equipment. They misconfigure it. And inadvertently leave an administrator portal to the Cisco router exposed on the public internet. And they forget to turn off the administrative default credentials. So if a hacker stumbles on that, they can ha they have direct access to the network. We're trying to catch those things and surface them to the client before the hackers find it. >>So we're giving 'em this, this hacker's eye view. And without the continuous data analysis, without the stream processing, the customer wouldn't have known about those risks. But if you can automatically know about the risks as they happen, what that does is that prevents a million shoulder taps because the customer doesn't have to go tap on the marketing team's shoulder and go tap on employees and manually interview them. They have the data already, and that can be for their company. That can be for any company they're doing business with where they're storing and processing data. That's a huge time savings and a huge risk reduction, >>Huge risk reduction. Like you're taking blinders off that they didn't even know were there. And I can imagine Sam tune in the last couple of years, as SAS skyrocketed the use of collaboration tools, just to keep the lights on for organizations to be able to communicate. There's probably a lot of opportunity in your customer base and perspective customer base to engage with you and get that really full 360 degree view of their entire organization. Third parties, fourth parties, et cetera. >>Absolutely. Absolutely. CU customers are more engaged than they've ever been because that challenge of the market moving to the cloud, it hasn't stopped. We've been talking about it for a long time, but there's still a lot of big organizations that are starting to dip their toe in the pool and starting to cut over from what was traditionally an in-house data center in the basement of the headquarters. They're, they're moving over to the cloud. And then on, on top of that cloud providers like Azure, AWS, especially make it so easy for any company to go sign up, get access, build a product, and launch that product to the market. We see more and more organizations sitting on AWS, launching products and software. The, the barrier to entry is very, very low. And the value in those products is very, very high. So that's drawing the attention of organizations to go sign up and engage. >>The challenge then becomes, we don't know who has control over this data, right? We don't have know who has control and visibility of our data. We're, we're bringing that to surface and for vendors themselves like, especially companies that sit in AWS, what we see them doing. And I think Lisa, this is what you're alluding to. When companies engage in their own scorecard, there's a bit of a social aspect to it. When they look good in our platform, other companies are following them, right? So now all of the sudden they can make one motion to go look good, make their scorecard buttoned up. And everybody who's looking at them now sees that they're doing the right things. We actually have a lot of vendors who are customers, they're winning more competitive bakeoffs and deals because they're proving to their clients faster that they can trust them to store the data. >>So it's a bit of, you know, we're in a, two-sided kind of market. You have folks that are assessing other folks. That's fun to look at others and see how they're doing and hold them accountable. But if you're on the receiving end, that can be stressful. So what we've done is we've taken the, that situation and we've turned it into a really positive and productive environment where companies, whether they're looking at someone else or they're looking at themselves to prove to their clients, to prove to the board, it turns into a very productive experience for them >>One. Oh >>Yeah. That validation. Go ahead, bro. >>Really. I was gonna ask Sam his thoughts on one particular aspect. So in terms of the industry, Sam, that you're seeing sort of really moving to the cloud and like this need for secure data, making sure that the data can be trusted. Are there specific like verticals that are doing that better than the others? Or do you see that across the board? >>I think some industries have it easier and some industries have it harder, definitely in industries that are, I think, health, healthcare, financial services, a absolutely. We see heavier activity there on, on both sides, right? They they're, they're certainly becoming more and more proactive in their investments, but the attacks are not stopping against those, especially healthcare because the data is so valuable and historically healthcare was under, was an underinvested space, right. Hospitals. And we're always strapped for it folks. Now, now they're starting to wake up and pay very close attention and make heavier investments. >>That's pretty interesting. >>Tremendous opportunity there guys. I'm sorry. We are out of time, but this is such an interesting conversation. You see, we keep going, wanna ask you both where can, can prospective interested customers go to learn more on the SSC side, on the confluence side, through the AWS marketplace? >>I let some go first. >>Sure. Oh, thank thank, thank you. Thank you for on the security scorecard side. Well look, security scorecard is with the help of Colu is, has made it possible to instantly rate the security posture of any company in the world. We have 12 million organizations rated today and, and that, and that's going up every day. We invite any company in the world to try security scorecard for free and experience how, how easy it is to get your rating and see the security rating of, of any company and any, any company can claim their score. There's no, there's no charge. They can go to security, scorecard.com and we have a special, actually a special URL security scorecard.com/free-account/aws marketplace. And even better if someone's already on AWS, you know, you can view our security posture with the AWS marketplace, vendor insights, plugin to quickly and securely procure your products. >>Awesome. Guys, this has been fantastic information. I'm sorry, bro. Did you wanna add one more thing? Yeah. >>I just wanted to give quick call out leads. So anyone who wants to learn more about data streaming can go to www confluent IO. There's also an upcoming event, which has a separate URL. That's coming up in October where you can learn all about data streaming and that URL is current event.io. So those are the two URLs I just wanted to quickly call out. >>Awesome guys. Thanks again so much for partnering with the cube on season two, episode four of our AWS startup showcase. We appreciate your insights and your time. And for those of you watching, thank you so much. Keep it right here for more action on the, for my guests. I am Lisa Martin. We'll see you next time.
SUMMARY :
It's great to have you on the program talking about cybersecurity. You've been on the queue before, but give the audience just a little bit of context about And the way it works the flaws that are inherent to their business. Rob, let's bring you into the conversation, talk about confluent, give the audience that overview and then talk about what a fundamentally new category of data infrastructure that is at the core of what what the data streaming pipelines enable enabled by confluent allow SSE to do for And so going back to the points I brought up earlier with conference, And so the entire premise of data streaming is built on the concepts. It's at it's at the core of what we do and getting, Just to have partners that are there saying, we understand your problem. Talk about the partnership from, from confluence perspective, how has it enabled confluence to So I know a stat that we talked about And so the learning for us is really seeing how those millions and billions Talk to us about that, the importance of that, and another release that you're making. and the risk can live not just in the third parties, Thus allowing the customers to go hold accountable, not just the third parties, And at the same time, it was of critical importance to have that instant visibility into the risk because And we created a new product from it called the tech surface intelligence. So critical. to resolve and what they ended up. Talk about that as some of the benefits in it for customers reducing shadow it, And because of the nature I mean, it's a lot of it is just human error and it, and it happens the example that Barra gave, And they forget to turn off the administrative default credentials. a million shoulder taps because the customer doesn't have to go tap on the marketing team's shoulder and go tap just to keep the lights on for organizations to be able to communicate. because that challenge of the market moving to the cloud, it hasn't stopped. So now all of the sudden they can make one motion to go look to prove to the board, it turns into a very productive experience for them Go ahead, bro. need for secure data, making sure that the data can be trusted. Now, now they're starting to wake up and pay very close attention and make heavier investments. learn more on the SSC side, on the confluence side, through the AWS marketplace? They can go to security, scorecard.com and we have a special, Did you wanna add one more thing? can go to www confluent IO. And for those of you watching,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Sam | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Sam Kam | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Sam Kassoumeh | PERSON | 0.99+ |
October | DATE | 0.99+ |
20% | QUANTITY | 0.99+ |
2019 | DATE | 0.99+ |
SSE | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
two guests | QUANTITY | 0.99+ |
SSC | ORGANIZATION | 0.99+ |
360 degree | QUANTITY | 0.99+ |
Rob | PERSON | 0.99+ |
HubSpot | ORGANIZATION | 0.99+ |
Excel | TITLE | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
Delta | ORGANIZATION | 0.99+ |
2022 | DATE | 0.99+ |
last year | DATE | 0.99+ |
fifth parties | QUANTITY | 0.99+ |
Bharath Chari | PERSON | 0.99+ |
both sides | QUANTITY | 0.99+ |
SAS | ORGANIZATION | 0.99+ |
thousands | QUANTITY | 0.98+ |
over a million and a half organizations | QUANTITY | 0.98+ |
three year | QUANTITY | 0.98+ |
APA | TITLE | 0.98+ |
today | DATE | 0.98+ |
billions of records | QUANTITY | 0.98+ |
thousands of ports | QUANTITY | 0.97+ |
second | QUANTITY | 0.97+ |
one | QUANTITY | 0.97+ |
both | QUANTITY | 0.97+ |
Colu | ORGANIZATION | 0.97+ |
fourth parties | QUANTITY | 0.96+ |
two URLs | QUANTITY | 0.96+ |
over a thousand vulnerabilities | QUANTITY | 0.96+ |
www confluent IO | OTHER | 0.95+ |
zero day | QUANTITY | 0.95+ |
Barth | PERSON | 0.95+ |
Intel | ORGANIZATION | 0.93+ |
scorecard.com | OTHER | 0.93+ |
one more thing | QUANTITY | 0.91+ |
SSE | TITLE | 0.89+ |
first | QUANTITY | 0.89+ |
Barra | ORGANIZATION | 0.88+ |
24 7 | QUANTITY | 0.87+ |
12 million organizations | QUANTITY | 0.85+ |