UNLIST TILL 4/2 - Keep Data Private
>> Paige: Hello everybody and thank you for joining us today for the Virtual Vertica BDC 2020. Today's breakout session is entitled Keep Data Private Prepare and Analyze Without Unencrypting With Voltage SecureData for Vertica. I'm Paige Roberts, Open Source Relations Manager at Vertica, and I'll be your host for this session. Joining me is Rich Gaston, Global Solutions Architect, Security, Risk, and Government at Voltage. And before we begin, I encourage you to submit your questions or comments during the virtual session, you don't have to wait till the end. Just type your question as it occurs to you, or comment, in the question box below the slide and then click Submit. There'll be a Q&A session at the end of the presentation where we'll try to answer as many of your questions as we're able to get to during the time. Any questions that we don't address we'll do our best to answer offline. Now, if you want, you can visit the Vertica Forum to post your questions there after the session. Now, that's going to take the place of the Developer Lounge, and our engineering team is planning to join the Forum, to keep the conversation going. So as a reminder, you can also maximize your screen by clicking the double arrow button, in the lower-right corner of the slides. That'll allow you to see the slides better. And before you ask, yes, this virtual session is being recorded and it will be available to view on-demand this week. We'll send you a notification as soon as it's ready. All right, let's get started. Over to you, Rich. >> Rich: Hey, thank you very much, Paige, and appreciate the opportunity to discuss this topic with the audience. My name is Rich Gaston and I'm a Global Solutions Architect, within the Micro Focus team, and I work on global Data privacy and protection efforts, for many different organizations, looking to take that journey toward breach defense and regulatory compliance, from platforms ranging from mobile to mainframe, everything in between, cloud, you name it, we're there in terms of our solution sets. Vertica is one of our major partners in this space, and I'm very excited to talk with you today about our solutions on the Vertica platform. First, let's talk a little bit about what you're not going to learn today, and that is, on screen you'll see, just part of the mathematics that goes into, the format-preserving encryption algorithm. We are the originators and authors and patent holders on that algorithm. Came out of research from Stanford University, back in the '90s, and we are very proud, to take that out into the market through the NIST standard process, and license that to others. So we are the originators and maintainers, of both standards and athureader in the industry. We try to make this easy and you don't have to learn any of this tough math. Behind this there are also many other layers of technology. They are part of the security, the platform, such as stateless key management. That's a really complex area, and we make it very simple for you. We have very mature and powerful products in that space, that really make your job quite easy, when you want to implement our technology within Vertica. So today, our goal is to make Data protection easy for you, to be able to understand the basics of Voltage Secure Data, you're going to be learning how the Vertica UDx, can help you get started quickly, and we're going to see some examples of how Vertica plus Voltage Secure Data, are going to be working together, in our customer cases out in the field. First, let's take you through a quick introduction to Voltage Secure Data. The business drivers and what's this all about. First of all, we started off with Breach Defense. We see that despite continued investments, in personal perimeter and platform security, Data breaches continue to occur. Voltage Secure Data plus Vertica, provides defense in depth for sensitive Data, and that's a key concept that we're going to be referring to. in the security field defense in depth, is a standard approach to be able to provide, more layers of protection around sensitive assets, such as your Data, and that's exactly what Secure Data is designed to do. Now that we've come through many of these breach examples, and big ticket items, getting the news around breaches and their impact, the business regulators have stepped up, and regulatory compliance, is now a hot topic in Data privacy. Regulations such as GDPR came online in 2018 for the EU. CCPA came online just this year, a couple months ago for California, and is the de-facto standard for the United States now, as organizations are trying to look at, the best practices for providing, regulatory compliance around Data privacy and protection. These gives massive new rights to consumers, but also obligations to organizations, to protect that personal Data. Secure Data Plus Vertica provides, fine grained authorization around sensitive Data, And we're going to show you exactly how that works, within the Vertica platform. At the bottom, you'll see some of the snippets there, of the news articles that just keep racking up, and our goal is to keep you off the news, to keep your company safe, so that you can have the assurance, that even if there is an unintentional, or intentional breach of Data out of the corporation, if it is protected by voltage Secure Data, it will be of no value to those hackers, and then you have no impact, in terms of risk to the organization. What do we mean by defense in depth? Let's take a look first at the encryption types, and the benefits that they provide, and we see our customers implementing, all kinds of different protection mechanisms, within the organization. You could be looking at disk level protection, file system protection, protection on the files themselves. You could protect the entire Database, you could protect our transmissions, as they go from the client to the server via TLS, or other protected tunnels. And then we look at Field-level Encryption, and that's what we're talking about today. That's all the above protections, at the perimeter level at the platform level. Plus, we're giving you granular access control, to your sensitive Data. Our main message is, keep the Data protected for at the earliest possible point, and only access it, when you have a valid business need to do so. That's a really critical aspect as we see Vertica customers, loading terabytes, petabytes of Data, into clusters of Vertica console, Vertica Database being able to give access to that Data, out to a wide variety of end users. We started off with organizations having, four people in an office doing Data science, or analytics, or Data warehousing, or whatever it's called within an organization, and that's now ballooned out, to a new customer coming in and telling us, we're going to have 1000 people accessing it, plus service accounts accessing Vertica, we need to be able to provide fine level access control, and be able to understand what are folks doing with that sensitive Data? And how can we Secure it, the best practices possible. In very simple state, voltage protect Data at rest and in motion. The encryption of Data facilitates compliance, and it reduces your risk of breach. So if you take a look at what we mean by feel level, we could take a name, that name might not just be in US ASCII. Here we have a sort of Latin one extended, example of Harold Potter, and we could take a look at the example protected Data. Notice that we're taking a character set approach, to protecting it, meaning, I've got an alphanumeric option here for the format, that I'm applying to that name. That gives me a mix of alpha and numeric, and plus, I've got some of that Latin one extended alphabet in there as well, and that's really controllable by the end customer. They can have this be just US ASCII, they can have it be numbers for numbers, you can have a wide variety, of different protection mechanisms, including ignoring some characters in the alphabet, in case you want to maintain formatting. We've got all the bells and whistles, that you would ever want, to put on top of format preserving encryption, and we continue to add more to that platform, as we go forward. Taking a look at tax ID, there's an example of numbers for numbers, pretty basic, but it gives us the sort of idea, that we can very quickly and easily keep the Data protected, while maintaining the format. No schema changes are going to be required, when you want to protect that Data. If you look at credit card number, really popular example, and the same concept can be applied to tax ID, often the last four digits will be used in a tax ID, to verify someone's identity. That could be on an automated telephone system, it could be a customer service representative, just trying to validate the security of the customer, and we can keep that Data in the clear for that purpose, while protecting the entire string from breach. Dates are another critical area of concern, for a lot of medical use cases. But we're seeing Date of Birth, being included in a lot of Data privacy conversations, and we can protect dates with dates, they're going to be a valid date, and we have some really nifty tools, to maintain offsets between dates. So again, we've got the real depth of capability, within our encryption, that's not just saying, here's a one size fits all approach, GPS location, customer ID, IP address, all of those kinds of Data strings, can be protected by voltage Secure Data within Vertica. Let's take a look at the UDx basics. So what are we doing, when we add Voltage to Vertica? Vertica stays as is in the center. In fact, if you get the Vertical distribution, you're getting the Secure Data UDx onboard, you just need to enable it, and have Secure Data virtual appliance, that's the box there on the middle right. That's what we come in and add to the mix, as we start to be able to add those capabilities to Vertica. On the left hand side, you'll see that your users, your service accounts, your analytics, are still typically doing Select, Update, Insert, Delete, type of functionality within Vertica. And they're going to come into Vertica's access control layer, they're going to also access those services via SQL, and we simply extend SQL for Vertica. So when you add the UDx, you get additional syntax that we can provide, and we're going to show you examples of that. You can also integrate that with concepts, like Views within Vertica. So that we can say, let's give a view of Data, that gives the Data in the clear, using the UDx to decrypt that Data, and let's give everybody else, access to the raw Data which is protected. Third parties could be brought in, folks like contractors or folks that aren't vetted, as closely as a security team might do, for internal sensitive Data access, could be given access to the Vertical cluster, without risk of them breaching and going into some area, they're not supposed to take a look at. Vertica has excellent control for access, down even to the column level, which is phenomenal, and really provides you with world class security, around the Vertical solution itself. Secure Data adds another layer of protection, like we're mentioning, so that we can have Data protected in use, Data protected at rest, and then we can have the ability, to share that protected Data throughout the organization. And that's really where Secure Data shines, is the ability to protect that Data on mainframe, on mobile, and open systems, in the cloud, everywhere you want to have that Data move to and from Vertica, then you can have Secure Data, integrated with those endpoints as well. That's an additional solution on top, the Secure Data Plus Vertica solution, that is bundled together today for a sales purpose. But we can also have that conversation with you, about those wider Secure Data use cases, we'd be happy to talk to you about that. Security to the virtual appliance, is a lightweight appliance, sits on something like eight cores, 16 gigs of RAM, 100 gig of disk or 200 gig of disk, really a lightweight appliance, you can have one or many. Most customers have four in production, just for redundancy, they don't need them for scale. But we have some customers with 16 or more in production, because they're running such high volumes of transaction load. They're running a lot of web service transactions, and they're running Vertica as well. So we're going to have those virtual appliances, as co-located around the globe, hooked up to all kinds of systems, like Syslog, LDAP, load balancers, we've got a lot of capability within the appliance, to fit into your enterprise IP landscape. So let me get you directly into the neat, of what does the UDx do. If you're technical and you know SQL, this is probably going to be pretty straightforward to you, you'll see the copy command, used widely in Vertica to get Data into Vertica. So let's try to protect that Data when we're ingesting it. Let's grab it from maybe a CSV file, and put it straight into Vertica, but protected on the way and that's what the UDx does. We have Voltage Secure protectors, an added syntax, like I mentioned, to the Vertica SQL. And that allows us to say, we're going to protect the customer first name, using the parameters of hyper alphanumeric. That's our internal lingo of a format, within Secure Data, this part of our API, the API is require very few inputs. The format is the one, that you as a developer will be supplying, and you'll have different ones for maybe SSN, you'll have different formats for street address, but you can reuse a lot of your formats, across a lot of your PII, PHI Data types. Protecting after ingest is also common. So I've got some Data, that's already been put into a staging area, perhaps I've got a landing zone, a sandbox of some sort, now I want to be able to move that, into a different zone in Vertica, different area of the schema, and I want to have that Data protected. We can do that with the update command, and simply again, you'll notice Voltage Secure protect, nothing too wild there, basically the same syntax. We're going to query unprotected Data. How do we search once I've encrypted all my Data? Well, actually, there's a pretty nifty trick to do so. If you want to be able to query unprotected Data, and we have the search string, like a phone number there in this example, simply call Voltage Secure protect on that, now you'll have the cipher text, and you'll be able to search the stored cipher text. Again, we're just format preserving encrypting the Data, and it's just a string, and we can always compare those strings, using standard syntax and SQL. Using views to decrypt Data, again a powerful concept, in terms of how to make this work, within the Vertica Landscape, when you have a lot of different groups of users. Views are very powerful, to be able to point a BI tool, for instance, business intelligence tools, Cognos, Tableau, etc, might be accessing Data from Vertica with simple queries. Well, let's point them to a view that does the hard work, and uses the Vertical nodes, and its horsepower of CPU and RAM, to actually run that Udx, and do the decryption of the Data in use, temporarily in memory, and then throw that away, so that it can't be breached. That's a nice way to keep your users active and working and going forward, with their Data access and Data analytics, while also keeping the Data Secure in the process. And then we might want to export some Data, and push it out to someone in a clear text manner. We've got a third party, needs to take the tax ID along with some Data, to do some processing, all we need to do is call Voltage Secure Access, again, very similar to the protect call, and you're writing the parameter again, and boom, we have decrypted the Data and used again, the Vertical resources of RAM and CPU and horsepower, to do the work. All we're doing with Voltage Secure Data Appliance, is a real simple little key fetch, across a protected tunnel, that's a tiny atomic transaction, gets done very quick, and you're good to go. This is it in terms of the UDx, you have a couple of calls, and one parameter to pass, everything else is config driven, and really, you're up and running very quickly. We can even do demos and samples of this Vertical Udx, using hosted appliances, that we put up for pre sales purposes. So folks want to get up and get a demo going. We could take that Udx, configure it to point to our, appliance sitting on the internet, and within a couple of minutes, we're up and running with some simple use cases. Of course, for on-prem deployment, or deployment in the cloud, you'll want your own appliance in your own crypto district, you have your own security, but it just shows, that we can easily connect to any appliance, and get this working in a matter of minutes. Let's take a look deeper at the voltage plus Vertica solution, and we'll describe some of the use cases and path to success. First of all your steps to, implementing Data-centric security and Vertica. Want to note there on the left hand side, identify sensitive Data. How do we do this? I have one customer, where they look at me and say, Rich, we know exactly what our sensitive Data is, we develop the schema, it's our own App, we have a customer table, we don't need any help in this. We've got other customers that say, Rich, we have a very complex Database environment, with multiple Databases, multiple schemas, thousands of tables, hundreds of thousands of columns, it's really, really complex help, and we don't know what people have been doing exactly, with some of that Data, We've got various teams that share this resource. There, we do have additional tools, I wanted to give a shout out to another microfocus product, which is called Structured Data Manager. It's a great tool that helps you identify sensitive Data, with some really amazing technology under the hood, that can go into a Vertica repository, scan those tables, take a sample of rows or a full table scan, and give you back some really good reports on, we think this is sensitive, let's go confirm it, and move forward with Data protection. So if you need help on that, we've got the tools to do it. Once you identify that sensitive Data, you're going to want to understand, your Data flows and your use cases. Take a look at what analytics you're doing today. What analytics do you want to do, on sensitive Data in the future? Let's start designing our analytics, to work with sensitive Data, and there's some tips and tricks that we can provide, to help you mitigate, any kind of concerns around performance, or any kind of concerns around rewriting your SQL. As you've noted, you can just simply insert our SQL additions, into your code and you're off and running. You want to install and configure the Udx, and secure Data software plants. Well, the UDx is pretty darn simple. The documentation on Vertica is publicly available, you could see how that works, and what you need to configure it, one file here, and you're ready to go. So that's pretty straightforward to process, either grant some access to the Udx, and that's really up to the customer, because there are many different ways, to handle access control in Vertica, we're going to be flexible to fit within your model, of access control and adding the UDx to your mix. Each customer is a little different there, so you might want to talk with us a little bit about, the best practices for your use cases. But in general, that's going to be up and running in just a minute. The security software plants, hardened Linux appliance today, sits on-prem or in the cloud. And you can deploy that. I've seen it done in 15 minutes, but that's what the real tech you had, access to being able to generate a search, and do all this so that, your being able to set the firewall and all the DNS entries, the basically blocking and tackling of a software appliance, you get that done, corporations can take care of that, in just a couple of weeks, they get it all done, because they have wait waiting on other teams, but the software plants are really fast to get stood up, and they're very simple to administer, with our web based GUI. Then finally, you're going to implement your UDx use cases. Once the software appliance is up and running, we can set authentication methods, we could set up the format that you're going to use in Vertica, and then those two start talking together. And it should be going in dev and test in about half a day, and then you're running toward production, in just a matter of days, in most cases. We've got other customers that say, Hey, this is going to be a bigger migration project for us. We might want to split this up into chunks. Let's do the real sensitive and scary Data, like tax ID first, as our sort of toe in the water approach, and then we'll come back and protect other Data elements. That's one way to slice and dice, and implement your solution in a planned manner. Another way is schema based. Let's take a look at this section of the schema, and implement protection on these Data elements. Now let's take a look at the different schema, and we'll repeat the process, so you can iteratively move forward with your deployment. So what's the added value? When you add full Vertica plus voltage? I want to highlight this distinction because, Vertica contains world class security controls, around their Database. I'm an old time DBA from a different product, competing against Vertica in the past, and I'm really aware of the granular access controls, that are provided within various platforms. Vertica would rank at the very top of the list, in terms of being able to give me very tight control, and a lot of different AWS methods, being able to protect the Data, in a lot of different use cases. So Vertica can handle a lot of your Data protection needs, right out of the box. Voltage Secure Data, as we keep mentioning, adds that defense in-Depth, and it's going to enable those, enterprise wide use cases as well. So first off, I mentioned this, the standard of FF1, that is format preserving encryption, we're the authors of it, we continue to maintain that, and we want to emphasize that customers, really ought to be very, very careful, in terms of choosing a NIST standard, when implementing any kind of encryption, within the organization. So 8 ES was one of the first, and Hallmark, benchmark encryption algorithms, and in 2016, we were added to that mix, as FF1 with CS online. If you search NIST, and Voltage Security, you'll see us right there as the author of the standard, and all the processes that went along with that approval. We have centralized policy for key management, authentication, audit and compliance. We can now see that Vertica selected or fetch the key, to be able to protect some Data at this date and time. We can track that and be able to give you audit, and compliance reporting against that Data. You can move protected Data into and out of Vertica. So if we ingest via Kafka, and just via NiFi and Kafka, ingest on stream sets. There are a variety of different ingestion methods, and streaming methods, that can get Data into Vertica. We can integrate secure Data with all of those components. We're very well suited to integrate, with any Hadoop technology or any big Data technology, as we have API's in a variety of languages, bitness and platforms. So we've got that all out of the box, ready to go for you, if you need it. When you're moving Data out of Vertica, you might move it into an open systems platform, you might move it to the cloud, we can also operate and do the decryption there, you're going to get the same plaintext back, and if you protect Data over in the cloud, and move it into Vertica, you're going to be able to decrypt it in Vertica. That's our cross platform promise. We've been delivering on that for many, many years, and we now have many, many endpoints that do that, in production for the world's largest organization. We're going to preserve your Data format, and referential integrity. So if I protect my social security number today, I can protect another batch of Data tomorrow, and that same ciphertext will be generated, when I put that into Vertica, I can have absolute referential integrity on that Data, to be able to allow for analytics to occur, without even decrypting Data in many cases. And we have decrypt access for authorized users only, with the ability to add LDAP authentication authorization, for UDx users. So you can really have a number of different approaches, and flavors of how you implement voltage within Vertica, but what you're getting is the additional ability, to have that confidence, that we've got the Data protected at rest, even if I have a DBA that's not vetted or someone new, or I don't know where this person is from a third party, and being provided access as a DBA level privilege. They could select star from all day long, and they're going to get ciphertext, they're going to have nothing of any value, and if they want to use the UDF to decrypt it, they're going to be tracked and traced, as to their utilization of that. So it allows us to have that control, and additional layer of security on your sensitive Data. This may be required by regulatory agencies, and it's seeming that we're seeing compliance audits, get more and more strict every year. GDPR was kind of funny, because they said in 2016, hey, this is coming, they said in 2018, it's here, and now they're saying in 2020, hey, we're serious about this, and the fines are mounting. And let's give you some examples to kind of, help you understand, that these regulations are real, the fines are real, and your reputational damage can be significant, if you were to be in breach, of a regulatory compliance requirements. We're finding so many different use cases now, popping up around regional protection of Data. I need to protect this Data so that it cannot go offshore. I need to protect this Data, so that people from another region cannot see it. That's all the kind of capability that we have, within secure Data that we can add to Vertica. We have that broad platform support, and I mentioned NiFi and Kafka, those would be on the left hand side, as we start to ingest Data from applications into Vertica. We can have landing zone approaches, where we provide some automated scripting at an OS level, to be able to protect ETL batch transactions coming in. We could protect within the Vertica UDx, as I mentioned, with the copy command, directly using Vertica. Everything inside that dot dash line, is the Vertical Plus Voltage Secure Data combo, that's sold together as a single package. Additionally, we'd love to talk with you, about the stuff that's outside the dash box, because we have dozens and dozens of endpoints, that could protect and access Data, on many different platforms. And this is where you really start to leverage, some of the extensive power of secure Data, to go across platform to handle your web based apps, to handle apps in the cloud, and to handle all of this at scale, with hundreds of thousands of transactions per second, of format preserving encryption. That may not sound like much, but when you take a look at the algorithm, what we're doing on the mathematics side, when you look at everything that goes into that transaction, to me, that's an amazing accomplishment, that we're trying to reach those kinds of levels of scale, and with Vertica, it scales horizontally. So the more nodes you add, the more power you get, the more throughput you're going to get, from voltage secure Data. I want to highlight the next steps, on how we can continue to move forward. Our secure Data team is available to you, to talk about the landscape, your use cases, your Data. We really love the concept that, we've got so many different organizations out there, using secure Data in so many different and unique ways. We have vehicle manufacturers, who are protecting not just the VIN, not just their customer Data, but in fact they're protecting sensor Data from the vehicles, which is sent over the network, down to the home base every 15 minutes, for every vehicle that's on the road, and every vehicle of this customer of ours, since 2017, has included that capability. So now we're talking about, an additional millions and millions of units coming online, as those cars are sold and distributed, and used by customers. That sensor Data is critical to the customer, and they cannot let that be ex-filled in the clear. So they protect that Data with secure Data, and we have a great track record of being able to meet, a variety of different unique requirements, whether it's IoT, whether it's web based Apps, E-commerce, healthcare, all kinds of different industries, we would love to help move the conversations forward, and we do find that it's really a three party discussion, the customer, secure Data experts in some cases, and the Vertica team. We have great enablement within Vertica team, to be able to explain and present, our secure Data solution to you. But we also have that other ability to add other experts in, to keep that conversation going into a broader perspective, of how can I protect my Data across all my platforms, not just in Vertica. I want to give a shout out to our friends at Vertica Academy. They're building out a great demo and training facilities, to be able to help you learn more about these UDx's, and how they're implemented. The Academy, is a terrific reference and resource for your teams, to be able to learn more, about the solution in a self guided way, and then we'd love to have your feedback on that. How can we help you more? What are the topics you'd like to learn more about? How can we look to the future, in protecting unstructured Data? How can we look to the future, of being able to protect Data at scale? What are the requirements that we need to be meeting? Help us through the learning processes, and through feedback to the team, get better, and then we'll help you deliver more solutions, out to those endpoints and protect that Data, so that we're not having Data breach, we're not having regulatory compliance concerns. And then lastly, learn more about the Udx. I mentioned, that all of our content there, is online and available to the public. So vertica.com/secureData , you're going to be able to walk through the basics of the UDX. You're going to see how simple it is to set up, what the UDx syntax looks like, how to grant access to it, and then you'll start to be able to figure out, hey, how can I start to put this, into a PLC in my own environment? Like I mentioned before, we have publicly available hosted appliance, for demo purposes, that we can make available to you, if you want to PLC this. Reach out to us. Let's get a conversation going, and we'll get you the address and get you some instructions, we can have a quick enablement session. We really want to make this accessible to you, and help demystify the concept of encryption, because when you see it as a developer, and you start to get your hands on it and put it to use, you can very quickly see, huh, I could use this in a variety of different cases, and I could use this to protect my Data, without impacting my analytics. Those are some of the really big concerns that folks have, and once we start to get through that learning process, and playing around with it in a PLC way, that we can start to really put it to practice into production, to say, with confidence, we're going to move forward toward Data encryption, and have a very good result, at the end of the day. This is one of the things I find with customers, that's really interesting. Their biggest stress, is not around the timeframe or the resource, it's really around, this is my Data, I have been working on collecting this Data, and making it available in a very high quality way, for many years. This is my job and I'm responsible for this Data, and now you're telling me, you're going to encrypt that Data? It makes me nervous, and that's common, everybody feels that. So we want to have that conversation, and that sort of trial and error process to say, hey, let's get your feet wet with it, and see how you like it in a sandbox environment. Let's now take that into analytics, and take a look at how we can make this, go for a quick 1.0 release, and let's then take a look at, future expansions to that, where we start adding Kafka on the ingest side. We start sending Data off, into other machine learning and analytics platforms, that we might want to utilize outside of Vertica, for certain purposes, in certain industries. Let's take a look at those use cases together, and through that journey, we can really chart a path toward the future, where we can really help you protect that Data, at rest, in use, and keep you safe, from both the hackers and the regulators, and that I think at the end of the day, is really what it's all about, in terms of protecting our Data within Vertica. We're going to have a little couple minutes for Q&A, and we would encourage you to have any questions here, and we'd love to follow up with you more, about any questions you might have, about Vertica Plus Voltage Secure Data. They you very much for your time today.
SUMMARY :
and our engineering team is planning to join the Forum, and our goal is to keep you off the news,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Vertica | ORGANIZATION | 0.99+ |
100 gig | QUANTITY | 0.99+ |
16 | QUANTITY | 0.99+ |
16 gigs | QUANTITY | 0.99+ |
200 gig | QUANTITY | 0.99+ |
Paige Roberts | PERSON | 0.99+ |
2016 | DATE | 0.99+ |
Paige | PERSON | 0.99+ |
Rich Gaston | PERSON | 0.99+ |
dozens | QUANTITY | 0.99+ |
2018 | DATE | 0.99+ |
Vertica Academy | ORGANIZATION | 0.99+ |
2020 | DATE | 0.99+ |
SQL | TITLE | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
First | QUANTITY | 0.99+ |
1000 people | QUANTITY | 0.99+ |
Hallmark | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
Harold Potter | PERSON | 0.99+ |
Rich | PERSON | 0.99+ |
millions | QUANTITY | 0.99+ |
Stanford University | ORGANIZATION | 0.99+ |
15 minutes | QUANTITY | 0.99+ |
Today | DATE | 0.99+ |
Each customer | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
California | LOCATION | 0.99+ |
Kafka | TITLE | 0.99+ |
Vertica | TITLE | 0.99+ |
Latin | OTHER | 0.99+ |
tomorrow | DATE | 0.99+ |
2017 | DATE | 0.99+ |
eight cores | QUANTITY | 0.99+ |
two | QUANTITY | 0.98+ |
GDPR | TITLE | 0.98+ |
first | QUANTITY | 0.98+ |
one customer | QUANTITY | 0.98+ |
Tableau | TITLE | 0.98+ |
United States | LOCATION | 0.97+ |
this week | DATE | 0.97+ |
Vertica | LOCATION | 0.97+ |
4/2 | DATE | 0.97+ |
Linux | TITLE | 0.97+ |
one file | QUANTITY | 0.96+ |
vertica.com/secureData | OTHER | 0.96+ |
four | QUANTITY | 0.95+ |
about half a day | QUANTITY | 0.95+ |
Cognos | TITLE | 0.95+ |
four people | QUANTITY | 0.94+ |
Udx | ORGANIZATION | 0.94+ |
one way | QUANTITY | 0.94+ |
UNLIST TILL 4/2 - Vertica Big Data Conference Keynote
>> Joy: Welcome to the Virtual Big Data Conference. Vertica is so excited to host this event. I'm Joy King, and I'll be your host for today's Big Data Conference Keynote Session. It's my honor and my genuine pleasure to lead Vertica's product and go-to-market strategy. And I'm so lucky to have a passionate and committed team who turned our Vertica BDC event, into a virtual event in a very short amount of time. I want to thank the thousands of people, and yes, that's our true number who have registered to attend this virtual event. We were determined to balance your health, safety and your peace of mind with the excitement of the Vertica BDC. This is a very unique event. Because as I hope you all know, we focus on engineering and architecture, best practice sharing and customer stories that will educate and inspire everyone. I also want to thank our top sponsors for the virtual BDC, Arrow, and Pure Storage. Our partnerships are so important to us and to everyone in the audience. Because together, we get things done faster and better. Now for today's keynote, you'll hear from three very important and energizing speakers. First, Colin Mahony, our SVP and General Manager for Vertica, will talk about the market trends that Vertica is betting on to win for our customers. And he'll share the exciting news about our Vertica 10 announcement and how this will benefit our customers. Then you'll hear from Amy Fowler, VP of strategy and solutions for FlashBlade at Pure Storage. Our partnership with Pure Storage is truly unique in the industry, because together modern infrastructure from Pure powers modern analytics from Vertica. And then you'll hear from John Yovanovich, Director of IT at AT&T, who will tell you about the Pure Vertica Symphony that plays live every day at AT&T. Here we go, Colin, over to you. >> Colin: Well, thanks a lot joy. And, I want to echo Joy's thanks to our sponsors, and so many of you who have helped make this happen. This is not an easy time for anyone. We were certainly looking forward to getting together in person in Boston during the Vertica Big Data Conference and Winning with Data. But I think all of you and our team have done a great job, scrambling and putting together a terrific virtual event. So really appreciate your time. I also want to remind people that we will make both the slides and the full recording available after this. So for any of those who weren't able to join live, that is still going to be available. Well, things have been pretty exciting here. And in the analytic space in general, certainly for Vertica, there's a lot happening. There are a lot of problems to solve, a lot of opportunities to make things better, and a lot of data that can really make every business stronger, more efficient, and frankly, more differentiated. For Vertica, though, we know that focusing on the challenges that we can directly address with our platform, and our people, and where we can actually make the biggest difference is where we ought to be putting our energy and our resources. I think one of the things that has made Vertica so strong over the years is our ability to focus on those areas where we can make a great difference. So for us as we look at the market, and we look at where we play, there are really three recent and some not so recent, but certainly picking up a lot of the market trends that have become critical for every industry that wants to Win Big With Data. We've heard this loud and clear from our customers and from the analysts that cover the market. If I were to summarize these three areas, this really is the core focus for us right now. We know that there's massive data growth. And if we can unify the data silos so that people can really take advantage of that data, we can make a huge difference. We know that public clouds offer tremendous advantages, but we also know that balance and flexibility is critical. And we all need the benefit that machine learning for all the types up to the end data science. We all need the benefits that they can bring to every single use case, but only if it can really be operationalized at scale, accurate and in real time. And the power of Vertica is, of course, how we're able to bring so many of these things together. Let me talk a little bit more about some of these trends. So one of the first industry trends that we've all been following probably now for over the last decade, is Hadoop and specifically HDFS. So many companies have invested, time, money, more importantly, people in leveraging the opportunity that HDFS brought to the market. HDFS is really part of a much broader storage disruption that we'll talk a little bit more about, more broadly than HDFS. But HDFS itself was really designed for petabytes of data, leveraging low cost commodity hardware and the ability to capture a wide variety of data formats, from a wide variety of data sources and applications. And I think what people really wanted, was to store that data before having to define exactly what structures they should go into. So over the last decade or so, the focus for most organizations is figuring out how to capture, store and frankly manage that data. And as a platform to do that, I think, Hadoop was pretty good. It certainly changed the way that a lot of enterprises think about their data and where it's locked up. In parallel with Hadoop, particularly over the last five years, Cloud Object Storage has also given every organization another option for collecting, storing and managing even more data. That has led to a huge growth in data storage, obviously, up on public clouds like Amazon and their S3, Google Cloud Storage and Azure Blob Storage just to name a few. And then when you consider regional and local object storage offered by cloud vendors all over the world, the explosion of that data, in leveraging this type of object storage is very real. And I think, as I mentioned, it's just part of this broader storage disruption that's been going on. But with all this growth in the data, in all these new places to put this data, every organization we talk to is facing even more challenges now around the data silo. Sure the data silos certainly getting bigger. And hopefully they're getting cheaper per bit. But as I said, the focus has really been on collecting, storing and managing the data. But between the new data lakes and many different cloud object storage combined with all sorts of data types from the complexity of managing all this, getting that business value has been very limited. This actually takes me to big bet number one for Team Vertica, which is to unify the data. Our goal, and some of the announcements we have made today plus roadmap announcements I'll share with you throughout this presentation. Our goal is to ensure that all the time, money and effort that has gone into storing that data, all the data turns into business value. So how are we going to do that? With a unified analytics platform that analyzes the data wherever it is HDFS, Cloud Object Storage, External tables in an any format ORC, Parquet, JSON, and of course, our own Native Roth Vertica format. Analyze the data in the right place in the right format, using a single unified tool. This is something that Vertica has always been committed to, and you'll see in some of our announcements today, we're just doubling down on that commitment. Let's talk a little bit more about the public cloud. This is certainly the second trend. It's the second wave maybe of data disruption with object storage. And there's a lot of advantages when it comes to public cloud. There's no question that the public clouds give rapid access to compute storage with the added benefit of eliminating data center maintenance that so many companies, want to get out of themselves. But maybe the biggest advantage that I see is the architectural innovation. The public clouds have introduced so many methodologies around how to provision quickly, separating compute and storage and really dialing-in the exact needs on demand, as you change workloads. When public clouds began, it made a lot of sense for the cloud providers and their customers to charge and pay for compute and storage in the ratio that each use case demanded. And I think you're seeing that trend, proliferate all over the place, not just up in public cloud. That architecture itself is really becoming the next generation architecture for on-premise data centers, as well. But there are a lot of concerns. I think we're all aware of them. They're out there many times for different workloads, there are higher costs. Especially if some of the workloads that are being run through analytics, which tend to run all the time. Just like some of the silo challenges that companies are facing with HDFS, data lakes and cloud storage, the public clouds have similar types of siloed challenges as well. Initially, there was a belief that they were cheaper than data centers, and when you added in all the costs, it looked that way. And again, for certain elastic workloads, that is the case. I don't think that's true across the board overall. Even to the point where a lot of the cloud vendors aren't just charging lower costs anymore. We hear from a lot of customers that they don't really want to tether themselves to any one cloud because of some of those uncertainties. Of course, security and privacy are a concern. We hear a lot of concerns with regards to cloud and even some SaaS vendors around shared data catalogs, across all the customers and not enough separation. But security concerns are out there, you can read about them. I'm not going to jump into that bandwagon. But we hear about them. And then, of course, I think one of the things we hear the most from our customers, is that each cloud stack is starting to feel even a lot more locked in than the traditional data warehouse appliance. And as everybody knows, the industry has been running away from appliances as fast as it can. And so they're not eager to get locked into another, quote, unquote, virtual appliance, if you will, up in the cloud. They really want to make sure they have flexibility in which clouds, they're going to today, tomorrow and in the future. And frankly, we hear from a lot of our customers that they're very interested in eventually mixing and matching, compute from one cloud with, say storage from another cloud, which I think is something that we'll hear a lot more about. And so for us, that's why we've got our big bet number two. we love the cloud. We love the public cloud. We love the private clouds on-premise, and other hosting providers. But our passion and commitment is for Vertica to be able to run in any of the clouds that our customers choose, and make it portable across those clouds. We have supported on-premises and all public clouds for years. And today, we have announced even more support for Vertica in Eon Mode, the deployment option that leverages the separation of compute from storage, with even more deployment choices, which I'm going to also touch more on as we go. So super excited about our big bet number two. And finally as I mentioned, for all the hype that there is around machine learning, I actually think that most importantly, this third trend that team Vertica is determined to address is the need to bring business critical, analytics, machine learning, data science projects into production. For so many years, there just wasn't enough data available to justify the investment in machine learning. Also, processing power was expensive, and storage was prohibitively expensive. But to train and score and evaluate all the different models to unlock the full power of predictive analytics was tough. Today you have those massive data volumes. You have the relatively cheap processing power and storage to make that dream a reality. And if you think about this, I mean with all the data that's available to every company, the real need is to operationalize the speed and the scale of machine learning so that these organizations can actually take advantage of it where they need to. I mean, we've seen this for years with Vertica, going back to some of the most advanced gaming companies in the early days, they were incorporating this with live data directly into their gaming experiences. Well, every organization wants to do that now. And the accuracy for clickability and real time actions are all key to separating the leaders from the rest of the pack in every industry when it comes to machine learning. But if you look at a lot of these projects, the reality is that there's a ton of buzz, there's a ton of hype spanning every acronym that you can imagine. But most companies are struggling, do the separate teams, different tools, silos and the limitation that many platforms are facing, driving, down sampling to get a small subset of the data, to try to create a model that then doesn't apply, or compromising accuracy and making it virtually impossible to replicate models, and understand decisions. And if there's one thing that we've learned when it comes to data, prescriptive data at the atomic level, being able to show end of one as we refer to it, meaning individually tailored data. No matter what it is healthcare, entertainment experiences, like gaming or other, being able to get at the granular data and make these decisions, make that scoring applies to machine learning just as much as it applies to giving somebody a next-best-offer. But the opportunity has never been greater. The need to integrate this end-to-end workflow and support the right tools without compromising on that accuracy. Think about it as no downsampling, using all the data, it really is key to machine learning success. Which should be no surprise then why the third big bet from Vertica is one that we've actually been working on for years. And we're so proud to be where we are today, helping the data disruptors across the world operationalize machine learning. This big bet has the potential to truly unlock, really the potential of machine learning. And today, we're announcing some very important new capabilities specifically focused on unifying the work being done by the data science community, with their preferred tools and platforms, and the volume of data and performance at scale, available in Vertica. Our strategy has been very consistent over the last several years. As I said in the beginning, we haven't deviated from our strategy. Of course, there's always things that we add. Most of the time, it's customer driven, it's based on what our customers are asking us to do. But I think we've also done a great job, not trying to be all things to all people. Especially as these hype cycles flare up around us, we absolutely love participating in these different areas without getting completely distracted. I mean, there's a variety of query tools and data warehouses and analytics platforms in the market. We all know that. There are tools and platforms that are offered by the public cloud vendors, by other vendors that support one or two specific clouds. There are appliance vendors, who I was referring to earlier who can deliver package data warehouse offerings for private data centers. And there's a ton of popular machine learning tools, languages and other kits. But Vertica is the only advanced analytic platform that can do all this, that can bring it together. We can analyze the data wherever it is, in HDFS, S3 Object Storage, or Vertica itself. Natively we support multiple clouds on-premise deployments, And maybe most importantly, we offer that choice of deployment modes to allow our customers to choose the architecture that works for them right now. It still also gives them the option to change move, evolve over time. And Vertica is the only analytics database with end-to-end machine learning that can truly operationalize ML at scale. And I know it's a mouthful. But it is not easy to do all these things. It is one of the things that highly differentiates Vertica from the rest of the pack. It is also why our customers, all of you continue to bet on us and see the value that we are delivering and we will continue to deliver. Here's a couple of examples of some of our customers who are powered by Vertica. It's the scale of data. It's the millisecond response times. Performance and scale have always been a huge part of what we have been about, not the only thing. I think the functionality all the capabilities that we add to the platform, the ease of use, the flexibility, obviously with the deployment. But if you look at some of the numbers they are under these customers on this slide. And I've shared a lot of different stories about these customers. Which, by the way, it still amaze me every time I talk to one and I get the updates, you can see the power and the difference that Vertica is making. Equally important, if you look at a lot of these customers, they are the epitome of being able to deploy Vertica in a lot of different environments. Many of the customers on this slide are not using Vertica just on-premise or just in the cloud. They're using it in a hybrid way. They're using it in multiple different clouds. And again, we've been with them on that journey throughout, which is what has made this product and frankly, our roadmap and our vision exactly what it is. It's been quite a journey. And that journey continues now with the Vertica 10 release. The Vertica 10 release is obviously a massive release for us. But if you look back, you can see that building on that native columnar architecture that started a long time ago, obviously, with the C-Store paper. We built it to leverage that commodity hardware, because it was an architecture that was never tightly integrated with any specific underlying infrastructure. I still remember hearing the initial pitch from Mike Stonebreaker, about the vision of Vertica as a software only solution and the importance of separating the company from hardware innovation. And at the time, Mike basically said to me, "there's so much R&D in innovation that's going to happen in hardware, we shouldn't bake hardware into our solution. We should do it in software, and we'll be able to take advantage of that hardware." And that is exactly what has happened. But one of the most recent innovations that we embraced with hardware is certainly that separation of compute and storage. As I said previously, the public cloud providers offered this next generation architecture, really to ensure that they can provide the customers exactly what they needed, more compute or more storage and charge for each, respectively. The separation of compute and storage, compute from storage is a major milestone in data center architectures. If you think about it, it's really not only a public cloud innovation, though. It fundamentally redefines the next generation data architecture for on-premise and for pretty much every way people are thinking about computing today. And that goes for software too. Object storage is an example of the cost effective means for storing data. And even more importantly, separating compute from storage for analytic workloads has a lot of advantages. Including the opportunity to manage much more dynamic, flexible workloads. And more importantly, truly isolate those workloads from others. And by the way, once you start having something that can truly isolate workloads, then you can have the conversations around autonomic computing, around setting up some nodes, some compute resources on the data that won't affect any of the other data to do some things on their own, maybe some self analytics, by the system, etc. A lot of things that many of you know we've already been exploring in terms of our own system data in the product. But it was May 2018, believe it or not, it seems like a long time ago where we first announced Eon Mode and I want to make something very clear, actually about Eon mode. It's a mode, it's a deployment option for Vertica customers. And I think this is another huge benefit that we don't talk about enough. But unlike a lot of vendors in the market who will dig you and charge you for every single add-on like hit-buy, you name it. You get this with the Vertica product. If you continue to pay support and maintenance, this comes with the upgrade. This comes as part of the new release. So any customer who owns or buys Vertica has the ability to set up either an Enterprise Mode or Eon Mode, which is a question I know that comes up sometimes. Our first announcement of Eon was obviously AWS customers, including the trade desk, AT&T. Most of whom will be speaking here later at the Virtual Big Data Conference. They saw a huge opportunity. Eon Mode, not only allowed Vertica to scale elastically with that specific compute and storage that was needed, but it really dramatically simplified database operations including things like workload balancing, node recovery, compute provisioning, etc. So one of the most popular functions is that ability to isolate the workloads and really allocate those resources without negatively affecting others. And even though traditional data warehouses, including Vertica Enterprise Mode have been able to do lots of different workload isolation, it's never been as strong as Eon Mode. Well, it certainly didn't take long for our customers to see that value across the board with Eon Mode. Not just up in the cloud, in partnership with one of our most valued partners and a platinum sponsor here. Joy mentioned at the beginning. We announced Vertica Eon Mode for Pure Storage FlashBlade in September 2019. And again, just to be clear, this is not a new product, it's one Vertica with yet more deployment options. With Pure Storage, Vertica in Eon mode is not limited in any way by variable cloud, network latency. The performance is actually amazing when you take the benefits of separate and compute from storage and you run it with a Pure environment on-premise. Vertica in Eon Mode has a super smart cache layer that we call the depot. It's a big part of our secret sauce around Eon mode. And combined with the power and performance of Pure's FlashBlade, Vertica became the industry's first advanced analytics platform that actually separates compute and storage for on-premises data centers. Something that a lot of our customers are already benefiting from, and we're super excited about it. But as I said, this is a journey. We don't stop, we're not going to stop. Our customers need the flexibility of multiple public clouds. So today with Vertica 10, we're super proud and excited to announce support for Vertica in Eon Mode on Google Cloud. This gives our customers the ability to use their Vertica licenses on Amazon AWS, on-premise with Pure Storage and on Google Cloud. Now, we were talking about HDFS and a lot of our customers who have invested quite a bit in HDFS as a place, especially to store data have been pushing us to support Eon Mode with HDFS. So as part of Vertica 10, we are also announcing support for Vertica in Eon Mode using HDFS as the communal storage. Vertica's own Roth format data can be stored in HDFS, and actually the full functionality of Vertica is complete analytics, geospatial pattern matching, time series, machine learning, everything that we have in there can be applied to this data. And on the same HDFS nodes, Vertica can actually also analyze data in ORC or Parquet format, using External tables. We can also execute joins between the Roth data the External table holds, which powers a much more comprehensive view. So again, it's that flexibility to be able to support our customers, wherever they need us to support them on whatever platform, they have. Vertica 10 gives us a lot more ways that we can deploy Eon Mode in various environments for our customers. It allows them to take advantage of Vertica in Eon Mode and the power that it brings with that separation, with that workload isolation, to whichever platform they are most comfortable with. Now, there's a lot that has come in Vertica 10. I'm definitely not going to be able to cover everything. But we also introduced complex types as an example. And complex data types fit very well into Eon as well in this separation. They significantly reduce the data pipeline, the cost of moving data between those, a much better support for unstructured data, which a lot of our customers have mixed with structured data, of course, and they leverage a lot of columnar execution that Vertica provides. So you get complex data types in Vertica now, a lot more data, stronger performance. It goes great with the announcement that we made with the broader Eon Mode. Let's talk a little bit more about machine learning. We've been actually doing work in and around machine learning with various extra regressions and a whole bunch of other algorithms for several years. We saw the huge advantage that MPP offered, not just as a sequel engine as a database, but for ML as well. Didn't take as long to realize that there's a lot more to operationalizing machine learning than just those algorithms. It's data preparation, it's that model trade training. It's the scoring, the shaping, the evaluation. That is so much of what machine learning and frankly, data science is about. You do know, everybody always wants to jump to the sexy algorithm and we handle those tasks very, very well. It makes Vertica a terrific platform to do that. A lot of work in data science and machine learning is done in other tools. I had mentioned that there's just so many tools out there. We want people to be able to take advantage of all that. We never believed we were going to be the best algorithm company or come up with the best models for people to use. So with Vertica 10, we support PMML. We can import now and export PMML models. It's a huge step for us around that operationalizing machine learning projects for our customers. Allowing the models to get built outside of Vertica yet be imported in and then applying to that full scale of data with all the performance that you would expect from Vertica. We also are more tightly integrating with Python. As many of you know, we've been doing a lot of open source projects with the community driven by many of our customers, like Uber. And so now with Python we've integrated with TensorFlow, allowing data scientists to build models in their preferred language, to take advantage of TensorFlow. But again, to store and deploy those models at scale with Vertica. I think both these announcements are proof of our big bet number three, and really our commitment to supporting innovation throughout the community by operationalizing ML with that accuracy, performance and scale of Vertica for our customers. Again, there's a lot of steps when it comes to the workflow of machine learning. These are some of them that you can see on the slide, and it's definitely not linear either. We see this as a circle. And companies that do it, well just continue to learn, they continue to rescore, they continue to redeploy and they want to operationalize all that within a single platform that can take advantage of all those capabilities. And that is the platform, with a very robust ecosystem that Vertica has always been committed to as an organization and will continue to be. This graphic, many of you have seen it evolve over the years. Frankly, if we put everything and everyone on here wouldn't fit on a slide. But it will absolutely continue to evolve and grow as we support our customers, where they need the support most. So, again, being able to deploy everywhere, being able to take advantage of Vertica, not just as a business analyst or a business user, but as a data scientists or as an operational or BI person. We want Vertica to be leveraged and used by the broader organization. So I think it's fair to say and I encourage everybody to learn more about Vertica 10, because I'm just highlighting some of the bigger aspects of it. But we talked about those three market trends. The need to unify the silos, the need for hybrid multiple cloud deployment options, the need to operationalize business critical machine learning projects. Vertica 10 has absolutely delivered on those. But again, we are not going to stop. It is our job not to, and this is how Team Vertica thrives. I always joke that the next release is the best release. And, of course, even after Vertica 10, that is also true, although Vertica 10 is pretty awesome. But, you know, from the first line of code, we've always been focused on performance and scale, right. And like any really strong data platform, the execution engine, the optimizer and the execution engine are the two core pieces of that. Beyond Vertica 10, some of the big things that we're already working on, next generation execution engine. We're already actually seeing incredible early performance from this. And this is just one example, of how important it is for an organization like Vertica to constantly go back and re-innovate. Every single release, we do the sit ups and crunches, our performance and scale. How do we improve? And there's so many parts of the core server, there's so many parts of our broader ecosystem. We are constantly looking at coverages of how we can go back to all the code lines that we have, and make them better in the current environment. And it's not an easy thing to do when you're doing that, and you're also expanding in the environment that we are expanding into to take advantage of the different deployments, which is a great segue to this slide. Because if you think about today, we're obviously already available with Eon Mode and Amazon, AWS and Pure and actually MinIO as well. As I talked about in Vertica 10 we're adding Google and HDFS. And coming next, obviously, Microsoft Azure, Alibaba cloud. So being able to expand into more of these environments is really important for the Vertica team and how we go forward. And it's not just running in these clouds, for us, we want it to be a SaaS like experience in all these clouds. We want you to be able to deploy Vertica in 15 minutes or less on these clouds. You can also consume Vertica, in a lot of different ways, on these clouds. As an example, in Amazon Vertica by the Hour. So for us, it's not just about running, it's about taking advantage of the ecosystems that all these cloud providers offer, and really optimizing the Vertica experience as part of them. Optimization, around automation, around self service capabilities, extending our management console, we now have products that like the Vertica Advisor Tool that our Customer Success Team has created to actually use our own smarts in Vertica. To take data from customers that give it to us and help them tune automatically their environment. You can imagine that we're taking that to the next level, in a lot of different endeavors that we're doing around how Vertica as a product can actually be smarter because we all know that simplicity is key. There just aren't enough people in the world who are good at managing data and taking it to the next level. And of course, other things that we all hear about, whether it's Kubernetes and containerization. You can imagine that that probably works very well with the Eon Mode and separating compute and storage. But innovation happens everywhere. We innovate around our community documentation. Many of you have taken advantage of the Vertica Academy. The numbers there are through the roof in terms of the number of people coming in and certifying on it. So there's a lot of things that are within the core products. There's a lot of activity and action beyond the core products that we're taking advantage of. And let's not forget why we're here, right? It's easy to talk about a platform, a data platform, it's easy to jump into all the functionality, the analytics, the flexibility, how we can offer it. But at the end of the day, somebody, a person, she's got to take advantage of this data, she's got to be able to take this data and use this information to make a critical business decision. And that doesn't happen unless we explore lots of different and frankly, new ways to get that predictive analytics UI and interface beyond just the standard BI tools in front of her at the right time. And so there's a lot of activity, I'll tease you with that going on in this organization right now about how we can do that and deliver that for our customers. We're in a great position to be able to see exactly how this data is consumed and used and start with this core platform that we have to go out. Look, I know, the plan wasn't to do this as a virtual BDC. But I really appreciate you tuning in. Really appreciate your support. I think if there's any silver lining to us, maybe not being able to do this in person, it's the fact that the reach has actually gone significantly higher than what we would have been able to do in person in Boston. We're certainly looking forward to doing a Big Data Conference in the future. But if I could leave you with anything, know this, since that first release for Vertica, and our very first customers, we have been very consistent. We respect all the innovation around us, whether it's open source or not. We understand the market trends. We embrace those new ideas and technologies and for us true north, and the most important thing is what does our customer need to do? What problem are they trying to solve? And how do we use the advantages that we have without disrupting our customers? But knowing that you depend on us to deliver that unified analytics strategy, it will deliver that performance of scale, not only today, but tomorrow and for years to come. We've added a lot of great features to Vertica. I think we've said no to a lot of things, frankly, that we just knew we wouldn't be the best company to deliver. When we say we're going to do things we do them. Vertica 10 is a perfect example of so many of those things that we from you, our customers have heard loud and clear, and we have delivered. I am incredibly proud of this team across the board. I think the culture of Vertica, a customer first culture, jumping in to help our customers win no matter what is also something that sets us massively apart. I hear horror stories about support experiences with other organizations. And people always seem to be amazed at Team Vertica's willingness to jump in or their aptitude for certain technical capabilities or understanding the business. And I think sometimes we take that for granted. But that is the team that we have as Team Vertica. We are incredibly excited about Vertica 10. I think you're going to love the Virtual Big Data Conference this year. I encourage you to tune in. Maybe one other benefit is I know some people were worried about not being able to see different sessions because they were going to overlap with each other well now, even if you can't do it live, you'll be able to do those sessions on demand. Please enjoy the Vertica Big Data Conference here in 2020. Please you and your families and your co-workers be safe during these times. I know we will get through it. And analytics is probably going to help with a lot of that and we already know it is helping in many different ways. So believe in the data, believe in data's ability to change the world for the better. And thank you for your time. And with that, I am delighted to now introduce Micro Focus CEO Stephen Murdoch to the Vertica Big Data Virtual Conference. Thank you Stephen. >> Stephen: Hi, everyone, my name is Stephen Murdoch. I have the pleasure and privilege of being the Chief Executive Officer here at Micro Focus. Please let me add my welcome to the Big Data Conference. And also my thanks for your support, as we've had to pivot to this being virtual rather than a physical conference. Its amazing how quickly we all reset to a new normal. I certainly didn't expect to be addressing you from my study. Vertica is an incredibly important part of Micro Focus family. Is key to our goal of trying to enable and help customers become much more data driven across all of their IT operations. Vertica 10 is a huge step forward, we believe. It allows for multi-cloud innovation, genuinely hybrid deployments, begin to leverage machine learning properly in the enterprise, and also allows the opportunity to unify currently siloed lakes of information. We operate in a very noisy, very competitive market, and there are people, who are in that market who can do some of those things. The reason we are so excited about Vertica is we genuinely believe that we are the best at doing all of those things. And that's why we've announced publicly, you're under executing internally, incremental investment into Vertica. That investments targeted at accelerating the roadmaps that already exist. And getting that innovation into your hands faster. This idea is speed is key. It's not a question of if companies have to become data driven organizations, it's a question of when. So that speed now is really important. And that's why we believe that the Big Data Conference gives a great opportunity for you to accelerate your own plans. You will have the opportunity to talk to some of our best architects, some of the best development brains that we have. But more importantly, you'll also get to hear from some of our phenomenal Roth Data customers. You'll hear from Uber, from the Trade Desk, from Philips, and from AT&T, as well as many many others. And just hearing how those customers are using the power of Vertica to accelerate their own, I think is the highlight. And I encourage you to use this opportunity to its full. Let me close by, again saying thank you, we genuinely hope that you get as much from this virtual conference as you could have from a physical conference. And we look forward to your engagement, and we look forward to hearing your feedback. With that, thank you very much. >> Joy: Thank you so much, Stephen, for joining us for the Vertica Big Data Conference. Your support and enthusiasm for Vertica is so clear, and it makes a big difference. Now, I'm delighted to introduce Amy Fowler, the VP of Strategy and Solutions for FlashBlade at Pure Storage, who was one of our BDC Platinum Sponsors, and one of our most valued partners. It was a proud moment for me, when we announced Vertica in Eon mode for Pure Storage FlashBlade and we became the first analytics data warehouse that separates compute from storage for on-premise data centers. Thank you so much, Amy, for joining us. Let's get started. >> Amy: Well, thank you, Joy so much for having us. And thank you all for joining us today, virtually, as we may all be. So, as we just heard from Colin Mahony, there are some really interesting trends that are happening right now in the big data analytics market. From the end of the Hadoop hype cycle, to the new cloud reality, and even the opportunity to help the many data science and machine learning projects move from labs to production. So let's talk about these trends in the context of infrastructure. And in particular, look at why a modern storage platform is relevant as organizations take on the challenges and opportunities associated with these trends. The answer is the Hadoop hype cycles left a lot of data in HDFS data lakes, or reservoirs or swamps depending upon the level of the data hygiene. But without the ability to get the value that was promised from Hadoop as a platform rather than a distributed file store. And when we combine that data with the massive volume of data in Cloud Object Storage, we find ourselves with a lot of data and a lot of silos, but without a way to unify that data and find value in it. Now when you look at the infrastructure data lakes are traditionally built on, it is often direct attached storage or data. The approach that Hadoop took when it entered the market was primarily bound by the limits of networking and storage technologies. One gig ethernet and slower spinning disk. But today, those barriers do not exist. And all FlashStorage has fundamentally transformed how data is accessed, managed and leveraged. The need for local data storage for significant volumes of data has been largely mitigated by the performance increases afforded by all Flash. At the same time, organizations can achieve superior economies of scale with that segregation of compute and storage. With compute and storage, you don't always scale in lockstep. Would you want to add an engine to the train every time you add another boxcar? Probably not. But from a Pure Storage perspective, FlashBlade is uniquely architected to allow customers to achieve better resource utilization for compute and storage, while at the same time, reducing complexity that has arisen from the siloed nature of the original big data solutions. The second and equally important recent trend we see is something I'll call cloud reality. The public clouds made a lot of promises and some of those promises were delivered. But cloud economics, especially usage based and elastic scaling, without the control that many companies need to manage the financial impact is causing a lot of issues. In addition, the risk of vendor lock-in from data egress, charges, to integrated software stacks that can't be moved or deployed on-premise is causing a lot of organizations to back off the all the way non-cloud strategy, and move toward hybrid deployments. Which is kind of funny in a way because it wasn't that long ago that there was a lot of talk about no more data centers. And for example, one large retailer, I won't name them, but I'll admit they are my favorites. They several years ago told us they were completely done with on-prem storage infrastructure, because they were going 100% to the cloud. But they just deployed FlashBlade for their data pipelines, because they need predictable performance at scale. And the all cloud TCO just didn't add up. Now, that being said, well, there are certainly challenges with the public cloud. It has also brought some things to the table that we see most organizations wanting. First of all, in a lot of cases applications have been built to leverage object storage platforms like S3. So they need that object protocol, but they may also need it to be fast. And the said object may be oxymoron only a few years ago, and this is an area of the market where Pure and FlashBlade have really taken a leadership position. Second, regardless of where the data is physically stored, organizations want the best elements of a cloud experience. And for us, that means two main things. Number one is simplicity and ease of use. If you need a bunch of storage experts to run the system, that should be considered a bug. The other big one is the consumption model. The ability to pay for what you need when you need it, and seamlessly grow your environment over time totally nondestructively. This is actually pretty huge and something that a lot of vendors try to solve for with finance programs. But no finance program can address the pain of a forklift upgrade, when you need to move to next gen hardware. To scale nondestructively over long periods of time, five to 10 years plus is a crucial architectural decisions need to be made at the outset. Plus, you need the ability to pay as you use it. And we offer something for FlashBlade called Pure as a Service, which delivers exactly that. The third cloud characteristic that many organizations want is the option for hybrid. Even if that is just a DR site in the cloud. In our case, that means supporting appplication of S3, at the AWS. And the final trend, which to me represents the biggest opportunity for all of us, is the need to help the many data science and machine learning projects move from labs to production. This means bringing all the machine learning functions and model training to the data, rather than moving samples or segments of data to separate platforms. As we all know, machine learning needs a ton of data for accuracy. And there is just too much data to retrieve from the cloud for every training job. At the same time, predictive analytics without accuracy is not going to deliver the business advantage that everyone is seeking. You can kind of visualize data analytics as it is traditionally deployed as being on a continuum. With that thing, we've been doing the longest, data warehousing on one end, and AI on the other end. But the way this manifests in most environments is a series of silos that get built up. So data is duplicated across all kinds of bespoke analytics and AI, environments and infrastructure. This creates an expensive and complex environment. So historically, there was no other way to do it because some level of performance is always table stakes. And each of these parts of the data pipeline has a different workload profile. A single platform to deliver on the multi dimensional performances, diverse set of applications required, that didn't exist three years ago. And that's why the application vendors pointed you towards bespoke things like DAS environments that we talked about earlier. And the fact that better options exists today is why we're seeing them move towards supporting this disaggregation of compute and storage. And when it comes to a platform that is a better option, one with a modern architecture that can address the diverse performance requirements of this continuum, and allow organizations to bring a model to the data instead of creating separate silos. That's exactly what FlashBlade is built for. Small files, large files, high throughput, low latency and scale to petabytes in a single namespace. And this is importantly a single rapid space is what we're focused on delivering for our customers. At Pure, we talk about it in the context of modern data experience because at the end of the day, that's what it's really all about. The experience for your teams in your organization. And together Pure Storage and Vertica have delivered that experience to a wide range of customers. From a SaaS analytics company, which uses Vertica on FlashBlade to authenticate the quality of digital media in real time, to a multinational car company, which uses Vertica on FlashBlade to make thousands of decisions per second for autonomous cars, or a healthcare organization, which uses Vertica on FlashBlade to enable healthcare providers to make real time decisions that impact lives. And I'm sure you're all looking forward to hearing from John Yavanovich from AT&T. To hear how he's been doing this with Vertica and FlashBlade as well. He's coming up soon. We have been really excited to build this partnership with Vertica. And we're proud to provide the only on-premise storage platform validated with Vertica Eon Mode. And deliver this modern data experience to our customers together. Thank you all so much for joining us today. >> Joy: Amy, thank you so much for your time and your insights. Modern infrastructure is key to modern analytics, especially as organizations leverage next generation data center architectures, and object storage for their on-premise data centers. Now, I'm delighted to introduce our last speaker in our Vertica Big Data Conference Keynote, John Yovanovich, Director of IT for AT&T. Vertica is so proud to serve AT&T, and especially proud of the harmonious impact we are having in partnership with Pure Storage. John, welcome to the Virtual Vertica BDC. >> John: Thank you joy. It's a pleasure to be here. And I'm excited to go through this presentation today. And in a unique fashion today 'cause as I was thinking through how I wanted to present the partnership that we have formed together between Pure Storage, Vertica and AT&T, I want to emphasize how well we all work together and how these three components have really driven home, my desire for a harmonious to use your word relationship. So, I'm going to move forward here and with. So here, what I'm going to do the theme of today's presentation is the Pure Vertica Symphony live at AT&T. And if anybody is a Westworld fan, you can appreciate the sheet music on the right hand side. What we're going to what I'm going to highlight here is in a musical fashion, is how we at AT&T leverage these technologies to save money to deliver a more efficient platform, and to actually just to make our customers happier overall. So as we look back, and back as early as just maybe a few years ago here at AT&T, I realized that we had many musicians to help the company. Or maybe you might want to call them data scientists, or data analysts. For the theme we'll stay with musicians. None of them were singing or playing from the same hymn book or sheet music. And so what we had was many organizations chasing a similar dream, but not exactly the same dream. And, best way to describe that is and I think with a lot of people this might resonate in your organizations. How many organizations are chasing a customer 360 view in your company? Well, I can tell you that I have at least four in my company. And I'm sure there are many that I don't know of. That is our problem because what we see is a repetitive sourcing of data. We see a repetitive copying of data. And there's just so much money to be spent. This is where I asked Pure Storage and Vertica to help me solve that problem with their technologies. What I also noticed was that there was no coordination between these departments. In fact, if you look here, nobody really wants to play with finance. Sales, marketing and care, sure that you all copied each other's data. But they actually didn't communicate with each other as they were copying the data. So the data became replicated and out of sync. This is a challenge throughout, not just my company, but all companies across the world. And that is, the more we replicate the data, the more problems we have at chasing or conquering the goal of single version of truth. In fact, I kid that I think that AT&T, we actually have adopted the multiple versions of truth, techno theory, which is not where we want to be, but this is where we are. But we are conquering that with the synergies between Pure Storage and Vertica. This is what it leaves us with. And this is where we are challenged and that if each one of our siloed business units had their own stories, their own dedicated stories, and some of them had more money than others so they bought more storage. Some of them anticipating storing more data, and then they really did. Others are running out of space, but can't put anymore because their bodies aren't been replenished. So if you look at it from this side view here, we have a limited amount of compute or fixed compute dedicated to each one of these silos. And that's because of the, wanting to own your own. And the other part is that you are limited or wasting space, depending on where you are in the organization. So there were the synergies aren't just about the data, but actually the compute and the storage. And I wanted to tackle that challenge as well. So I was tackling the data. I was tackling the storage, and I was tackling the compute all at the same time. So my ask across the company was can we just please play together okay. And to do that, I knew that I wasn't going to tackle this by getting everybody in the same room and getting them to agree that we needed one account table, because they will argue about whose account table is the best account table. But I knew that if I brought the account tables together, they would soon see that they had so much redundancy that I can now start retiring data sources. I also knew that if I brought all the compute together, that they would all be happy. But I didn't want them to tackle across tackle each other. And in fact that was one of the things that all business units really enjoy. Is they enjoy the silo of having their own compute, and more or less being able to control their own destiny. Well, Vertica's subclustering allows just that. And this is exactly what I was hoping for, and I'm glad they've brought through. And finally, how did I solve the problem of the single account table? Well when you don't have dedicated storage, and you can separate compute and storage as Vertica in Eon Mode does. And we store the data on FlashBlades, which you see on the left and right hand side, of our container, which I can describe in a moment. Okay, so what we have here, is we have a container full of compute with all the Vertica nodes sitting in the middle. Two loader, we'll call them loader subclusters, sitting on the sides, which are dedicated to just putting data onto the FlashBlades, which is sitting on both ends of the container. Now today, I have two dedicated storage or common dedicated might not be the right word, but two storage racks one on the left one on the right. And I treat them as separate storage racks. They could be one, but i created them separately for disaster recovery purposes, lashing work in case that rack were to go down. But that being said, there's no reason why I'm probably going to add a couple of them here in the future. So I can just have a, say five to 10, petabyte storage, setup, and I'll have my DR in another 'cause the DR shouldn't be in the same container. Okay, but I'll DR outside of this container. So I got them all together, I leveraged subclustering, I leveraged separate and compute. I was able to convince many of my clients that they didn't need their own account table, that they were better off having one. I eliminated, I reduced latency, I reduced our ticketing I reduce our data quality issues AKA ticketing okay. I was able to expand. What is this? As work. I was able to leverage elasticity within this cluster. As you can see, there are racks and racks of compute. We set up what we'll call the fixed capacity that each of the business units needed. And then I'm able to ramp up and release the compute that's necessary for each one of my clients based on their workloads throughout the day. And so while they compute to the right before you see that the instruments have already like, more or less, dedicated themselves towards all those are free for anybody to use. So in essence, what I have, is I have a concert hall with a lot of seats available. So if I want to run a 10 chair Symphony or 80, chairs, Symphony, I'm able to do that. And all the while, I can also do the same with my loader nodes. I can expand my loader nodes, to actually have their own Symphony or write all to themselves and not compete with any other workloads of the other clusters. What does that change for our organization? Well, it really changes the way our database administrators actually do their jobs. This has been a big transformation for them. They have actually become data conductors. Maybe you might even call them composers, which is interesting, because what I've asked them to do is morph into less technology and more workload analysis. And in doing so we're able to write auto-detect scripts, that watch the queues, watch the workloads so that we can help ramp up and trim down the cluster and subclusters as necessary. There has been an exciting transformation for our DBAs, who I need to now classify as something maybe like DCAs. I don't know, I have to work with HR on that. But I think it's an exciting future for their careers. And if we bring it all together, If we bring it all together, and then our clusters, start looking like this. Where everything is moving in harmonious, we have lots of seats open for extra musicians. And we are able to emulate a cloud experience on-prem. And so, I want you to sit back and enjoy the Pure Vertica Symphony live at AT&T. (soft music) >> Joy: Thank you so much, John, for an informative and very creative look at the benefits that AT&T is getting from its Pure Vertica symphony. I do really like the idea of engaging HR to change the title to Data Conductor. That's fantastic. I've always believed that music brings people together. And now it's clear that analytics at AT&T is part of that musical advantage. So, now it's time for a short break. And we'll be back for our breakout sessions, beginning at 12 pm Eastern Daylight Time. We have some really exciting sessions planned later today. And then again, as you can see on Wednesday. Now because all of you are already logged in and listening to this keynote, you already know the steps to continue to participate in the sessions that are listed here and on the previous slide. In addition, everyone received an email yesterday, today, and you'll get another one tomorrow, outlining the simple steps to register, login and choose your session. If you have any questions, check out the emails or go to www.vertica.com/bdc2020 for the logistics information. There are a lot of choices and that's always a good thing. Don't worry if you want to attend one or more or can't listen to these live sessions due to your timezone. All the sessions, including the Q&A sections will be available on demand and everyone will have access to the recordings as well as even more pre-recorded sessions that we'll post to the BDC website. Now I do want to leave you with two other important sites. First, our Vertica Academy. Vertica Academy is available to everyone. And there's a variety of very technical, self-paced, on-demand training, virtual instructor-led workshops, and Vertica Essentials Certification. And it's all free. Because we believe that Vertica expertise, helps everyone accelerate their Vertica projects and the advantage that those projects deliver. Now, if you have questions or want to engage with our Vertica engineering team now, we're waiting for you on the Vertica forum. We'll answer any questions or discuss any ideas that you might have. Thank you again for joining the Vertica Big Data Conference Keynote Session. Enjoy the rest of the BDC because there's a lot more to come
SUMMARY :
And he'll share the exciting news And that is the platform, with a very robust ecosystem some of the best development brains that we have. the VP of Strategy and Solutions is causing a lot of organizations to back off the and especially proud of the harmonious impact And that is, the more we replicate the data, Enjoy the rest of the BDC because there's a lot more to come
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Stephen | PERSON | 0.99+ |
Amy Fowler | PERSON | 0.99+ |
Mike | PERSON | 0.99+ |
John Yavanovich | PERSON | 0.99+ |
Amy | PERSON | 0.99+ |
Colin Mahony | PERSON | 0.99+ |
AT&T | ORGANIZATION | 0.99+ |
Boston | LOCATION | 0.99+ |
John Yovanovich | PERSON | 0.99+ |
Vertica | ORGANIZATION | 0.99+ |
Joy King | PERSON | 0.99+ |
Mike Stonebreaker | PERSON | 0.99+ |
John | PERSON | 0.99+ |
May 2018 | DATE | 0.99+ |
100% | QUANTITY | 0.99+ |
Wednesday | DATE | 0.99+ |
Colin | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Vertica Academy | ORGANIZATION | 0.99+ |
five | QUANTITY | 0.99+ |
Joy | PERSON | 0.99+ |
2020 | DATE | 0.99+ |
two | QUANTITY | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Stephen Murdoch | PERSON | 0.99+ |
Vertica 10 | TITLE | 0.99+ |
Pure Storage | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Philips | ORGANIZATION | 0.99+ |
tomorrow | DATE | 0.99+ |
AT&T. | ORGANIZATION | 0.99+ |
September 2019 | DATE | 0.99+ |
Python | TITLE | 0.99+ |
www.vertica.com/bdc2020 | OTHER | 0.99+ |
One gig | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Second | QUANTITY | 0.99+ |
First | QUANTITY | 0.99+ |
15 minutes | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |