Image Title

Search Results for Springfield:

Joe Gonzalez, MassMutual | Virtual Vertica BDC 2020


 

(bright music) >> Announcer: It's theCUBE. Covering the Virtual Vertica Big Data Conference 2020, brought to you by Vertica. Hello everybody, welcome back to theCUBE's coverage of the Vertica Big Data Conference, the Virtual BDC. My name is Dave Volante, and you're watching theCUBE. And we're here with Joe Gonzalez, who is a Vertica DBA, at MassMutual Financial. Joe, thanks so much for coming on theCUBE I'm sorry that we can't be face to face in Boston, but at least we're being responsible. So thank you for coming on. >> (laughs) Thank you for having me. It's nice to be here. >> Yeah, so let's set it up. We'll talk about, you know, a little bit about MassMutual. Everybody knows it's a big financial firm, but what's your role there and kind of your mission? >> So my role is Vertica DBA. I was hired January of last year to come on and manage their Vertica cluster. They've been on Vertica for probably about a year and a half before that started out on on-prem cluster and then move to AWS Enterprise in the cloud, and brought me on just as they were considering transitioning over to Vertica's EON mode. And they didn't really have anybody dedicated to Vertica, nobody who really knew and understood the product. And I've been working with Vertica for about probably six, seven years, at that point. I was looking for something new and landed a really good opportunity here with a great company. >> Yeah, you have a lot of experience in Vertica. You had a role as a market research, so you're a data guy, right? I mean that's really what you've been doing your entire career. >> I am, I've worked with Pitney Bowes, in the postage industry, I worked with healthcare auditing, after seven years in market research. And then I've been with MassMutual for a little over a year now, yeah, quite a lot. >> So tell us a little bit about kind of what your objectives are at MassMutual, what you're kind of doing with the platform, what application just supporting, paint a picture for us if you would. >> Certainly, so my role is, MassMutual just decided to make Vertica its enterprise data warehouse. So they've really bought into Vertica. And we're moving all of our data there probably about to good 80, 90% of MassMutual's data is going to be on the Vertica platform, in EON mode. So, and we have a wide usage of that data across corporation. Right now we're about 50 terabytes and growing quickly. And a wide variety of users. So there's a lot of ETLs coming in overnight, loading a lot of data, transforming a lot of data. And a lot of reporting tools are using it. So currently, Tableau MicroStrategy. We have Alteryx using it, and we also have API's running against it throughout the day, 24/7 with people coming in, especially now these days with the, you know, some financial uncertainty going on. A lot of people coming and checking their 401k's, checking their insurance and status and what not. So we have to handle a lot of concurrent traffic on top of the normal big query. So it's a quite diverse cluster. And I'm glad they're really investing in using Vertica as their overall solution for this. >> Yeah, I mean, these days your 401k like this, right? (laughing) Afraid to look. So I wonder, Joe if you could share with our audience. I mean, for those who might not be as familiar with the history of just Vertica, and specifically, about MPP, you've had historically you have, you know, traditional RDBMS, whether it's Db2 or Oracle, and then you had a spate of companies that came out with this notion of MPP Vertica is the one that, I think it's probably one of the few if only brands that they've survived, but what did that bring to the industry and why is that important for people to understand, just in terms of whatever it is, scale, performance, cost. Can you explain that? >> To me, it actually brought scale at good cost. And that's why I've been a big proponent of Vertica ever since I started using it. There's a number, like you said of different platforms where you can load big data and store and house big data. But the purpose of having that big data is not just for it to sit there, but to be used, and used in a variety of ways. And that's from, you know, something small, like the first installation I was on was about 10 terabytes. And, you know, I work with the data warehouses up to 100 terabytes, and, you know, there's Vertica installations with, you know, hundreds of petabytes on them. You want to be able to use that data, so you need a platform that's going to be able to access that data and get it to the clients, get it to the customers as quickly as possible, and not paying an arm and a leg for the privilege to do so. And Vertica allows companies to do that, not only get their data to clients and you know, in company users quickly, but save money while doing so. >> So, but so, why couldn't I just use a traditional RDBMS? Why not just throw it all into Oracle? >> One, cost, Oracle is very expensive while Vertica's a lot more affordable than that. But the column-score structure of Vertica allows for a lot more optimized queries. Some of the queries that you can run in Vertica in 2, 3, 4 seconds, will take minutes and sometimes hours in an RDBMS, like Oracle, like SQL Server. They have the capability to store that amount of data, no question, but the usability really lacks when you start querying tables that are 180 billion column, 180 billion rows rather of tables in Vertica that are over 1000 columns. Those will take hours to run on a traditional RDBMS and then running them in Vertica, I get my queries back in a sec. >> You know what's interesting to me, Joe and I wonder if you could comment, it seems that Vertica has done a good job of embracing, you know, riding the waves, whether it was HDFS and the big data in our early part of the big data era, the machine learning, machine intelligence. Whether it's, you know, TensorFlow and other data science tools, it seems like Vertica somehow in the cloud is the other one, right? A lot of times cloud is super disruptive, particularly to companies that started on-prem, it seems like Vertica somehow has been able to adopt and embrace some of these trends. Why, from your standpoint, first of all, from your standpoint, as a customer, is that true? And why do you think that is? Is it architectural? Is it true mindset engineering? I wonder if you could comment on that. >> It's absolutely true, I've started out again, on an on-prem Vertica data warehouse, and we kind of, you know, rolled kind of along with them, you know, more and more people have been using data, they want to make it accessible to people on the web now. And you know, having that, the option to provide that data from an on-prem solution, from AWS is key, and now Vertica is offering even a hybrid solution, if you want to keep some of your data behind a firewall, on-prem, and put some in the cloud as well. So data at Vertica has absolutely evolved along with the industry in ways that no other company really has that I've seen. And I think the reason for it and the reason I've stayed with Vertica, and specifically have remained at Vertica DBA for the last seven years, is because of the way Vertica stays in touch with it's persons. I've been working with the same people for the seven, eight years, I've been using Vertica, they're family. I'm part of their family, and you know, I'm good friends with some of these people. And they really are in tune not only with the customer but what they're doing. They really sit down with you and have those conversations about, you know, what are your needs? How can we make Vertica better? And they listen to their clients. You know, just having access to the data engineers who develop Vertica to be arranged on a phone call or whatnot, I've never had that with any other company. Vertica makes that available to their customers when they need it. So the personal touch is a huge for them. >> That's good, it's always good to get the confirmation from the practitioners, just not hear from the vendor. I want to ask you about the EON transition. You mentioned that MassMutual brought you in to help with that. What were some of the challenges that you faced? And how did you get over them? And what did, what is, why EON? You know, what was the goal, the outcome and some of the challenges maybe that you had to overcome? >> Right. So MassMutual had an interesting setup when I first came in. They had three different Vertica clusters to accommodate three different portions of their business. The data scientists who use the data quite extensively in very large queries, very intense queries, their work with their predictive analytics and whatnot. It was a separate one for the API's, which needed, you know, sub-second query response times. And the enterprise solution, they weren't always able to get the performance they needed, because the fast queries were being overrun by the larger queries that needed more resources. And then they had a third for starting to develop this enterprise data platform and started, you know, looking into their future. The first challenge was, first of all, bringing all those three together, and back into a single cluster, and allowing our users to have both of the heavy queries and the API queries running at the same time, on the same platform without having to completely separate them out onto different clusters. EON really helps with that because it allows to store that data in the S3 communal storage, have the main cluster set up to run the heavy queries. And then you can set up sub clusters that still point to that S3 data, but separates out the compute so that the API's really have their own resources to run and not be interfered with by the other process. >> Okay, so that, I'm hearing a couple of things. One is you're sort of busting down data silos. So you're able to have a much more coherent view of your data, which I would imagine is critical, certainly. Companies like MassMutual, have been around for 100 years, and so you've got all kinds of data dispersed. So to the extent that you can break down those silos, that's important, but also being able to I guess have granular increments of compute and storage is what I'm hearing. What does that do for you? It make that more efficient? Well, they are other business benefits? Maybe you could elucidate. >> Well, one cost is again, a huge benefit, the cost of running three different clusters in even AWS, in the enterprise solution was a little costly, you know, you had to have your dedicated servers here and there. So you're paying for like, you know, 12, 15 different servers, for example. Whereas we bring them all back into EON, I can run everything on a six-node production cluster. And you know, when things are busy, I can spin up the three-node top cluster for the API's, only paid for when I need them, and then bring them back into the main cluster when things are slowed down a bit, and they can get that performance that they need. So that saves a ton on resource costs, you know, you're not paying for the storage, you're paying for one S3 bucket, you're only paying for the nodes, these are two instances, that are up and running when you need them., and that is huge. And again, like you said, it gives us the ability to silo our data without having to completely separate our data into different storage areas. Which is a big benefit, it gives us the ability to query everything from one single cluster without having to synchronize it to, you know, three different ones. So this one going to have there's, this one going to have there's, but everyone's still looking at the same data and replicate that in QA and Devs so that people can do it outside of production and do some testing as well. >> So EON, obviously a very important innovation. And of course, Vertica touts the difference between others who separate huge storage, and you know, they're not the only one that does that, but they are really I think the only one that does it for on-prem, and virtually across clouds. So my question is, and I think you're doing a breakout session on the Virtual BDC. We're going to be in Boston, now we're doing it online. If I'm in the audience, I'm imagining I'm a junior DBA at an organization that maybe doesn't have a Joe. I haven't been an expert for seven years. How hard is it for me to get, what do I need to do to get up to speed on EON? It sounds great, I want it. I'm going to save my company money, but I'm nervous 'cause I've only been at Vertica DBA for, you know, a year, and I'm sort of, you know, not as experienced as you. What are the things that I should be thinking about? Do I need to bring in? Do I need to hire somebody? Do I need to bring in a consultant? Can I learn it myself? What would you advise? >> It's definitely easy enough that if you have at least a little bit of work experience, you can learn it yourself, okay? 'Cause the concepts are still there. There's some you know, little bits of nuances where you do need to be aware of certain changes between the Enterprise and EON edition. But I would also say consult with your Vertica Account Manager, consult with your, you know, let them bring in the right people from Vertica to help you get up to speed and if you need to, there are also resources available as far as consultants go, that will help you get up to speed very quickly. And we did work together with Vertica and with one of their partners, Clarity, in helping us to understand EON better, set it up the right way, you know, how do we take our, the number of shards for our data warehouse? You know, they helped us evaluate all that and pick the right number of shards, the right number of nodes to get set up and going. And, you know, helped us figure out the best ways to get our data over from the Enterprise Edition into EON very quickly and very efficient. So different with yourself. >> I wanted to ask you about organizational, you know, issues because, you know, the guys like you practitioners always tell me, "Look, the tech, technology comes and goes, that's kind of the easy part, we're good at that. It's the people it's the processes, the skill sets." What does your, you know, team regime look like? And do you have any sort of ideal team makeup or, you know, ideal advice, is it two piece of teams? Is it what kind of skills? What kind of interaction and communications to senior leadership? I wonder if you could just give us some color on that. >> One of the things that makes me extremely proud to be working for MassMutual right now, is that they do what a lot of companies have not been doing and that is investing in IT. They have put a lot of thought, a lot of money, and a lot of support into setting up their enterprise data platform and putting Vertica at the center. And not only did they put the money into getting the software that they needed, like Vertica, you know, MicroStrategy, and all the other tools that we were using to use that, they put the money in the people. Our managers are extremely supportive of us. We hired about 40 to 45 different people within a four-month time frame, data engineers, data analysts, data modelers, a nice mix of people across who can help shape your data and bring the data in and help the users use the data properly, and allow me as the database administrator to make sure that they're doing what they're doing most efficiently and focus on my job. So you have to have that diversity among the different data skills in order to make your team successful. >> That's awesome. Kind of a side question, and it's really not Vertica's wheelhouse, but I'm curious, you know, in the early days of the big data, you know, movement, a lot of the data scientists would complain, and they still do that, "80% of my time is spent wrangling data." The tools for the data engineer, the data scientists, the database, you know, experts, they're all different. And is that changing? And to what degree is that changing? Kind of what ending are we in and just in terms of a more facile environment for all those roles? >> Again, I think it depends on company to company, you know, what resources they make available to the data scientists. And the data scientists, we have a lot of them at MassMutual. And they're very much into doing a lot of machine learning, model training, predictive analytics. And they are, you know, used to doing it outside of Vertica too, you know, pulling that data out into Python and Scalars Bar, and tools like that. And they're also now just getting into using Vertica's in-database analytics and machine learning, which is a skill that, you know, definitely nobody else out there has. So being able to have one somebody who understands Vertica like myself, and being able to train other people to use Vertica the way that is most efficient for them is key. But also just having people who understand not only the tools that you're using, but how to model data, how to architect your tables, your schemas, the interaction between your tables and schemas and whatnot, you need to have that diversity in order to make this work. And our data scientists have benefited immensely from the struct that MassMutual put in place by our data management delivery team. >> That's great, I think I saw, somewhere in your background, that you've trained about 100 people in Vertica. Did I get that right? >> Yes, I've, since I started here, I've gone to our Boston location, our Springfield location, and our New York City location and trained, probably about this point, about 120, 140 of our Vertica users. And I'm trying to do, you know, a couple of follow-up sessions per year. >> So adoption, obviously, is a big goal of yours. Getting people to adopt the platform, but then more importantly, I guess, deliver business value and outcomes. >> Absolutely. >> Yeah, I wanted to ask you about encryption. You know, in the perfect world, everything would be encrypted, but there are trade offs. Are you using encryption? What are you doing in that regard? >> We are actually just getting into that now due to the New York and the CCPA regulations that are now in place. We do have a lot of Person Identifiable Information in our data store that does require encryption. So we are going through a month's long process that started in December, I think, it's actually a bit earlier than that, to start identifying all the columns, not only in our Vertica database, but in, you know, the other databases that we do use, you know, we have Postgres database, SQL Server, Teradata for the time being, until that moves into Vertica. And identify where that data sits, what downstream applications, pull that data from the data sources and store it locally as well, and starts encrypting that data. And because of the tight relationship between Voltage and Vertica, we settled on Voltages as the major platform to start doing that encryption. So we're going to be implementing that in Vertica probably within the next month or two, and roll it out to all the teams that have data that requires encryption. We're going to start rolling it out to the downstream application owners to make sure that they are encrypting the data as they get it pulled over. And we're also using another product for several other applications that don't mesh well as well with both. >> Voltage being micro, focuses encryption solution, correct? >> Right, yes. >> Yes, of course, like a focus for the audience's is the, it owns Vertica and if Vertica is a separate brand. So I want to ask you kind of close on what success looks like. You've been at this for a number of years, coming into MassMutual which was great to hear. I've had some past experience with MassMutual, it's an awesome company, I've been to the Springfield facility and in Boston as well, and I have great respect for them, and they've really always been a leader. So it's great to hear that they're investing in technology as a differentiator. What does success look like for you? Let's say you're at MassMutual for a few years, you're looking back, what success look like? Go. >> A good question. It's changing every day just, you know, with more and more, you know, applications coming onboard, more and more data being pulled in, more uses being found for the data that we have. I think success for me is making sure that Vertica, first of all, is always up made, is always running at its most optimal to keep our users happy. I think when I started, you know, we had a lot of processes that were running, you know, six, seven hours, some of them were taking, you know, almost a day long, because they were so complicated, we've got those running in under an hour now, some of them running in a matter of minutes. I want to keep that optimization going for all of our processes. Like I said, there's a lot of users using this data. And it's been hard over the first year of me being here to get to all of them. And thankfully, you know, I'm getting a bit of help now, I have a couple of system DBAs, and I'm training up to help out with these optimizations, you know, fixing queries, fixing projections to make sure that queries do run as quickly as possible. So getting that to its optimal stage is one. Two, getting our data encrypted and protected so that even if for whatever reasons, somehow somebody breaks into our data, they're not going to be able to get anything at all, because our data is 100% protected. And I think more companies need to be focusing on that as well. And third, I want to see our data science teams using more and more of Vertica's in-database predictive analytics, in-database machine learning products, and really helping make their jobs more efficient by doing so. >> Joe, you're awesome guest I mean, we always like I said, love having the practitioners on and getting the straight, skinny and pros. You're welcome back anytime, and as I say, I wish we could have met in Boston, maybe next year at the BDC. But it's great to have you online, and thanks for coming on theCUBE. >> And thank you for having me and hopefully we'll meet next year. >> Yeah, I hope so. And thank you everybody for watching that. Remember theCUBE is running concurrent with the Vertica Virtual BDC, it's vertica.com/bdc2020. If you want to check out all the keynotes, and all the breakout sessions, I'm Dave Volante for theCUBE. We'll be going. More interviews, for people right there. Thanks for watching. (bright music)

Published Date : Mar 31 2020

SUMMARY :

Big Data Conference 2020, brought to you by Vertica. (laughs) Thank you for having me. We'll talk about, you know, cluster and then move to AWS Enterprise in the cloud, Yeah, you have a lot of experience in Vertica. in the postage industry, I worked with healthcare auditing, paint a picture for us if you would. with the, you know, some financial uncertainty going on. and then you had a spate of companies that came out their data to clients and you know, Some of the queries that you can run in Vertica a good job of embracing, you know, riding the waves, And you know, having that, the option to provide and some of the challenges maybe that you had to overcome? It was a separate one for the API's, which needed, you know, So to the extent that you can break down those silos, So that saves a ton on resource costs, you know, and I'm sort of, you know, not as experienced as you. to help you get up to speed and if you need to, because, you know, the guys like you practitioners the database administrator to make sure that they're doing of the big data, you know, movement, Again, I think it depends on company to company, you know, Did I get that right? And I'm trying to do, you know, a couple of follow-up Getting people to adopt the platform, but then more What are you doing in that regard? the other databases that we do use, you know, So I want to ask you kind of close on what success looks like. And thankfully, you know, I'm getting a bit of help now, But it's great to have you online, And thank you for having me And thank you everybody for watching that.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Joe GonzalezPERSON

0.99+

VerticaORGANIZATION

0.99+

Dave VolantePERSON

0.99+

MassMutualORGANIZATION

0.99+

BostonLOCATION

0.99+

DecemberDATE

0.99+

100%QUANTITY

0.99+

JoePERSON

0.99+

sixQUANTITY

0.99+

New York CityLOCATION

0.99+

seven yearsQUANTITY

0.99+

12QUANTITY

0.99+

80%QUANTITY

0.99+

sevenQUANTITY

0.99+

AWSORGANIZATION

0.99+

four-monthQUANTITY

0.99+

vertica.com/bdc2020OTHER

0.99+

SpringfieldLOCATION

0.99+

2QUANTITY

0.99+

next yearDATE

0.99+

two instancesQUANTITY

0.99+

seven hoursQUANTITY

0.99+

bothQUANTITY

0.99+

OracleORGANIZATION

0.99+

Scalars BarTITLE

0.99+

PythonTITLE

0.99+

180 billion rowsQUANTITY

0.99+

TwoQUANTITY

0.99+

thirdQUANTITY

0.99+

15 different serversQUANTITY

0.99+

two pieceQUANTITY

0.98+

OneQUANTITY

0.98+

180 billion columnQUANTITY

0.98+

over 1000 columnsQUANTITY

0.98+

eight yearsQUANTITY

0.98+

VoltageORGANIZATION

0.98+

threeQUANTITY

0.98+

hundreds of petabytesQUANTITY

0.98+

firstQUANTITY

0.98+

six-nodeQUANTITY

0.98+

oneQUANTITY

0.98+

one single clusterQUANTITY

0.98+

Vertica Big Data ConferenceEVENT

0.98+

MassMutual FinancialORGANIZATION

0.98+

4 secondsQUANTITY

0.98+

EONORGANIZATION

0.98+

New YorkLOCATION

0.97+

about 10 terabytesQUANTITY

0.97+

first challengeQUANTITY

0.97+

next monthDATE

0.97+