UNLIST TILL 4/2 - A Deep Dive into the Vertica Management Console Enhancements and Roadmap

>> Jeff: Hello, everybody, and thank you for joining us today for the virtual Vertica BDC 2020. Today's breakout session is entitled "A Deep Dive "into the Vertica Mangement Console Enhancements and Roadmap." I'm Jeff Healey of Vertica Marketing. I'll be your host for this breakout session. Joining me are Bhavik Gandhi and Natalia Stavisky from Vertica engineering. But before we begin, I encourage you to submit questions or comments during the virtual session. You don't have to wait, just type your question or comment in the question box below the slides and click submit. There will be a Q and A session at the end of the presentation. We'll answer as many questions as we're able to during that time. Any questions we don't address, we'll do our best to answer them offline. Alternatively visit Vertica Forums at forum.vertica.com. Post your question there after the session. Our engineering team is planning to join the forums to keep the conversation going well after the event. Also, a reminder that you can maximize the screen by clicking the double arrow button in the lower right corner of the slides. And yes, this virtual session is being recorded and will be available to you on demand this week. We'll send you a notification as soon as it's ready. Now let's get started. Over to you, Bhavik. >> Bhavik: All right. So hello, and welcome, everybody doing this presentation of "Deep Dive into the Vertica Management Console Enhancements and Roadmap." Myself, Bhavik, and my team member, Natalia Stavisky, will go over a few useful announcements on Vertica Management Console, discussing a few real scenarios. All right. So today we will go forward with the brief introduction about the Management Console, then we will discuss the benefits of using Management Console by going over a couple of user scenarios for the query taking too long to run and receiving email alerts from Management Console. Then we will go over a few MC features for what we call Eon Mode databases, like provisioning and reviving the Eon Mode databases from MC, managing the subcluster and understanding the Depot. Then we will go over some of the future announcements on MC that we are planning. All right, so let's get started. All right. So, do you want to know about how to provision a new Vertica cluster from MC? How to analyze and understand a database workload by monitoring the queries on the database? How do you balance the resource pools and use alerts and thresholds on MC? So, the Management Console is basically our answer and we'll talk about its capabilities and new announcements in this presentation. So just to give a brief overview of the Management Console, who uses Management Console, it's generally used by IT administrators and DB admins. Management Console can be used to monitor both Eon Mode and Enterprise Mode databases. Why to use Management Console? You can use Management Console for provisioning Vertica databases and cluster. You can manage the already existing Vertica databases and cluster you have, and you can use various tools on Management Console like query execution, Database Designer, Workload Analyzer, and set up alerts and thresholds to get notified by some of your activities on the MC. So let's go over a few benefits of using Management Console. Okay. So using Management Console, you can view and optimize resource pool usage. Management Console helps you to identify some critical conditions on your Vertica cluster. Additionally, you can set up various thresholds thresholds in MC and get other data if those thresholds are triggered on the database. So now let's dig into the couple of scenarios. So for the first scenario, we will discuss about queries taking too long and using workload analyzer to possibly help to solve the problem. In the second scenario, we will go over alert email that you received from your Management Console and analyzing the problem and taking required actions to solve the problem. So let's go over the scenario where queries are taking too long to run. So in this example, we have this one query that we are running using the query execution on MC. And for some reason we notice that it's taking about 14.8 seconds seconds to execute this query, which is higher than the expected run time of the query. The query that we are running happens to be the query used by MC during the extended monitoring. Notice that the table name and the schema name which is ds_requests_issued, and, is the schema used for extended monitoring. Now in 10.0 MC we have redesigned the Workload Analyzer and Recommendations feature to show the recommendations and allow you to execute those recommendations. In our example, we have taken the table name and figured the tuning descriptions to see if there are any tuning recommendations related to this table. As we see over here, there are three tuning recommendations available for that table. So now in 10.0 MC, you can select those recommendations and then run them. So let's run the recommendations. All right. So once recommendations are run successfully, you can go and see all the processed recommendations that you have run previously. Over here we see that there are three recommendations that we had selected earlier have successfully processed. Now we take the same query and run it on the query execution on MC and hey, it's running really faster and we see that it takes only 0.3 seconds to run the query and, which is about like 98% decrease in original runtime of the query. So in this example we saw that using a Workload Analyzer tool on MC you can possibly triage and solve issue for your queries which are taking to long to execute. All right. So now let's go over another user scenario where DB admin's received some alert email messages from MC and would like to understand and analyze the problem. So to know more about what's going on on the database and proactively react to the problems, DB admins using the Management Console can create set of thresholds and get alerted about the conditions on the database if the threshold values is reached and then respond to the problem thereafter. Now as a DB admin, I see some email message notifications from MC and upon checking the emails, I see that there are a couple of email alerts received from MC on my email. So one of the messages that I received was for Query Resource Rejections greater than 5, pool, midpool7. And then around the same time, I received another email from the MC for the Failed Queries greater than 5, and in this case I see there are 80 failed queries. So now let's go on the MC and investigate the problem. So before going into the deep investigation about failures, let's review the threshold settings on MC. So as we see, we have set up the thresholds under the database settings page for failed queries in the last 10 minutes greater than 5 and MC should send an email to the individual if the threshold is triggered. And also we have a threshold set up for queries and resource rejections in the last five minutes for midpool7 set to greater than 5. There are various other thresholds on this page that you can set if you desire to. Now let's go and triage those email alerts about the failed queries and resource rejections that we had received. To analyze the failed queries, let's take a look at the query statistics page on the database Overview page on MC. Let's take a look at the Resource Pools graph and especially for the failed queries for each resource pools. And over to the right under the failed query section, I see about like, in the last 24 hours, there are about 6,000 failed queries for midpool7. And now I switch to view to see the statistics for each user and on this page I see for User MaryLee on the right hand side there are a high number of failed queries in last 24 hours. And to know more about the failed queries for this user, I can click on the graph for this user and get the reasons behind it. So let's click on the graph and see what's going on. And so clicking on this graph, it takes me to the failed queries view on the Query Monitoring page for database, on Database activities tab. And over here, I see there are a high number of failed queries for this user, MaryLee, with the reasons stated as, exceeding high limit. To drill down more and to know more reasons behind it, I can click on the plus icon on the left hand side for each failed queries to get the failure reason for each node on the database. So let's do that. And clicking the plus icon, I see for the two nodes that are listed, over here it says there are insufficient resources like memory and file handles for midpool7. Now let's go and analyze the midpool7 configurations and activities on it. So to do so, I will go over to the Resource Pool Monitoring view and select midpool7. I see the resource allocations for this resource pool is very low. For example, the max memory is just 1MB and the max concurrency is set to 0. Hmm, that's very odd configuration for this resource pool. Also in the bottom right graph for the resource rejections for midpool7, the graph shows very high values for resource rejection. All right. So since we saw some odd configurations and odd resource allocations for midpool7, I would like to see when this resource, when the settings were changed on the resource pools. So to do this, I can preview the audit logs on, are available on the Management Console. So I can go onto the Vertica Audit Logs and see the logs for the resource pool. So I just (mumbles) for the logs and figuring the logs for midpool7. I see on February 17th, the memory and other attributes for midpool7 were modified. So now let's analyze the resource activity for midpool7 around the time when the configurations were changed. So in our case we are using extended monitoring on MC for this database, so we can go back in time and see the statistics over the larger time range for midpool7. So viewing the activities for midpool7 around February 17th, around the time when these configurations were changed, we see a decrease in resource pool usage. Also, on the bottom right, we see the resource rejections for this midpool7 have an increase, linear increase, after the configurations were changed. I can select a point on the graph to get the more details about the resource rejections. Now to analyze the effects of the modifications on midpool7. Let's go over to the Query Monitoring page. All right, I will adjust the time range around the time when the configurations were changed for midpool7 and completed activities queries for user MaryLee. And I see there are no completed queries for this user. Now I'm taking a look at the Failed Queries tab and adjusting the time range around the time when the configurations were changed. I can do so because we are using extended monitoring. So again, adjusting the time, I can see there are high number of failed queries for this user. There about about like 10,000 failed queries for this user after the configurations were changed on this resource pool. So now let's go and modify the settings since we know after the configurations were changed, this user was not able to run the queries. So you can change the resource pool settings of using Management Console's database settings page and under the Resource Pools tab. So selecting the midpool7, I see the same odd configurations for this resource pool that we saw earlier. So now let's go and modify it, the settings. So I will increase the max memory and modify the settings for midpool7 so that it has adequate resources to run the queries for the user. Hit apply on the right hand top to see the settings. Now let's do the validation after we change the resource pool attributes. So let's go over to the same query monitoring page and see if MaryLee user is able to run the queries for midpool7. We see that now, after the configuration, after the change, after we changed the configuration for midpool7, the user can run the queries successfully and the count for Completed Queries has increased after we modified the settings for this midpool7 resource pool. And also viewing the resource pool monitoring page, we can validate that after the new configurations for midpool7 has been applied and also the resource pool usage after the configuration change has increased. And also on the bottom right graph, we can see that the resource rejections for midpool7 has decreased over the time after we modified the settings. And since we are using extended monitoring for this database, I can see that the trend in data for these resource pools, the before and after effects of modifying the settings. So initially when the settings were changed, there were high resource rejections and after we again modified the settings, the resource rejections went down. Right. So now let's go work with the provisioning and reviving the Eon Mode Vertica database cluster using the Management Console on different platform. So Management Console supports provisioning and reviving of Eon Mode databases on various cloud environments like AWS, the Google Cloud Platform, and Pure Storage. So for Google, for provisioning the Vertica Management Console on Google Cloud Platform you can use launch a template. Or on AWS environment you can use the cloud formation templates available for different OS's. Once you have provisioned Vertica Management Console, you can provision the Vertica cluster and databases from MC itself. So you can provision a Vertica cluster, you can select the Create new database button available on the homepage. This will open up the wizard to create a new database and cluster. In this example, we are using we are using the Google Cloud Platform. So the wizard will ask me for varius authentication parameters for the Google Cloud Platform. And if you're on AWS, it'll ask you for the authentication parameters for the AWS environment. And going forward on the Wizard, it'll ask me to select the instance Type. I will select for the new Vertica cluster. And also provide the communal location url for my Eon Mode database and all the other preferences related to the new cluster. Once I have selected all the preferences for my new cluster I can preview the settings and I can hit, if I am, I can hit Create if all looks okay. So if I hit Create, this will create a new, MC will create a new GCP instances because we are on the GCP environment in this example. It will create a cluster on this instance, it'll create a Vertica Eon Mode Database on this cluster. And it will, additionally, you can load the test data on it if you like to. Now let's go over and revive the existing Eon Mode database from the communal location. So you can do it the same using the Management Console by selecting the Revive Eon Mode database button on the homepage. This will again open up the wizard for reviving the Eon Mode database. Again, in this example, since we are using GCP Platform, it will ask me for the Google Cloud storage authentication attributes. And for reviving, it will ask me for the communal location so I can enter the Google Storage bucket and my folder and it will discover all the Eon Mode databases located under this folder. And I can select one of the databases that I would like to revive. And it will ask me for other Vertica preferences and for this video, for this database reviving. And once I enter all the preferences and review all the preferences I can hit Revive the database button on the Wizard. So after I hit Revive database it will create the GCP instances. The number of GCP instances that I created would be seen as the number of hosts on the original Vertica cluster. It will install the Vertica cluster on this data, on this instances and it will revive the database and it will start the database. And after starting the database, it will be imported on the MC so you can start monitoring on it. So in this example, we saw you can provision and revive the Vertica database on the GCP Platform. Additionally, you can use AWS environment to provision and revive. So now since we have the Eon Mode database on MC, Natalia will go over some Eon Mode features on MC like managing subcluster and Depot activity monitoring. Over to you, Natalia. >> Natalia: Okay, thank you. Hello, my name is Natalia Stavisky. I am also a member of Vertica Management Console Team. And I will talk today about the work I did to allow users to manage subclusters using the Management Console, and also the work I did to help users understand what's going on in their Depot in the Vertica Eon Mode database. So let's look at the picture of the subclusters. On the Manage page of Vertica Management Console, you can see here is a page that has blue tabs, and the tab that's active is Subclusters. You can see that there are two subclusters are available in this database. And for each of the subclusters, you can see subcluster properties, whether this is the primary subcluster or secondary. In this case, primary is the default subcluster. It's indicated by a star. You can see what nodes belong to each subcluster. You can see the node state and node statistics. You can also easily add a new subcluster. And we're quickly going to do this. So once you click on the button, you'll launch the wizard that'll take you through the steps. You'll enter the name of the subcluster, indicate whether this is secondary or primary subcluster. I should mention that Vertica recommends having only one primary subcluster. But we have both options here available. You will enter the number of nodes for your subcluster. And once the subcluster has been created, you can manage the subcluster. What other options for managing subcluster we have here? You can scale up an existing subcluster and that's a similar approach, you launch the wizard and (mumbles) nodes. You want to add to your existing subcluster. You can scale down a subcluster. And MC validates requirements for maintaining minimal number of nodes to prevent database shutdown. So if you can not remove any nodes from a subcluster, this option will not be available. You can stop a subcluster. And depending on whether this is a primary subcluster or secondary subcluster, this option may be available or not available. Like in this picture, we can see that for the default subcluster this option is not available. And this is because shutting down the default subcluster will cause the database to shut down as well. You can terminate a subcluster. And again, the MC warns you not to terminate the primary subcluster and validates requirements for maintaining minimal number of nodes to prevent database shutdown. So now we are going to talk a little more about how the MC helps you to understand what's going on in your Depot. So Depot is one of the core of Eon Mode database. And what are the frequently asked questions about the Depot? Is the Depot size sufficient? Are a subset of users putting a high load on the database? What tables are fetched and evicted repeatedly, we call it "re-fetched," in Depot? So here in the Depot Activity Monitoring page, we now have four tabs that allow you to answer those questions. And we'll go a little more in detail through each of them, but I'll just mention what they are for now. At a Glance shows you basic Depot configuration and also shows you query executing. Depot Efficiency, we'll talk more about that and other tabs. Depot Content, that shows you what tables are currently in your Depot. And Depot Pinning allows you to see what pinning policies have been created and to create new pinning policies. Now let's go through a scenario. Monitoring performance of workloads on one subcluster. As you know, Eon Mode database allows you to have multiple subclusters and we'll explore how this feature is useful and how we can use the Management Console to make decisions regarding whether you would like to have multiple subclusters. So here we have, in my setup, a single subcluster called default_subcluster. It has two users that are running queries that are accessing tables, mostly in schema public. So the query started executing and we can see that after fetching tables from Communal, which is the red line, the rest of the time the queries are executing in Depot. The green line is indicating queries running in Depot. The all nodes Depot is about 88% full, a steady flow, and the depot size seems to be sufficient for query executions from Depot only. That's the good case scenario. Now at around 17 :15, user Sherry got an urgent request to generate a report. And at, she started running her queries. We can see that picture is quite different now. The tables Sherry is querying are in a different schema and are much larger. Now we can see multiple lines in different colors. We can see a bunch of fetches and evictions which are indicated by blue and purple bars, and a lot of queries are now spilling into Communal. This is the red and orange lines. Orange line is an indicator of a query running partially in Depot and partially getting fetched from Communal. And the red line is data fetched from Communal storage. Let's click on the, one of the lines. Each data point, each point on the line, it'll take you to the Query Details page where you can see more about what's going on. So this is the page that shows us what queries have been run in this particular time interval which is on top of this page in orange color. So that's about one minute time interval and now we can see user Sherry among the users that are running queries. Sherry's queries involve large tables and are running against a different schema. We can see the clickstream schema in the name of the, in part of the query request. So what is happening, there is not enough Depot space for both the schema that's already in use and the one Sherry needs. As a result, evictions and fetches have started occurring. What other questions we can ask ourself to help us understand what's going on? So how about, what tables are most frequently re-fetched? So for that, we will go to the Depot Efficiency page and look at the middle, the middle chart here. We can see the larger version of this chart if we expand it. So now we have 10 tables listed that are most frequently being re-fetched. We can see that there is a clickstream schema and there are other schemas so all of those tables are being used in the queries, fetched, and then there is not enough space in the Depot, they getting evicted and they get re-fetched again. So what can be done to enable all queries to run in Depot? Option one can be increase the Depot size. So we can do this by running the following queries, which (mumbles) which nodes and storage location and the new Depot size. And I should mention that we can run this query from the Management Console from the query execution page. So this would have helped us to increase the Depot size. What other options do we have, for example, when increasing Depot size is not an option? We can also provision a second subcluster to isolate workloads like Sherry's. So we are going to do this now and we will provision a second subcluster using the Manage page. Here we're creating subcluster for Sherry or for workloads like hers. And we're going to create a (mumbles). So Sherry's subcluster has been created. We can see it here, added to the list of the subclusters. It's a secondary subcluster. Sherry has been instructed to use the new SherrySubcluster for her work. Now let's see what happened. We'll go again at Depot Activity page and we'll look at the At a Glance tab. We can see that around >> 18: 07, Sherry switched to running her queries on SherrySubcluster. On top of this page, you can see subcluster selected. So we currently have two subclusters and I'm looking, what happened to SherrySubcluster once it has been provisioned? So Sherry started using it and the lines after initial fetching from Depot, which was from Communal, which was the red line, after that, all Sherry's queries fit in Depot, which is indicated by green line. Also the Depot is pretty full on those nodes, about 90% full. But the queries are processed efficiently, there is no spilling into Communal. So that's a good case scenario. Let's now go back and take a look at the original subcluster, default subcluster. So on the left portion of the chart we can see multiple lines, that was activity before Sherry switched to her own designated subcluster. At around 18:07, after Sherry switched from the subcluster to using her designated subcluster, there is no, she is no longer using the subcluster, she is not putting a load in it. So the lines after that are turning a green color, which means the queries that are still running in default subcluster are all running in Depot. We can also see that Depot fetches and evictions bars, those purple and blue bars, are no longer showing significant numbers. Also we can check the second chart that shows Communal Storage Access. And we can see that the bars have also dropped, so there is no significant access for Communal Storage. So this problem has been solved. Each of the subclusters are serving queries from Depot and that's our most efficient scenario. Let's also look at the other tabs that we have for Depot monitoring. Let's look at Depot Efficiency tab. It has six charts and I'll go through each one of them quickly. Files Reads by Location gives an indicator of where the majority of query execution took place in Depot or in Communal. Top 10 Re-Fetches into Depot, and imagine the charts earlier in our user case, it shows tables that are most frequently fetched and evicted and then fetched again. These are good candidates to get pinned if increasing Depot size is not an option. Note that both of these charts have an option to select time interval using calendar widget. So you can get the information about the activity that happened during that time interval. Depot Pinning shows what portion of your Depot is pinned, both by byte count and by table count. And the three tables at the bottom show Depot structure. How long tables stay in Depot, we would like tables to be fetched in Depot and stay there for a long time, how often they are accessed, again, the tables in Depot, we would like to see them accessed frequently, and what the size range of tables in Depot. Depot Content. This tab allows us to search for tables that are currently in Depot and also to see stats like table size in Depot. How often tables are accessed and when were they last accessed. And the same information that's available for tables in Depot is also available on projections and partition levels for those tables. Depot Pinning. This tab allows users to see what policies are currently existing and so you can do this by clicking on the first little button and click search. This'll show you all existing policies that are already created. The second option allows you to search for a table and create a policy. You can also use the action column to modify existing policies or delete them. And the third option provides details about most frequently re-fetched tables, including fetch count, total access count, and number of re-fetched bytes. So all this information can help to make decisions regarding pinning specific tables. So that's about it about the Depot. And I should mention that the server team also has a very good presentation on the, webinar, on the Eon Mode database Depot management and subcluster management. that strongly recommend it to attend or download the slide presentation. Let's talk quickly about the Management Console Roadmap, what we are planning to do in the future. So we are going to continue focusing on subcluster management, there is still a lot of things we can do here. Promoting/demoting subclusters. Load balancing across subclusters, scheduling subcluster actions, support for large cluster mode. We'll continue working on Workload Analyzer enhancement recommendation, on backup and restore from the MC. Building custom thresholds, and Eon on HDFS support. Okay, so we are ready now to take any questions you may have now. Thank you.

Published Date : Mar 30 2020

SUMMARY :

for the virtual Vertica BDC 2020. and all the other preferences related to the new cluster. and the depot size seems to be sufficient So on the left portion of the chart

ENTITIES

Entity	Category	Confidence
Natalia Stavisky	PERSON	0.99+
Sherry	PERSON	0.99+
MaryLee	PERSON	0.99+
Jeff Healey	PERSON	0.99+
Natalia	PERSON	0.99+
Jeff	PERSON	0.99+
February 17th	DATE	0.99+
second scenario	QUANTITY	0.99+
10 tables	QUANTITY	0.99+
forum.vertica.com	OTHER	0.99+
AWS	ORGANIZATION	0.99+
1MB	QUANTITY	0.99+
two users	QUANTITY	0.99+
first scenario	QUANTITY	0.99+
second option	QUANTITY	0.99+
Vertica	ORGANIZATION	0.99+
Bhavik	PERSON	0.99+
80 failed queries	QUANTITY	0.99+
today	DATE	0.99+
Depot	ORGANIZATION	0.99+
third	QUANTITY	0.99+
Each	QUANTITY	0.99+
six charts	QUANTITY	0.99+
both	QUANTITY	0.99+
each point	QUANTITY	0.99+
three recommendations	QUANTITY	0.99+
Today	DATE	0.99+
each	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Bhavik Gandhi	PERSON	0.99+
midpool7	TITLE	0.99+
two nodes	QUANTITY	0.99+
second chart	QUANTITY	0.99+
two subclusters	QUANTITY	0.98+
second subcluster	QUANTITY	0.98+
Each data point	QUANTITY	0.98+
each user	QUANTITY	0.98+
both options	QUANTITY	0.98+
4/2	DATE	0.98+
Eon	ORGANIZATION	0.97+
this week	DATE	0.97+
each subcluster	QUANTITY	0.97+
about 90%	QUANTITY	0.97+
three tables	QUANTITY	0.96+
0	QUANTITY	0.96+
about 14.8 seconds seconds	QUANTITY	0.96+
one subcluster	QUANTITY	0.95+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for midpool7: