UNLIST TILL 4/2 - Vertica Database Designer - Today and Tomorrow

>> Jeff: Hello everybody and thank you for joining us today for the Virtual VERTICA BDC 2020. Today's breakout session has been titled, "VERTICA Database Designer Today and Tomorrow." I'm Jeff Healey, Product VERTICA Marketing, I'll be your host for this breakout session. Joining me today is Yuanzhe Bei, Senior Technical Manager from VERTICA Engineering. But before we begin, (clearing throat) I encourage you to submit questions or comments during the virtual session. You don't have to wait, just type your question or comment in the question box below the slides and click Submit. As always, there will be a Q&A session at the end of the presentation. We'll answer as many questions, as we're able to during that time, any questions we don't address, we'll do our best to answer them offline. Alternatively, visit VERTICA forums at forum.vertica.com to post your questions there after the session. Our engineering team is planning to join the forums, to keep the conversation going. Also, a reminder that you can maximize your screen by clicking the double arrow button at the lower right corner of the slides. And yes, this virtual session is being recorded and will be available to view on demand this week. We will send you a notification as soon as it's ready. Now let's get started. Over to you Yuanzhe. >> Yuanzhe: Thanks Jeff. Hi everyone, my name is Yuanzhe Bei, I'm a Senior Technical Manager at VERTICA Server RND Group. I run the query optimizer, catalog and the disaggregated engine team. Very glad to be here today, to talk about, the "VERTICA Database Designer Today and Tomorrow". This presentation will be organized as the following; I will first refresh some knowledge about, VERTICA fundamentals such as Tables and Projections, which will bring to the question, "What is Database Designer?" and "Why we need this tool?". Then I will take you through a deep dive, into a Database Designer or we call DBD, and see how DBD's internals works, after that I'll show you some exciting DBD improvements, we have planned for 10.0 release and lastly, I will share with you, some DBD future roadmap we planned next. As most of you should already know, VERTICA is built on a columnar architecture. That means, data is stored column wise. Here we can see a very simple example, of table with four columns, and the many of you may also know, table in VERTICA is a virtual concept. It's just a logical representation of data, which means user can write SQL query, to reference the table names and column, just like other relational database management system, but the actual physical storage of data, is called Projection. A Projection can reference a subset, or all of the columns all to its anchor table, and must be sorted by at least one column. Each table need at least one C for projection which reference all the columns to the table. If you load data to a table with no projection, and automated, auto production will be created, which will be arbitrarily assorted by, the first couple of columns in the table. As you can imagine, even though such other production, can be used to answer any query, the performance is not optimized in most cases. A common practice in VERTICA, is to create multiple projections, contain difference step of column, and sorted in different ways on the same table. When query is sent to the server, the optimizer will pick the projection, that can answer the query in the most efficient way. For example, here you can say, let's say you have a query, that select columns B, D, C and sorted by B and D, the third projection will be ideal, because the data is already sorted, so you can save the sorting costs while executing the query. Basically when you choose the design of the projection, you need to consider four things. First and foremost, of course the sort order. The data already sorted in the right way, can benefit quite a lot of the query actually, like Ordered by, Group By, Analytics, Merge, Join, Predicates and so on. The select column group is also important, because the projection must contain, all the columns referenced by your workflow query. Even missing one column in the projection, this projection cannot be used for a particular query. In addition, VERTICA is the distributed database, and allow projection to be segmented, based on the hash of a set of columns, which is beneficial if the segmentation merged, the join keys or group keys. And finally encoding of each per columns is also part of the design, because the data is sorted in different way, may completely change the optimal encoding for each column. This example only show the benefit of the first two, but you can imagine the rest too are also important. But even for that, it doesn't sound that hard, right? Well I hope you change your mind already when you see this, at least I do. These machine generated queries, really beats me. It will probably take an experienced DBA hours, to figure out which projection can be benefit these queries, not even mentioning there could be hundreds of such queries, in the regular work logs in the real world. So what can we do? That's why we need DBD. DBD is a tool integrated in the VERTICA server, that it can help DBA to perform an access, on their work log query, tabled schema and data, and then automatically figure out, the most optimized projection design for their workload. In addition, DBD also a sophisticated tool, that can take customize by a user, by sending a lot of parameters objectives and so on. And lastly, DBD has access to the optimizer, so DB knows what kind of attribute, the projection need to have, in order to have the optimizer to benefit from them. DBD has been there for years, and I'm sure there are plenty of materials available online, to show you how DBD can be used in different scenarios, whether to achieve the query optimize, or load optimize, whether it's the comprehensive design, or the incremental design, whether it's a dumping deployment script, and manual deployment later, or let the DBD do the order deployment for you, and the many other options. I'm not planning to talk about this today, instead, I will take the opportunity today, to open this black box DBD, and show you what exactly hide inside. DBD is a complex tool and I have tried my best to summarize the DBD design process into seven steps; Extract, Permute, Prune, Build, Score, Identify and Encode. What do they mean? Don't worry, I will show you step by step. The first step is Extract. Extract Interesting Columns. In this step, DBD pass the design queries, and figure out the operations that can be benefited, by the potential projection design, and extract the corresponding columns, as interesting columns. So Predicates, Group By, Order By, Joint Condition, and analytics are all interesting Column to the DBD. As you can see this three simple sample queries, DBD can extract the interest in column sets on the right. Some of these column sets are unordered. For example, the green one for Group By a1 and b1, the DBD extracts the interesting column set, and put them in the own orders set, because either data sorted by a1 first or b1 first, can benefit from this Group By operation. Some of the other sets are ordered, and the best example is here, order by clause a2 and b2, and obviously you cannot sort it by b2 and then a2. These interesting columns set will be used as if, to extend to actual projection sort order candidates. The next step is Permute, once DBD extract all the C's, it will enumerate sort order using C, and how does DBD do that? I'm starting with a very simple example. So here you can see DBD can enumerate two sort orders, by extending d1 with the unordered set a1, b1, and the derived at two sort order candidates, d1, a1, b1, and d1, b1, a1. This sort order can benefit queries with predicate on d1, and also benefit queries by Group By a1, b1, when a1, sorry when d1 is constant. So with the same idea, DBD will try to extend other States with each other, and populate more sort order permutations. You can imagine that how many of them, there could be many of them, these candidates, based on how many queries you have in the design and that can be handled of the sort order candidates. That comes to the third step, which is Pruning. This step is to limit the candidates sort order, so that the design won't be running forever. DBD uses very simple capping mechanism. It sorts all the, sort all the candidates, are ranked by length, and only a certain number of the sort order, with longest length, will be moved forward to the next step. And now we have all the sort orders candidate, that we want to try, but whether this sort order candidate, will be actually be benefit from the optimizer, DBD need to ask the optiizer. So this step before that happens, this step has to build those projection candidate, in the catalog. So this step will build, will generates the projection DBL's, surround the sort order, and create this projection in the catalog. These projections won't be loaded with real data, because that takes a lot of time, instead, DBD will copy over the statistic, on existing projections, to this projection candidates, so that the optimizer can use them. The next step is Score. Scoring with optimizer. Now projection candidates are built in the catalog. DBD can send a work log queries to optimizer, to generate a query plan. And then optimizer will return the query plan, DBD will go through the query plan, and investigate whether, there are certain benefits being achieved. The benefits list have been growing over time, when optimizer add more optimizations. Let's say in this case because the projection candidates, can be sorted by the b1 and a1, it is eligible for Group By Pipe benefit. Each benefit has a preset score. The overall benefit score of all design queries, will be aggregated and then recorded, for each projection candidate. We are almost there. Now we have all the total benefit score, for the projection candidates, we derived on the work log queries. Now the job is easy. You can just pick the sort order with the highest score as the winner. Here we have the winner d1, b1 and a1. Sometimes you need to find more winners, because the chosen winner may only benefit a subset, of the work log query you provided to the DBD. So in order to have the rest of the queries, to be also benefit, you need more projections. So in this case, DBD will go to the next iteration, and let's say in this case find to another winner, d1, c1, to benefit the work log queries, that cannot be benefit by d1, b1 and a1. The number of iterations and thus the winner outcome, DBD really depends on the design objective that uses that. It can be load optimized, which means that only one, super projection winner will be selected, or query optimized, where DBD try to create as many projections, to cover most of the work log queries, or somewhat balance an objective in the middle. The last step is to decide encoding, for each projection columns, for the projection winners. Because the data are sorted differently, the encoding benefits, can be very different from the existing projection. So choose the right projection encoding design, will save the disk footprint a significant factor. So it's worth the effort, to find out the best thing encoding. DBD picks the encoding, based on the actual sampling the data, and measure the storage footprint. For example, in this case, the projection winner has three columns, and say each column has a few encoding options. DBD will write the sample data in the way this projection is sorted, and then you can see with different encoding, the disk footprint is different. DBD will then compare the disk footprint of each, of different options for each column, and pick the best encoding options, based on the one that has the smallest storage footprint. Nothing magical here, but it just works pretty well. And basic that how DBD internal works, of course, I think we've heard it quite a lot. For example, I didn't mention how the DBD handles segmentation, but the idea is similar to analyze the sort order. But I hope this section gave you some basic idea, about DBD for today. So now let's talk about tomorrow. And here comes the exciting part. In version 10.0, we significantly improve the DBD in many ways. In this talk I will highlight four issues in old DBD and describe how the 10.0 version new DBD, will address those issues. The first issue is that a DBD API is too complex. In most situations, what user really want is very simple. My queries were slow yesterday, with the new or different projection can help speed it up? However, to answer a simple question like this using DBD, user will be very likely to have the documentation open on the side, because they have to go through it's whole complex flow, from creating a projection, run the design, get outputs and then create a design in the end. And that's not there yet, for each step, there are several functions user need to call in order. So adding these up, user need to write the quite long script with dozens of functions, it's just too complicated, and most of you may find it annoying. They either manually tune the projection to themselves, or simply live with the performance and come back, when it gets really slow again, and of course in most situations, they never come back to use the DBD. In 10.0 VERTICA support the new simplified API, to run DBD easily. There will be just one function designer_single_run and one argument, the interval that you think, your query was slow. In this case, user complained about it yesterday. So what does this user to need to do, is just specify one day, as argument and run it. The user don't need to provide anything else, because the DBD will look up his query or history, within that time window and automatically populate design, run design and export the projection design, and the clean up, no user intervention needed. No need to have the documentation on the side and carefully write a script, and a debug, just one function call. That's it. Very simple. So that must be pretty impressive, right? So now here comes to another issue. To fully utilize this single round function, users are encouraged to run DBD on the production cluster. However, in fact, VERTICA used to not recommend, to run a design on a production cluster. One of the reasons issue, is that DBD picks massive locks, both table locks and catalog locks, which will badly interfere the running workload, on a production cluster. As of 10.0, we eliminated all the table and ten catalog locks from DBD. Yes, we eliminate 100% of them, simple improvement, clear win. The third issue, which user may not be aware of, is that DBD writes intermediate result. into real VERTICA tables, the real DBD have to do that is, DBD is the background task. So the intermediate results, some user needs to monitor it, the progress of the DBD in concurrent session. For complex design, the intermediate result can be quite massive, and as a result, many lost files will be created, and written to the disk, and we should both stress, the catalog, and that the disk can slow down the design. For ER mode, it's even worse because, the table are shared on communal storage. So writing to the regular table, means that it has to upload the data, to the communal storage, which is even more expensive and disruptive. In 10.0, we significantly restructure the intermediate results buffer, and make this shared in memory data structure. Monitoring queries will go directly look up, in memory data structure, and go through the system table, and return the results. No Intermediate Results files will be written anymore. Another expensive lubidge of local disk for DBD is encoding design, as I mentioned earlier in the deep dive, to determine which encoding works the best for the new projection design, there's no magic way, but the DBD need to actually write down, the sample data to the disk, using the different encoding options, and to find out which ones have the smallest footprint, or pick it as the best choice. These written sample data will be useless after this, and it will be wiped out right away, and you can imagine this is a huge waste of the system resource. In 10.0 we improve this process. So instead of writing, the different encoded data on the disk, and then read the file size, DBD aggregate the data block size on-the-fly. The data block will not be written to the disk, so the overall encoding and design is more efficient and non-disruptive. Of course, this is just about the start. The reason why we put a significant amount of the resource on the improving the DBD in 10.0, is because the VERTICA DBD, as essential component of the out of box performance design campaign. To simply illustrate the timeline, we are now on the second step, where we significantly reduced, the running overhead of the DBD, so that user will no longer fear, to run DBD on their production cluster. Please be noted that as of 10.0, we haven't really started changing, how DBD design algorithm works, so that what we have discussed in the deep dive today, still holds. For the next phase of DBD, we will briefly make the design process smarter, and this will include better enumeration mechanism, so that the pruning is more intelligence rather than brutal, then that will result in better design quality, and also faster design. The longer term is to make DBD to achieve the automation. What entail automation and what I really mean is that, instead of having user to decide when to use DBD, until their query is slow, VERTICA have to know, detect this event, and have have DBD run automatically for users, and suggest the better projections design, if the existing projection is not good enough. Of course, there will be a lot of work that need to be done, before we can actually fully achieve the automation. But we are working on that. At the end of day, what the user really wants, is the fast database, right? And thank you for listening to my presentation. so I hope you find it useful. Now let's get ready for the Q&A.

Published Date : Mar 30 2020

SUMMARY :

at the end of the presentation. and the many of you may also know,

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Yuanzhe Bei	PERSON	0.99+
Jeff Healey	PERSON	0.99+
100%	QUANTITY	0.99+
forum.vertica.com	OTHER	0.99+
one day	QUANTITY	0.99+
second step	QUANTITY	0.99+
third step	QUANTITY	0.99+
tomorrow	DATE	0.99+
third issue	QUANTITY	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
yesterday	DATE	0.99+
Each benefit	QUANTITY	0.99+
Today	DATE	0.99+
third projection	QUANTITY	0.99+
One	QUANTITY	0.99+
b2	OTHER	0.99+
each column	QUANTITY	0.99+
first issue	QUANTITY	0.99+
one column	QUANTITY	0.99+
three columns	QUANTITY	0.99+
VERTICA Engineering	ORGANIZATION	0.99+
Yuanzhe	PERSON	0.99+
each step	QUANTITY	0.98+
Each table	QUANTITY	0.98+
first step	QUANTITY	0.98+
DBD	TITLE	0.98+
DBD	ORGANIZATION	0.98+
seven steps	QUANTITY	0.98+
DBL	ORGANIZATION	0.98+
each	QUANTITY	0.98+
one argument	QUANTITY	0.98+
VERTICA	TITLE	0.98+
each projection	QUANTITY	0.97+
first two	QUANTITY	0.97+
first	QUANTITY	0.97+
this week	DATE	0.97+
hundreds	QUANTITY	0.97+
one function	QUANTITY	0.97+
clause a2	OTHER	0.97+
one	QUANTITY	0.97+
each per columns	QUANTITY	0.96+
Tomorrow	DATE	0.96+
both	QUANTITY	0.96+
four issues	QUANTITY	0.95+
VERTICA	ORGANIZATION	0.95+
b1	OTHER	0.95+
single round	QUANTITY	0.94+
4/2	DATE	0.94+
first couple of columns	QUANTITY	0.92+
VERTICA Database Designer Today and Tomorrow	TITLE	0.91+
Vertica	ORGANIZATION	0.91+
10.0	QUANTITY	0.89+
one function call	QUANTITY	0.89+
a1	OTHER	0.89+
four things	QUANTITY	0.88+
c1	OTHER	0.87+
two sort order	QUANTITY	0.85+

UNLIST TILL 4/2 - A Technical Overview of Vertica Architecture

>> Paige: Hello, everybody and thank you for joining us today on the Virtual Vertica BDC 2020. Today's breakout session is entitled A Technical Overview of the Vertica Architecture. I'm Paige Roberts, Open Source Relations Manager at Vertica and I'll be your host for this webinar. Now joining me is Ryan Role-kuh? Did I say that right? (laughs) He's a Vertica Senior Software Engineer. >> Ryan: So it's Roelke. (laughs) >> Paige: Roelke, okay, I got it, all right. Ryan Roelke. And before we begin, I want to be sure and encourage you guys to submit your questions or your comments during the virtual session while Ryan is talking as you think of them as you go along. You don't have to wait to the end, just type in your question or your comment in the question box below the slides and click submit. There'll be a Q and A at the end of the presentation and we'll answer as many questions as we're able to during that time. Any questions that we don't address, we'll do our best to get back to you offline. Now, alternatively, you can visit the Vertica forums to post your question there after the session as well. Our engineering team is planning to join the forums to keep the conversation going, so you can have a chat afterwards with the engineer, just like any other conference. Now also, you can maximize your screen by clicking the double arrow button in the lower right corner of the slides and before you ask, yes, this virtual session is being recorded and it will be available to view on demand this week. We'll send you a notification as soon as it's ready. Now, let's get started. Over to you, Ryan. >> Ryan: Thanks, Paige. Good afternoon, everybody. My name is Ryan and I'm a Senior Software Engineer on Vertica's Development Team. I primarily work on improving Vertica's query execution engine, so usually in the space of making things faster. Today, I'm here to talk about something that's more general than that, so we're going to go through a technical overview of the Vertica architecture. So the intent of this talk, essentially, is to just explain some of the basic aspects of how Vertica works and what makes it such a great database software and to explain what makes a query execute so fast in Vertica, we'll provide some background to explain why other databases don't keep up. And we'll use that as a starting point to discuss an academic database that paved the way for Vertica. And then we'll explain how Vertica design builds upon that academic database to be the great software that it is today. I want to start by sharing somebody's approximation of an internet minute at some point in 2019. All of the data on this slide is generated by thousands or even millions of users and that's a huge amount of activity. Most of the applications depicted here are backed by one or more databases. Most of this activity will eventually result in changes to those databases. For the most part, we can categorize the way these databases are used into one of two paradigms. First up, we have online transaction processing or OLTP. OLTP workloads usually operate on single entries in a database, so an update to a retail inventory or a change in a bank account balance are both great examples of OLTP operations. Updates to these data sets must be visible immediately and there could be many transactions occurring concurrently from many different users. OLTP queries are usually key value queries. The key uniquely identifies the single entry in a database for reading or writing. Early databases and applications were probably designed for OLTP workloads. This example on the slide is typical of an OLTP workload. We have a table, accounts, such as for a bank, which tracks information for each of the bank's clients. An update query, like the one depicted here, might be run whenever a user deposits $10 into their bank account. Our second category is online analytical processing or OLAP which is more about using your data for decision making. If you have a hardware device which periodically records how it's doing, you could analyze trends of all your devices over time to observe what data patterns are likely to lead to failure or if you're Google, you might log user search activity to identify which links helped your users find the answer. Analytical processing has always been around but with the advent of the internet, it happened at scales that were unimaginable, even just 20 years ago. This SQL example is something you might see in an OLAP workload. We have a table, searches, logging user activity. We will eventually see one row in this table for each query submitted by users. If we want to find out what time of day our users are most active, then we could write a query like this one on the slide which counts the number of unique users running searches for each hour of the day. So now let's rewind to 2005. We don't have a picture of an internet minute in 2005, we don't have the data for that. We also don't have the data for a lot of other things. The term Big Data is not quite yet on anyone's radar and The Cloud is also not quite there or it's just starting to be. So if you have a database serving your application, it's probably optimized for OLTP workloads. OLAP workloads just aren't mainstream yet and database engineers probably don't have them in mind. So let's innovate. It's still 2005 and we want to try something new with our database. Let's take a look at what happens when we do run an analytic workload in 2005. Let's use as a motivating example a table of stock prices over time. In our table, the symbol column identifies the stock that was traded, the price column identifies the new price and the timestamp column indicates when the price changed. We have several other columns which, we should know that they're there, but we're not going to use them in any example queries. This table is designed for analytic queries. We're probably not going to make any updates or look at individual rows since we're logging historical data and want to analyze changes in stock price over time. Our database system is built to serve OLTP use cases, so it's probably going to store the table on disk in a single file like this one. Notice that each row contains all of the columns of our data in row major order. There's probably an index somewhere in the memory of the system which will help us to point lookups. Maybe our system expects that we will use the stock symbol and the trade time as lookup keys. So an index will provide quick lookups for those columns to the position of the whole row in the file. If we did have an update to a single row, then this representation would work great. We would seek to the row that we're interested in, finding it would probably be very fast using the in-memory index. And then we would update the file in place with our new value. On the other hand, if we ran an analytic query like we want to, the data access pattern is very different. The index is not helpful because we're looking up a whole range of rows, not just a single row. As a result, the only way to find the rows that we actually need for this query is to scan the entire file. We're going to end up scanning a lot of data that we don't need and that won't just be the rows that we don't need, there's many other columns in this table. Many information about who made the transaction, and we'll also be scanning through those columns for every single row in this table. That could be a very serious problem once we consider the scale of this file. Stocks change a lot, we probably have thousands or millions or maybe even billions of rows that are going to be stored in this file and we're going to scan all of these extra columns for every single row. If we tried out our stocks use case behind the desk for the Fortune 500 company, then we're probably going to be pretty disappointed. Our queries will eventually finish, but it might take so long that we don't even care about the answer anymore by the time that they do. Our database is not built for the task we want to use it for. Around the same time, a team of researchers in the North East have become aware of this problem and they decided to dedicate their time and research to it. These researchers weren't just anybody. The fruits of their labor, which we now like to call the C-Store Paper, was published by eventual Turing Award winner, Mike Stonebraker, along with several other researchers from elite universities. This paper presents the design of a read-optimized relational DBMS that contrasts sharply with most current systems, which are write-optimized. That sounds exactly like what we want for our stocks use case. Reasoning about what makes our queries executions so slow brought our researchers to the Memory Hierarchy, which essentially is a visualization of the relative speeds of different parts of a computer. At the top of the hierarchy, we have the fastest data units, which are, of course, also the most expensive to produce. As we move down the hierarchy, components get slower but also much cheaper and thus you can have more of them. Our OLTP databases data is stored in a file on the hard disk. We scanned the entirety of this file, even though we didn't need most of the data and now it turns out, that is just about the slowest thing that our query could possibly be doing by over two orders of magnitude. It should be clear, based on that, that the best thing we can do to optimize our query's execution is to avoid reading unnecessary data from the disk and that's what the C-Store researchers decided to look at. The key innovation of the C-Store paper does exactly that. Instead of storing data in a row major order, in a large file on disk, they transposed the data and stored each column in its own file. Now, if we run the same select query, we read only the relevant columns. The unnamed columns don't factor into the table scan at all since we don't even open the files. Zooming out to an internet scale sized data set, we can appreciate the savings here a lot more. But we still have to read a lot of data that we don't need to answer this particular query. Remember, we had two predicates, one on the symbol column and one on the timestamp column. Our query is only interested in AAPL stock, but we're still reading rows for all of the other stocks. So what can we do to optimize our disk read even more? Let's first partition our data set into different files based on the timestamp date. This means that we will keep separate files for each date. When we query the stocks table, the database knows all of the files we have to open. If we have a simple predicate on the timestamp column, as our sample query does, then the database can use it to figure out which files we don't have to look at at all. So now all of our disk reads that we have to do to answer our query will produce rows that pass the timestamp predicate. This eliminates a lot of wasteful disk reads. But not all of them. We do have another predicate on the symbol column where symbol equals AAPL. We'd like to avoid disk reads of rows that don't satisfy that predicate either. And we can avoid those disk reads by clustering all the rows that match the symbol predicate together. If all of the AAPL rows are adjacent, then as soon as we see something different, we can stop reading the file. We won't see any more rows that can pass the predicate. Then we can use the positions of the rows we did find to identify which pieces of the other columns we need to read. One technique that we can use to cluster the rows is sorting. So we'll use the symbol column as a sort key for all of the columns. And that way we can reconstruct a whole row by seeking to the same row position in each file. It turns out, having sorted all of the rows, we can do a bit more. We don't have any more wasted disk reads but we can still be more efficient with how we're using the disk. We've clustered all of the rows with the same symbol together so we don't really need to bother repeating the symbol so many times in the same file. Let's just write the value once and say how many rows we have. This one length encoding technique can compress large numbers of rows into a small amount of space. In this example, we do de-duplicate just a few rows but you can imagine de-duplicating many thousands of rows instead. This encoding is great for reducing the amounts of disk we need to read at query time, but it also has the additional benefit of reducing the total size of our stored data. Now our query requires substantially fewer disk reads than it did when we started. Let's recap what the C-Store paper did to achieve that. First, we transposed our data to store each column in its own file. Now, queries only have to read the columns used in the query. Second, we partitioned the data into multiple file sets so that all rows in a file have the same value for the partition column. Now, a predicate on the partition column can skip non-matching file sets entirely. Third, we selected a column of our data to use as a sort key. Now rows with the same value for that column are clustered together, which allows our query to stop reading data once it finds non-matching rows. Finally, sorting the data this way enables high compression ratios, using one length encoding which minimizes the size of the data stored on the disk. The C-Store system combined each of these innovative ideas to produce an academically significant result. And if you used it behind the desk of a Fortune 500 company in 2005, you probably would've been pretty pleased. But it's not 2005 anymore and the requirements of a modern database system are much stricter. So let's take a look at how C-Store fairs in 2020. First of all, we have designed the storage layer of our database to optimize a single query in a single application. Our design optimizes the heck out of that query and probably some similar ones but if we want to do anything else with our data, we might be in a bit of trouble. What if we just decide we want to ask a different question? For example, in our stock example, what if we want to plot all the trade made by a single user over a large window of time? How do our optimizations for the previous query measure up here? Well, our data's partitioned on the trade date, that could still be useful, depending on our new query. If we want to look at a trader's activity over a long period of time, we would have to open a lot of files. But if we're still interested in just a day's worth of data, then this optimization is still an optimization. Within each file, our data is ordered on the stock symbol. That's probably not too useful anymore, the rows for a single trader aren't going to be clustered together so we will have to scan all of the rows in order to figure out which ones match. You could imagine a worse design but as it becomes crucial to optimize this new type of query, then we might have to go as far as reconfiguring the whole database. The next problem of one of scale. One server is probably not good enough to serve a database in 2020. C-Store, as described, runs on a single server and stores lots of files. What if the data overwhelms this small system? We could imagine exhausting the file system's inodes limit with lots of small files due to our partitioning scheme. Or we could imagine something simpler, just filling up the disk with huge volumes of data. But there's an even simpler problem than that. What if something goes wrong and C-Store crashes? Then our data is no longer available to us until the single server is brought back up. A third concern, another one of scalability, is that one deployment does not really suit all possible things and use cases we could imagine. We haven't really said anything about being flexible. A contemporary database system has to integrate with many other applications, which might themselves have pretty restricted deployment options. Or the demands imposed by our workloads have changed and the setup you had before doesn't suit what you need now. C-Store doesn't do anything to address these concerns. What the C-Store paper did do was lead very quickly to the founding of Vertica. Vertica's architecture and design are essentially all about bringing the C-Store designs into an enterprise software system. The C-Store paper was just an academic exercise so it didn't really need to address any of the hard problems that we just talked about. But Vertica, the first commercial database built upon the ideas of the C-Store paper would definitely have to. This brings us back to the present to look at how an analytic query runs in 2020 on the Vertica Analytic Database. Vertica takes the key idea from the paper, can we significantly improve query performance by changing the way our data is stored and give its users the tools to customize their storage layer in order to heavily optimize really important or commonly wrong queries. On top of that, Vertica is a distributed system which allows it to scale up to internet-sized data sets, as well as have better reliability and uptime. We'll now take a brief look at what Vertica does to address the three inadequacies of the C-Store system that we mentioned. To avoid locking into a single database design, Vertica provides tools for the database user to customize the way their data is stored. To address the shortcomings of a single node system, Vertica coordinates processing among multiple nodes. To acknowledge the large variety of desirable deployments, Vertica does not require any specialized hardware and has many features which smoothly integrate it with a Cloud computing environment. First, we'll look at the database design problem. We're a SQL database, so our users are writing SQL and describing their data in SQL way, the Create Table statement. Create Table is a logical description of what your data looks like but it doesn't specify the way that it has to be stored, For a single Create Table, we could imagine a lot of different storage layouts. Vertica adds some extensions to SQL so that users can go even further than Create Table and describe the way that they want the data to be stored. Using terminology from the C-Store paper, we provide the Create Projection statement. Create Projection specifies how table data should be laid out, including column encoding and sort order. A table can have multiple projections, each of which could be ordered on different columns. When you query a table, Vertica will answer the query using the projection which it determines to be the best match. Referring back to our stock example, here's a sample Create Table and Create Projection statement. Let's focus on our heavily optimized example query, which had predicates on the stock symbol and date. We specify that the table data is to be partitioned by date. The Create Projection Statement here is excellent for this query. We specify using the order by clause that the data should be ordered according to our predicates. We'll use the timestamp as a secondary sort key. Each projection stores a copy of the table data. If you don't expect to need a particular column in a projection, then you can leave it out. Our average price query didn't care about who did the trading, so maybe our projection design for this query can leave the trader column out entirely. If the question we want to ask ever does change, maybe we already have a suitable projection, but if we don't, then we can create another one. This example shows another projection which would be much better at identifying trends of traders, rather than identifying trends for a particular stock. Next, let's take a look at our second problem, that one, or excuse me, so how should you decide what design is best for your queries? Well, you could spend a lot of time figuring it out on your own, or you could use Vertica's Database Designer tool which will help you by automatically analyzing your queries and spitting out a design which it thinks is going to work really well. If you want to learn more about the Database Designer Tool, then you should attend the session Vertica Database Designer- Today and Tomorrow which will tell you a lot about what the Database Designer does and some recent improvements that we have made. Okay, now we'll move to our next problem. (laughs) The challenge that one server does not fit all. In 2020, we have several orders of magnitude more data than we had in 2005. And you need a lot more hardware to crunch it. It's not tractable to keep multiple petabytes of data in a system with a single server. So Vertica doesn't try. Vertica is a distributed system so will deploy multiple severs which work together to maintain such a high data volume. In a traditional Vertica deployment, each node keeps some of the data in its own locally-attached storage. Data is replicated so that there is a redundant copy somewhere else in the system. If any one node goes down, then the data that it served is still available on a different node. We'll also have it so that in the system, there's no special node with extra duties. All nodes are created equal. This ensures that there is no single point of failure. Rather than replicate all of your data, Vertica divvies it up amongst all of the nodes in your system. We call this segmentation. The way data is segmented is another parameter of storage customization and it can definitely have an impact upon query performance. A common way to segment data is by using a hash expression, which essentially randomizes the node that a row of data belongs to. But with a guarantee that the same data will always end up in the same place. Describing the way data is segmented is another part of the Create Projection Statement, as seen in this example. Here we segment on the hash of the symbol column so all rows with the same symbol will end up on the same node. For each row that we load into the system, we'll apply our segmentation expression. The result determines which segment the row belongs to and then we'll send the row to each node which holds the copy of that segment. In this example, our projection is marked KSAFE 1, so we will keep one redundant copy of each segment. When we load a row, we might find that its segment had copied on Node One and Node Three, so we'll send a copy of the row to each of those nodes. If Node One is temporarily disconnected from the network, then Node Three can serve the other copy of the segment so that the whole system remains available. The last challenge we brought up from the C-Store design was that one deployment does not fit all. Vertica's cluster design neatly addressed many of our concerns here. Our use of segmentation to distribute data means that a Vertica system can scale to any size of deployment. And since we lack any special hardware or nodes with special purposes, Vertica servers can run anywhere, on premise or in the Cloud. But let's suppose you need to scale out your cluster to rise to the demands of a higher workload. Suppose you want to add another node. This changes the division of the segmentation space. We'll have to re-segment every row in the database to find its new home and then we'll have to move around any data that belongs to a different segment. This is a very expensive operation, not something you want to be doing all that often. Traditional Vertica doesn't solve that problem especially well, but Vertica Eon Mode definitely does. Vertica's Eon Mode is a large set of features which are designed with a Cloud computing environment in mind. One feature of this design is elastic throughput scaling, which is the idea that you can smoothly change your cluster size without having to pay the expenses of shuffling your entire database. Vertica Eon Mode had an entire session dedicated to it this morning. I won't say any more about it here, but maybe you already attended that session or if you haven't, then I definitely encourage you to listen to the recording. If you'd like to learn more about the Vertica architecture, then you'll find on this slide links to several of the academic conference publications. These four papers here, as well as Vertica Seven Years Later paper which describes some of the Vertica designs seven years after the founding and also a paper about the innovations of Eon Mode and of course, the Vertica documentation is an excellent resource for learning more about what's going on in a Vertica system. I hope you enjoyed learning about the Vertica architecture. I would be very happy to take all of your questions now. Thank you for attending this session.

Published Date : Mar 30 2020

SUMMARY :

A Technical Overview of the Vertica Architecture. Ryan: So it's Roelke. in the question box below the slides and click submit. that the best thing we can do

ENTITIES

Entity	Category	Confidence
Ryan	PERSON	0.99+
Mike Stonebraker	PERSON	0.99+
Ryan Roelke	PERSON	0.99+
2005	DATE	0.99+
2020	DATE	0.99+
thousands	QUANTITY	0.99+
2019	DATE	0.99+
$10	QUANTITY	0.99+
Paige Roberts	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Paige	PERSON	0.99+
Node Three	TITLE	0.99+
Today	DATE	0.99+
First	QUANTITY	0.99+
each file	QUANTITY	0.99+
Roelke	PERSON	0.99+
each row	QUANTITY	0.99+
Node One	TITLE	0.99+
millions	QUANTITY	0.99+
each hour	QUANTITY	0.99+
each	QUANTITY	0.99+
Second	QUANTITY	0.99+
second category	QUANTITY	0.99+
each column	QUANTITY	0.99+
One technique	QUANTITY	0.99+
one	QUANTITY	0.99+
two predicates	QUANTITY	0.99+
each node	QUANTITY	0.99+
One server	QUANTITY	0.99+
SQL	TITLE	0.99+
C-Store	TITLE	0.99+
second problem	QUANTITY	0.99+
Ryan Role	PERSON	0.99+
Third	QUANTITY	0.99+
North East	LOCATION	0.99+
each segment	QUANTITY	0.99+
today	DATE	0.98+
single entry	QUANTITY	0.98+
each date	QUANTITY	0.98+
Google	ORGANIZATION	0.98+
one row	QUANTITY	0.98+
one server	QUANTITY	0.98+
single server	QUANTITY	0.98+
single entries	QUANTITY	0.98+
both	QUANTITY	0.98+
20 years ago	DATE	0.98+
two paradigms	QUANTITY	0.97+
a day	QUANTITY	0.97+
this week	DATE	0.97+
billions of rows	QUANTITY	0.97+
Vertica	TITLE	0.97+
4/2	DATE	0.97+
single application	QUANTITY	0.97+
each query	QUANTITY	0.97+
Each projection	QUANTITY	0.97+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Fitbits: