All of the learning examples in my books Supercharge Power BI and Supercharge Excel are built on an Adventure Works database that has a single data table. But Power Pivot is a very capable tool for building reports that combine multiple data tables. Although it is very capable, the logic of how to combine multiple tables is different in Power Pivot than most Excel users will be familiar with. This post covers the correct way to do it.
The Classic Power Pivot Data Model
As you can see in the image below, the classic Power Pivot data model consists of 2 different types of tables – lookup tables (shown as #1 below) and data table(s) shown as 2.
The difference between these 2 types of tables is very important.
The lookup tables (also called dimension tables by BI professionals) always consist of 1 (and only 1) record for every item in the table. Eg.
- In the calendar lookup table, there is 1 row for every day – no duplicates. The uniquely identifying key is the date itself.
- In the customers lookup table, there is 1 row for every customer – no duplicates. The uniquely identifying key is the customer number.
And so on for the other lookup tables.
The data tables (also called fact tables) contain as many records as you like (or as exist in your data) – duplicates are fine. There is a “key” column that matches with each of the lookup tables. Eg there is a Sales[date] column in the sales table that matches to the Calendar[date] column, a Sales[Customer Key] column that matches to the Customers[Customer key] column and so on. Each date can have many sales, each customer can have many sales etc.
Joining the Data Tables to Lookup Tables
Data tables are always joined to the Lookup Tables not the other way around. The easiest way to think about this is to think how you would work with tables like this in Excel. If you had a table of sales by product code in Excel and you wanted to find out the product description, you would probably write a VLOOKUP in your sales table to go and fetch the product description from another table that contained the full list of products. This is one of the reasons I recommend you lay out your tables like I have shown in the image above. I call this the Collie Layout methodology as it was invented by Rob Collie. This layout makes no difference to the way the data model works, but laying it out in this way gives you a visual clue that the tables at the top are lookup tables (ie look up, VLOOKUP).
To join the tables (in the diagram view) you click and drag the common column from the data table to the lookup table (not the other way around). Note if you do it the other way around, Power Pivot will MOSTLY detect the error and flip the relationship for you. But in some instances this will not happen potentially giving you undesired and incorrect results. For this reason I recommend you create the joins the correct way as described above.
Every join in Power Pivot (Excel 2010, 2013) is of the type One to Many – there are no exceptions. The one side of the relationship has the arrow pointing to it, and the many side of the relationship has the circle.
Power BI Desktop/Excel 2016
Several things changed in the latest version of the Power Pivot engine that comes with Power BI Desktop and Excel 2016. In these newer versions, it is also possible to have one to one relationships. Just double click the relationship and then you can change the cardinality (shown as 1 below).
Note you can only change this if your data truly is 1 to 1. You can also force the data tables to filter the lookup tables (shown in 2 above). Finally, there are a number of visualisation improvements including the cardinality of the relationship is easier to understand. The highlighted relationship is currently 1 to many and has a 1 (shown #3 above) and a * (shown as #4 above). You can also now see an arrow (shown as #5) indicating the direction of the automatic filter propagation.
Now for that Second Data Table
When you bring a second data table into Power Pivot, it is common for people to think that they should join the new data table to the original data table, but this is wrong. The correct way to join a second data table to a data model is to treat the new data table exactly the same as the first data table.
- Make sure it has 1 column for every lookup table (lookup tables that are relevant anyway)
- Join the data table columns to each lookup table (that is relevant)
In the following example, I have imported a Stock on Hand (SOH) data table for Adventure Works. This table contains the stock counts completed in each branch at the end of each trading month.
A couple of things to note:
- I have placed the data table at the bottom, just like the other data table
- I have joined each of the “key” columns to the lookup tables. In this case there is a date column, ProductKey, TerritoryKey but no CustomerKey.
How to use the Model Inside Pivot Tables
Once you have your data model set up in this way, it is important that you always use your Lookup Table columns to filter your pivot tables. The 4 tables (highlighted below) are the lookup tables from above. You should only use columns from these tables in your Pivot Table Rows, Columns, Filters and Slicers. The filters always automatically propagate from the lookup tables to the data tables and hence doing it this way will ensure that BOTH of your data tables are filtered. Note of course that there is no join between the customer table and the SOH table.
If you don’t use the lookup tables, but instead you use (for example) the Territory Key from one of the data tables, it will correctly filter that 1 data table but it will not filter the second data table. This is because the natural filter propagation in Power Pivot ALWAYS flows from the 1 side of the relationship to the many side of the relationship. Filters cannot automatically flow in the other direction (it can be done with more advanced DAX, but it is not automatic).
This is the second reason I recommend using the Collie Layout Methodology for your data models. As you can “see” in the image above, the filter propagation always flows “down hill”. Filters do not automatically flow “up hill”. Laying out your tables this way gives you a second visual cue that “filters only flow downhill”.
A Working Pivot Table over Multiple Data Tables
Here is an example of a working pivot table over multiple data tables.
Note that the Calendar Year filter (shown as #1 above) and the row labels (Product Sub Category #2) both come from the lookup tables. The measure [Total Sales] (#3) comes from the Sales data table = SUM(Sales[Extended Amount] while the measure [Total SOH] (#4) comes from the SOH data table. The formula for Total SOH is a bit complex for reasons outside the scope of this post, but here it is in case you are interested.
Total SOH = CALCULATE( SUM(SOH[SOH]), LASTNONBLANK(Calendar[Date], CALCULATE(COUNTROWS(SOH))) )
Only Create Pivot Tables that make Sense
In this data model, you will recall that there is no join between the SOH data table and the Customer Table. From a business sense this makes perfect sense – the stock is counted at a store level and there is no relevance to customers. However the implication is that it will make no sense to create a pivot table that contains customers in a filter and also SOH in values. If you do this, the SOH numbers will not make any sense. The message is that you should only create pivot tables that make sense based on your data model.