Many to Many Relationships in DAX Explained

Level: Advanced

(Level: Advanced, but explained in detail so everyone can understand)

There is a lot to learn in DAX if you want to be a ninja.  A couple of the more complex and important things to learn and understand are filter propagation and context transition.  It is not so much that you need to be a rocket scientist to understand these concepts, it is more that they are not naturally intuitive.  You simply have to learn how filter propagation and context transition work.  What’s more, if you have an Excel background, there are some fundamental differences between the way Power Pivot works vs regular Excel, and you have to learn these things too.  All these things are learnable – but indeed you will need to learn. You need an understanding of filter propagation and context transition to understand how to solve the Many to Many problem below – but don’t worry – I will explain it in detail in this post.

The Many to Many Problem – Bill of Materials

The problem I am going to cover today is the DAX Many to Many problem.  All relationships between tables in DAX are of the type 1 to Many* – there is no support for Many to Many relationships in DAX (*Note: in Power BI there is also a 1 to 1 relationship type).

Now for a lot of people this won’t make any difference as there is no need for a Many to Many relationship.  But sometimes there is a DAX data model that has 2 or more tables that have multiple records in each table, and these tables have a logical business relationship to each other.   An example will make it easier to understand.

In the example below, each product exists in the Product table (orange) only once, but each product can be sold many times, and hence it can appear in the Sales table (blue) many times.  This is a standard 1 to many relationship that DAX was built to handle really well.

image

Some of the products in the Sales table are actually “multi product” products.  eg Product ABC is a box of fruit that contains 3 different products (2 x Apples, 3 x Bananas and 1 x Cantaloupe – 6 items in total) .  If you want to see the individual products that were sold, you would need to create a many to many relationship between the Sales table and the Bill of Materials table (green).   The product ID in the sales table has many instances of each product ID (1 for each sale), and the Bill of Materials table also has many instances of the Product ID (1 for each sub product). This is allowed in a traditional relational database tool like Access or SQL Server but it is not allowed in Power Pivot.

Using ONLY 1 to Many Relationships to Solve the Problem

Given that you simply must use 1 to Many relationships in DAX*, the only workable setup of the relationships between the tables is shown below.  There is a 1 to many relationship from the Products table to the Sales table.  There is also a 1 to many relationship from the Products table to the Bill of Materials (BOM) table.  But there is no relationship between the Calendar table and the BOM table because the BOM table doesn’t record the date of the sale; it only records the quantity of each sub product.

image

Now for Some Simple Measures

I have created 3 measures here.

Total Sales Qty=SUM(Sales[Qty])
Total BOM Qty=SUM(BOM[Qty])
Total BOM Sales = SUMX(Products,[Total Sales Qty] * [Total BOM Qty])

When I put these 3 measures in a pivot table with Calendar Date in Rows, this is what I get (shown below).  See the problem?  The measure [Total BOM Qty] is giving the wrong answer – I get the same value for every row in the Pivot Table.  What is going on here?

image

To understand the problem in DAX, you need to have a very clear understanding of automatic filter propagation between relationships.

The Many to Many Problem Explained

Below is the Power Pivot data model again.  I always layout my tables using the Collie Layout Methodology – that is, the lookup tables are at placed at the top and the data tables are placed at the bottom.  This has no effect on the behaviour of the model but it makes it much easier to visualise how filter propagation works.  Lookup tables always have 1 and only 1 row for every object in the table, and there must be a unique identifying key (date in the case of the Calendar table and Product ID in the case of the Products table).  Data tables also must have the same key (date and Product ID – otherwise you can’t join the tables) but data tables are allowed to have as many duplicates as needed (many sales are made on the same date, and the same product is sold many times).

image

Filter propagation automatically flows from the 1 side of the relationship to the many side of the relationship but it does not automatically flow in the other direction.  When you use the Collie Layout Methodology like I have here, we say that filters always automatically flow downhill – they can’t automatically flow uphill. So in the image above,

  • Any filter on the Calendar table will automatically flow through the relationship to the Sales table (shown as 1 to 2 in the image above).
  • Any filter on the Products table will automatically flow through the relationship to the Sales table (3 to 4) and it will also flow to the BOM table (5 to 6).
  • But very importantly, filters will not automatically flow from the Sales table to the Products table (7 to 8), nor uphill through (9 to 10) nor (11 to 12).  The implication is that when you set up a pivot table like the one shown earlier, the Calendar table will filter the Sales table, and hence the Total Sales Qty will be correct.  But there is no automatic way for the Calendar table to filter the BOM table because filters don’t automatically flow up hill. Hence the BOM table is completely unfiltered by the Calendar table.  The result of Total BOM Qty will therefore always be the quantity of the entire BOM table – completely unfiltered by the Calendar table (it is of course filtered by the Product table).

This is an Easy Problem to Solve in Power BI Desktop

In Power BI Desktop this is an easy problem to solve – I will explain how now before going back and solving it for Excel 2010/2013.  There is a feature in Power BI Desktop called Bi-Directional Cross Filtering that will allow you to change the natural “down hill only” filtering of the relationship between tables so that it flows in both directions. As you can see in the data model below (from Power BI Desktop), I have swapped the filtering direction of relationship 2 (Products to Sales) to be bi-directional (these arrows indicate the direction of filter propagation in Power Bi Desktop – which is very helpful indeed.  We can thank Rob Collie for lobbying for this UI improvement, and Microsoft for listening).

image

When you make this change, the Products table will be automatically filtered based on the entries in the Sales table – reverse “up hill” filtering.  Once the Products table is filtered by the Sales table (due to the bi-directional cross filtering behaviour), then the new filter on the Products table will automatically propagate downhill through relationship 3 shown above.  As a result of the end to end flow of cross filtering:

  1. The Calendar table filters the Sales table,
  2. The Sales table filters the Products table,
  3. The Products table filters the BOM table

The net result is that the Calendar table is now filtering the BOM table even though there is no direct relationship between these 2 tables.

When I create a Matrix in Power BI Desktop (a matrix is similar to a Pivot Table in Excel), I get the correct answers as shown below).

image

But there is another problem. Note that Total Sales x Total BOM Qty doesn’t automatically equal Total BOM Sales at this level (eg on 8th Jan). I need to bring the BOM ID column into my Matrix so I can see exactly which BOM items sold each day.  When I do this I get a similar problem to before.  See in the Matrix below that the BOM Qty is correct for each BOM ID, but the Sales Quantity is the same for each of the BOM IDs in the Matrix – this is not correct.

image

This is an almost identical problem as the first one.  Let’s look at the data model again (below).  The Matrix above has the BOM ID column on Rows in the matrix.  This column comes from the BOM Table (shown as 1 below) and because it is on rows in the matrix, it is normal to expect that this will filter the measure [Total Sales Qty].  But remember filters automatically flow down hill, not uphill.  So the BOM ID column is not filtering the Products table (2) and hence the Products table is not filtering the Sales table based on the BOM ID.  The net result, you get the same Total Sales Quantity regardless of the BOM ID because the BOM ID is not filtering the Sales Table.  The simple answer to this problem (in Power BI Desktop) is to change the cross filtering behaviour of the relationship (4) from single to Bi-Directional – just like before.

image

Once you make this change, you will get a fully working report that makes sense to anyone reading it.

image

How to use DAX to force the Calendar Table to filter the BOM Table

OK, now that you understand how to solve this with Bi-Directional cross filtering, hopefully you will realise what needs to be done to solve the problem in Excel 2010/2013.  Here is the data model again (shown below).  I need to force the Sales Table to filter the Products table (shown as 1 below) and I also need to force the BOM table also to filter the Products table (shown as 2 below).  If I can force the 2 Data tables to filter the common Products table, then the Products table will do its job and pass those filters through to the other data tables automatically, hence solving the problem.  Stated another way, I want the Sales table to filter the Products table, then the products table will filter the BOM table automatically.  I also want to force the BOM table to filter the Products table as well – then the products table will automatically filter the Sales table.  I am trying to get the 2 data tables to filter the common lookup table so the common lookup table will then pass the filters on to the other table

image

If I were to write these formulas using “Pseudo DAX”, the 2 formulas would read like this:

Total Sales Qty=
     CALCULATE(
         SUM(Sales[Qty]),
         'Filter the Products table based on the rows in the BOM table first'
     )

Total BOM Qty=
     CALCULATE(
         SUM(BOM[Qty]),
        'Filter the Products table based on the rows in the Sales table first after applying filters from the Calendar table'
     )

So now all I need to do is find a suitable filter to replace the “Filter the Products table…  ” portion of each formula and I will achieve the outcome.  There are many ways to do this, but first I am going to show you a method using the FILTER function, and then I will show (and explain) another method using Black Magic from The Italians.

Total Sales Qty

Let’s start with this formula.

Total Sales Qty=
     CALCULATE(
        sum(Sales[Qty]),
        'Filter the Products table based on the rows in the BOM table first'
     )

How can I write a filter statement to put inside CALCULATE that will filter the Products table based on the values in the BOM?  Let me show the formula and then explain what it does.

=FILTER(
   Products,
   CALCULATE(COUNTROWS(BOM)) > 0
 )

The FILTER function is an iterator and hence it has a Row Context.  The above FILTER formula iterates over the Products table and returns a filtered table of all rows in the Products table that pass the given test.  At each iteration (ie each product) in the Products table, the CALCULATE function forces context transition (turns the row context into a filter context) and hence the BOM table is filtered for the current row in the Products table iteration.

Then the FILTER formula asks the question “Now that it’s clear that we are only talking about this one single product for this single step of the iteration process and we have filtered the BOM table to reflect this, are there currently any rows visible in the BOM table?”.  If the answer is yes, then FILTER keeps that product, if the answer is no, then FILTER discards the product.  FILTER then goes to the second product in the Products table, then CALCULATE again forces context transition for this second iteration and the BOM table is filtered so that only rows of this specific second product are visible in the BOM table, and then FILTER completes the COUNTROWS check again.  This process goes on for every product in the Product table (all those in the current filter context anyway) and then FILTER is left with a new Filtered Table of Products that contains only products that also exist in the BOM table in the current filter context.

What if you leave out the CALCULATE?

It is worth pointing out here that the following filter formula will not work.

The problem with this second formula is that there is no CALCULATE wrapped around COUNTROWS(BOM).  FILTER is an iterator and has a Row Context.  But a Row Context does not automatically create a Filter Context.  So when the FILTER function steps through its iteration process and gets to the first Product, there is no Filter Context and hence the BOM table is not filtered by the new iteration process.  COUNTROWS(BOM) will therefore be the total number of rows in the original table in the original filter context, every product will therefore always pass the test (or always fail – depending on the initial filter context) and there will be no change to the new Filtered Products table.  The net result is the new Filtered Products table is actually identical to the original Products table – no change in filtering at all.  The formula simply doesn’t work.

Bringing the Correct Formulas Together

So putting the correct filter formula inside the CALCULATE from earlier, I end up with this formula.

Total BOM Qty

Now I can just apply the same pattern to the other formula, switching out the table names.

And here is the working Pivot Table, same as in Power BI Desktop earlier.  The BOM table is filtering the Sales table, and the Sales table is filtering the BOM table – Many to Many using DAX formulas in action!

image

Now for the Italian Black Magic

There is another way you can write these formulas that is simpler to write and easy to read – unfortunately it is difficult to understand how it works – Marco Russo calls it Black Magic.  Here are the 2 formulas.

When you compare these Black Magic formulas against the Pseudo DAX formula I wrote earlier, you will see that I am using the BOM table as the filter expression in the first formula, and the Sales table as the filter expression in the second formula.  This doesn’t make any sense on first sight.  If filters always propagate from the one side of the relationship to the many side of the relationship, how can these formulas possibly work?  This can be explained with “Expanded Tables”.

Expanded Tables

Power Pivot is built on top of some more traditional database technologies and hence what happens inside Power Pivot can be converted (or thought of) in more traditional database patterns and structures behind the scenes.  In SQL terms, the relationships between the Sales table (shown as 1 below), the Calendar table (2) and the Products table (3) are:

Sales Left Outer Join Calendar ( 1 to 2)

Sales Left Outer Join Products ( 1 to 3)

image

If I had these tables in a relational database, I could materialise the sales table into an Expanded Table that contains all the original Sales columns, plus the Calendar table columns and the Product table columns.  To do tihs I could write the following SQL Query:

Select * 

from Sales 

Left Join Calendar on Sales.Date = Calendar.Date 
Left Join Products on Sales.[Product ID] = Products.[Product ID]

The above query will return the following Expanded Table with columns of data coming from 3 different tables.

image

Technically speaking when I place the Sales table as a filter argument inside the following formula…

I am actually placing the Expanded Table – the Sales table plus all the relevant records from the other tables on the one side of the relationships.

The Expanded table will still be filtered by the current filter context.  So if there is a filter on the Calendar table (say for 3rd Jan), then the Calendar table will filter the Sales table AND the Expanded Sales Table.  If I re-run the SQL code with a filter on Calendar[Date] = ‘3 Jan 2016’ I get this new Expanded Table.  The Calendar table is filtering the Sales table, and the Sales table is filtering the Products table.

expanded

So when Sales table is used as the filter portion of the CALCULATE function, you can only “see” the Sales table, but it is actually the entire Expanded Sales table (including the Calendar and Products tables and any filters from all of these 3 tables) that is doing the filtering, not just the single Sales table.  Filters from all 3 tables are therefore effectively filtering the BOM table and that is why it works.

Here are copies of my workbook files if  you want to take a look.

Where to learn more

I have learnt most of my advanced DAX knowledge from The Italians (Marco Russo and Alberto Ferrari).  There is an excellent video available here where Alberto explains Many to Many relationships.  I also recommend the book “The Definitive Guide to DAX” for anyone that wants to develop a deep mastery of the DAX language.

For people that are earlier in the learning stages of DAX, you really must master filter propagation and context transition before you can move forward.  If you haven’t mastered these techniques, I recommend you invest some time going back over the basics and make sure you have a solid understanding of the pre-requisites.  The fastest, cheapest and most effective way you can do this is to read my book “Learn to Write DAX”.
L2WD banner ad

Share?

Comments

  1. A beautiful article about not-so-easy-to-grasp problem. The episode has been masterfully set out with great simplification, description and clarification of all problem areas. I particularly loved the explanation of Black Magic formulas with the help of Expanded Tables concepts.

    Truly enjoyed to read and learn. Thank you Matt.

  2. Great time and effort put in here to explain a tricky concept. I didn’t know about the new feature in 2016 and wondered how I can find out about stuff like this more readily. The black magic formula is great and if you spend too much time trying to understand it your head begins to hurt!

  3. Knowing that it’s forcing a Left Outer join for the purposes of the calculation makes it all click. Thank you! I’ve seen this formula before but never seen the explanation on how that was working, and without the proper context remembering how to use it was just not happening.

  4. @Matt
    Very nicely explained – I first learnt about Magic behind the Logic of a DAX Cross Filter – here http://mdxdax.blogspot.in/2011/03/logic-behind-magic-of-dax-cross-table.html

    Also a important thing to learn about Calculate is the Filter Parameter is always a Table
    The best way to visualize this is to think of the Filters being happening using the Advanced Filter method of the Range object of Excel (which expects a range as the criteria)

    So even when you Say Calculate([mSales], PRODUCTS[PRODUCTS_TYP]=”Single”) – behind the scenes it is passing a Table of Products filtered for Type single as the “Criteria” Range for Sales and Sales is filtered for that Criteria
    Once you understand that a Table can is passed as the Filter Criteria then the Shock of
    CALCULATE ( SUM ( BOM[Qty] ), Sales ) – reduces !!

    Also a good practice is not to Create the measures on Individual tables but on a separate table called M having a column called MEASURES
    This way it becomes easy to maintain and also you dont have to re-create the measures if you decide to change the Data source of the Fact / Dim Tables or for some reason delete the Fact / Dim Tables

    • Referring to place where measures should be kept. I remember the article from Rob Collie saying that best practice is to write measures with tables which they are concerning with. You are giving me a smart hint in case deleting/changing structure ..

  5. Very good read. But I have a question. Does Bi-Directional Cross Filtering exist in Excel 2016 (I can’t find it in Powerpivot in Excel2016)? I can only see it in Power BI Desktop though.

    • Min Li, apparently you are correct – it is not there. I don’t have Excel 2016 but my understanding was that it was included. I just confirmed it was left out of the final version due to backward compatibility issues with Excel 2013. Thanks for alerting me to this.

  6. How to solve the case when you have a sub-assembly within the assembly
    For example:
    Add to tables:
    Sales:
    14-01-2016, ABPK, 2

    Products:
    ABPK, BoxToBox, Multi

    BOM:
    ABPK, ABC, 1
    ABPK, P, 2
    ABPK,K,1

    • I don’t see any issue connecting a sub assembly table to the BOM table. This would be a 1 (BOM) to many (Sub Assembly) relationship. Any filters one the BOM table would propagate to the Sub Assembly table. Alternatively you could merge the data in the BOM and Sub Assembly tables into the single BOM table. eg if the BOM has 2 products – 1 is a single item and the second is a sub assembly containing 3 items, you could just load the BOM with the 4 single items.

      • Dear Matt,

        Thank you for the informative article about a topic which I’ve been searching for ages but could not have found any satisfying answer; up to now.

        I downloaded your workbook. Changed your products table with mine. Adapted my BOM to yours and put into table BOM. Changed Sales table contents with mine. All worked perfect. I definitely will use it in my calculations.

        What I could not have achieved is that showing my BOM in multilevel. How can I make Level1, Level2, Level3 and apply BOMQty and SalesQTY to these levels in one measure? I first thought of using a parent-child relationship but as one sub assembly is used in more than one assembly it failed.

        Can you please guide me how I can show my BOM in multilevel and calculate BOMQty and SalesQty for each of my levels separately together with their subtotal corresponding the individual level?

        This is an illustration how a table I want to make:

        Sales:
        XYZ 2

        BOM:
        PRODUCT ID BOM ID Qty
        XYZ ABC 2
        XYZ D 1
        ABC A 1
        ABC B 1
        B C 8
        B H 5
        D EF 4
        EF E 2
        EF F 6

        BOM Qty Sales Qty
        Product ID Level1 Item Level2 Item Level3 Item Level1 Level2 Level3 Level1 Level2 Level3
        XYZ ABC 2 4
        —————————————————————————————————————————————————————————–
        A 1 2
        —————————————————————————————————————————————————————————–
        B 1 2
        —————————————————————————————————————————————————————————–
        C 8 16
        —————————————————————————————————————————————————————————– H 5 10
        —————————————————————————————————————————————————————————– D 1 2
        —————————————————————————————————————————————————————————– EF 4 8
        —————————————————————————————————————————————————————————– E 2 4
        —————————————————————————————————————————————————————————– F 6 12
        —————————————————————————————————————————————————————————– TOTAL Level3 21 42
        —————————————————————————————————————————————————————————– TOTAL Level2 6 12
        —————————————————————————————————————————————————————————– TOTAL Level1 3 6

        • I am happy to try to help you, but can you pleas do a couple of things to make it easier for me help. Can you please take my sample workbook and extend it to show the scenario you have. Then please post a question at http://powerpivotforum.com.au and provide the detail and also what the outcome you are expecting. I monitor he forum and will take a look when I get a few minutes.

  7. Hi Matt,

    Thanks for this awesome article.
    But I found myself confused about the pseudo dax under section: ‘How to use DAX to force the Calendar Table to filter the BOM Table’.

    For the [Total Sales Qty], why the product table should be filtered by BOM table, not other tables? also for the [Total BOM Qty], why the sales table?

    How do you translate this problem into those pseudo dax?

    • Hi Spence. The sales table is already filtered by the calendar table and the product table. This happens automatically due to automatic filter propagation. Therefore there is only 1 table in this data model that is not filtering the Sales table, and that is the BOM table. If you can get the BOM table to filter the Product table, then this filter will automatically propagate to the sales table (from the Product table). So the task is to force the BOM table to filter the Product table. If you can do this, then everything else will just work.

      Likewise, the BOM table is already filtered by the Product table, but it is not filtered by the Sales table. If you want to slice the BOM table by Sales, you need to get the sales table to filter the product table – then the product table will automatically filter the BOM table for you. There is no need to worry about the Calendar table as this table already filters the Sales table. So as long as you can get the sales table to filter the Product table, everything else will just work. Hope that makes sense.

  8. In Power BI, it seems the crossfilter not working on the measure [Total BOM Qty] when I put [Product ID] in together with [BOM ID]. It does not filter out the product with no sales for the [Total BOM Qty] measure.
    Same case when using Crossfilter function in DAX.

    However, it works well with the “black magic”, and the Calculate filter above.

    What could be the explanation for that? And I notice that the measure [Total Sales Qty] measure is ok in both cases.

Leave a Reply to Anna Cancel reply

Your email address will not be published. Required fields are marked *

Ten Things I Wish I Knew When I Started With Power Pivot.   Learn  Without The Pain!

x