Archive for SUMX

Measures on Rows – Here is How I did it

Level: Intermediate

You may or may not be aware that it is not possible to put Measures on rows in a Matrix in Power BI. But I came up with a trick that makes it possible, so read on to find out how.

Measures Can Only be Placed on Columns

First the problem. The only way that you can use the Power BI Matrix visualisation (at this writing) is to place the measures on the columns as shown below.  You can then take any column of data from your data model (typically from a Dimension/Lookup table) and place that on rows in the Matrix.  So you end up with this.

image

This limitation does not exist in an Excel Pivot Table.  As you can see below, it is possible to flip values between columns on rows from the Pivot Table Fields List.

measures on rows

Old tricks for New Purposes

When I was thinking through this problem, my first intuition was to use a Switch statement somehow.  Switch statements can be used to create a Switch Measure (like the ones I used in this blog post here back in 2014) and I figured this could be part of the solution.  And then I remembered another trick I learnt from Rob Collie using disconnected tables to feed a measure – I new I had a solution.

So in summary, I wrote a Switch measure that will morph into different measures when filtered, and then used a disconnected table to pass a filter to that measure.

Write a Switch Measure

The Switch measure is pretty easy to write as follows:

The way this measure works is that it takes an input in the form of an integer, and then depending on that number, it will return the value of the appropriate measure as the result.

Then Create a Helper Table

The next step is the secret sauce to this trick.  I created a disconnected table with the sole purpose of passing the correct filter values (the integers) to the Switch measure.

I used the “Enter Data” feature in Power BI to create a list of the measures I want to put on rows in the Matrix.

image

Note that the ID column is also the sort order of the measures, and the ID column also matches with the Switch measure numbering scheme.  Once the table is created, I added it to the data model as a disconnected table – no need to join it to any other tables in the data model.

I then set the sort order of the Measure column as follows:

image

Write a Harvester Measure

The next step is to write a harvester measure to extract the selected filtered value of the table.  This measure is simply as follows:


The measure above will return the largest integer in the current filter context.  If there is a filter (eg a slicer) on the Measure column and the user as selected “Total Margin” in the slicer, then there is only a single row visible in the filter context, and hence the MAX of the visible rows is 3.  I would of course get the same result if I used MIN, AVERAGE, SUM or indeed even VALUES in my harvester measure.

Filter context is one of the hardest things for new users to understand properly.  I explain filter context in detail (and everything else you need to know to be great at using Power Pivot and Power BI) in my book “Learn to Write DAX”.
L2WD banner ad

You can see the harvester measure in action below.  I have placed the column from the MeasureTable into a slicer and the harvester measure into a Matrix.  When I click on the slicer, the harvester measure updates to show the ID of the selected filter.

slicer

Add the Measure Column to the Matrix

There is more than 1 way to filter a table of course.  In the image above I am using a slicer, but I can also use the rows on the Matrix to provide filter context.  When I place the Measure column from the MeasureTable onto rows in the Matrix, you can see below that the harvester measure is correctly returning the value of the ID column to the visualisation.

image

Pulling it All Together

In order to pull this all together, I replaced the harvester measure with the Switch measure I wrote earlier, and put the Categories column from the Products table onto columns in the Matrix.

image

But there are now 2 problems.  The first problem is that the Total Invoices are shown in currency format.  One limitation of this trick is that all the measures must have the same number formatting.  If you would like to see an optional “Alternative Format” for the SWITCH measure, please vote for that idea here.

The second issue is that the total row is actually showing the total invoices and not the total of all the measures.  This makes sense of course because the grand total row is not filtered.  As a result, the MAX of ID is 4 and hence the SWITCH measure is returning the value of the measure that maps to ID 4 (Total Invoices).

To fix these 2 problems, I removed Total Invoices from my table and rewrote the SWITCH statement to correctly handle the grand total row.

And there you have it, Measures on Rows in a Power BI Matrix.

image

A Double CALCULATE Solves a SUMX Problem

Level: Intermediate

I helped a member at powerpivotforum.com.au with a problem last week that ended with an interesting solution.  The explanation of how it worked was a bit complicated and worthy of sharing, hence this is the topic of today’s post.

Count the Working Days Between Two Dates

The requirement was to count the working days between the arrival date (say of a request) and the approval date of that request.

I set up a simple data model as follows

Data Table

The data table has a unique identifier (ID), a received date and a approved date.

image

Calendar Table

I created a simple calendar table that has just 2 columns; a Dates column and a flag to indicate if each date is a working day.  I used 0s and 1s in this second column so that it would be easy to add up this column to calculate the total working days.

image

Table Structure

I then created a single join between the calendar date and the received date.  Technically I could have created the join on the Date Approved column as an alternative, however it is possible (even likely) that some requests will be “open” and hence will not have an Approved Date at some point in time.  For this reason I have used the Date Received column to make the join.

image

Calculated Column or Measure?

There are many ways to write a formula that will work.  Of course it is possible to write a calculated column in the data table that contains the total business days for each request.  This would not be a complete disaster in this case as the cardinality of this column is likely to be relatively low.  However this is not the best approach for a number of other reasons, the main one being that “Measures” are the turbo charged super tools of Power Pivot (read all the reasons here).  You should invest time to learn how to write Measures, even (especially) when it is hard. In doing so you will build your skills and depth of understanding of how to use and leverage Power Pivot.

If you laze about in calculated columns world, you will never develop your skills enough to be really good at DAX.

Part Solution:  A Simple SUM Measure

The first part of the Measure I wrote to solve this problem is as follows:

image

When reading a CALCULATE formula, you always start with the second parameter – in this case the FILTER function (lines 4 through 8).  Filter is an iterator, and in this case it is iterating over the CALENDAR table AFTER any existing filters on the Calendar table are first removed (line 5).

FILTER works through each row in the Calendar table and checks to see if the date of each row in the Calendar table is greater than the maximum Date Received (in the current filter context) and also if that date is less than the maximum Date Approved (also in the current filter context).  The FILTER function keeps the rows in the Calendar table that pass this test, and then CALCULATE (line 2) applies this FILTER prior to adding up the week days (line 3).  Actually, This FILTER function is not very efficient – a better solution is provided below.

It is easier to understand how the formula works with an example, so read on.
L2WD banner ad

Worked Through Example in a Pivot Table

Look at Request ID 2 in the pivot table below (the second row in the pivot). Pivot tables naturally provide initial filter context to the data model – that’s what they do; that’s why they are so great. 

image

When looking at Request ID 2 in the above pivot table, the data model has an initial filter context so that only a single row in the Data table is “visible”.  It is possible to simulate this initial filter context into the Data table by physically applying a filter to the source table (like shown below ID = 2).  You can do this in the Power Pivot window, or in this case I have used the linked table in Excel to simulate the filter as shown below.

image

When you simulate the initial filter context like this, it is much easier to understand how the formula works.  Referring back to the formula (lines 6 and 7), you need to ask yourself “What is the MAX value of Date Received in the current filter context?”.  The answer is clearly 4th Jan and the maximum of Date Approved in the current filter context is 9th Jan.  Once you know the answer to the MAX parts of this FILTER function (lines 6 & 7), it is then a lot easier to understand what the FILTER function is actually doing.  Remember line 5 of the formula first removes all filters on the Calendar[Date] column (in this example there actually are no filters on the calendar table as there are no columns from Calendar in the pivot).  Then FILTER checks each row of the Calendar table to see if the date in each row is >= 4th Jan and also <= 9th Jan.  FILTER keeps all rows in the Calendar table that pass this test.  Note I could have used MAX or MIN or even VALUES to “harvest” the value in the current filter context.  In some cases you can even use SUM or AVERAGE – it really depends on your data and what you are trying to do.  In short, the Aggregation functions and VALUES when used this way will “harvest” the value in the current filter context – very useful.

It is possible to simulate the results of this FILTER function on the Calendar table the same way as I showed with the Data table above.  In the image below, note how I have used the standard Excel table filter feature to simulate the FILTER function in DAX.  After applying the filters in Excel, there are 6 rows left in the table, and 5 of those rows have an “IS Weekday” value of 1.

image

So when SUM in the formula kicks in, it will return the value 5 – exactly the value shown in the pivot table above.

Note how much easier it is to understand and learn how these functions work when you simulate what is happening in the actual tables? This is a great way to learn to “think in tables”.

The First Problem with this Partial Formula

You may have noticed that there is a problem with the partial formula shown above.  Referring back to the pivot table (shown here again), see how the Grand Total row is blank?

image.png

What is really needed is for the Grand Total row to be the sum of the values shown in the rows in the pivot table ie the Grand Total should be 10.  To understand why the Grand Total it is blank, you have to go through the same process I walked through above.  In short, the initial filter context of the Grand Total in the pivot table is “completely un-filtered”.  So MAX of Date Received is 15th Jan and MAX of Date Approved is 11th Jan.  If you go ahead and simulate this by applying these filters onto the Excel Calendar table (like before), there are no rows in the Calendar table that pass the test.  The Calendar table is therefore completely filtered of all rows – none is visible, and hence the SUM formula returns BLANK (note it is BLANK, not 0).

A More Complete SUMX Solution (Still Not Optimal)

Whenever you get this “Grand Totals don’t add up” problem, you should think of SUMX as the solution. I like to think of SUMX as a way to “simulate the natural filtering behaviour provided by the rows of the pivot table”.  Each row of the pivot table is providing initial filter context for one record in the Data table and hence the formula works on row level.  But no such Data table filtering is provided by the Grand Total row and hence the formula doesn’t work on Grand Total level.  I cover this SUM vs SUMX behaviour in more depth in a blog post here.

Here is a formula that you may think should work, but it actually doesn’t work at all.

image

In the above formula, SUMX iterates over the Data table (line 3).  For each row in the Data table, the formula lines 4 through 10 are executed.  Intermediate DAX users can be forgiven for thinking that Context Transition occurs because of the CALCULATE function on line 4 – but in this case it does not happen.   Remember earlier in the post I explained that the second parameter of CALCULATE is the first part of the formula to be executed.  So in the above formula, the FILTER portion (lines 6 through 9) is executed in the initial filter context without any context transition. The CALCULATE isn’t executed until after the FILTER portion is complete.  That is why CALCULATE in this case does not force Context Transition and that is why this formula doesn’t work.

Here is a formula that does work (Still Not Optimal)

image

Note above in this new correct formula the addition of a second CALCULATE on line 4.  This CALCULATE function (ending on line 12) only has a single parameter (which is the second CALCULATE from line 5 through 12).  Because the inner CALCULATE function is the first parameter of the outer CALCULATE, the outer CALCULATE is executed first (note if the inner CALCULATE was the second parameter of the outer CALCULATE, then the inner CALCULATE would be executed first – confusing, hey!).  The outer CALCULATE forces context transition onto the Data table for every iteration step in the SUMX function before the inner CALCULATE is executed.  This context transition is in effect “simulating” the row by row behaviour of the original pivot table regardless if there is an ID used in the filter context or not.  This is why it works at the Grand Total level too.

works

The Optimal SUMX Formula

I mentioned above that the previous formulas are not optimal. The issue is that the Vertipaq engine in Power Pivot is a Columnar Data Store.  That means data is compressed and stored in Columns – not Tables.  The most efficient way to access the compressed data is one column at a time.  In the formulas above, the FILTER function is operating over 2 separate columns in the same table – this is not optimal – in fact it can be very inefficient on large tables.  In order for FILTER to do the comparison on 2 separate columns in the same table, Power Pivot needs to decompress the data – this is slow and inefficient.

So a better formula is this.

best

In this final formula, lines 7 and 8 are 2 independent FILTER functions each operating over a single column in the Data table. With this structure, the Vertipaq engine can complete each operation (independently) really fast, and without decompressing the data.

This final formula is fine over a relatively small data table, but would still not be efficient over a very large data table.  What is “large” depends on your computer.

You can download the sample workbook here.

I will cover a more detailed description of context transition in a future post.

If you like the way I explained this problem and you are a starting out on your formal DAX learning, you may like to consider reading my book “Learn to Write DAX” for a complete and structured way to build your DAX skills.  If you already have a solid understanding of DAX and want to go to the next level, then you should consider the websites, books and training courses from sqlbi.com   I have learnt most of my advanced understanding of DAX from Marco and Alberto using these methods.