Archive for SUMX

Uses for DAX as a Query Language

This is the final article in my series about DAX as a Query Language.  In my first article I introduced DAX as a Query Language and then in my second article I went in deeper to the specifics on how to use some of the more useful DAX query functions.  In this final article today I am going to cover a few of the many uses for DAX Queries and also introduce one last DAX function – the ROW function.

Create Summary Tables

As you have may have seen in my previous articles, you can use DAX Queries to create summary tables like the one below.

This may be useful if you want to pre-calculate this summary table and have it as part of your data model (more on that covered in the “Materialisation” section below).  If you want to create a summary table for any reason you can do it using DAX Queries.

Create Tables to Use As Filters Inside  CALCULATE

Another major use of queries is as a table input inside a CALCULATE function.  Because the virtual tables retain lineage to the data model they behave as if they were physically part of the data model.   The following measure returns the total sales for all unique combinations of product category and customer gender where the margin to those unique combinations is greater than 45%.

This measures doesn’t have any real value however it does demonstrate the concept well.

Create Tables to Use in Iterators

You can also create table to use as the table parameter in an iterator like SUMX or AVERAGEX.  The following measure returns the straight line average Margin % across all the actual combinations of product category and customer gender.

Here is an article I wrote some time ago that explains how to use DAX Query tables instead of a more complex nested SUMX approach.  This article will be much more useful with your new understanding of DAX Queries.

To Debug DAX Formulas (the Table Bit)

One thing I really like about DAX Queries is they allow you to “Inspect” tables that are used inside DAX Measures and Calculated Columns.  Excel users come from a world where everything thing they do – every formula and every number – is materialised in front of their eyes in the spreadsheet.  When you begin to write more complex DAX Measures or Calculated Columns, the formulas will often be in the following format:

Your Measure = CALCULATE([Some Base Measure], <A complex table function used to filter the data model>)

The issue of course is you can’t actually “see” the result of “the complex table function used to filter the data model”.  If you can’t see it, it can be hard to conceptualise and also hard to debug the measure particularly if it gives an answer different to your expectations.  This is where the concept of materialising tables comes in.  A great way to move forward when you are stuck is to write a DAX Query to return just the table portion of your measure first.  Once you have that working correctly you can insert the table portion into the measure.

Materialise Tables Into Excel

It is possible to materialise a DAX Query into Excel by using a trick that I learnt from The Italians.  For this next example I have an Excel Workbook that contains an Adventure Works Power Pivot data model.  The first thing to do is insert a Table* (not a Pivot Table) connected to the data model.  To do this, go to a blank area of the spreadsheet (shown as 1 below) and then select Data\Existing Connections\Tables, then select any table (I have selected Products shown as 5 below) and then click Open.  *At this point you can select any table as it is going to change shortly anyway.

image_thumb[38]

You will be given an option to insert a Pivot Table however you also have an option to insert a Table as shown below.  Go ahead and click OK to insert this table.

image_thumb[40]

Now that you have a table inserted into a spreadsheet from the data model, it is possible to change what is returned in this table.  To change what is returned in the table, right click anywhere inside the table (shown as 1 below) and then select Table\Edit DAX.

image_thumb[42]

Now you have the opportunity to change this table from being a “table” to being a DAX Query as shown below.

image_thumb[43]

I have now taken a DAX Query I wrote earlier and pasted this into the expression box (shown below) to return a summary table into the spreadsheet.

image_thumb[44]

You can manually refresh this table by right clicking in the table and clicking “Refresh”.  Unfortunately there is no easy way to pass filters to this type of table in Excel although I believe it can be done with VBA. That may be a topic for another day.

Materialise Tables Into Power BI

You can also materialise DAX Query Tables into Power BI using the “New Table” function.  This can be very useful to help you debug the formulas you are writing (as I covered in this article).  The only thing different to note here is that you need to omit the EVALUATE statement inside Power BI New Tables.  The example below materialises the table (shown as 5) from the (formula shown as 4).

image_thumb[45]

Check the Performance of your Measures using ROW

DAX Studio has a really cool feature that allows you to understand how Power Pivot is completing its task of calculating a formula under the hood.  To demonstrate the point I need to introduce an new DAX Query function called ROW.

You can see the syntax for ROW below from the DAX Studio Intellisense.

image_thumb

ROW returns a single row table.  Because a DAX Query can ONLY return a table, it follows that if you want to see the result of a measure inside DAX Studio then you must first convert the scalar result of the measure into a table – this is exactly what ROW does.

Below is a simple example where I return the value of the [Total Sales] measure as a new table.

image_thumb5

My ROW function (shown as 1 above) returns a single row table with 1 column (called “my result”) and the row has the value 29.3m (shown as 2 above) which is the total for the entire database.  Note the Server Timings button shown in 3 above.  If you click this button it will enable the capability for DAX Studio to keep track of how long your query takes to execute and also how the query is being executed.

With the server timings enabled, a new tab will appear in DAX Studio (shown as 1 below).

image_thumb4

After executing the query, the server timings tab will show the total time taken to execute the query in milliseconds (shown as 2), how much time each engine in Power Pivot completed the tasks (Formula Engine and Storage Engine shown in 3), and if the Storage Engine Cache was used (shown in 4).

SE or FE?

This is a huge topic in its own right and way beyond the scope of this article. What I will say here is that in general you want to leverage the Storage Engine in preference to the Formula Engine.  Take a look at some of the key points below.

Storage Engine Formula Engine
Really Fast Really Smart
Can retrieve records without decompressing in the right conditions Iterates in a row context and hence it can be slow
Multi threaded Single threaded
Cached Not cached

I will cover the Storage Engine and Formula Engine in another post sometime.

CROSSJOIN Revisited using ROW

Earlier in this series I warned that CROSSJOIN can return very large tables, but this is only really an issue if you try to materialise the tables.  If I write the following query in DAX Studio, it takes forever to execute and I ended up canceling the query (remember there are more than 7 million rows in this table).

image_thumb[46]

The issue is not the creation of the above table in Power Pivot – the issue is the materialisation of this table in DAX Studio.  Consider the following query that uses ROW to return a single row table containing the count of the rows in the CROSSJOIN query.

image_thumb[47]

The above query executed in 400 milliseconds on my PC confirming that Power Pivot can create these large tables really quickly.  If you want to materialise the results, well that is a completely different thing.

DAX as a Query Language Part 2

In my last article I introduced DAX as a Query Language and explained why learning to query your data model can be useful.  I covered the basics about using a tool like DAX Studio to query a database and return a table of data as a result.  In this article I am going to go deeper and introduce the more common and useful query functions including CROSSJOIN, SUMMARIZE and ADDCOLUMNS (there are more functions that I don’t cover in this article).

Here is the link to the PBIX data model again if you want to follow along and do the examples yourself.  I am using DAX Studio again to connect to Power BI Desktop running on my location machine.  If you take the time to complete these examples you will significantly increase your learning and knowledge retention.

Lineage

Before I move on, an interesting and important fact about Power Pivot (when compared to traditional database tools) is the concept of lineage (pronounced LIN-E-AGE).  When a new virtual table is created in a query or a formula in Power Pivot, the new table will include an automatic one to many relationship to the table from where it was created.  Consider the following simple table from last time.

image

You can see above that this query produces a single column table of all of the unique product categories.  This new virtual table retains lineage to the data model.  In this case the new virtual table comes from the Products table and hence the new virtual table has lineage to the Products table.  You can “imagine” this as shown below with the new table having a 1 to many relationship to the Products table.

.image

Note the image above is just a visual simulation of what happens.  The virtual table is not materialised and you cannot actually see it in the relationship view. But the virtual table does exist (virtually) and the lineage to the Products table also exists – it is just that you can’t actually see it.  I recommend that you learn to “imagine” this happening in the data model in your mind as it will help you understand how the new virtual table interacts with the rest of the data model especially as it relates to context transition.

All virtual tables have lineage to the tables from where they came from.

CROSSJOIN

CROSSJOIN is a function that can create a new table from 2 or more source tables. For this example I am going to join some virtual tables. The first virtual table is VALUES(Product[Category]) which of course returns a list of all the unique product categories.

image

The second virtual table is a list of all possible Customer genders

image

Next I am going to use CROSSJOIN to create a new table containing all the unique combinations of both tables.

image

In the table above there are 4 rows x 2 rows giving a total of 8 rows of all of the unique values.  Continuing on the concept of “imagining” the way these new virtual tables are related in the data model, it would look something like this:

image

Remember this is just a simulation of what it looks like.  These tables at the top are not materialised and you cannot see them in the data model.  But you can “imagine” them as looking like this and they behave in exactly the same way as they would if they were physical tables.

M x N Can Mean a Large Table

You need to be careful with CROSSJOIN as by definition the resulting table will be m x n rows long where m is the number of rows in table 1 and n is the number of rows in table 2.  If I were to CROSSJOIN the customer table (18,484 rows) with the Products table (397 rows) I would end up with more than 7 million rows.  This in itself is not a problem for Power Pivot to create such a large table in memory, but it can definitely be a problem if you try to materialise the table.  More on that next week.

SUMMARIZE

SUMMARIZE is by far my favourite DAX Query function.  SUMMARIZE can do similar things to CROSSJOIN however CROSSJOIN can join tables that do not have relationships whereas SUMMARIZE can only join tables that are related with a many to 1 relationship.

SUMMARIZE first takes a table and then one or more columns (that can be reached via a many to 1 relationship) that you want to include in the new summarised table.

SUMMARIZE(<table>, table[column], table2[column],….)

Here is an example.

image

The query results above are similar to the CROSSJOIN query from before but there is one important difference.    SUMMARIZE will only return rows that actually exist in the data itself (Note there are only 6 rows above compared with 8 rows in the CROSSJOIN example).

Consider the relevant tables from the data model below.

image

Here is the SUMMARIZE formula written earlier.

This query starts with the Sales table and then adds the Products[Category] column from the Products table and the Customers[Gender] column from the Customers table.  The 2 columns specified inside the SUMMARIZE formula come from tables on the 1 side of the many to 1 relationships – this is allowed.

The following is not allowed and will not work.

EVALUATE
SUMMARIZE(Products,Sales[CustomerKey])

It doesn’t work because the column Sales[CustomerKey] cannot be reached from the Products table via a many to 1 relationship.

It is also possible to write a SUMMARIZE statement over any single table.  In the example below, the SUMMARIZE statement returns a list of all possible combinations of product category and colour.

image

You could also achieve the same result with the ALL function (which would be an easier solution if you are just using a single table)

Adding Summary Sales to the Summary Table

So far the SUMMARIZE queries above are just lists of valid combinations.  It is time to do something more interesting and add the [Total Sales] to these summary tables.  Before moving on please note that the following formulas are not best practice – there is a better way which I will cover later.

Consider the following formula

Note specifically that the table parameter in this formula is “Products”.  Also note below that this formula returns blank rows (shown below).

image

This summarize statement correctly summarises all the combinations of Product[Category] and Product[Color] in the products table and then for those products where there are sales, those sales are shown next to the unique combination.  But in some cases the unique combination doesn’t actually have any sales, hence the blank rows.

Using Sales as the Table Parameter

If I change the formula above and swap out the Products table with the Sales table, then the blank rows are no longer visible (see below).

image

SUMMARIZE will always find the unique combinations that actually exist in the selected data.  Because this new formula starts from the Sales table, only combinations of Product[Category] and Product[Color] where there are actual sales are returned.

Context Transition or No Context Transition?

Those of you that are familiar with the concept of context transition may be thinking that context transition is occurring here.  That is a valid thing to assume but this is not what is happening here.  Consider the following formula.

image

Note how I have swapped out the measure [Total Sales] with SUM(Sales[ExtendedAmount]).  With this new formula above there is no CALCULATE forcing context transition yet despite this the table still returns the same result.  This implies that SUMMARIZE does not operate in a Row Context.  In fact SUMMARIZE is a Vertipaq Storage Engine operation.  The part that produces the valid combinations of columns is very efficient however the calculation of the total sales figures are very inefficient.  For this reason it is better to use ADDCOLUMNS to add the sales totals (see below).

ADDCOLUMNS

ADDCOLUMNS does exactly what it suggests – it adds new columns to a table in a query.  The general syntax is as follows:

ADDCOLUMNS(<table>,”Column Name”, <formula or measure>,….)

To demonstrate how this works, let me start with a formula from earlier that produces the following table.

The SUMMARIZE function returns a 2 column table with 15 rows – all the possible combinations that contains sales values.  This table can be used as the table parameter in the ADDCOLUMNS formula as follows.

image

You can see above that this new table returns the Total Sales for each of the 15 possible combinations.

And it is possible to add as many new columns as you need to the summary table.  See below.

image

The Important Differences between ADDCOLUMNS and SUMMARIZE

Now you have seen that it is possible to add columns (such as summary total sales) to a table using SUMMARIZE and also with ADDCOLUMNS.  But there are some important differences between these 2 approaches.

ADDCOLUMNS has a Row Context

Unlike I showed with SUMMARIZE earlier in this article, ADDCOLUMNS does have a row context.  Consider the following query.

image

When I swap out the measure [Total Sales] with SUM(Sales[ExtendedAmount]) the results are wrong.  This shows that ADDCOLUMNS operates in a row context.

Efficiency

When given the choice, you should choose to use ADDCOLUMNS in favour of SUMMARIZE to add these additional columns of data.  ADDCOLUMNS is a lot more efficient in the way it adds the values to the SUMMARIZE table. SUMMARIZE uses a Vertipaq Storage Engine operation to produce the base table and then ADDCOLUMNS leverages lineage and context transition to add the value columns – this approach leverages the special capabilities of Power Pivot to do the job in the most efficient way.  For a more detailed coverage of this topic you should read this article from The Italians.

Other DAX Query Functions

I realise I have not covered all DAX Query functions in this series of articles.  There are others, some of which are only available in the newer versions of Power Pivot (eg Power BI Desktop, Excel 2016).  If you are interested to find out more you can do some research online.  I will be covering 1 final function next week – the ROW function.

Uses for DAX Queries

Next week I will share my final article in this series where I explain a few ways that you can use DAX Queries in the real world.  Be sure to check back next week, or better still sign up to my weekly newsletter to be notified when there are new articles.

Measures on Rows – Here is How I did it

Level: Intermediate

You may or may not be aware that it is not possible to put Measures on rows in a Matrix in Power BI. But I came up with a trick that makes it possible, so read on to find out how.

Measures Can Only be Placed on Columns

First the problem. The only way that you can use the Power BI Matrix visualisation (at this writing) is to place the measures on the columns as shown below.  You can then take any column of data from your data model (typically from a Dimension/Lookup table) and place that on rows in the Matrix.  So you end up with this.

image

This limitation does not exist in an Excel Pivot Table.  As you can see below, it is possible to flip values between columns on rows from the Pivot Table Fields List.

measures on rows

Old tricks for New Purposes

When I was thinking through this problem, my first intuition was to use a Switch statement somehow.  Switch statements can be used to create a Switch Measure (like the ones I used in this blog post here back in 2014) and I figured this could be part of the solution.  And then I remembered another trick I learnt from Rob Collie using disconnected tables to feed a measure – I new I had a solution.

So in summary, I wrote a Switch measure that will morph into different measures when filtered, and then used a disconnected table to pass a filter to that measure.

Write a Switch Measure

The Switch measure is pretty easy to write as follows:

The way this measure works is that it takes an input in the form of an integer, and then depending on that number, it will return the value of the appropriate measure as the result.

Then Create a Helper Table

The next step is the secret sauce to this trick.  I created a disconnected table with the sole purpose of passing the correct filter values (the integers) to the Switch measure.

I used the “Enter Data” feature in Power BI to create a list of the measures I want to put on rows in the Matrix.

image

Note that the ID column is also the sort order of the measures, and the ID column also matches with the Switch measure numbering scheme.  Once the table is created, I added it to the data model as a disconnected table – no need to join it to any other tables in the data model.

I then set the sort order of the Measure column as follows:

image

Write a Harvester Measure

The next step is to write a harvester measure to extract the selected filtered value of the table.  This measure is simply as follows:


The measure above will return the largest integer in the current filter context.  If there is a filter (eg a slicer) on the Measure column and the user as selected “Total Margin” in the slicer, then there is only a single row visible in the filter context, and hence the MAX of the visible rows is 3.  I would of course get the same result if I used MIN, AVERAGE, SUM or indeed even VALUES in my harvester measure.

Filter context is one of the hardest things for new users to understand properly.  I explain filter context in detail (and everything else you need to know to be great at using Power Pivot and Power BI) in my book “Learn to Write DAX”.
L2WD banner ad

You can see the harvester measure in action below.  I have placed the column from the MeasureTable into a slicer and the harvester measure into a Matrix.  When I click on the slicer, the harvester measure updates to show the ID of the selected filter.

slicer

Add the Measure Column to the Matrix

There is more than 1 way to filter a table of course.  In the image above I am using a slicer, but I can also use the rows on the Matrix to provide filter context.  When I place the Measure column from the MeasureTable onto rows in the Matrix, you can see below that the harvester measure is correctly returning the value of the ID column to the visualisation.

image

Pulling it All Together

In order to pull this all together, I replaced the harvester measure with the Switch measure I wrote earlier, and put the Categories column from the Products table onto columns in the Matrix.

image

But there are now 2 problems.  The first problem is that the Total Invoices are shown in currency format.  One limitation of this trick is that all the measures must have the same number formatting.  If you would like to see an optional “Alternative Format” for the SWITCH measure, please vote for that idea here.

The second issue is that the total row is actually showing the total invoices and not the total of all the measures.  This makes sense of course because the grand total row is not filtered.  As a result, the MAX of ID is 4 and hence the SWITCH measure is returning the value of the measure that maps to ID 4 (Total Invoices).

To fix these 2 problems, I removed Total Invoices from my table and rewrote the SWITCH statement to correctly handle the grand total row.

And there you have it, Measures on Rows in a Power BI Matrix.

image

A Double CALCULATE Solves a SUMX Problem

Level: Intermediate

I helped a member at powerpivotforum.com.au with a problem last week that ended with an interesting solution.  The explanation of how it worked was a bit complicated and worthy of sharing, hence this is the topic of today’s post.

Count the Working Days Between Two Dates

The requirement was to count the working days between the arrival date (say of a request) and the approval date of that request.

I set up a simple data model as follows

Data Table

The data table has a unique identifier (ID), a received date and a approved date.

image

Calendar Table

I created a simple calendar table that has just 2 columns; a Dates column and a flag to indicate if each date is a working day.  I used 0s and 1s in this second column so that it would be easy to add up this column to calculate the total working days.

image

Table Structure

I then created a single join between the calendar date and the received date.  Technically I could have created the join on the Date Approved column as an alternative, however it is possible (even likely) that some requests will be “open” and hence will not have an Approved Date at some point in time.  For this reason I have used the Date Received column to make the join.

image

Calculated Column or Measure?

There are many ways to write a formula that will work.  Of course it is possible to write a calculated column in the data table that contains the total business days for each request.  This would not be a complete disaster in this case as the cardinality of this column is likely to be relatively low.  However this is not the best approach for a number of other reasons, the main one being that “Measures” are the turbo charged super tools of Power Pivot (read all the reasons here).  You should invest time to learn how to write Measures, even (especially) when it is hard. In doing so you will build your skills and depth of understanding of how to use and leverage Power Pivot.

If you laze about in calculated columns world, you will never develop your skills enough to be really good at DAX.

Part Solution:  A Simple SUM Measure

The first part of the Measure I wrote to solve this problem is as follows:

image

When reading a CALCULATE formula, you always start with the second parameter – in this case the FILTER function (lines 4 through 8).  Filter is an iterator, and in this case it is iterating over the CALENDAR table AFTER any existing filters on the Calendar table are first removed (line 5).

FILTER works through each row in the Calendar table and checks to see if the date of each row in the Calendar table is greater than the maximum Date Received (in the current filter context) and also if that date is less than the maximum Date Approved (also in the current filter context).  The FILTER function keeps the rows in the Calendar table that pass this test, and then CALCULATE (line 2) applies this FILTER prior to adding up the week days (line 3).  Actually, This FILTER function is not very efficient – a better solution is provided below.

It is easier to understand how the formula works with an example, so read on.
L2WD banner ad

Worked Through Example in a Pivot Table

Look at Request ID 2 in the pivot table below (the second row in the pivot). Pivot tables naturally provide initial filter context to the data model – that’s what they do; that’s why they are so great. 

image

When looking at Request ID 2 in the above pivot table, the data model has an initial filter context so that only a single row in the Data table is “visible”.  It is possible to simulate this initial filter context into the Data table by physically applying a filter to the source table (like shown below ID = 2).  You can do this in the Power Pivot window, or in this case I have used the linked table in Excel to simulate the filter as shown below.

image

When you simulate the initial filter context like this, it is much easier to understand how the formula works.  Referring back to the formula (lines 6 and 7), you need to ask yourself “What is the MAX value of Date Received in the current filter context?”.  The answer is clearly 4th Jan and the maximum of Date Approved in the current filter context is 9th Jan.  Once you know the answer to the MAX parts of this FILTER function (lines 6 & 7), it is then a lot easier to understand what the FILTER function is actually doing.  Remember line 5 of the formula first removes all filters on the Calendar[Date] column (in this example there actually are no filters on the calendar table as there are no columns from Calendar in the pivot).  Then FILTER checks each row of the Calendar table to see if the date in each row is >= 4th Jan and also <= 9th Jan.  FILTER keeps all rows in the Calendar table that pass this test.  Note I could have used MAX or MIN or even VALUES to “harvest” the value in the current filter context.  In some cases you can even use SUM or AVERAGE – it really depends on your data and what you are trying to do.  In short, the Aggregation functions and VALUES when used this way will “harvest” the value in the current filter context – very useful.

It is possible to simulate the results of this FILTER function on the Calendar table the same way as I showed with the Data table above.  In the image below, note how I have used the standard Excel table filter feature to simulate the FILTER function in DAX.  After applying the filters in Excel, there are 6 rows left in the table, and 5 of those rows have an “IS Weekday” value of 1.

image

So when SUM in the formula kicks in, it will return the value 5 – exactly the value shown in the pivot table above.

Note how much easier it is to understand and learn how these functions work when you simulate what is happening in the actual tables? This is a great way to learn to “think in tables”.

The First Problem with this Partial Formula

You may have noticed that there is a problem with the partial formula shown above.  Referring back to the pivot table (shown here again), see how the Grand Total row is blank?

image.png

What is really needed is for the Grand Total row to be the sum of the values shown in the rows in the pivot table ie the Grand Total should be 10.  To understand why the Grand Total it is blank, you have to go through the same process I walked through above.  In short, the initial filter context of the Grand Total in the pivot table is “completely un-filtered”.  So MAX of Date Received is 15th Jan and MAX of Date Approved is 11th Jan.  If you go ahead and simulate this by applying these filters onto the Excel Calendar table (like before), there are no rows in the Calendar table that pass the test.  The Calendar table is therefore completely filtered of all rows – none is visible, and hence the SUM formula returns BLANK (note it is BLANK, not 0).

A More Complete SUMX Solution (Still Not Optimal)

Whenever you get this “Grand Totals don’t add up” problem, you should think of SUMX as the solution. I like to think of SUMX as a way to “simulate the natural filtering behaviour provided by the rows of the pivot table”.  Each row of the pivot table is providing initial filter context for one record in the Data table and hence the formula works on row level.  But no such Data table filtering is provided by the Grand Total row and hence the formula doesn’t work on Grand Total level.  I cover this SUM vs SUMX behaviour in more depth in a blog post here.

Here is a formula that you may think should work, but it actually doesn’t work at all.

image

In the above formula, SUMX iterates over the Data table (line 3).  For each row in the Data table, the formula lines 4 through 10 are executed.  Intermediate DAX users can be forgiven for thinking that Context Transition occurs because of the CALCULATE function on line 4 – but in this case it does not happen.   Remember earlier in the post I explained that the second parameter of CALCULATE is the first part of the formula to be executed.  So in the above formula, the FILTER portion (lines 6 through 9) is executed in the initial filter context without any context transition. The CALCULATE isn’t executed until after the FILTER portion is complete.  That is why CALCULATE in this case does not force Context Transition and that is why this formula doesn’t work.

Here is a formula that does work (Still Not Optimal)

image

Note above in this new correct formula the addition of a second CALCULATE on line 4.  This CALCULATE function (ending on line 12) only has a single parameter (which is the second CALCULATE from line 5 through 12).  Because the inner CALCULATE function is the first parameter of the outer CALCULATE, the outer CALCULATE is executed first (note if the inner CALCULATE was the second parameter of the outer CALCULATE, then the inner CALCULATE would be executed first – confusing, hey!).  The outer CALCULATE forces context transition onto the Data table for every iteration step in the SUMX function before the inner CALCULATE is executed.  This context transition is in effect “simulating” the row by row behaviour of the original pivot table regardless if there is an ID used in the filter context or not.  This is why it works at the Grand Total level too.

works

The Optimal SUMX Formula

I mentioned above that the previous formulas are not optimal. The issue is that the Vertipaq engine in Power Pivot is a Columnar Data Store.  That means data is compressed and stored in Columns – not Tables.  The most efficient way to access the compressed data is one column at a time.  In the formulas above, the FILTER function is operating over 2 separate columns in the same table – this is not optimal – in fact it can be very inefficient on large tables.  In order for FILTER to do the comparison on 2 separate columns in the same table, Power Pivot needs to decompress the data – this is slow and inefficient.

So a better formula is this.

best

In this final formula, lines 7 and 8 are 2 independent FILTER functions each operating over a single column in the Data table. With this structure, the Vertipaq engine can complete each operation (independently) really fast, and without decompressing the data.

This final formula is fine over a relatively small data table, but would still not be efficient over a very large data table.  What is “large” depends on your computer.

You can download the sample workbook here.

I will cover a more detailed description of context transition in a future post.

If you like the way I explained this problem and you are a starting out on your formal DAX learning, you may like to consider reading my book “Learn to Write DAX” for a complete and structured way to build your DAX skills.  If you already have a solid understanding of DAX and want to go to the next level, then you should consider the websites, books and training courses from sqlbi.com   I have learnt most of my advanced understanding of DAX from Marco and Alberto using these methods.