Part 2 - Filtering Out BLANKS in a MEDIANX Measure Using DAX Studio - Excelerator BI

Part 2 – Filtering Out BLANKS in a MEDIANX Measure Using DAX Studio

This article is a follow on from last week. I recommend you go back and read the article first if you missed it, but in summary, I want to write a measure (not a calculated column) that will return the median sales of products while excluding the products with blanks (no sales). As I showed last week, this is relatively easy with a calculated column. Here it is again.  Remember writing calculated columns first is a great way to visualise the problem you want to solve.  It is not a great way to solve most problems (some yes, most, no).

Sample Data – Simplified Adventure Works Star Schema

I am using my standard Adventure Works simple star schema for this example.

This model has a products table and a sales table, but there is no information stored anywhere that tells me the total sales for each product.

Calculating Median With a Calculated Column

Calculating the median is pretty easy with a calculated column. I first created a calculated column in the products table of Adventure Works as shown below by simply referring to the [Total Sales] measure in the new column.

I added this column as a card to my report, and set the aggregation to MEDIAN, this is what I got.

When I wrote the equivalent test measure using MEDIANX, this is what I got.

It turns out that when you use a column in a card in Power BI and find the median value, the visual auto filters out the BLANKS for you, whether you want that or not. I couldn’t see any way to change this behaviour.

How to Solve This With a Measure and DAX Studio

The timing of this article was perfect given the recent integration of DAX Studio with Power BI Desktop via the External Tools Toolbar. I have given lots of examples of how to get value from DAX Studio in the past, and this is another such article.

I realised last week that I would have to do a few things to solve this problem in a Measures. I would have to

  • Create a virtual table
  • Generate the column of sales
  • Filter out the blanks
  • Then do the calculation.

At first, this can seem overwhelming, but if you take a step back and do each step, one step at a time, then you can do it. Here is how I did it.

Open DAX Studio

I fired up DAX Studio by going to the External Tools bar and launching it from there (read about how to install DAX Studio in last week’s article). One of the key differences between using DAX Studio and writing DAX measures is that DAX studio MUST return a table and a measure MUST return a single scalar value. This is the exact reason I am using DAX Studio to solve this problem. Steps 1, 2 and 3 above are all Table function steps. I want to SEE what I am doing, THEN I will come back and to the final calculation. Trying to do it conceptually (when you first start out, anyway) is near impossible. Do yourself a favour and fire up DAX Studio to step through the problem.

I wrote my first DAX Query in the pane at the top (1 below) and clicked Run. The results were shown below (2). Note the results are a table.

Every DAX query must start with the EVALUATE statement when using DAX Studio. This is not the case if you are using the New Table button in DAX or if you are embedding the DAX Query inside a DAX measure. If you want to learn more about DAX Queries, I have a series of 4 articles starting here.

As you can see in 2 above, this query returned the entire table. Now for this exercise, I am trying to get a table containing total sales for each product that has any sales at all. One principle when you are writing queries like this is to not return any data that you don’t need. The products table is a dimension table, and that means that the ProductKey column is a primary key. I can get the same final outcome by simply using the ProductKey instead of the entire Products table. I modified the query and ran it again. This is what I got. A single column containing all the product keys.

Add a “Calculated Column”

In last week’s article, I showed you how to create a calculated column first so that you could see what you were doing, then move to a measure. When you write DAX queries, you use a slightly different approach. There is a special function called ADDCOLUMNS that can add a new column to a table.

As you can see above, I have wrapped the original query (Line 3) inside an ADDCOLUMNS function. Line 4 specified the name of the new column and also what value would be added to the column. Do you see the similarities to writing a calculated column and the ADDCOLUMNS function?

Filter the Blanks

See how much easier this is to do when you can SEE what is going on? I can SEE all those blanks, and I know I have to get rid of them. This is where DAX as a query language is really powerful. The code above from line 2 to 5 returns a TABLE. This table can be used as the first parameter of a FILTER function, as follows.

Notice that the FILTER function above takes a table as the first parameter (lines 3 through 6) and then the second parameter (line 7) filters out the zeros. Zeros, blanks, nulls, it’s all the same to DAX. Also notice here that line 7 refers to the column created in line 5.  Now if you are familiar with best practice, you will know it is not good practice to refer to a column unless you specify the table, ie Table[Column].  But when you create a column using ADDCOLUMNS in a DAX query, you cannot refer to the table name (because the table doesn’t have a name).  That is why line 7 above looks like I am filtering on the measure [Sales], but it is actually the column [Sales].

So now the table in the results window above contains the total sales for each product that has sales > 0.

Grab the Median Result

If I now try to wrap the above table in a MEDIANX, this is what I get.


Note I get errors. Why? Do you remember me saying at the beginning. DAX Studio MUST return a table. MEDIANX returns a value, so you can’t use it in DAX Studio like this.

Curly braces to the rescue! { }

As I mentioned last week, you can use the curly braces to convert a value to a table. So then I wrote this.

And DAX studio returned a single column, single row table containing the median answer I am after. All that was left to do was to copy the formula (lines 2 to 11) and paste them into a measure in Power BI. Here is the final measure

Product Sales Median =
MEDIANX (
    FILTER (
        ADDCOLUMNS ( ALL ( Products[ProductKey] ), "Sales", [Total Sales] ),
        [Sales] > 0
    ),
    [Sales]
)

Now here’s my point.  Try writing the above formula in your first 2 years of writing DAX without using DAX Studio. I’m not saying there are not any super humans that can do this – I’m sure there are. But do yourself a favour. Break the problem into pieces and visualise each step along the journey so you can do it too.

15 thoughts on “Part 2 – Filtering Out BLANKS in a MEDIANX Measure Using DAX Studio”

  1. Thank you so much for this simple, straightforward explanation!!! I was having a problem with median for grouping and thanks to your article, got it working in Dax Studio. However, when I paste the very same measure into PowerBI, it still returns blank. Any idea why that might be happening??

    Not sure if you’ll answer this since it’s been so long, but trying anyway.

    Thank you!

  2. Worth noting that sometimes 0 is desired for MEDIANX, like ages, where 0 days means immediate/same day.
    In such cases, instead of [Column] > 0
    one might prefer NOT( ISBLANK( [Column] ) )

  3. Hi Matt, long time no write.

    “So I have to say, I was very surprised that my calculated column and my measure gave me a different result.”

    Me too!

    I have discussed this a couple of times with other DAX developers. The real culprit here is the fact that the MEDIAN and MEDIANX functions operate under two different semantic interpretations. Probably because Microsoft believes there is a difference between calculating the median of a column and a potential expression – in your case the [total sales] measure. Well, there is probably more than one opinion on this.

    That aside. If no referential integrity violation between the sales and product tables exists and every sales row contain an amount (as opposed to a blank value) you could alternatively use this measure:

    Product Sales Median =
    MEDIANX (
    SUMMARIZE ( Sales, Products[ProductKey] ),
    [Total Sales]
    )

    Your measure is much easier to understand and probably faster (in most scenarios) so my measure is only meant as an example of another way to attack the problem.

    PS:
    Will said to me “don’t mention the word ‘iterator’ as it is too confusing.”. (from part 1)
    Something has to be done 🙂

    Best regards Jes.

  4. You have the non-common ability to breakdown an issue and explain it in logical, incremental steps that make it all fit together!!! Thank you very much!!

    You have made me excited about starting to use DAX Studio!!

  5. Nice post Matt… I really need to start leveraging DAX Studio instead of visualizing it in my head. Looks much easier!

  6. Frank Arendt-Theilen

    Hi Matt,
    This article is again an added value. To visualize the individual steps in DAX-Studio are simply great.
    Thank you very much!

    Regards FrankAT

    1. Definitely Frank. This is one of my all time best tips. Solve it visually first, then, if needed, turn it into a measure

  7. Hi Matt. Thank you for the excellent content. I have just started my DAX journey and your blog is helping me so much.
    A question about the first step when you suggest “I am trying to get a table containing total sales for each product that has any sales at all”. The first thing that popped into my head is ‘Sales'[Product Key]. It seems like the remainder of the formula would work if you start here. However, what I am most interested in is the way we should think about the problem. It doesn’t seem like there would be any downsides from starting at the ‘Sales’ table?

    1. I’m not sure I understand your point, but I would like to. To solve this problem, I need a summary of all sales by product. I could easily create this as a new calculated table, or I could easily create it by adding a new column to the product table (the product table is at the same grain as I need). The sales table has all the detail, yes, but it is at a finer grain – it is at the product and date level, hence has many rows for each product. So while you definitely need this table to solve the problem, The result can not be stored in this table – the grain is wrong. Does that clarify? Please share your thoughts.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top