The variable feature in DAX has been around for more than a year and I have simply never got around to writing an article about it – I am changing that today. Let me start off with some version information and syntax and then I will show you are real life example where I have used variables to improve my formula.
Variables in DAX is a relatively new feature and is available in
- Power BI Desktop
- Excel 2016
- SSAS Tabular 2016
Variables are not available in Excel 2013 or Excel 2010.
I normally refer to variables as “the VAR syntax”. In short there are 2 new keywords that are used together to make your formulas more readable. The keywords are VAR and RETURN. This is how you would write a typical formula.
myFormula = VAR = VariableName1 = <some valid DAX formula> VAR = VariableName2 = <some other valid DAX formula> RETURN <another valid DAX formula that can use VariableName1 and VariableName2 as part of the expression>
Notes on the Syntax
It is possible to set scalar values as variables but it is also possible to assign tables as variables. See the following valid examples for assigning a variable within a formula. Note that the following are just examples of the VAR line. You must always use VAR and RETURN to write a valid complete DAX formula.
Scalar Value Example
VAR myScalarValue = SUM(Sales[Extended Amount])
VAR myTable = FILTER(Customer,Customer[Post Code] = 50210)
The above variable myTable returns a virtual table that can be used anywhere that a table is used in a DAX formula, such as a filter inside a CALCULATE function.
Other Important Points
- Each of the variables are evaluated once prior to the commencement of the RETURN portion of the formula.
- Variables can refer to other variables.
- The variables are executed in the initial filter and row contexts.
- The results of the variable calculations are cached for reuse inside the formula.
- Once the variable has been assigned a value, that value cannot change during the execution of the RETURN portion of the formula. From that perspective, the variables act more like constants than regular variables in a traditional programming language context.
There are many benefits of using the VAR syntax, particularly when you also use in line comments.
- The formulas can be easier to write. It may seem to take more time to write the formula at first, but you may just end up saving that time by getting to a working version of the formula faster, with less rework.
- The formulas can be easier to read. A reader can understand the logic of the formula step by step and hence it can be easier to work out what the formula is designed to do. This is particularly true when the author uses comments, new lines and spacing as well.
- The formulas can be easier to maintain when changes are needed – see my example below.
- Formulas that use repetitive sub strings of DAX can be written once and then used many times. This is good for formula writing but also for performance (although DAX already had some good caching features prior to this syntax).
- Given the variables are executed in the initial filter context, this syntax can be an easier alternative to using the EARLIER function (when you have nested row contexts inside a complex formula).
Real Life Example
I was with a client last week building a sales force incentive tracking system. It is very common for sales management to want to measure performance across a range of different business drivers in order to focus the sales force on what is important. As the saying goes, “what gets measured gets done”. Power BI, Power Pivot and Power Query are excellent tools for this type of reporting. Reasons include:
- Often the data sources are disparate and need to be brought together into a single report.
- The time horizon for the tracking is normally short and means that the cost and effort of a full blown IT BI solution often makes it hard to justify.
- The effort to manually produce such a report on a regular cycle (say monthly) can be high and hence the work tends to get done less regularly (say quarterly or even half yearly). Irregular updates of such tracking tools defeats the purpose of the entire exercise.
In short my client has an annual “team of the year” award that is based on a number of different criteria across many different data sources including sales performance, execution effectiveness, new product launch effectiveness etc – pretty standard sales stuff. Teams are awarded a number of points in each area reflecting based on performance; the points available are relative to the importance of each business driver. Teams are awarded points across all areas based on performance and then the points can be added up to come up with an overall score and hence the winning team.
Assigning Points for Sales Growth
I am going to share the formula I wrote for assigning points for just one of the areas – sales growth. The business rules for assigning points were as follows:
- If you decline on last year, the team gets no points.
- For each 0.1% growth between 0 and 1% over last year, the team receives 100 points.
- For each 0.1% growth above 1% over last year, the team receives 200 points.
First a Design Decision
The first thing I had to decide was how to maintain these business rules. I could think of 3 options.
- Hard code the business rules into the formula itself.
- Create a parameter table that contained all of the key numbers (eg how many points to assign, upper and lower trigger points to get those points etc).
- Write the formula using the VAR syntax.
I pretty quickly came to a conclusion that option 3 was the best in this case. The effort to load a parameter table and then extract the values so they could be used is relatively high compared to using the VAR syntax. Given the use of the business rules were limited to this one formula, I felt the VAR syntax was the best option.
Here is the final formula I ended up writing, but as often is the case, I think the process of writing the formula is much more interesting and useful than the final formula.
Writing the Formula
Writing formulas often take some iterations and rework – it is seldom a case of just typing in the full correct formula from start to finish. To start with I wrote variables for each of the business rules I was doing to need to apply (lines 2 to 6 above). But it definitely wasn’t as simple as just typing each line one after the other. In fact I ended up using Excel to build the formula first to break the problem into its component pieces. Once I had the Excel version working, it was much easier to write the DAX.
The RETURN Portion
Note that the actual DAX function commences from line 7. On line 9 you can see the first reference to one of the variables as an input into the final formula (GrowthRounded in this case). As it turned out, I changed the logic of the way the points were assigned. It seemed easiest to assign 100 points to all sales above 0 and then assign a further 100 points for all sales above 1% (as opposed to assigning 100 point between 0 and 1% and then assigning 200 points for sales above 1%). But of course all of this is clearly documented in the formula so it is clear what was done and why.
Hopefully you can see from this example that the VAR syntax makes it easy to write formulas that can be read, maintained and understood by others.
How to Avoid the EARLIER Function
As I mentioned up front, The EARLIER function can be a bit confusing when you are learning DAX. EARLIER is used to refer to a value in a previous row context – let me explain. Consider the following 2 calculated columns in the customer table of Adventure Works. (note I am not saying you should write columns like this, I am just using them to demonstrate the use of EARLIER).
Calculated Column 1
Cust Sales = [Total Sales] // this adds the total lifetime sales for each customer to the customer table.
Calculated Column 2
Note the use of EARLIER in line 4 above. This formula is a calculated column and all calculated columns have a row context. The FILTER function also has a row context so there are actually 2 nested row contexts in this formula.
What line 4 is actually saying is this.
“Filter the customer table and only keep customers that have sales greater than the sales for this one customer I am currently considering for this row of the calculated column”. The EARLIER function is required to be able to “escape” the inner row context of the FILTER function and refer back to the outer row context created by the calculated column.
The formula can be considered like this.
Start Calculated Column (row context 1 - outer) For each row in row context 1, do the following…. Start FILTER function (row context 2 - inner) For each row in row context 2, do the following…. Check if this customer has more sales than me Next row for row context 2 (ie go to the next customer) End FILTER function Next row for row context 1 End Calculated Column
So you can think of a row context as an iteration through each row of the table to complete the same task for every row. And you can nest as many row contexts as needed within other row contexts.
As I mentioned earlier in this article, it is possible to use the VAR syntax as an alternative to the EARLIER function because the VAR statements are evaluated in the initial filter and row contexts. So here is the alternative syntax for the Rank Customer calculated column.
This new formula returns the same result but I think you will agree it is easier to read and understand. The reason it works is because the VAR statement on line 2 is evaluated before the second row context is created by the FILTER function.