Level: Intermediate
The variable feature in DAX has been around for more than a year and I have simply never got around to writing an article about it – I am changing that today. Let me start off with some version information and syntax and then I will show you are real life example where I have used variables to improve my formula.
Versions
Variables in DAX is a relatively new feature and is available in
- Power BI Desktop
- Excel 2016
- SSAS Tabular 2016
Variables are not available in Excel 2013 or Excel 2010.
Syntax
I normally refer to variables as “the VAR syntax”. In short there are 2 new keywords that are used together to make your formulas more readable. The keywords are VAR and RETURN. This is how you would write a typical formula.
myFormula = VAR VariableName1 = <some valid DAX formula> VAR VariableName2 = <some other valid DAX formula> RETURN <another valid DAX formula that can use VariableName1 and VariableName2 as part of the expression>
Notes on the Syntax
It is possible to set scalar values as variables but it is also possible to assign tables as variables. See the following valid examples for assigning a variable within a formula. Note that the following are just examples of the VAR line. You must always use VAR and RETURN to write a valid complete DAX formula.
Scalar Value Example
VAR myScalarValue = SUM(Sales[Extended Amount])
Table Example
VAR myTable = FILTER(Customer, Customer[Post Code] = 50210)
The above variable myTable returns a virtual table that can be used anywhere that a table is used in a DAX formula, such as a filter inside a CALCULATE function.
Other Important Points
- Each of the variables are evaluated once prior to the commencement of the RETURN portion of the formula.
- Variables can refer to other variables.
- The variables are executed in the initial filter and row contexts.
- The results of the variable calculations are cached for reuse inside the formula.
- Once the variable has been assigned a value, that value cannot change during the execution of the RETURN portion of the formula. From that perspective, the variables act more like constants than regular variables in a traditional programming language context.
Benefits
There are many benefits of using the VAR syntax, particularly when you also use in line comments.
- The formulas can be easier to write. It may seem to take more time to write the formula at first, but you may just end up saving that time by getting to a working version of the formula faster, with less rework.
- The formulas can be easier to read. A reader can understand the logic of the formula step by step and hence it can be easier to work out what the formula is designed to do. This is particularly true when the author uses comments, new lines and spacing as well.
- The formulas can be easier to maintain when changes are needed – see my example below.
- Formulas that use repetitive sub strings of DAX can be written once and then used many times. This is good for formula writing but also for performance (although DAX already had some good caching features prior to this syntax).
- Given the variables are executed in the initial filter context, this syntax can be an easier alternative to using the EARLIER function (when you have nested row contexts inside a complex formula).
Real Life Example
I was with a client last week building a sales force incentive tracking system. It is very common for sales management to want to measure performance across a range of different business drivers in order to focus the sales force on what is important. As the saying goes, “what gets measured gets done”. Power BI, Power Pivot and Power Query are excellent tools for this type of reporting. Reasons include:
- Often the data sources are disparate and need to be brought together into a single report.
- The time horizon for the tracking is normally short and means that the cost and effort of a full blown IT BI solution often makes it hard to justify.
- The effort to manually produce such a report on a regular cycle (say monthly) can be high and hence the work tends to get done less regularly (say quarterly or even half yearly). Irregular updates of such tracking tools defeats the purpose of the entire exercise.
In short my client has an annual “team of the year” award that is based on a number of different criteria across many different data sources including sales performance, execution effectiveness, new product launch effectiveness etc – pretty standard sales stuff. Teams are awarded a number of points in each area reflecting based on performance; the points available are relative to the importance of each business driver. Teams are awarded points across all areas based on performance and then the points can be added up to come up with an overall score and hence the winning team.
Assigning Points for Sales Growth
I am going to share the formula I wrote for assigning points for just one of the areas – sales growth. The business rules for assigning points were as follows:
- If you decline on last year, the team gets no points.
- For each 0.1% growth between 0 and 1% over last year, the team receives 100 points.
- For each 0.1% growth above 1% over last year, the team receives 200 points.
First a Design Decision
The first thing I had to decide was how to maintain these business rules. I could think of 3 options.
- Hard code the business rules into the formula itself.
- Create a parameter table that contained all of the key numbers (eg how many points to assign, upper and lower trigger points to get those points etc).
- Write the formula using the VAR syntax.
I pretty quickly came to a conclusion that option 3 was the best in this case. The effort to load a parameter table and then extract the values so they could be used is relatively high compared to using the VAR syntax. Given the use of the business rules were limited to this one formula, I felt the VAR syntax was the best option.
Final Formula
Here is the final formula I ended up writing, but as often is the case, I think the process of writing the formula is much more interesting and useful than the final formula.
Growth Points = VAR Points = 100 //Points VAR DoubleBonusPointsLevel = 0.01 //extra points cut in above 1% growth VAR Growth = 0.001 //0.1% growth for applying each 100 points VAR GrowthRounded = ROUNDDOWN([% Growth FYTD], 3) //rounddown to nearest 0.1% RETURN IF( GrowthRounded > 0, //check to make sure the team is in growth GrowthRounded * Points //assign points for growth above 0 /Growth + /* The next portion assigns a further 100 points for evrything above 1% growth making a total of 200 points for all sales above 1% */ MAX( (GrowthRounded - Double BonusPointsLevel), 0 ) //max ensures the formula doesn't remove points if growth is less than 1% * Points / Growth )
Writing the Formula
The Variables
Writing formulas often take some iterations and rework – it is seldom a case of just typing in the full correct formula from start to finish. To start with I wrote variables for each of the business rules I was doing to need to apply (lines 2 to 6 above). But it definitely wasn’t as simple as just typing each line one after the other. In fact I ended up using Excel to build the formula first to break the problem into its component pieces. Once I had the Excel version working, it was much easier to write the DAX.
The RETURN Portion
Note that the actual DAX function commences from line 7. On line 9 you can see the first reference to one of the variables as an input into the final formula (GrowthRounded in this case). As it turned out, I changed the logic of the way the points were assigned. It seemed easiest to assign 100 points to all sales above 0 and then assign a further 100 points for all sales above 1% (as opposed to assigning 100 point between 0 and 1% and then assigning 200 points for sales above 1%). But of course all of this is clearly documented in the formula so it is clear what was done and why.
Hopefully you can see from this example that the VAR syntax makes it easy to write formulas that can be read, maintained and understood by others.
How to Avoid the EARLIER Function
As I mentioned up front, the EARLIER function can be a bit confusing when you are learning DAX. EARLIER is used to refer to a value in a previous row context – let me explain. Consider the following 2 calculated columns in the customer table of Adventure Works. (note I am not saying you should write columns like this, I am just using them to demonstrate the use of EARLIER).
Calculated Column 1
Cust Sales = [Total Sales] // this adds the total lifetime sales for each customer to the customer table.
Calculated Column 2
Customer Lifetime Rank = CALCULATE( COUNTROWS(Customers), FILTER(Customers, Customers[Cust Sales] >= Earlier(Customers[Cust Sales])) )
Note the use of EARLIER in line 4 above. This formula is a calculated column and all calculated columns have a row context. The FILTER function also has a row context so there are actually 2 nested row contexts in this formula.
What line 4 is actually saying is this.
“Filter the customer table and only keep customers that have sales greater than the sales for this one customer I am currently considering for this row of the calculated column”. The EARLIER function is required to be able to “escape” the inner row context of the FILTER function and refer back to the outer row context created by the calculated column.
The formula can be considered like this.
Start Calculated Column (row context 1 - outer) For each row in row context 1, do the following…. Start FILTER function (row context 2 - inner) For each row in row context 2, do the following…. Check if this customer has more sales than me Next row for row context 2 (ie go to the next customer) End FILTER function Next row for row context 1 End Calculated Column
So you can think of a row context as an iteration through each row of the table to complete the same task for every row. And you can nest as many row contexts as needed within other row contexts.
As I mentioned earlier in this article, it is possible to use the VAR syntax as an alternative to the EARLIER function because the VAR statements are evaluated in the initial filter and row contexts. So here is the alternative syntax for the Rank Customer calculated column.
Customer lifetime rank = VAR ThisCustomerSales = Customers[Cust Sales] RETURN CALCULATE( COUNTROWS(Customers), FILTER(Customers, Customers[Cust Sales] >= ThisCustomerSales) )
This new formula returns the same result but I think you will agree it is easier to read and understand. The reason it works is because the VAR statement on line 2 is evaluated before the second row context is created by the FILTER function.
Hi Matt,
Thank you for the great article. I really learned a lot from it.
As I’m new to Power BI, especially to its syntax convention, I found it a bit difficult to follow the formula for [Growth Points] within the first 5 minutes of reading it.
I feel if we grouped the GrowthRounded or MAX( (GrowthRounded – DoubleBonusPointsLevel), 0 ) with VAR Growth together (that is, these variables were not separated by VAR Points), to indicate the level of growth; as well as realigning the comments, this would make it easier for readers to follow.
What do you think of the following arrangement?
Growth Points =
VAR Points = 100 //Points
VAR DoubleBonusPointsLevel = 0.01 //extra points cut in above 1% growth
VAR Growth = 0.001 //0.1% growth for applying each 100 points
VAR GrowthRounded =
ROUNDDOWN([% Growth FYTD], 3) //rounddown to nearest 0.1%
RETURN
IF(
GrowthRounded > 0, //check to make sure the team is in growth
Points * (GrowthRounded / Growth) //assign points for growth above 0
+
/* The next portion assigns a further 100 points
for everything above 1% growth
making a total of 200 points for all sales above 1%
*/
Points * (MAX(
(GrowthRounded – DoubleBonusPointsLevel),
0
) //max ensures the formula doesn’t remove points if growth is less than 1%
/ Growth
)
)
Best regards,
Jason
This is an old one – I don’t even remember writing it 🙂
There are many ways up a mountain. Mountains can be treacherous and hard to climb, but they can also be an enjoyable challenge. I recommend taking the easiest path you are comfortable with. What I like about your formula is that it is well commented/documented, it takes the reader one a journey, and presumably it gives you the answer you need. Nothing bad can come from improving the readability of formulas like this, so good job.
Hi Matt,
how can I avoid a crossfiltering combined using variables?
I’m using PowerBI and have a column “Sales” with different salestypes, some with positive, some with negative amounts.
I want to sum up the amounts per salestype, ignore the +/- and then sum up that table.
I have a pagefilter on the masterdata = Profit.
I do not want the measure to interact with a crossfilter on the Dashboard.
My solution would be:
All_sales = VAR TableSales = SUMMARIZE(Table,Table[Sales],”Sum Sales”,CALCULATE(abs(sum(Table[Amount])),ALLEXCEPT(MasterdataTable,MasterdataTable[Profit])))
return CALCULATE(sumx(TableSales,[Sum Sales]),all())
It works perfectly fine EXCEPT the crossfiltering. If I do filter on the Dashboard the sum is filtered too, although I’m using allexcept and all…
Would you kindly give me a hint, what the Problem is?
Hi Matt,
How to use / refer to a column that is created within a table variable?
Interesting question. What is your use case? You can refer to a new column when you are within a row context, but not otherwise.
In your assigning points for sales growth topic, the DAX return statement included the MAX function so that a growth rate of just 0.96% didn’t cause the result to have points deducted. I have replicated the problem on Excel like you suggested and can see why this is needed but can you explain how that bit of DAX worked, I thought the MAX function just returned the Max value, if Growth rounded is 0.9% – DoublePointsLevel is 1% then isn’t the MAX of that -0.1% ?
Its OK I just checked the syntax of MAX and I figured this out 🙂
That will certainly do as well Matt. I just read an article on the PowerPivotPro website (https://powerpivotpro.com/2018/10/sourcing-candidates-and-training-a-team-on-the-microsoft-data-platform/) where your article was recommended and thought I would add a little extra 🙂
Or my “human understandable” equivelant word is un-mutate-able (ie cannot be mutated)
“Once the variable has been assigned a value, that value cannot change during the execution of the RETURN portion of the formula.”
A fancy technical term for this is: DAX variables are IMMUTABLE. So if you want to impress your fellow … 🙂
Once the variable has been assigned a value, that value cannot change during the execution of the RETURN portion of the formula. From that perspective, the variables act more like constants than regular variables in a traditional programming language context.
This hit me. I didn’t know about this as I usually call an external base measure and do not assign it to a variable until the VAR SUM(‘Table'[Sales]) always remained the total sum of sales even if I wrapped it in CALCULATE and apply a filter expression. That’s when I realized that I should have called an external measure or explicitly used SUM(‘Table'[Sales]) within CALCULATE().
Great explenation, thank you very much.
Doesn’t the VAR needs aggregation function as SUM or MIN or a measure?
Your code example reference a table column which didn’t work.
No, a VAR does not need an aggregation function. It can be a scalar value, a function or a table. Which code example doesn’t work?
When i’m using the following code, as in your example:
VAR ThisCustomerSales = Customers[Cust Sales] ….
The pop-up window doesn’t present any suggested tables just a list of functions and measures so i have to write the name of the table/column (Which is grayed out). The Error i get:
A single value for column ‘Cust Sales’ in table ‘Table’ cannot be determined…
But when I’m adding aggregation function like this:
VAR ThisCustomerSales = MAX ( Customers[Cust Sales] )….
it works perfectly.
Very glad that your solution works and it save me a lot of trouble and preformance issues with EARLIER function. But i don’t understand how it is possible not to use aggregation function.
Thanks a lot. Great blog.
OK, now I understand what you are saying. In your example, you are trying to assign a column to a variable – this is not allowed. You can assign a scalar value, a function or a table.
VAR myConstantValue = 5
VAR myFunction = MAX(Customer[Sales])
VAR myTable = FILTER(Sales,Customer[Sales] >50)
Thanks, now it is clear to me.
Are the processing benefits by using variables also extended to existing measures that will be used multiple times within a formula?
Something like:
Proposed Cost =
VAR CostChange = [Cost Change Value]
RETURN
IF (
CostChange = BLANK (),
0,
[Cost] * ( 1 + CostChange )
)
Variables IMO is the single biggest improvement to DAX since inception. Functions are now much cleaner and therefore easier to read and maintain. I can’t imagine going back to life without them. I don’t know if this has happened to you, but with variables I find myself using fewer nested “portable measures” as I don’t always recall where all they are used and therefore how even a subtle change will impact calculations. Instead because I have variables more and more in complicated measures I encapsulate all function steps in a single measure. A refactoring trade off to be sure, but just the direction I’m gravitating right now.
Tried to replicate your example with the AdventureWorks DB and I found the circular dependency issue “Customers[Cust Sales], Customers[CUSTOMER LIFETIME RANK], Customers[Cust Sales]”
You will need to help me to help you here. Can you give me some more information?
I can share the Pbix file I´ve created, please advise an email to send it. Thanks in advance for your help
Hi Matt
You mentioned the benefits of using variables, but still I am not sure I see a real significant advantage in using variables over using EARLIER functions (calling simple measures in a complicated one). One sure advantage of calling EARLIER functions is that it is supported by older versions (Excel 2013 for one).
Can you (or any one reading my comment) point out the advantages?
Yes I covered the limitations of which versions support the new syntax above. Of course if you version doesn’t support it, then it is not a benefit. In my experience when teaching new Excel users, they find it easier to understand the VAR syntax. I believe this is because they don’t have to conceptualise nested row contexts. They don’t even need to really understand that there is a nested row context. They can simply define the variable they want to compare with first, then execute the rest of the function. If you don’t see a benefit in doing it this way because you already understand earlier, then there is probably no benefit to you.
ThX
I will test it
What I REALLY like about the above is it has a VBA feel about it, declaring variables, good commentary, clear logic etc. I can relate to that very well and have shared this article further
Great one – very detailed as usual !
I love variables, the fact they can store either scalar values or tables is so useful… hopefully we can get global variables and not just local to the measure.
Hi Tristan, I’m curious as to in which use cases a global variable will be of benefit.
a. The measure cache is already a set of global variables!
b. What filter context will be applied to a global variable?
I’ve heard/seen/read your statement before but am still a bit confused about it. Is it a language feature (DAX) or a client tool feature (like in Power BI desktop) that is requested?
Best regards Jes
I’ve got a good case for a global variable wherein I need it to populate a part of a calculation that is otherwise not populated by data.
In order for the required field to not show as null, a global variable could be used to “plug the gap” in the calculation, and display a subsequent value.
I.e., Globalvar maxweekcounter – counting the weeks on one table, and allowing them to be split for users on a user table that don’t have any respective datetime data on the timesheet table.
I too want global variables…they would allow me to work out what the values in other visuals are, and then scale all visuals to share the same dynamically generated axis max across all the visuals. I have a very complicated workaround, but global variables would make this trivial.
Hi Jeff,
I’m really interested in having a shared scaled axis for multiple visuals. Could you share how your workaround works? I tried a lot but did not get it to work unfortunately.
Global variables would be awesome for that, but until that, your workaround would be of much help!
Best, Christina