Level: Beginners
Original: Feb 2016
It has been many years since Matt wrote his first take on Calculated Columns vs Measures, back in 2014 when he wrote the blog post 5 common mistakes of self taught DAX students at PowerPivotPro.com. Since then Matt expanded his ideas with the original release of this article back in Feb 2016, yet this topic continues to be relevant today. Here, at Excelerator BI, we train a lot of people in Power BI and Fabric at our training courses, including Matt’s Supercharge Power BI online training course, and also help many more understand DAX and Power BI on various forums. The number 1 mistake we see self-taught DAX students with an Excel background make is the inappropriate use and over use of Calculated Columns when they should be using Measures! That is why I am updating this article for the current audience.
Moths to a Flame
Through no real fault of their own, Power BI developers coming from an Excel background are typically attracted to Calculated Columns like “moths to a flame” with sometimes equally dangerous side effects.
This is likely because Excel users feel very comfortable working in the table structure that exists in an Excel Spreadsheet. In addition, self-taught DAX authors normally don’t know any better.
Take a look at the Table View window below (Adventure Works Sales Table). When an untrained DAX user who is used to working in Excel looks at this table (which looks a lot like an Excel spreadsheet), the “natural thing to do” is to think “I’ve got Sales (1 below), I’ve got Cost (2), but I need Margin $ (Sales minus Cost)”. So then they jump in and write a simple Calculated Column for Margin (3 below).
But this is 100% the wrong thing to do. Please, please don’t do this! Use Measures instead.
Calculated Columns are Evil
Matt was very critical of Calculated Columns in the past, to the extent that he suggested they are evil (as implied in the graphic below), but since then his (and my) opinion has mellowed. Calculated Columns are not evil, in fact there are modeling challenges that are best solved using Calculated Columns (more on that later), but they certainly can be bad in the hands of the uninitiated, which is why we warn new DAX authors to steer clear of them unless they are absolutely certain it is the right approach.
Here is a list of reasons that Calculated Columns can be bad.
- The results of your formula are calculated and stored in the table for every row in the table. This takes up space on disk when you save your report and also uses space in memory when your report is open, and this will make your report less performant. If your table is 100 rows then it probably doesn’t matter. If your table is 100 million rows, then it can be a big problem!
- The compression of Calculated Columns may not be as good as compression of imported columns. This is because Calculated Columns are not able to be calculated until the imported columns’ data is available to use, as a result they are the last columns to be stored, and thus the last to compress. The VertiPaq engine (Power BI’s column store database) therefore doesn’t give Calculated Columns the same prioritisation consideration it gives imported columns.
- Calculated Columns are recalculated every time the workbook is refreshed against the data source. This will slow down the refresh process (increasingly so for every additional Calculated Column).
- You can unwittingly create circular reference conflicts in your tables (particularly data tables) for reasons that are very difficult for learners to understand (and beyond the scope of this article).
Write Measures Instead
The most practical reason you shouldn’t write Calculated Columns is that you simply don’t need to. Power BI was built for Measures. If Calculated Columns are Evil, then Measures are Angelic. Here are the benefits of using Measures instead of Calculated Columns.
- Your workbooks will be smaller and faster. You are not “materialising” the results of every possible calculation in your table and storing it in memory and on disk (like with a Calculated Column). Instead Measures are calculated on demand, and Measure calculations are normally lightening fast. The only results that are materialised are the results that need to be displayed inside the report visual.
- Power BI visuals naturally filter your data (semantic model) before calculations are completed. This means that Measures only need to be calculated over the filtered subset of data, further reducing the number of calculation iterations required.
Take the Calculated Column mentioned earlier (Total Margin). You can get the exact same outcome much more efficiently by writing the following measures. Note the calculations below are Measures, not Calculated Columns.
Total Sales = SUM(Sales[SalesAmount]) Total Cost = SUM(Sales[TotalProductCost]) Total Margin = [Total Sales] - [Total Cost]
When Calculated Columns are OK
I still believe Calculated Columns are extremely dangerous in the wrong hands, however as I have built depth of knowledge I have learnt more about the specific conditions when it is okay to use Calculated Columns. Here are the general rules for when it is okay to use Calculated Columns. For Excel users that are still learning; if you don’t know which to use, then it is recommend that you assume you should use a Measure at least until you are sure in your mind why you must use a Calculated Column. With that in mind, it is generally OK to use Calculated Columns in the following circumstances.
- When you need to use the results of your formula to filter your data model. Measures can only ever be placed on a Values field well. You can’t add a Measure to a Categorical field well (i.e. Rows and Columns of a Matrix, Slicers or the Filters Pane, or X-Axis of a Column Chart). If you want or need to do this, then you need a Column and not a Measure. Most of the time, it is preferable to get the column added in your data source rather than use a Calculated Column. The benefits of loading it from the source include:
- that you can re-use the source column in other workbooks without re-writing the calculated column
- there is the potential for improved compression on import
- the refresh of your workbook will be faster.
But for times when that is not possible to import from the source, and it doesn’t make sense to build it using Power Query, then a Calculated Column is your friend.
- When it doesn’t make sense to build it using Power Query. The most common reason you should choose to add a Calculated Column (using DAX) is if you need to leverage existing parts of the model to create the new data. Again an Adventure Works example here should help. Imagine you want to classify your customers in bands, of high sales, medium sales, or low sales. You need a new column in your Customer table (lookup table) so you can use that column on a slicer in one of your reports. If you tried to do this task inside Power Query, it would be quite some additional work. You would have to work out the total sales for each customer in Power Query, and that would require you to
- Create a join between the Customer table and the Sales table
- Pre-aggregate the sales data for each customer
- Group the customers into the size bands using the business logic you need.
- Add the column (high, medium, low).
The important point here is that items 1 and 2 will most likely already exist in the data model itself. In the case of Adventure Works, the model already has a Customer table, a relationship to the Sales table, and a measure that aggregates the sales. These features of the model can be used to easily add the new Calculated Column using DAX. So in short, you should prefer a Calculated Column when it leverages the logic of your model (measures and relationships) so that you don’t have to repeat this logic inside Power Query.
- When you are enhancing a Lookup (Dimension) table. Lookup tables are normally smaller (less rows) and hence the negative impacts associated with Calculated Columns are not as significant.
- When the results of the Calculated Column have a low cardinality. Uniqueness of values in a column is the enemy of compression in Power BI. So if your Calculated Column returns 2 possible unique values (eg Yes or No) and this column helps you with writing a complex Measure, then there are no problems with the Calculated Column. Calculated Columns with a low cardinality compress really well.
- When you have a very complicated formula that is very taxing on your report to the point where run time performance of a Measure calculation is so bad that it is preferable to “pre-calculate” and store the result in a column. It is unlikely that a self taught Excel user will be writing such a complex formula in the early days of learning DAX, but it is worth noting that this is an acceptable and valid use case.
Where to learn more
- Subscribe to Blogs and read all you can.
- Read a book. Matt’s book Supercharge Power BI covers the right way to use Calculated Columns as well as all the other topics that you need to learn to be good at Power BI, and any other tool that utilises the VertiPaq engine, such as Power Pivot for Excel. Matt’s book focuses on giving you the theory as well as hands on practice. We also keep a curated list of other great Power BI and Power Query books in our Knowledge Base.
- Do some live training. Students from our live training courses in Australia walk away with depth of knowledge that is difficult to replicate in other learning environments. If you really want to gain deep understanding, then consider attending one of our live training courses.
- If you prefer self-paced online training, you can enroll in Matt’s Supercharge Power BI Online Training for guided learning.
Hey There. I discovered your weblog the use of msn. This is a very well written article. I’ll make sure to bookmark it and come back to read more of your useful information. Thank you for the post. I will certainly comeback.
Стоимость мостовидных несъемных зубных протезов.
Несъемные мостовидные протезы http://belfamilydent.ru/services/mostovidnoe-protezirovanie/ .
Utterly indited content material, Really enjoyed reading.
555
555
555
555
555
555
555
Hi Matt,
Thanks for the informative contents.
When I tried to write a measure using formula function like “CALCULATE” only other Measures pop up in the dropdown list to be selected and inserted into the formula, other data table columns do not. However when I need to just do a simple “SUM” formula then Measure works.
Can you share why this is?
what you are describing is how Intellisense works. Intellisense will only present you with a list of valid options. So if you type
=CALCULATE( and then type something, you will only see a list of measures. It is not legal to put a column as the first parameter in CALCUALTE.
However, if you type
=CALCULATE(SUM( and then type something, then you will see a list of the columns and not the measures.
Hi Matt.
All day I have been trying to figure out how to use measures.
I have 2 tables, Table 1 consist of customer names, and other information, Table 2 consist of multiple dates, values, customer names and other information. I am trying to calculate the total value (monthly/yearly) for each date column for any of the customers that appear on table 1.
I have a current relationship between the 2 tables by customer name and the cardinality relationship is many to many and the cross filter direction is single from Table 1 filters Table 2.
I suggest you post a question at community.powerbi.com
HI Your Contain Is Outstanding
Sorry to be a pest. This post well explained why Calculated Columns (I assume from different tables) are “Bad” … but it did not discuss the correct solution!! I did apologize at the beginning. I come from the world of access where calculated columns are in queries and need to be re-run every time the query is run.
Any hint (basic sample) would be appreciated. Please remember that I’m extremely new to all this and I got both books but not sure what to look for. B-)
Calculate columns are bad when you can use Measures instead. It is as simple as that. It does cover the correct solution – use Measures instead. I have added an example as per your request.
Thanks, Matt, very passionate post! 🙂
It is really interesting, when to use PQ to make columns and when PP. I think that in cases you mentioned for use columns in PP (like filtering etc.), it is better to use PQ-generated columns, right? Because we at least do not recalculate them later
I think both Power Query and Power Pivot calculated columns are both fine for the types of columns I referred to in the post. It really depends on the scenario. If you have loaded your data from SQL Server directly into Excel 2013 (ie not through Power Query), then I would definitely use Power Pivot for the “appropriate” use cases. If my table was already loaded via Power Query, I would definitely use Power Query (you have to anyway). If the table is already loaded directly into Power Pivot, I would only switch to Power Query if there were a very good reason.
Thanks for detailed answer. If in PowerBI Desktop, you’don’t prefer DAX or M for this reasons?
If I were in Power BI Desktop, I would probably put it into Power Query just for simplicity sake – so all data preparation is in the same place. The only exception would be if there were some complex calc that uses context transition (and hence the rest of the logic built in the data model) to return the required value. But we are touching on the edges here – I have a saying – if you have all the information and you still can’t decide, then it probably doesn’t matter etiher way.
Thanks Matt – it is useful and insightful as always! I might possibly be the only Excel user who “forgets” to use columns and tries to write measures instead ( with varying results!) 🙂
That’s so funny ?
Hmmm…I agree. Column compression being seemingly independent did blow my mind. I think Scott tried to disprove that but drew up blank, so I didn’t even bother to try. As I’ve heard the Italians often say – “It depends”. So it’s best to test things out with your own data I suppose.
As for doing Calc Column in PP versus PQ…I have settled into the habit of ALWAYS using PQ to bring in data. So that’s moot for me as well.
My Friends, good post as always! Help me out on this one
“…The compression on Calculated Columns may not be as good as compression on imported columns…”
In this article
http://www.powerpivotpro.com/2015/12/compression-with-power-pivot/
…you indicated that Columns seem to be compressed independent of each other. If that’s the case, then it should not matter if I add a calculated column in Power Pivot or bring it in via Power Query. Should it?
I’ll try to run a test, but if you already have, let me know.
Hey Avi. This is still mysterious to me. I attended “Mastering DAX” and “Optimising DAX” with Marco Russo recently. That confirmed that there IS a dependency between columns with compression, however I can only go on my personal experience and testing. In my experience, there is little incremental cost of having 1 extra column, certainly not “orders of magnitude” extra cost. I have also tested imported columns vs calculated columns and found little difference. The only issue (as I understand) is that Calculate Columns are not considered along with other columns to determine the optimum compression order. So it is at least possible that this will make the workbook larger – possible but not guaranteed.
Great guidelines, thanks Matt. i am interested to hear similar besr practice tips on striking a balance between creating calculated columns in DAX versus in the ETL phase with say, power query. Thats another area of overlap with no obvious answer.
This is a very helpful post, I have some re work to do now, Thank you for sharing your knowledge.
I’m glad it is helpful. Let me assure you, we are all learning incrementally building knowledge on top of experience 🙂