I have been working with a client here in Sydney were we are producing some Human Resource (HR) reporting tools to analyse what is happening across the business. HR data is quite often very sensitive because it can contain personal information as well as salary information. A lot of the calculations that are in the workbook we have been building do not require all this sensitive information to work – eg Total New Employees, Total Employees leaving the business etc. The reports are going to work if the sensitive data is in or out, but some people are allowed to have the sensitive data and others are not. So….what to do?!
There are a few approaches you can take to this problem. The most obvious is to create a second copy of the workbook and remove the sensitive data. But the problem that will then occur is you have 2 versions that need to be maintained over time. As you build out new reports or make changes to your DAX, you have to do it in 2 places. So that is a bit of a pain. Power Query to the rescue!
Use Power Query to Create a Second Version
The solution I came up with is to use a couple of tricks in Power Query to create an easily repeatable process of quickly creating a second version from scratch when ever required. So in short there is 1 version of the workbook, but you can quickly create a second ‘desensitised’ version in less than a minute. This reduces the rework that you would otherwise need to do if you had 2 completely separate copies. Here is the process.
Create the workbook that contains all of the data you need.
Use Power Query to shape the data import the way you want it. I have created a simple table below to demonstrate the point.
I import this data using Power Query and do all the shaping first. The very last step after you are done with the table shaping is to select all of the columns (1 below) and then select “Remove Other Columns”.
When you do this, Power Query will add a new line of code that contains all of the columns of data in your table. You can see this code by opening the Advanced Editor.
The next step is to duplicate this row of code and comment out one of the rows. As you can see below, the original row (shown as 1) contains all the sensitive data. The copy of this line of code (shown as 2 below) has a double slash – which indicates the line is “commented out” and is not being used. You can see in 3 below that I have manually removed the sensitive columns from this line of code.
Now save the workbook. As is, this workbook is you new master copy and can be refreshed with all the data at any time. When you want to create a ‘desensitised version’ of the workbook, you simply open the Advanced Editor, remove the double slash from the line of code (shown as 2 above) and place a double slash in front of the line of code shown as 1 above. Save and refresh the workbook and now you have an identical copy of the workbook that doesn’t contain any sensitive data. Down the track you can do all the enhancements and improvements over your main workbook, and still quickly create a new desensitised version with just a few seconds of your time.
A Final Word
There are a few different techniques you can use to do this. Another way is to simply add the final line of code manually each time you want to re-create a desensitised version. But if there is more than 1 table or lots of columns then this can be a bit ‘hit and miss’. You could just write the last step and then comment it out, but the problem with this is that you also then need to make further manual changes to rename the steps in the Advanced Editor and manually add and remove commas etc – it is just more complex.
You can download the sample workbooks here if you want to take a look.
If you want a comprehensive lesson on how to use Power Query, checkout my training course here https://exceleratorbi.com.au/power-query-online-training/