Level: Intermediate Power Query.
I spent a lot of last week helping to configure Power BI in preparation for go live for a client. One of the important things to do when designing a Power BI solution is to make sure you have a good design for your user security access. Today I am going to share the approach I used to track user security configuration with Active Directory Groups.
This post is more about Power Query than it is about Active Directory, so even if you don’t have this specific scenario, it is still worth reading this blog post to see how I dealt with some tricky issues in cleaning data that was extracted into a text file using Power Query.
Using Active Directory Groups
I don’t want to go into a lot of detail here, but in short, we set up dashboards and reports targeted at different groups of users. Rather than assign users access directly to the dashboards in Power BI (by adding their email addresses one by one), a much better solution is to create an Active Directly Group, assign the users to the active directory group using the normal business process, and then assign the AD group to the Dashboard in Power BI.
Note that you must assign an email address to the Active Directory group to be able to do this.
The benefit of this approach is that you can have a single corporate process for assigning user access, and this is normally Active Directory (for medium to large companies).
The Problem – How to Keep Track of Which Users Have Access
If you are an Active Directory Administrator, then I am sure you have all sorts of tools to work out who has what access. But let me tell you – I don’t have those tools (or don’t know about them anyway), and I quickly found out it was a pain to try to check user access when you don’t have the right tools.
Command Line Tricks to Check Users in an AD Group
A quick question around the office, and Lester told me how to run a command line request to find out which users are in a group. To do this:
- Open a new command prompt on your PC (that is connected to the company network of course).
- type the following command
net group /domain "GROUP NAME IN INVERTED COMMAS"
The above command line string will send a list of all users assigned to the group to the screen like shown below
Checking One Group at a Time is a Pain too
This was great, but soon I realised I wanted to get ALL the groups I wanted, and then look at the data in Excel. I didn’t want to have to go to the command line each time, so I developed the following process to create a simple to use Excel file. This is a summary of the steps I took to achieve this outcome.
- Created a batch file that ran the command line code for all the groups I needed to interrogate (one after the other), and then sent the results of each of these commands to a single text file.
- Imported and cleansed the data in the file to Excel using Power Query
- Set up some slicers on top of my final Excel table to make it easy to access the information I needed.
This is what I ended up with
Below I step through the details on how to complete each step to achieve this final report.
Step 1. Move the Manual Code into a Batch File
This is easy, even if you haven’t done it before
- Open notepad.exe and paste the first line of code that I showed above.
- I appended some more code at the end of the line to tell the command line to send the results to a file (instead of sending the results to the screen).
Note I added a single > symbol pointing to the name of the output file in the first line of code below.
- Then I duplicated the first line of code multiple times (cut and paste), once for each group I needed – then changed the group name of course. Also note that the > symbol is a >> for lines 2+ in the batch file (> means send the data to a new file, >> means append the data to an existing file). I had many more groups in real life.
net group /domain "FINCUBE GROUP" > c:\users\allingtm\Documents\list.txt net group /domain "FINCUBE ALL" >> c:\users\allingtm\Documents\list.txt net group /domain "FINCUBE ADHOC" >> c:\users\allingtm\Documents\list.txt
Now save the text file with a *.bat extension in an easy to find location (note, first change the file type filter in Notepad to be *.* instead of *.txt). From the command line, execute the batch file by typing the name of the file (including the full path and excluding the bat extension). The command line will then chug away and produce the output file with all the Active Directory information needed. This is what the final file output looks like in a txt file (sample of test data only of course). In my real scenario there were many more names and many more groups which is of course exponentially harder to manage manually than this simple test data.
Step 2. Import into Excel Using Power Query
I am not going to go through every step I used in Power Query, one at a time here as this is an intermediate post. I have however attached the workbook at the bottom of the page – download it and take a look if you want more details about each step in the transformation. I am just going to call out some of the more interesting things I had to do to get the final output.
The Default File Output is not Very Excel Friendly
The first thing to note is that this output is great for a command line screen, but not very good for Excel. So you have to deal with these issues in Power Query the best you can. It is easy to deal with blank lines and dotted lines etc with a simple filter. Just deselect anything that is not of value for the task at hand as shown below.
All of the lines I have deselected (shown in the image above) are repetitive lines that are created for each of the AD Group commands in my batch file.
Once I had done this, I had a single column of text data as shown below. Note the name of the groups appear first, followed by one or more rows of user names. This pattern is then repeated down the page. Each row of names has 1 or more user names per row – tricky!!
I then added a new custom column in Power Query by hand writing an IF formula. An If formula in Power Query Language (PQL) is in the format
if something1 = something2 then something3 else something4
Note PQL is case sensitive for the formulas and the text it is looking at.
See my formula below along with the output in a new custom column.
Basically I am checking the first column to see if the row is a Group Name or a list of User Names. If it is a Group Name, then return that Group Name. If it is a User Name, then return null. The reason null is required is important – it is then easy to copy the group name down the list as shown below
Now that every row has a group name as part of the row record, it is now OK to go ahead and delete every row in the first column that starts with the text “Group name”. I did this with a filter rather than manually selecting each row – that way it is future proof if I add new group names down the track.
Remember everything is case sensitive, so typing “group name” would fail here as my data uses “Group name”.
Split the User Names into Columns
This is what I have now (after some reorg). Time to split the users into columns.
This looked easy, but there was a surprise waiting for me. When I split on “space” repeatedly, I discovered the number of spaces between each name varied by line – anywhere from around 17 spaces up to about 21 spaces in total. This stumped me – how was I going to iterate over this file and remove a seemingly random number of spaces to parse out the names? I gave up and went to lunch.
During a nice kebab with my family, I had a brain wave – I wondered if the file may have been a fixed width file format, adding more or less spaces depending on the number of characters in each user name. I then remembered an article (Chris Webb I think) talking about the new monospaced font feature in Power Query – just what I needed. As soon as I got home, I found the setting for the mono spaced font and turned it on – BAM! My hunch was correct.
It wasn’t until later that I realised that I should have known this from looking at the original text file – oh well.
I then selected one of the cells of user names (shown as 1 below) and then selected the results pane (shown as 2 below). Once the cursor is in the results pane, I used my arrow keys to move the cursor left and right to count how many characters made up each “column” of user names (25 as it turned out).
So I applied a “split column by number of characters” step as shown below.
I then un-pivoted the resulting user name columns, trimmed out the trailing spaces, removed the columns I didn’t need and did a general clean up. This is what I then sent to a table in Excel
The last thing I did was add a slicer for each of the columns. This gave me a very usable tool where I could select either a group or a user name, and see the relationship between the two in the resulting filtered table. See below.
And given I had set this up using a batch file and a Power Query report, the process is infinitely repeatable with very little effort. 20 seconds to re-run the batch file, 40 seconds to refresh the Power Query workbook, and in as little as 1 minute I can have a refreshed set of data.
You can download the sample text extract, the sample batch file and my Power Query workbook here if you would like to look in more detail at the process.