Archive for Tools

How to Document DAX Measures in Excel

I often get asked if there is an easy way to create documentation for DAX measures when using Power Pivot for Excel.  I am not a big fan of documentation for the sake of it, but I do see value in having “some” appropriate level of documentation.  I think a good balance of the right amount of documentation vs the effort to write and maintain the documentation is to use the Power Pivot UI itself.  I explain how I do it below.

Writing DAX Measures From Within Excel

I always teach students at my Power BI training courses to write their measures from within Excel via the “new measures” dialog box (and specifically not to write them from within the Power Pivot window).  The process is always as shown below.

  1. Set up a pivot table that has some relevance to the measure you are writing, then click in the pivot table (shown as 1 below)
  2. Select Power Pivot Menu \ Measures* \ New Measure* as shown below (*called Calculated Field in Excel 2013)

image

The reasons I teach students to do it this way are simple.

  1. You get a dialog box/wizard that shows you every step that is required to complete the task.  Just complete each step one at a time and you can’t go wrong.
  2. You get to check the measure for errors before completing it (shown as 2 below).
  3. Once you add the measure to your pivot table (shown as 3), you get to see immediately if it is working as you expected.

image

image

And there is a fourth reason to use the wizard that is less obvious.  The Description box (shown as 4 above) is a great place where you can make notes about your DAX formulas (aka documentation).    In Power BI Desktop you can add comments into your DAX expressions, but this is not possible in Excel, so the description box is the best option you have.

Create Your Documentation

If you diligently add comments into the Description box above, you will then be able to extract this information into a documentation page using the Power Pivot Utilities Toolbar.  I have previously reviewed this great little tool here, so you can go back and read about it if you missed it.

After installing both tools, you will then see the PP Utilities Toolbar in Excel (shown as 1 below)

image

Click on List Measures (shown as 2 above).  After a few seconds, you will see a new page in your Workbook called “Measures” like shown below.  Note the “Description” column on the right hand side. This column is sourced directly from the information entered into the Description box shown further up the page.

image

What’s more, if you select the measures on your page (Shown as 1 below) and then click Format DAX Expressions (2 below), the measures will be nicely formatted so they are easy to read.

image

This page is then your documentation and you can refer to it at any time you need to see the big picture.

Always Up to Date

One important feature of this type of documentation is that it is very easy to maintain.  If you make changes to any of the measures or comments in your data model, you can simply re-run the extract process so you have an up-to-date list of all the measures.  You can also do the same for Calculated Columns and also relationships.

What About Excel 2010?

Unfortunately the Power Pivot Utilities Addin only works for Excel 2013+  But all is not lost.  It is possible to extract the Measures from your Data Model manually using DAX Studio.  You can read my blog post about how to do that (and other cool things) with DAX Studio here

Import Tabular Data from PDF using Power Query

Today I am sharing a process I developed that allows you to import tabular data from a PDF document into Excel (or Power BI) using Power Query.  I didn’t want to purchase software to do this task so I started experimenting on how I could do it with the tools I already have, and I am sharing my solution here today.

pdf-to-pq

Note:  This will only work for tabular data in a PDF – exactly the same as downloading from a Web Page must be in tabular form.

I also demo some neat Power Query tricks further down in this post, so make sure you read through to the end.  These tricks are not hard, and you may be surprised at what you can do.

Process Overview

Here are the steps I use to grab the table from the PDF.

  • Open the PDF in Microsoft Word.
  • Save the file as a single file web page.
  • Import the single file web page into Power Query as HTML.
  • Go through the normal cleansing and transformation process you would normally do (plus of course the cool tricks I use below).

Worked Through Example

I did a quick Google to find a PDF containing something I could use in my demo from the Web.  I found this PDF that contains a list of US States with their abbreviations

www.siue.edu/postal/pdf/Abbreviation-List.pdf    Note the data I want from this file is in tabular format.

image

I saved the PDF to a known location on my PC.

Convert the PDF to a Web Page File

I opened Microsoft Word and then I opened the PDF file from within Word.  I got this warning message below.  I just clicked “don’t show this message again” and clicked OK.

image

I then selected File\Save As and saved the file as a Single File Web Page (shown below).

2016-11-18_121653

 

Import into Power Query

There are a couple of ways you can do the next step. I found the following to be the easiest.

Create a new query that connects to a text file (example shown below – Excel 2016, but it is similar in all other versions and Power BI)

image

In the dialog, change the file filter so it can find all file types (as shown below), then browsed until I found my file.

image

Then select the MHT file created earlier and click “import”

image

Power Query will not correctly identify the file type, so right click on the file (shown below) and the select HTML.

image

Power Query now correctly identifies the table in the document.  I expanded the one identified as “3DTableGrid”.

image

In your own PDF documents, you may need to use some trial and error to find the right table.

I then clicked on the Table I wanted (shown as 2 above).

Cleansing the Data

The data is starting to look good at this stage.  But note there is a bit of “noise” in the file (see below). Power Query is a great tool for fixing that.

image

First I got rid of the first row (Remove Top Rows, 1).

Next I copied the text <= /span> by right clicking on one of the cells containing this text, and selecting “copy”.  Then I selected all 4 columns and did a Transform\Replace Values and replaced <= /span> with nothing.

I didn’t need the bottom 11 rows, so I did Remove Rows\Remove Bottom Rows\11

image

Now for Some Power Query Tricks

Power Query is a formula language. If you analyse each of the steps in the Applied Steps window, you will notice that each step is a formula, and the formula takes the output of the previous formula as the input step to the next formula.  The UI assumes each step will take the previous step as an input to the new step, but it doesn’t have to be that way.  You can override that if you want – I show you how below.

Turn on the formula bar before proceeding.

formula

At this point, I have my data in 4 columns, but I really want all the data in just 2 columns.

image

This is easy to fix with some simple Power Query tricks, with very little PQL coding at all.

First I renamed this step to be called All4Columns.  I like to rename significant steps in my Applied Steps window to make it easier to find the ones I need later.

image

Then I removed the last 2 columns using the UI.  Select the columns to remove, right click and then remove columns.

image

Before moving on I renamed the 2 columns to be called “State” and “Abbreviation”, and renamed the step to be called First2Columns using the same renaming approach as before.

image

Now the trick. At any point in your query, you can add a custom query step by clicking the button shown in 1 below.

image

Power Query assumes you want to add a new step that starts from the previous step. So Power Query automatically adds the previous step as the starting point (as shown below).

image

But you don’t have to keep the previous step.  In this case I want to refer to the step “All4Columns” instead.  So I just typed in the name of the step I wanted (as shown below) to replace what Power Query added for me.

= All4Columns

Now that I had all 4 columns back, I went through the following steps.

  • removed the first 2 columns
  • renamed the columns to be “State” and “Abbreviation” as before
  • renamed the step to be called Second2Columns

This then left me with 2 non-sequential steps (First2Columns, Second2Columns) that each contained half of the data.

Append The 2 Tables into a Single Table

I didn’t know how to write the code to do this, so I just used the UI again.  I selected “Append Queries” and appended the query to itself.  That forced the UI to write the code for me as shown below.

image

Not only did the UI write the code for me, but I learnt a single PQL function that I think I can probably remember Table.Combine( ) and hence I may be able to do the same step by hand next time around (this is how you can learn too).  Then all I needed to do was change the code above so that instead of appending to itself, it would append the First2Columns to the Second2Columns.  You don’t need to be a rocket scientist to work out how to do this :-).   It looks like this after I made the change.

= Table.Combine({First2Columns, Second2Columns})

Wrap Up

I hope you found this article useful, and you have learnt a few things.

  1. How to import data from a PDF
  2. How to refer to non sequential rows in a query
  3. How to use the UI in Power Query to help you write more advanced queries.
  4. How to refer to the Formula Bar to start to build your knowledge of the Power Query Formula Language.

If you are into Power Pivot and you haven’t already done so, you may like to download my paper “Ten Things I Wish I Knew When I Started With Power Pivot” from the link below.

Edit: 24th Nov 2016
A friend of mine Dave Marriott was using my approach above but had a stack of PDFs that he needed to access. He decided to write some Word VBA code that automates the task of converting the PDFs to MHT files. Dave is a professional VBA programmer based in Adelaide, Australia if you would like to hire him to help you be more efficient in your business. You can contact him at LinkedIn

Copy this VBA code into Microsoft Word.  Why not add it in your Normal Template so it is always available?  If you haven’t done this before, you can follow my instructions on how to do this in Excel (it is exactly the same process in Word).  Run the code to launch a dialog. You can multi-select as many PDFs as you like and the code will convert them all for you.

Sub ConvertToMHT()

Dim f_dialog As FileDialog
Dim FileChosen As Integer
Dim FileName As String
Dim i As Integer
Set f_dialog = Application.FileDialog(msoFileDialogFilePicker)
f_dialog.InitialView = msoFileDialogViewList
f_dialog.AllowMultiSelect = True

FileChosen = f_dialog.Show
If FileChosen = -1 Then
     For i = 1 To f_dialog.SelectedItems.Count
          Documents.Open FileName:=f_dialog.SelectedItems(i)
          ActiveDocument.SaveAs2 FileName:=Mid(f_dialog.SelectedItems(i), 1, Len(f_dialog.SelectedItems(i)) - 4) & ".mht", FileFormat:=wdFormatWebArchive
          ActiveWindow.Close
     Next i
End If

End Sub

Excel Workbook Connection to Local Power BI Desktop

Today I am sharing an easy way to use Power BI Desktop as a SSAS tabular server on your local PC.  I find increasingly that I am building data models in Power BI Desktop and yet I still want to use Excel to analyse the data some of the time.  If I load the PBIX workbook to the Power BI Service, it is easy to use “Analyze in Excel” to query the data model.  But there is currently no standard supported method to do the same thing using Power BI Desktop as a server (see my suggested idea to Microsoft at the bottom of this page for more about that).  Until then, we need to use a “hack” to complete this task.

I have previously shared a way to:

  • Open DAX Studio on your PC
  • Use DAX Studio to extract the Port Number of a running instance of Power BI Desktop
  • Manually enter this information into Excel to connect to Power BI Desktop and use it as a local SSAS Server instance
  • Query the data model that exists in the open Power BI Desktop file with a Pivot Table in Excel.

The problem with the above approach is that once the Power BI Desktop file is closed, the connection string becomes invalid.  Next time you want to use Power BI Desktop as a server you must go through the entire process again plus you need to alter the connection string that was previously created – hardly a scalable solution.

An Excel VBA Workbook Template to Solve This Problem

Local Host Workbook

Today I am sharing a tool that I built recently that completes the entire process automatically in Excel using VBA and Power Query.

Disclaimer:  I am providing this tool free of charge, without support, and do not warrant that it will work for you.  This is not supported by Microsoft and could stop working anytime.  If it works for you, then great – if not then it is unfortunate :-(.  Maybe you can deconstruct what I have done and make it work for you.  I built this using Excel 2016 and I have tested it on 2010 and 2013 and it works (Please make sure you have the latest version of Power Query installed).

You create a new blank Excel Workbook using my template (from personal templates), then click the “Refresh SSAS Connection” button.  The tool will then:

  • Detect an instance of Power BI Desktop running on your PC (there must be 1 and only 1)
  • It finds the port number and database name of the running instance
  • it modifies the existing connection string in the Excel Workbook to repoint it to the current running instance of Power BI Desktop.
  • If the workbook already contains pivot tables that were previously built using the same data source, they will simply reconnect and work as you expect.
  • If this is the first time you have used the template, you will have a blank Pivot Table connected to the Power BI Desktop data model.
  • You should only have 1 Excel Workbook and 1 Power BI Desktop file open at any one time (as they all use the same connection string).

I am not planning on covering in detail how the tool works.  I am making it available to use and modify as you see fit.  If you want to look into the VBA code and the Power Query to see what I did then of course this is fine by me.  If you want to improve it and give me back the improvements under the same “share and share alike” rules, then please do so as long as you leave my name comments in the file untouched.

How to use the Template

You can download the template here.  I have put it in a zip file so that I can keep the same link but change the name of the actual file over time (to manage version numbers).

The file is a macro enabled Excel Template workbook.  You need to place it in your Personal Templates location.  To find this location, in Excel go to File\Options, then navigate to save (1 below) then check the location (shown in 2).  If you don’t have a current template location, you can set your own.  Note Excel 2010 has a different file location (Google it).

image

To create a new workbook that uses this template, in Excel go to File\New, select the personal templates option (shown as 2 below) and then select the template.

image

I would love to hear what you think and find out if this is a useful tool for you.

Let’s Ask Microsoft to Make this a Standard Feature

I was talking to Avi Singh and he suggested that I request this as a standard feature.  I have set up an idea that you can find and vote for here.  If we get enough votes for this feature, Microsoft may build out this capability into the core product(s) so it is easier for everyone and fully maintained for future releases.

 

Power BI Personal Gateway Explained

One of the many excellent sessions I attend this week at the PASS Business Analytics Conference in San Jose was a session titled “Get Latest Insights by connecting your data using Power BI Content Packs and PBI Gateways”.  The title was interesting but the content presented by Dimah Zaidalkilani and Theresa Palmer-Boroski (both Program Managers on the Power BI team at Microsoft) was truly excellent.  I am going to share here what I learnt about the Power BI Personal Gateway, to help you understand if you need it, and if so how to get it installed and running.

There are actually 2 gateways that Microsoft currently has available for Power BI, the other being the Enterprise Gateway (for companies and the like).  I will not be covering the Enterprise Gateway in this post.  You would generally look to use the Enterprise Gateway if you refresh from work from a corporate network and need to connect through the firewall.

What is Power BI Personal Gateway?

personal gatewayAs the name suggests, this gateway is for “personal” use. To use it you simply install it on your own laptop, desktop computer or even a server if you want.  “Personal” does not mean you can’t use it for business if you want to, it is just a term to distinguish from the other gateway.  What the Personal Gateway does is create a “gateway” or connection between your computer (that has the software installed) and the Power BI Service in the cloud.  It is kind of like a dedicated VPN designed just to allow Power BI to talk to the data on your computer.

Why do I need it?

Well you probably don’t “need” it, but you certainly might “want” it.  Once you have installed and configured the Personal Gateway on your computer, you are then able to refresh your Power BI datasets directly in the cloud.  You have the option to either set up an automated schedule (up to 8 times per day) or you can trigger the refresh manually on demand.

When you refresh a Power BI Desktop file on your PC, Power BI Desktop connects to all the data sources it needs to refresh the report. Data sources can include things such as SQL Server, local Excel files, CSV files, or what ever you have set as your data sources for your reports.  The Power BI Personal Gateway simply allows PowerBI.com to access those same data sources that are visible on your PC transparently over the Internet.

The Regular Refresh Process

If you are anything like me, your standard refresh process for Power BI workbooks will go like this.

  1. Open the Power BI Workbook on your PC via Power BI Desktop
  2. Click on the refresh buttonimage
  3. Wait for 1 to 10 minutes for the workbook to refresh while doing something else.  30 minutes later you realise the refresh finished 20 minutes ago.
  4. Save and then “Publish” the workbook to PowerBI.com via the menu shown below.image
  5. You then get 1 or more prompts about where to load the workbook and also “do you want to replace the existing version” (seriously – what else would I want to do?).image
  6. Repeat process steps 1 through 5 over and over again for each Workbook you have developed.

The Power BI Personal Gateway Process

Option 1:

  1. Configure a daily or weekly refresh once and then do nothing but sit back and relax.

Option 2:

  1. Log on to the Power BI Service
  2. Click on the ellipsis next to each data source and click “refresh” for each workbook you need to refresh.  There is no waiting and the refreshes can be triggered in parallel (whether you should or not will depend on your computer and Internet connection.

Things You Should Know

There are a couple of things you should be aware of.

  1. Gateways are part of the paid subscription to PowerBI.com.  If you currently don’t have a paid subscription then you can’t use it.  Everyone has to decided for themselves if it is worth what it costs.  Keep in mind there are other benefits of the paid subscription, not least of which are “content packs”.  I will come back and talk about that another time.  Also keep in mind that Microsoft is currently not invoicing personal users for the professional version unless they are part of some broader corporate licencing agreement.  In my experience you will be given a 60 day free trial, and after 60 days you will be given a further 60 days free.  I guess this will end one day, but I haven’t seen the end yet. If it still isn’t right for you, consider purchasing Power Update as a local refresh tool.
  2. Your computer needs to be turned on and connected to the Internet at the time of the refresh if you want it to work.  Computer servers are “by design” always on and hence it is not a problem.  But your laptop or personal computer may not be on all the time and hence you need to keep this in mind.

How to Install the Power BI Personal Gateway

Installation couldn’t be easier.  Go to powerbi.microsoft.com/en-us/downloads/ and find the Personal Gateway download.  There is a link to the downloads page on the PowerBI.com  home page before you log in.

image

Launch the software then follow the steps in the install wizard and close then you are done.

Set your Data Source Credentials

The only thing that wasn’t obvious to me when I started using the Personal Gateway was that you need to manually set the data source credentials for each dataset you want to refresh through the gateway.  The first time I realised this needed to be done was when I got the following error message.

image

Clicking through the error took me to the settings section.  You can get to this session at any time from the regular “cog” settings menu.

image

For each data source in your datasets, you will have to “edit credentials” and then tell Power BI what the connection authentication method will be.

image

Here is an example of the dialog from one of my workbooks that has a SQL connection.

image

The dialog for regular PC files is very similar

image

Once it is done, you should see a confirmation message something like this for each connection.

image

Time to Refresh the Datasets

In PowerBI.com, click on the ellipsis next to one of your datasets, and then you can either set up your Scheduled Refresh settings (2 below) or manually refresh the report on demand (3 below).

image

If you select Schedule Refresh above, you will be taken to the settings panel (also clickable from the cog in the top right of the page).  From there you can set up the refresh schedule for all of your loaded datasets.  You may choose to stagger the refresh times across the data sets so that they don’t all hit your computer at once.  You can have up to 8 refresh events each day per dataset.

image

I hope you have found this article interesting.  I would to hear from different readers as to why they have decided to use or “not use” the personal gateway, so please share your story in the comments section below.

I you are looking for more detailed information, you can read about it at the official Power BI Site.

A Fabulous new Excel Add-In for Power Pivot

I’m really exited to share with you a new Excel Add-in that was authored by Bertrand d’Arbonneau and has recently been made available via SQLBI.com The Add-in called Power Pivot Utilities combines a number of existing tools into a single new tool bar in Excel as shown below (Excel 2013 and 2016 only). Everything could already be done using various bespoke existing tools and procedures, but there is huge value by bringing them all together into a simple to use UI like this. This new Add-In goes straight into my “must have tools” for Power Pivot in Excel.

image

There are currently 8 icons on the tool bar (edit: plus there are some hidden gems in the context menus). Let me explain what each item does. Read More