I have written a couple of posts here and here about the problems and dangers of using Power Query with Power Pivot. Well I am very pleased to say that I now fully understand the problem, how it is caused and how it can be avoided.
Edit: Nov ’15. Microsoft has released a fix to prevent this problem occurring.
First a Clear Statement of the Problem
You use Power Query to create a table and load it into Power Pivot. You only load a subset of columns to start with.
All is fine for a while and you create a whole lot of DAX formulae – life is sweet. You add new rows to your Power Query Table and refresh the data into Power Pivot – not a care in the world.
Then one day you need to make a change in Power Query (eg to bring in another column of data that you left out originally). You go into Power Query and make a change to your table to bring in a new column.
You save, close and load and then BAM – you get the following dreaded error message. You didn’t even see it coming before it hit you right between the eyes.
Exception occurred. (Exception from HRESULT: 0x80020009 (DISP_E_EXCEPTION))
The only option now is to remove the table from Power Pivot and then reload it. When you do this you lose all your relationships, all your measures in the table, all your calculated columns you created in the table (not that you should have any of those – right?!)
What causes this?
It is very easy to prevent this happening if you know what causes it. This is one of the many high value things I learnt at the PASS BA Conference in San Jose in April 2015 (in 2 separate sessions – one by Chris Webb and one by Miguel Llopis) . So what causes the problem? The issue is caused when you make structural changes to the table in BOTH Power Query and Power Pivot.
Let’s work through the problem.
When you first load data from Power Query into Power Pivot, a ‘connection’ between to the 2 applications is created. You can see this connection from inside Excel on the Data Tab. See example below. Note how the connection string can be selected with the mouse.
Then you innocently make a change in Power Pivot without realising it is a problem
This is where the real problem is caused. If you make ANY CHANGES inside of Power Pivot to the structure of the original Power Query table, then Power Pivot will change the connection from being “editable” to being “read only”. If you then try to make any structural changes to Power Query AFTER the connection is set to read only, this error is then triggered. Once the connection is read only, the only thing you can do is “refresh” the model to add new rows of data. No further structural changes can be made to the table in Power Query after the connection is set to ‘read only’.
All of the following changes in Power Pivot will trigger the problem (careful, there may be more).
- Changing the table name in Power Pivot
- Changing an imported Column Name in Power Pivot
- Deleting a column in Power Pivot
Note that it is quite OK to create a calculated column inside of Power Pivot (although I wouldn’t recommended it – that is what Power Query is for).
And a big part of the problem is that you make the change inside of Power Pivot and you don’t even realise there is a problem – there is nothing to warn you or tell you that the connection has been changed to read only. It could be weeks or months later that you make the change to Power Query and trigger the dreaded message. By this time you have completely forgotten about the change you innocently made in Power Pivot.
So how do you know if you have a problem?
So by the time you get to this point, you probably realise that you may already have a problem that you don’t know about. To check, you need to take a look at the connection string between Power Query and Power Pivot. The easiest way to do this is is to goto ExcelDataConnectionsPropertiesDefinition and check the connection string (just like the example in the previous image). If you see that the connection string is ‘greyed out’ – ie not selectable and also there is a message stating properties can’t be changed, then this confirms you have a problem and it is already too late.
Time to move on with confidence
I have to say that since I have learnt the problem, I have re-engaged with Power Query like never before. I now use Power Query with confidence as a data provider to Power Pivot and I have not had 1 single instance of a problem since. The simple solution is to never ever make any changes inside Power Pivot to a table you have imported from Power Query. Take this precaution and you will never have a problem again. You can make all the changes you need directly in Power Query as long as you only use Power Query and never make structural table changes in Power Pivot. There is nothing you can’t do in Power Query anyway, so this is not a limitation in any way.
I hope this post will give you the confidence to do the same.