Edit 16 May 2015: I now know the exact cause and cure for this problem. See this post here
PhilC at PowerPivotForum.com.au asked me for advice on the best way to use Power Query with Power Pivot. If you have come to this post via Google, it is probably because you have had issues and are looking for answers. If you are reading this post for any other reason, then hopefully it will save you some pain by following my advice on how to minimise these Power Query issues with Power Pivot.
Quickly – what is Power Query
Power Query is the best thing to happen to Excel since Power Pivot! Basically it is a lite extract, transform and load (ETL) tool for Excel users. It allows us Excel users to grab data from almost anywhere, shape it into a form that can be used in Excel (increasingly Power Pivot), load it into your workbook and then away we go.
So what’s the problem? Let me explain
Here is the scenario. You grab some data using Power Query and then load it into Power Pivot. You start building out your measures and adding value to your Power Pivot data model. After investing many hours, you realise that you really need 1 more column of data from your Power Query table. You go back to Power Query, add the new column of data, save the workbook and “BAM” you get the following error message.
We couldn’t refresh the table YourTableName from connection YourPQConnection. Here’s the error message we got: OLE DB or ODBC error: The query YourPowerQuery or one of its inputs was modified in Power Query after this connection was added.
When you first see this message, you can feel your stomach sink through the floor. You have put hours into your data model and how you have to “start again”.
How to fix this problem after it happens
- First go to each measure in your affected table and move each measure to another table (any table will do). You can’t cut and paste from the calculation area – not if you want to keep your measure formatting anyway. So instead you need to go into ExcelManage Calculated Fields and change the table name for all the measures affected.
- Go into Power Query, remove the check from the checkbox “load to data model” and save the Power Query.
- Then edit the Power Query again, re-check “load to data model” and save.
- Go back into Power Pivot and move all your measures back into the Power Query table.
- Reconnect the tables in Power Pivot.
Now the more measures you have in Power Pivot in your table, the more work this becomes. I can tell you it is a real hassle.
As an extra precaution, I always take hourly snapshots of my workbooks when working on the data model. Just take a copy of the current workbook and save it in a location with a unique name (date and time stamp added to the name works best for me). Then if disaster ever strikes, you never lose more than 1 hour’s work.
Prevention is better than the cure
I have seen various posts on how to fix this problem, but frankly I haven’t seen any that seem to work perfectly. So I think there are basically 2 choices:
- Don’t use Power Query
- Find a way of managing the risk to an acceptable level.
Well option 1 is not a good option because Power Query is a great tool. This is process I have adopted that minimises the effort of rework if/when something goes wrong. And the key is to never put any measures/calculate fields in tables that come from Power Query. Here is my process.
- Add your table from Power Query (as normal) into the data model.
- Create a dummy table to store your measures that would normally go into the Power Query table.
- Add the table to Power Pivot. Don’t connect the table to anything, just keep it as a disconnected table.
- When you need a measure that would normally be placed in your Power Query table, just place it in this dummy table instead.
Now there are only 2 problems with this approach, and they are minor in the big scheme of things.
- If a problem occurs, you still need unload your Power Query workbook, re-load it again and reconnect the tables. But that takes 30 seconds tops.
- You will get “false positive” warnings from Excel that your data model may need connections. But frankly you get a lot of those anyway, particularly once you start to write more complex DAX, so it is not really an issue.
Microsoft is working on it.
I know a guy who is connected with Microsoft, and he assures me that Microsoft knows this is a big issue and is working on it with priority. Having said that, he told me this last October 2014 and it is still not fixed.
I hope my suggested approach saves you future pain.
Edit 16 May 2015: I now know the exact cause and cure for this problem. See this post here
If you want a comprehensive lesson on how to use Power Query, checkout my training course here https://exceleratorbi.com.au/power-query-online-training/
Thanks for an interesting idea, but what do you do with calculated columns?
I assume you are asking “how do I recover my calf columns”, right? Well a couple of comments. Firstly, based on my experience, most calc columns that self taught Power Pivot users creat should actually be Calculated Fields instead. I don’t know if this applies to you or not. Have a read of this. http://www.powerpivotpro.com/2014/10/5-common-mistakes-made-by-self-taught-dax-students/
Secondly, I have just returned from PASS BA Vinference in Santa Clara, CA http://passbaconference.com/2015/Live.aspx. At one of the sessions I finally learnt exactly what the problem is. I will update this blog post to explain the problem later, but so you know, the issue is that when you make any changes to a table that comes from Power Query from within Power Pivot, the connection is set to “read only”. Examples are changing table name, column name, adding calc column etc. Then if you change Power Query again, the error occurs. I’ve both applications have changed the table, and hence you are forced to remove and reload the table. It makes sense really.
So my advice is 1, consider changing your calc column to a calc field, and 2, if you really need the calc column, create it in Power Query and not Power Pivot and 3, never change a table in Power Pivot if it was created in Power Query.
“As an extra precaution, I always take hourly snapshots of my workbooks when working on the data”
Amen to that. I’ve learnt the hard way and now have a command button on the home page of any work in progress. It’s caption reads ‘Increment File Version’
When you click it, it saves the current workbook, then increments a counter and resaves the workbook as v1 + counter number.
Any serious slog and I’m clicking that button regularly I can tell you
Ok, I’ll have loads of versions racked up but I can delete them when I feel comfortable and know I always have something to fall back on
When I started off and wasn’t accustomed to the 32 bit / 64 bit dimension to Office 2013 I had some serious crashes to recover from. And also some gotchas vis-à-vis changing the names of tables loaded into PP using PQ
Anthony, a button to save a new version – great idea. Care to post the VBA code?
No problem, just adapt accordingly changing the locations where you store your counter, filename etc
Public Sub IncrementFileVersionSave()
Dim sFilename As String
Dim sFolder As String
Dim sFullFilename As String
Dim lVersionNo As Long
On Error GoTo ErrorHandler
Application.StatusBar = UCase(“Incrementing file version, please wait…”)
ActiveWorkbook.Save ‘save current workbook
lVersionNo = wsConfig.Range(“rVersionNo”).Value + 1
wsConfig.Range(“rVersionNo”).Value = lVersionNo
wsConfig.Range(“rFilename”).Calculate
sFilename = wsConfig.Range(“rFilename”).Value
sFolder = ActiveWorkbook.Path & “”
sFullFilename = sFolder & sFilename
ActiveWorkbook.SaveAs sFolder & sFilename ‘save as new version, incrementing version number
ErrorExit:
Application.StatusBar = False
Exit Sub
ErrorHandler:
MsgBox (“Cannot save file, check folder / drive is available”)
Resume ErrorExit
End Sub