Power Pivot and Power Query are the best things to happen to Excel since the invention of Excel itself. Once you start using these tools there is no turning back. But someone asked me the other day “Is it possible to use Power Query with a traditional pivot table?” (ie not with Power Pivot). Indeed it is, let me show you how.
Assume you have some data in a source spreadsheet (actually it could be any other source that Power Query can point to). I created a sample worksheet that contains 1 million rows of random made up transactional data that looks like this to demonstrate the process.
There are no formulas in this workbook, just data – just as if it were exported from a transactional sales system. This sample workbook is almost 27 MB in size as you can see below.
I then created a new blank workbook and connected this new Workbook to the above workbook using Power Query. When prompted in Power Query, I selected Close and Load to and selected “Only Create Connection”. Doing it in this way creates the connection to the source workbook but does not actually bring any data into the new workbook.
After I saved this new Workbook, I could see that the new workbook is very small indeed, confirming there is no data loaded.
The next thing I did was to create a new pivot table that uses this new Power Query as the connection. Just start the “Insert Pivot Table” process as you normally would for any other regular Pivot Table (InsertPivot Table). But when you get to the Pivot Table wizard, select “Use and External Data Source” and then select “Choose Connection” as shown below.
The wizard shows the Power Query created earlier. Select this query (1 below), and then click Open (2 below).
Finally click OK to add the Pivot Table. For this demo data, I then went ahead and set up a Pivot Table as follows.
Interestingly, when I saved this workbook with the query loaded and the Pivot Table added, the workbook was still much smaller than the original workbook.
The standard Pivot Table technology compresses the data (into the Pivot Cache) and only materialises the data it needs to populate the Pivot Table. This keeps the file size of the new workbook small. The result is this new workbook is actually compressed almost 5 fold over the original workbook.
The Normal Way to Create a Pivot Table
Out of interest, I then copied the source data into a new workbook and added a traditional Pivot Table the normal way into the same workbook as the copy of the source data – InsertPivot Table pointing to the underlying data. Interestingly, this time the file is almost 9 MB larger than the source file alone, even though the Power Query workbook that just contained the Pivot Table was only 5.3MB.
And it Still Works Without Access to the Source Data
The last thing I did was move the original source file so that I could be sure that the Power Query Workbook wasn’t actually reading the data from this file when I was interacting with the Pivot Table. I was able to fully interact with the Pivot Table in the Power Query workbook without having access to the underlying source data – exactly the way it works with Power Pivot. The only thing I couldn’t do is refresh the data source (as you would expect).
Edit 10 Dec. As Jos points out below, this can be done without Power Query. The advantage of Power Query is you get all the data shaping, cleaning and transforming capabilities bundled in as well.
So if you have not quite got your Power Pivot Skills up to standard as yet, then at least you can get started using Power Query to shrink the size of your Pivot Table workbooks. But seriously, do yourself a favour and get started with Power Pivot so you can really start to leverage the super abilities of that tool.
If you want a comprehensive lesson on how to use Power Query, checkout my training course here https://exceleratorbi.com.au/product/power-query-excel-power-bi-online-training/
When creating pivot table from connection only query, it creates a duplicate of the query. Any way to avoid this?
Yes, when you create a pivot table from connection only query, it creates a duplicate query with the connection mode as pivot table. But the data is not loaded into data model. You can delete the previous connection only query.
Try save the files as .xlsb. It can reduce the .xlsx file size with up to 80% – if you are lucky.
@Matt
Now take it one step further try the same thing but user the Power BI Desktop’s PQ to get the Data
And Build a “Normal” Power Pivot in Excel connecting the PBIX file – you will see that the Excel File is a Few KB !!! – Thats right – KB
The method to connect to the PBIX file is described in the comments of the below post – on which you have commented as well !!
http://www.powerpivotpro.com/2015/08/power-bi-desktop-now-converts-power-pivot-workbooks/
I like the idea of PBD as a server, but I don’t think it is practical given the server port changes every time you start it
That’s the only thing MS has to fix or give a simple straight forward way to connect to Data model in the PBIX file
We will then finally have an alternate to Access as a Desktop DB – so all the Data and the transformations and the DAX can happen in PBI Desktop and Excel can be a the ultimate lightweight reporting front end
Good story Matt. I am into Power Query. Note that you can get to the same result without Power Query. Simply, insert a pivot table and connect to the data source / table in Excel with transactions. The file with only the pivot table is about 6 – 7 times smaller than the fiole with the transaction table. Indeed a great compression ratio.
Thanks Jos. Yes I probably over simplified my demo example. The big advantage of Power Query of course is the data transformation you can do in the process. In my demo, I didn’t actually do any transformation – but I could :-).
Great, succinct, useful blog post Matt, thanks for sharing your continued learnings.