There are quite a few different common business scenarios where you get data in the form of a file on a regular basis. The 3 common patterns I use most often are.
- Each day/month/year you get a new file containing incremental data. All the data in all the files need to be combined and loaded. I have previously shared this article explaining how to combine all files in a folder with Power Query.
- Each day/month/year you get a new file and it replaces the old file. I wrote about this pattern in Load the Latest Version of a File with Power Query.
- You have a workbook with multiple sheets within a single workbook that need to be combined. This could be in addition to pattern 1 or 2 above. This is what today’s post is about.
Combine All Sheets in a Workbook with Power Query
There are a couple of tricky issues to overcome with this pattern.
- Write the query in such a way that new sheets are automatically loaded
- Handle the errors with repeating header rows coming from the multiple sheets
- Handle the renaming of the Sheet Name column so it works regardless if the sheet order changes.
I cover these tricks in my video below.
Note also the comment from Tristian Malherbe in the YouTube comment section. What Tristian is talking about is manually making a change to the code in the source step. My source step as created by the UI is as follows
= Excel.Workbook( File.Contents("C:\Users\Matt\Consolidate-Worksheets.xlsx"), null, true )
If you look at the documentation for this function, you find the following
The UI set useHeaders as null and delayTypes as true. By simply changing the first optional parameter to true (manually in the formula bar), Power Query will treat each sheet as having headers when they are combined removing the repeating header issue. Nice tip – thanks @Datatouille.
Want to Learn Power Query in a Structured Way?
My Power Query Online training course consists of 7.5 hours of online video training material that will teach you everything you need to know about using this fabulous tool for both Power BI and Excel. You can watch some free sample videos and find out more about registration following the link above.
Hi Matt, would this also work with Excel Online sheets on OneDrive?
I see no reason why not
Pingback: Power BI Secure Embed, DAX Variables, MS Flow and more... (January 14, 2018) | Guy in a Cube
Yes, you can skip the first X rows, as long as X is constant. If I were doing this as part of a consolidation of worksheets, I would probably use a custom function approach rather than the combine approach covered here. I would use the approach above to create a list of sheets. I would then process one sheet and remove the rows, then turn that into a function and pass the list of sheet names to the function. You can see a blog about something similar here https://exceleratorbi.com.au/combine-excel-workbooks-power-query-method-1/
The process is essentially the same just that it is combining workbooks rather than worksheets
Hi Matt – thanks for this great tutorial. Is there a way to get Power Query to ignore the first few rows? I have a regular report downloaded from our ERP into Excel where the first few rows are report parameters used to generate the report. The header row starts at row 5 and the data from row 6.
Nice. Great tutorial/walk through Matt. I’m sharing this with colleagues. I’m frequently coming across spreadsheets setup like this. I may have to watch the video a few more times to get it committed to memory!
Excellent tip Matt. I’ve struggled with this in the past as spreadsheets were updated over time and I had to keep fixing the import.
Thanks for the post