There was an interesting question this week on http://powerpivotforum.com.au asking if there was a smarter way to user Power Query over multiple identical web pages to scrape the data in a single query. I have been meaning to blog about this for a while, so it is a great opportunity for a mini-Friday post.
The process of typing all the steps on how to do this is a lot longer than simply recording a video. So instead of a long post, I have embedded a video below from my YouTube channel.
First the Problem
Take a look at this web page http://www.calendar-australia.com/holidays/2016/
It contains all the public holidays for the calendar year 2016. But there are 2 similar web pages with the same structure but different results
How to do it in Power Query
I’m sure I first learnt this technique watching Miguel Llopis from Microsoft, probably at PASS SQL BA Conference in 2015. The technique can be reused in many different scenarios like this one. The trick is virtually identical to the technique I showed in this post about combining multiple Excel workbooks into a single query. The solution works equally well in Power Query for Excel as well as in Power BI (Get and Transform). The steps are
- Build a query for one of the sets of data
- Turn the query into a function using the simple parameter step
- Modify the query to use the parameter instead of the hard coded set of data
- Create a table with a column of the parameters you want to use (in this case the data must be text, not numbers).
- Add a new column that takes the text from the first column and processes the function for each row.
- Expand the column into a single table.
Here is a video showing how easy it is to do.
Seems there is a special birthday this weekend. Should be fun!
If you want a comprehensive lesson on how to use Power Query, checkout my training course here http://xbi.com.au/pqt