Power Query to Combine Web Pages

Level: Intermediate

There was an interesting question this week on http://powerpivotforum.com.au asking if there was a smarter way to user Power Query over multiple identical web pages to scrape the data in a single query.  I have been meaning to blog about this for a while, so it is a great opportunity for a mini-Friday post.

The process of typing all the steps on how to do this is a lot longer than simply recording a video.  So instead of a long post, I have embedded a video below from my YouTube channel.

First the Problem

Take a look at this web page  http://www.calendar-australia.com/holidays/2016/

It contains all the public holidays for the calendar year 2016.  But there are 2 similar web pages with the same structure but different results

http://www.calendar-australia.com/holidays/2017/

http://www.calendar-australia.com/holidays/2018/

How to do it in Power Query

I’m sure I first learnt this technique watching Miguel Llopis from Microsoft, probably at PASS SQL BA Conference in 2015.  The technique can be reused in many different scenarios like this one. The trick is virtually identical to the technique I showed in this post about combining multiple Excel workbooks into a single query. The solution works equally well in Power Query for Excel as well as in Power BI (Get and Transform). The steps are

  1. Build a query for one of the sets of data
  2. Turn the query into a function using the simple parameter step
  3. Modify the query to use the parameter instead of the hard coded set of data
  4. Create a table with a column of the parameters you want to use (in this case the data must be text, not numbers).
  5. Add a new column that takes the text from the first column and processes the function for each row.
  6. Expand the column into a single table.

Here is a video showing how easy it is to do.

Seems there is a special birthday this weekend.  Should be fun!

 

If you want a comprehensive lesson on how to use Power Query, checkout my training course here https://exceleratorbi.com.au/product/power-query-excel-power-bi-online-training/

Share?

Comments

  1. Very helpful post, I had some VBA code that I was using to get Store details from Websites, Now I have the store details directly in PQ. Thank you for the post.

  2. That’s funny I was looking for a solution like this this morning. I will try your suggestion.

    Thanks

    Patrick

    • You don’t need one query per page. You need 1 query to turn into a function, and a second query to pass the page details to the function. With these 2 queries, you can extract any number of pages into a single table.

  3. Hi Matt,
    Thanks for sharing. This use case is way too popular to not have a good solution. But the interesting thing for me is the statement: “The solution works equally well in Power Query for Excel as well as in Power BI (Get and Transform)”. I predominantly use Excel PQ/PP, 365 proplus with 2016 apps. Should I be concerned once I port to PBID? Many thanks.

    • No. You should not be concerned. Generally Power BI is more advanced in capabilities than Power Query. You should expect migration to be painless and features to be better.

  4. Great Post!!!
    can you please upload the excel file?

    how did you extract the year from the website? as it not mention but only the day and month?

    thanks.

  5. Hi Matt. Thanks for sharing . I was looking something like that so many … weeks. Greets from Argentina

Leave a Reply

Your email address will not be published. Required fields are marked *

Ten Things I Wish I Knew When I Started With Power Pivot.   Learn  Without The Pain!

x