Cross Join with Power Query - Excelerator BI

Cross Join with Power Query

Before showing you how to do cross join in power query, let me tell you a story. I had a situation this week where I need to create a budget table in Power Pivot.  I had a choice of either loading a weekly budget as a single record per store and using DAX to calculate the Year To Date budgets, or load a set of records (one for each week) in a budget table.  I chose the latter in this case as it gave me a chance to try out some new skills in Power Query.  I have recreated the scenario below.

Weekly Budgets by Store

I have a spreadsheet that contains the budget per store per week.  The budget is the same for each week for each store for the next 15 weeks.  The budget table looks like this – easy to enter the data but not optimal for Power Pivot.  In my real life example, there were many more columns for Shrinkage, GP and some other metrics across the page.

image

Long narrow tables are normally better for Power Pivot, so I used Power Query to reshape this table so it looks like the one below – using a simple un-pivot columns transformation from Power Query.

image

Once this table was created in Power Query (import from spreadsheet then transformed), I named the Query “Budget” and set it to “only create connection”.

image

This makes it available to use in other queries without actually storing the data anywhere in Excel.  Each time you ‘materialise’ a query by adding it as a table in Excel or Power Pivot, it takes up space.  Only do this if you need the materialised view of the table.

Power Query Online Training

Week ID Numbers

I needed to set the budget each week for the coming 15 weeks.  I was using a weekly calendar in this example (not a daily calendar) so I had week IDs as shown below in the format YYWW.  Note the non contiguous jump at the end of the calendar year into 2016.

image

Creating the List of Week IDs in Power Query

Here was my first opportunity to learn a new skill.  I knew it was possible to create a native list in Power Query rather than import the list from Excel ( I read this somewhere – probably something from Chris Webb) but wasn’t quite sure how.  I Googled and tried List.Generate() but that was wrong. Then a flashback – all I needed to do was use the following syntax using braces.

={1..5}

The above generates a list of numbers from 1 to 5.  With this in mind I did the following:

  • I created a new query by executing the following menu steps:
    • Power Query
    • From Other Sources
    • Blank Query
  • Renamed the Query “Weeks”
  • I wrote a line of code like this in the formula bar = {1545..1552,1601..1607} and it gave me exactly what I needed.  I didn’t know that the comma syntax would work for creating non contiguous ranges when I wrote the formula, but it did work.  After you get some practice and experience with a new language like this, you start to see the patterns in the syntax and you get free kicks like this one.  (Note:  If you can’t see the formula bar, turn it on from the View menu)

image

  • Note that at this point it is “just” a list, not a table.  To convert the list to a table, I clicked TransformTo Table, then gave the new table a column name.

image

  • I saved the query as “only create connection” as well so it didn’t materialise the table, just created the instructions on how to create the table in a new Query.

At this point I was very happy and impressed with my new skills, but there were still problems ahead.  The next step was to Cross Join the tables.

Cross Join the Two Tables

The weeks table above has 15 rows and the Budget Table has 10 rows.  When I say “cross join”, what I mean is I need to duplicate the entire Budget table for every week, adding the week column into the budget table.  So I need something like shown below, but for every possible combination.  ie 15 x 10 = 150 rows in my new table.

cross join

I did some quick Googling on Cross Join Power Query and found some references to Full Outer Joins, but this is not what I needed.  Then I had an idea – perhaps I could write a dummy function to simulate the Cross Join process in power query.  I tested this and it worked well.

Edit 31/10/15

After posting, I got some good suggestions from Imke Feldman; it seems the function is not needed at all – makes sense actually but I hadn’t realised it.  Chris Webb posted a more performant approach too (which is important if you have large tables), so here is the updated approach.

Add a Custom Column to create a Cross Join

  1. Open the Weeks table and add a new custom column (shown below).  The formula for the new column is simply the name of the Budget Query.
    new add column
  2. After you click OK, you get a new column that “contains” the entire Budget table as an object in each row of the Weeks table (as shown below).
    image
  3. All I needed to do then is expand the new column by clicking the expand button (shown above) to create all the possible combinations (shown below)

image

I set the data types of the numeric columns and loaded the table directly into my data model so I could use it in Power Pivot.

You can also read Chris Webb’s suggestion in the comments below)

If you want a comprehensive lesson on how to use Power Query, checkout my Power Query Online training.

26 thoughts on “Cross Join with Power Query”

  1. Nathaniel Feuerstine

    Our scenario is nearly the same as yours. Except I cannot have my users go into Power Query and manually enter the {1..15} to generate our matrix. They will have to just open the workbook and go. But the numeric sequence values (like {1..15} or {1..10,12..15}) are stored in a database column in our “Jobs” table. I’m trying to build a matrix of those numbers times the tests in our “Tests” table.

  2. Nathaniel Feuerstine

    Can you, for example, have the {1545..1552,1601..1607} as a value in a database and then get PowerQuery to generate the list from the value? Or is that only available to convert to a list if it’s typed in?

      1. Nathaniel Feuerstine

        I can’t seem to figure out how to do it. I don’t see a way to get PQ to recognize I’m trying to pass a value from a query result (or a table or ???) and have the list generated. What I’d like to be able to do is have values like “1..15” or even “{1..15}” in my database and then have PQ build a list from that, but doesn’t work any way I’ve tried.

        1. I haven’t tried it. At least conceptually, I would have thought if you were trying to extract a list of products from SQL, with product codes 1 through 15, you could do the following.
          Generate the list of codes ={1..15}
          Convert it to a table
          Connect to the product table in SQL
          Join the list of codes table to the source table using “Combine” using an inner join
          That should do it.

          But I’m not sure if that is what you are trying to do. Maybe you could ask for help at community.powerbi.com and provide some more context and sample data, possibly

          1. Nathaniel Feuerstine

            That’s not exactly what I’m trying to do. I am trying to find a way to have that value in a database column (i.e., store “1..15” or “{1..15}” in a database) and have PQ read that result and generate the list 1 through 15.

            I will try to post it over there. I posted to another forum and they didn’t think it could be done. Disappointing, would be a powerful feature for us. Thanks!

  3. I had huge performance issues doing the approach suggested here.

    The approach recommended by Chris Webb above of adding a match column to each table being joined with a value of 1 was about 3 x faster for me (which meant query took 5-7 seconds instead of 20 seconds with the approach mentioned in this article.

  4. Hi there. And how would you cross join a table with itself? When I try to do it, I get the “cyclic reference” error message.

  5. Hi Matt,
    In Power Query it show the right results for example (Table A ID = 1,2,3,4,4 and Table B ID = 2,4,4,5)
    so 5×4 = 20 but when you use the table visual it show only 15 records ? now showing duplicate from table B
    I wonder way? – How do I show all records in my table

  6. I just wanted to say thank you for your blog!! Really helpful for those of us still learning the tricks of Power Query. I’ll take you up on your book recommendation too 🙂

  7. Hi Matt,

    I think there is a simpler way to do that. The only you have to do is adding the column with your list of weeks to the first table and expand its. That is all 🙂

    Regards

      1. But Matt, i’m not telling about adding the column containing first table in each row.
        I’m telling about adding the column to the first table (column containing your list in each row).
        I mean something like this
        =Table.AddColumn( YourLastStepInFirstTable, “Week”, each {1542..1562,1601..1607})
        and then expand this column.
        Now we have the same table as you have. Only the order of records is different.

        Regards

  8. Matt, Imke,

    There are some potential performance problems with these approaches that I cover here:
    http://blog.crossjoin.co.uk/2014/06/02/join-conditions-in-power-query-part-1/
    http://blog.crossjoin.co.uk/2014/06/04/join-conditions-in-power-query-part-2-events-in-progress-performance-and-query-folding/

    The best approach to doing cross joins I have found is to create calculated columns on both tables that return the value 1, then do a merge to join these tables together.

    Chris

  9. Hi Matt,
    very cool trick with the comma in the list! Isn’t Power Query really rewarding once you start playing around with it?

    Re the function for the crossjoin: As it has no parameters, you could actually skip it and simply write the name of the query into the formula section of your new column (=Budget). (In your case you would need to “unfunction” it back to a simple table)

    (…I’m not a programmer at all, but from what I know about it, the simplicity of some concepts in Power Query must come quite as a surprise 🙂 )

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top