Before showing you how to do cross join in power query, let me tell you a story. I had a situation this week where I need to create a budget table in Power Pivot. I had a choice of either loading a weekly budget as a single record per store and using DAX to calculate the Year To Date budgets, or load a set of records (one for each week) in a budget table. I chose the latter in this case as it gave me a chance to try out some new skills in Power Query. I have recreated the scenario below.
Weekly Budgets by Store
I have a spreadsheet that contains the budget per store per week. The budget is the same for each week for each store for the next 15 weeks. The budget table looks like this – easy to enter the data but not optimal for Power Pivot. In my real life example, there were many more columns for Shrinkage, GP and some other metrics across the page.
Long narrow tables are normally better for Power Pivot, so I used Power Query to reshape this table so it looks like the one below – using a simple un-pivot columns transformation from Power Query.
Once this table was created in Power Query (import from spreadsheet then transformed), I named the Query “Budget” and set it to “only create connection”.
This makes it available to use in other queries without actually storing the data anywhere in Excel. Each time you ‘materialise’ a query by adding it as a table in Excel or Power Pivot, it takes up space. Only do this if you need the materialised view of the table.
Week ID Numbers
I needed to set the budget each week for the coming 15 weeks. I was using a weekly calendar in this example (not a daily calendar) so I had week IDs as shown below in the format YYWW. Note the non contiguous jump at the end of the calendar year into 2016.
Creating the List of Week IDs in Power Query
Here was my first opportunity to learn a new skill. I knew it was possible to create a native list in Power Query rather than import the list from Excel ( I read this somewhere – probably something from Chris Webb) but wasn’t quite sure how. I Googled and tried List.Generate() but that was wrong. Then a flashback – all I needed to do was use the following syntax using braces.
={1..5}
The above generates a list of numbers from 1 to 5. With this in mind I did the following:
- I created a new query by executing the following menu steps:
- Power Query
- From Other Sources
- Blank Query
- Renamed the Query “Weeks”
- I wrote a line of code like this in the formula bar = {1545..1552,1601..1607} and it gave me exactly what I needed. I didn’t know that the comma syntax would work for creating non contiguous ranges when I wrote the formula, but it did work. After you get some practice and experience with a new language like this, you start to see the patterns in the syntax and you get free kicks like this one. (Note: If you can’t see the formula bar, turn it on from the View menu)
- Note that at this point it is “just” a list, not a table. To convert the list to a table, I clicked TransformTo Table, then gave the new table a column name.
- I saved the query as “only create connection” as well so it didn’t materialise the table, just created the instructions on how to create the table in a new Query.
At this point I was very happy and impressed with my new skills, but there were still problems ahead. The next step was to Cross Join the tables.
Cross Join the Two Tables
The weeks table above has 15 rows and the Budget Table has 10 rows. When I say “cross join”, what I mean is I need to duplicate the entire Budget table for every week, adding the week column into the budget table. So I need something like shown below, but for every possible combination. ie 15 x 10 = 150 rows in my new table.
I did some quick Googling on Cross Join Power Query and found some references to Full Outer Joins, but this is not what I needed. Then I had an idea – perhaps I could write a dummy function to simulate the Cross Join process in power query. I tested this and it worked well.
Edit 31/10/15
After posting, I got some good suggestions from Imke Feldman; it seems the function is not needed at all – makes sense actually but I hadn’t realised it. Chris Webb posted a more performant approach too (which is important if you have large tables), so here is the updated approach.
Add a Custom Column to create a Cross Join
- Open the Weeks table and add a new custom column (shown below). The formula for the new column is simply the name of the Budget Query.
- After you click OK, you get a new column that “contains” the entire Budget table as an object in each row of the Weeks table (as shown below).
- All I needed to do then is expand the new column by clicking the expand button (shown above) to create all the possible combinations (shown below)
I set the data types of the numeric columns and loaded the table directly into my data model so I could use it in Power Pivot.
You can also read Chris Webb’s suggestion in the comments below)
If you want a comprehensive lesson on how to use Power Query, checkout my Power Query Online training.
Thanks for this Matt, great way to do the cross join!
Nice and easy trick, I was getting maed with Outer Joins but that did the trick!
Our scenario is nearly the same as yours. Except I cannot have my users go into Power Query and manually enter the {1..15} to generate our matrix. They will have to just open the workbook and go. But the numeric sequence values (like {1..15} or {1..10,12..15}) are stored in a database column in our “Jobs” table. I’m trying to build a matrix of those numbers times the tests in our “Tests” table.
Have you looked at the new feature announced in October – dynamic M query parameters. https://powerbi.microsoft.com/en-us/blog/power-bi-october-2020-feature-summary/
It is in preview, so you have to turn the feature on first
Can you, for example, have the {1545..1552,1601..1607} as a value in a database and then get PowerQuery to generate the list from the value? Or is that only available to convert to a list if it’s typed in?
Give it a try. Power Query can certainly generate that list and I see no reason why it can’t be passed to a DB
I can’t seem to figure out how to do it. I don’t see a way to get PQ to recognize I’m trying to pass a value from a query result (or a table or ???) and have the list generated. What I’d like to be able to do is have values like “1..15” or even “{1..15}” in my database and then have PQ build a list from that, but doesn’t work any way I’ve tried.
I haven’t tried it. At least conceptually, I would have thought if you were trying to extract a list of products from SQL, with product codes 1 through 15, you could do the following.
Generate the list of codes ={1..15}
Convert it to a table
Connect to the product table in SQL
Join the list of codes table to the source table using “Combine” using an inner join
That should do it.
But I’m not sure if that is what you are trying to do. Maybe you could ask for help at community.powerbi.com and provide some more context and sample data, possibly
That’s not exactly what I’m trying to do. I am trying to find a way to have that value in a database column (i.e., store “1..15” or “{1..15}” in a database) and have PQ read that result and generate the list 1 through 15.
I will try to post it over there. I posted to another forum and they didn’t think it could be done. Disappointing, would be a powerful feature for us. Thanks!
I think pert of the challenge is clearly describing what you want to do. I’m still not clear.
I had huge performance issues doing the approach suggested here.
The approach recommended by Chris Webb above of adding a match column to each table being joined with a value of 1 was about 3 x faster for me (which meant query took 5-7 seconds instead of 20 seconds with the approach mentioned in this article.
Hi there. And how would you cross join a table with itself? When I try to do it, I get the “cyclic reference” error message.
How about you duplicate the query first
Hi Matt,
In Power Query it show the right results for example (Table A ID = 1,2,3,4,4 and Table B ID = 2,4,4,5)
so 5×4 = 20 but when you use the table visual it show only 15 records ? now showing duplicate from table B
I wonder way? – How do I show all records in my table
Sorry, this is not enough information for anyone to help. I suggest you read my suggestions on how to get help here
https://exceleratorbi.com.au/get-help-power-bi-2/#freehelp
thanks for this post! So simple and exactly what I needed
I just wanted to say thank you for your blog!! Really helpful for those of us still learning the tricks of Power Query. I’ll take you up on your book recommendation too 🙂
Hi Matt,
I think there is a simpler way to do that. The only you have to do is adding the column with your list of weeks to the first table and expand its. That is all 🙂
Regards
Yep, that is what is says. Are you looking at an email copy of the site? I have already update it to reflect the above comment from Imke
But Matt, i’m not telling about adding the column containing first table in each row.
I’m telling about adding the column to the first table (column containing your list in each row).
I mean something like this
=Table.AddColumn( YourLastStepInFirstTable, “Week”, each {1542..1562,1601..1607})
and then expand this column.
Now we have the same table as you have. Only the order of records is different.
Regards
Yep, looks like a nice solution. Thanks for sharing.
oopsss…sorry the list should be {1545..1552, 1601..1607}
Matt, Imke,
There are some potential performance problems with these approaches that I cover here:
http://blog.crossjoin.co.uk/2014/06/02/join-conditions-in-power-query-part-1/
http://blog.crossjoin.co.uk/2014/06/04/join-conditions-in-power-query-part-2-events-in-progress-performance-and-query-folding/
The best approach to doing cross joins I have found is to create calculated columns on both tables that return the value 1, then do a merge to join these tables together.
Chris
Hi Matt and Imke,
Imke’s solution works a treat!
Thank you Matt for this elegant solution. I am using it already!
cheers
ACB
Hi Matt,
very cool trick with the comma in the list! Isn’t Power Query really rewarding once you start playing around with it?
Re the function for the crossjoin: As it has no parameters, you could actually skip it and simply write the name of the query into the formula section of your new column (=Budget). (In your case you would need to “unfunction” it back to a simple table)
(…I’m not a programmer at all, but from what I know about it, the simplicity of some concepts in Power Query must come quite as a surprise 🙂 )
Oh, cool. Thanks Imke, that is just the sort of “better way” I was thinking of/hoping for. I didn’t even think that was possible. I will test it out and update the post