How Query Folding Works - Excelerator BI

How Query Folding Works

Level: Beginners

Power Query is a product that allows business users to extract data from any source, cleanse and transform that data, and then load the modified data to a new location, typically into an Excel spreadsheet or a Power BI data model.  Power Query is very easy to learn, and most tasks can be completed by using the buttons on the menus.

When the data source to be loaded is a file or files eg a csv(s), Power Query reads the data from the file(s) and processes the transformation tasks directly inside Power Query.  This is fine for relatively simple tasks (eg remove or rename columns) and relatively small files (say less than 1 million rows).  But bad things can happen when you try to load tens of millions of rows using Power Query from a csv file and then try to preform ‘heavy’ tasks, such as sorting, over the data to be loaded.  To sort 10 million rows from a csv, Power Query would first need to load every record into memory and then complete the sort task from there.  While it may be possible to do this, it certainly isn’t the best approach.

Persistent Storage

Power BI and Power Pivot for Excel do not have their own persistent storage of the source data.  These products are a database, but they are a reporting database, not a storage database.  You will need a place to persistently store your source data outside of Power BI and Power Pivot for Excel if you want to be able to refresh your data in the future.

You can of course keep your source files (csv, xlsx) as the persistent storage if you have relatively small amounts of data, but if you are going to process a lot of data you really should consider loading the source data into a relational database. This could be something like Microsoft Access but better still, you could use something like SQL Server if you have it available. Even if you don’t have access to SQL Server, you could consider installing SQL Server Express on your laptop (it is free).  Believe me SQL Server is not that hard to learn (with YouTube) and it is well worth the effort to better manage your persistent storage of data.

What is Query Folding

Inside Power Query, the term query folding refers to the the process of taking complex tasks generated by the Power Query UI (such as sorting 10 million rows of data) and pushing those tasks back to the database for execution.  I like to think of Power Query “writing down” some SQL code on a piece of paper, folding the paper in half and passing it over to the database as a set of instructions for execution.

Power Query will automatically execute query folding under certain conditions including:

  1. The source must be a database that can accept a folding request.  Most relational databases are included.  It can also be an OData source, SSAS and Google Analytics (thanks Miguel and Chris).
  2. If you start your query with SQL code you wrote yourself, no further steps inside Power Query will be folded.
  3. Once you perform a step that cannot be folded, query folding stops for the rest of the query.

Power Query Online Training

How do I Know if Query Folding is Active?

When you use Power Query, the steps you produce are listed in the right side “applied steps” panel. You can check if a query step has generated query folding (or not) by checking for the existence of Native Query code inside that step.  To do this, right-click on one of the query steps and then select “view Native Query”. If you are able to do this, ie if view Native Query is enabled, it means Power Query has successfully performed query folding.

Do the Heavy Lifting First

Your objective when using Power Query over a relational database should be to do all the heavy lifting first.  You basically want to make sure query folding is enabled for as long as possible so the harder tasks are performed more efficiently by the database and not by Power Query itself.  The best way to check is to check if the last step in your query uses query folding.   If the last step uses query folding, then every step uses query folding.  However if you find that that last step is not using query folding, then you should work you way back up the list of applied steps to find out where query folding stopped.  If possible, change the order of execution of your steps so that all the foldable tasks are completed first before you complete any step that cannot be folded (e.g. Split Column).

I have created a brief video to show how Query Folding works and how to check where folding stops (there is no sound in this video, so you can look at it now 🙂 ).

Power Up Your Skills

You can learn this stuff from online articles, but why not excelerate your learning (see what I did there?) by doing some formal training.  I have an excellent online Power Query training course that will teach you query folding and so much more with over 7.5 hours of structured video instruction using real world examples.  You can checkout my training course here => Power Query for Excel and Power BI – Online Training

avatar
11 Comment threads
8 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
  Subscribe  
Notify of
Lazaros Viastikopoulos
Guest

Brilliant read! Thank you for this Matt. Did I understand correctly through? I have a query (data set) in Power Query which the source data comes from 5 CSV’s (over 5M records) appended together through a folder connector. Then, I return two more Queries (data sets) from a SQL DB Source using a SQL query, conduct a LEFT OUTER MERGE from the CSV data set as it requires some fields from the two SQL DB Sources, the performance will be slow as Power Query must load every record into Memory first and then complete the Left Out Merge. However, if… Read more »

Shobi
Guest

Thanks for sharing, Keep up the good work!

edh
Guest
edh

I love query folding. A comment, and a plea for help. Comment: If you change the data type as a separate step, it will most likely break query folding, but if you include the data type change in the step by manually editing the formula in the formula bar, it will change the data type and preserve query folding. The below will preserve query folding. #”Added Commission” = Table.AddColumn(dbo_oeordlin_sql, “Commission”, each [unit_price] * .05), #”Changed Type” = Table.TransformColumnTypes(#”Added Commission”,{{“Commission”, type number}}), That preserves query folding because the data type was already a number and it is just a visual in… Read more »

Eric Glover
Guest
Eric Glover

Has anyone found a solution to the YYYYMMDD date issue mentioned above by edh? In my case, the YYYYMMDD field comes across from the source in String format. So, anytime I transform the column it breaks query folding.

Chris Webb
Guest

Hi Matt,
Nice article, but there are a few inaccuracies. First, there are a few other sources apart from relational databases and OData that support query folding, for example SSAS and Google Analytics. Second, your point about no manual M coding allowed is wrong – there’s no difference between M code you write manually and M code generated by the UI, some things can be folded and some things can’t but it depends on what you do in the code, no how you write the code. You’re right though that using your own SQL query prevents any further folding.

Chris

Miguel Escobar
Guest

Hey!
OData also supports Query folding 🙂

Jessica Jolly
Guest

Thanks for this. Query Folding has been on my mind lately, and this helped me understand it. I think it should really be called Query Outsourcing! 🙂

John P
Guest
John P

Matt- great tutorial. In this example, it is straightforward. Any way you can do a blog (or video) of query folding when doing a left outer join? I am having trouble understanding this: You connect to Table 1 (75MB) in a database and do some transformations and close and load as a connection. Then you connect to Table 2 (100MB) in the same database and do some transformations and close and load as a connection. So now we have two queries in power query. When I do a left outer join (Table 1 is the left) it takes a long… Read more »

osiel
Guest
osiel

hola Mat siempre es genial tus publicaciones, seria genial que los cursos estuvieran disponibles en español. cuidate mucho, exitos en todo. 🙂

Scroll to Top