Self Referencing Tables in Power Query - Excelerator BI

Self Referencing Tables in Power Query

I have had this idea in my head for over a year, and today was the day that I tested a few scenarios until I got a working solution. Let me start with a problem description and then my solution.

Add Comments to a Bank Statement

The problem I was trying to solve was when I download a digital copy of my bank statement, loaded it up into Excel using Power Query and then wanted to add some commentary to some of the transactions.  If you just add the comments to the source file then everything is easy.  But it is often not practical to add the comments directly into the source file for a number of reasons.  This could be because you get a new copy of the source data each time you refresh (eg you get a CSV file that replaces the old file, and the file just gets larger each month), or it could be if you are using a “combine multiple files” technique and you only want to deal with a summarised, cleansed version of the final data instead of dealing with the source files directly.

Once you have loaded the table in Power Query, it is possible to add a new column to the resulting table (shown below).  The problem with this approach is that the comments are not logically linked to the rows of data in the table.  Take the example below.  The data is loaded from table 1 using Power Query, into table 2 on the right.  I then manually added a new comment column and added some comments (as shown below).


The problem that can happen is demonstrated below.  If the sort order of the source table changes and then you refresh the query, the comments no longer align with the original data.  And this is despite having a unique ID column in the original table.


The problem is that the new column I manually added to the Power Query table on the right is not logically joined to the actual table, and hence the comments are actually in a column next to the table rather than part of the rows in the main table.

Enter Self Referencing Tables

I read this blog post from Imke quite some time ago, and that was what gave me the idea on how I could solve this problem. The idea is to load table 2 above a second time (after adding the comments), and then joining it back to itself, hence logically joining the manually added comments to the rows in the main table.

Note: Prior to completing the self referencing steps below, I used Power Query to create the second table from the source table. This was my starting point.

Then I loaded the resulting table 2 a second time as follows:

  1. Select in the second table
  2. Go to the Power Query menu
  3. Create a new query using “from table”.


I renamed this new query “Comments” and selected “Close and Load To” so that it only created a connection but didn’t load the new table to the worksheet.


The next thing to do is to merge the original query with the new query.  To do this,

  1. Go back and edit the original query for table 2
  2. Add a new step to the query using “Merge Query”
  3. Merge the second “Comments” query with the original query by joining on the ID column as shown below.


This will give you a new column in your query, and the column contains a table (as shown below).  You can then expand this new column to extract the comments.


I only extracted the comments column, and deselected the last option as shown below. Note, it is important that you deselect “Use original column name as prefix” so that the new column has the same name as the original source column.

Click OK.

Edit May 2020.  Power Query is always changing.  It seems there has been a change which introduces the need for another step at this stage. After expanding the new column, you will see that Power Query renames the new column and calls it Comments.1 (shown below in 1).

Rename this step

You need to edit the code shown in 1 and remove the .1 so the new column has exactly the same name as the original table you loaded.
After you have made this change, you can then close and load.


When you look at the table now, it all looks the same as before (see below), but there is one important difference.  The last column in the Power Query Table is now a part of the table and not a manually added on column.  You now have a self referencing table.  EDIT 2017.  If you get a new column called Comment2, just delete it.  Everything else will work. There was a change to the way Power Query works at some stage and this duplicate column with “2” appended now appears.  But you can (and should) just delete it – it will then all work.


To prove this, I completed the same sort test as before.  First I sort the source table, then refresh the Power Query table – you can see the results below.  Note how this time the comments stick with the row in the table – sweet!


Power Query Online Training

Incrementally Add Comments

Now that the queries are all set up, you can incrementally add comments to the Power Query table any time you like.  And you can also add new data to the source data – it will all work as you would expect as shown below.


Real Life Data

The above demo is of course test data.  A more realistic real life scenario would be to download a CSV file from your bank, and use one of the many “combine multiple files” techniques to import the data.  It is likely that your source data is messy, and contains lots of columns you don’t need. In this real life scenario, it is not easy to add comments to the source data unless you open each file and add the comments there – this is not ideal.  Another problem could be that your data source could be a CSV file that uses a “delete and replace with the new larger file containing the latest records” method.  In both of these scenarios, you will need to have your manually added comments located somewhere other than the source table.  The solution provided here is a simple and intuitive way to manage this for the user (albeit with a little bit of setup first).

The example shown in this post will work just fine as long as 2 conditions are maintained.

  1. You have a unique ID for each record
  2. You don’t accidentally load duplicate data – so don’t do that!

But What if I don’t have an ID column

The next logical issue you may have is a scenario where you don’t have an ID column.  I tested quite a few scenarios including adding an ID column inside the first Power Query step (add index column).  I haven’t been able to find a solution that works 100% of the time, but there is one that can work for most people.  If your data is such that each row in the table is unique (eg date, maybe a time column if you are lucky enough to have this, amount of transaction, remaining balance etc) you could create a unique ID by concatenating these columns into a single unique column.  This should solve the issue for most people in non-commercial scenarios (those in commercial scenarios are more likely to have a reference ID anyway).  You could even create an audit query that checks for duplicates and warns you if this occurs, but that sounds like another blog post.

I hope you find this article useful, and or it inspires you to come up with your own innovative ways to use Power Query.


If you want a comprehensive lesson on how to use Power Query, checkout my training course here

91 thoughts on “Self Referencing Tables in Power Query”

  1. Fantastic! This has been a life saver for me – I work for a retailer and sometimes we need to pull a list of returned goods that needs investigating from our SQL-server, for them to be investigated and manually assigned a category in Excel. The problem was, that when I would refresh the data set to pull new orders that needed investigation, the manual assignment got misaligned and put onto the wrong order. Using this, I was able to keep it correctly aligned when pulling new data from our server. Thank you so much.

  2. Derick Reichwald

    This has been a huge help! Thank you! Not sure if this thread is still active, but here goes nothing. I’m working in Excel 2016.

    When removing the .1 after you complete the merge and expand the columns, I get an Expression.Error: The field ‘____’ already exists in the record and it won’t run the query. When watching a video that was posted in the comments, after the .1 was removed the columns just popped together.

    Any advice? Appreciate the help and the article!

    1. The results do vary from time to time as the versions change. I don’t have the magic answer – you will just have to experiment and see what you can do.

  3. Thank you, Matt! This really helps?
    Do you know how I can insert a row manually in the source table so that the comment shifts as well?

    1. This solution has a source file and a comments table. I don’t see how you could add a row in the comments table as the comments table is just a copy of the source file. If you wanted to add new records, I would probably take a different approach and create a second table of manual records and append that to the source file data. It would mean you would have to uniquely ID each new row of data from any other in the source.

  4. This is Great! Everything seems to be working as it should except for when I input formulas into the cells of the new table. When just entering data and I hit refresh, everything stays. However, when I input a formula into one of the cells and hit refresh, the formula disappears in the formula box and it is replaced with the result of the formula. Any thoughts as to why this is happening?

    1. Yes. You can’t use formulas. The data in the table gets loaded into PowerQuery and then reloaded back into the table. During that process, any formulas will be converted to text. There is no way around this for this process

  5. Hi Matt,

    Thanks for publishing this post! It is exactly what I needed. However, upon completing it I noticed that my data is self duplicating random rows upon each refresh. I have not seen anyone else have this issue in the comments below. Here is my situation:

    1) PQ from Exchange Server to pull purchase orders
    1.1) in edit Query I perform lots of table transformations to filter out columns/data that is not needed in my report.
    1.2) Close and Load – creates table “Mail”
    2) Added 5 columns to this table
    3) Create “Input” query with added columns and close and load to connect only.
    4) Merge queries to same “Mail” table (using auto generated “ID” column as the key from exchange data) with Left.outerjoin
    5) delete duplicated columns PQ adds after merge (I.e comments1, etc)

    I have the report set to auto refresh every minute since we get hundreds of orders throughout the day. First, I only used one table in excel for output/input and i had this duplication. Then I tried using two tables (PQ from Exchange and transform data into table 1 then PQ that to Table 2. Table 2 is then made into output/input using your method above with less table transformations since it’s referencing table 1) but i still get random row duplications that exponentially duplicate with each auto refresh.

    Any advice on how to prevent this from randomly duplicating rows? Thanks in advance for your response!

      1. Thanks for the response! In your example in your post, the primary key is the unique ID column? Or is it something else I need to define?

        1. It can be anything, as long as it is unique and comes from the source file. It could be a concatenated column combining date, description and amount if necessary.

  6. Hi Matt,
    can you explain why this works? When you try to use the result of an Excel formula inside the formula, you’ll get an error. Somehow this solution feels like it would also create an error. Does it depend on the order in which the queries are processed? Is there any risk that future versions of Power Query will break my table-with-comments?

    1. Can I explain why?, not really; i just know it works. Effectively there are 2 copies of the table, to original and the one with the comments, and it sorts itself out. Power Query is not Excel. Excel formulas continue to iterate until there is an end point. Power Query doesn’t – it just does one iteration then stops once the table loads. It seems to always work. Yes, there is a risk it will stop working. It has already changed a few times, but it still works.

  7. Matt,

    This has to be one of the most glorious solutions I’ve come across. I’m a heavy user of Visio in my consulting role and needed to be able to make use of the data driven Visio cross functional flowcharts but needed to use the Excel table the Visio interacts with as a dynamic data source. As you may be aware Visio is notoriously finicky about any changes to the underlying table. For the project I’m on I needed to not only derive additional columns from the Visio data but also to extend the dataset out in order to build some semi-dynamic documentation. I’d been losing my mind trying to carefully extend the base spreadsheet only to keep having the link between the Visio and the Excel break.

    Finding this self referencing method has completely solved the issue because I can simply leave the original Visio Excel table alone. I’ve built a second spreadsheet that uses the original table as a source but also extends on that table for the additional fields I need using the Visio Object ID as the primary key. I’ve managed to get a combination of calculated fields, mixed with free text fields (as per your example) and it’s completely saved the day!

    Thank you so much for sharing this brilliant solution.


    ITSM/SIAM Consultant

  8. Thank you Matt for this post; Leila G pointed me to your site which I greatly appreciate (I took her PQ course took to find such nuggets – her course is great btw!!). I only found one other YouTube video referring to this topic (Doug H from There are a lot of assumptions here (it should work, you should delete) which may not be straight forward! Have you tested the position of your “comment” column in reference to the original query? I would like my “comment” column to actually be almost at the beginning (2nd column to the left) of my original table (inserting the “comment” in column B); this seems to throw a curve ball to the process. I would love to hear from you if you have tested more scenarios on how the “comment” column position affects performance. My “source” table is a query from an external Excel file. Many thanks in advance! Yves

      1. Matt, I have tested a few options:
        1) I figured it is possible to add multiple “comment” columns (I needed to add 5 comment columns!)
        2) when adding the “comment” columns, I created them first at the end of the table (far right); when I attempted to insert them in the middle or before the first column, the “comments.1″,”comments.2” replicas would pop up on every refresh. This seems to be a key step.
        3) I created the simple self-referencing query as you suggest.
        4) I merged my original query with the self-referencing using several data columns for my merge – I had no unique ID).
        5) Upon the very first refresh, I deleted the extra “comments.1” to the right of my table.
        6) I went back in to query editor and re-ordered the columns as I needed (I wanted the comments columns at the beginning).
        The refresh now holds in place!
        Thanks again for the great pointers Matt!

  9. Hi Matt, trying to get this working… at the very start when you created table 2 from the source data, was that done just in the query or did it get loaded to the power pivot in the background? How did the Comments column get created in the first place? I can’t seem to make a blank column in power query, so have to make it in power pivot. but then I end up with a Comments.1 column in power pivot, even though in the excel page I have deleted it.

    When I hit refresh the comments data I have put in just disappears.

    1. You connect to the source, edit the query and “load to table in Excel”. This has nothing to do with Power Pivot. The comments table is then added by simply clicking in the cell next to the table as shown in the first image in the article. No where do I create a blank column in Power Query, nor do I use Power Pivot.

  10. Ron MVP (2012-2018)

    Funny thing, I just cooked up a similar solution a couple of hours ago. Similar scenario, user wanted to add new comment data columns to data imported using PowerQuery.
    We have a unique row ID for the join.
    I created a “comment” table that consists of the ID column and comment column.
    Use PQ to import the source data, close and load to Excel/PowerBI
    Use PQ to import the Comments table, close create connection only
    Open the source data query, Merge it to comments query using a Left Outer join
    Your self reference has 2 copies of the full table. My way only has the smaller number of rows with comments, but requires a separate comment input table. I’m not sure which is better. Your way may be a little “neater”.

  11. Hi Matt, this is awesome. Is there a way to connect the query table to the source table? What I mean is, my source table is really long and I want the users to make any changes they have on the smaller query table. Is there a way to relay those changes back to the source using a variation of the above?

    1. This solution creates a “copy” of the source with an additional column that only exists in this new copy. It can only display the new column in this “copy”. It can’t “write it back” to the source. I am not sure if I understood correctly, or not.

      1. Thank you Matt

        In my situation, the source table is a live table managed by one stakeholder. My second table using your above self-referencing technique has additional columns that another stakeholder updates. Ultimately, they both have certain columns that (at this moment) can be updated by either parties but with the above structure cannot be updated on different tables. I was looking to understand if there was a variation of the above technique to make that possible. I hope that clarifies things.

        The issue with a “copy” is, (in my beginner level experience), is that the source is a live document and cannot be maintained as such if I make a “copy” from my understanding.

        Thanks again

        1. Matt Allington

          I recommend looking for a different solution. Options include Excel Online, SharePoint and Power Apps. I’m sure there are other options too.

  12. Hi this is just awesome. thanks very much for sharing. the addional columns i added included some formula fields i.e. Table Column C = Table Column A + Table Column B. This formula disappeared on refresh and only the data remained. So any newly added table columns will loose the formulas that were input. I set these formulas for the columns programatically and it works great. thanks.

  13. Hi Matt,
    this really is a fantastic solution ! Unfortunately when I tried to share the excel file with with number of colleagues via Sharepoint or OneDrive, working simultaneously in the file seemed to cause issues. as long as we only entered comments we could save/refresh the file. but as soon as we refreshed the query before saving we got an error message upon saving saying that the “file wasn’t uploaded because we cannot merge changes made by xxxxxx” and then the requirement to either “save a copy” or “discard my changes”. do you have any idea ?
    Thank you so much !

    1. I realise this doesn’t help, but this is a OneDrive issue. I actually hate OneDrive with a passion. Give me Dropbox any day! I can’t suggest anything other than to save first. One thing you could do is to make it a VBA macro workbook and add a button to save and refresh. This would prompt people to click, ensuring the required steps are executed in the required order.

      1. Hi Matt, thanks for looking into my question, but I may have not been clear. We can simultaneously enter comments in the power query table and can save those comments but as soon as you run the query (before or after the saving) to link the comments to the other data in the table you can’t save anymore without getting the error message. so it really seems the query refresh action (which works) that cannot be saved. I could write a macro that ensures the required steps as executed in the right order, but I’m not sure it will work as it really seems the end result of running the query that cannot be shared. As long as you work alone in the file it works perfectly. I have not enough Power Query knowledge to understand what triggers this and if there is anything I can do about it. Thanks again !

        1. Hi Matt, I did some further testing . if multiple users are co-authoring the file and user 1 has not saved his comments yet before user 2 saves his comments (and thus also refreshed his sheet with potential saved data from other users) and refreshes the query and saves the end result, the unsaved comments entered by user 1 are wiped out when he clicks the save button. I guess the save action refreshes his view with the end result of the query by user 2, instead of saving his own comments. Any thoughts for a potential solution ? Thanks a lot !

          1. I can’t think of a simple solution to this other than to force a single person to edit at any one time, maybe with check in/check out the documents or turn off the collaboration features.

  14. Hi. I group, index, filter a top 10, then sort by my variable gross value$. Using the old method it used to keep my list sorted correctly, now I cannot see any pattern and it appears random. Any ideas?

  15. George Johnson

    Hi Matt,

    Followed your tutorial but I’m getting the duplicate columns issue. Tried deleting the second one that comes up but it keeps reappearing and the data that I’ve put in the first one shifts over. Does this still work or is there an alternative?

    Excel Version: 365 MSO 16.0.12730.20144 32 Bit



    1. It definitely works – I did it just the other day when recording a training video for Power Query Academy at Skillwave (in case you are interested in more comprehensive training)

      I did notice a small change that could be the issue. After you merge the comments table, Power Query renames the new merged column that you extract and calls it comments.1 You need to edit the rename step so it keeps the original name without the .1

      1. I tried it and also get the duplicate column, in which I delete after refresh reappears. I am wondering if there is a trick for this now. Office 365 ver 2004 (build 12730.20352)

  16. Hi Matt
    This is avery useful article, thank you for that.
    I have folowed your implementaion staps and the only difference on my side is that afer merging the queries and loading table 2 back to Excel I end up with 2 “comments” columns as opposed to one as in your example. Is this expected in the latest version of Excel?
    Claudio Lopes

    1. The behaviour has definitely changed a few times over the years. Have you tried deleting the new second column and then refreshing again? That certainly used to work. Please let me know. Also which build version of Excel you have

  17. Hi Matt

    Great article, thank you. Quick question: If the text in the comments box is a formula, is it possible to retain this formula on refresh?

  18. Matt,
    Can’t thank you enough for this clear explanation.
    I went through this solution on many discussion forums / sites – but was finally able to grasp only after reading the way you have explained!

    A major roadblock is cleared with this. I wonder why don’t Microsoft Technical Team make it a native functionality in their Excel programming. This is something that is obviously needed by many people.

    Thanking you a million times.

    Best Regards

  19. This is a great tip! I wonder though– is it possible to also have dynamic columns that also preserve data even if renamed?

    This is what I have now, using Role and Group tables to define a RoleGroup matrix:

    Role = Excel.CurrentWorkbook(){[Name=”Role”]}[Content],
    Group = Excel.CurrentWorkbook(){[Name=”Group”]}[Content],
    RoleGroup = Excel.CurrentWorkbook(){[Name=”RoleGroup”]}[Content],
    #”Merged Queries” = Table.NestedJoin(Role,{“ID”},RoleGroup,{“ID”},”Removed Columns”,JoinKind.LeftOuter),
    #”Expanded Removed Columns” = Table.ExpandTableColumn(#”Merged Queries”, “Removed Columns”, List.Skip(Table.ColumnNames(RoleGroup),2)),
    #”Removed Other Columns” = Table.SelectColumns(#”Expanded Removed Columns”,{“ID”, “Client Internal Role”}&Group[Group LongName], MissingField.UseNull)
    #”Removed Other Columns”

    I can add, remove, and rename rows (Roles here) while preserving anything in that row in the RoleGroup matrix (provided the ID doesn’t change), but I can only add and remove columns. If I rename a column (Groups here), the column is renamed (or more accurately, replaced), but the data in it is lost.

    Can you assist? I think the solution would be generally applicable to a lot of use cases. Thanks!

  20. Yes, the blue table is the source data created by converting the data into an Excel table. The second (green) table is the one generated by PQ. To move the green table, right click on the query in the query pane on e right, select “load to” and then specify the location.

  21. Thanks for this tip. Do you know a simple way to change the “comments” column name without losing your data? When I change the column name, it breaks the connection and then the self-referencing query. Even if I manually change the connection (using advanced editor) to the new column name (let’s say “Changed name”), when the query is refreshed, all of the data is lost.

    The general context for the question is that I am aggregating data from multiple bank accounts for analysis. This data set may change as my analysis progresses. Each transaction is given a unique ID in the original account transaction table. I pull each account table into a separate PQ and then append them all together in a master list (called pq_AllTrans). I then want to be able to analyze the data as I see fit (including adding and removing analysis columns, changing their names, etc.). It is critical that my analysis survive any changes to the underlying data set.

    1. It is hard to say. My guess is that it will work as long as you change the column name in the original query(ies) before you load the table. Once you have loaded the table in Excel you won’t be able to change it prior to reloading it.

  22. Hi!
    I think I will be able to use this method as a solution to my problem. Although, I don’t understand this part of your method: “Note: Prior to completing the self referencing steps below, I used Power Query to create the second table from the source table. This was my starting point.” Could you tell how did you do this? Is the “second table” you are referring here the green table named 2. Power Query Table? How did you get the two tables side by side in the same sheet?
    Thanks in advance.

  23. Hi Matt,

    thanks for this smart solution.
    I’m identifying one problem with office 365.
    When I sort the table or enter a new key in the middle of the source table I need to refresh the query two times before the new key is visible or the sorting is correct.
    Do you have a hint why this is happening?

    1. Sorry, I don’t have any tips. My best guess is the order in which the queries are executed is back to front, hence why the second refresh fixes it. Currently there is no way to control which queries are executed in which order natively in PQ. If it is a big issue, I guess you could write some VBA to refresh one at a time, in the correct order.

  24. Hi, I tried this trick but as needed to have a kind of back log tracking of comments, but during the self-joining I have > 6 000 rows the file is inflating to gigantic amount Mb of memory. I think the recursive step is probably killing the process at each steps creating a table, nobody in the comments reported this kind of problem, it’s strange. May be I am doing something wrong. Tried also with duplicating the query and not loading from the table.

      1. Hi there,
        6k rows would only create a problem for my experience, if there are complicated transformations before.
        Buffering could solve the problem. So please “wrap” the first table expression (“PreviousStepName”) in a Table.Buffer like so:

        Table.NestedJoin(Table.Buffer(#”PreviousStepName”), {“ID”}, ImportData, {“ID”}, “ImportData”, JoinKind.LeftOuter)

        Missing keys could also slow down a join (although it shouldn’t be a problem with 6k rows). But anyhow – removing Duplicates would solve that problem like so:

        Table.Buffer(#”PreviousStepName”), {“ID”},
        Table.Distinct(ImportData, {“ID”}), {“ID”},
        “ImportData”, JoinKind.LeftOuter)

  25. What about append tables??? I have 5 sheets that are merged it 1 query. When i use your method it works. But when additional data is added in some sheet(s) it doesn’t works anymore.

    So how to achieve this:
    i have 5 sheets with data that will constantly add new data. I want 1 query that will append this into one table but with function that i can add comments.

    1. Well you need a unique key on each row to make it work. If you inserted new rows, as long as the keys don’t change and the new rows get new keys, I don’t see why it wouldn’t just work.

  26. I desperately need this to work for me – but I don’t get the same results??
    I’ve created a simple table in Test.xls, Then loaded in Test 2.xls using Power query.
    Added the additional column, loaded from table, merged with itself etc.
    But see the behaviour in this video I posted on YouTbe.

  27. Thank you for sharing. I think I encountered the same issue as others here : if i cancel the new column, the columns are independent again and new sorting of datas leads to a mess. So I kept (and hired) the added column and it seems to work 😉

  28. But did you really have this idea rolling around in your head for a year prior, Matt? You totally stole the technique AND the name from me. Sorry, but this has been bugging me for years lol

  29. Hi, Matt! It’s a really powerful trick, thanks a lot! I have an another such an issue that I can’t handle. I need to change an already loaded value (let’s say date in your table 1) and that new value should be fixed. May be you could give me an idea to solve it..

      1. Hi, Matt, scenario is simple. I have a raw data from a DB. And I know that this field in this row is incorrect. I haven’t an opportunity to correct DB, but PBI model should be correct. I need a way to make changes on fly either it’s a fact-table or a dimension-table. Now I make an algorithm with row substitution from a table but it’s a little bit clumsy.

          1. Many thanks for reply, Matt! I’ve worked out such approach, but it would be great (in my dreams) to make it so easy as in your self-referencing solution (Just make changes in output table) 🙂

  30. Matt,
    In my sales data among Customer ShipTo names there is some redundancy, such as “ABC Cleaners” and “ABC Cleaners Inc”. I don’t have access to the primary source to remove duplicates there. Each month I get an updated file. I want to keep the original Customer ShipTo names, so I use your excellent method above to build a crosswalk/lookup table then build a relationship between [CustShipToName] (original w duplicates) in the original table to [CustShipToMergedName] in my lookup table. Works fine in Excel PowerQuery as you show here.

    Can this be done in Power Bi desktop? Imke posted that Power BI makes self-referencing tables easy, but what about your method of adding an editable column to a self-referenced table: can this be done in Power BI Desktop? I’m relatively new to Power BI, and haven’t found a way to easily edit individual cell values (without writing a new function step in PowerQuery for each edit) in the manner needed to update values in my crosswalk table. If I want to run my report again, this time merging sales of XYZ Co and XYZ Assoc because new information tells me they are same company, how best do I do this in Power Bi using your method.

    My workaround is to use Excel to create the lookup table, write the results to separate CSV file, then add it to Power BI. Can you think of a better mothod?


    1. Good question Dan. There is nowhere in Power BI that has a “cell” reference where you can take an extract from a query and then manually add data into a new column. So you can do it the way you are doing it (which is not bad). Another approach I could think of (not quite as good, but not bad) is.
      1. Write an “enter data” query to create your substitute table and load it to power bi
      2. Write an audit query that returns the new values that you need to match that currently don’t have a substitute. Return those values to an audit table in Power BI.
      3. When you refresh, check this audit table. If it contains any values, copy the values to the clipboard
      4. Go back into the enter data query, paste the new records to the bottom of the table and manually add the substitutes
      5. refresh and make sure the audit table is empty. You could even put a warning flag on your report that tells you when there are records in the audit table.

      Could that work?

  31. Matt my name is Samir and İ live in Azerbaijan. You cant belive this is what I was looking for many many month. I also added comments into my power query results and after sorting data was moved. I could not understand what is going on. Now I imagine why this occur. You are fantastic!!!! Thank you very very much!!

  32. I am trying to use this approach to update a table with daily excel files with closing stock price info but something is wrong and at some point it doesn’t work. let’s assume i start with daily stock data file stored in a workbook and read from it and import the data in power query. i then clean my data and load it to a table. then i create a new query from this table and add a new column using Date.ToText(#date(2017, 8, 11)) where essentially i say to power query that the specific file it will read from and add to the db will be for the 11th of Aug, then another one for the 14th etc…
    next step – per your post – I create a connection and then try to merge the original table with the new one and expand the date field only but i get “null” for dates. is this because during the merge i do not have unique ids? should i create an extra column with let’s say the stock ticker and the custom date e.g. MSFT11/8/2017 ??

    thank you in advance for your help.

    best regards,


    1. Zaf, my technique has unique IDs and also the new data is manually added in Excel, not Power Query. It sounds like you don’t have IDs and are trying to automatically add data. If you can automatically identify the columns in Power Query before loading to Excel, and then if you manually add data (data, not formulas) in Excel then I see no reason why it wont work. If that is not what you are doing, then you will simply have to test it and see what you can work out.

  33. Thank you for the quite useful post! I am trying to use its logic and get stock market daily data from the web and from another source (sap bi). I am not sure though what would be the best way to flag the date as in some cases I might pull the data on a later date (when downloading from sap).
    1. let’s assume i want to update a daily price table and due to vacation i delayed the update for a whole week. once i start downloading the daily data, i want my query to read each day and add it to the db
    2. in the case of the web update, i link the original source to the website and need to re-visit this daily so that it adds the new data and creates some history. what is the optimal way to do this? is there a way to schedule the data import on a daily basis even if i am not in front of the pc?

    thank you in advance for your support!

  34. Hi Matt.
    Wrote a post on Microsoft Communities. A lady replied.
    Her solution was to simply delete the additional columns that appear, denoted with the ascending number sequence.
    The query worked after that. Tried again from scratch and the numbers still show up again. At least deleting the columns fixes the issue.
    Guessing it is a bug or addon issue.
    Thanks again for a very useful post. Have implemented it at work on a multi-file and multi sheet query. It is working like a charm.

      1. I have Excel 2016 and last time I tried it worked fine. If you can share more information about the problem I would be happy to look

  35. Hi. Great and helpful post.
    Have it working, with one minor issue (tried 15 times mirroring your data and following exact steps).

    When completing the final step, after clicking load and close (after the merge), an additional Comments column shows up called “Comments2”. Everything works fine (sort and filter, open and close and the additional data remains within the relative record), but can’t get rid of (or hide) the Comments2 column.

    Also tried the same above while adding in 15 additional columns. The same thing happens where the column name is replicated and a 2, 3, 4…n is added on at the end of the table.

    Any suggestions? Thanks in advance.

          1. Hi Matt,
            Thanks for going through the extra work to make the video while testing the code. Much appreciated.
            Our steps are exactly the same. Couple of notes:
            In the video, at 1:54, you can see the table expand, then contract (image link below). It seems my version of Excel is not doing the contracting part (or removing the “Column2” column)

            Curious what your code looks like on the Comments query. Noticed after refreshing that the “Column2” column appeared.
            Image screen shots show the changes. Odd the code doesn’t change.

            Pre-refresh Code:
            Post-refresh Code:

            Excel: 2016 Version 1609 (Build 7369.2095)

            Looks like a tech support question for Microsoft. If I learn anything, will update the post. Thanks again for sharing this very useful query tool!!

            1. My code looks just like yours. Very strange. I even added a second column like you did and it still worked my end.
              Source = Excel.CurrentWorkbook(){[Name=”Data4″]}[Content],
              #”Changed Type” = Table.TransformColumnTypes(Source,{{“ID”, Int64.Type}, {“Date”, type datetime}, {“Amount”, type number}}),
              #”Merged Queries” = Table.NestedJoin(#”Changed Type”,{“ID”},Comments,{“ID”},”NewColumn”,JoinKind.LeftOuter),
              #”Expanded NewColumn” = Table.ExpandTableColumn(#”Merged Queries”, “NewColumn”, {“Comment”}, {“Comment”})
              #”Expanded NewColumn”

              Let me know if you hear something from MS

  36. Very nice blogpost!
    In order to prevent messing by accidental multiple loads you can add a Remove-Duplicates step on the index – just to be on the safe side 🙂

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top