I’m always looking for more productive ways to work with Power BI and associated tools. As I covered in my article last week, development of Power BI reports using SharePoint for your file storage can be so slow that it is impractical. In this article I discuss the various options to speed up the process, plus I then show you how to develop your Power BI workbooks so they use a local PC copy of the files on your HDD (which is relatively fast), then switch to the SharePoint source when you are read to publish online.
Options to Improve the Power BI Authoring Experience using SharePoint File Storage
There are quite a few approaches to improving the Power BI authoring experience when using SharePoint as your file storage. Options include
|1. Sync the SharePoint files to a network connected storage location using OneDrive. Develop the Power BI report to point to this local file storage instead of SharePoint.||Relatively fast to develop.||Files are still on a network drive, so not as fast as local files.
Need to install a gateway when you otherwise may not need one.
|2. Use Web.Contents() to directly access the files as you would a web page.||Reportedly faster to load a file (I’ve never tried it).||I assume it doesn’t work for file combine.|
|3. Use Sharepoint.Contents() instead of Sharepoint.Files()||Reportedly faster to load a file (I’ve never tried it), but can still be slow to navigate.||I assume it doesn’t work for file combine.|
|4. Use Dataflows to map the raw file, then load to PBI from there.||Moves the problem to dataflows so you can then develop quickly from a tabular source.||Only really works for single files in tabular format.
Can’t use it for file combine or other complex branching queries done in dataflows unless you have Premium.
|5. Develop on local file storage but deploy with SharePoint storage||Best of both worlds||Requires some manual editing of queries prior to deployment.|
There may be other approaches I haven’t covered.
Develop Using Local Files but Deploy With SharePoint
There are benefits in each of these approaches as covered below.
|Local File Storage||Accessing the files is the most responsive, particularly during development and file combine tasks.||You don’t get access to all the benefits of SharePoint as your file storage solution.|
|SharePoint File Storage||Inbuilt version control
Check in and check out
Ease of access to multiple users geographically dispersed around the globe.
The ability to access the files from Power BI without the need for a gateway.
|The development can be slow because the refresh and preview process can be slow.|
The solution I now prefer to use (and that I am going to explain in detail here) is to develop your Power BI Reports using local files, then manually switch to SharePoint files prior to deployment. Doing it this way gives you the best of both worlds. To do this, I configure the report to connect to both a local folder and SharePoint folder in Power Query and toggle between the two depending if I am developing or deploying. But there are a few things you need to do in the process to make sure everything works properly. The file path structure for SharePoint is different to the file path structure for local files, so you need to use some caution.
Walked through Demo
To demonstrate the concept, I will show you how I went about doing a file combine using my approach.
I have 3 simple csv files. I have loaded them into a local folder on my PC (#1 below) and also onto a SharePoint folder (#2 below). Note, I went ahead and added some extra files too so I could demonstrate the filtering process (this is particularly important). Also note, the folder path on my PC and SharePoint are not the same, but I do have one sub folder that is the same – the “Consolidate CSV” sub folder containing all my files.
Copying the Files to Your PC
Now there are a few ways you can get a copy of the files from SharePoint onto your PC. The easiest (and my recommendation) is to use OneDrive to Sync the folders you need. That way your PC files will stay in sync with the SharePoint files and the sub folder structure will be the same. But you can also set up a local folder on your PC and simply place the files you need there manually. If you do it this manual way, I recommend that you use at least the lowest level subfolder name so it matches the lowest level SharePoint Sub Folder name exactly (spelling, spaces and capitalisation). This is needed later in the process to ensure that you are swap easily between local files and SharePoint files. It is not mandatory that you replicate the sub folder structure from SharePoint exactly, as long as you can identify your files using at least one sub folder.
Start with the Local PC Version of the Files
This approach is designed to make the development effort as fast as possible. It therefore follows that you should start the process using the local HDD version of the files.
The generic pattern to follow is:
Connect to your local master folder on your HDD.
- Turn off “enable load” and call it RawLocalFiles.
- Create a new query referencing RawLocalFiles and call it ActiveFiles and turn off “enable load”.
Create a new query referencing ActiveFiles and carry out the transformation you need (I will be doing File Combine in this demo).
- Make sure the query in step 3 does not have any absolute references to anything related to the local file path.
- Repeat for as many files as you need to process.
- Connect to your SharePoint master folder.
Turn off “enable load” and call it RawSharePointFiles.
- Make sure the list of files looks identical to the list in step 2.
- The sub folders can be different as long as there is at least 1 sub folder than you can use to uniquely identify the files you need (or a file name).
- When you are ready to deploy, manually edit the query ActiveFiles in step 3 to point to RawSharePoint Files.
In this demo, I used Power Query file combine to connect to my local HDD folder and combine the files. For those who are new to the Power Query Combine feature can read my article here.
Connect to your local master folder on your HDD
- Use Get Data\From File\From Folder
Browse to the main (top level) folder containing your source files on your PC.
- In my demo this will be a top level folder containing my file combine folder and some other files (to demonstrate the point). In your case, just connect to the top level folder of the files that you need to use. It is easiest if they are all in sub folders beneath that.
Click Transform Data – don’t click Combine yet.
- I turned off “enable load” and called it RawLocalFiles.
Note the shape and structure of the table above – I will refer you back to this later.
Create the ActiveFiles Query
- Right Click on RawLocalFiles and select Reference
- Rename it ActiveFiles
- Turn off “enable load”
Use Your Files as Needed
In my demo, I am going to do a File Combine.
- Right click ActiveFiles and select Reference
- Rename the query and call it “Data”
Filter the files so you are only referencing the files you need (for file combine in this case). In my case I filtered on the Folder Path, selected Text Filters, Contains, and typed the name of the sub folder I need called “Consolidate CSV”.
- It is essential that you do not use any backslashes in your filtering. If you need to use backslashes to get to the correct sub folder, I will give you a bonus trick and the end.
- Click the File Combine Button
Then I ended up with this.
Before moving on, I am just going to do some tidying up by grouping my queries together, like shown below.
Remove references to anything related to the local file path
This is critically important, otherwise everything will fail. I manually checked every single step in every query to ensure there were no references to the file path on my PC. It will depend on what you are doing. If you are referencing a file by name, it is normal for Power BI to add a named reference to the folder as well as the file name. If this happens, you must remove the folder reference.
The RawFiles queries can and should have absolute reference to the sources, but these are the only ones.
Connect to your SharePoint master folder
I created a new source, from SharePoint Folder. Note you must use only the URL part of the SharePoint folder as shown below, not the full path to the document library.
Power Query will show all the files in all of the document libraries on this SharePoint site. It is most likely you will not want them all. So next, I went to the Folder Path column (shown below) and filtered to the Sub Folder I needed called “Some Folder”.
At this point – stop and take note. The list of files below is almost identical to the list of files on my local drive shown earlier. But the folder structure is different on SharePoint and my local PC. This doesn’t matter. All that matters is that from here on, you only use relative references to your files by either using the name of the files (without reference to the folder) or buy using Contains Text as covered earlier, making sure you don’t reference backslashes or slashes.
I renamed this query as SharePointFiles and set it not to load.
Now there are two source folder queries, one called RawLocalFiles and one called RawSharePointFiles. Both queries present the same list of files to be processed.
Change the Source Before Publishing
So that’s it. You can now develop to you heart’s content with highly responsive queries. When you are ready to deploy, just edit the single line of code in the ActiveFiles query (1) so it doesn’t point to RawLocalFiles (2) but instead points to RawSharePointFiles (3).
Use a Parameter to Toggle?
Now at this point you may be thinking that it would be neat if you could configure a parameter to toggle between both. I agree – great idea. But in my testing I couldn’t get it to work. I tried creating a parameter and then a few ideas such as an IF statement to toggle the file source depending on the parameter value. It was all going swell until I published the report. Once it was published, PowerBI.com reported that I needed a gateway configured for the local files, even though they were never going to load. If anyone can work out a way to configure this with a parameter without needing the gateway, I would love to know about it.
Changing Slashes to Backslashes
As a bonus trick, you can actually replace the slashes in the SharePoint File path with backslashes. If you do this, you can use a filter on Text, Contains that uses a more absolute folder path, like “\main folder\sub folder\combine csv” for example. You need to flip the slashes in the SharePoint list for this to work.
What do you think?
Tell me what you think in the comments below. If you have a better way, I would love to hear about it. Especially if you know how to fix that parameter thing so you can use a Parameter to toggle the source without configuring the gateway.