Until recently, I had not spent a lot of time building Power BI Reporting solutions that have SharePoint Online as a data source (This article refers to SharePoint Online specifically, but I am sure the experience is the same with on premise SharePoint). Over the last month or so I have been working with a few customers that are using SharePoint for the storage of source files in document libraries. On the face of it, it seems like a great idea to leverage SharePoint as a storage location for CSV and Excel files.
- Everyone has easy access to the files for editing and storage
- SharePoint manages version control, check in, check out etc
- SharePoint can facilitate shared editing of files
- You can build a Power BI report that will refresh online without the need to install a gateway.
Unfortunately, despite the benefits, the experience is not great. Power BI performance with SharePoint as a data source is simply terrible. Ultimately, the problems come down to performance in 2 areas.
Issue 1: Refresh Speed
Firstly, there is the refresh performance. SharePoint has an OData API which is not very performant (same with most OData APIs, actually). When you try to refresh against SharePoint, you will find the refresh is going to be slow. Now that doesn’t mean it is not workable, because refresh happens in the background, after all. In my testing and practical experience, the performance of SharePoint refresh drops off rapidly as soon as you create a number of sub folders in a document library. Stated differently, if you have your files in the main folder of your document library the performance is not too bad, but as soon as you create a few sub folder branches, performance drops away rapidly. A Power BI workbook with a number of sources seems to take 4-5 times longer to refresh when compared to the same workbook pointing to a file server.
Issue 2: Query Editing/Maintenance Speed
This second issue is much bigger than the first issue. When you have many SharePoint connections in your PBIX file, the query editing experience is woeful. Power Query has a few tricks to improve the experience, such as keeping a cached copy of the data for preview purposes. Despite this, I have found that editing queries connected to SharePoint to be so slow that it is literally unworkable when compared working with local files on your PC.
Actually, there is one setting in Power Query that you can alter that can improve the situation a bit.
Go into File\Options and Settings\Options
Go to the current file data load (1 below) and turn off background data (2 below).
This is definitely worth doing, but in my experience, it didn’t solve the SharePoint performance problem. I lost so much time trying to build queries with connections to SharePoint that I simply had to find another way.
Sync Local Files Instead
The best alternative I have come up with is to sync the files from SharePoint to a file server using OneDrive.
- Use OneDrive to sync a copy of the files from SharePoint onto a file server that can be accessed via a Gateway. This can be the same server used for your gateway or a file server on the network.
- OneDrive keeps a local copy on the file server in sync with SharePoint.
- Load the files from the local PC rather than from SharePoint.
This is the best option but comes with some new problems.
- One of the benefits of storing your files on SharePoint Online is that you don’t need a gateway at all. If you take this approach, then you may need to arrange for a gateway to be installed.
- If you replicate the files to the gateway machine, you probably will need to have a different file location on your PC for development and then change the file path when you deploy.
- Thirdly, there is an issue (if not a bug) where you cannot connect Power Query to the root folder of a synchronised OneDrive Folder from SharePoint.
eg, if you sync SharePoint to you PC as
C:\Users\Matt\Excelerator BI Pty Ltd\Excelerator BI Pty Ltd Team Site – Documents
you can connect Power Query to
C:\Users\Matt\Excelerator BI Pty Ltd\Excelerator BI Pty Ltd Team Site – Documents\Sub Folder\
but you cannot connect to
C:\Users\Matt\Excelerator BI Pty Ltd\Excelerator BI Pty Ltd Team Site – Documents\
Just keep that in mind – I lost an hour or so trying to discover why it wouldn’t work.
Also note: If the files are to be stored on a file server, you need to use the full network path to the files, not the mapped drive version. E.g. something like \\fileserver\department\folder name instead of n:\department\folder name
There is another way
There is another way to solve the SharePoint problem. That is a bigger topic and I will cover that off next week. For now, I just wanted to assure you “it’s not just you”.