I had a question from one of my online Power BI Training students just this week who was struggling to import data from a large *.xlsb Excel file. I suggested she save the file as an XLSX file and try again – and that worked a treat. But the question is why? I had an email exchange to Miguel Llopis (from the Power Query team) and asked him why this was the case. As it turns out, the way Power Query connects to XLSB files and XLSX files is very different.
- The XLSX file import uses the OpenXML SDK to get into the actual data
- The XLSB file import (and XLS and some other non XML formats) uses the ACE OLEDB Driver (the one used to import Access data).
The bottom line is the OpenXML import approach is far more efficient than the one that uses the ACE OLEDB driver. While this may not be an issue with smaller XLSB files, it seems that it can be an issue for larger files. So if you have a choice, you could consider saving the XLSB file as an XLSX file instead. If that is not an option, you could consider writing some VBA code to save a copy when the XLSB file is closed.
Other Problems?
What other file formats have you had problems importing into Power BI? Please post your experiences in the comments below.
Files saved as read only and files with sensitivity flag of personal
It doesn’t help if you have macros in your XLSB as XLSX doesn’t support them.
Hi @Robert,
The suggestion to have a macro in the xlsb file that creates an xlsx copy and to have the power query connected to the xlsx is a good tip, which I’ve implemented.
The code I’ve used for saving as xlsx is as follows::
ThisWorkbook.Save ‘this saves the current xlsb,
‘then the next step saves a copy as xlsx (without the macros and in a different folder)
ThisWorkbook.SaveAs Filename:= _
“\\Destination_Folder\BackendReports.xlsx” _
, FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
ActiveWindow.Close True
Hope it helps
Israel Shavit
Not sure if it’s the same in Power BI – but in Excel this has a “nifty” side effect.
If the XLSB workbook in question is open and potentially refreshed or changed, the values returned by a query are not the ones on disk, but the actual real-time values in the currently open XLSB.
Hello Matt, I get the error message and it tells me that I need to install Access Database Engine in order to import .xlsb file.
It refers to this directory: https://docs.microsoft.com/en-us/power-bi/desktop-access-database-errors
Have you experienced downloading and trying out Access Database Engine for binary file and thus importing after to Power BI? I have put request to do that for my Computer Desk Help, but not sure.
Yes, this is a standard issue.with xls files. You need to use the Access data provider tool to import this file type. There can be issues installing the Access data provider that I cover here https://exceleratorbi.com.au/64-bit-power-bi-desktop-32-bit-office-can-do/
Hello Matt, I would like to understand more. I have small, medium, and large files geographically spread over a corporate network. How can I determine the best strategy regarding use of xlsx or xlsb files regarding performance. Are the file across a network copied to the local machine before a connection is made, or is the connection established across the network, working on the distant file? I was hoping that using xlsb files would be better as these are compressed and would result in better transfer to the local engine before processing. I need to stick with xls files but can change format (x or b). Thanks, Mark.
It will always be better to connect to files on a local machine than on a server. Servers will always be slower, and multiple servers will always be slower and harder (you have to manage the server names). I suggest using OneDrive or Sharepoint if you can. If you are refreshing locally on your PC, then sync the files to your local machine and refresh there. If you are using a gateway, better to connect directly to OneDrive (ideally) or Sharepoint. HTH
XLS file import from an ERP it is arround 200 files which daily imported. As i have already try in one file to open and save it, But it does not work.
So how should we overcome this situation with PQ. ?
I would expect that opening as an xls and saving as an xls should fix each file. if that doesn’t work you could try saving as xlsx. If that is too hard, you should log a ticket with microsoft and see what they say. But if your ERP is generating non-xls files with an xls file extension, you may not get any help.
It should work fine. Where do the xls files come from? One thing I know is that some third party systems create xls files, but they don’t strictly comply with the file format and power query wont read them. you could try opening them and then saving them as xls and see if it then works. if that fixes it, then it is an issue with the file format from the source system.
Hello Matt !
How can we merged “XLS” file from Power Query. It always give an error