Importing XLSB into Power BI

I had a question from one of my online Power BI Training students just this week who was struggling to import data from a large *.xlsb Excel file.  I suggested she save the file as an XLSX file and try again – and that worked a treat.  But the question is why?  I had an email exchange to Miguel Llopis (from the Power Query team) and asked him why this was the case.  As it turns out, the way Power Query connects to XLSB files and XLSX files is very different.

  • The XLSX file import uses the OpenXML SDK to get into the actual data
  • The XLSB file import (and XLS and some other non XML formats) uses the ACE OLEDB Driver (the one used to import Access data).

The bottom line is the OpenXML import approach is far more efficient than the one that uses the ACE OLEDB driver.  While this may not be an issue with smaller XLSB files, it seems that it can be an issue for larger files.    So if you have a choice, you could consider saving the XLSB file as an XLSX file instead.  If that is not an option, you could consider writing some VBA code to save a copy when the XLSB file is closed.

Other Problems?

What other file formats have you had problems importing into Power BI?  Please post your experiences in the comments below.

Share?

Comments

  1. It should work fine. Where do the xls files come from? One thing I know is that some third party systems create xls files, but they don’t strictly comply with the file format and power query wont read them. you could try opening them and then saving them as xls and see if it then works. if that fixes it, then it is an issue with the file format from the source system.

  2. XLS file import from an ERP it is arround 200 files which daily imported. As i have already try in one file to open and save it, But it does not work.
    So how should we overcome this situation with PQ. ?

    • I would expect that opening as an xls and saving as an xls should fix each file. if that doesn’t work you could try saving as xlsx. If that is too hard, you should log a ticket with microsoft and see what they say. But if your ERP is generating non-xls files with an xls file extension, you may not get any help.

  3. Hello Matt, I would like to understand more. I have small, medium, and large files geographically spread over a corporate network. How can I determine the best strategy regarding use of xlsx or xlsb files regarding performance. Are the file across a network copied to the local machine before a connection is made, or is the connection established across the network, working on the distant file? I was hoping that using xlsb files would be better as these are compressed and would result in better transfer to the local engine before processing. I need to stick with xls files but can change format (x or b). Thanks, Mark.

    • It will always be better to connect to files on a local machine than on a server. Servers will always be slower, and multiple servers will always be slower and harder (you have to manage the server names). I suggest using OneDrive or Sharepoint if you can. If you are refreshing locally on your PC, then sync the files to your local machine and refresh there. If you are using a gateway, better to connect directly to OneDrive (ideally) or Sharepoint. HTH

Leave a Reply