This is a further (third) update/edit to an article I first wrote on 18th April 2017 and then updated on 30th Jan 2018. When I update my articles I simply change the original post with the latest data – that way only the current/latest version is available preventing people stumbling on old advice.
As Power BI becomes more and more pervasive in the business world, I am being increasingly asked “How can I build 1 master dataset, use it as the 1 version of the truth, but still allow people to do their own thing”? My concept of the Golden Dataset (you heard the term here first, in 2017) will allow you to do just that – read on to find out how.
The Problem – Too Many Data Models
Power BI has so many great features it seems almost sacrilegious to discuss problems, however one of the problems that can occur in the world of Self Service BI is the proliferation of different versions of essentially the same data model. This is made worse by the fact that you can publish the same data model to multiple workspaces at PowerBI.com. Note: data model and dataset are essentially the same thing – I refer to data model when using Power BI Desktop, but when it is published to PowerBI.com it is officially called a dataset.
If you think about it, Self Service BI using your own data models has the potential to be the antithesis of Enterprise BI. Instead of having one single data model that is used for all reporting (Enterprise BI), you can build individual bespoke data models each time you have a new problem. This is both powerful and problematic. If you are not careful, before too long you can have an unwieldy mess of uncontrollable and unmanageable data models that are essentially the same at the core but with different reports sitting over the top. My suggested solution to this problem is to create a “Golden Dataset” and use it as single master source in Power BI Desktop (and Power BI Service). It is a great approach. Let me explain.
What is the Golden Dataset?
The Golden Dataset is a master Power BI Data Model that is loaded to PowerBI.com with or without a report attached. (Actually currently you must have at least a single page in your report, even if it is blank, but you can delete it once loaded to PowerBI.com if you like). Once you have your Golden Dataset loaded to PowerBI.com, you can build as many new ‘thin’ reports as you need directly from Power BI Desktop by connecting Power BI Desktop directly to PowerBI.com with Get Data -> Power BI Service.
Note: A thin workbook is one that doesn’t contain a data model itself, but links into a central data model (in this case stored on PowerBI.com).
The advantages of taking this approach are:
- You only have one dataset/data model to maintain. This will use less space but also eliminate the proliferation of slightly different versions of the data model and minimise the effort to keep the models up to date.
- You can maintain and improve your single data model in the Golden Dataset master PBIX and make the changes available to all users and dependent reports.
- You (and everyone else with access rights) can simply build new ‘thin’ reports connected to the Golden Dataset as needed.
- Anyone building a new report from the Golden Dataset will be able to add their own “local” measures – local to their thin workbook.
- With the implementation of a suitable business process, the best local measures created by users can be retrospectively migrated into the Golden Dataset for the benefit of all.
- You get the benefits of an enterprise solution (kinda like using an SSAS Tabular data model) without having to invest in all that infrastructure.
This is how it works.
Publish Your Golden Dataset
There are actually a few ways you can do this. I learnt options 2 and 3 below the last time I updated this article (from comments from my blog readers).
Your options are:
- Publish your Golden Dataset direct to a Power BI App Workspace. 1 App Workspace has the only copy of the Golden Dataset.
- Save your Golden Dataset to OneDrive for Business and import to one or more App Workspaces from there.
- Save your Golden Dataset to SharePoint Online and import to one or more App Workspaces from there.
The advantage of options 2 and 3 above is that you can have 1 dataset, and then load it to many workspaces while still having a single dataset to maintain. If you set it up this way, you will end up with something like I have illustrated in the image below. The advantage – there is only 1 dataset to maintain – the Golden Dataset – but it can be deployed to as many different App Workspaces as needed.
Loading From OneDrive
To demo how this works, I first saved an Adventure Works PBIX to my OneDrive for Business folder on my PC. If you do this in your environment, you will probably want to save your PBIX to a shared OneDrive folder that is available to everyone that needs to use it from PowerBI.com. Or you can save it to a SharePoint document library that is available to others to access.
I then created 2 new Test App Workspaces at PowerBI.com. To load the PBIX file from OneDrive into each Power BI workspace, I simply
- Went to the workspace
- Selected Get Data
On the next screen, I couldn’t see my OneDrive for Business folder option (my personal OneDrive for Business folder). The OneDrive option (1 below) is actually a dedicated OneDrive folder for this App Workspace. But that doesn’t mean I can’t access files from my personal OneDrive for Business folder. To do this, I selected the SharePoint option (2 below).
When asked for the URL, I simply cut and paste the URL from my browser that points to my OneDrive account. You can tell by the address in the URL that OneDrive is actually integrated with SharePoint – which is why this works.
I then repeated the process for App Workspace Test 2, essentially giving me 2 “copies” of the one master Golden Dataset – one copy loaded to each workspace.
I stress again, in this demo I am connecting to my Golden Dataset from my personal OneDrive for Business folder. If you are doing this inside your enterprise, you would probably use a shared OneDrive folder accessible to those that need the Golden Dataset, or a SharePoint document library.
Making Changes to The Golden Dataset
The real benefits of this approach are realised when you make changes to the Golden Dataset. To demonstrate how this works, when I first saved the PBIX to my personal OneDrive for Business folder on my PC, it had a table called “test”. You can see this table visible at PowerBI.com in App Workspace Test 1 below.
I went to my PC (local copy of my OneDrive for Business), opened the Golden Dataset, deleted the test table, saved the file and waited. I checked OneDrive online and saw the updated copy of the file appeared almost immediately (keep in mind this test workbook is small, at just 2MB. A bigger workbook will take longer to upload). It took a bit longer for the Golden Dataset to flow through to the App Workspaces in Power BI. I seem to remember hearing that this replication takes place every hour or so (maybe if someone knows the exact cycle you could let me know in the comments below). This is automatic – you do not need to install a gateway to make this work.
Time to Create a New Thin Power BI Desktop Report
Time to create a new thin report using Power BI Desktop. To do this, I simply opened a new blank pbix file in Power BI Desktop, and selected “Power BI Service” as the data source (shown below).
If you are not logged in when you do this, you will be prompted to log in before proceeding. Once logged in, you will be presented with a list of datasets that you can access from your various workspaces (see my demo examples below).
In my example, I selected the Golden Dataset (Adventure Works Test) in workspace Test 1 shown above and then created a report from there. I saved the thin Power BI Desktop file with the name Golden Dataset Demo. The new thin workbook does not have a data view or relationship view, only the report view as shown below. Also the thin workbook doesn’t have a copy of the data, so it will only work when you have direct access to PowerBI.com (keep that in mind if you want access when you are out of Internet coverage).
As mentioned earlier, one of the benefits of this approach is that a user can connect to the Golden Dataset but still create local measures when needed. Here is my thin Power BI Desktop workbook below (#1 below). Notice I can right click on a table in the Fields List and create a new Measure (#2 below).
Any new measures you create in a thin workbook are stored as part of the thin workbook and are not visible in the Golden Dataset. (Note, you can’t create calculated columns or new tables, only measures).
If, down the track someone updates the main dataset with a measure with the same name as the one you created locally, your local measure will show an error in your thin workbook and the new measure will appear, working in place of your local version.
One of my readers, Sean commented below that it is difficult to write local measures if you can’t see the relationship view. This is an excellent point and it really needs to be addressed. I have created a new idea at ideas.powerbi.com to have a read only view of the relationships. Please vote for it here https://goo.gl/1XYdiu
Publish the Thin Workbook
I then published this thin workbook to PowerBI.com (the same workspace as before). I therefore have both a desktop file AND a new PowerBI.com thin workbook containing a report – both pointing to the Golden Dataset in the same workspace. After publishing I could see the new report (as highlighted below).
Power BI Service Report
Of course you can use the Golden Dataset to create a PowerBI.com service only report (as you can with any dataset). To do this, simply click on the dataset while in PowerBI.com (1 below) and then build your report as needed.
After you save the report (top right hand corner of PowerBI.com) you will see your new report in the reports list on the left
And you can download a desktop version of this report by clicking File\Download Report as shown below.
Switch Reports to a Different Golden Dataset
Now for some really good news. It is possible to repoint an existing thin Power BI Desktop workbook to a different dataset/workspace. Now of course you wouldn’t want to repoint a Sales report to a Finance dataset. So if you take this Golden Dataset approach and later change your mind which workspace to point to, you can easily repoint the thin workbook to a new dataset.
To test this out I did the following.
- I went to Edit Queries in my Thin Workbook
- Data Source Settings
- From there I was able to change the source and point the Thin workbook to another dataset.
Summary of Key Points
- You can upload a PBIX file to the service and use it as a Golden Dataset (one dataset to rule them all). You can also save the PBIX file to OneDrive or SharePoint with the added benefit that you can have many workspaces using the one Golden Dataset.
- You can then create as many new reports as you want directly connected to this Golden Dataset.
- Connect a new blank Power BI Desktop file effectively creating new thing workbooks.
- Directly create new reports online in the Power BI Service.
- If you create a thin workbook connected to the golden dataset and create a report, you can publish it to the same app workspace and it will work.
- Anyone with access to the workspace can download a copy of the thin report to their desktop and successfully interact with it on Power BI Desktop. If you later remove their access rights to the workspace, then the downloaded workbook will stop working – very cool.
- You can modify and update your Golden Dataset and republish at any time. Any thin workbooks that you have created connected to the Golden Dataset will be able to access the changes you have published –sweet again.
- Any user with access rights can build local measures to a thin Power BI Desktop workbook without impacting the Golden Dataset.
- With a suitable business process, end users can advise the owner (admin) of the Golden Dataset of any useful local measures that have been created. The administrator can then migrate these measures to the Golden Dataset for the benefit of everyone.
- If needed, you can repoint any thin workbook to a different copy of the Golden Dataset if needed. This can be good if you started your journey without using a Golden Dataset but now want to backtrack and sort this out.
Let me know what you think in the comments below. I also would love to hear any other ideas you have on how to get the most from the Golden Dataset.