New Power BI Reports from a Golden Dataset

This is a further (third) update to an article I first wrote on 18th April 2017 and then updated on 30th Jan 2018.  As Power BI becomes more and more pervasive in the business world, I am being increasingly asked “How can I build 1 master dataset, use it as the 1 version of the truth, but still allow people to do their own thing”?  My concept of the Golden Dataset will allow you to do just that – read on to find out how.

The Problem – Too Many Data Models

Power BI has so many great features it seems almost sacrilegious to discuss problems, however one of the problems that can occur in the world of Self Service BI is the proliferation of different versions of essentially the same data model.  This is made worse by the fact that you can publish the same data model to multiple workspaces at PowerBI.com.  Note: data model and dataset are essentially the same thing – I refer to data model when using Power BI Desktop, but when it is published to PowerBI.com it is officially called a dataset.

If you think about it, Self Service BI using your own data models has the potential to be the antithesis of Enterprise BI. Instead of having one single data model that is used for all reporting (Enterprise BI), you can build individual bespoke data models each time you have a new problem.  This is both powerful and problematic.  If you are not careful, before too long you can have an unwieldy mess of uncontrollable and unmanageable data models that are essentially the same at the core but with different reports sitting over the top.  My suggested solution to this problem is to create a “Golden Dataset” and use it as single master source in Power BI Desktop (and Power BI Service).  It is a great approach.  Let me explain.

What is the Golden Dataset?

Golden DatasetThe Golden Dataset is a master Power BI Data Model that is loaded to PowerBI.com with or without a report attached. (Actually currently you must have at least a single page in your report, even if it is blank, but you can delete it once loaded to PowerBI.com if you like).  Once you have your Golden Dataset loaded to PowerBI.com, you can build as many new ‘thin’ reports as you need directly from Power BI Desktop by connecting Power BI Desktop directly to PowerBI.com with Get Data -> Power BI Service.

Note: A thin workbook is one that doesn’t contain a data model itself, but links into a central data model (in this case stored on PowerBI.com).

The advantages of taking this approach are:

  • You only have one dataset/data model to maintain.  This will use less space but also eliminate the proliferation of slightly different versions of the data model and minimise the effort to keep the models up to date.
  • You can maintain and improve your single data model in the Golden Dataset master PBIX and make the changes available to all users and dependent reports.
  • You (and everyone else with access rights) can simply build new ‘thin’ reports connected to the Golden Dataset as needed.
  • Anyone building a new report from the Golden Dataset will be able to add their own “local” measures – local to their thin workbook.
  • With the implementation of a suitable business process, the best local measures created by users can be retrospectively migrated into the Golden Dataset for the benefit of all.
  • You get the benefits of an enterprise solution (kinda like using an SSAS Tabular data model) without having to invest in all that infrastructure.

This is how it works.

Publish Your Golden Dataset

There are actually a few ways you can do this.  I learnt options 2 and 3 below the last time I updated this article (from comments from my blog readers).

Your options are:

  1. Publish your Golden Dataset direct to a Power BI App Workspace.  1 App Workspace has the only copy of the Golden Dataset.
  2. Save your Golden Dataset to OneDrive for Business and import to one or more App Workspaces from there.
  3. Save your Golden Dataset to SharePoint Online and import to one or more App Workspaces from there.

The advantage of options 2 and 3 above is that you can have 1 dataset, and then load it to many workspaces while still having a single dataset to maintain.  If you set it up this way, you will end up with something like I have illustrated in the image below.  The advantage – there is only 1 dataset to maintain – the Golden Dataset – but it can be deployed to as many different App Workspaces as needed.

image

Loading From OneDrive

To demo how this works, I first saved an Adventure Works PBIX to my OneDrive for Business folder on my PC.  If you do this in your environment, you will probably want to save your PBIX to a shared OneDrive folder that is available to everyone that needs to use it from PowerBI.com.  Or you can save it to a SharePoint document library that is available to others to access.

image

I then created 2 new Test App Workspaces at PowerBI.com.  To load the PBIX file from OneDrive into each Power BI workspace, I simply

  1. Went to the workspace
  2. Selected Get Data
  3. Files

image

On the next screen, I couldn’t see my OneDrive for Business folder option (my personal OneDrive for Business folder).  The OneDrive option (1 below) is actually a dedicated OneDrive folder for this App Workspace.  But that doesn’t mean I can’t access files from my personal OneDrive for Business folder. To do this, I selected the SharePoint option (2 below).

image

When asked for the URL, I simply cut and paste the URL from my browser that points to my OneDrive account.  You can tell by the address in the URL that OneDrive is actually integrated with SharePoint – which is why this works.

image

I then repeated the process for App Workspace Test 2, essentially giving me 2 “copies” of the one master Golden Dataset – one copy loaded to each workspace.

I stress again, in this demo I am connecting to my Golden Dataset from my personal OneDrive for Business folder. If you are doing this inside your enterprise, you would probably use a shared OneDrive folder accessible to those that need the Golden Dataset, or a SharePoint document library.

Making Changes to The Golden Dataset

The real benefits of this approach are realised when you make changes to the Golden Dataset.  To demonstrate how this works, when I first saved the PBIX to my personal OneDrive for Business folder on my PC, it had a table called “test”.  You can see this table visible at PowerBI.com in App Workspace Test 1 below.

image

I went to my PC (local copy of my OneDrive for Business), opened the Golden Dataset, deleted the test table, saved the file and waited.  I checked OneDrive online and saw the updated copy of the file appeared almost immediately (keep in mind this test workbook is small, at just 2MB.  A bigger workbook will take longer to upload).  It took a bit longer for the Golden Dataset to flow through to the App Workspaces in Power BI.  I seem to remember hearing that this replication takes place every hour or so (maybe if someone knows the exact cycle you could let me know in the comments below).  This is automatic – you do not need to install a gateway to make this work.

Time to Create a New Thin Power BI Desktop Report

Time to create a new thin report using Power BI Desktop.  To do this, I simply opened a new blank pbix file in Power BI Desktop, and selected “Power BI Service” as the data source (shown below).

image

If you are not logged in when you do this, you will be prompted to log in before proceeding.  Once logged in, you will be presented with a list of datasets that you can access from your various workspaces (see my demo examples below).

image

In my example, I selected the Golden Dataset (Adventure Works Test) in workspace Test 1 shown above and then created a report from there.  I saved the thin Power BI Desktop file with the name Golden Dataset Demo.  The new thin workbook does not have a data view or relationship view, only the report view as shown below.  Also the thin workbook doesn’t have a copy of the data, so it will only work when you have direct access to PowerBI.com (keep that in mind if you want access when you are out of Internet coverage).

image

Local Measures

As mentioned earlier, one of the benefits of this approach is that a user can connect to the Golden Dataset but still create local measures when needed.  Here is my thin Power BI Desktop workbook below (#1 below). Notice I can right click on a table in the Fields List and create a new Measure (#2 below).

Any new measures you create in a thin workbook are stored as part of the thin workbook and are not visible in the Golden Dataset.  (Note, you can’t create calculated columns or new tables, only measures).
If, down the track someone updates the main dataset with a measure with the same name as the one you created locally, your local measure will show an error in your thin workbook and the new measure will appear, working in place of your local version.

One of my readers, Sean commented below that it is difficult to write local measures if you can’t see the relationship view.  This is an excellent point and it really needs to be addressed.  I have created a new idea at ideas.powerbi.com to have a read only view of the relationships.  Please vote for it here https://goo.gl/1XYdiu

Publish the Thin Workbook

I then published this thin workbook to PowerBI.com (the same workspace as before). I therefore have both a desktop file AND a new PowerBI.com thin workbook containing a report – both pointing to the Golden Dataset in the same workspace.  After publishing I could see the new report (as highlighted below).

Power BI Service Report

Of course you can use the Golden Dataset to create a PowerBI.com service only report (as you can with any dataset).  To do this, simply click on the dataset while in PowerBI.com (1 below) and then build your report as needed.

image

After you save the report (top right hand corner of PowerBI.com) you will see your new report in the reports list on the left

image

And you can download a desktop version of this report by clicking File\Download Report as shown below.

image

Switch Reports to a Different Golden Dataset

Now for some really good news.  It is possible to repoint an existing thin Power BI Desktop workbook to a different dataset/workspace. Now of course you wouldn’t want to repoint a Sales report to a Finance dataset.  So if you take this Golden Dataset approach and later change your mind which workspace to point to, you can easily repoint the thin workbook to a new dataset.

To test this out I did the following.

  1. I went to Edit Queries in my Thin Workbook
  2. Data Source Settings
  3. From there I was able to change the source and point the Thin workbook to another dataset.

image

Summary of Key Points

  • You can upload a PBIX file to the service and use it as a Golden Dataset (one dataset to rule them all).  You can also save the PBIX file to OneDrive or SharePoint with the added benefit that you can have many workspaces using the one Golden Dataset.
  • You can then create as many new reports as you want directly connected to this Golden Dataset.
    • Connect a new blank Power BI Desktop file effectively creating new thing workbooks.
    • Directly create new reports online in the Power BI Service.
  • If you create a thin workbook connected to the golden dataset and create a report, you can publish it to the same app workspace and it will work.
  • Anyone with access to the workspace can download a copy of the thin report to their desktop and successfully interact with it on Power BI Desktop. If you later remove their access rights to the workspace, then the downloaded workbook will stop working – very cool.
  • You can modify and update your Golden Dataset and republish at any time.  Any thin workbooks that you have created connected to the Golden Dataset will be able to access the changes you have published –sweet again.
  • Any user with access rights can build local measures to a thin Power BI Desktop workbook without impacting the Golden Dataset.
  • With a suitable business process, end users can advise the owner (admin) of the Golden Dataset of any useful local measures that have been created.  The administrator can then migrate these measures to the Golden Dataset for the benefit of everyone.
  • If needed, you can repoint any thin workbook to a different copy of the Golden Dataset if needed.  This can be good if you started your journey without using a Golden Dataset but now want to backtrack and sort this out.

Let me know what you think in the comments below.  I also would love to hear any other ideas you have on how to get the most from the Golden Dataset.

Share?

Comments

  1. Matt, yes Excellent, but i am chasing a different scenario, I Want to query the “Golden” dataset from Excel/Powerquery , not live connections but load some data into a local Powerpivot Model.
    Live query is nice and all, but it is a pro feature as you have pointed out, it requires a shared workspace.

    i tried to play with the string connections but not luck so far.

    but i agree it is a game changer for pro users.

  2. Would you consider this a golden “Data model” instead of a dataset? I’ve usually referred to a dataset as a single table…

  3. “however from experience loading data from a SSAS model into Power Pivot is pretty painful.” yes indeed, but it is one of the most cost effective way to distribute large numbers of reports, of course assuming you have Excel 2013+

  4. Yes! It’s a game changer. An issue I have with a single model is the number of measures has increased over time, which can be confusing for the developer and user. I am hoping that this will allow a “modular” approach. Each report can focus on a specific dimension table or Time Intelligence each with a more limited number of measures.

  5. The Game changers are yet to come.
    1. Connect to Data model in Power BI Desktop file from Excel (w/o all the hacks)
    2. Connect on an Excel/Access DB stored in One Drive personal
    3. Password protect the Queries / Data model – to give some sense on security.

  6. Excellent new feature! Very well described. Casual users can benefit from Power users’ modelling skills much better now ! It sounds like it is Microsoft’s goal for each enhancement of Power BI. Power users drive the development and production that Casual users can enjoy peacefully.
    When you talk about a “shared workspace”, it is actually an Office 365 Group accessible in any 0365 application right ?

  7. You can create a content pack of the dataset and report that connects to the service and share the pack with the new group. The data refreshed in the original dataset and group is reflected automatically in the new group via content pack.

  8. My wish for the next advancement on this feature is to allow connection to multiple “golden dataset” in the same report, to allow the most flexibility in data steward process, with PBI as the convenient single platform to manage the data modeling of various data sources.

  9. Take a bow Sir!

    This is fantastic and I’ve felt in my bones I’ve needed to arrive at this place for a while now. I don’t have the facility to set up SSAS and nor do I have the inclination

    This is a superb alternative and does the job very nicely indeed!

  10. Sorry, the reason I turned off from this idea in the past is that the the measures in the golden model weren’t ‘exposed’ for usage, is this now the case?

    They are when you use Analyse in Excel so I’m hoping yes?!

  11. I like this idea, but have a hard time seeing how end users will be able to write anything other than the most basic reports and measures if they don’t understand the structure of the model. This puts a real burden on the model designer to put a tremendous amount of attention to, for example, naming fields and tables to relay that information. Even then, I’d have a concern that they may create a measure that does not work as they expect and, as a result, report incorrect BI because of it.

    • The way I dealt with one of those issues was to duplicate some measures in the form of a ‘semantic later’, if you will. A set of friendly names for users completely separate to the base measures as I call them

  12. Not sure if this is already well know, in my scenario I am making this Golden Dataset available to use in multiple App Workspaces, to maximize the application of “single source of truth”. This is done by using Sharepoint to house the Golden Dataset (PBIX), then in each of the App Workspaces use Get Data –> Get Files –> Sharepoint Teamsites, then point to where the PBIX file is uploaded.

      • Hi Matt, I haven’t see any documentation or blog from anyone on this scenario, it seems pretty basic so I just assumed it was common knowledge. The setup is pretty straightforward, using Sharepoint as the cloud storage for PBIX files, and by using the Get Files from Sharepoint method as I mentioned, in each App Workspace, the Golden Dataset is now available for use in each App Workspace, and subsequently production Apps. Future updates in the Golden Dataset (data refresh, measures, relationships) will reflect to each App Workspace. If this is not common knowledge, it might be great if you can mention this in your blog post, and also recommend Microsoft to add to their documentations. What do you think?

        • I agree. I will check it out and update this article. Are you using SharePoint Online? I don’t use it much, so this is part of the reason I don’t know about it. Are you just sharing to a SharePoint Library?

          • Hi Matt, for smaller datasets I am using a data gateway to refresh the dataset in Sharepoint. The dataset appears in PBI Service the same way as if it were an direct publish from desktop, and has the same refresh options on the Service.
            I have a couple more complex datasets that the data gateway can’t handle just yet due to timeout issue, so I am manually refreshing them in Desktop on a daily basis, then upload the PBIX file to Sharepoint. The Sharepoint file will sync with PBI Service within an hour, then the updated data will show up in the Service for various Apps and Reports.

            Our team hasn’t moved into more enterprise-grade platforms (such as Informatica or SSAS) for dataset management yet, so this solution with PBIX and Sharepoint works well for now to manage our datasets.

  13. Maybe I’m missing something but you can connect to a golden dataset if your starting from fresh with a new pbix, but what happens if you already have a pbix and you simply want to cherry pick some data / measures from another data model? My attempts at this proved unsuccessful (cannot recall the message, something about multiple online sources) but am happy to be proved wrong.

  14. Hi Matt,
    I think this is a really good idea and a great concept. But it feels like Microsoft has some way left before this can be fully used in production. For example, in my case I have 7 similar reports, with only minor modifications. I want to base all the reports on the same data model, for all the benefits you mention above, but also to minimize the workload on the database.

    However, I want some differences in e.g. target values etc. that I wanted set up as parameters. But to my understanding, this cannot be done when designing a “thin” report. I also have some measures designed as indexes, based on the target values. The only solution I can come up with is to hardcode the values into the measures in each report, but it doesn’t seem like a very sustainable alternative. Do you have any ideas how I can tackle this?

    • I’m not sure what you mean by “measures designed as indexes”. Yes parameters are part of Power Query so cant be in a thin workbook. For your hard coded measures, how about you load a table of all the possible values. Eg with 2 columns (Report name, value). Place a Report level filter in each report as appropriate to match each report. Then a generic measure myMeasure = selectedvalue(table[Value]) should work for all reports, and you can just change the values in the source table when needed.

  15. I voted for the idea to make a schema visible in these cases. In any case, a couple of best practice suggestions would be to (a) always avoid hiding key fields and (b) maybe also duplicate the golden model side by side as a fully imported dataset option. To often analysts just want to make small variations but will abandon the golden model if they can’t and then will have a dead end.
    By the way I tested importing all tables directly into Power Pivot tables from a golden model and it works fine; just fYI but again would need the keys to enable relationships to be recreated.

  16. Nice article, Matt. I work in an organisation that hasn’t yet fully come to exploit what Power BI has to offer and hence didn’t really have the chance to see this being used. I created something similar to what you have described before the connect to Power BI feature was made available

    The inability to view the data model and the relationships will be a pain-point to create custom measures. You mentioned about migrating measures to the Golden Dataset? Is this something as trivial as sending an email to the “owner” of the Golden Dataset to add in the desired measures? Or is it something that will be uploaded to the thin report on PowerBI.com

    Also, if the golden dataset as row level security enforced, would that affect creating custom measures in some way?

    Thanks

    • Yes, I think the process to migrate measures could be a simple as an email. They key is that everyone needs to know what to do. As for RLS, that is a great question – I have never tested this, but I assume that the new thin workbook would have the golden dataset RLS applied for the user of the thin workbook. Nothing else would really make any sense from a design perspective.

  17. Hi Matt,
    Great article.
    I just want to add some demo I have done so far based on your article.

    I’m always looking for this kind of Golden data set, but also a Golden reports (then customize for each app). So with the help of your article, I finally “almost” there. I have tried to pass by the tabular model but all the reports are disappear (pbix > Azure Analysis Services > obtain tabular model) so I dont really appreciate this method.
    So my approach to have golden dataset and reports is :
    1/ I have a golden dataset with all the reports for different group of users but each group will use only a subset of reports and I dont want to have them in different places so I have only one pbix file (the viz works perfectly with RLS so I dont need to apply filter on group of business users)

    2/ With the help of your article, the onedrive master pbix file synchronizes with their “Import pbix” ‘s file. But I dont want them to sync (I dont want the users see reports that are not belong to their business) so I turn off the sync One Drive option, activate Schedule Refresh.

    3/ By doing that, every time I want to modify the report, I can just go and modify the original and then activate sync, and by the end I will delete the non-necessary reports for this app.

    Let me know what do you think about this approach!
    Thanks again for this great article.
    Dung Anh

  18. Is the only difference between storing the model (ie dataset) in the cloud (onedrive/sharepoint) verses on premise with a gateway (when you want to publish to multiple workspaces): When changes are made to the model the cloud options will automatically update the model in ALL workspaces and the on premise version would have to be published manually to each workspace where the model has been previously published?
    There is no difference in the refresh scheduling, correct?
    For both on premise and cloud datasets deployed to multiple workspaces you still have to setup maintain the refresh schedules in each separate workspace, correct?
    There is no difference in permissions to make changes to the dataset, correct?
    The users have to have edit permissions in the workspace, they therefore can download the dataset, make changes and republish and override it?
    Thank you.

  19. Thanks Matt … this is a very helpful article.
    I’ve been using a golden dataset for a while now – loading it to 3 different workspaces and having a separate refresh schedule in each workspace. Had no idea till I found your article that it could be a single dataset “serving” the 3 workspaces.
    I’m a bit unsure how this would work with refreshing the dataset via a PBI Gateway. Jerry’s comment earlier suggests this can work, but how would one configure this? Would you configure the dataset in all 3 workspaces to refresh via the Gateway? Or, if you setup the refresh schedule in just one of the workspaces, would all 3 workspaces reflect the refreshed data? Has anyone tried this?

    • When you load to OneDrive and then import to a workspace, the workspace has its own copy of the data set (3 workspaces means 3 copies – hence 4 in total). Power BI and OneDrive talk to each other. When the OneDrive version is updated, any workspace versions will automatically be updated. So you have 2 options. 1) refresh the OneDrive version and hav eit auto flow through, or 2) configure the gateway for all workspace copies (3 gateway configurations in your case). OneDrive doesn’t have the ability to refresh itself. The file needs to be “opened” and then refreshed and saved. You could do this with Power Update ($500) or just use the gateway on all copies.

    • Report server is a whole different topic. You need 1) Report Server (of course) and 2) the specifically edition of power bi desktop that is used to publish to the server

      • I have both conditions, the Reports server and the PowerBI desktop optimized for the report server.
        I am not able to import the dataset to the server.

        • Sorry for the confusion in my reply. You “publish” a report to report server by using the optimised Power bi desktop version. But report server is not Powerbi.com, and I guess you can’t use his technique with report server.

  20. One thing that worries me from the very beginning – Once I will change my position in the company, is there a way to smoothly transfer the dataset (either loaded to OneDrive or directly to PBI account) to one of my colleagues account, without losing highly developed set of workspaces, reports and group of recipients?

    • Always create a new App Workspace. That way admin rights can be shared and transferred. Use an internally public OneDrive or Sharepoint folder to store the Golden Dataset. Do those things and all will be fine. Even if you do it in your personal workspace, Power BI admins have tools to make changes and repoint workspaces and workbooks

Leave a Reply to Gary Lanzafama Cancel reply

Your email address will not be published. Required fields are marked *

Ten Things I Wish I Knew When I Started With Power Pivot.   Learn  Without The Pain!

x