Archive for Power BI

An Introduction to DAX as a Query Language

Level: Intermediate

I have written quite a few articles that contain embedded references to DAX Queries and DAX Table functions in the past.  Today I am writing an article that introduces the DAX query language from the beginning so you can learn how to use it and how it will help you to become a Power BI superstar.

What is a Query Language?

A query language is a programming language that allows you to send a request to a database and then receive data back in the form of a table.  If you only have a few hundred rows of data then this is not overly useful.  If on the other hand you have millions of rows of data then it becomes very useful.

The most important difference between DAX when used in Measures and Calculated Columns and DAX as a query language is that the query language always returns a table.  The resulting table will consist of one or more columns, none or more rows, and data inside none or more of the cells in the table.

Why Do I Need a Query Language?

When you create a data model in Power BI or Power Pivot for Excel, you are essentially loading tabular data into a dedicated reporting database for the expressed purpose of analysis and/or reporting.  Once it is loaded, you (the report author) then typically spend hours of effort building out the data model so it can be used to add value to your business/job.  Finally once this is all done, the most common way to use your new data model is to build reports and pivot tables that present data to a user in a way that is easy to understand.

Below is an example of an interactive Power BI report that runs off a Power BI data model.

And here is a Pivot Table that runs off an Excel Power Pivot data model.

Both of these above visualisations are summarisations of the underlying data and in both cases you can’t actually see any of the the underlying data itself, just the summary.  The visualisation tools above are great and have enormous utility however they are generally not the best tools if you want to see the underlying data itself or if you want to extract very large sets of data for whatever reason.

Once you have invested all this effort in building your data model, you are going to want to use it to the fullest extent. Power Pivot and Power BI can handle many millions of rows of data (unlike traditional Excel) and so the “old way” of simply looking at the underlying table is not really practical in this brave new world.

Show Me the Underlying Data!

 

If you take the pivot table I showed earlier, most Excel users would be aware that they can double click on any value cell (say 31.6% representing Mountain Bike sales to people aged between 30 and 40 years of age).   When you double click on a traditional pivot table in this way, something magical happens – a copy of the underlying data is extracted from the source and it magically appears in a new sheet in Excel.  Once you are done looking at the copy of the data, you can delete the new sheet and you are back where you started – life is good.

When you double click on on a traditional pivot table in this way, you are effectively sending a query to the underlying database and the query response is a table of data.

But Does it Work With Power Pivot?

In short, yes and no depending on what you are looking for.  The experience is different if you have an Excel Power Pivot data model or if you are connected to an external SSAS data source. In my experience the results are inconsistent at best and at worst it may not give you what you expect.  But the good news is that with DAX as a query language you can easily write any query you want and extract exactly the subset of data you need.  You no longer have to rely on double clicking your pivot table to interrogate your data.

Time To Learn DAX as a Query Language

OK, enough of the chatter, time to get serious about DAX as a query language.  You have invested time and effort in building your data model in Power BI or Power Pivot and now you rightly want to extract a subset of your data so you can take a look at it.  Let’s get started.

First Select the Right Tool

There are 3 general approaches you can take to extract a table of data from your fabulous data model.

  • Use a dedicated tool like DAX Studio or SQL Server Management Studio.  I wrote an article describing how to install and use DAX Studio here.
  • Write a “New Table” in Power BI Desktop.  I wrote an article about how to use this feature here.
  • Use an Excel “table hack” I learnt from Marco Russo to extract a table directly into Excel.  I will cover this hack later in this article series.

I am going to use DAX Studio in this blog article but you can use Power BI “New Table” if you want to follow along and you don’t have (or want) DAX Studio. In fact if you want to learn this stuff then you really should follow along using either approach.  In my experience you can’t learn new concepts simply by reading.  Please note however that using Power BI and Excel as a query tool should be considered as a temporary extraction of the data.  Once you have looked at the data you want, you would normally delete the table so you don’t end up storing redundant data (you can always extract it again if you need it).

Let’s Start With the Evaluate Statement

All DAX queries must start with the EVALUATE statement. There is an exception however.  When you use the Power BI “New Table” button to write a query, you simply omit this EVALUATE statement.  This is no doubt another example of “Syntax Sugar” where the developers shield the user from some of the more confusing syntax.  If you are following along and testing my queries below in Power BI, then simply omit the EVALUATE statement when you write the queries.  In addition if you ever want to move your DAX Studio query to a table in Power BI, then you can just remove the EVALUATE statement at that time.

Connecting DAX Studio to the Database

When I launch DAX Studio I am prompted to connect to my database. In this case I have an instance of Power BI Desktop with an Adventure Works data model running on my PC and DAX Studio gives me an option to connect to that.  You can also connect to Excel Power Pivot data models and Tabular SSAS servers.

image

If you want a more comprehensive overview of the DAX Studio UI, then read my other post I mentioned earlier.  Otherwise let’s get into it.

Return an Existing Table

The easiest DAX query you can write is to simply return a complete copy of an existing table.

When I run this query in DAX Studio (shown as 1 below) and then press F5 to run, the entire table is returned (shown as 2).  A total of 18,484 rows of data (shown as 3).

image

This in itself is not overly useful as chances are you have this table already in some source system.

List of Unique Values

In DAX there are many functions that return tables instead of scalar values.  I covered this topic in depth last week so you can read up on that here if you like.

Let me now look at the ALL function.

When I execute this table function as a query, I can actually “see” the results of the query.  In this case the query returns a single column table consisting of 4 rows – all the unique product categories.

Next let’s look at VALUES.

In this instance, the VALUES table function returns exactly the same result.

image

Now let’s look at DISTINCT.

And same again here.

image

All three of these table functions return the same table in this instance.  In reality they all do different things in different filtering scenarios.  This is one of the great things about DAX Queries is that you can “inspect” what is happening to these table functions in different scenarios.  More on that later.

A Two Column Table

The ALL function can take 1 or more columns as an input (or a table).  In the example below I have used ALL to materialise a table of all of the combinations of Product Category and Product Sub Category.  There are 37 unique combinations in all.

image

A Filtered Table

Most people that know DAX will know of the FILTER function.  The FILTER function returns a filtered copy of a table.  eg see the example below

This is a much more interesting query because it returns a table containing all the products that have sold more than $200,000 of sales over all time.  There are 49 such products.

image

Altering Filter Context

When you write a DAX measure or Calculated Column, there is only 1 way you can change the filter context and that is using the CALCULATE function.  CALCULATE returns a scalar value after applying filters.  CALCULATETABLE is the sibling of CALCULATE.  As you can probably guess, CALCULATETABLE returns a table after applying filters.

This next query returns exactly the same result as the last query.

image

the main difference with using CALCULATETABLE compared to just using FILTER is that CALCULATETABLE can return a different table to the FILTER function itself.

The above query uses the same FILTER function from before, but instead of returning a table containing the products that sold more than $200,000, it returns all of the actual sales transactions that contributed to those products qualifying (see results below).

image

Combining Table Functions

Now that you see how some of the standard table functions work in a query, let’s look at combining them together into a single query.

This query first applies a filter on all products that have sold more than $50,000 (line 4 below), and then returns a list of unique product subcategories that contain these products.

image

One interesting thing in the query results above is that “helmets” are in the list (the only non-bike subcategory). To see what the exact products where that sold, it is simply a matter of writing another query as follows.

image

Note how I was able to reuse the the original query structure to return the list of product names with just a minor change to the query.

More to Come Next Week

This is a big topic – too big for a single blog article.  This week I introduced the topic of DAX queries, I showed how you can write queries to return tables and how you can combine table functions together to interrogate data.  Next week I will introduce some new special query functions that extend the capabilities even further.

Power BI Skills – Thinking of Data in Tables

Level: Beginners

One very important concept that Excel users often struggle with when starting on their Power BI and Power Pivot journey is that they need to learn to think about data in tables rather than thinking about data in cells.  In the spreadsheet world every cell can be a different formula or value, but in the Power BI and Power Pivot worlds, it doesn’t work this way.

The Excel Way

The Excel way is very simple.  You load data into the spreadsheet – each cell in the spreadsheet contains one data element (one scalar value).  From there you can write a formula in a single cell (see cell E2 below).

Standard Excel

If you want the same formula to appear many times in your spreadsheet, you copy the formula down the page. The key point here is that every cell in the above example is either a scalar value or an individual formula that returns a scalar value. That is how traditional Excel works and that is why us Excel folk are comfortable thinking about data in this way.

But What About Excel Tables I Hear You Say?

Yes, good point.  Sometime in the past Excel was given a table object as part of it suite of tools.  Many Excel users don’t know about these and hence don’t use them.  Excel tables solve quite a few issues compared with Excel cell ranges including:

  • allowing you to write a single formula, and have that formula apply to the entire column in the table
  • extending the formulas in the table when there are new rows added to the table
  • extending a reference (such as a pivot table source reference) so that it will point to a table (that can change size and shape) rather than pointing to a static range of cells.

Below I convert the same Excel range from the first image into an Excel table.  Note how I now only need to add a single formula to create “Sales Inc Tax”.  The single formula now applies to the entire table.

Excel Tables

When you have an Excel table like the one above, you can then start to refer to the entire table or columns in other formulas in your spreadsheet.  See in the image below how the formula in cell G3 refers to a single column in the table.  As the table grows, the formula doesn’t need to be changed – it just works.

refer to columns

This use of tables in Excel as shown above is very similar to how it works in Power Pivot and Power BI.  You no longer think about clusters of cells and instead have to consider the entire table or column.

Step 1: Thinking of Data in Tables in Power Pivot

The key things you need to know as you learn to think of data in tables in Power Pivot are:

  1. Measures are written to operate over the entire data model.  You can (and normally do) get a different answer to the same formula depending on what filtering is currently applied to the columns, tables and visuals in your data model/report.
  2. You can’t directly reference rows in a table. The only way to reference specific row(s) is to first “filter” the table so that the rows you want to access are “visible” (and all the rows you don’t want are “not visible”).  After the filter is applied, then the measure or calculated column (mentioned in point 1 above) will work for the filtered copy of the table (it is a little more complicated than that for calculated columns but that is beyond the scope of this article).
  3. You need to learn to “imagine” what the table looks like with filters applied.  The reason you need to learn this skill is because many of the tables you will use in DAX formulas are ‘virtual tables’, that is the table is never materialised in a way that you can see it.  You need to learn to “imagine” what the table looks like in its filtered state.  I wrote a blog post at powerpivotpro.com that explains how you can use Power BI to physically materialise these virtual tables into temporary “test tables” to help you get started on your journey of being able to visualise what is happening.  Once you get the hang of it you will only need to materialise a table when you can’t work out why your formula isn’t working or if there is a specific question that is too hard to answer without looking (like question 5 that you will see later in this post).

Here is some imagination practice for you

Imagine you have a calendar table containing every date from 1 Jan 2010 to 31 Dec 2016 as well as columns for [Day Name], [Month Name] and [Year].  It would look something like this.

image

This table of course would have more than 2,500 rows.  Picture this table in your mind (eg stop looking at the image above).  Now imagine what would happen if you applied a filter on the Year column to be equal to 2012 and another filter on the Month Name column to be Feb.  Once you have applied these filters in your mind, then answer the following questions by referring to your imaginary table in your mind.

  1. What would the table look like now?
  2. How many rows are there visible in this table with these 2 filters applied?

Step 2: Thinking of Data in Columns

As well as mastering tables, you need to master columns.  Here are some additional questions for your imaginary table – this time the questions are about columns.

  1. How many unique values now appear in the Month Name column?
  2. How many unique values now appear in the Day Name column?
  3. What is the minimum date that exists in the Date column?
  4. What is the maximum date that exists in the Date column?
  5. What is the Day Name for the last date in this filtered table?

The answers are are at the bottom of the page.  But do yourself a favour and answer each question yourself without cheating – this will help you understand how skilled you are with thinking of data in tables and columns using your mind’s eye.  In fact I don’t expect you to be able to answer number 5 without doing some research.

The reason you need to think about columns of data are two fold.

  • Many of your formulas will operate over single or multiple columns.
  • Power Pivot is a columnar database and it is optimised for working with columns.

As a general rule, it is more efficient to apply filters to individual columns one at a time rather than apply filters to multiple columns at the same time.  Consider the following two measures.

The second formula is much more efficient because there are 2 separate filters being applied to 2 separate columns and they are applied one at a time (in a logical AND).  This is much more efficient than asking Power Pivot to consider both columns at the same time.

Note the FILTER functions in these two measures above all return a filtered copy of the calendar table.  You can’t see the filter copy of the table and that can make it hard to understand what is happening.  But if you learn to imagine what the tables look like in their filtered state you will be well on the way to becoming a DAX super star.

Step 3: Thinking of the Entire Data Model

The final thing you need to learn is that the entire data model operates as an enclosed ecosystem.  In the data model shown below, there are 4 lookup tables at the top of the image and 2 data tables at the bottom (from Adventure Works).

All 6 of these tables operate as a single system.  Filters applied to the lookup tables propagate downhill from the top (one side of the relationship) to the bottom (many side of the relationship).  Filters do not automatically propagate from the bottom to the top.  So once you learn to think about how filtering will affect a single table, you then need to take a further step to imagine how the filters will propagate to all other tables in the data model.

Answers to the Quizzes Above

Tables

  1. What would the table look like now?  It would still have 4 columns but now it has only 29 rows.
  2. How many rows are there visible in this table with the filters applied?  29.

Columns

  1. How many unique values appear in the Month Name column?  1 – Feb.
  2. How many unique values appear in the Day Name column? 7, the days from Sun to Sat
  3. What is the minimum date that exists in the Date column?   1 Feb 2012
  4. What is the maximum date that exists in the Date column?  29 Feb 2012
  5. What is the Day Name for the last date in this filtered table?  It is Wednesday, but this is a very hard question and too difficult to answer without either materialising the table or doing some research to check the day of week for the 29th Feb 2012. 

Download Report from Power BI Service to Desktop

I love to celebrate successes with the Microsoft Power BI team, but I am also not shy in shouting out when I am disappointed.  Back on 25th Nov 2016 I blogged about the then new “Save PBIX from Power BI Service”.  At the time I was disappointed in the way it worked because it didn’t really add any value that I could see.  But that now seems to have all changed.  Today I was delivering a training course and we looked at the “Download Report (Preview)” feature.  I am very wary about telling training participants definitively how PowerBI.com because it often get caught out.  The way it worked last week is different to this week.  So I said cautiously “This feature wasn’t very good last time I looked – but lets see if it works now”.

image

I was very excited to discover that download report now works as I would expect.

Here is the Scenario that now Works

Create a desktop PBIX and upload it to the service, then make changes to (or add a new) a report, then download the changes back to the desktop.  Changes that you make in Power BI Service in the cloud can be downloaded (and hence backed up) to a PBIX file on your desktop.  You are no longer locked into a scenario where you must author all reports in Power BI Desktop to maintain a backup.  This new feature allows you to share your reports with others using a group workspace to co-author a report together, and then download a PBIX version of the report with all the new reports back to the desktop.

To test this out, I uploaded a PBIX file to PowerBI.com and then added a new report tab (shown below). I added a new custom visual (shown below) to the PowerBI.com service to give it a good work out – I wondered if this would be downloaded.

image

I then downloaded the PBIX to my desktop using download report and opened it in Power BI Desktop.  I was pleased and excited to find the desktop version was identical to the one in the cloud, including the Word Cloud custom visual.

image

It is worth pointing out a few things that still don’t work

  • The Dashboards are not included in the PBIX.
  • You must upload a PBIX with a current version of Power BI Desktop for this feature to work. So if your PBIX file hasn’t been updated in the cloud for a while then now is the time to refresh it.
  • it doesn’t work for cloud created data sets, only with pbix files first created on the desktop.

Good job Microsoft.  I hope to see these other features added soon

The Top 30 Reasons You Should Be Considering Power BI

I wrote this article to be a definitive list of the top reasons you should consider using Power BI in your organisation (in case you haven’t already made the decision to do so).  In fact there is only one thing that worries me about this list, and that is I think many people may be sceptical and think “this is too good to be true”.  But I faithfully and honestly believe every bit of the story is true.  In the future I think there will be those that moved early and quickly to adopt the new approach and are leading from the front, and there will be those that couldn’t, shouldn’t, wouldn’t, and got left behind in the dust.  But to be fair, one of the biggest issues a company can face is when it has large financial investments in legacy BI solution.  If that is you then the good news is that you don’t have to dump what you have to get started with Power BI as a self service BI tool.  And if Power BI is a raging success as a self service tool then you have effectively hedged your bets by starting small and then (hopefully) scaling quickly when the business case supports it.

Strap yourself in – here are my top 30 reasons you and your company should be considering Power BI as a self service and/or enterprise BI solution.

1. It’s Microsoft (the new Microsoft)

No one can deny the strength of Microsoft in the corporate world of business software and IT solutions.

 

Microsoft hasn’t always been great at listening to specific customer concerns, but it has always been great at building and delivering world class robust commercial software.  Power BI is part of that heritage but better, for all the reasons I cover in this article.

When I say “the new Microsoft”, I am talking about Microsoft under the leadership of Satya Nadella, and when it comes to Business Intelligence I am talking about the changes brought about under the leadership of James Phillips.

Image result for satya nadella site:microsoft.com

Satya Nadella

Image result for James Phillips site:microsoft.com

James Phillips

You can read more about Sayta Nadella’s success at Microsoft here.   And here is an article that James Phillips (and Sayta) talk about what they are doing with BI at Microsoft. gigaom.com/2015/01/27/microsoft-throws-down-the-gauntlet-in-business-intelligence/

Both of these first class leaders have brought change to Microsoft like you have never seen before.  Who would have thought that SQL Server would ever run on Linux – well it does now.  Who would have thought that Microsoft would ask its customers what was important and then prioritise these things for the new tools?  Well it does now.  Do not under estimate the importance of a power house like Microsoft as the owner and developer of Power BI.  Microsoft has deep pockets and it knows how to win with business.  I personally wouldn’t want to be competing in this space against Microsoft in the next 10+ years.

2. Power BI is built from the ground up on SSAS heritage

Power BI is a green field product but it has an enormous heritage.  Microsoft has been one of the most successful companies out there in the BI space leading with SQL Server Analysis Services Multi Dimensional and to a lesser extent with Microsoft Excel.Image result for SSAS logo

When it came time to leverage the new Column Store database technologies to build Power Pivot, Microsoft called on its crack expert SSAS team to build Power Pivot from scratch leveraging the skills, knowledge and experience of a pedigree race horse.  It is unlikely that a stand alone company could start from scratch like Microsoft has and hit a home run first batter up.

3. Power BI is built from the ground up on SSIS heritage

Just as Power Pivot has been built from scratch using new technologies but with a pedigree, Power Query (the self service ETL tool – Extract, Transform and Load) has also been built from scratch. But once again Microsoft has wheeled in its crack SQL Server Integration Services (SSIS) team that has a long and successful heritage of successful ETL software development.  Do not under estimate the benefits of getting this part of the puzzle right the first time, and Microsoft has already proven it is winning in this space.

4. Power BI will ingest data from virtually any source

Most of the major data providers are already built and in deployed in production. There is also a “catch-all” ODBC database connector and there are new (less main stream) connectors being developed all the time.  In addition there are many many cloud based connectors that allow users to easily connect to cloud content like Google Analytics, SalesForce.com, and lots lots more online software service providers.

image

5. Power BI is highly compressed and performant

Power Pivot (the data modelling engine in Power BI and Excel) is a highly compressed and performant columnar database.  It uses modern tabular database technologies (built from scratch) to create a highly compressed database that loads fully into memory for rapid performance.  It is not unusual for a database to compress between 600% –1200%, meaning a 1GB database will compress down to an operating size of between 85MB – 170MB.  The performance impact of this compression is amazing and it needs to be experienced to be believed.

6. Power BI has a brand new visualisation engine

There is no baggage to take along for the ride here – no sign of Silverlight for example.  Microsoft has built a brand new HTML 5 compliant visualisation engine that is simply awesome.  Everything (virtually everything) will slice and dice and cross filter everything else (as shown below).

engine

If you can see a map of your country showing sales by region, then click on the region to see the underlying details.  If you can see a bar chart that show performance by year, then drill down on the bar chart to see the monthly, weekly or daily breakdown. Everything is intuitive and simply works.

Gone are the days when you needed a specialised IT report writer to create every new report for the business.  Anyone that can create a chart in Excel can create a chart in Power BI (and probably some people that can’t create a chart in Excel too for that matter).  Unfortunately for professional report writers, I simply don’t see this role existing in the future as it will be the business users that build what they need.

7. Power BI also has open source visualisations

Microsoft has realised early on (under that strong leadership mentioned earlier) that it can’t solve world peace on its own.  Microsoft has be focused with its efforts so rather than try to solve every visualisation problem for every company around the world, it decided to share its visualisation source code so that any capable developer can develop their own visualisations and share them with the community.

image

Developers can either copy an existing visual and extend it, or they can start from scratch and build something completely new.  It is amazing the quality of what has been delivered already and the party hasn’t even started yet.  You can see all the current custom visualisations here.

8. Power BI is built for users familiar with Excel, but it’s not Excel

This is self service BI at its best.  Who better than to design the UI for the business user community than the company that built and owns Excel.  So while it is relatively easy for Excel users to get started, it is not Excel.  Microsoft has brought the rigour and structure from SQL Server Analysis Services into this product to ensure that it is a Self Service BI Tool with Enterprise strength features, including row level security, active directory integration etc while it is still familiar enough to Excel users that they can and will embrace it.  I can hear a few companies cringing at this point as they think about the BI shelfware they have spent literally million$ on and is not being used.

9. Power BI is in the cloud…

Like it or not, the world is moving to the cloud.  This product was built from the ground up with the view of sharing via the cloud.  It uses all the modern HTML5 coding techniques to ensure compatibility with browsers across any platform.  It has industrial strength security wrapped around the cloud presence and it is definitely secure – Microsoft’s reputation is on the line here.  I realise it won’t work for every company, but frankly it will work for most.  Of course do your due diligence, but then get on board.

10. Unless it’s not in the cloud

Not every company is ready to move to the cloud for whatever reason. In Jan 2017, Microsoft released a preview of Power BI for SQL Server Reporting Services (SSRS) aka Power BI On Premise.  The product is not quite ready for the main stream yet but it will be soon (Q2 2017 is my guess).  Given the time it takes for many to deploy a BI solution, I feel confident anyone starting in Feb 2017 will be able to have a working on premise solution available by the time they are ready to go live.  And the desktop version of the software is free for everyone and is at least as good (if not better) than Excel for BI analytics.  You can build and share pbix files just like you can share xlsx files to any other user that has the desktop application.  And it is possible to install 64 Bit Desktop even if you are locked into 32 bit Office.  Read about that here

11. Power BI has mobile apps

One of the benefits of starting from scratch is you have no legacy to carry around.  Microsoft has built mobile apps for Apple and Android (and Windows mobile of course).

image

Under the leadership of Satya Nadella Microsoft supports every platform and it no longer uses its brand strength in one product to try to force participation in another.  The iPhone and iPad apps are simply great – did I mention they are free?  You can read more about mobile apps here.

12. You can subscribe via eMail – but please don’t!

I know, I know – some of you just want your static report to appear in your email inbox every morning.  OK, Power BI can do that.  You just go to the report tab you want to see, click “subscribe via email” and a nicely formatted PDF that is ideal for printing will be there in your inbox when you get to work.  But please please think about using the mobile apps instead.   These are sooo much better and have full interactivity.  All you have to do is remember to open the app in the morning.  At worst, subscribe to a report via email and when you see it in your inbox, simply open the app instead :-).

13. Power BI can be embedded into your own custom apps

Another benefit of building from the ground up is you can think about all sorts of issues when you design the architecture.  Microsoft has always planned to allow Power BI to be embedded into custom apps.  So if you want to build your own website and control user access via your own logon credentials, then that is fine with Microsoft.  Just embed Power BI into your application and away you go.  Same goes for mobile apps or anything else you can think of for that matter.  There is a dedicated licensing agreement specifically built for Power BI embedded that makes the product financially compelling.

14. Power BI is very cheap to buy

Free to try and indeed free to use as long as you want.  If you want to take advantage of more of the enterprise features including controlled sharing of data, automatic refreshes using gateways etc.  then you have to pay – wait for it – a whole US$9.99 per month per user.  It is enough to make the competition shiver in their boots!

image

Power BI Desktop is free to use on the desktop for ever, and free for any user in the cloud as long as they are not using the features mentioned above.  And frankly the professional features are hugely value adding so you really can see the benefits when you start to pass a small pile of pennies over to Microsoft.  And it is at least as good (if not better) than Excel is in the BI space already – and it is less than 2 years old as a product.  Do you think it will get better in the next 2 years?  You bet your life (or better still your position as a corporate decision maker) it will!

15. Power BI is easy to use (and hence cheap to train users)

Anyone that can click on a button can interact with a Power BI report.  It is intuitive and easy to learn for anyone that wants to consume the content interactively.  The training costs of this product are low making the total cost of ownership very compelling.  And that is before the product even gets used and helps users make decisions to drive business results.

16. Power BI has a modelling engine that power users can learn

In addition Power BI is enabling self service BI to a new wave of “developers” from the business community with the  Power Pivot technology.  No longer is this the sole domain of an MDX programmer to develop a cube that can be used.  There will still be a need for IT cube developers in many companies and of course they can now develop in Power BI, however competent Excel users can also now do the work too.  Power BI has a language called DAX (Data Analysis eXpressions).  This is a functional language that can be learnt by Excel professionals hence it doesn’t have to be the IT pros that do all the work now.  And as a bonus, the DAX language in Power BI is also available in Power Pivot for Excel, so Excel users get to learn a new super language that they can leverage in multiple places.

17. Power BI has data loading tools that power users can learn

Power Query is so easy to use it is embarrassing to those skilled in SSIS.  I am not saying that we don’t need SSIS developers any more as there will always be a need for this technology for large scale enterprise BI projects.  But this is no longer the only robust Microsoft technology that is out there, and any capable Excel professional can learn and use Power Query.  Power Query has a user driven interface that generates code (in the M language) to transform the data.  The process steps are saved as a process and can be re-run over new data at any time.  As a bonus, Power Query is also identical in Excel so you get to learn another language once that can be used in multiple places.  Did I mention that Power Query is so much more than just a self service ETL tool.  If you use Excel and you don’t use Power Query, you are missing out on something special.

18. Power BI is self service BI…

For the 2 reasons above (Power Pivot and Power Query), Power BI is self service BI personified.  It is everything you want self service BI to be and more.  Any competent Excel user can learn and use these tools and hence they can become semi self sufficient in their analytics and reporting needs very quickly.

19. Except when its not self service BI

But just because Power BI is a very capable self service BI tool, that doesn’t mean that it can’t be an enterprise strength BI tool too – in fact it can be both at the same time.  Power BI has the foundation of an enterprise strength tool and a self service capable tool all at once.  And don’t underestimate the value in having ad hoc analysis and enterprise analysis capability shared and using the same underlying technology – everyone benefits from this.  And what’s more I am fully expecting that in the future users will be able to extend deployed standardised enterprise Power BI reporting by creating user specific measures that allow additional add hoc analysis on top of the core (not here yet, but I can imagine it will come).

20. Power BI has consolidation dashboards

Power BI is among other things a data source aggregator. It doesn’t matter where your data comes from or how/if it interacts with other data you care about.  If it is data and you can report on it, you can include it in Power BI and pin your visualisations to a dashboard/cockpit where you can see everything you care about and nothing you don’t.  And if you want more detail, then just click on the tile and drill down into the report.

21. Power BI gets better every day – literally!

The Power BI service is on a continuous release cycle.  As improvements are made they are released into the cloud service.  Microsoft also needs to build parallel improvements to the mobile apps and to Power BI Desktop.  The release cycles for these products are monthly rather than daily, however the pace is still incredible.  Quite literally, today’s biggest complaint can disappear when you wake up tomorrow morning.  You can read about all the latest updates from the official MS Power BI Blog.

22. Power BI has a massive user community

Power BI is less than 2 years old at this writing, however there are 10’s of thousands of users in the community that ask questions and help out other like-minded people.  There are showcases demonstrating how people are already exploiting Power BI, and there are more than 100 user groups in the major cities around the world.  Specifically with the user forum is not uncommon for people to get answers to their questions in minutes (if not a few hours) and the content of their questions remain on the web for others to learn from.

image

And of course there are all the other community forums like powerpivotforum.com.au that provide help and solutions to users as well.

23. The Power BI user community decides what’s important

Did I mention Microsoft has changed?  Instead of building only what Microsoft thinks we need, it is asking the community what new features are most valued by the community.  Anyone can sign up and create and/or vote for new ideas on how to improve the product.

image

Of course Microsoft also needs to build out the foundation of the technology, but I regularly see highly supported features make their way into the core product…sweet!  You can add your own suggestions or vote for another here.

24. Power BI has a natural language query engine

This sounds like a gimmick but believe me it is super cool and it works.  You can type natural language questions of a well designed Power BI data model like ‘what were my sales yesterday by state?” or “how are bike sales trending vs last year?” and Power BI will give you its best effort to present the answer to you as a table, chart or what ever makes sense.

QandA

If you don’t quite like the first response you can tweak the settings from the standard user interface.  If you do like the report then pin it to your dashboard and see it updated every day forward.  Believe me this is awesome.   Oh and the mobile apps have just had an update that allows voice recognition Q&A, so you can literally ask your phone/tablet for some information and it will get the answer for you.  Oh and it is integrated into Cortana on the desktop in Windows 10 too.

25. Power BI has quick insights for automatic analysis

There is a quick insights tool that will trawl through your Power BI reports (when you load them to PowerBI.com) and analyse the data for quick insights.  When I used Quick Insights on the Adventure Works demo database it returned almost 50 interesting facts about the data in my report.

image

26. Power BI seamlessly integrates with Excel (no need for export)

The best BI industry joke I know is this (thanks Rob Collie).

Q: What are the 3 most common buttons in any BI solution?

A: OK, Cancel and Export to Excel

But it really isn’t a joke – it’s true.  One of the biggest problems BI vendors have with their products is that the business users simply are not familiar with and don’t want to learn a new tool.  You can argue all you want that the users should go to training and learn the new tool, but let me tell you – they don’t/won’t do it.  So what happens when business users users are under pressure and need that data quickly?    They go into the expensive BI tool, grab the data and click “Export to Excel” and then do the manipulation and analysis inside Excel instead – you better believe it is true.

Power BI has 2 killer advantages here over every other player.  Firstly it doesn’t need to export to Excel, you just click “Analyze in Excel” and Power BI talks natively to Excel as if they were twins joined by an umbilical cord (this is literally true in fact).  Secondly the look and feel within Power BI is so similar to Excel that many users will end up staying in Power BI rather than clicking Analyse in Excel anyway.  Over time this we become more pervasive.

27. Power BI seamlessly integrates with PowerPoint

The truth is that most businesses rely on PowerPoint to tell stories and hence integration between Power BI and PowerPoint is highly desirable by many.  OK, PowerPoint integration is pretty basic today (you can just export reports to slides), but it works and it will only get better.  There is already a third party product that will allow you to embed Power BI reports directly inside PowerPoint and this will allow you to interact with the reports live during a presentation.  And the early L&D experiments Microsoft (called PowerView) had a strong PowerPoint integration allowing in-PowerPoint-presentation interactively with live reports – it just worked.  There is no doubt native live integration of Power BI reports into PowerPoint is coming soon.

28. Power BI has R Integration

R is one of the most used open source statistics software packages in the world today.  Microsoft isn’t trying to compete with R.  Instead it has decided to integrate R directly into Power BI so you can use your R insights and packages directly with Power BI.  Read more about it here powerbi.microsoft.com/en-us/documentation/powerbi-desktop-r-visuals/

29. Power BI has Esri Integration

Another great third party integration is Esri.  If you need more granular geo mapping and indeed if you need to (or already do) use Esri for routing and other geospatial work, then you can leverage your content, maps and data directly in Power BI.  Read more about it here.  www.esri.com/software/arcgis/arcgis-maps-for-power-bi 

30 Gartner Loves Microsoft

Last and by no means least is that Gartner rates Microsoft highly for its completeness of vision and ability to deliver.

Let there be no mistake here.  This rating is largely driven by the excellent work Microsoft is doing with Power BI.

Power BI On Premise – Not in the Cloud!

Microsoft has recently released a technical preview of Power BI that can be installed on premise inside a company firewall and the Power BI reports can be securely published over the Intranet (not Internet) via SSRS.  Today I am sharing my experiences setting up the technical preview for the first time.  I installed the software on my PC however if you are going to use it for your company, you need to install it on a server.  Time to get one of your friendly IT folk to help you when you are ready to set this up and demonstrate what can be done to your business leaders.

Guy in a Cube is Your Friend

Adam Saxton shares a truck load of information at his Guy in a Cube YouTube ChannelI watched Adam’s video on how to install this technical preview and followed his instructions.  I wont repeat all the details here as he has done a great job of that already, however I wanted to share my experience and observations.

My Install Experience

For the record, I installed the 64 bit version on my local PC.  I already had SQL Server 2012 installed as well as SSAS Tabular 2012.

1. I downloaded the Jan 2017 technical preview of Power BI Desktop (SSRS edition) PBIDesktopRS_x64.msi and SQL Server Reporting Services SQLServerReportingServices.exe (both available here).  1 minute to download both files.

image

2. I installed and ran Power BI Desktop installer for SSRS first.  After installing, I ran the new software.  It looks, loads and behaves just like Power BI Desktop except for the reference to SSRS.  There was no conflict with my existing copy of Power BI Desktop, so now I have both versions installed side by side.  I assume the special SSRS features will find their way into the main version of Power BI Desktop in time.

image

3. I then installed the SSRS Technical Preview software (I note it has a short life – 180 days, which is to be expected).  1 minute to install.

image

4. I was then prompted to configure SSRS.  I already have a SQL Server instance running on my PC (2012), so I used that (I didn’t need to download a SQL Server Evaluation Edition) from the link shown above.

5. I then clicked the configuration button and gave it the name of my localhost server. I then got the following message.

image

6.  I then had to refer back to Adam’s video as I wasn’t sure what to do next.  Adam instructed to navigate to a URL in a browser (MachineName\Reports).   I found the exact URL for my SSRS server in the configuration tool as shown below.

image

Note you don’t need the :80 port number as port 80 is the default.  So for me the URL was simply

desktop-1sckel7/Reports

But when I went to that address in Google Chrome, it didn’t work.

image

On a hunch I decided to reboot my PC, then it worked just fine.

image

7. I noticed in the SSRS configuration manager that there is an option to register Power BI Integration.  This seems to be related to users being able to pin tiles from an On Premise Power BI report onto a Cloud based Dashboard, although I am not 100% clear on this.  I didn’t touch anything here and it all worked fine.

image

8.  This preview only works with a live connection to SSAS (Tabular or Multi Dimensional).  I have SSAS Tabular running on my PC but I didn’t have a Tabular Data Model I could use, so I decided to restore a Power Pivot model from Excel to my server.  I launched SSMS to connect to Analysis Services Tabular (2012 in my case).

image

I right clicked on Databases (shown as 1 below) and then selected Restore from Power Pivot (2 below).

image

I followed the instructions to restore the file.  At first I couldn’t restore the file due to a file access permission error. I have struggled with this forever (I am not a SQL Server Pro) but I quickly found this article that told me how to give SQL Server permission to access the folder that contained my Power Pivot workbook.

After restoring I couldn’t see the database.  I had to first right click (1 below), then select refresh (2 below).

image

9. Using my new version of Power BI Desktop, I created a new workbook and connected it to my SSAS server using connect live.

image

I created a quick demo report and then saved the report.  I navigated to the SSRS Reports Page in my Browser (shown below) and uploaded my new report.

image

Bingo – On Premise Power BI

And bingo – there it is, working through a browser like a charm.

image

Observations

  • The first report was slow to render.  The second and subsequent reports were fast.  This could be related to my SSAS Server on my PC not being hot.
  • Now when I double click on a PBIX file, the SSRS version of Power BI Desktop launches by default.  I tried to manually change this back but couldn’t get it to change.  I decided to reinstall the latest production version of Power BI Desktop (repair install) and that fixed it.  The SSRS version still worked but was now not the default.
  • The URL for the report page is in the format MACHINENAME/Reports/PowerBI/FileNameWithoutThePBIXextension.  As you can see in the image above, my file name was “Power BI On Premise” including spaces.  That meant that the URL was actually desktop-1sckel7/Reports/powerbi/Power%20BI%20On%20Premise  I hate all those %20 that indicate a space.  I renamed the file using underscore characters instead of spaces, and that gave me a much neater URL

image

Summary

All in all I have to say this is pretty exciting.  I have lots of current and potential clients that I have spoken to that are not willing to proceed with a cloud based solution.  A production ready on-premise solution seems very close now – can’t wait.

Best Practices for Power Pivot, Power Query and Power BI

Level: Beginners

There are many best practices for Power BI, Power Pivot and Power Query.  I know these things so intuitively now that it is very easy to forget how I incrementally learnt these things along the journey.  Most of these things are not “right” vs “wrong” – they are more often simply “better”.  I thought there would be value in producing a page that clearly outlines the important best practices as a learning reference for everyone.  In all cases I have outlined why it is a best practice to help the understanding.  You would be well placed to adopt these best practices as part of your DAX journey.

Naming Conventions

Naming your Columns and Measures

  • Always write a Column in the format TableName[Column Name]
  • Always write a Measure in the format [Measure Name]

This is the foundation of all the DAX formulas you will write.  Both columns and measures use the same square bracket syntax.  It is technically possible to write measures and columns both including the table name as follows.

  • TableName[Column Name]
  • TableName[Measure Name]

These 2 examples above do not cause a problem because their names makes it easy to know what they are, but if you always place the table name at the front of every measure and every column, then it will be impossible to tell them apart by reading the formula.  Take the following example

  • TableName[Total Sales]

Is the above a column or a measure?  It is impossible to tell unless you are using the best practice naming convention.

Measures and columns are very different in the DAX language.  It is essential that you can tell at a glance which is which.  In addition if you hard code a measure with the table name and then you later move the measure to another table, any formulas referring to this measure will stop working.

Give Tables a Single Noun Name

  • Don’t just accept the table name from your source system.  Preferably give the table a single word noun description/name.

Many BI data sources will have long table names like fctSalesTransactionsHistory or dimCustomerMasterFile.  This is a common practice in IT and is related to the Kimball dimension modelling methodology.  The problem is that with Self Service BI these table names, column names and measure names are more visible to business users than ever before.  Given many business users are going to be reading these table names, it is much easier for them to comprehend the “Sales” table rather than the “fctSalesTransactionsHistory” table.  In addition earlier versions of Power Pivot do not have fully featured intellisense – you are required to refer to columns by starting to type the table name from the beginning.  If every table starts with either fct or dim, you have just added 3 additional characters you have to type for each formula before Intellisense can help you.

Also PowerBI.com has a natural langauge query tool that allows you to ask quetions of your data.  If your table names are ‘words’, then you are helping the engine find what you are looking for.

Using Spaces in Names

  • Don’t use spaces in table names
  • Do use spaces in column names
  • Do use spaces in measure names

If you use spaces in table names you will be forced to add single quotes around the table name each time you refer to it in a formula.  This makes the code longer, harder to read and “untidy” (IMO anyway).  It is better to use underscore_characters or CamelCase instead of spaces (or better still use a single noun name as mentioned above).

Columns and measures always need to be wrapped in [Square Brackets] anyway and hence adding spaces does not make the code any more complex.  Columns and measures are easier to read if they have spaces

Don’t Overly Abbreviate Business Terms

  • Give your tables, columns and measures descriptive business names without overly short abbreviations.

Firstly you should use language and abbreviations that are commonly used in your organisation.  So if “Year to Date” is commonly abbreviated to YTD, then for sure you can use this abbreviation in your measure names eg [Total Sales YTD].  However if you develop a new measure called [Total Sales Last Rolling Quarter] and this is not a common concept across the organisation, then you are just making it hard for yourself if you call your measure [Ttl Sales LRQ].  You will simply have people calling you asking what it means.

Secondly Power BI has a feature called Q&A that allows a user to ask a natural language question about data.

eg.  What were the total sales for bikes last year

This natural language algorithm looks for matches in words in the question against the data model to help answer the question.  If you abbreviate your measure names to for example [TtlSales] instead of [Total Sales], you are making it hard for Q&A to do its work.  You can help Q&A using synonyms, but do yourself a favour and don’t over abbreviate your name.s

Measures or Calculated Fields

  • Measures is a better name than Calculated Fields

The term measures has been a Microsoft term for a BI formula for many years.  In the first release of Power Pivot in Excel 2010, Microsoft adopted this term.  Unfortunately in Excel 2013 somehow a decision was taken to rename “measures” to be called “calculated fields”.  This was a bad decision and thanks to lobbying from many people Excel 2016 reverted to using the term measures (as does Power BI).  I always now use the term measures and never refer to Calculated Fields unless I am explaining to Excel 2013 users why they are stuck with a bad name.

Loading and Shaping Data

Push Shaping as Close to the Source as Possible

  • Wherever possible, you should do your data shaping as close as possible to the data source.

There are many ways that you can shape your data in the Microsoft BI stack.  Power Query is a great tool to reshape your data however you can also use Power Pivot (Calculated Columns, Filters on load) and Power BI also includes Calculated Tables.  And you can always write SQL code and paste that into the tools to extract the data that way.  The main problem with these approaches is you are effectively hard coding a solution for a single data set.  If you want to build another data set in the future, the work needs to be done again (either copy or re-write).  The data shaping tools are designed to allow you to do what ever you need without having to rely on a third party – use these tools if you need to.  However if you have a common need for data in a particular shape and you can get support (from IT or otherwise) to shape your data at the source so you can easily get what you need, then there is definitely value in doing that.

Shape in Power Query, Model in Power Pivot

Power Query and Power Pivot were built to do 2 completely different tasks.  Power Query is built for cleansing and shaping while Power Pivot is built for modelling and reporting.  It is possible that you can shape your data in Power Pivot (eg you can write calculated columns, you can add calculated tables (in the newer versions) etc).  But just because you can do these things in Power Pivot, doesn’t mean you should.   For example it is possible to write letters to people using Excel, but Word is a much better tool for this task (I knew someone that once did that!).

Best practice is that you should use Power Query to shape your data before/during load, and then use Power Pivot for measures and reporting. I have deeper coverage on this topic here.

Use A Calendar Table

  • If you want to any sort of time calculations, get a Calendar table

It is possible that you can analyse your data in a single flat table without using any lookup/dimension tables.  A Calendar table is a special type of lookup/dimension table because it can be used to perform time intelligence functions.  I have an article on time intelligence here and another on Calendar tables here.  Bottom line – get a Calendar table.

A Star Schema is Optimal

  • Power Pivot is optimised to use a Star Schema table structure

I have an in-depth article about star schemas here that you can read if need be.  I am not saying this is the only layout that will work, or that other designs will always be slow.  I am saying that if you start out thinking about a star schema and aim to build that design you will be well under way to success.  Two key things you should know.

  • Don’t just bring in what is in your source transactional database – that would likely put you into a world of pain.
  • There is no need to create a lookup/dimension table just for the sake of it.  If your sales table has customer name and you don’t care about anything else about the customer (eg city, state etc), then there is no need to create a lookup table just for the sake of creating a star schema.  If you have 2 or more columns relating to the same object in your data table, then it is time to consider a lookup table.

You Should Prefer Long and Narrow Tables

  • Short wide tables are generally bad for Power Pivot but long narrow tables are great.

image

There are 2 main reasons why loading data this way is a good idea.

  • Power Pivot is a column store database.  It uses advanced compression techniques to store the data efficiently so it takes up less space and so it is fast to access the data when needed.  Simplistically speaking, long narrow tables compress better than short wide tables.
  • Power Pivot is designed to quickly and easily filter your data.  It is much easier/better to write one formula to add up a single column and then filter on an attribute column (such as month name in the green table above) than it is to write many different measures to add up each column separately.

Only Load the Data You Need

  • Load all the data you need, and nothing you don’t need.

If you have data (particularly in extra columns) you don’t need loaded, then don’t load it. Loading data you don’t need will make your workbooks bigger and slower than they need to be.  In the old world of Excel we all used to ask IT to “give me everything” because it was too hard to go back and add the missing columns of data later.  This is no longer the case – it is very easy to change your data load query to add in a column you are missing.  So bring in all of what you need and nothing you don’t.  If you need something else later, then go and get it later.  It is even advisable to use a tool like PP Utilities to show you which columns are not in use so you can delete them.  Focus mainly on your large data tables – the lookup/dimension tables tend to be smaller and hence are generally less of an issue (not always).

Don’t use Linked Tables

It is possible to add your data to a table in Excel and then use a Linked Table to load it into Power Pivot.  You simply select the data, go to the Power Pivot Menu (1 below) and click Add to Data Model (2 below).

The trouble with doing this is that you end up with 2 copies of the data in your workbook. The Excel table is an uncompressed copy and then you also have a compressed copy inside Power Pivot.  In the example (shown as 3 above) there are many thousands of rows of data.  The uncompressed data can take up 6 – 10 times more space than the equivalent compressed data.  If you have a small table with a couple of columns and 10 or 20 rows then it is fine.  However if you have anything more than that you are better to have 1 workbook for your data and then import the table directly into Power Pivot without storing it in Excel at all.

Modelling

Avoid Bi-Directional Relationships

  • Avoid using the default bi-directional relationships in Power BI unless you need them.

image

Microsoft introduce bi-directional filter propagation in Power BI (this is currently not available in Excel).  This is a good feature for beginners and those that don’t really understand how the plumbing works.  But it comes at a cost in that:

  • The performance can be negatively affected
  • If there is more than 1 data table, there can be circular relationships created (just like in cells in Excel)

I recommend you turn your bi-directional relationships to single direction (double click to change) and only turn them back on if you really need them.

Measures are Better than Calculated Columns

  • Wherever possible you should write Measures in Preference to Calculated Columns Where Possible

I have been a strong proponent of this for many years.  It mainly applies to Excel users that don’t have any formal learning about database design.  It is very easy for an Excel user to think they should write a calculated column (because they know how to do that) and not a measure (because that is a bit foreign to an Excel user).  I am not going to cover this in depth again now as I have already covered it here.  Just do yourself a favour Excel folk and start with the assumption that a measure is what you should write unless you know why a calculated column is a better fit.

For the record there are times when a calculated column is the best option, but 99.9% of all use cases coming from new Excel users won’t need them.  The main exception is if you need to use the formula in a slicer to filter your data – then you will need a column.

Store Measures in the Table Where the Data Comes from

  • You have a choice where to store your measures, so place them in the table where the data comes from.

Take for example a measure like this.

Total Sales = SUM(Sales[Extended Amount])

The data for this measure is coming from the [Extended Amount] column in the sales table.  You should therefore store the measure in the sales table.   If you place it in (say) the Calendar table, Excel will likely give you a warning similar to shown below.

image

This issue doesn’t occur in Power BI.

Break Measures into Interim Parts

  • Break your DAX problems into manageable pieces and solve each piece one at a time.

You wouldn’t use a single cell in a spreadsheet to build a financial model.  The cells are there to be used and make it easier to build a solution that meets your needs.  You should think the same way about measures.  The following formula is valid however it is hard to read, write and debug.  It also repeats the same line of code multiple times throughout the measure.  Having said that it will give you the % change in sales vs last year.

 

It is much easier to write interim measures and then join the pieces together to solve your problem.  Plus you get each interim measure available for reuse elsewhere in your model.  I am sure you will agree the following set of 4 measures are much easier to understand.

Total Sales = SUMX(Sales,Sales[Qty] * Sales[Unit Price])

Total Sales LY  = CALCULATE([Total Sales],SAMEPERIODLASTYEAR(Calendar[Date]))

Chg vs LY = [Total Sales] – [Total Sales LY]

% Chg vs LY = DIVIDE ( [Chg vs LY], [Total Sales LY] )

Don’t Break Calculated Columns into Interim Parts

  • It is good to have interim measures but it is bad to keep interim columns.

Interim measures are calculated on the fly on demand, they take up little space and make it easier to write your DAX.  As with measures, I t is easier to write calculated columns using interim calculated columns, however the problem is that every column is pre-calculated and stored on disk, and each additional column makes the data model take up more space on disk and memory and hence makes it less efficient.   By all means write interim columns if you need to in order to create a calculate column (not withstanding the earlier comments of columns vs measures) however once you have worked out the correct syntax, concatenate all the code into a single “Mega DAX formula” in a single column. This is an Excel concept I learnt from John Walkenbach.

Other Advice

You Can’t Start to Learn DAX by Reading Alone

I say this up front in my book “Learn to Write DAX”.  If you think you are going to learn a new skill like Power Pivot, Power Query or Power BI by reading a book and not getting your hands in the system, let me tell you “you can’t”.  The exception is if you are a professional SQL Server database user and have a solid background in reporting and analytics database technology, then I am sure you can learn this way. For the rest of us Excel folk, there is no substitute for practicing what you read – so do yourself a favour.

Use 64 bit If You Can

This one causes a world of pain for many people.  Power Pivot is the ONLY MS Office product that can benefit from 64 bit, but unfortunately it is all or nothing.  Most organisations have deployed 32 bit and will be very reluctant to give you 64 bit Office.  You can read my article on this topic here and also read up about my work arounds including installing 64 bit Power BI Desktop with 32 bit Office, and then also using Power BI Desktop as a local server for Excel.

DAX Time Intelligence Explained

Level: Beginners

I help a lot of people on forums who ask questions about time intelligence for DAX.  If you are just starting out then the chances are that you may not even be clear what time intelligence is and hence sometimes you don’t even know what to ask.  Often the question is something like “I want to show this year and last year on a chart”, or “total year to date this year compared with last year” etc. If you want to do any time based comparison using DAX, Power Pivot and or Power BI, then this article explaining time intelligence is the right article for you.

Definition of Time Intelligence

Time intelligence is the collective name for a set of patterns (DAX in this case) that can be used to solve time comparison problems.  Examples include comparing:

  • Same period prior month, quarter, year etc.
  • Same period next month, quarter, year etc.
  • Same period year to date compared with prior year, next year etc.
  • Rolling 30 days, 60 days, 90 days, 12 months etc.
  • etc. – there are many many more

Time intelligence is used when you want to “time shift” any period with another period of time for comparison purposes or to simply display a different period than the selection.

Understanding Filter Context

Before you can understand why time intelligence needs a special approach in DAX, you first need to have a clear understanding of Filter Context.

Pivot Tables and Power BI visuals both slice data so that you can “drill” and “filter” to see a sub-set of data.  Take the image shown below.  On the left is a pivot table and on the right is a Power BI visual.

image

Starting with the Excel Pivot table on the left, every value cell in the pivot table has been filtered more or less by the Rows, Columns, Filters and Slicers that make up the Pivot Table.  The cell highlighted as 1 (above left) has the following filters applied.

  • Territory[Country] = “Australia” – this comes from the Rows in the Pivot
  • Calendar[Year] = “2003” – this comes from the Year in the slicer
  • Product[Category] = “Bikes” – this comes from the Category in the filter.

After these filters are applied, the calculation for the cell is evaluated and the answer $2,947,789 is returned to the cell.  Every value cell in the pivot table is evaluated in exactly the same way – including the Grand Total row in the pivot table.  In the case of the grand total row, Product[Category] and Calendar[Year] have the same filters, but there is no filter on Territory[Country].

On the right hand side in the image above is a Power BI visual.  Filtering in Power BI visuals fundamentally works the same way as a pivot table however there are more places for cross filtering to come from.  In the image above, the same filtering is applied as in the Pivot table but in the Power BI example the filters are applied in a different way.

  • Territory[Country] = “Australia” – this comes from the bar chart Axis
  • Calendar[Year] = “2003” – this comes from the Year in the slicer
  • Product[Category] = “Bikes” – this comes from the Category in the tree map visual.

When filter context is passed from a visual to the underlying data model, all the relevant tables are filtered before the calculation is completed.  Filter first, evaluate second is a fundamental principle for all DAX formulas.

The Time Intelligence “Problem”

Let’s assume you want to compare total sales on a particular year vs prior year.  One way to do this (in Excel) is to put the years onto Columns in a pivot table as shown below (a similar approach can be used in Power BI).

image

But doing it this way causes many problems, including:

  • There are years in the pivot table that you don’t want (eg 2001, 2004).  You will need to somehow manually hide or filter the ones you don’t need.
  • If you want to calculate the absolute change or % change year on year you will need to hard code some formulas in the cells next to the spreadsheet and hence they can’t be reused in other visuals later.
  • If you want to look at a different year you will potentially have to go through the process of doing the filtering and formulas again from scratch.

A better way to solve this problem is to select the current period (using a slicer or filter of some sort) and have a universal formula that returns the result relative to the selection.  So if you select 2003, the formula will automatically select 2002 for you.  If you select 2002, it will automatically select 2001 (and so on).

Filtering is Now Your Enemy

If you want to write a relative time intelligence formula, then the natural filtering behaviour of a visual becomes your enemy. Imagine a calendar table with a filter applied Calendar[Year] = 2003.  If you imagine what the filtered data model would look like “Under the hood”, it should be clear that the filtered calendar table will show rows starting on 1 Jan 2003 and ending on 31 Dec 2003 – it will contain 365 unique days for the single year 2003.  The filter has already been applied to the data model so only days in 2003 are available for the calculation – how then can the data model possibly access sales for the year 2002?  There needs to be a way to go back in time and fetch a different period (in this case the period is 1 year earlier in time than the selected period), yet the 2003 filter has already been applied preventing this from occurring naturally.  This is the reason why there needs to be a special set of time intelligence functions – to solve this natural filtering “problem”.

How Time Intelligence Functions Work

Time intelligence functions are specifically designed to solve the filtering issue described above.  All time intelligence functions execute the following 4 steps:

  1. first “detect” the current filter context to determine what the “current” selected period is
  2. then remove the calendar filtering from the underlying tables so that data for “all time” is once again available.
  3. then perform a time shift to find a different period in time (as specified in the formula).
  4. Finally reapply filters on the data model for the time shifted period prior to calculating the final value.

Custom vs. Inbuilt Time Intelligence

There are 2 types of time intelligence functions – Custom and Inbuilt.  Inbuilt time intelligence functions are easier to write because they have been designed to protect the user (ie you) from the complexity.  I am not going to cover Inbuilt time intelligence in depth here because they are relatively easy to use.  See link at the bottom of the page if you would like to download the DAX reference guide I produced (including all the time intelligence functions).

The reason inbuilt time intelligence functions are easier to learn is they actually are what is known as “Syntax Sugar”.  Microsoft has created these special functions to make them easier to write and easier to understand.  You follow the simple syntax and Power Pivot does the rest.  But under the hood the inbuilt time intelligence functions are converted to the full syntax prior to evaluation.  Consider the following two examples (just to illustrate the potential complexity in the custom version).

Total Sales Year to Date (inbuilt) = TOTALSYTD(Calendar[Date],[Total Sales])

Both of these formulas return the exact same result, and in fact both use the same approach to calculating the result under the hood.  The only difference is the inbuilt version is much easy to write because you (the DAX author) are protected from the full syntax.

The end result (regardless which formula is used) looks like this in a Pivot Table.

image

Looking at the highlighted cells, even though cell 1 above is filtered to show only sales for the month of July 2003, the Time Intelligence function (cell 2 above) is able to display sales for the period Jan – Jul 2003.  It does this because the formula performs the following 4 steps.

  1. It first “detects” the current filter context to determine what the “current” selected period is.  In this case July 2003
  2. It then removes the calendar filtering from the underlying tables so that all data is once again available.
  3. It then performs a time shift to find a different period in time – in this case it holds the end date in the current filter context the same (31 July 2003) but shifts the start date back to 1 Jan 2003.
  4. Finally it reapplies filters on the data model for the time shifted period prior to calculating the final value.

How to Read a Custom Time Intelligence Formula

The custom time intelligence formulas can be daunting when you first see them – but actually they are easy to understand as long as you clearly understand the role of each part of the formula.  Below is the same formula again (from above) along with line numbers to make it easier for me to refer to.

image

Line 2 (CALCULATE) is the only function that can change the natural filtering behaviour coming from a visual – that’s what it does.   CALCULATE always filters first, evaluates second (as mentioned above).  So lines 5 – 8 are executed first (filters applied) and then the formula on line 3 is evaluated last.

Lines 4 – 8 (FILTER) is where the filtering occurs.  There are a few confusing things here.

  • Line 5 refers to ALL(Calendar) instead of simply Calendar.
  • Lines 6 and 7 seem to be evaluating against themselves – very confusing.
  • Line 7 starts with a double ampersand &&

Let me explain each line.

Line 5 ALL(Calendar)

The key thing to understand here is that the filter portion of calculate always operates in the current filter context.  That means that the Calendar table in line 5 has already been filtered by the visual (eg the Pivot Table).  Looking back at the pivot table image above, this means that the line 5 is already filtered by the pivot table and hence the Calendar only has dates from 1 July 2003 until 31 July 2003 available – all other dates have been filtered away.  The purpose of the ALL Function therefore is to remove all filters from the current filter context.  If it didn’t remove these filters, it would not be possible to access dates outside of the month of July 2003.

Now they have all be removed, it is time to add back that date filters that are needed.

Line 6 MAX( )

Line 6 reads “where Calendar[Year] = MAX(Calendar[Year])” and hence it seems to be referring to itself. The way to read and understand line 6 (and 7) is as follows:

  • Whenever you see an aggregation function (in this case MAX) around a column, it means “go and read the value from the current filter context”.
  • Whenever you see a “naked” reference to a column (ie there is no aggregation function around the column), then it means “go and apply a new filter on this column in the data model.

Taking these 2 rules, it should be clear that MAX(Calendar[Year]) in the current filter context is = 2003.  Line 6 therefore says “Go and apply a new filter on Calendar[Year] to be equal to 2003.

Note that you can use any aggregation function in these formulas that does the job.  In line 6, you could use SUM, MIN, MAX, AVERAGE or any other similar aggregation function and still get the same result.  You could also use VALUES or DISTINCT in the case of line 6, and LASTDATE in the case of line 7.  There is no right or wrong answer, simply think about the need and then find the best function to use.

Line 7 && and MAX( )

Line 7 reads “and Calendar[Date] <= MAX(Calendar[Date])”.  The double ampersand && is the inline syntax for a logical AND.  Using this knowledge and the same logic as with line 6, the way to read line 7 is as follows:

“AND also go and apply another new filter this time on Calendar[Date] to be less than or equal to 31 July 2003”.

With both of these filters applied in a logical AND, the Calendar table will contain all of the dates from 1 Jan 2003 until 31 July 2003.

The Result

Once the time intelligence function has been written, it can be added to a visual as shown below (Power BI Desktop).  Note how the YTD formula seems to “defy” the filter context coming from the visualisation due to the custom time intelligence function that has been written and explained.

image

The Trouble with Syntax Sugar

Syntax sugar is great as it makes otherwise hard formulas very easy to write with a limited understanding of filter context, row context, filter propagation, context transition etc.  There are a few down sides however.  Firstly is that you don’t get to learn these very important concepts and hence you are delayed in becoming a true Power Pivot and Power BI Ninja.  Secondly the inbuilt time intelligence functions only work in certain circumstances where you have a day level Gregorian calendar.  Read more about that here exceleratorbi.com.au/power-pivot-calendar-tables/

Granularity

I personally hardly ever use the inbuilt time intelligence functions unless my data is at a day level of granularity (which is rare), and I find the custom functions become easy to write with practice.  Custom time intelligence functions become much more important when your data is not at a day level of granularity.  Most of the work I do for clients is either weekly or monthly data.  If you are in this situation you could “trick” Power Pivot that you are using daily data by loading all data in any given week or month on the same date (eg last day of the month) and use inbuilt time intelligence however this is hardly best practice.  A much better approach I think is to write custom time intelligence functions as outlined in this article.  If you are going down the  path of writing custom time intelligence, you should read my advice about adding an ID column into a calendar table to make custom time intelligence functions easier to write.  exceleratorbi.com.au/power-pivot-calendar-tables/

Sales vs Prior Year

Time for a different example.  Now that I have covered how a custom time intelligence function works, let me show you a couple of inbuilt time intelligence measures that solve the original problem (Sales vs Prior Year).

Sales Prior Year 1 = CALCULATE([Total Sales],SAMEPERIODLASTYEAR(Calendar[Dates]))

Sales Prior Year 2 = CALCULATE([Total Sales],DATESADD(Calendar[Dates],-1,YEAR))

Both of the above formulas use inbuilt time intelligence functions (shown in bold), but they also use CALCULATE.  Now you have an understanding that CALCULATE is performing a time shift, it should be much easier to understand what is happening in these formulas.  Both of these formulas in bold produce a table of dates that have been time shifted by 1 year.  CALCULATE then takes this new table of dates, removes the current filter context from the calendar table and then moves back in time by 1 year before reapplying the filter context and then doing the calculation.  One you have [Sales Prior Year] it is easy to write:

Change vs Prior Year = [Total Sales] – [Sales Prior Year]

% Change vs Prior Year = DIVIDE([Change vs Prior Year] , [Sales Prior Year])

So where can you find a list of all the inbuilt time intelligence functions?

 A Free DAX Reference Guide

One of my students at a recent training class asked me if there was a list of all DAX Functions – kind of like a cheat sheet.  I wasn’t able to find such a thing so I produced exactly that and I am making it available free to anyone that would like a copy here.

This reference guide covers all of the inbuilt time intelligence functions on page 14 as well as every other function across the language all nicely laid out to make them easy to find.  You can download this reference guide below.  If you haven’t ready done so, why not sign up for my weekly newsletters at the same time so you are kept up to date with my latest tips and tricks about Power Pivot, Power Query and Power BI.

Download the DAX Reference Guide Using the Form Below

Extract Tabular Data From Power BI Service to Excel

Someone asked me a question yesterday about exporting data from the Power BI Service into Excel.  There are a few options to do this however they all have their problems (these problems are not covered in great detail in this post).

  • Power BI has an inbuilt export data feature (there is an export limit of 30k rows for a free account and 150k rows for a Pro account.  Plus the connection cannot be refreshed).
  • Analyze in Excel (can only view the data in Excel via a pivot table, not a table of records).
  • Download the Power BI Desktop file to your PC and use DAX Studio (you have to download the entire model just to get the records you want, besides it may not even work at all).

I started to think that I may be able to connect directly to Power BI Service using DAX Studio and then use the techniques I cover here to extract the records.  I tried this but couldn’t get it to work (please let me know if you find a way to do this).  I started searching and found this old article I wrote in 2015 that uses a technique I learnt from Marco Russo.  l have used a similar approach to create a direct query of a Power BI Data Model and extract the records directly into a Table in Excel.

Edit: 11 Jan 2017 – see the bottom of this post to see how Chris Webb recommends to do this task – it is much easier :-).  But read on for some interesting learning.

Create a Link to PowerBI.com

The first thing I did was to select my dataset from the Power BI Service, clicked the Ellipsis (1 below) and then Analyze in Excel (2 below).

image

This downloads an ODC file to the downloads folder.  Mine was called Adventure Works.odc

image

Note: there have been many updates to the Analyze in Excel connectors since its launch last year.  If you haven’t updated the Analyze in Excel connectors for a while you would be well advised to do that first before attempting the rest of the procedure that follows.  This can be done by logging into Power BI, clicking Downloads (shown as 1 below) and then downloading and installing the connector software (2 below).

image

Edit the ODC File to Make it a Query

I then edited the ODC file using Notepad.  It looks like this originally

image

It is a bit hard to read the file as is, so I cut and paste the contents into an HTML editor to improve the readability, (as shown below) then pasted the formatted text back into Notepad.

image

As you can see above, the default command is “Cube” and the default text is “Model”.  The trick is to change the command to be “Query” and replace the text to be any valid DAX Query.

I changed lines 19 and 20 below as follows:

image

<odc:CommandType>Query</odc:CommandType>
<odc:CommandText>Evaluate Sales</odc:CommandText>

The Command Text I used above is one of the most simple DAX Queries you can write.  All queries must start with the EVALUATE statement followed by any valid DAX code that returns a table of data.  The simplest way to do that is just specify an existing table (Sales in this case). If you are doing this, you can use any table name in your data model.

I then saved and closed the ODC file.

Open the ODC in Excel

To open the ODC file in Excel, I then simply double clicked on the file.  This opened Excel and gave me the following security warning

image

I clicked enable, and there appeared the entire Sales table from my Power BI data model directly in a table in Excel.

image

Now that the connection has been established, the file can be saved and refreshed as needed in the future.  To refresh, simply open the file, right click on the table and click “refresh”.

Change the Query

So far this is pretty boring albeit maybe useful for some people.  Much more interesting is to be able to extract a subset of the data to Excel.  To do this you need to learn a bit of the DAX Query Language.  I am not going to go into the DAX Query Language in detail in this post, but below I show how to edit the current query and change it to a new query – in this case a query that will return the Total Sales by Invoice Number.

To edit the existing query, I simply right clicked anywhere in the table (1 below), then selected Table (2 below), then Edit Query (3 below).

image

This brings up a dialog that can be edited as needed.

image

For my demo, I replaced the command text with my new DAX Query as follows

image

After clicking OK, this query then returned a table of all invoice numbers and the total value of sales for those orders.  Interestingly the number formatting for Total Sales did not flow through to Excel as I expected it to.

image

Chris Webb’s Trick

Edit 11 Jan 2017. Chris Webb tweeted an alternate approach that is faster to execute. Simply create any pivot table using Analyze in Excel, drill through to the underlying table and then edit the query as before.

Further Learning

I may come back and blog about DAX as a Query language at another time.  Until then if you would like to learn what is possible, I recommend this video from Alberto Ferrari (although it is a bit advanced unless you have some solid understanding of tables and databases).

www.sqlbi.com/tv/querying-in-dax-tee2014/

Data Modelling with OneDrive Source Files

Last week I shared how it was possible to easily create an online survey using Excel for O365.  Once such a survey is created, the data can be imported into PowerBI.com and used to create a simple report.  The biggest issue with directly connecting to OneDrive like I showed last week is that you can’t do any serious data modelling.  You can’t write any DAX measures or join other tables into the data model etc.  Not being able to build a proper data model with my data set last week created the following issues (I am sure there are more).

  1. I couldn’t easily create a sort order column for my “how long have you been reading” question (hence why I needed to prepend a letter A, B, C etc to get the answers to sort).
  2. Without data modelling, there was no easy way for me to cleans the data. eg Some people said they came from “United States” and others said US, but that didn’t work in my map (it had to be consistent).
  3. image<

  4. I couldn’t write proper measures such as countrows(survey) to find out how many surveys there were.  Instead I had to rely on “Count of Name”.  If someone left the name field blank, it would not be counted.
  5. I couldn’t change the column names to be something more suitable for reporting – I was stuck with the actual survey questions in the visualisations.

All of these issues (and more) are solvable if you can simply build a regular data model, but this is not possible when you directly connect PowerBI.com to a flat table of data in the cloud.

This article explains how you can create a data model with a cloud based service (like OneDrive) and also how I then went about improving the reporting from last week given I could crate a proper data model.

Connecting Power BI Desktop to OneDrive

The alternative to connecting PowerBI.com to the OneDrive file is to use Power BI Desktop instead.

  • First build a data model using Power BI Desktop connecting to the cloud copy of the OneDrive file.
  • Publish the Power BI Desktop file to the service in the cloud (after you have done your data modelling of course)
  • Configure Power BI cloud service to directly authenticate to OneDrive in the cloud so that it can directly access the OneDrive data without further configuration or gateways.  Power BI automatically refreshes the data for you approximately once every hour or so.

Let me demonstrate how to do this using the same survey from last week.  If you haven’t already completed the survey you can do so by clicking here and then see you survey response included in the reporting below.  You can read about how to create an Excel survey like this from the blog article last week.

Copy the URL to your OneDrive File in the Cloud

There are a few ways to do this but the way I did it was to copy the URL of my OneDrive XLSX file and then use that to import the data “From Web”.

Get the URL

The tricky bit is finding the correct web URL of your file on OneDrive.  The way to do this is to open the Excel file in Excel on your PC and extract the URL from there.  You can either open the file from OneDrive in the cloud or the copy on your PC (I used the one on my PC).

Once you have the file open in Excel, click on the File menu, then Info (1 below), click on the OneDrive Link (2 below) and then click “Copy link to Clipboard” (3 below).  This placed the link into the clipboard.

image

With the URL still in the clipboard, I created a new Power BI Desktop report using Get data, Web as shown below.

image

I then pasted the URL from the clipboard and manually deleted everything from the question mark to the right (highlighted below).

image

So I was left with the following

image

I then got the following complicated set of options for Authentication. I took the last option to sign into my Organizational Account, and signed in with my Office 365 credentials

image

image

After authenticating, I was able to import the table of data into Power BI Desktop and Load as normal.

image

Load to Power BI and Configure Credentials

I will continue the process of building out my data model further down the page, but first let me complete the process to publish and configure the online credentials process.

First I saved the query (close and apply), saved the workbook and published it to my PowerBI.com account.  Once the dataset was in the cloud, I needed to configure my credentials for the dataset.    I clicked on Settings (1 and 2), Datasets (3) and then selected my dataset (4 below)

image

Because this is an online data source, I was able to edit the credentials as shown below.

image

After clicking Edit Credentials, I selected OAuth2 and signed in.

image

And I then got this confirmation.

image

Improving the Reporting Using Data Modelling

Back to the data modelling in Power BI Desktop.  The first thing I did was turn off data load for the table I had previously loaded.  This was quite easy. Right click on the table, then uncheck “Enable Load”.  This keeps the connection to the table but doesn’t load it in the current form.

image

The rest of this article explains how I changed the data to make it more useable in my final report.

Cleanse the Country Data

The next thing I did was create a data cleansing table for the list of countries. Ideally the survey tool would provide a list of all valid countries to select from a list, however this is not available using the simple online Excel survey tool I am using.  Instead I created a substitution table where I can manually create substitutes for data that I wanted to change.

I created a new query with reference to the first.  Right click on the first query (1 below) and then select Reference (2 below).

image

I then deleted all columns other than the country column, removed duplicates and sorted the column.  Finally I set it to disable load.  It is now possible to see all the data points entered and identify the dirty data that was included (shown below).

image

I then created a manual mapping table to make it easier to control the data cleansing process.  To do this, I simply created a new manual table in Power BI Desktop as show below (and also turned of data load).

image

Sorry to the non descript North American if you happen to be Canadian, Mexican or of some other Nationality.  I decided to map this to the USA.

Now I had the mapping table, the next step was to replace the original country data with the cleansed data. I created a new query referencing Table1, then joined on the country columns to the CountryCleanse table.

image

Using a Left Outer Join as shown below

image

I then expanded the new column to extract the cleansed version of the data.

image

This gave me a new column that contains the substitutes in the case there was a match in the cleansing table.

image

I then created a new custom column that uses this cleansed column if it exists, otherwise the original column.

image

image

I then deleted the other 2 columns and renamed my new column so that I had a cleansed copy of the Country.

Cleanse and Sort “How Long” Answers

In my survey question, I asked how long had the participant been reading my blog.  When the answers were placed in Power BI they didn’t sort based on length of time (they sorted alphabetically).  To fix this (using the original approach I blogged about last week) I was forced to add a prefix to force the answers to sort logically.  Using Data Modelling, it is possible to solve this problem properly.  For variety (in this blog), I decided to use a different mapping table solution. I first created a mapping table with the original survey choices, a new set of labels and a sort column as follows.

image

Then I loaded this mapping table as a lookup table to my data model and joined it on the Original column as the key.

image

Then to avoid confusion in the final report, I hid the original columns in both tables.  Right click on the column (1 below) and then select Hide in Report View (2 below).

image

Of course I sorted the Tenure column

image

Adding “All of the Above” into the other 3 choices

The last problem I wanted to solve was that many people chose “All of the above” when asked about their favourite topics.  When this was put in a visual (shown as 1 below) it is not possible to see the overall most popular item.

image

There are many ways to handle this problem.  The measure I wrote was as follows:

Count Most Useful =
COUNTROWS ( Data )
    + CALCULATE (
        COUNTROWS ( data ),
        Data[Which topics do you find most useful?] = "All of the above"
    )

The measures simply adds the count for each item and also adds “All of the above” as well (shown as 3 above) and I then simply filtered out the “All of the above” answer from the visual (shown as 4 above).

My report was starting to look good (see below) but I still wasn’t happy.

image

Ideally I should be able to show “what is most useful” and also “What do you want to see more of” in a single bar chart.   But the problem was that there were 2 different measures and the “bars” are coming from 2 different columns in the data source.  This could not be solved without thinking differently about the problem.

I created a new table that has the 3 responses and loaded it to the data model.  I didn’t connect this table to any other tables in the data model.

image

I placed this new column of data as the Axis of my visual and place the 2 measures as Values.  This then gave me a chart that looked like I needed, but of course there was no filtering, because the new “Types” table is not connected to anything in the data model.

This is important:  The Axis is part of the filter context, the Types table is being filtered by the visual, but this filtering is not being used in the measures I wrote.

image

So I changed the measures so they would “harvest” the value on the Axis to filter the results.

Count Most Useful =
IF (
    HASONEVALUE ( Types[Product] ),
    CALCULATE (
        COUNTROWS ( Data ),
        Data[Which topics do you find most useful?] = VALUES ( Types[Product] )
    )
        + CALCULATE (
            COUNTROWS ( data ),
            Data[Which topics do you find most useful?] = "All of the above"
        )
)

The final visual looks like this, which is much easier to read and understand.

image

And here is the final report

Easy Online Surveys with Power BI Reporting

I think today’s article will be of interest to my readers even though it is a little astray from my normally pure Power BI, Power Pivot and Power Query content. I will show you how to quickly and easily create an On-Line Survey that you can distribute to anyone that has an Internet connection, and then use Power BI to report on the results from your survey.

Office 365 Excel Surveys

You may not be aware that O365 has an inbuilt Excel survey tool that is very easy to use.  And who wouldn’t want to use Excel to solve a problem if they could?  You can find the survey tool by going to OneDrive online.  I am using OneDrive for business for this demo, but it also works for OneDrive Personal.

Here is Survey – Please Participate

For demonstration purposes, I have created a survey to ask my readers what they would like to see more of on my website and to see which countries the readers live in.  Edit 7 Feb 2016.  The survey is now closed, but you can still read on.

image

Now that you have completed the survey, read on to see how you can do the same yourself and to see the latest results via an embedded a Power BI report at the bottom of the page.  Note that it can take up to 1 hour for your specific survey response to be available in the report (ie it is near real time – not real time).

How I Created the Survey

First Open OneDrive Online

Frankly I get confused with all the logins for Microsoft sites. The easiest way I know how to login to OneDrive Online is to Open Windows Explorer, right click on the OneDrive folder and then select View Online.

image

Create Your Excel Survey

Creating an Excel survey couldn’t be easier.  Just click New\Excel Survey as shown below.

image

I immediately got some survey header information to fill in as shown below.

image

There are no user manuals required to work out how to do this.  I just updated the information and moved on to the next section.  It is easy to enter the questions you have, and there are a number of intuitive response types that you can choose to meet your needs.

image

Survey Results Database

After you have completed your survey, you can see the underlying Excel spreadsheet that stores your survey results.  It is possible to add new columns to the right of the results table.  Additional columns added manually can be used to store information you want to add without it being part of the survey.

image

Connecting the Data to PowerBI.com

There are a few ways you can create a Power BI Report.  If you want a fully featured data model including the ability to import other tables of data and write your own DAX, then you need to import this new data source into Power BI Desktop as you would with any other report.  For this demo today, I have use the “Get Data” feature from directly inside Power BI Service to connect to the Excel workbook on OneDrive.  To do this, I first logged into PowerBI.com and selected get data from files as shown below.

image

Then I selected OneDrive – Business as shown below.

image

Then I selected my survey file and clicked Connect

image

I was then prompted to either import the data or connect.  The connect option works just like Excel Online – not what I wanted.  So I selected Import.  When you use import in this way, Power BI will check your OneDrive account about once each hour to check for new data and will automatically update your report.

image

Power BI then created a new Dataset for me (1 below).  You can see the table of data shown in 2 below.  It also created a Dashboard, however this was not much use so I simply deleted the dashboard.

image

Here is the Final Power BI Report

Once I had the data connected, I simply created a few simple visualisations using the capabilities of Power BI Service.  You can see the final interactive report below.

Be sure to read next week’s post where I cover how you can use Power BI Desktop to model your online data. exceleratorbi.com.au/data-modelling-onedrive-source-files/

Save PBIX from Power BI Service Under Delivers

This blog post was going to be about how excited I was to hear that there is now an option to download a PBIX file from the Power BI Service to your PC.  Unfortunately all that glitters is not gold.  This new option in the Power BI Service promises a lot but delivers little – let me explain.

save pbix updated

The Authoring Pattern

The general pattern I recommend for Power BI is:

  1. Author your workbook in Power BI Desktop
  2. Publish the workbook to the Power BI Service and share from there.

I have always recommend you do not make any changes to your reports or datasets on the Power BI service because:

  1. There is no way to backup the changes you make
  2. There is no way to download the changes back to your Power BI Desktop file.

I was hoping that this new feature solved this problem, but it doesn’t.

The Problems With This Feature

The first problem is that any PBIX files that I previously uploaded to the server are not able to be downloaded (maybe because you need to upload from the latest version of Power BI Desktop – I’m not sure).  This problem seems fixable by simply uploading a new copy of the PBIX and overwriting the current version.

The second issue is that you cannot download a report created from scratch on an existing dataset in the Power BI Service.  If the report was created online, the download option is greyed out (shown below).

image

This limitation therefore means the new feature doesn’t solve the 2 issues that existed in the first place.  It seems to me that the only possible value of this new feature is that it allows other users to take a local copy of the PBIX file (which is good), and also allows you to recover the PBIX in the instance where you misplaced (or lost) the original.

I would like to hope that this is just the first step of a multi-step process towards making all reports authored in the Power BI Service downloadable.  It remains to be seen if this will be the case.

Edit:  A colleague of mine Gilbert Quevauvilliers tells me that if you do download a PBIX, you must have the latest version of Power BI Desktop to open it.  It is a good idea to keep Power BI Desktop always up to date.

Find Duplicate Files on Your PC with Power BI

Level: Beginners

If you want to learn new skills using a new tool, then you simply must practice.  One great way to practice is to weave the new tool into you daily problem solving.  If you have something meaningful to do with the new tool, then you are much more likely to be motivated to practice.  And the new tool I am talking about of course is Power BI.

Last week I showed how easy it is to use Power BI to help you track down large files saved in Dropbox so you could manage the overall space usage.  As a result of that article, Graham Whiteman posted a comment suggesting it would be a good next step to find duplicate files.  I think that is a great idea, so I decided to test it out on my PC.  Read on to see how I did it, and how you can do it too.

Create a Query to Fetch All PC files

I started a new Power BI Desktop file, then connected to my PC documents folder

image

image

I immediately selected Edit query as shown in 1 below.

image

The only time you would immediately select Load (2 above) is if the data you are imported is already in the correct shape for Power BI.

The only columns I need are the file name, date modified, attributes and path (shown below).  I Multi selected the columns I wanted to keep, then I right clicked and select “remove other columns”.

image

The next step was to extract the file size from the attributes list. To do this, I expanded the list of attributes (1 below), deselected all the columns and then reselected the file size (3 below).

image

Then I renamed the query (1 below) and changed the query so it didn’t load to Power BI by right clicking on the query and un-checking the enable load option.

image

This created a query that links to the PC, keeps the columns of data needed but didn’t load anything to Power BI yet.

Create a New Query that Accesses the File List

The next step was to create a new query that references the File List.  I right clicked on the first query (1 below) and then selected Reference (2 below).  Note how the File List query is shown in Italics indicating that it won’t load to Power BI.

image

The next step was to merge this data with the itself by going to the Home Ribbon and selecting Merge Queries.

image.

In the Merge Queries dialog, I joined the list of files File List (2) with the original query File List so that it was joined to itself on 3 columns (the File Name, Modify Date and File Size) but not the File Path as shown below.

join file list

The above steps added a new column to the query.  I then expanded the new column as shown below making sure to keep the original column name prefix.

image

Find The Duplicate Files

The second query now looked like this.  As you can see in the image below, the query returned all the files (name column) along with the folder paths from the query “File List” shown as 1, and a second column containing the folder paths from the query “File List (2)” shown as 2 below.

image

The next step was to get rid of all rows in this query where the 2 folder paths are identical.  Doing this is easy with a custom column.  I added a custom column (steps 1 and 2), and wrote a formula to return TRUE if the 2 folder paths were identical.

image

I then filtered out everything that returned a TRUE in the new column using the filter button as shown below.

remove matches

I then deleted this custom column as it was no longer needed.  I just right clicked and selected remove.

Format the Number Columns

It is very important in Power BI to set the number formats before loading the data.  Any numeric column that has a data type “Any” should be changed to a suitable numeric format (as shown below).

image

I did this, renamed the query to be called “Duplicates” and then selected  “Close and Load” to get the data into Power BI.

Time to Write Some DAX

Now the data is loaded, you of course I could just drag the one or more of the columns to the Power BI canvas.  But remember half the reason of doing this is to get some new skills.  So instead of dragging the Size column and creating an implicit measure, I wrote some DAX – it isn’t hard to get started with such simple formulas.  Here’s how to do it.

Select the Size column, go to the Modelling Ribbon and select New Measure.

image

The formula I wrote is as follows

File Size MB = sum(Duplicates[Size])/(1024 * 1024)

image

Note a few things that I was able to do by writing this measure myself

  1. I converted the units of the result from bytes to megabytes by dividing by (1024 x 1024).
  2. I gave the measure a more meaningful name “File Size MB”
  3. I was able to set the formatting to comma separated with 1 decimal place

And of course I practiced my DAX.

And the Results

I simply then added the file size, File Name, Folder Path and Second Folder Path to a table in Power BI like shown below. image

I then discovered I had around 9 GB of duplicate files on my PC.  I sorted the table by File Size descending and discovered that I had multiple identical copies of a contoso.pbix.  It looks above like there are 6 copies of contoso.pbix but this is deceiving. Every copy of a file will find a match with every other copy.  If  you note in the Folder Path column, there are only 3 unique folder paths, hence 3 files.

The next thing I did was add a Tree Map as shown, with the file name in the Group section and File Size MB in the Values section.

image

To find out accurately how many copies of each file there were, I had to write some more DAX.  This formula is a bit more involved (intermediate DAX).

2016-10-31_120903

Let me explain this formula starting from the inside out.  There are 4 functions in this DAX formula and I describe their role below.

  1. SUMX is an iterator.  It iterates over a table specified as the first parameter (VALUES in this case).  You can read more about SUMX here.
  2. The VALUES function returns a table of unique file names (in this case it is unique values in the column Duplicates[Name]).  So SUMX above will iterate over each file name in the name column.
  3. SUMX is iterating over a Virtual Table (VALUES).  The CALCULATE is required to force context transition.
  4. Then for each file name in the table (in 2 above), DISTINCTCOUNT will count how many unique folder names there are.

I then added the new File Count measure to the Colour Saturation section of the Tree Map (1 below).  This does 2 things.  Firstly it shows the high folder count files as being a darker colour, and secondly it adds the file count to the tool tips (visible when you hover the mouse over the visual).

image

And Now Some Fun

I’ve been looking for an excuse to do this for some time.  I want to find the fattest fish in my pond (aka most space taken by file name).  I went to visuals.powerbi.com and downloaded the Enlighten Aquarium custom visual.

app.powerbi.com/visuals/show/Aquarium1442671919391

I then imported the custom visual into Power BI Desktop

image

The I copied my Tree Map visual (Ctrl-c, Ctrl-v), selected the copy and changed the visualisation to be the Aquarium.  This visual is showing the largest individual files regardless of location or how many copies.  I am not saying this is the best way to visualise data, but surely it is one of the most creative.

fish

Here is my final workbook canvas

image

For the purists out there, I wrote a new file size formula as follows.

final

 

The original formula I wrote double counts the file size when there are multiple duplicates.  The above formula is almost identical to the File Count I explained above.  The only difference really is the inclusion of MAX(Duplicates[Size]).  This is a “trick” to handle the fact that for each file name there will be multiple records in the data model.  Each file will have the exact same file size, so by selecting MAX I simply get to access the file size.  I could have used any other aggregator (eg Min, Avg, Sum) and got the same outcome.

I haven’t shared the actual workbook here. The whole idea is for you do try this yourself so you get 3 benefits; more disk space, some practice with Power BI Desktop and have some fun.

Who Needs Power Pivot, Power Query and Power BI Anyway?

Level: Beginners

One of the great challenges Microsoft has faced with its “new” suite of Self Service BI tools (particularly Power Pivot) is that most people that could benefit from the holy trinity (Power Pivot, Power Query and Power BI) don’t even know these tools exist, let alone how the tools can help them succeed in their jobs.  The situation is definitely getting better as Power BI starts to get a presence in the market place, however I still talk to people who have heard of Power BI, but have no idea what Power Pivot or Power Query are, and what’s more they don’t know why they should care.  I personally believe a big part of the awareness problem is that Microsoft insists on shipping Microsoft Excel with the Power Pivot plugin disabled.  There is no reference to Power Pivot when you excitedly receive your brand spanking new version of Excel 2016 – what a marketing opportunity gone begging!

image

I have been an Excel nerd for 30 years.  There is nothing I used to enjoy more than installing a new version of Excel, and clicking through every menu item to find something shiny and new that would make my life easier.  By not turning on the Power Pivot menu by default, Microsoft is missing one of the best silent selling opportunities is has for this fabulous addition to Excel.

Given there is no “pull through” on the menus, many people don’t know what these products are or why they should care.  I am often asked by people “who can benefit from these tools?”.  This post sets out to explain who can benefit and why.  Note when I say “who can benefit”, I am not talking about “consumers of reports” here, I am talking about “authors of reports”.  It is clear that people that consume quality reports and analysis will benefit, whatever the tool.  This article is focused on the benefits to those people that are responsible for building the reports and analysis that others will end up consuming.

Power BI

Who can benefit from Power BI is probably the easiest to understand.  The product is well marketed and has a clear role to play.   Power BI is a complete self service BI tool.  It is designed to bring business intelligence capabilities to the masses instead of the elite (e.g. instead of just highly skilled IT MDX programmers).  Rob Collie wrote a good article last week about the democratisation of BI tools.  Power BI will add value to people who:

  1. Have problems sharing reports with others because the file size is too large.
  2. Need to share data with people on the go, that maybe only have a Tablet or a Mobile phone.
  3. Have large data sets that can’t be managed in traditional Excel.
  4. Are spending too much time each week/month manually maintaining reports with new source data and/or new visualisation requests.
  5. Can’t get the (timely) support they need from their IT department using traditional Enterprise BI tools.

Power BI is great because it puts capabilities across the end to end BI stack into the hands of end users (authors), including:

  1. Extraction of data from the source (using the Power Query engine)
  2. Transformation of that data into a new shape that is optimum for BI reporting and analytics (Power Query again).
  3. Data modelling capabilities, so you can turn the raw data into business meaningful insights (using the Power Pivot engine).
  4. A reporting and analytics front end allowing you to build reports to visualise your data (Power BI Desktop and Power BI Service).
  5. A fully integrated cloud based delivery framework so you can easily share with anyone over the internet (Power BI Service).
  6. A full set of Mobile applications across the major operating systems (Power BI Mobile).

Notice how steps 1 and 2 use Power Query, and step 3 uses Power Pivot.  So if you want to learn about Power BI, you really need to learn about Power Pivot and Power Query too.

Power Pivot

Conversely, Power Pivot is the hardest to understand – I.e. it is the hardest for individuals (potential authors) to understand “what does Power Pivot do for me and why do I need it?”.  I have had people enquire about Power BI training courses that have not been interested in Power Pivot or DAX*. But the truth is, if you want be able to write your own reports in Power BI, you really need to learn at least some basic Power Pivot skills.

Power Pivot is a data modelling tool.  It is the underlying reporting engine that enables Power BI and Modern Excel to delivery those modern funky reports that can help you succeed in business.  The Power Pivot engine allows you (the report author) to take your business knowledge and to configure the reporting tools so that Power BI and Excel Pivot tables can be used to find and report on insights in your data.

Most business users have never heard of the term “Data Modelling” before, and the reason for this is quite simple – it has always been the IT department that has been responsible for data modelling.  Power Pivot brings the power of data modelling and puts it in the hands of competent business/Excel users.    An example of data modelling will make it easier to understand.

Example of Data Modelling

Consider a scenario where you download sales data from your company transaction system and it looks something like this in a spreadsheet.

image

You can see the sell price and the cost price information exists in the table above.  But there is nothing about the $ Margin per product and nothing about the % Margin, let alone insights like Sales Year to Date, Top 3 selling products, Fastest growing product etc.  In a traditional Excel world you would simply write formulas in your spreadsheet(s) to enhance this raw data and extract the additional insights.  The trouble with this approach is that all these formulas are only ever “one off”.  You write them for this report, and then when you create another report later, you have to write the formulas again.  Power Pivot handles this problem in a different way.  Power Pivot is a data modelling tool, and instead of writing formulas in your final reports, you write them “under the hood directly in the Power Pivot engine” as part of the data modelYou build the “rules” that describe how to calculate $ Margin and % Margin ONCE and only once directly in the Power Pivot engine.  Once you have created the rules for each insight (e.g. $ Margin, % Margin etc), it is forever available for you (and others) to use over and over again in any report, chart, Pivot Table or any other visualisation you can think of.  Never again to you have to write/copy a new formula every time you create a new report.

*What About DAX?

Data Analysis Expressions (DAX) is the formula language of Power Pivot.  DAX is very similar to the Excel formula language yet there are also a lot of differences that you will have to learn from scratch.  Simply put, if you want to learn Power Pivot (and/or Power BI), you will need to learn at least some DAX as well as lots of other things about how the Power Pivot engine works.

Who needs Power Pivot and why?

People that need this tool are typically Microsoft Excel users/report authors that analyse or report on data, particularly when the following conditions apply:

    1. There are lots of data that stretch the capacity of traditional Excel (file size, re-calculation speed etc).
    2. If you use Pivot Tables a lot to analyse your data.
    3. If you are writing a lot of VLOOKUP (or INDEX/MATCH) formulas to join data from different tables so you can analyse in a Pivot Table.
    4. If you have had to create bespoke reports in traditional Excel, but then have to spend hours rebuilding the report when asked for a different view of the data (e.g. you have a report that shows sales by half year, and then you are asked to produce the same report by month – and have to start again almost from scratch).
    5. Anyone that wants to start using Power BI for anything more than the most simple aggregation reports.  The “out of the box” capabilities of data modelling in Power BI are limited, and you will quickly realise that you need to learn some new skills (Power Pivot/DAX) to be able to leverage the strengths of Power BI.

Power Query

Power Query is a user friendly ETL (Extract, Transform, Load) tool.  Unfortunately Microsoft muddied the waters by renaming Power Query “Get and Transform” in Excel 2016, and “Get Data” in Power BI Desktop.  That aside, Power Query is used to:

  1. Extract:  Connect directly to the data source and ingest the data (into Power BI or Excel) so you can work with it.  It retains a connection to the source so when the source data is updated you can automatically “refresh” the data without having to go through the import process again from scratch. It is amazing.
  2. Transform:  You can clean and reshape the data so that by the time you are ready to use it, it already is in the format you need instead of the format you are given.  Again, you do this only once, and then you can automatically refresh when the data is updated later.
  3. Load the data directly to the place where you want to use it.  The end location for the data will typically be inside Power Pivot or Power BI, but it is also an invaluable tool for use with traditional Excel.

Don’t think of Power Query as simply a tool for self service BI reporting.  I have used Power Query to automatically audit information (eg XLSX files, csv extracts from other systems etc) and produce exception reports directly in Excel.  Once you understand what Power Query can do, you will start to realise all the ways you could use it to make your life easier.

Who needs Power Query and why?

People that need this tool typically are trying to solve the following problems.

  1. They regularly receive file extracts from someone (or some system) and need to manually manipulate this data before they can use it.
  2. They want to use Power BI/Power Pivot but can’t get the IT department to shape the data they need prior to loading it into Power BI/Power Pivot.
  3. People that are doing repetitive tasks such as matching invoices on a remittance advice against a bank statement (or similar) – Power Query eats this kind of work.
  4. Anyone that is given data in the wrong shape (example below).  And yes I quickly used Power Query to un-pivot the data in the blue table so it looked like the green table.  It is worth getting and using Power Query just to get this un-pivot feature alone!
    image

Wrap Up

Hopefully this overview has helped demystify how Power Pivot, Power Query and Power BI can help you in your job.  If you would like to learn more about Power Pivot, you can read my book “Learn to Write DAX“.  If you want to learn more about Power Query, I recommend Ken and Miguel’s book “M is for Data Monkey“.  If you live in Australia, you can attend one of my live training courses where I cover what you need to know about Power Pivot, Power Query and Power BI.

Shaping vs Modelling in Power BI

Level: Beginners

Power Pivot, Power Query and Power BI are 3 products that are closely related to each other and were all built for the same purpose – enabling Self Service Business Intelligence.  I first learnt to use Power Pivot for Excel, then Power Query for Excel, and finally Power BI.  But there is a new wave of users that have never heard of Power Pivot and Power Query, and their first exposure to the underlying technologies is with Power BI.  This can be confusing and hence I explain what you need to know below.

Power BI Can be Confusing to First Timers

I realised recently that people who have never used Power Pivot and Power Query could easily be confused by the different options to do things in Power BI.  There are lots of reasons to be confused, including:

  • There is no reference to the term “Power Query” inside the Power BI tools.  Microsoft stopped using this name in favour of terms like “Get Data”, and “Edit Queries”.  I think this was a bad decision and it has fuelled the confusion. When I tell people to do something in Power Query, they therefore don’t know what I am talking about.
  • There is no reference to the term “Power Pivot” inside Power BI either.  The most relevant proxy for Power Pivot inside Power BI is the “Modelling” menu. Use of the term Power Pivot therefore doesn’t make any sense to “Power BI only users” either.

Shaping and Modelling Your Data

The implications of the above is that there is no clear way to delineate (with language/labels) between the process of loading the data and the process of modelling the data.  I like the term “Shaping” as a replacement for Power Query/Get Data and “Modelling” as a replacement for Power Pivot.

Definition of Shaping

Shaping is the process of extracting your data from the source, and transforming the data so that you can load it into Power BI (often referred to as ETL).  The reason I like the word Shaping is that it clearly describes what is normally needed as the first step in any BI project.  The shape of your data refers to

  • What data you extract
  • How your data is loaded into one or more tables, and which tables you import
  • What names you give to the columns and tables
  • Whether your tables are wide and short (many columns), or long and narrow (fewer columns)

For example data tables (also called fact tables) are normally best shaped into long, narrow tables (lots of rows, and a minimum number of columns), whereas lookup (dimension) tables can be as wide as needed to for the task.

image

All tables should only be loaded with data that you actually need.  One big difference between Power Pivot/Power Query and Excel, is it is easy to go back and get the data you missed later.

Definition of Modelling

Modelling is the process of building the relationships between your tables and converting the business requirements into measures and calculated columns using the DAX language.  The measures contain

  • The business logic that transforms your raw data to a usable, value adding calculation
  • The formatting required for the numbers (currency, percentage)
  • A good business name for the measure that accurately describes the measure

Prior to Self Service Business Intelligence, data modelling was always done by expert IT professionals using a complex language like MDX.  The complexity of such languages put it out of reach of users, as it was simply too hard to learn casually.

2 Ways to Add A Column – Wait What?

A perfect example of the type of confusion Power BI first timers face is that there are 2 ways to add columns to your data.  You can add a column during the shaping process, and you can add a column during the modelling process.

image

So which should you use?  Well it depends on what you are trying to do.  My general advice on adding columns to your tables is to make the change as close to the source as possible. So in order of preference:

  1. Add the column in your source data (eg your Data Warehouse) with the help of your IT friends if necessary
  2. Add the column during the shaping process
  3. Add the column during the modelling process

If you add the column at the source, then you have access to it for your current BI needs but also for future BI needs.  Adding the column at the source provides the most reuse, and if you need to make a change down the track, it will be the least maintenance.  But sometimes you can’t do that for various reasons, or you can’t do it quickly.

If you add the column during shaping, it will be loaded and compressed with all other columns during data load.  This normally means better overall table compression and hence a good outcome for the overall model.

The most common reason you would add a column during the modelling process is if you need to leverage other parts of the model to create the data.  An example would be if you wanted to use a measure (eg Total Sales) in your data model to create the new column in the model (eg Banded groups of Customers based on High, Med, Low annual sales).  But don’t fall into the trap of using too many calculated columns instead of measures. I rant on about that here.  Use them when they are needed, otherwise avoid them.

Wrap Up

Hopefully this explanation of the role of shaping and modelling in Power BI has helped clarify some things for you.  Don’t forget to check out the other articles in my knowledge base here.

Top 10 Tips for Getting Started with Power BI

Level: Beginners

top 10 tipsI really love Power BI, and I have learnt so much over the last 12 months that sometimes it is easy to forget the challenges I had in getting started.

Today I am sharing my top 10 tips on how to get started with Power BI.

Build Your Reports in Power BI Desktop, Not the Service

One thing that can be confusing is that there are 2 ways you can build reports – Power BI Desktop and Power BI Service.  My first tip is that you should always use Power BI Desktop.  Reasons include:

  • There are more configuration capabilities in Power BI Desktop
  • You can upload changes made in Power BI Desktop to the Service, but you can’t download changes made in Power BI Service to the desktop.
  • There is currently no way to back up your changes in Power BI Service, but you can back up Power BI Desktop yourself.

Don’t Try to Replicate What You Do in Traditional Excel

Power BI is not the same as Excel, and hence you should not start out trying to simply replicate what you currently do in Excel.  The closest thing in Excel to Power BI is the Pivot Table (probably Pivot Charts actually).  The key similarity here is that a Pivot Table is an aggregation and summarisation visualisation tool – this is exactly what Power BI is too.  Yet there is no Pivot Table object currently available in Power BI.  The closest thing to a Pivot Table in Power BI is the Matrix, yet if you start using the Matrix with the view of replicating a Pivot Table, then you are bound to be disappointed.  Hence you should explore the other options (to your great delight).  See the next tip.

Explore the Visualisations to Understand Capabilities

Instead of replicating what you do in Excel, I recommend that you explore the various visualisations to see what each of them does.  There are lots of new visualisations that don’t exist in Excel, and some of these are very effective in helping you communicate the insights in data.  A great place to start is the Tree Map or a Column Chart.  You can get these visualisations to behave just like a Pivot Table in Excel.  Just place a number of summary levels into the visual (see the Axis section in the image below), then you can drill through the visual just like a Pivot Table.  It is a great experience.

drill

You should also check out the Custom Visuals Gallery for some of the great free tools that have been shared with the community by industry leading experts.

Understand the Purpose of Your Report, Then Find a Way to Build It

A common mistake is to start out thinking that you should simply replicate what you have previously built in Excel.  As covered above, this approach will limit your thinking and make the journey so much harder.  Instead I recommend you start out by thinking about the purpose of your report.  You need to be clear “What” you are trying to communicate, “How” the user will consume the information, and what business decisions will be made with the data.  If you can write down a couple of sentences about the purpose of your report, that is even better.  Once you are clear on the purpose, then set about finding the best visualisations and techniques to achieve the best outcome  that delivers on the purpose of the report.

Assume Your Current Data Shape is  Not Ideal for Power BI

Many Excel professionals make a fundamental mistake by simply loading data from their current source systems into Power BI as is.  The most common issue I find when helping people with their DAX problems is they don’t have a good data structure.  There is a big difference between transactional business systems and BI reporting tools.  If you want your reporting tools to be easy to use, easy to enhance and to perform well, then you really need to understand the data structure needed by BI Reporting tools.  I wrote a separate article about data shape here.

Assume Calculated Columns are Not What You Need

Self taught Power Pivot/Power BI users from an Excel background tend to make the mistake of writing far too many calculated columns.  I have hardly ever seen a calculated column from a self taught user where it was “the correct decision” – in almost all cases the calculated column should have been written as a measure.  I cover this topic in a lot more detail here.  I generally recommend (to Excel users) that you should assume your formulas should be written as measures unless you know for sure why it needs to be a calculated column.

Learn How Cross Filtering Works, and How to Turn it Off

Cross filtering is one of the great features of Power BI, and you should learn how it works so you can use it to your advantage.  However in my experience the default cross filtering behaviour is not always the best experience for the user.  In a recent release, Microsoft deployed a feature that allows you to turn off (or change) the cross filtering behaviour between visualisations.

Once you select a visualisation (1 below), the “format” menu (2 below) appears at the top of the page.  From there you can select “Edit Interactions” (3 below) to change the default cross filtering behaviour.  It is easy to work out how to do it, so just give it a go and learn by doing.

image

Load Content to be Shared into a Group Workspace

If you intend to share data with others, make sure you load the content into a Group Workspace right from the start. I have found it is much easier to keep control of data that you intend to share if you do this.  Of course at times you may be working on something in your own Workspace and want to share it, and that is fine to do.  However if you are starting out with the intention of sharing data, then the Group Workspace is the way to go.  I cover the topic of workspaces and sharing in some depth here.

Learn to Write DAX

At some time early in your journey you will discover that there is a limit to what you can do using the automated summarisation techniques that come with Power BI (and Power Pivot).  Data Analysis eXpressions (DAX) is the language of Power Pivot used in Power BI.  DAX has a lot of similarities with Excel formulas however there is a lot to learn and it is not always intuitive.  If you want to turbo charge your Power BI journey, you really do need to do some formal learning.  I recommend you read my book “Learn to Write DAX” as a cheap and easy way to make rapid progress.   Many Excel professionals tell me that they wish they started with my book first, and it would have saved them a world of pain.  There are many other great books available too.  I keep a curated list of the best books here.

Learn to Use Power Query

Power Query (get data) is much easier to learn than DAX.  The user interface is very intuitive and using the menu items in the UI will normally give you excellent transformations with very little knowledge or previous experience.  You should explore what you can do, and go in with the assumption that “there has to be a way to do this”.  Once you get the basics down, you should consider learning some more advanced techniques from books like M is for Data Monkey.

Happy First Birthday Power BI

HB PBI 1

One year ago today on the 24th July 2015 a brand spanking new BI baby was born.  And its proud parents named the new baby Power BI, and soon Power BI would become known to everyone around the world.  Today is therefore a day to celebrate what a great first year it has been for this bambino.  

From Little Things, Big Things Grow

It is no secret that I thought the baby was just a little bit ugly on its day of birth, but then again many babies are!  However it didn’t take long for it to grow into something that would be a game changer for personal and enterprise business intelligence.  For me it was late in October 2015 when Power BI hit the sweet spot of “good enough”, and that was the trigger for me to jump all in.  The reality is that the product will grow faster and better as a live product than it could possibly ever grow if it was being developed in a backroom pending a release when it has every visualisation feature of some of the more mature products.  It is only the fact that is is being used by so many early adopters, and the fact that Microsoft is taking their feedback on board, that this product is growing at such an incredible pace. However we must remember that the product is only a one year old; we can’t expect it to be as sharp and mature as its 30 year old favourite sibling Microsoft Excel.

Birthday Video (coming soon)

My Top 10 Things I Love About Power BI in its First Year

Here are my top 10 things that I love about Power BI at the sweet age of 1 year old.

  1. It’s free to get started.
  2. The integration and reuse of Power Pivot and Power Query across Power BI, Microsoft Excel and SSAS Tabular – all the same software – great job!!
  3. The Power BI Desktop application is a cracking piece of free software.
  4. The cloud service is effectively you own personal free SSAS Tabular server.
  5. The open source visuals gallery that allows the community to build new visualisations.
  6. Excel and Power BI are better together, including Pin Ranges to Dashboards, and Analyse in Excel.
  7. The innovative cross filter behaviour across visualisation (plus the ability now to control this by exception).
  8. Seamless integration with Mobile devices across Android, iOS and Windows Mobile
  9. So many data connectors to so many data sources.
  10. Q&A is showing so much promise.

Thanks a Million Microsoft

A big thanks a million to all the fabulous people at Microsoft on the Power BI team for doing such a great job over the last year this baby to make it such a precocious one year old.  Keep up the good work.

Measures on Rows – Here is How I did it

Level: Intermediate

You may or may not be aware that it is not possible to put Measures on rows in a Matrix in Power BI. But I came up with a trick that makes it possible, so read on to find out how.

Measures Can Only be Placed on Columns

First the problem. The only way that you can use the Power BI Matrix visualisation (at this writing) is to place the measures on the columns as shown below.  You can then take any column of data from your data model (typically from a Dimension/Lookup table) and place that on rows in the Matrix.  So you end up with this.

image

This limitation does not exist in an Excel Pivot Table.  As you can see below, it is possible to flip values between columns on rows from the Pivot Table Fields List.

measures on rows

Old tricks for New Purposes

When I was thinking through this problem, my first intuition was to use a Switch statement somehow.  Switch statements can be used to create a Switch Measure (like the ones I used in this blog post here back in 2014) and I figured this could be part of the solution.  And then I remembered another trick I learnt from Rob Collie using disconnected tables to feed a measure – I new I had a solution.

So in summary, I wrote a Switch measure that will morph into different measures when filtered, and then used a disconnected table to pass a filter to that measure.

Write a Switch Measure

The Switch measure is pretty easy to write as follows:

The way this measure works is that it takes an input in the form of an integer, and then depending on that number, it will return the value of the appropriate measure as the result.

Then Create a Helper Table

The next step is the secret sauce to this trick.  I created a disconnected table with the sole purpose of passing the correct filter values (the integers) to the Switch measure.

I used the “Enter Data” feature in Power BI to create a list of the measures I want to put on rows in the Matrix.

image

Note that the ID column is also the sort order of the measures, and the ID column also matches with the Switch measure numbering scheme.  Once the table is created, I added it to the data model as a disconnected table – no need to join it to any other tables in the data model.

I then set the sort order of the Measure column as follows:

image

Write a Harvester Measure

The next step is to write a harvester measure to extract the selected filtered value of the table.  This measure is simply as follows:


The measure above will return the largest integer in the current filter context.  If there is a filter (eg a slicer) on the Measure column and the user as selected “Total Margin” in the slicer, then there is only a single row visible in the filter context, and hence the MAX of the visible rows is 3.  I would of course get the same result if I used MIN, AVERAGE, SUM or indeed even VALUES in my harvester measure.

Filter context is one of the hardest things for new users to understand properly.  I explain filter context in detail (and everything else you need to know to be great at using Power Pivot and Power BI) in my book “Learn to Write DAX”.
L2WD banner ad

You can see the harvester measure in action below.  I have placed the column from the MeasureTable into a slicer and the harvester measure into a Matrix.  When I click on the slicer, the harvester measure updates to show the ID of the selected filter.

slicer

Add the Measure Column to the Matrix

There is more than 1 way to filter a table of course.  In the image above I am using a slicer, but I can also use the rows on the Matrix to provide filter context.  When I place the Measure column from the MeasureTable onto rows in the Matrix, you can see below that the harvester measure is correctly returning the value of the ID column to the visualisation.

image

Pulling it All Together

In order to pull this all together, I replaced the harvester measure with the Switch measure I wrote earlier, and put the Categories column from the Products table onto columns in the Matrix.

image

But there are now 2 problems.  The first problem is that the Total Invoices are shown in currency format.  One limitation of this trick is that all the measures must have the same number formatting.  If you would like to see an optional “Alternative Format” for the SWITCH measure, please vote for that idea here.

The second issue is that the total row is actually showing the total invoices and not the total of all the measures.  This makes sense of course because the grand total row is not filtered.  As a result, the MAX of ID is 4 and hence the SWITCH measure is returning the value of the measure that maps to ID 4 (Total Invoices).

To fix these 2 problems, I removed Total Invoices from my table and rewrote the SWITCH statement to correctly handle the grand total row.

And there you have it, Measures on Rows in a Power BI Matrix.

image

Direct Connect from Excel to Power BI Service

Today Microsoft announced a great new feature that allows you to direct connect FROM Excel TO Power BI and not the other way around.  This simple change really streamlines the integration experience between Excel and the Power BI Service, and makes Power BI even more like you own personal SSAS server.

Power BI direct connect

There are 2 immediate use cases that come to mind

Personal SSAS Server

It is now easier than ever to effectively have your own free SSAS Server.  Just create a free Power BI account, load up your Power BI workbook, and then access your data models directly from within Excel.

Corporate Sharing

If you are using the paid Pro version of Power BI, the ability to use Power BI Service as a “bonus” SSAS server is also now easier than ever.  Just open Excel and connect directly to the data models that have been shared with your from your organisation from the Excel Power BI menu.

Installing and Getting Started

Download the new version of the Excel Dashboard Publisher Addin here

After installing, you will notice a new Menu Bar in Excel.  Click on this menu and then connect to your Power BI Service account.

image

Once you are signed in, you will be presented with a prompt asking how you want to connect.  In my case I was offered just a single workspace as shown below.

image

What is interesting is that I actually have 2 workspaces in my service, but I can only see one of them.   So something seems a bit amiss here, but I am sure that will be sorted out shortly.

image

A second interesting thing is that you will be presented with options to connect to either a report or a dataset.

image

In my testing, the results were the same.  Maybe something new is coming in the future here!

And yet a third interesting thing is that I was then asked to Authenticate again.  It would be nice if these Azure Authentication dialog boxes were not modal and supported Lastpass.  This would make the login process so much easier.  Even better – Excel could pass the stored credentials to the service so I don’t have to authenticate twice.

image

The good news is this is a one off additional login process, so the next time you connect from Excel you won’t have to log in again.  The exception is if you have multiple Power BI accounts (eg if you are a consultant like me with many clients with different domains) – in that case you will have to continuously log in and out, and not be able to benefit from Lastpass Sad smile

After logging in, you will then see a new blank Pivot Table connected directly to the Power BI Service

image

A Final Word

The original Pin to Power BI Dashboard/Pin Manager is still there (see 1 below), giving you great flexibility in sharing ranges in your workbooks in the Power BI Service.

And last but not least, look how POWERful my Excel Ribbon has become (2 below) – I now have 5 POWERful Addins.  Sweeeet.

image

Top Tips for Sharing Content Using Power BI

I have learnt a lot about sharing Power BI content over the last year since Power BI hit the main stream.  A lot of my learning has come from trial and error, and there are lots of things I wish I knew before I started.  Today I am sharing my top tips about how you can share, and when you should use each of the various methods

It is not easy to categorise the sharing approaches into logical buckets and there is quite a bit of overlap.  Hopefully my explanation below will make it easier to understand the options.

PBIX Workbooks

The most basic way to share your Power BI content is to simply share your PBIX workbook (created by Power BI Desktop) using traditional methods.  There are a few reasons why you might want to do this.

Reasons you might want to do it Things that might turn you off.
  • Everyone can download the software and use it for free (Windows users – sorry Mac boys and girls)
  • It has lots of new visualisations you can’t get in Excel
  • You don’t need your IT department to sign you up to the Power BI Service to get started (although you do need Admin Rights to install)
  • It is no harder to share a PBIX workbook than a Power Pivot Excel workbook.
  • Workbooks can get very large – hundreds of Mb so they can be hard to share if they are that big.
  • There are no Dashboard capabilities until you publish to Power BI Service.
  • You can’t stop sharing – once you send the workbook, the other party has it (not so with the Service).

So while you can do it this way, sharing via the Power BI Service is probably a better way to go.  Power BI Service was built for the explicit purpose of sharing, so why not use it – its free to get started.

My Workspace

Once you sign up to the Power BI Service, the default location where you will publish your workbooks is the “My Workspace” area.

image

  • The workspace area is a place for you to store your Datasets, Reports and Dashboards.
  • You are the only person that can see the items in you own “My Workspace” (unless you explicitly share material)
  • Items that other people share with you will also be visible in your own “My Workspace”, but with a “shared with me” icon.

Sharing from My Workspace (use caution)

There are 3 ways you can share from My Workspace.  These methods of sharing are also available in other areas of Power BI (covered further down the page).  While you “can” share this way from My Workspace, be sure to read up on the downsides of sharing this way and consider the other options further down the page.

Here are the 3 options to share from My Workspace.

Share a Dashboard from My Workspace

Click on the ellipsis next to the dashboard to bring up the sharing menu.  You can share to individual email addresses or to Active Directory Groups.  It is possible to securely share a dashboard this way with people internal to your organisation (email domain) as well as with people outside of your organisation.

image

Publish to Web from My Workspace

Sharing this way allows you to make the material publicly visible over the web.  You can get an embedding code to publish the material via a web page (see this example in a recent blog) or you can get a link that you can send to anyone. But be careful!  Even if you send the link to a single person, the material is available to anyone that has the link (or can find it, hack it etc) – no password required.

image

Create a Content Pack from My Workspace

A Content Pack is a bundle of Datasets, Reports and Dashboards that can be distributed to defined users in your organisation.  When you create the Content Pack, you can specify individual users or groups of users via Active Directory.

image

If someone wants to access this data shared with them as a Content Pack, they simply click “Get Data” and select the content pack from “My Organisation”

image

The benefit of sharing via Content Packs is that you can bundle relevant material and share with only those that need it.  They can choose to access it only when they want it.

There are some downsides of sharing from My Workspace

Here are the pros and cons of sharing this way from My Workspace.  Remember it is possible to share using these methods from a Group Workspace too – more on that below.

Pros Cons
  • It is free for the first 2 options, but you need a Pro License to create a content pack (option 3).
  • It is easy to share very large workbooks with other users as long as they have an account too.
  • You can share with Mac users (via browser)
  • You can share with Mobile users (there are apps and browser solutions)
  • You can stop sharing the content at any time by going back into the share options in the service and removing the sharing approval.
  • It is difficult to easily spot the dashboards you have shared from the My Workspace Dashboard view.
  • My Workspace can get very crowded with ad hoc reports and dashboards for your personal use, and these can then get in the way of shared material.
  • If you have a role to share material within an organisation (eg Business Analyst sharing material), then what will happen when you go on leave?  No one else can access your workspace.
  • There are better ways of doing it by using Group Workspaces.

Group Workspaces

If you are serious about sharing material, then you really should start using Group Workspaces. Group Workspaces are a Pro Licence feature, but it is almost a must for any organisation serious about sharing content via Power BI.

The benefits of using Group Workspaces vs My Workspace include:

  • More than one person can publish, edit and manage all content within a Group Workspace.
  • One or more people can administer user access to the Group Workspace.
  • If the original author is not available (ie they are on holidays or they leave the company) then someone else can take ownership of the material without starting again from scratch.
  • You can have as many Group Workspaces as you need, and these can be based on any group categorisation that works for you and your organisation.
  • When you have your shared material in a Group Workspace, there is no confusion as to what is your personal material and what material you have loaded for the purpose of sharing (see below how you can categorise the Group Workspaces).

image

Once you have decided to use a Group Workspace, you have the same sharing options as covered earlier with all the same benefits.

  • Share a Dashboard
  • Create and Share a Content Pack
  • Public Sharing – Publish to Web

Sharing Group Workspaces

The final method of sharing is to share the Group Workspace itself.  You can of course share the Group Workspace for the purpose of “administration”, but you can also share with the purpose of Group Collaboration and/or Content Sharing.  You have the ability to set access for individuals to be either an administrator or a member.  Admin always has full privileges. Members can be configured by an administrator to either be all “editors” or all “viewers” depending on the group requirements.  One limitation is that you can only share a Group Workspace with individual email address users, not an Active Directory group.  If you think that is just wrong (as I do), then please vote here to have this changed.

Hopefully this article has helped you gain a deeper understanding of the sharing options available with Power BI.  Microsoft has announced it is reviewing a new idea called  Private Publish to web, so keep an eye out for that, and vote for it if you like the idea.

Sharing Directly In Excel

As pointed out by Anthony (in the comments below), once you have access to shared data via Power BI, you will have access to another KILLER feature – direct access to the data from within Excel.  There was a new update on this today, and I blogged about it here. exceleratorbi.com.au/direct-connect-excel-power-bi-service/

Power BI May 2016 Update is Another Cracker

I was super excited by the May 2016 Power BI Desktop update as it had a stack of great new features.  Today I am covering the ones I think are the most useful. They are

  1. Conditional Formatting
  2. Nudge Objects on the canvas
  3. Quick Calcs
  4. Customisable ToolTips
  5. Improved CSV connector

Conditional Formatting

One thing I find curious about this new feature is that it is only available on the Table visual.  It is not available on the Matrix!! Hopefully Microsoft is planning to fix this soon, as I find the Tables quite limited in their possible use compared to a Matrix.

Conditional formatting is applied from the fields view, not the format view.  You can either right click on a measure in the values section of the fields view, or click on the arrow as shown, then select Conditional Formatting.

image

The formatting options are as you would expect (shown below).

image

And looks like this once applied.  Note that you can’t apply the format to the total row (you have this optional setting for a pivot table in Excel, but not here – yet).

image

Nudge Objects

This is a big one for me, and makes a BIG difference when editing reports.  I asked for this in one of my ideas posted at ideas.powerbi.com.  What it does is allow you to move objects around on the report canvas by first selecting the object and then using the arrow keys on the keyboard.  The usage is shown below – the object is being moved with the arrow keys only.

keyboard nudge

Quick Calcs

This feature is similar to what is available in Pivot Tables in Excel.  It allows you to easily create percentage of total calculations without having to write any DAX.

image

Currently there are only 2 choices – “no calculation” and “Precent of grand total”.  Given the design of the dialog box, it seems the list of options may grow over time.

image

Once applied, the visualisation changes to show the percentage

image

It is possible to bring in a second copy of the value field so you can have the value and the % of total together.

image

The quick calc feature is available for naked columns as well as measures.

Customisable Tool Tips

This useful feature is very easy to use.  There is a new Tooltips drop zone on the Fields list pane.  Just drag any measure you want to see included in the tool tips

image

The default tool tip behaviour shows only items in the visualisation (as shown below).

image

You can significantly enhance the experience by adding an additional measure (such as change vs last year), as shown below.

image

Improved CSV Connector

CSV files are the only data source that I have had problems with. The import dialog has now been enhanced so that you have a more granular control over how the file is imported.  Hopefully some of the import problems of the past will be resolved with this enhancement.

image

Here is the link to the full list of new features powerbi.microsoft.com/en-us/blog/power-bi-desktop-may-update-feature-summary/