Solving a Complex Time Problem In Power BI - Excelerator BI

Solving a Complex Time Problem In Power BI

Level: Intermediate

I was delivering some training last week for a customer in Sydney and they participants had an interesting problem where they needed to be able to report on a rolling time window.  Think of it a bit like Rolling 12 Months Sales, or average weekly sales over a rolling 4 week period, but instead this problem uses a rolling time window of just a few hours.  I have decided to demonstrate how to solve this problem using my own solar electricity data and build a rolling average 3 hourly kWh consumption.

This is an interesting problem for a few reasons which I will explain below.

Split Your Date and Time

The first thing to note is that when ever possible, you should split your DateTime columns into separate Date and Time columns.  The reason this is best practice is due to the way Power BI (The Vertipaq Engine) compresses data.  The more unique values in any given column, the worse the data will be compressed.  It therefore follows that 2 columns, one for date and one for time will compress better than 1 column containing DateTime.  As a side point, you should also remove un-necessary precision from your time columns where possible.  If you don’t need seconds, then truncate the seconds and just load minutes.  If you don’t need minutes, then just load the hour, and so on.  In summary, don’t load precision you don’t need in your reporting – your report performance will improve as a result.

Rolling 3 Hour Window

One consequence of splitting a DateTime column into 2 columns is that it is a bit harder to write a rolling 3 hour calculation.  I can’t simply take the current DateTime and then go back in time and grab the data points that make up preceding 3 hours because the filters for date and time are applied independently.  This can be clearly seen below from my work in progress measure shown below.

image

Note above, when the day changes, my rolling 3 hourly window measure (WIP) resets and is not correctly loading the data from late in the previous day.  While this is a tricky problem, it is actually quite easy to solve as long as you break the problem into pieces and solve each part of the problem one at a time.

Watch Me Solve This Problem Myself

One thing I like to teach people is the process of how to solve a problem.  Each problem is different and hence each solution is different.  When I start the process I normally don’t know exactly what the final result will look like.  All I do is work through the problem, one step at a time, until I have a working solution.  If you watch my video below, you will see how I solved this particular problem, but more importantly you will see the approach I used and hopefully you will be able to adopt this approach yourself to solve your own problems.

My Final Measure

For the record, here is the final measure I ended up writing.  Remember the key point is “How” I wrote the measure, not the measure itself.

Rolling 3 hour average =
VAR ToTimeID = SELECTEDVALUE ( 'Time'[TimeID] )
VAR FromTimeID = ToTimeID - 11
VAR TodaysValue =
    CALCULATE (
        [Total kWh],
        FILTER (
            ALL ( 'Time' ),
            'Time'[TimeID] >= FromTimeID
                && 'Time'[TimeID] <= ToTimeID
        )
    )
VAR TodaysDate = SELECTEDVALUE ( 'Calendar'[Date] )
VAR YesterdaysDate = TodaysDate - 1
VAR ToTimeIDYesterday = 96
VAR PeriodsNeededFromYesterday =
    IF ( ToTimeID < 1212 - ToTimeID )
VAR FromTimeIDYesterday = 96 - PeriodsNeededFromYesterday + 1
VAR YesterdaysValue =
    CALCULATE (
        [Total kWh],
        FILTER (
            ALL ( 'Time' ),
            'Time'[TimeID] >= FromTimeIDYesterday
                && 'Time'[TimeID] <= ToTimeIDYesterday
        ),
        ALL ( 'Calendar'[Date] ),
        'Calendar'[Date] = YesterdaysDate
    )
RETURN  YesterdaysValue + TodaysValue

I have also loaded my sample workbook here if you would like to take a look.

18 thoughts on “Solving a Complex Time Problem In Power BI”

  1. Hi Matt,

    Thanks for another great article. Can I dig a bit deeper into splitting date and time?

    I’ve read that it reduces the file size of load-enabled queries. Is there also a benefit to splitting them in queries that you’re not loading?

    Also, if you have several queries all joining to a central date/time table, do the advantages of splitting the date and time fields outweigh the disadvantages of needing to join all those queries on two fields rather than one? Or are there in fact no disadvantages and Power BI performs better with such double joins rather than worse?

    Thanks again,
    PR

    1. 1. There is no benefit of splitting dateTime in queries that are not loaded. The benefit comes from compression of the loaded data, hence not loaded = no benefit.
      2. It always depends on your data, but my guess is that splitting date and time would normally be better than the alternative, particularly if you can reduce the granularity of the time too (ie round to minutes, not milliseconds). A high cardinality relationship will always be more expensive than a low cardinality relationship. But sometimes there are different ways to solve problems which may end up being more efficient. Eg if you have 2 datetime columns, you could load one datetime column and then load the “difference” between the 2 columns. It always depends, but personally I would always start out with the assumption that splitting dateTime is likely to be the most efficient approach.

  2. Matt, Another good explanation of breaking down a problem. I’m wondering whether the MOD function would be useful here?
    The TimeID would start from 0 and run to 95.
    E.g. 1: With a “to” TimeID of 11 (2:45am), the “from” TimeID would be 0.
    E.g. 2: With a “to” TimeID of 5 (1:15am), the “from” TimeID would be MOD(5-11,96) = 90. Add in similar logic to yours to make sure the today’s component is from 0 to 5, and yesterday’s is from 90 to 95.
    The total needs to be divided by 12 to get the rolling average
    The same end result but maybe a touch “cleaner”. Just a thought 🙂

        • Rolling 3 hour average DS = VAR ToTimeID = SELECTEDVALUE(‘Time'[TimeID])
        • VAR FromTimeID = max(0,ToTimeID – 11)
        • VAR TodaysValue = CALCULATE([Total kWh],FILTER(ALL(‘Time’),’Time'[TimeID] >=FromTimeID && ‘Time'[TimeID] < =ToTimeID))
        • VAR TodaysDate = SELECTEDVALUE(‘Calendar'[Date])
        • VAR YesterdaysDate = TodaysDate – 1
        • VAR ToTimeIDYesterday = 95
        • VAR FromTimeIDYesterday =if([ToTimeID]<11,mod([ToTimeID]-11,96),96)
        • VAR YesterdaysValue = CALCULATE([Total kWh],
        • FILTER(ALL(‘Time’),’Time'[TimeID] >=FromTimeIDYesterday && ‘Time'[TimeID] < =ToTimeIDYesterday),
        • ALL(‘Calendar'[Date]),’Calendar'[Date] = YesterdaysDate)
        • Return (YesterdaysValue + TodaysValue)/12
  3. fact table
    Sale Date Sale
    1-Jan 14
    2-Jan 14
    3-Jan 12
    4-Jan 13
    5-Jan 12
    6-Jan 12
    1-Feb 12
    2-Feb 10
    3-Feb 13
    4-Feb 13
    5-Feb 11

    datetable
    DateTable
    1-Jan
    2-Jan
    3-Jan
    4-Jan
    5-Jan
    6-Jan
    7-Jan
    8-Jan
    9-Jan
    10-Jan
    11-Jan

    i want to populate Jan sale against each month, Please help
    Row Labels Sum of Sale Sum of Sale
    Jan 77 77
    Feb 59 77
    Mar 81 77
    Apr 45 77

  4. Thank you for being willing to use your expertise to guide beginners like myself! Very informative to watch you work through that.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top