Solving a Complex Time Problem In Power BI - Excelerator BI

Solving a Complex Time Problem In Power BI

Level: Intermediate

I was delivering some training last week for a customer in Sydney and the participants had an interesting problem where they needed to be able to report on a rolling time window.  Think of it a bit like Rolling 12 Months Sales, or average weekly sales over a rolling 4 week period, but instead this problem uses a rolling time window of just a few hours.  I have decided to demonstrate how to solve this problem using my own solar electricity data and build a rolling average 3 hourly kWh consumption.

This is an interesting problem for a few reasons which I will explain below.

Split Your Date and Time

The first thing to note is that when ever possible, you should split your DateTime columns into separate Date and Time columns.  The reason this is best practice is due to the way Power BI (The Vertipaq Engine) compresses data.  The more unique values in any given column, the worse the data will be compressed.  It therefore follows that 2 columns, one for date and one for time will compress better than 1 column containing DateTime.  As a side point, you should also remove un-necessary precision from your time columns where possible.  If you don’t need seconds, then truncate the seconds and just load minutes.  If you don’t need minutes, then just load the hour, and so on.  In summary, don’t load precision you don’t need in your reporting – your report performance will improve as a result.

Rolling 3 Hour Window

One consequence of splitting a DateTime column into 2 columns is that it is a bit harder to write a rolling 3 hour calculation.  I can’t simply take the current DateTime and then go back in time and grab the data points that make up preceding 3 hours because the filters for date and time are applied independently.  This can be clearly seen below from my work in progress measure shown below.

image

Note above, when the day changes, my rolling 3 hourly window measure (WIP) resets and is not correctly loading the data from late in the previous day.  While this is a tricky problem, it is actually quite easy to solve as long as you break the problem into pieces and solve each part of the problem one at a time.

Watch Me Solve This Problem Myself

One thing I like to teach people is the process of how to solve a problem.  Each problem is different and hence each solution is different.  When I start the process I normally don’t know exactly what the final result will look like.  All I do is work through the problem, one step at a time, until I have a working solution.  If you watch my video below, you will see how I solved this particular problem, but more importantly you will see the approach I used and hopefully you will be able to adopt this approach yourself to solve your own problems.

Power BI Online Training

My Final Measure

For the record, here is the final measure I ended up writing.  Remember the key point is “How” I wrote the measure, not the measure itself.

Rolling 3 hour average = 
    VAR ToTimeID = SELECTEDVALUE('Time'[TimeID])
    VAR FromTimeID = ToTimeID - 11
    VAR TodaysValue = 
          CALCULATE(
              [Total kWh],
               FILTER(ALL('Time'),'Time'[TimeID] >= FromTimeID && 'Time'[TimeID] <=ToTimeID)
          )
    VAR TodaysDate = SELECTEDVALUE('Calendar'[Date]) 
    VAR YesterdaysDate = TodaysDate - 1
    VAR ToTimeIDYesterday = 96
    VAR PeriodsNeededFromYesterday = IF(ToTimeID < 12, 12-ToTimeID) 
    VAR FromTimeIDYesterday = 96 - PeriodsNeededFromYesterday + 1 
    VAR YesterdaysValue = 
            CALCULATE(
                      [Total kWh],
                      FILTER(
                             ALL('Time'),
                             'Time'[TimeID] >= FromTimeIDYesterday && 'Time'[TimeID] <= ToTimeIDYesterday
                      ),
                      ALL('Calendar'[Date]),'Calendar'[Date] = YesterdaysDate
           )
    RETURN YesterdaysValue + TodaysValue

I have also loaded my sample workbook here if you would like to take a look.

17 thoughts on “Solving a Complex Time Problem In Power BI”

  1. Hi Matt,

    I have also found this solution really helpful, and your way to teach and solve complex problem is awesome. Thank you for all your efforts in the journey of mentoring Power BI learners and practitioners.

  2. Hi There,

    Awesome solution! I’m having issues when I try and perform this function over an entire dataset, rather than the current selected date. Is there a way to use the same DAX to apply over a large dataset (I have about 9 months worth of data, I’m calculating hourly rolling averages)

    Essentially, visually I have the correct output but I can’t input the results into cards to calculate the average over 1 day, 3 days, 7 days, a month etc.

    Hope that makes sense! Any suggestion would be greatly appreciated 🙂

    1. Hi Shane,

      I have also similar issue as you were facing. Appreciate if you can please recall your way of solving this over entire dataset. I have also the correct output for hourly rolling sum when put the measure in the line chart. Now I want to find the maximum hourly rolling sum data for 3 months dataset. Thanks!

  3. Hi Matt,

    This is a great tutorial and thanks for sharing this! Could you share some thoughts on the following?

    When you split your datetime into date and column to allow vertipaq to compress the data, was this purely for storage savings or for dax performance?

    Wondering if you also tried performing the moving avg by referencing the datetime column with an ALLSELECTED/ALL? I have tried for a similar problem (hourly price data) on my side but am facing some performance problems but am unsure of the root cause.

    1. Storage performance and DAX performance are 2 sides of the same coin – they are deeply related. So the short answer is “both”. This is a big topic. I have no doubt there could be issues using ALL and its siblings. I did some work for a client where we changed the structure of the source data to solve a reporting issue, but these things always depend on the detail.

  4. Hi Matt,
    Thank you for your crystal clear explanation. I have a data at hour granularity(date/time column in my fact table). In my fact table, I have data for every hour of a day and I have data a three-year data, 2018-2020. I want to calculate an hour, 8 hours and a 24-hour rolling average using your approach but I couldn’t. Specifically, I have the challenge to join the time table and the data table, given that my date table has uniques data values to related to my fact and time table on a one-to-many relationship. I really appreciate your help.

  5. Hi Matt,

    Thanks for another great article. Can I dig a bit deeper into splitting date and time?

    I’ve read that it reduces the file size of load-enabled queries. Is there also a benefit to splitting them in queries that you’re not loading?

    Also, if you have several queries all joining to a central date/time table, do the advantages of splitting the date and time fields outweigh the disadvantages of needing to join all those queries on two fields rather than one? Or are there in fact no disadvantages and Power BI performs better with such double joins rather than worse?

    Thanks again,
    PR

    1. 1. There is no benefit of splitting dateTime in queries that are not loaded. The benefit comes from compression of the loaded data, hence not loaded = no benefit.
      2. It always depends on your data, but my guess is that splitting date and time would normally be better than the alternative, particularly if you can reduce the granularity of the time too (ie round to minutes, not milliseconds). A high cardinality relationship will always be more expensive than a low cardinality relationship. But sometimes there are different ways to solve problems which may end up being more efficient. Eg if you have 2 datetime columns, you could load one datetime column and then load the “difference” between the 2 columns. It always depends, but personally I would always start out with the assumption that splitting dateTime is likely to be the most efficient approach.

  6. Matt, Another good explanation of breaking down a problem. I’m wondering whether the MOD function would be useful here?
    The TimeID would start from 0 and run to 95.
    E.g. 1: With a “to” TimeID of 11 (2:45am), the “from” TimeID would be 0.
    E.g. 2: With a “to” TimeID of 5 (1:15am), the “from” TimeID would be MOD(5-11,96) = 90. Add in similar logic to yours to make sure the today’s component is from 0 to 5, and yesterday’s is from 90 to 95.
    The total needs to be divided by 12 to get the rolling average
    The same end result but maybe a touch “cleaner”. Just a thought 🙂

        • Rolling 3 hour average DS = VAR ToTimeID = SELECTEDVALUE(‘Time'[TimeID])
        • VAR FromTimeID = max(0,ToTimeID – 11)
        • VAR TodaysValue = CALCULATE([Total kWh],FILTER(ALL(‘Time’),’Time'[TimeID] >=FromTimeID && ‘Time'[TimeID] <=ToTimeID))
        • VAR TodaysDate = SELECTEDVALUE(‘Calendar'[Date])
        • VAR YesterdaysDate = TodaysDate – 1
        • VAR ToTimeIDYesterday = 95
        • VAR FromTimeIDYesterday =if([ToTimeID]<11,mod([ToTimeID]-11,96),96)
        • VAR YesterdaysValue = CALCULATE([Total kWh],
        • FILTER(ALL(‘Time’),’Time'[TimeID] >=FromTimeIDYesterday && ‘Time'[TimeID] <=ToTimeIDYesterday),
        • ALL(‘Calendar'[Date]),’Calendar'[Date] = YesterdaysDate)
        • Return (YesterdaysValue + TodaysValue)/12
  7. fact table
    Sale Date Sale
    1-Jan 14
    2-Jan 14
    3-Jan 12
    4-Jan 13
    5-Jan 12
    6-Jan 12
    1-Feb 12
    2-Feb 10
    3-Feb 13
    4-Feb 13
    5-Feb 11

    datetable
    DateTable
    1-Jan
    2-Jan
    3-Jan
    4-Jan
    5-Jan
    6-Jan
    7-Jan
    8-Jan
    9-Jan
    10-Jan
    11-Jan

    i want to populate Jan sale against each month, Please help
    Row Labels Sum of Sale Sum of Sale
    Jan 77 77
    Feb 59 77
    Mar 81 77
    Apr 45 77

  8. Thank you for being willing to use your expertise to guide beginners like myself! Very informative to watch you work through that.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top