I am writing an off-topic article today but I promise I will circle back later in the week with a follow up article bringing it back on topic. In this article, I am going to share a COVID report I built today, and later in the week I will post a follow up article where I show how I created the full report so you can see me work through an end to end problem.
Background – COVID in Australia
Firstly, some background. Thankfully, Australia has been spared the worst of the COVID deaths that have occurred in many countries. This is one of the benefits of being an island nation on the other side of the world. This outcome has been achieved by locking the national borders to international travellers, forcing all returning residents into mandatory 14 day hotel quarantine, and jumping on any outbreaks with lockdowns of various severities.
As my Australian readers would well know (and maybe some readers from around the world, too), Australia is now entering a new stage of its fight against COVID-19. The Delta strain has appeared on our shores and by all reports (and observations), this virus is much more contagious than previous strains. The response from two of the state governments in Australia (NSW and VIC) is to hard lock down 50% of the country in an attempt to eradicate the Delta strain from Australia. At the time of this article, Australia is experiencing a 7 day average of 114 cases per day (in a population of 25 million people). Despite the effort, the signs are that control is slipping away despite the lockdowns.
Without wanting to get too political in this article, I must say that I have been frustrated by the lack of availability of all relevant data provided by the Australian media and Governments. There is a lot of commentary about how highly contagious the Delta strain is (vs previous strains), but not much commentary about how severe the Delta cases are, nor what impact vaccinations are having on outcomes. When I look at data coming from overseas (mainly the UK), it would seem that the Delta variant in a partially vaccinated population is significantly less deadly than previous strains of COVID, 10 to 20 times less deadly, in fact.
Let’s Look at All the Data
So with that as a background, I decided to do my own research. This is one of the benefits of being a Power BI professional – if you don’t get to see the analysis that you need, then you can simply build your own. I set out to find all the relevant data, build a report, and then form my own conclusion if this Delta variant outbreak in partially vaccinated Australia is less deadly than previous outbreaks, or not.
Spoiler alert 19th July 2021 – it’s too early to say based on the Australian data, but there should be a clear answer by the end of July 2021.
Methodology
After I loaded and completed a brief exploration of the data, I decided to group and then split the data into 4 phases as follows.
I acknowledge that this is somewhat arbitrary, but it is not without some logic. I analysed the case and death data to find some logical groupings of time where it made sense (to me at least) to split and group the data. I have called these “phases” and not “waves”, as the third phase could not really be called a “wave”; it was a period of relative calm in the country. I explain the logic for my splits in the video below. If you think there is a better way to split the data, then please say so in the comments below.
Case Fatality Rate
After completing my analysis, here is the case fatality rate for each of my phases (as at 17th July 2021).
The way to read this chart is that phase 2 was the most fatal in Australia, with 3.94% CFR (deaths as a percentage of recorded cases). All other phases are compared against that worst period to determine how many times “less fatal” was each other phase vs phase 2. Phase 3 is showing that it was 33 times less fatal than phase 3, and phase 4 (so far, at least) is running at 17 times less fatal.
I specifically want to call out “so far” for this phase for a very important reason. The data clearly shows that, historically, deaths lag the initial spike in case numbers by about 45 days. It would appear from the data that if there is going to be a spike in deaths in Australia from the cases recorded in this latest phase, then we will start to see the spike appear in the next 2 weeks. Hopefully this will not happen, and the theory that the Delta strain and the benefits of vaccination have significantly reduced the IFR will be shown.
I will update this article of the next few weeks to keep you informed what the numbers show.
Video Run Through
I recorded a brief video explaining the report I have built, and also showing how I came to the decision on where to start and end the phases.
Edit: When I first released this article, I incorrectly referred to the Infection Fatality Rate when what I was actually showing was the Case Fatality Rate (thanks Paul for pointing this out to me). My calculations divide deaths by recorded cases. The number of recorded cases is always less than the infection rate, because not all infections are recorded. I have corrected the reference were possible, but please note that in the video I continue to refer to Infection Fatality Rate whereas the correct term in Case Fatality Rate.
Interactive Report
And here is an interactive copy of the report if you would like to take a closer look. I have now scheduled this to auto refresh every day, so it should stay up to date from here on in.
Watch Out for my Next Article
If you are interested in seeing how I used Power BI to build this report, keep an eye out for my more technical article on this topic later in the week.
Cool analysis.
.
Will you be sharing the model in a form we could use? I would like to use our local values, both for my province and country. For a while last August I lived in the “worst” part of the country. I didn’t realize that until somewhat later as I wasn’t following the stats.
.
Absolute counts are “nice”, but misleading. I would like to see a stat can be easily universally compared. Counts can’t be compared, but rates per 100K or per Million population allow much better and more reasonable comparisons between geographic locations independent of population size.
.
.
I expect the mortality rate for the current phase will be much lower than earlier phases. Beyond the positive effect of inoculations, there is a simpler reason. We (as societies) have already murdered a large part of the most vulnerable population. That is the elderly. I don’t know how it was down under, but here the “LTC”, Long Term Care (using care loosely), homes were grossly negligent. They were, and still are unprepared, understaffed and incompetently run entirely focused on profits. Even if you give them a pass on the first phase, which we shouldn’t, they still had unacceptable surges in subsequent phases. Instead of allowing limited family access (ie 1 selected family member) to provide personal support they locked everyone out. Then their overworked staff effectively abandoned the residents to die in squalor. The stories that our military reported when they finally were asked to step in and help are truly horrifying.
.
Disturbingly, the articles I’ve been reading suggest that Covid may now be attacking the other end of age spectrum, including unvaxinated kids under 12, but not exclusive to them. Under 30’s partying are getting infected too.
.
PS: I’m getting an error on the inserted chart. “The visual has unrecognized fields”
“Unrecognized fields in this visual
We are not able to identify the following fields: Data[Lag Point], Data[Chart Title]. Please update the visual with fields that exist in the dataset.
Activity ID0b5e4dc2-c0c1-40e1-ab4f-9cdd560ae8cb
Request IDf0e0842f-769a-454c-2173-1af23ef58b6a
Correlation IDf2b143a2-5816-63a2-0df1-2a8c28b648bc
TimeTue Jul 20 2021 00:27:41 GMT-0500 (Central Daylight Time)
Service version13.0.16442.42
Client version2107.2.06866-train
Cluster URIhttps://wabi-australia-southeast-api.analysis.windows.net/
Activity ID0b5e4dc2-c0c1-40e1-ab4f-9cdd560ae8cb
Request IDf0e0842f-769a-454c-2173-1af23ef58b6a
TimeTue Jul 20 2021 00:27:41 GMT-0500 (Central Daylight Time)
Service version13.0.16442.42
Client version2107.2.06866-train
Cluster URIhttps://wabi-australia-southeast-api.analysis.windows.net/”
Hey Ron.
Yes, I will share my workbook in the next article.
Yes rates per million population are great for comparing countries and regions, but not needed for my article (comparing the same region in different time periods). The data is there.
Yes, the most vulnerable will obviously be affected first. And yes, there are many reasons these things become less deadly over time. The question is, at what point do you stop treating something as extraordinary and when do you go back to “the new” normal?
The purpose of my post is to provide some objectivity into the changing pattern between case rates and death rates. If the pattern has changed, then we need to know and we need to talk about it. If we don’t monitor this, then governments may be taking the wrong response based on tracking only a lead indicator that is no longer fit for purpose. I am not saying it is, or it isn’t. I am saying we need to track it so we can have an objective conversation.
I’m not sure what the error is – I will take a look. I think it was a version issue, and I think it should be OK now.
That’s interesting, Matt! It inspired me to carry out your form of analysis on the data. I do follow the OWID data series but not as you have done. However, I have three comments to make.
As far as the lag in deaths is concerned, it seems to me that it is 36 days for Australia rather than 45 days initially. I say this by observing the date of the first deaths, 1/3/2020 and finding that since the first cases in the database are recorded on 26/1/2020, the lag seems to be 36 days not 45. For the rest of my comment, I stuck to a 36 day lag, right or wrong since I have no idea what it is.
Secondly, I was able to follow your methodology easily since you have put this piece together so well. What I then think you are missing in your table of IFR is the lagging: whether 45 days or 36 days, you don’t seem to have lagged the deaths for that part of your exercise. My suggestions for lagging give these results:
36 Day lag
Date Range IFR MA 0 IFR MA 36
26/1/2020 – 31/5/2020 1.43% 1.47%
1/6/2020 – 31/10/2020 3.94% 3.93%
1/11/2020 – 31/5/2021 0.12% 0.08%
1/6/2021 – 13/6/2021 0.22% 2.78%
45 day lag
Date Range IFR MA 0 IFR MA 45
26/1/2020 – 31/5/2020 1.43% 1.54%
1/6/2020 – 31/10/2020 3.94% 3.91%
1/11/2020 – 31/5/2021 0.12% 0.16%
1/6/2021 – 4/6/2021 0.22% 5.13%
Because of Australia’s covid-19 profile, the differences between your lagged and unlagged results are not so massive except for the final line in each case.
In your unlagged version, you are able to evaluate the IFR up until the end of the available data. With a 36 day lag, you could only evaluate the IFR up until 13/6/2021 and with a 45 day lag that becomes 4/6/2021
Finally, having looked at the data for Australia, my own cut off dates, for what they are worth would be:
Date Range
26/1/2020 – 31/3/2020
1/4/2020 – 30/6/2020
1/7/2020 – 31/8/2020
1/9/2020 – 4/6/2020
I chose cut off points at the start of what I saw as a new phase to suggest those dates.
I hope you find that useful, Matt and I have only concerned myself with the statistics since my Power BI skills are nowhere near good enough to challenge what looks really good!
Duncan
Good comments. I think it makes sense to lag the deaths against the cases in order to get a more accurate figure. This could be easily built into the model. I do not expect that the change in the results will be material, however. My expectation is that one of two things will happen next.
1. Australia will follow the patterns seen around the world, with the CFR being orders of magnitude lower in this current phase, or.
2. Something different will happen.
My tip is 1, but let’s wait and see. The other interesting thing to watch will be the difference in Australia, with overall relatively lower vaccination rates compared to the UK, with overall higher rates.
I may do a follow up article where I build the lag factor into the model.
2nd phase was when it got into nursing homes in Vic
Yes
I would be interested in any analysis of the case morbidity rate. i.e. the fraction of cases with survivors having serious long term health effects. Unfortunately I don’t know if any numbers are kept about this, I haven’t seen any numbers on the news i’ve read.
The long term selective pressures on viruses tend to push them to higher transmission rates and lower fatality rates. It would not surprise me if the Delta variant spreads easier and is milder, however, some numbers and analysis on this would be good.
Thanks Matt 🙂
I totally agree with you Andrew. I guess there is no long term data on serious long term health effects because we are still in the early days. Long term data only becomes available in the “long term”. In the same way, there is no long term data about the impact of vaccines for the same reason. It would also not surprise me if the CFR is lower with Delta in Australia – in fact that is what I set out to show. But what I discovered is “it’s too early to tell” for the reasons covered in my video. We will know for sure in 4 weeks from now.
Very interesting Matt, though I believe the correct scientific term for the calculation is ¨Case fatality rate” since the calcualtion is based on the cases identified as positive and not the total population who have actally been infected by Covid (which would be the Infection fatality rate). There is a large number of cases which have not officially been recorded (my whole family is part of that number & I know of dozens others) since a large % show no symptoms (and therefore never get/got tested – this segment can be substantial, especially in the younger age groups) + those who caught Covid in the various waves but were told to stay at home unless the disease developed into something serious: these cases have not been recorded as Covid positive since no test was done at the time.
To calculate the IFR, some countries have tested a sample of the population for antibodies to come up with a statistical number estimating the total population which have had Covid. For example in Spain, the last study I know of was in November where the case infection rate was around 2.5% but the estimated IFR was at 0,98%. I believe other countries with similar studies have come up with an IFR between 0.6% and 0,8%.
Just thought it was worth pointing out.
Thanks Paul. I was aware that the official numbers understated the true infection rate for the reasons you mentioned, but I was not aware that there was a different term for this. Thanks for clarifying. I have already recorded the second video (how I produced this report) but I will add an update correcting this.
Thanks for sharing and enlightening.
Happy to have helped! As it is, I myself have made the mistake of comparing “case infection rate” in Spain’s example when it should of course be “case fatality rate”. Not sure if you’re interested in exploring this further, but I have myslef been following the pandemic in a PBI report (please excuse the fact that it’s in spanish and the appalling graphics & design):
https://app.powerbi.com/view?r=eyJrIjoiMWM1MTVjYmItODg0Yi00YjVhLWJhMjEtYWIxODQ2ZDRhMTRkIiwidCI6ImFiOTJkNmExLTU4NWItNGU2Ny04OWNmLTQ2ZGZhZmE2YzQyNyIsImMiOjl9
Estimates of IFR I have seen are all around 0.1%-0.2% with some below 0.1%. Even back in June 2020, CDC (US) guidance to hospitals was based implicitly on a rate of 0.26%. That IFR is well below 1% has been well established for over a year now.
Of course IFR isn’t a single rate as it is a function of age and health, (and later waves seem to be targeting mostly younger people)
The best insights into IFR use Number of tests and %positive rate as a way to guess at full numbers of infections, with some recognition that current testing is much more random than it was before.
This becomes dangerous talk, because if you work backwards from the lower IFR numbers accepted now, and the high numbers death in early phases only 3 conclusions make sense. That case treatment has dramatically improved (likely to be only a partial answer). And that early waves infected a huge proportion of the population. And that vulnerable groups were being exposed too much in early phases. All 3 suggest that Covid is was not the threat it was reported to be and is even less so now.
I could not agree more.