As promised, today I am sharing with you the process I went through to build the COVID report I shared in my last article. One of the best ways to learn and improve your Power BI skills is to practice. The best way to practice is to find something you care about (work or home) and build something useful. I built this report because I care about the lack of transparency and lack of full information in Australia about what is happening with COVID.
Watch Me Do It
The video today is simply a walk through of what I did to build this report. It is actually a reproduction – I did it first, then recorded a video so you can see what I did. I have tried to keep it “warts and all” so that you can see where I made mistakes, etc. It is quite a long video (36 mins) but I hope you find it a good learning resource.
Summary Of Key Learnings
If you would like to know what I cover in the video, here is a brief list.
- I created a staging query connected to the source
- Inspected the new data source to understand its structure
- Built a star schema
- Unpivoted the data into a long skinny table
- Found and fixed errors in the transformed data
- Realised I should remove zeros from the data
- I showed how to enumerate a list of dates in Power Query
- I showed how to enter today’s date in Power Query
Interactive Report V1.0
Here is a link to the live, interactive report. This will auto refresh every day. I will show you how I did that in my next article.
Wrap Up and Next Steps
Here is a copy of the workbook so you can take a look for yourself.
You will need to download a copy of the data yourself from this link and reconnect the workbook to your local copy. https://github.com/owid/covid-19-data/blob/master/public/data/owid-covid-data.csv
I have already started to improve my report, and I plan to record more videos in the coming weeks to show you the process of continuous improvement to make version 1.0 bigger and better. Keep an eye out for those videos in the coming weeks.
Please let me know in the comments below if you like this style of video so I can get some feedback on that.
HI, Matt, I am back about a similar question since I didn’t get a satisfied answer from other resource.
As you know, Power BI only allows to copy and paste Excel file with 3000 cells into Power BI table.
Unfortunately, I have an Excel file with more than 3000 cells. So I split the Excel file into six small Excel files.
After creating six Power BI tables through copy and paste, I merged the six Power BI tables into the first table through Append Queries approach.
Currently, there is one existing table [New York] loaded from the same original Excel file. I can copy the query in the newly merged table and paste into [New York] table. Then deleted the merged table. But I could not delete tables 2-6 since tables 2-6 are appended to table [New York].
Are there any ways to delete tables 2-6?
Appreciate your help.
Dennis
#”Appended Query” = Table.Combine({#”Changed Type”, Table2, Table3, Table4, Table5, Table6})
I do t really understand what you are saying. In short, you can’t delete query 1 if it is being used in query 2. If you want to delete query 1, presumably you don’t want the data in query 2. So edit query 2 so it doesn’t append query 1, then delete query 1
Also, why did you cut and paste from excel rather than importing from Excel? Was it to remove the reliance on the excel file?
Thank you, Matt,
Sorry for the confusion.
Yes, I try to remove the reliance of the table ‘New York’ on the Excel data source since I could not set up scheduled refresh from clouds service.
I just delete the existing table ‘New York’ and re-load the Excel file ‘New York’. After unchecking ‘Enable Load’, I revise the visuals.
Now I can set up the scheduled refresh from clouds server.
Appreciate your help.
Dennis
Thank you, Matt,
If your video, while loading the csv file, you said:
The first thing I’m going to do is right click and turn off the load so that I have a permanent connection to that csv file.
My understanding about this paragraph is that if I turn off the load, then Power BI Desktop report will keep updating automatically.
In other words, I have some some tables Direct Loaded from Excel. If the Excel files are updated, I have to manually refresh the Power BI Desktop file. How can I refresh the Power BI Desktop file on scheduled or constant basis?
Dennis
Load the report to powerbi.com, install the gateway, set the refresh schedule from there
Thank you, Matt,
If the load is enabled, what will it happen? Could you further explain the reason to turn off the load?
Dennis
The first thing I’m going to do is right click and turn off the load so that I have a permanent connection to that csv file.
If it’s enabled, it will be loaded into the data model. It is just a good practice (not essential) to connect to your source and keep a clean connection with no transformation. This allows you to easily find and see the source data any time, as well as build additional tables from the same source without having to make a new connection
Thank you, Matt,
Finally, I set up the scheduled refresh after installing the gateway.
Appreciate your help.
Dennis
Thank you, Matt, it just open a window for me to work on Power BI.
To Keep the connection constant, you disable the Enable Load. Do you mean the Power BI report always keeps updating given any change in the database?
Dennis
I’m not really sure what you are asking here. If you disable load, the query will still refresh if it is being referenced by another query.
e.g.
Q1 (disabled) => Q2 (enabled)
Q2 References Q1
If you refresh Q2, Q1 will also be refreshed.
Great video, thank you Matt! So helpful to hear and see your thought process and steps as you work through the development of the data load, transformation, and report.
Very awesome Matt. Thank you.
Loved learning new tricks.
At about 5:13 in te video you said ….
“Turn off the load so you have a permanent connection to the CSV file”.
ummm what now? Why? – Did you do this just so it doesn’t reload the data when you click Close & Apply? I haven’t heard of this before.
Also… loved the PowerQuery steps… especially adding Phase to the date table.
And as usual your DAX demos are fantastic.
– David
Hey David. Sorry about that – my words were a bit vague/imprecise. Whether you turn off load or not, the query keeps a permanent connection to the file. What I should have said is “turn off the load. This keeps the permanent connection to the file but simply doesn’t load the data to the model in its raw form”.
The reason I do this is so I can easily see the full raw version of the file if I want to go back to it, plus I can build many new queries off the raw file should I choose to down the track. I learnt this from Ken Puls in the Power Query Academy course, and I do this as my standard operating approach now.
Adding the phases – yes, one of my favourite tricks. The trouble is, when you have powerful tricks up your sleeve, sometimes you miss the obvious. See Kamil’s comment below for a much simpler method.
Great tutorial, Matt. I would pass it to everyone who ask me to help starting his PBI adventure. Just to make things simpler, instead of building lists for phases, I would just merge it for start dates and then fill down for the rest of (sorted) calendar. Just to phase knowledge for beginners :).
Oh, that’s such a good idea, Kamil. Thanks for sharing. The best solutions are the simplest.