When a new disease starts to spread, the most pressing questions are: How deadly is this? And how many people are likely to die?

One way to measure the severity of disease is by calculating the case fatality rate, or CFR.

Watch the video above to find out more about how CFR is determined and how this relates to Covid-19, the disease caused by the new coronavirus.

Mortality generally LAGS Diagnosis for this disease. I think we can all agree on that, correct?

Now, consider that the diagnosis curve is exponential in nature, which for almost all uncontained pandemics it generally is. With semi-effective containment you get more of a geometric curve, and with containment an asymptote.

So allow me to illustrate my disagreement with the media reported CFR:

Let x = # of fatalities.

Let y = # of diagnosed.

Let t = time.

Let tau = lag between diagnosis and death.

The media is reporting present CFR as x(t)/y(t).

Can someone explain to me why? Because I find this to be quite erroneous.

They should be reporting it as x(t)/y(t – tau).

This is reflective of mortality lagging diagnosis by tau days, with y(t – tau) an estimator of the cumulative diagnoses tau days ago, when the deceased were, on average, first diagnosed. Yeah, it’s not great, but far better than assuming that the ratio of deceased to presently diagnosed is accurate by any means.

So why is this so incredibly relevant?

Well, as a first-order rough approximation, time shift your diagnostic curve to the left by time tau, and then re-estimate CFR, ratio’ing PRESENT mortality with the “tau time units” PAST diagnosis metrics. Not a happy picture (your denominator exponentially decreases).

This is the 800 lb gorilla in the room that for some reason the media is ignorant of, or is ignoring.

Generally speaking most people do not die of this infection on the day of diagnosis. Of course the lag period between diagnosis and death itself is stochastic, which makes any attempt to present CFR estimates stochastic too, but I do know that at least the right-sided tail is in the 5-7 day range based on widely circulated recent official reports.

What is being generally reported though is tau = 0. Seriously? Does anyone believe that? That alone is enough information to call baloney on these presently reported metrics. The “why” matters not to me. I leave that up to the audience to speculate if they so choose. But IMO “the null hypothesis” ( official reports are accurate) has been rejected, and that’s what really matters.

For the folks witty enough to recognize that unresolved cases inhibit an accurate calculation this is the biggest error that I believe is occurring with the present metrics reported.

Thoughts? Am I missing something here? If so then please by all means. I don’t think that I am.

The true, reliable value is only obtained once it’s over and all information has been harvested. That is, the deterministic calculation. But I am stating the obvious here.

Yes, you are quite right. I have been charting a lagged CFR of 7 days since so I have read that average time from onset of symptoms to confirmed positive test is 7 days and average time from onset of symptoms to death (for those who die) is 14 days. The resulting figure has been consistently falling for 2 weeks and now looks to be asynptotically approaching 4%. But I do not trust either the denominator or the numerator figures. There may be many more deaths by now and there may have been many more cases by 7 days ago.

By the way it should be x(t) – x(t – tau) all divided by y (t – tau). Only the *additional* deaths from t – tau to t should be counted.

Andrew, thanks!

What a silly oversight by me on the math. You are correct with your correction. Again, thank you.

But I am glad that someone else sees my point, and I did not know that the lag factor was 14 days. That is concerning indeed.

I am monitoring the news outlet’s reporting of the CFR asking myself, “Am I delusional here or are they just dividing total deaths by total cases?

Thanks for validating my sanity check.

Hi Pete. Thinking again, my corrected formula is only the right one if y(t – tau) stands for the number of cases who were *currently suffering symptoms* at (t – tau). If it stands for the cumulative number of all those who had suffered symptoms at one time or another up to (t – tau), which is what the Chinese authorities’ reports give us, then your original formula is right (I think!).

The simple solution is to just look at deaths/(deahts+recoveries). This might be a little TOO high because there may also be a lag from when one is most likely to die to when they’d be declared recovered if they didn’t. Anyway, the correction is HUGE and must be done correctly. The definition is simply not correct for ongoing cases.

What value are you using for (t)? Or rather, how are you determining (t).

Looking for a good way to use this with the JHU aggregate data to get a better approximation of the actual mortality rates. Thanks!

I never knew how many doctors were on here until now.

I regret that this video completely conflates ‘naive’ CFR (100 x known deaths to date / known cases to date) with true CFR (100 x actual eventual deaths among those who have actually caught it to date / actual number who have caught it to date). Naive CFR refers to what we know, true CFR to what is the case. We know the former by definition, but we can only estimate the latter.

It’s been said beliw that resolved case fatality ratio is a good estimate of true CFR, but I am not so sure. The denominator is those who ha e been suffering badly enough to be tested *and* have since either recovered or died. So it leaves out all those who are unresolved and all those who never got to the point of being tested. So it is a poor indicator. BTW Covid-19’s resolved CFR has fallen rapidly day by day from around 40% and is now apparently asymptotically approaching about 4%.

Sorry, my last statement was incorrect. Resolved CFR for Mainland China has progressively fallen from 55% on 30th January to 15% on 15th February and seems to be asymptotically approaching about 10%.

Chitty, that’s what I calculate too, a steady decrease in estimated CFR (as per deaths / (deaths + recoveries) ), dropping from 48% (in my truncated dataset) down to 15% among those who came into the scope of official counting, last I checked. And there’s not lower bound indicated yet.

CFR for all cases should be much smaller, Imperial College is estimating ~1% CFR among all infected…. https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-2019-nCoV-severity-10-02-2020.pdf

The OTHER thing people are not talking about is survival of this virus on surfaces. Because it has a hardy shell, I believe it can last longer on surfaces and survives hot and cold weather, thus increasing the spread – unlike the influenza. Bleach does kill it And probably other disinfectants. This is another way it can spread.

Here’s a citation that says other coronaviruses can survive on surfaces for up to 9 days, and they and susceptible to disinfection measures (there’s isn’t yet data on this new strain’s external survivability I’m aware of)…

https://www.journalofhospitalinfection.com/article/S0195-6701(20)30046-3/fulltext

Conclusions: Human coronaviruses can remain infectious on inanimate surfaces for up to 9 days. Surface disinfection with 0.1% sodium hypochlorite or 62-71% ethanol significantly reduces coronavirus infectivity on surfaces within 1 min exposure time. We expect a similar effect against the 2019-nCoV.

This is one of the most informative threats on Corona virus I’ve come across. I agree that the closest estimator of morality rate is where deaths and recoveries are the variables. New cases tell us nothing as the outcome is unknown.

Using these parameters, clearly we are closing in to a 20% rate, however this fomulae leaves out an important parameter in my opinion, those who could have fallen ill, but resolved without being hospitalized. If this number is high in it’s 1,000s, and the unrepoted deaths are relatively low, then we could be looking at a single digit fatality rate.

Any data on demographics of hospitalized? Deaths? Re- infection? Im aware of the skew towards the elderly and those with preconditions but such numbers would further shed more light. What is the likelihood of Corona turning fatal if you dont have pre-existing conditions?

Well, one thing we can agree is that the epidemic is at an early stage hence statistics still very volatile.

The panic across our dear continent Africa is real now with a confirmed case. It’s the talk of town here in Kenya.

We can only hope for the best

Forgive my English, it’s my third language 🙂

Forgive my English, it’s my third language 🙂? I thought, Samuel, you were writing from somewhere in the UK (like Cambridge, perhaps?), your English is so good. As for the Corona mess, IDK why everyone is so worried.

@Hazel, Sorry I was referring to the one confirmed case in Egypt. No case confirmed in Kenya ..yet. There has been alot of talk on impact of the epidemic should it hit African nation. When we heard of the first case in Africa, the trending swahili words here was “kwisha sisi” meaning “We are dead”.

Fake news is making matters worse, I got a forward whatsap message from my mum last night telling us to keep of broiler chicken as it has been ‘confirmed’ they are carriers of the deadly virus. muuuuum! 🙁

@AI I do agree with you..

With ultimate respect to the fallen heroes, the infected and directly affected, I think I am currently worried by whether our Kipchoge will beat Bekele in the upcoming Olympics as opposed to the greater danger COVID-19 pauses to mankind.

I am confident it will be contained soon. If it hits our country Kenya, we may as well give it a marathon orientation by trying to literary outrun it.

As a statistician, I remain forever thankful to the contributors of this blog. I had a discussion with my colleagues at work on the CFR and how misleading various versions could be. I am glad that at the end of it it we did agree that there is more that meets the eye.

I believe sooner than later, after the epidemic is contained, we will be here discussing the actual CFR against the volatile estimates we are currently limited to.

How is it that mild cases that never sought treatment in the past MERS and SARS epidemics became known cases? That seems to be implied in this video.

However, I can’t see how all mild cases that never sought treatment could ever become known (unless every single person possibly infected was tested while actively infected, which would entail testing millions of people quickly), and thus I can’t see why not knowing the number of mild cases for this virus now is a unique problem.

would the number be more probable if this has been around since 2019 say january?

i’m happy to found https://home.exetel.com.au/upload/coronavirus-predictions/

i did the same observations 4 february and i have hack the chinese model and the reason please check this 4 links and you will understand the real situation.

* my simple excel done on February 4th : https://myhc.page.link/simulation-5-february

* My first analysis on february 6: https://myhc.page.link/analys-6-fevrier

* My conclusion on february 10: https://myhc.page.link/conclusion-10fevrier

— THE VIDEO 45 min that open my EYE – and if you want to take the right move to protect, your family, your firend and your business:

A data analysis specialist studying the impact of EPIDEMIC and PANDEMIC explains the impact of “when did the school/plant reopen? “what is the impact on supply and financing?” … and why it is unprecedented. and the future mechanism that this epidemic may be triggered https://myhc.page.link/analysis_CV_05_FEB_2020

I have worked on number of viral vaccines for global companies and recently we developed cell line platform that is available to grow Corona viruses and proven to grow pandemic influenza viruses. I am reaching out to all scientistic, public health care and business development and companies to partner to help develop vaccine against this highly contagious pathogenic epidemic virus. Please reach me to discuss details

Indeed, we should hope many cases are undiagnosed. Right now, the CFR looks about 2% equivalent to that of unvaccinated measles (see the irony?). However, some mathematical modeling estimates the number of infections to be ten times higher. If that’s the case, the CFR is 0.2%, or the same as that of the common influenza.

And nobody is panicking about influenza.

That figure comes from calculating active cases that have not resolved, which makes no sense because we do not yet know how they will resolve.

However, we do known of resolved cases to date (6,945), of which 1,369 have died and 5,576 have recovered. That indicates a mortality rate of 19.7% of those who have sought medical attention and reached the resolution of their case.

A friend of mine just shared that way of examining the data, based on resolved cases only. It seems to make sense, certainly more so than calculating from unresolved cases. If anyone can comment on whether this method of calculating a mortality rate during an epidemic is legit, please do so.

Ian, I totally agree.

Unresolved cases should not be used to make the calculation. I came to think of it the same way myself, but only yesterday evening UK heath professionals on TV were still stating 1 or 2%.

Thanks Elias. And I’ve found that equation ( deaths / (deaths + recoveries) ) is a legit estimator of CFR. I found a study that tested to see if it would have been accurate if used during the 2003 SARS outbreak, the authors found that it would have “adequately estimated” the ultimate CFR during that outbreak. See:

“Methods for Estimating the Case Fatality Ratio for a Novel, Emerging Infectious Disease” @ https://academic.oup.com/aje/article/162/5/479/82647

I also tested it with CDC data for the seasonal flu, using (deaths + recoveries of those who sought medical attention), and for the 2016-17 flu season found a 0.2% fatality rate. Compare that to the current deaths and recoveries for COVID-19 using the same method, which gives a 17% mortality rate *of those who sought medical attention.* So this legit real-time estimator shows why this novel virus is not comparable to the seasonal flu, as many insist.

These “static” models don’t take into account dynamic variables, such as physician learning, hospital crowding, the changing availability of medical supplies, and the non-constant demographics of patient populations over time.

I’m guessing the CFR diminishes over time because the most vulnerable patients have already died. Also, experiential learning amongst physicians leads to better outcomes.

I’m guessing we’ll see widely divergent CFR’s across countries and regions. For example, in Wenzhou there hasn’t been a single death, despite close ties to Wuhan. Why? Plenty of beds and medical personnel.

In fact, there have been few deaths outside China so far. It’s really about patient demographics and level of available care. As improved treatments become known and available, the CFR will diminish yet further.

day1-1 case

day2-2 case-1death <<< everyone uses this as the method CFR

day3———-2deaths

The above is 100% all died. Day 2 if you divide deaths to cases you'll never get 100%, you must divide the actual case and track it until death. Or things run the toll to the end as coronavirus.

day1-1 case

day2-3 case-1death <<< this calculates totals like coronavirus

day3———-3deaths

Total deaths are 3 and cases are 3, if it gets incorrectly calculated it gives 33% when we know this is wrong they all died

Most sources from all media calculated it out at 2%, and from the above examples, you now know how wrong they all are

I have info here on the high confirmed 64.54% rate, that's spread across a large period of time in which deaths occur(plus some die beforehand and never get confirmed, now changed today13.02.20 to be included in the stats)

Info on the link I forgot

https://home.exetel.com.au/upload/coronavirus-predictions/

Aussie, you bring out a good point, but you’re mixing statistics. The 64% is a growth rate. There are other statistical considerations you may be mixing.

Hi Franklin

Did you read the info at the bottom of the page from my site? If there are no test kits there will be an increase once there available. The spike increase occurred on 27.01.2020 at 64.54%(that’s massive and its why I went to a percentage so its more easily seen there are no more like it since(until the 12.02.2020 and its explained)), the media found out about it on 2.2.2020.

Chinese state media have reported test kit shortages and processing bottlenecks

https://www.nytimes.com/2020/02/02/health/coronavirus-pandemic-china.html

Forgot to mention, as the death rates all the way thought that period is very smooth and consistent its why there highlighted. All the info on my page proves what the media had known existed, and it took me 2.2.2020 to find the answer to why it happened