I'm going to get back to writing this morning...working on some stuff about age distributions and invisibility of epidemics in "young" countries. But first...
I want to do a tweetorial about epidemic curves, lags and the genesis of epidemiological optimism. As some Canadians noticed, Ontario made partial data from the provincial line list available public yesterday. A big deal, given the culture around data in this province.
You can download the data here. It's not totally complete, but it's the real deal: https://data.ontario.ca/dataset/confirmed-positive-cases-of-covid-19-in-ontario
"accurate case date" signifies an approach used by epidemiologists to fill onset dates when they're sometimes missing. They will use onset date if available, replace with test submission date if that's missing, and replace with test-received-by-lab date if THAT's missing.
Epi nerds reading this will know there are probably better ways to do this, but whatevs, that's what people do.
So if you look at approximate case onset date for Ontario what you see is this:
The natural response to such a curve is to say: "woohoo. We have flattened that curve."
But of course, we EXPECT the cases to trail off on the right hand side of the curve, because of lags between onset and reporting. In other words, we expect April 1 cases to be an under-count, because most of those cases have yet to be reported.
This issue is acknowledged by good epidemiologists, like whichever smarty at @PHAC_GC made this figure for their Canadian national epi report ( https://www.canada.ca/content/dam/phac-aspc/documents/services/diseases-maladies/2019-novel-coronavirus-infection/SURV_COVID19%20Epi%20update%20APR1.pdf).
So what do we do about that? How do we know whether we're flattening this mofo or not?

We treat this as a time-to-event or survival analysis problem. We can impute lags between case onset and reporting by comparing epi curves released on different days.
Today is April 2. We have 32 cases (I think) for April 1. If by tomorrow (April 3) the April 1 count goes to 96, then goes to 128 on April 4, then stops increasing, we know that we have 25% of cases reported with a 1 day lag, 75% of cases reported with 2 day lag, 100% at d 3.
Simplistic to make the math easier, and oh that our lags were that short, but you get the idea.

If this is consistent, then we can look at the count from April 1 (today) and say: divide by 0.25 to inflate. March 31 is 2 days ago so divide by 0.75.
If you have delays between onset and case report, you can create a table of "failure times" using survival analytic methods, and do this with actual data. We did (late last night) get access to the good stuff (the real iphis data), but I will be executed or something if we share
so I can't. But we can also do this just based on varying assumptions around the average lag between onset and report in Ontario. Our tests are coming back slowly, so let's say we have an average time from onset to report of 5 days (I know, I know, it's longer).
If we assume that there's an exponential hazard here for reporting, completeness of test reporting by some time "t" is going to be:

Completeness = 1-exp(-r t)
Sorry, math and now some will flee. But this isn't that big a deal...exp just means exponent. On your calculator that's the e^x thing.

r is the daily reporting rate, and the beauty of exponential hazards is that r is equal to 1/(average time from onset to reporting)
If we assume average time from onset to reporting is 5 days that's just 1/5. 6 day average TAT and it's 1/6 and so on.
For a 5 day turn around test completeness looks like this:
You can just divide 1 by each of those values to get the inflation factor by day lag.
So let's circle back to those Ontario data that looked so awesome and encouraging, and assume varying lag functions. Our lag-corrected epi curves look like this:
So sadly we must now retract our high-fives. But it's important to actually know where we're at.

I'm going to spare you all my grumbling about the March break travellers, but I ain't seeing a whole lot of the ol' flattenin' happening here.
Kudos to our amazing Dr. de Villa and Mayor Tory for doubling down physical/social distancing. We are not out of the soup yet.
Thus endeth ye epi tweetorial.
I'm going to put the spreadsheet up on figshare if any under-employed quants want to take this and make a nice app...you can choose your own adventure wrt which survival time function you choose. But the tldr is: don't believe the RHS of epi curves. They're trying to trick you.
@JPSoucy @ishaberry2 we should do this for your awesome data resource, except you don't have onset date yet (I don't think).
You can follow @DFisman.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: