The Big Table

A huge issue is that numbers from different areas (like Germany/Italy or SF/NYC) seem very different.

Theoretically, you could root cause this heterogeneity by forming a single giant table where rows are people & columns are features. 🧵
https://marginalrevolution.com/marginalrevolution/2020/03/where-does-all-the-heterogeneity-come-from.html
Specifically, suppose you had a giant table of all the people in the world. Columns included things like:

- demographics
- test results, types, dates
- other features derived from medical charts
- location over time
- nearby individuals over time
- wearable data

And so on.
This is a thought experiment for now.

But if you could form a table like this (eg by contacting every hospital in the world) then you could derive conflicting stats like the Italy vs Germany death rates from functions run on the *same* table.

Now you’re getting somewhere.
Continuing that example, let’s say there was another column — a hidden variable — that was responsible for the Italy/Germany death rate difference.

It could be the quality of the healthcare those specific people received, as quantified by (eg) whether they got an ICU bed or not.
The process of systematically cleaning up & integrating the patient data into one big table would, in theory, allow us to find hidden variables that explain variation between groups.

Maybe that hidden variable is a pharmacogenomic variant we didn’t test. Or a vitamin deficiency.
This kind of thing is called “data integration” and it’s done in genomics all the time.

It differs from many meta-analyses in that you aren’t just comparing p-values from different studies, but actually stacking raw tables.

Must be careful of (eg) batch effects but it’s doable.
Now, the punchline.

It seems insane to ask for a gigantic table with millions or even billions of rows, in standardized form, with every patient & possible patient in the world.

But Facebook has that kind of data. Google and Apple do too. Maybe something can be done here.
You can follow @balajis.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: