I gave @Surgisphere the benefit of the doubt. I believe this data is very possible to have. Other companies have similarly large EHR datasets, such SVS, Explorys, etc. I have even worked with large EHR datasets.

But I am getting worried. Let me explain. /Thread
CC @raj_mehta
Datasets like this are not easy to create. Explorys, for example, started in Cleveland and now owned by IBM has a huge infrastructure w/ > 360 hospitals nationally participating. Contrary to popular belief, you do not need 100s of authors on these papers. /n
Participating hospitals have data-use agreements to provide data to the company and in return get to use the cloud-based data for its own research. These data agreements often specify that the identity of the hospitals remain anonymous in order to be HIPAA and HITECH compliant /N
Of course, the owners of Surgisphere know which hospitals are in the dataset, but divulging that would require new agreements with those hospitals.

These databases collect data from hospitals via health records, laboratory systems, and billing inquiries /N
With Explorys, eg, hospitals feed data from their respective EHRs to Explorys once every 24h.

This means you have to have hospitals with strong computer networks and EHRs! This has been questioned by African researchers regarding data from this continent, which is legit. /N
Diagnoses/procedures for patient records are often mapped into Systematized Nomenclature of Medical Clinic Terms for clinical term (SNOMED-CT) hierarchies. SNOMED-CT collapses diagnostic codes from ICD into clinically meaningful, standardized categories using “umbrella” terms /N
With Explorys, the PopEx system can be used in the Cloud to search for disease, procedures, and findings at the epidemiological level of a de-identified, aggregated patient cohort. At my previous hospital, the IRB convened & decided this use to be exempt a prior. /N
It is also possible to get patient level data that is de-identified, similar to NHAMCS, NAMCS, H-CUP, NHANES, etc, but that usually requires specific IRB approval and not a blanketed one. I'm not sure what Surgisphere uses, but would like to know. /N
The power of EHR data from these large registries is for Hospital QI projects, preliminary data to use in a big grant application to justify sample size / prevalence etc, or very rare disease that require a broad net (think inborn errors of metabolism, factitious d/o etc) /N
At the population level, this data is very weak. You cannot account for missing data. There's a lot of confounding. It's dependent on how people enter data, which isn't standardized, it rounds values to the nearest 10... etc. /N
Apparently, this company used individualized data. I suppose propensity score matching would be possible here, but I can't imagine that would be a priori IRB exempt. Even Explorys you can't account for this with individual data. /N
I think it is MORE than possible for data like this to exist. it does elsewhere. But the authors have a lot of explaining to do. Why did the NEJM paper have < 200 hospitals on May 1, and then the Lancet late-May they have 671? That would require a HUGE support system & $$ /N
At this point, with WHO stopping RCTs, I think this paper needs to be retracted sadly unless the authors give a full accounting for. This is a pandemic. Our research has consequences. /END
Let me take you on a little journey of how I used Explorys:

1) Rare presentations is one way.

@AlexChaitoff & I characterized demographic features of pts dx with Factitious d/o in the U.S., which previously had only been done in cases series. https://pubmed.ncbi.nlm.nih.gov/32081412/ 
4) Peds rare disease: A paper we had preliminary accepted -- don't ask why it never went farther.... lol -- looked at common labs for patients diagnosed with inborn errors of metabolism, as well as distribution of different diseases. /N
All of these studies have a LOT of limitations, but they provide some data on patients where more studies are needed!

But I wouldn't publish pandemic data with it... especially stating in interviews that RCTs might not be needed now.
Good thread for others to read below! H/t @scleroplex https://twitter.com/jwato_watson/status/1265603863551389697?s=20
others more knowledgeable than me have weighed in too and I recommend following this: https://twitter.com/leorahorwitzmd/status/1266331791876657157?s=20
this is also worrisome.... https://twitter.com/MaartenvSmeden/status/1266834030445633539?s=20
You can follow @reverendofdoubt.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: