1/Now in @nature: our Matters Arising letter describing how the @GoogleHealth study claiming an AI system can outperform radiologists at predicting cancer violates basic scientific standards of transparency and reproducibility. @bhaibeka @hugo_aerts et al. https://www.nature.com/articles/s41586-020-2766-y
2/We submitted the letter on 1 February. We posted a preprint @arxiv a month later, and I summarize that version of the letter and the problems with Google article and the authors' post-publication behavior this Twitter thread. Check it out. https://twitter.com/michaelhoffman/status/1237349469118586891
3/New is the @GoogleHealth rebuttal, which I'll discuss next. But first let's look at our letter. Roughly, I've highlighted in yellow parts discussing problems with insufficiently described methods and lack of code. Pink is parts about lack of data sharing. Orange discusses both.
4/Here is the rebuttal highlighted the same way. While most of our letter describes the problems with insufficiently described *methods* and *code*, the @GoogleHealth authors chose to focus first on why they would not share *data*. https://www.nature.com/articles/s41586-020-2767-x
5/Unfortunately this tactic is not uncommon. @GoogleHealth abuses its access to private health data as a way to justify restrictions that have nothing to do with keeping that data private, such as sharing code or even model details. This should be shut down every time.
6/Google: "Given the extensive textual description in the supplementary information… we believe that investigators proficient in deep learning should be able to learn from and expand upon our approach."

Doesn't say that others can *reproduce* their approach. Because they can't.
7/Reproducibility is a core element of the scientific method. Independent scrutiny of methods and data are core elements of the scientific method. Withholding details needed for reproducibility and independent scrutiny and whether it is still science is questionable.
8/As a private company, Google is free to withhold these details and inhibit independent reproduction and analysis of their claims. But that doesn't mean they should get to publish advertisements for their software in scholarly journals without them.
9/If you're editing or reviewing a manuscript, demand public access to relevant code. It is essential for science. Despite Google's excuses for withholding code & model details, if @nature said they would not publish the paper without them, I think Google would have found a way.
10/Let's face it: following good practices for sharing code, data, and other materials can be inconvenient for authors anywhere (although some practices can make it more convenient). But it's essential for the scientific enterprise. For-profit businesses don't get a free pass.
11/The most outrageous part of the rebuttal is Google's suggestion that releasing a version of their model one could test independently without "regulatory oversight" would be unethical because it might be misused.
12/As far as I can tell this is a model that they developed without any regulatory oversight. Nothing protects us from any misuse of this technology by Google and its business partners. And this whole argument is actually an excuse to avoid independent oversight!
13/I appreciate @nature's willingness to publish a critique of a paper they published. Many more will see this than the arXiv version so we were willing to go through their process. But there's no reason this should take 8.5 months. They publish original research papers faster!
14/Science moves fast. The article has already received 236 citations. We need another channel to publish serious critiques of published literature where readers are likely to see it. Three anonymous peer reviewers will never think of every issue that readers should know about.
You can follow @michaelhoffman.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: