Thread by @ProfDFrancis, Thank you to @willsuh76 for highlighting this to me.Let's have a think [...]

Thank you to @willsuh76 for highlighting this to me.

Let& #39;s have a think about what they have done. https://twitter.com/nickmmark/status/1248736307062267904">https://twitter.com/nickmmark...

https://twitter.com/nickmmark/status/1248736307062267904

First, we must not let the tail wag the dog.

Which should one decide first?

Choose the endpoint, then calculate the sample size

or

Calculate the sample size, then choose the endpoint

They are, of course, presented together so it is in simultaneous in PRESENTATION to the public but this Twitter group is for intelligent people who think about where knowledge comes from, not for soundbite-swallowers.

Yes the first 22 people are 100% correct! Maybe I should always run my tweetorials at 04:29 London time, when the trolls are asleep.

For those wanting to play along at home, I am looking at the "Moderate Covid" design. There appear to be several similar remdesivir trials on http://ClinicalTrials.gov"> http://ClinicalTrials.gov , and I have happened to pick NCT04292730.

Home - ClinicalTrials.gov

http://ClinicalTrials.gov

First the praise

They weren& #39;t sure what to look for, so they went for something that didn& #39;t need any understanding of medicine.

Even the janitor knows that if you get home alive within 14 days, that is better than still being stuck in hospital (or dead) at 14 days.

At the Janitors& #39; Club, they can have discussions like this.

"My mom was really ill. She was in hospital for a month."

"Ah, my uncle did better. He was home in a week."

They don& #39;t need to know any test results, decide which tests are more important, or by how much.

There& #39;s nothing special about 14 days. It& #39;s a made-up time. You could do it by chopping at any number of days.

But you are not allowed to decide afterwards. Why?

Whenever there is a company involved, with investors paying big $ to run a trial, they want a return, and this puts intolerable pressure on investigators to present a positive picture.

This is not financial pressure - it is desire to avoid embarrassment.

To protect investigators against the temptation to pick-and-choose endpoints, we require them to specify the endpoint, e.g. "Getting home by 14 days", in advance.

This way, when a trial is negative, they can proclaim innocence to the company.

"Not my fault, I have no choice on what to present.
I know that it is significant for going-home at 4 days, but we prespecified 14 days. Just be happy people can& #39;t pick and choose, because it is significant the other way at 19 days!"

I know this from personal experience because I have seen it extensively in the great "Bone Marrow Cell Therapy for Heart Failure" worldwide fiasco.

Investigators would confidentially admit fudging or spinning the results to be positive, because of

- embarrassment of trial being neutral
- wanting to give patients hope
- not wanting to have to sack researchers dependent on research money, in turn dependent on positive results

Janitor endpoints, yes-no of a very simple thing, like "Gone home at 14 days" are good in one way.

Which way?

Yes! First 5 answers are correct. 6th answer wrong.

The endpoint of "gone home at 14d" is not unbiased, in an unblinded trial, since staff (and patients) may be happier to send home with the feeling of protection, even if it is from Francisoglimeprivir and therefore useless.

It& #39;s clearly not specific for corona-virus. Indeed that is its strength.

When we don& #39;t know much about the disease, it is hard for scientists to decide what to use as the endpoint.

It is easy for stupid people to decide, because stupid people think that everything is of equal difficulty (and perhaps easy), since they do not have experience of solving problems of varying difficulty and seeing what makes some problems difficult.

I should emphasise that it is easy for ONE scientist to decide, because as an individual a scientist can just pick a thing and declare it to be good.

But scientistS (plural) have a language of discourse and need to persuade each other, and that is what makes science good.

Does it need fewer patients?

Let& #39;s think about numbers. How many binary digits in this number, which is binary representation of 7:

1 1 1

How many binary digits are required to answer this question?

"Did the patient get discharged, alive, by 14 days? (Yes/No)"

Thank Frank Harrell for pointing this out on an slide he presented to the FDA and put on line.

Dichotomizing produces simplicity, but at the cost of throwing away lots of stuff, which you might regret.

Sorry some people are finding the binary questions above a bit hard.

To store the ANSWER for a "yes/no" question, you only need ONE binary digit in total. You don& #39;t need one for Yes and one for No. It is enough to store 1 for Yes, and 0 for No.

So "one binary digit" is in some senses the smallest amount of information you can get per patient from a trial.

(I know this is not true in the Information Theory sense, with entropy & stuff, and I know @mshunshin and @DrJHoward will beat me up tomorrow, but you get the point.)

So generally, Yes/No endpoints tend to need more patients.

The @mshunshin formula for this is as follows.

What is the proportion of bad events (e.g. death-or-still-in-hospital-at-14d)?

Let& #39;s say that is 25%, or 1-in-4. We call that a RARITY factor of 4.

How weak is your therapeutic effect? i.e. what fraction of deaths do you prevent?

Suppose you prevent 20% of deaths, or 1-in-5 *of deaths*.

We call that a WEAKNESS factor of 5. (Why "weakness"? Because 100 means you only prevent one death out of 100 deaths).

The @mshunshin formula is that you need roughly

30 * W squared * R

patients

So for 25% mortality that your reducing to 20%, that is

Rarity 4 (since one in 4 die if untreated) and

Weakness 5 (since 1 in 5 of 25 deaths are prevented, bringing deaths down to 20)

What is the number of patients needed, according to that quick formula?

30 * 4^2 * 5

So the sample size of a few hundred would never have answered the question adequately, if we were expecting an effect of the order of "event rates down from 25% to 20%".

So either they were expecting a spectacularly huge effect, like when penicillin was introduced for streptococcal infections.

(Incidentally at my hospital - shout out to St Marys Paddington!

Sorry it looks shabby. Tradition, you know.)

Oops thank you David!

A nice round 3000!

That makes me feel a bit better. I knew the formula was a bit rough-and-ready, but I was quietly puzzling why it was so far out, when the official proper formula was giving 2922.

(90% power, 5% significance, and terms-and-conditions apply. Do not use the @mshunshin formula as your definitive design for a clinical trial. Just use it in your head to protect you from saying stupid things in corridor conversations.)

Yup, two possibilities.

A. Hoping for a ludicrously large effect size.

B. Weren& #39;t thinking explicitly about the sample size. Just wanted to start something, and then update once they had more information. So started with a sample size they were sure they could fund.

Therefore I am not particularly sad that the sample size has increased. It has gone from woefully inadequate (but presumably only a temporary placeholder) to perfectly reasonable.

As for the endpoint, it has moved from a binary thing to a 7-level thing.

This is good, because it means that a patient who is benefitted (but doesn& #39;t cross the 14-day discharge threshold) can still contribute useful information about the benefit.

This looks pretty cunning to me.

At 11 days (this is a non-round number so presumably they looked at a variety of days to pick the most informative day, i.e. where there was the greatest spread of outcomes in patients as a whole, but blinded to allocation arm).

At 11 days, they will assess patients on a scale of how seriously ill they are:

Dead
Intubated/ECMO
CPAP or high flow O2
Low flow O2
Hosp, needing things other than O2
Hosp, not needing anything
Home

Very reasonable.

What do you think my disappointment is?

While we wait for more answers, thank you to Darren Dahly @statsepi for sending me this.

I am a professor, i.e. I love the sound of my own voice, and I pride myself on the Trumpian virtue of always having something to say, however inane.

However I am unable to comment on the quote Darren has sent, because (a) I can& #39;t understand what it means.

(b) I disagree that it is critical for all prophylactic trials to have the same endpoint and same DSMB (surely they don& #39;t mean the actual same people?)

My reason is that while *I* know what endpoint I want for *my* trial, and I can persuade my colleagues of this in a limited amount of arm-twisting, if I disagree with another scientific team, we have neither the time or scientific data to resolve it scientifically.

Oxygen is critical.

Water is critical.

Having the same endpoint across all Covid trials is not critical. We can make plenty of progress with each trial deciding for itself and pre-specifying.

Anyway, maybe I misunderstood what the WHO were saying.

And at the end of the day,

(a) they probably contain more than one person and so more than one opinion (otherwise what is the point of having more than one person)

(b) even if it is just one person, they are entitled to change their mind - after all, we call that process "thinking".

So let me get back to my disappointment regarding the remdesivir trial.

Trials like his are likely to have a blinded endpoint committee.

This blinded endpoint committee is NOT allowed to see what arm the patient is in.

They review the records of the patient and rate the outcome, without knowledge of the treatment allocation.

Does this make the trial blinded?

Are the endpoints EVALUATED blinded?

First 3 people are seeing the distinction very clearly.

Sadly, two funding peer reviewers, one of ORBITA and one of ORBITA-2, completely failed to understand this, despite our valiant attempts to explain.

They "killed" it because they said other PCI trials were blinded too!

The endpoints are EVALUATED blinded, i.e. the committee has no knowledge of the treatment arm.

But what are they looking at?

And who decided what treatments the patients needed to be on?

And do those people know which arm the patient is in?

So that is my sadness.

Death is death, but everything else is a measure of what their clinical staff decided to give that patient at that time, in the full knowledge of whether they were taking the drug, and likely the tacit belief that the drug is probably beneficial.

My only suggestion would have been to add a placebo control infusion. I can& #39;t see it being very expensive or difficult, since it is not a tablet that has to be manufactured.

Otherwise I think the whole thing is excellent, and I wish them, and the patients, the best of luck!

One little postscript.

(I see the anti-capitalism campaigners are in: Smash ECMO! Down with Ventilation! Up the revolution!)

Think why I would not push for a much longer timepoint than 11d, for this 7-level endpoint.

Do a thought experiment. At 90 days, let& #39;s say (just a made-up number) 20% of people are dead.

What will the status of the other 80% be, along the spectrum of 6 remaining states?

If you have difficulty answering, think about the answer at 365 days.

With that 365-day timepoint picture in mind, what is the benefit of choosing the 7-level endpoint to evaluate that outcome, rather than a dichotomous outcome?

OMG 100% of the first 6 people are wrong on this question:

Let me put it in words of one syllable.

1000 peeps go in da hosp

At 3 months (MONTHS) [M O N T H S],
200 are dead.

Dat meens 800 still live

Where are dose 800 peeps? (At 3 months, or wun year?)

And of those 800 who have gone home, how many are on ECMO & shit?

OK so at a very long time point, like 3 months or one year or whatever, EVERYONE who hasn& #39;t died has recovered and is at home.

So the 7-level fanciness turns to crap.

It has a bunch of people (say 20%) in the worst tier, "dead".

And all of the rest in the best, "home".

So we have shot the 7-level endpoint in the head.

What we should have done is picked the timepoint when people were spread out across the levels of the endpoint.

Right in the middle of the bad times.

11 days sounds good to me!

A brief math interlude and then another postscript.

I love today& #39;s puzzle from Mathirati.

Simple to ask
Impossible (for me) to do in one& #39;s head: had to type two lines of notes
And answer is hard to believe from picture

Let& #39;s take the radius of the circles as one unit.

Look at this line, the side of the small octagon.

How long is it?

Now look at the blue line

How long is the blue line?

Remember that thing is a 45 degree triangle.
In a 45 degree triangle the short sides are Square-root-of-2 times smaller than the long side.

And you know the long side is two units: you just told me.

A short side (blue) is:

Look at this black line, which takes us down to the centre of the octagons.

How long is the black line

So the total from the top of the inner octagon to the centre of the octagons is what?

Add the blue and the black heights.

Now, from that midpoint, how many units further up is the top of the big octagon?

If your thread ends here, click here to continue: https://twitter.com/ProfDFrancis/status/1248874996841304064">https://twitter.com/ProfDFran...

https://twitter.com/ProfDFrancis/status/1248874996841304064

Latest Threads Unrolled: