Sunday, March 20, 2005

Interesting discussion on Tim Lambert's 41 post on the Lancet 100,000 death study. SoldierDad has an excellent point here as well, noting that Iraq's pre-war death rate seems incredibly low -- lower even than the EU death rate(!). That would seem to fail the laugh test. I'm guessing one reason besides the one he mentions is that when Saddam's goons dragged you off to be tortured to death for drawing a funny beard on a Saddam portrait, they probably weren't that fastidious about doing all the paperwork.
Here are what I thought were the more interesting and relevant comments:
TallDave 21/3/2005 05:01:39
Binomial, I would think, but both are reasonably approximated by a normal distribution in this case
There is no basis on which to make that statement. The data barely even produces a significant correlation for the confidence interval itself, so to pretend you can describe the distribution within it is laughable.

Yes, the 95% confidence interval by itself doesn’t tell us what the probabilities are. But this doesn’t mean that each value is equally likely. We can also construct other confidence intervals. We can be 67% confident that the number is between 50,000 and 150,000. In this sense the end points of the 95% CI are less likely and the middle is most likely.
No, you can't. Besides the fact a 67% probability is practically meaningless (that's a 1 in 3 chance the effect doesn't even exist), the 67% probability bounds speak only to the 67% probability interval. They tell you NOTHING about any other interval. Every number from 8,000 to 194,000 is 95% likely. They are all equally probable, and none of them is "more equal" than the others.

TallDave 21/3/2005 05:15:52
I'm sure this point has been made before, but just to make it again: The main reason reason the study is worthless is because all it can conclude with a 95% confidence interval is that something happened which created between 8,000 and 194,000 additional deaths. This is not useful. We already know the war probably killed more than 8,000 and less than 194,000. The study tells us nothing new or useful -- unless someone arbitrarily grabs a number with no confidence interval to bandy about as an "estimate."

Tim Lambert 21/3/2005 05:24:18
TallDave, it is absolutely false to say that each number in the 95% CI is equally likely.

Pat Curley, for deaths before the war see link

The study did not just count violent deaths but all deaths. Part of the increase was an increase in deaths from disease.

TallDave 21/3/2005 05:24:41
You know, if the authors wanted to be honest about that 100,000 number, they would have to say "We believe 100,000 people were killed, but our confidence level for the 100,000 +/- 0 range is approximately ZERO percent. Any single-number estimate is just a guess and nothing more."

Soldier's Dad 21/3/2005 05:25:19
The mortality rate for the EU is 10/1000.
The mortality rate for the World is 8.81/1000.

The mortality rate for Iraq according to Lancet is 7.9 /1000. The base line death rate(needed to calculate excess deaths) would mean Iraq had one of the lowest mortality rates in the world in 2002.

As "food rations" were determined by family size, and it is largely accepted that the rations were not nearly enough, it is an easy leap to question whether family's might have hidden deaths in order to maintain the additonal ration. The Lancet study does nothing to determine whether the 2002 mortality rate in Iraq, which would have made Iraq one of the healthiest places on the planet, was in fact accurate.

TallDave 21/3/2005 05:25:29
No Tim, it is absolutely false to claim you can say which numbers within that interval are more likely than the others. The range is a quanta.

I suppose technically I am in error to say that no conclusions can be drawn about the interval data (esp. since you could use sources outside the study itself); but it is certainly true that no meaningful (i.e. 95% confidence) connclusions about ranges within the interval can be made on the basis of the numbers in the study. The whole reason you do the 95% confidence interval in the first place is to make meaningful conclusions; when you start parsing the interior of the interval based on other intervals you are flailing at them with non-meaningful correlations. So that point stands, with this clarification.

As I pointed out above, the 100,000 number is statistically meaningless as it has confidence of essentially zero. The only meaningful statement that can be made from this study is that the number of dead is between 8,000 and 194,000.


Anonymous Dan said...

I've been commenting on the same post. I think there's grounds to be cautious when interpreting the Lancet study, but don't really have a strong opinion. So I'm in thoroughly favour of people looking at it carefully.

That said, a couple of quick stats comments. Tim's comment about distributions is more about stats theory than about the Lancet study itself: the binomial distribution and the poisson distribution both start to look like a normal distribution quite quickly as the number of observations grows, so the "normality" assumption made by the authors probably isn't too far wrong (other assumptions might be, though).

Dull stats comment 2. Depending on what you think a probability is (insert long, boring technical argument here), you can say something about the relative likelihood of different bits of the confidence interval. A lot of orthodox statistics gets itself into trouble when it does so, however, because they end up pretending to be more confident than they really are. Ultimately, you're better off just looking at the distribution itself. My (pissant) example is on my blog (see link attached to my name: entry is "on sampling distributions").

My other comment would be about the 8.8/1000 vs 10/1000 thing. My guess is that's just noise (I could be wrong, though). Whatever else you might think, Iraq is a pretty typical, fairly developed country. Personally, I wouldn't overinterpret those numbers.

Ok, so most of that is me being a picky bastard. My point is that the theory of confidence intervals is sound (up to a point), and so the conclusions being drawn are ok (sort of). But, since no study is flawless, I agree that we should be skeptical. That part is always true.

8:26 AM  
Anonymous Dan said...

Ah. Just followed your instapundit link. So, two more things.

1. The "95% confidence range ... is nearly flat" comment is true. It's an outcome of the fact that the data are very noisy (what else would you expect under the circumstances?). That doesn't (necessarily) mean that we should throw the data away. You just need to be conservative about how they are interpreted. So, for instance, while 98,000 is the mean (or "expected") number of excess deaths, it's not a reliable number. So, if the methodology is ok, there's (kind of) a 95% chance that the number of excess deaths over that period was 20,000 or more (somewhere on Tim's blog I explained how you get that number, but I forget where exactly - sorry). In stats terms, that's a pretty big margin for error (hence the "nearly flat" bit), though 20,000 dead people is still a pretty serious issue.

2. The later comment about the inability to compare probabilities inside the interval. That's (sort of) true too. Technically, confidence intervals come from "frequentist" statistics, which don't allow you to look at all the information available to you. As the writer notes, "Bayesian" statistics do let you do this. So, while I agree that some of the comments on Tim's blog are a bit disingenuous about this, the substance of the remarks about comparing relative probabilities inside the interval do have a solid basis in statistical theory.


8:55 AM  
Blogger TallDave said...

Thanks, I later clarified to say that while some numbers inside the distribution are more likely than others, only the aggregate has the 95% confidence level. So while the 95% confidence interval proves at least 8,000 excess deaths, the point estimate doesn't prove anything, and is made esp. meaningless by the size of the interval.

2:09 PM  

Post a Comment

<< Home