Friday, March 25, 2011

inflated conclusions from small self-selected sample sizes

yesterday mark thompson's post titled "Just Who Are These Libyan Rebels?" got a bit of attention. the post highlights the fact that in a 2007 study of documents captured in iraq (pdf), the per capita majority of foreign fighters in that country came from libya as shown in the below chart:

 furthermore, of the libyan foreign fighters in iraq, a majority came from either benghazi or darnah, both eastern libyan towns that are currently rebelling against qadhafi.


the thing that is annoying about the point thompson's post makes is that he slips in that key words "per capita." libya has a relatively small population of only 6.4 million. libya's population is less than two percent of the population of the entire arab world. plus, as thompson notes, the 2007 study had a small sample size. it only only looked at the countries and towns of origin of 595 foreign fighters. (see the bottom of page 7 of the pdf of the study). when you mix a small sample size with a per capita adjustment for a relatively low-population country, small blips in the data can balloon and distort the ultimate conclusion.

it you look at the actual data, the study identified 112 individual foreign fighters who were from libya. the study also identified more than twice as many (244) from saudi arabia. but because saudi arabia has four times the population of libya, a per capita accounting makes the libyan per capita contribution much highter. which is potentially interesting, but how do we know that the sample is representative of the total number of actual foreign fighters? we don't. the study could only look at the records that u.s. forces happened to capture. it could be the documents came from a raid on a safe house that happened to hold a disproportionate number of libyans.

the small sample distortion gets even worse when thompson "drills down" and looks at the home town data from the study. he uses the above pie-chart shows to suggest that 23.9% of those libyans came from benghazi, the city that is now the effective capital of the anti-qadhafi rebellion. but that's not really true. look at the fine print at the bottom of the pie chart. of the 112 records identifying libyan foreign fighters, only 88 indicated the fighters home town. which means the already tiny sample size just got even smaller. it's 23.9% of 88, which is 21 individuals that the study identified being from benghazi. 21 people out of a city of about a half-million.

so how exactly does that say anything about the people who lead the current rebellion? it doesn't. the other problem with using the 2007 study of foreign fighters in iraq to make a statement about the people who currently live in benghazi is that the sample in the 2007 study is at least partly self-selected. that is, it looks at people who chose to leave their home town to fight the u.s. in iraq and makes it seem like that is representative of the city they came from as a whole. which is nonsense.

for example, in 2008, one year after the study that thompson cites, the FBI estimated that "[a]nywhere from 15 to 20" minneapolis residents left to become foreign fighters in somalia for a group the FBI claims is linked to terrorists. minneapolis has a population of 382,578, that's about 25% smaller than benghazi. that means that on a per capita basis, in 2008 minneapolis was the home to anywhere between as many or more foreign fighters than benghazi had the year before. what does that say about the members of the minneapolis city council? uh, nothing. every city has it's wackos (the minneapolis suburbs elected michele bachmann to congress!) the existence of those wackos doesn't mean that entire city is filled with crazy people.