Back.of.the.envelope.gif

[Image courtesy of dynamicdesigns]

Charles Stewart of MIT has a new post over at ElectionUpdates about the state of Florida’s capacity to handle the 2012 election turnout without lines.

Using 2010 data from the federal Election Administration and Voting Survey (EAVS), Charles calculated the expected numbers of voters who should have been able to use voting stations in Florida. In particular, he used a calculation proposed in a recent paper by William and Arthur Edelstein. As Charles notes:

Edelstein and Edelstein propose something called the “Queue Stop Rule.” Stated simply, the Queue Stop Rule calculates the number of voters who can be expected to use one voting station in one day without causing lines to form of people waiting for a voting station to open up. The formula is ½ x TD/TV, where TD equals the number of minutes on Election Day allocated to voting and TV is the average number of minutes it takes a voter to cast a ballot. In the case of Florida, with a twelve-hour voting time on Election Day, TD is equal to 720. If it takes an average of 5 minutes to cast a ballot, then no voting booth should handle more than ½ x 720/5 = 72 voters per day. If it takes 3 minutes to cast a ballot, then a voting booth should be expected to handle 120 voters per day; if 7 minutes, then the voting booth could handle 51 voters.

Using the Rule and applying Florida EAVS data – after accounting for and in some cases correcting missing figures, Charles comes to a very surprising conclusion:

Assuming the numbers reported in the EAVS are mostly accurate — with the exceptions noted above and accounted for — then the number of voting booths is consistent with most of the state having no lines at the polls on Election Day, so long as the average time to vote is five minutes or less. If it takes longer on average, then the number of voting booths is insufficient. There were complaints about the length of the Florida ballot in 2012, but from my own examination of sample ballots posted on various web sites, it is easy to imagine that the average voting time was around five minutes. [emphasis added]

Being the cautious-yet-data-driven type that he is, Charles concedes that his analysis could be “garbage in/garbage out” – but only if the data is inaccurate and better data comes along; until then, he says, “I will stand by this analysis.” Moreover:

if the analysis performed here is accurate, then it is hard to argue that the long lines on Florida on Election Day were caused by an under-supply of voting places, or even the ballots being long.

The envelope says that there shouldn’t have been lines, but experience tells us there were lines, especially in South Florida. That means one or more of the following things are true:

  1. The data is wrong;
  2. The formula is wrong; or
  3. We’re missing something.

Charles has already accounted for the first possibility in his post; indeed, spotty and/or incomplete responses to the EAVS have been an issue from its inception – though data quality is improving. But Charles’ efforts to conservatively account for those data issues don’t appear to create holes in his back of the envelope numbers.

The idea that the formula could be wrong is intriguing, but I don’t feel qualified to do much digging into methodological flaws. The one aspect of the formula that catches my eye is the 1/2 coefficient, which appears to have been added by the authors as a hedge against the ebbs and flows in voter arrival times:

As a sanity check, we consider what would happen if one were to specify that the number of voters per voting station should equal the number of minutes in a day divided by voting time needed by each voter … This would work only if voters came along at exact [“clockwork”] intervals. Even if the average voter flow were constant throughout the day, fluctuations of voter arrivals would result in small pileups. Surges would result in major pileups, as we have demonstrated. Our Queue Stop Rule, specifying half the number of voting stations obtained by assuming clockwork voter attendance, should have enough capacity to make long line formation extremely rare.

If that coefficient is wrong, then the expected throughput of the voting stations will be wrong as well. Using the example in the quote above, if the coefficient is 1/3 then the expected throughput would be 54 voters per station; if it’s 1/4, then the expected throughput would be 36. These small differences begin to add up as the number of voters increase.

Most likely, we’re missing something. Even the Edelsteins concede that their calculations would benefit from actual observation of polling places:

Finally, further observational data on voting times and voter cycle times are sorely needed. These data can be used in conjunction with queuing simulation to help decide how much equipment is required for each step of the voting process, and thus help specify cost-effective equipment that enables expeditious voting. Studies of line formation during elections would be very instructive in refining our model.

It’s all guesswork at this point, but I wonder if a combination of longer scan times for the lengthy ballot combined with physical limitations of polling places (i.e., backups in the voting area slowed check-in since there was no more room for new voters) contributed to congestion notwithstanding the number of voting stations.

If nothing else, Charles’ calculations are a great service because they simultaneously help us diagnose what happened last week while also enabling us to refine the formula going forward.

That’s quite an envelope. Thanks, Charles …