Tuesday, November 9, 2010

I'm calling the race for...

... Justice-elect Charlie Wiggins.

I've run three different statistical models to project the final outcome in the race, and the outcome is unmistakable.  In each of the three models, Wiggins wins by well outside the recount margin.

Here's the breakdown:


Model 1 uses the "average" VCR/SMOV% totals.  These averages are calculated independently for each county based on all data returned so far.  Because the vote totals from election day are treated equally with the ballots being counted today, this model excludes most of the effects of the late shift in vote totals towards Wiggins.  This is the most conservative (not politically -- statistically) of the three models.  Although there are certainly theoretical models that could show a narrower gap, I probably wouldn't find them credible.


Model 2 swings towards the other end of the spectrum of acceptable modeling techniques.  This model uses the "latest" VCR/SMOV% totals.  These numbers come from the most recent day in which the county reported new votes, and are based only on that day's votes.  Accordingly, it picks up all of the late trends towards Wiggins, but in many counties, those numbers are based on a very small, unreliable sample size.  But having said that, this race has continued to trend further towards Wiggins each day that results have been counted.  It would not be unrealistic to see that trend continue; in that case, even this model would undercalculate the potential Wiggins gain.


Model 3 uses a"progressive" methodology (again, that's statistically, not politically) for the VCR/SMOV% totals.  This is the same model I've been applying to the vote projections I posted on Monday and Tuesday.  In general, it uses the "average" number from each county, but it will use the specific averages from Monday, Saturday, or Friday (in order of preference) if the net new ballot data supplied that day is statistically significant.  For our purposes, statistically significant means that the county reported counting at least 500 new ballots, and further, it requires that the number of ballots counted on that day amount to at least 10% of the overall ballots counted from that county so far.

All things being equal, I would normally expect this model to be a fairly accurate predictor of the final results.  However, all things are not equal.  In light of the shifting Wiggins trends across the board, I would probably lean more towards somewhere in between Model 2 and 3.


So there it is.  Even the most conservative model has Wiggins winning by over 5,000 votes.  I'm always hesitant to call a race with so many ballots outstanding and such a small margin of victory -- particularly when I'm calling it for the candidate not currently leading -- but looking at this data, I'm convinced that Wiggins has the upper hand and will coast to victory.

Expect the vote count to switch around 4:30 today when King County posts updated results.  Wiggins should take the lead then, and I don't see a model where he relinquishes it.

4 comments:

  1. Great stuff! Nate Silver would be impressed.

    ReplyDelete
  2. Way to go! Can I steal your spreadsheet? I want to incorporate part of it into mine for a separate bet I've got going :-)

    mookie5381@gmail.com

    Awesome!

    ReplyDelete
  3. Hey Michael - it's on its way!

    ReplyDelete
  4. Got it! You should hit up Slog Happy next Thursday, and I'll buy you a beer. Your work is amazing, which is saying something for a 2Y (tee hee! - j/k). I'm going to be sure to keep the mixed model that I put together (tomorrow morning, that is - I'll share) for future statewide races, because it is simply that awesome, and that great of a predictor!

    ReplyDelete