This post is password protected. To view it please enter your password below:

Dear PEC readers, I have a math puzzle. It relates to my gerrymandering project. If you are good at working with probability distributions, take a look. Can you solve it?

Here is the puzzle. It is basically a closed-form calculation of the numerical simulations I did for that NYT piece. It is for a peer-reviewed paper I am writing on how to establish criteria for fair Congressional districting.

**Partitioning of voters in a state with randomly selected districts. **Imagine a state with N districts, and a two-party winner-take-all system (i.e. the U.S. system for electing House members). Select districts at random from a distribution whose vote share v (for party #1) follows a near-Gaussian distribution whose average is A and standard deviation is S.

Now add the condition that the statewide two-party vote yields a fraction F_{1} of votes for political party #1 (and of course the other party gets F_{2}=1-F_{1}). Therefore districts v_{1..N} must satisfy the constraint sum(v_{1..N})/N=F_{1}.

What is the probability distribution of k, where k is defined as the number of districts in which v_{i}>0.5? Give the mean and SD of the expected number of seats to be won by party #1. Also describe the degree to which the distribution resembles a Gaussian.

*P.S. When F _{1 }is close to A, I believe the answer is approximately <k> = N*p and std(k) = sqrt [N*p*(1-p)], where p = normcdf(F_{1},0.5,S). If you can do better, let me know!*

P.P.S. Here is a rephrasing of the problem: *Consider a normally distributed variable with mean mu and standard deviation sigma. Draw from it k times. You only accept sets of draws whose average is constrained to be mu’, which is unequal to mu. What is the distribution of the draws?*

*P.P.P.S. Probably solved. It’s as above, except instead of normcdf(F_{1},0.5,S) we have normcdf(F_{1},0.5,S*sqrt((k-1)/k)). This arises in a semi-obvious way from the derivation of the standard error of the mean.*

>>>

The gift I have in mind is kind of small: a signed copy of either (or both) of my books. I will see if I can think of something nicer to send…

]]>I am traveling, and will reply soon. You’ll want to read it. Check back soon!

]]>I’m preparing a long-form piece (for elsewhere) on the topic of partisan House gerrymandering. We’re cooking up some graphs to drive home some basic points. Your immediate reactions and critical questions will be welcome.

This graph shows what fraction of the two-party vote would have been needed for Democrats to control the House of Representatives.

The procedure was:

- Calculate the % two-party vote for all 435 districts.
- Calculate the shift in vote needed to make an outcome of exactly 218 Democratic seats.
- Add this shift to the national % Democratic vote.

The colored horizontal line segments indicate which party was in control. Generally, the out-party needs a bit more than 50% of the two-party vote to gain control. This extra barrier is an advantage for the incumbent party.

*Note 1:* Dealing with uncontested races is a challenge. For instance, the 2006 data point is distorted by the fact that there were 47 uncontested races won by Democrats (versus only 10 won by Republicans). Forty-seven is an unusually high number. With other definitions, this data point is more comparable to 1996-2004.

*Note 2:* I came into this analysis expecting the 2012 value to be unusually high because of partisan gerrymandering. It is indeed high – but it is only on a par with 2004. I am pondering if there is a problem I am missing.

This post will self-destruct in 12 hours.

]]>The code is a bit of a mess: mysterious variable names, bad structure, that kind of thing. I’ll clean it up later.

If you have 2010 or earlier House voting data in tabular form, let me know. It will allow additonal tests.

]]>I miss my commenters! Let’s see if Facebook-based threads are sustainable. Open discussion thread for the Presidential race. Ro-mentum, early voting, whatever…**have at it!**

I’ve identified the districts – now I need a way to display them conveniently. The ideal tool would be a compact app that uses a ZIP code to return the nearest three swing CDs, along with links to resources such as Pollster.com and campaigns (both D and R). For example, in California the swing districts are CA-07, 09, 10, 24, 26, 41, and 52. These are places where Get-Out-The-Vote (GOTV) activity would be most effective – for either side.

The swing districts are listed after the jump. Write me directly (left sidebar, About Us).

**Update for the very knowledgeable:** in one solution, the key missing piece of information is GIS-friendly Congressional district boundaries. If you have those…swoon!

**Pacific Coast states**

CA-07

CA-09

CA-10

CA-24

CA-26

CA-36

CA-41

CA-52

WA-01

**Arizona/Nevada/Utah/Colorado**

AZ-01

AZ-09

CO-03

CO-06

NV-03

NV-04

UT-04

**Midwest**

IA-03

IA-04

IL-10

IL-11

IL-12

IL-13

IL-17

IN-08

KY-06

MI-01

MI-11

MN-08

OH-06

OH-16

WI-07

**South, including Texas**

FL-10

FL-18

FL-22

FL-26

GA-12

NC-07

TX-23

**New England**

CT-05

MA-06

NH-01

NH-02

RI-01

**Northeast**

NJ-03

NY-01

NY-11

NY-18

NY-19

NY-21

NY-24

NY-27

PA-08

PA-12