Thanks to the chapter written by Peter Brent in The Crikey Guide to the 2007 Federal Election, I've started to look more closely at the statistics that lurk behind Polls.

Peter Brent's article set me off on a bit of a study on how polling works: so credit for the information I'm sharing here should go to him and some other web sources that I link to. Peter runs an interesting web site called Mumble.

I'll discuss the mechanics of opinion polling in the context of the most recent Galaxy poll of John Howard's seat of Bennelong.

The headline is that Labor candidate Maxine McKew is leading sitting member (Prime Minister) John Howard by 53% to 47%.

The poll is a serious concern for Howard, but the game's not over yet. Why?

Sample Size

It's an interesting fact that a sample size of about 1000 well chosen people is a good size for getting a reasonably safe prediction of how voters will vote. But the sample of 1000 is just as accurate for a single seat (Bennelong, with 86,220 voters) as it is for Australia (population 21,036,214 at 9:14pm on 12th August 2007) or for the whole USA (population 302,584,856 (approx)).

This probably explains why opinion polls aren't carried out on single seats very often. It costs just as much for Howard's local paper (The Weekly Times) to conduct a poll as it does for the New York Times or USA Today to predict the US Presidential election. The Weekly Times have passed up on doing a poll of Bennelong, so we've got to be grateful to the Daily Telegraph for commissioning this opinion poll.

The number of people questioned for the sample was 800 people. This is a respectable sample size, but it's on the lower edge of respectability.

A thousand voters is a good size that is normally used. With that sample, there's a margin of error of plus or minus three percent. So Maxine might get 50% or she could get as high as 56%. 53% is the most likely response, and there's a bell curve centred there. 52 or 54% are pretty possible too. 51 or 54 less so, but still within range. 50 or 56 are just possible. However: if the election were run 20 times, then on 19 occasions the result would fit somewhere within the 50-56 margin.

Which means you must realise: results outside that range ARE possible. And if you are doing polls over enough years, you must hit that 1 in 20 exception some times. So Maxine might get 49% and John Howard 51%, and that is an outcome you should sometimes expect to get with this poll.

Now: the fact that the sample was only 800 (rather than 1000) lowers the accuracy (or expands the range of error) a little bit.

Public Agenda have a good table:

The sample sizes go like this:

  • 2000 people:     + or - 2%
  • 1500 people:     + or - 3%
  • 1000 people:     + or - 3%
  • 900 people:       + or - 3%
  • 800 people:       + or - 3%
  • 700 people        + or - 4%

You'll see that Galaxy's choice of 800 was kind of strategic: the error is still theoretically 3%, but only just. In fact it's verging close to 4%. It might be 3.49%. A sample size of 1000 people is more safely in the middle of the 3% error range.

(These figures are based upon results that a roughly 50/50 outcome. If your respondents were out in the range of 90% of favour of something and 10% against, the error ranges would be different.)

Preference Distribution

In reality the poll respondents didn't all select only Johnnie or Maxine. Some of them chose candidates from other parties. As a first preference, 47% chose McKew and 44% chose Howard. McKew would need to pick up preferences from the Greens or some other minor party candidate before she got elected.

According to Peter Brent's article in the Crikey guide, Galaxy has one of the best methods for deciding how the 9% of people voting for minor parties will be allocating their preferences between Labor and Liberal. Some polls simply split the difference according to major party preferences. Others ask the voters where their next preferences will flow. Galaxy looks at how minor party preferences flowed in past elections. In Galaxy's predictions over the past few elections this method has proved better than asking people where there preferences will go. But I'm not entirely sure if it's appropriate for a single electorate where local issues and local candidates might work differently.

In the absence of any better information, we'll have to assume that Galaxy has done a good job in allocating preferences here. (Unfortunately I can't yet find any description on Galaxy's web site about how they allocated preferences for this poll.)

So can we trust the poll?

Well, despite my sceptical mind: Probably.

  • Galaxy haven't told us how many people are still undecided. Undecideds can sway a poll if they are generally from one party.
  • It's still a long time until the election. Howard's still got time to announce a federal government takeover of the local kindergarten, or whatever it takes to get his electorate feeling good about him. Galaxy's poll might be great for 10/8/07 but hopeless for 17/11/07 (or whenever the still unannounced election ends up happening.)  
  • Even of the poll is accurate and people don't change their mind between now and election day, there's still slightly more than a one in 20 chance that there are still a majority of Bennelongians who really do want to re-elect Howard.

But he'd have to be feeling rather worried.

Comments (1)
Anthony Holmes August 12th, 2007 09:59:13 PM

1) Update: 31 percent undecided !
13/08/2007 10:09:16 PM

In analysing the Galaxy poll of Bennelong (above), I mentioned that the percentage of undecided voters in the poll was not specified. I've just re-read the Telegraph article (linked above) and I now notice that they did mention the undecided proportion.

They said "Dangerously for Mr Howard, the poll shows that an increasing proportion of voters have now locked in their decision - a figure of 69 percent compared to 62 percent for a similar poll in May".

Which means that 31 percent are still either undecided, or they deliberately chose to keep their preferences secret.


Despite the Telegraph's spin, that's a huge undecided number which undermines the accuracy of the poll. The predictions *might* be right. But it *might* be very wrong too.

If I were Howard, I'd be delighted to hear that 31 percent of his electorate hadn't yet decided to vote against him.


