IFComp 2009: Correlation between rating and the number of testers

Everybody's always talking about how important it is to have your game tested (or at least I'm always talking about it). But does it really matter? Surely if you have a great idea and enough enthusiasm you can do without?

Well, no. People's tolerance of broken or flawed games seems to go down every year. On the other hand it is delightful to see that those broken games are getting more and more rare. This year only 5 games didn't have any testers. Yes, five. That means 79 % of games had at least some testing. Compared to the last year's number, which was about 50 %, this is a huge improvement. I feel happy and warm inside.

So, how did the number of testers affect the final rating? Here's a handy scatter plot chart.

The Believable Adventures of the Invisible Man

RATING (from 1 to 24)

The correlation looks quite clear, doesn't it? One game had 15 testers, three games had 16. Those all ranked from 2 to 5. Only the winner had a little bit less -- 9 (but it did have two authors, when the design and testing process might arguably be a bit different).

Conversely, the bottom three had no testers at all. The other two games that had no testers ranked 18th and 19th, only two games separating them from the other untested works.

Here's the same with bars:

1	Rover's Day Out	9
2	Broken Legs	16
3	Snowquest	16
4	The Duel That Spanned the Ages	15
5	Earl Grey	16
6	The Duel in the Snow	7
7	Resonance	2
8	Interface	8
9	Byzantine Perspective	10
10	Grounded In Space	5
11	Yon Astounding Castle!	4
12	Condemned	1*
13	Eruption	2
14	Beta Tester	1*
15	The Ascot	1
16	Spelunker's Quest	3
17	The Believable Adventures of...	4
18	The Grand Quest	0
19	Star Hunter	0
20	GATOR-ON, Friend to Wetlands!	1
21	Gleaming the Verb	4
22	zork, buried chaos	0
23	Trap Cave	0
24	The Hangover	0

The correlation becomes even more clear when we compare the number of testers to the average score of each game:

SCORE (from 10 to 1)

I don't think I have to spell out what the lesson is to be learned from this data.

* Note that it wasn't easy to get an absolute number of testers for some games. Yon Astounding Castle named one tester and mentioned "anonymous testers", but not their amount. Condemned and Beta Tester did not credit any testers in-game (boo) but evidence elsewhere on the Internet reveals that they had testers. Both have been marked down as having one tester, but in truth they could have had much more. Many other games credited people for tools they had made, inspiration, technical support or other help. These are not counted in the numbers. If you have more accurate information, please let me know so I can update the charts.

Update: The author of Yon Astounding Castle confirms that there were 4 testers total.

5 thoughts on “IFComp 2009: Correlation between rating and the number of testers”

This is an interesting correlation, but I wonder how much control authors have over the number of testers they have? In the case of Snowquest I simply advertised for testers and used all the volunteers I got (as well as recruiting one or two known testers at an earlier stage to get early feedback on the design). I was happy to have 16 testers, but I didn't set out to recruit that particular number!

The correlation makes perfect sense. Some testers are better than others, of course, but if you start with more, you'll surely end up with better data (other things being equal). Collaborating with another writer does seem to improve the game's quality as well (as I think Eric might agree).

Oops ... didn't mean to be anonymous there. Another factor, I think, is that authors who are better at their craft (either because of their experience or simply because they care more about their work) are likely to be the ones who employ more testers. So there's a positive correlation that is not entirely causative in nature.

Hey, love your data. Can I have them?

-- If you have the spreadsheets, or even just tables, available to email me I'd like to check them out with the software I know.

Conrad.

Sure. Here's the whole shebang with all sorts of data I collected from the comp09 games. Anyone is free to use it as they please. I don't claim that it's 100 % accurate; if you find errors, let me know so I can update it.

Comments are closed.

Undo Restart Restore

Interactive Fiction by Juhana Leinonen

IFComp 2009: Correlation between rating and the number of testers

5 thoughts on “IFComp 2009: Correlation between rating and the number of testers”

Popular posts

5 thoughts on “IFComp 2009: Correlation between rating and the number of testers”