The biggest problem (really a very common biggest problem), is the issue of unknown factors. In particular, most people have not personally seen all the shows and require external influence in order to come to a decision. A close second, is that it's hard to figure out what the results of that decision might be. The first issue is self-explanatory. The second is more complicated. Let's say I had a goal of trying to pick shows that would increase the size of the audience, equalize the gender balance, and increase the level of education of the audience. For audience size, I might think that more popular shows (e.g. higher ANN ratings) would be more attractive, but at the same time what if everyone has seen those shows already and would be driven away? For gender balance, I might think that shoujo-genre shows would be more attractive, but what is the mechanism by which females would learn about and choose to attend? For education, I might think that more "fringe" shows (e.g. Tale of Genji or Angel's Egg) might broaden peoples' horizons, but it might also drive them away.
Ben started the reviews site in the e-board repository last semester, which seems to get people significant information about shows they have not seen, in an easy centralized manner. There seems to be interest in this, and if people get in the habit of contributing and reading it, people's general level of knowledge should increase. But summaries only get us part of the way there. Dossiers are often written to attract people to a show, focusing in on aspects that would convince a person to start a show. While there is some attempt to structure things to balance present and future expectation (e.g. with the "how strong is the second half" question), in reality, the general motivation will be to induce an unknowing person to start the show, not a person in the middle of watching the show to finish. To some extent, the quantitative ratings statistics carry some of this information. A higher rated show is more likely to have a satisfying start and finish (since others found it satisfying), plus we also get a sense of what the overall public feels about the show. Thus, someone evaluating the show could combine the numerical rating with their own attraction of elements of the show and weigh them during a voting decision. However, these numerical scores carry their own problem. How do we know how these ratings by unknown thousands of people on the internet relate to how CJAS members will react? (People won't like a show just because we show it. Say you have a bunch of people who don't like comedy. If you inundate them with comedy, they won't learn to like it. They'll just leave.)
Of course, these issues are nothing new. But, I was re-visiting the topic of surveys recently, and realized that the answers to a lot of the issues could be found in existing survey data. So, I took the FA03, FA04, and SP05 survey data and did some analysis on the derived statistics for each show. In particular, for each show, I analyzed the following variables:
- Fall mean rating
- Fall std. dev.
- Fall # responses
- Spring mean rating
- Spring std. dev.
- Spring # responses
- ANN # ratings
- ANN # people indicating they've seen part or all of the show
- ANN arithmetic mean
- ANN arithmetic std. dev
- ANN weighted mean
- ANN Bayesian score
- Was the show shown as a series (0/1 dummy variable)
- Moderate (0.5 <>
- High correlation with high significance between Spring mean and ANN Bayesian score
- No relationship between Fall # responses and ANN Bayesian score
- High correlation with high significance between Spring # responses and ANN Bayesian score
- Low (0.3 <>
- High correlation with high significance between Spring # responses and shown as a series
- No relationship between Fall mean and Fall # responses
- High correlation (R > 0.7) with high significance between Spring mean and ANN # rating
Caveats aside, it's quite interesting that series drive increased # responses so much more in the spring than in the fall. It's also interesting that ANN ratings drive # responses significantly in the spring, but not the fall, and also that those same ratings are much more correlated to CJAS ratings in the spring than in the fall.
When I did some preliminary work on regression models to predict the CJAS rating based on ANN data, it was interesting and shocking to me that the multiple-R on my best multiple regression on Fall mean was in the 0.6-ish range, whereas it was in the >0.9 range on the Spring mean. This would suggest that the ANN data are a good predictor of the audience's impression of a show after seeing the whole thing, but not as good a predictor of the audience's first-half reaction.
What I eventually aim for is getting some statistically significant model that can predict the audience's reaction to a show based on factors we can discover or control at the time of scheduling.
1 comment:
There could be a reason for the Spring/Fall difference--most of our members were Fall start. See if filtering out the freshmen reduces the difference? -Kris
Post a Comment