Related Articles?

Here’s a great example of the danger of using code to suggest related articles without having someone check the results. In an article in the UK’s Daily Telegraph entitled “Education secretary Michael Gove admits he was beaten at school”, there is a panel headed “Related Articles”. This is what it lists:

  • Conservative Party Conference 2010 live
  • The Ashes: first Test, day two, report
  • Ryan Giggs is the Premier League player for all seasons
  • Gay Saudi prince 'murdered servant in ferocious attack'
  • England given Ashes hope as brittle Australia lose another cliff-hanger, this time to India
  • Kim Jong-un rumoured to have undergone plastic surgery

I think this is fertile ground for at least three subject areas in a school:

The ICT or education technology area

Using a code to generate recommended or related articles may save time, but when it goes wrong we get a first class view of the fact that computers and related technology don’t think. Part of the requirements of the ICT curriculum in England and Wales is being able to evaluate the plausibility of information. This is not the same as accuracy, and must surely take context into consideration. After all, there is nothing to suggest that any of the news story headlines given above are not true (inaccurate), but to suggest they are somehow related to our Education Secretary being beaten at school is hardly plausible.

It also highlights the fact that it’s too easy to blame mishaps on “computer error”. At some point, a human being is responsible, either at the programming stage or the checking and proofreading stage.

For me, it also calls into question the “most popular articles” insets you see on many blogs. I don’t see the point of them anyway: I’d rather have a snippet of code that gives links to my least popular blog posts, so that they will be read more, otherwise it just becomes a self-fulfilling prophecy. But leaving that aside for the moment, how do you know those are the most popular articles? Maybe someone clicked on one of them 50 times or something. Most of the time this would not be important enough to lose sleep over, but if someone was running a supposedly unbiased series of reports on a particular issue, and on their blog you saw that their “most popular article” was especially one-sided, would you not have reasonable cause to be sceptical?

The Citizenship area

This leads nicely on to considering the citizenship or civic education area. Can bias be present in the news even if a code is generating the content (or some of it)? And what is “news” anyway? For some, it will be the Conservative Party Conference in the list above; for others, cricket, and so on. What about the importance of news stories? If that is decided by a popularity contest, in effect, is that really democracy in action? Is it right that the issues deemed to be the most important are the ones chosen in a way that involves (or may involve) less debate than the X Factor?

The Literacy area

It’s interesting to me that (I infer) the keyword on which this list of “related” news stories is “beaten”. So the questions which arise in such instances are:

  • What’s the main keyword?
  • Was that the most appropriate keyword in the circumstances?
  • What are its synonyms?
  • Do these synonyms really mean the same as the word from which they arise?
  • What is someone who is not fluent enough in English to pick up all the different nuances to make of the list of headings?

All of which goes to show, I think, that accuracy on websites must depend on extrinsic aspects like context and audience as well as intrinsic ones like grammar.

This will be cross-posted at