We’ve been spending the morning finding datasets around the games and the torch relay in particular. Here’s what we’ve got so far:
The first step was to map out exactly what it was we wanted to do, and the elements that were involved. The games involve so many aspects that it’s easy to get sidetracked into various explorations. Here’s the image rounding up the elements:
The end result to work back through is some sort of overview of content generated around the route. Working back from that we have the following:
- Identify the route for the torch and related events: times and places
- Identify sources of content around those events: the Citizen Relay project; use of the #citizenrelay tag by others; traditional media; hyperlocal media; alternative media; issue-led media (e.g. groups concerned with accessibility); geo-located social media (those near event locations); keyword-identified social media (those mentioning ‘torch’, ‘relay’, ‘venue name’, etc.)
- Set up systems to capture and possibly classify that content – e.g. social media scrapers; training for Citizen Relay members
- Create systems to interrogate that content, e.g. by theme, quantity, network relationships, etc.
The data
It turns out Oliver O’Brien mapped the general route (1029 locations) last November. He blogged the background to it at the time, and the JSON with the data on the locations is here. There’s a similar list on the Telegraph website produced around the same time.
That doesn’t give us street-level data, however. This is listed – for the first 6 days at least – in a series of PDFs on the ITV News site. Running those through PDFtoexcelonline.com, however, doesn’t seem to work – and 6 days isn’t that useful anyway, so I stopped there.
That said, an advanced search for “torch route locations filetype:xls” or “torch route locations filetype:pdf” does bring up some results from local authorities. The problem is collating all these into a single dataset. Crowdsourcing may be a possible solution.
Some other thoughts:
The games site itself includes data on all the torchbearers, navigable by date, location and name (3 characters minimum).
Hyperlocal blogs could be located through OpenlyLocal.
Local campaign groups around accessibility could also be identified
The 60 members of the Citizen Relay project could be encouraged to add postcodes to their tweets to aid geolocation and encourage other tweeters outside the group to do so too.
Pingback: A case study in online journalism: investigating the Olympic torch relay | Online Journalism Blog
Pingback: A case study in online journalism: investigating the Olympic torch relay | Online Journalism Blog