Tag Archives: big data

5 tips on mining and using big data for journalists

cijlogo

Matt Fowler is a freelance application developer and programmer who helps journalists understand and use big data. At the CIJ Summer School this year he gave some top tips in the field, which we have summarised below…

1. Double check privacy settings of your data

You don’t want private work being published on show for all to see.

2. Tidy up the data and make the structure simpler

This re-engineering effort can get details out and help you to discover information to turn into stories. Continue reading

Too big for Excel? What to do with big datasets

Recently the NICAR mailing list (for journalists who use computer assisted reporting) discussed how they dealt with datasets that were ‘too big for Excel’. With their permission, I’m reproducing a digest of the highlights.

How much is too much

Different versions of Excel have different limits to the data they can handle. From a million rows in Excel 2010 to just 16,000 rows by 256 columns in Excel 5, Office Watch gives a good rundown of the various versions.

Tom Torok points out that Excel 2007’s million row limit is per sheet, rather than per workbook (spreadsheet), so if you have Continue reading