Sid Ryan wanted to see if planning applications near planning committee members were more or less likely to be accepted. Here’s the first part of how he did it (a second part – on researching people – here):
While researching Hammersmith and Fulham councillors’ registers of interest for a feature, I began looking into the council’s planning applications database.
By joining up the council’s data and presenting it on a simple map I could show the building hotspots and make accessing public data much easier, even if I didn’t find the undue influence by councillors I was looking for.
Below is a guide to making the map itself, and another post to follow will go through adding the councillors details and researching them using public records.
Getting the Data
Hammersmith and Fulham use a format for publishing their planning applications that makes it difficult to scrape. Whereas a long list of applications (similar to the xml lists released on Openly Local) would have been relatively easy, the aspx database the council use is more difficult.An aspx scraper was beyond my technical skill to write and I asked for help on the ScraperWiki Google Group. Paul Bradshaw gave me some pointers and explained how the scraper would work. While Hammersmith and Fulham’s website doesn’t have a simple list, it does have a drop down menu for applications made each week. This means that to get the data into a form we could use the scraper would have to be coded to work around the menu to get at the weekly list, pull the data for that week and then look for the next week. Craig Russell wrote and annotated this aspx scraper which can help for getting your head around what the program is doing. The scraper we decided to use was an existing scraper written by Andrew Speakman. This scraper collects planning applications for the whole of London, using different modules to probe the different types of database in use by London’s boroughs. One of those scraper modules can navigate Hammersmith and Fulham’s aspx database. However, the raw data from Andrew’s scraper contains 12,000 results from across London. To isolate the borough we want, clicking “Explore with API” on the scraper’s page on scraperwiki will bring up a filtering tool. Change the format to CSV, and enter the SQL query: select * from `applications` where `authority` LIKE “Hammersmith and Fulham”
This will select any record from the dataset where the authority name is “Hammersmith and Fulham”. This returns a CSV file with the 923 planning applications made in the last 6 months, which looks something like this.
Tailoring the Data
The CSV file from will upload straight into Google Drive where you can choose for it to be converted into a spreadsheet. Each record has a column for the local authority, the description of the works to be carried out, address, postcode, latitude and longitude, the date the record was received, validated and scraped, the URL to the record on the council’s database and a reference. What’s missing is whether the application was approved or denied.But using the link to the council’s own record you can find out. If you go onto the council’s page and click view source, you’ll find this piece of HTML, that shows whether the application was approved or denied: <input type=”text” readonly name=”decision” id=”decision” size=”40″ class=”DetailInput” value=”Application Approved”> Google spreadsheets can use the importXML function to get the record for us. Enter the following into a column on the right hand side of the table: =ImportXML(**Cell reference for the planning URL**,”//input[@name=’decision’]//@value”)
This tells google spreadsheets to follow the link you supplied, look inside tags called “<input>” for ones where name attribute ‘=decision’ and then return whatever is inside the quotation marks of the value attribute. That will give tells you whether the application was approved in the first cell and you can then copy this formula down the page. The problem is that google spreadsheets only lets you have 50 importXML functions live at any one time. You can work around this by using 50 functions, copying them and then using “paste values only” over the top. Unfortunately you’ll have to repeat this quite a few times. You can also use =importXML again to retrieve the name of the applicant by using: =ImportXML(**Cell reference for the planning URL**,”//input[@id=’applicantname’]//@value”)
With this in place, you can sort the table by approval status and remove a third of the planning applications that have lapsed or been withdrawn. The other important piece of information missing from the dataset is a link to the design documents that are submitted with every planning application. But the button linking to the page is hidden behind an “Associated Documents” tab on the council’s website, so we can’t scrape the address like we did for the approval status and the applicant’s name. However there is a way around this. The design documents are held on a different part of the council website with an address that looks like this: http://www.idoxwam1.lbhf.gov.uk/WAM/showCaseFile.do?action=show&appType=planning&appNumber=2012/01578/ADV This will take you to the planning reference for part of the BBC building, where they wanted to erect olympic signs. If you’re paying attention you’ll notice that the URL for the page has the planning reference for the application at the end: 2012/01578/ADV. As all the applications follow this pattern, although we can’t pull the URL to the planning documents from the council’s website, we can recreate it using the reference and the bit of the URL that stays the same. To do this we can use google spreadsheet’s concatenate function, which joins two or more pieces of text together. Input the following in a column to the right of your data: =CONCAT(“http://www.idoxwam1.lbhf.gov.uk/WAM/showCaseFile.do?action=show&appType=p…, **cell reference for the planning reference code**)
Copying this formula down the column will return the planning document URL. The last piece of tailoring defines what our map will look like. Add one more column called “icon style”, or similar, and write “small_red” next to all the denied applications and “small green” next to all the approved applications. Or you could choose from Google’s other choices.
Mapping the data
Once the data is ready, use Google Drive to create a new fusion table and choose to import the data from your spreadsheet. Fusion tables should automatically detect that you’re using location data and have a basic map ready for you.We can customise the map by changing the icons. When on the map page, click Tools>Change map styles>Marker icon>Column and choose the icon style column from the drop down menu. This lets google know you want a different icon for approved and denied applications. Finally, the pop-up boxes that appear when you click on each record by clicking on Tools>Change info window layout. Google will let you check a box to include in your pop-up any of the fields in your table, or you can use the custom settings to edit in HTML. That’s your map finished. I’ll follow up with how to add the councillors onto the map too, and how to use Companies House, Duedil.com, the Land Registry and Google’s advanced search terms to find out what they’re up to.