Archive

Archive for the ‘Uncategorized’ Category

Parsing USGS BIL Digital Elevation Models in Python

A false colour elevation map of the Hawaiian island of Kahoolawe

The USGS have a digital elevation model covering most of the US, Hawaii and Puerto Rico. If you use the National Map Seamless Server it’s possible to select a region and download a digital elevation model in a number of formats. One of these, BIL, is a simple binary raster file, which can be read quite easily in Python – in this, I’ll base this on code from a previous post.

Getting the data

  • Go to National Map Seamless Server
  • The map is fairly intuitive, but if you get stuck check out their tutorial
  • When you use the download rectangle tool, make sure to click on “Modify data request” and choose the BIL data format (BIL_INT16) rather than the default “Arc_GIS”
  • Choose zip or TGZ
  • You’ll then download a compressed archive – may take a while
  • The archive contains lots of files; look at the contents of output_parameters.txt. This tells you the dimensions of the file. Make a note of the sizes.
  • Extract the .bil file – this is what has the raw data.

Extracting the binary elevation data from the BIL file

# USGS Seamless Server BIL Format Parser
#

import struct

# get these from the output_parameters.txt file
# included in the download
height=437
width=663

# where you put the extracted BIL file
fi=open(r"my-extracted.bil","rb")
contents=fi.read()
fi.close()

# unpack binary data into a flat tuple z
s="<%dH" % (int(width*height),)
z=struct.unpack(s, contents)

After running this, z is now a tuple of width*height elements, with data stored in raster order (entries in row 1 from left-to-right, then row 2 from left-to-right, and so on)

The struct module for reading binary data

struct is a fantastic Python library, it really makes reading binary files pretty easy. Say we have a raster-layout binary file of 50 rows by 100 values per row. We’d expect 5000 values. Each value is an unsigned short (H) and little-endian (<)

In this case, we build up a format specification like this
“<5000H”

Plotting the end result

And optionally, if you have matplotlib and numpy libraries, you can get a quick preview of the resulting array.

from pylab import *
import numpy as np
import matplotlib.cm as cm

heights = np.zeros((height,width))
for r in range(0,height):
    for c in range(0,width):
        elevation=z[((width)*r)+c]
        if (elevation==65535 or elevation<0 or elevation>20000):
            # may not be needed depending on format, and the "magic number"
            # value used for 'void' or missing data
            elevation=0.0
        heights[r][c]=float(elevation)
imshow(heights, interpolation='bilinear',cmap=cm.prism,alpha=1.0)
grid(False)
show()

Using R and ggplot2 to create a timeline visualization of UK government politics

creating an abstract timeline chart with R and ggplot

This was an attempt to visualise the ebb and flow of various political parties in the UK, over time.

The data was made available on the Guardian DataBlog, so I thought I’d have a go at producing a timeline image, using vertical strips of colour to show which party was in power in each year.

I’m not an expert in R, so there may well be a better way of doing this!

The steps taken were:-

  • download the OpenOffice spreadsheet from the Google Spreadsheet linked to on the Guardian DataBlog.
  • remove footer rows (copyright notices at the foot of the spreadsheet).
  • tidy up headers in row 1 – simplify these to single words like “Party”
  • save as a CSV, using the pipe (|) character as a delimiter. I put this in c:\infoviz\data.csv

First of all, we need to load the CSV file into a data frame

inp <- read.table("c:\\infoviz\\data.csv",header=T, sep="|")

This works fine, but the column ‘Party’ contains names like “Conservative”, “Labour” and “Liberal”.

How to convert this into something a bit more abstract, like a colour? The answer was to define a palette of colours…

palette <- c("blue", "red", "orange1", "rosybrown", "orange3", "red4", "blue4", "grey") 

.. and use a function to change a party name to a colour mentioned in that array…

getcol <- function(party) { 
	if ( party=="Conservative" | party=="Tory" )
         r<-1
     	else if ( party=="Labour" ) 
         r<-2
     	else if ( party=="Liberal" ) 
         r<-3
     	else if ( party=="Whig" ) 
         r<-4
	else if (party=="National Labour National Government") 
	   r<-6
	else if (party=="Conservative National Government") 
	   r<-7
     	else 
         r<-8
	return(palette[r])
}

Now, we can use the sapply function to map the values in the “party” column of the data frame into a colour..

inp$PartyCol <- sapply(inp$Party,getcol)

Now, the data frame has a column called ‘PartyCol’ which maps “Conservative” to “blue”, “Labour” to “red” and so on.

So now, ggplot…

qplot(factor(Year), title="history", data=inp, geom="bar", fill=factor(PartyCol), color=factor(PartyCol)) +scale_fill_identity(name=inp$PartyCol) +scale_colour_identity(name=inp$YearCol)

This shows a whole lot of labels and grids that I didn’t want…,

…so I added these options to give a clear, blank theme…

+ opts(panel.background = theme_rect(size = 1, colour = "lightgray"),panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),axis.line = theme_blank(),axis.text.x = theme_blank(),axis.text.y = theme_blank(),axis.title.x = theme_blank(),axis.title.y = theme_blank(), axis.ticks = theme_blank(),strip.background = theme_blank(),strip.text.y = theme_blank())

Here’s the whole source, if you’re brave enough to want to try this yourself!

library(ggplot2)
library(rgb)
palette <- c("blue", "red", "orange1", "rosybrown", "orange3", "red4", "blue4", "grey") 
getcol <- function(party) { 
	if ( party=="Conservative" | party=="Tory" )
         r<-1
     	else if ( party=="Labour" ) 
         r<-2
     	else if ( party=="Liberal" ) 
         r<-3
     	else if ( party=="Whig" ) 
         r<-4
	else if (party=="National Labour National Government") 
	   r<-6
	else if (party=="Conservative National Government") 
	   r<-7
     	else 
         r<-8
	return(palette[r])
}
inp <- read.table("c:\\infoviz\\data.csv",header=T, sep="|")
inp$PartyCol <- sapply(inp$Party,getcol)
qplot(factor(Year), title="history", data=inp, geom="bar", fill=factor(PartyCol), color=factor(PartyCol)) +scale_fill_identity(name=inp$PartyCol) +scale_colour_identity(name=inp$YearCol) + opts(panel.background = theme_rect(size = 1, colour = "lightgray"),panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),axis.line = theme_blank(),axis.text.x = theme_blank(),axis.text.y = theme_blank(),axis.title.x = theme_blank(),axis.title.y = theme_blank(), axis.ticks = theme_blank(),strip.background = theme_blank(),strip.text.y = theme_blank())

Plotting points on an OpenStreetMap Export

February 24, 2010 Leave a comment

This map shows the location of pictures in a Flickr group superimposed on an OpenStreetMap (OSM) export.

If you try plotting points directly on an OSM map, you’ll find that points are all over the shop. The reason is that OSM exports use the Mercator projection; you need to change the latitude and longitude coordinates into the Mercator projection.

Basemap to the rescue!

If you use the code in the previous post, you can change the plotting code.

from basemap import Basemap 
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import numpy as np
import string
import matplotlib.cm as cm

x=[] #longitudes
y=[] #latitudes

fi=open(r'C:\infoviz\scotland_photos.csv','r')

linenum=0
for line in fi:
    if linenum>0:
        line=string.replace(line, "\n","")
        try:
            fields=string.split(line,",")
            lon,lat=fields[0:2]
            x.append(float(lon))
            y.append(float(lat))
        except:
            print "Error!"
    linenum+=1
fi.close()

m = Basemap(llcrnrlon=-8.0,llcrnrlat=54.5,urcrnrlon=1.5,urcrnrlat=59.5,lat_ts=20,
            resolution='h',projection='merc',lon_0=-4.36,lat_0=54.5)
x1,y1=m(x,y)
m.drawmapboundary(fill_color='white') # fill to edge
m.scatter(x1,y1,s=5,c='r',marker="o",cmap=cm.jet,alpha=1.0)

Rather than using basemap to draw the outline of Scotland, this script simply creates a scatter plot on a white background, like so:-

Now, you need to export the map from OpenStreetMap.

The corners have been set as follows..

… llcrnrlon=-8.0,llcrnrlat=54.5,urcrnrlon=1.5,urcrnrlat=59.5 …

llcrn stands for the lower-left coordinate, and and urcrn for the upper-right coordinate.

So if you if you export from OpenStreetMap using these coordinates…

.. you’ll have a map of Scotland with the Mercator projection.

The two images can then be composited in Photoshop or another graphics app like the Gimp.

unemployment statistics in the UK

October 14, 2009 2 comments

Visualizing recent trends in benefit claimant counts in the UK.

Unemployment data from the Guardian Data Blog.

Constituency coordinates courtesy of the TheyWorkForYou API.

The three heatmaps show, respectively, from left to right:-

(1) the %age change in those claiming benefits (hotspot in the Thames Valley)

(2) the %age of the workforce out of work and claiming benefits (hotspots in the Midlands, Hull, London, Liverpool, Glasgow)

(3) the gender ratio of claimant percentages. Red=higher ratio of male to female claimants, blue=lower ratio of male to female claimants

visualising the nationality of Nobel Peace Prize Winners

October 9, 2009 Leave a comment

Visualizing the nationality of Nobel Peace Prize winners over time

hello world

September 5, 2009 Leave a comment

hey, we all have to start somewhere!

print "Hello world!"
Categories: Uncategorized