Parsing USGS BIL Digital Elevation Models in Python
The USGS have a digital elevation model covering most of the US, Hawaii and Puerto Rico. If you use the National Map Seamless Server it’s possible to select a region and download a digital elevation model in a number of formats. One of these, BIL, is a simple binary raster file, which can be read quite easily in Python – in this, I’ll base this on code from a previous post.
Getting the data
- Go to National Map Seamless Server
- The map is fairly intuitive, but if you get stuck check out their tutorial
- When you use the download rectangle tool, make sure to click on “Modify data request” and choose the BIL data format (BIL_INT16) rather than the default “Arc_GIS”
- Choose zip or TGZ
- You’ll then download a compressed archive – may take a while
- The archive contains lots of files; look at the contents of output_parameters.txt. This tells you the dimensions of the file. Make a note of the sizes.
- Extract the .bil file – this is what has the raw data.
Extracting the binary elevation data from the BIL file
# USGS Seamless Server BIL Format Parser # import struct # get these from the output_parameters.txt file # included in the download height=437 width=663 # where you put the extracted BIL file fi=open(r"my-extracted.bil","rb") contents=fi.read() fi.close() # unpack binary data into a flat tuple z s="<%dH" % (int(width*height),) z=struct.unpack(s, contents)
After running this, z is now a tuple of width*height elements, with data stored in raster order (entries in row 1 from left-to-right, then row 2 from left-to-right, and so on)
The struct module for reading binary data
struct is a fantastic Python library, it really makes reading binary files pretty easy. Say we have a raster-layout binary file of 50 rows by 100 values per row. We’d expect 5000 values. Each value is an unsigned short (H) and little-endian (<)
In this case, we build up a format specification like this
“<5000H”
Plotting the end result
And optionally, if you have matplotlib and numpy libraries, you can get a quick preview of the resulting array.
from pylab import * import numpy as np import matplotlib.cm as cm heights = np.zeros((height,width)) for r in range(0,height): for c in range(0,width): elevation=z[((width)*r)+c] if (elevation==65535 or elevation<0 or elevation>20000): # may not be needed depending on format, and the "magic number" # value used for 'void' or missing data elevation=0.0 heights[r][c]=float(elevation) imshow(heights, interpolation='bilinear',cmap=cm.prism,alpha=1.0) grid(False) show()