Creating a pinboard map of geotagged photos in a flickr pool

January 14, 2010 stevendkay Leave a comment

In this post I’ll show how to produce a simple pinboard map of geotagged photos in a flickr group pool, using Python and Basemap/Matplotlib. You’ll need:-

There are two short scripts here:-

  • A script to find the longitude and latitude of geotagged photos in the group pool
  • A script to generate the plot

The first script produces a CSV file; the second uses this CSV file to produce the plot.

Here’s the script to produce the CSV file with photo locations:-

# -*- coding: UTF8 -*-
'''
Created on 12 May 2009
Based on beejs flickr API
Produce a list of photo locations for a given
group's pool on flickr
@author: Steven Kay
'''

import flickrapi
import string
import datetime
import string
import time

# Enter your API key below
# You can apply for an API key at
# http://www.flickr.com/services/apps/create/apply
api_key = '' 

# paste group NSID below
group = '1124494@N22'

# sample... fetch your latest images with
# a count of the views, faves and comments

if __name__ == '__main__':
    flickr = flickrapi.FlickrAPI(api_key)
    response_photos = flickr.groups_pools_getPhotos(group_id=group,per_page=500,extras='geo')
    root=response_photos.findall('.//photos')
    pages=int(root[0].get('pages'))
    if pages>8:
        # stop after 8 pages of 500 images
        # not sure if groups.pools.getPhotos has the same
        # 4000 image limit as photos.search..?
        pages=8

    fo=open(r"C:\infoviz\scotland_photos.csv","w")
    print "Longitude,Latitude"
    fo.write("Longitude,Latitude\n")
    for page in range(0,pages):
        response_photos = flickr.groups_pools_getPhotos(group_id=group,per_page=500,page=str(page),extras='geo')
        for photo in response_photos.findall(".//photos/photo"):
            try:
                lat=photo.get('latitude')
                lon=photo.get('longitude')
                st="%s,%s" %(lon,lat)
                if not st=="0,0":
                    # ignore the odd buggy 0,0 coords
                    print "%s,%s" %(lon,lat)
                    fo.write("%s,%s\n" %(lon,lat))
            except:
                pass
        time.sleep(1)
    fo.close()

You’ll need to find the NSID of the group as an input; you can find this with the flickr API call flickr.group.search.

Now, you have a simple CSV file with the latitude and longitude of each geotagged image in the pool.

Longitude,Latitude
-5.167792,58.352519
-4.024359,57.675544
-4.230251,57.497356
-4.2348,57.501045
-4.84703,56.646034
-4.306168,55.873986
-3.586263,56.564732
...

This demo uses the Photography Guide to Scotland pool.

The next step is to plot the map.

'''
Simple Matplotlib/Basemap pinboard map for
Flickr Groups.

Need to provide a CSV file in following format

Longitude,Latitude
20.1,-3.25
20.225,-3.125
.. etc..

Created on 10 Oct 2009

@author: Steven Kay
'''

from basemap import Basemap
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import numpy as np
import string
import matplotlib.cm as cm

x=[] #longitudes
y=[] #latitudes

fi=open(r'C:\infoviz\scotland_photos.csv','r')

linenum=0
for line in fi:
    if linenum>0:
        line=string.replace(line, "\n","")
        try:
            fields=string.split(line,",")
            lon,lat=fields[0:2]
            x.append(float(lon))
            y.append(float(lat))
        except:
            pass
    linenum+=1
fi.close()

# cass projection centred on scotland
# will need to replace with a projection more suited
# to the group you're plotting

m = Basemap(llcrnrlon=-8.0,llcrnrlat=54.5,urcrnrlon=1.5,urcrnrlat=59.5,
            resolution='h',projection='cass',lon_0=-4.36,lat_0=54.5)
x1,y1=m(x,y)
m.drawmapboundary(fill_color='cyan') # fill to edge
m.drawcountries()
m.drawrivers() # you may want to turn this off for larger areas like continents
m.fillcontinents(color='white',lake_color='cyan',zorder=0)
m.scatter(x1,y1,s=5,c='r',marker="o",cmap=cm.jet,alpha=1.0)

plt.title("Photography Guide to Scotland in FlickR") # might want to change this!
plt.show()

This script uses a projection centred around scotland; you’ll need to change the following line…

m = Basemap(llcrnrlon=-8.0,llcrnrlat=54.5,urcrnrlon=1.5,urcrnrlat=59.5,
            resolution='h',projection='cass',lon_0=-4.36,lat_0=54.5)

…to something more suitable for your needs. Basemap provides an intimidating list of projections which should meet your needs.

Processing Sketches

January 11, 2010 stevendkay Leave a comment

I’ve been experimenting with the Processing Language recently. It’s quite addictive :)

Here are a few tasters… click through to see the animation (needs Java) and the source.

ripples processing sketch

You can see the rest of my Processing portfolio on Openprocessing.org

Categories: Processing Tags: , ,

world endangered species map

October 25, 2009 stevendkay Leave a comment

visualizing the number of endangered species worldwide.

Outer circle represents number of species in total; green inner circle is the proportion that are plant species.

sources of data – Guardian Data Blog. Country locations from CIA World Factbook. Plotted using python/matplotlib with baseline extension.

Categories: Basemap, Infovis, Matplotlib

unemployment statistics in the UK

October 14, 2009 stevendkay 2 comments

Visualizing recent trends in benefit claimant counts in the UK.

Unemployment data from the Guardian Data Blog.

Constituency coordinates courtesy of the TheyWorkForYou API.

The three heatmaps show, respectively, from left to right:-

(1) the %age change in those claiming benefits (hotspot in the Thames Valley)

(2) the %age of the workforce out of work and claiming benefits (hotspots in the Midlands, Hull, London, Liverpool, Glasgow)

(3) the gender ratio of claimant percentages. Red=higher ratio of male to female claimants, blue=lower ratio of male to female claimants

homicide rates

October 13, 2009 stevendkay Leave a comment



homicide rates

Originally uploaded by stevefaeembra

A visualisation of the homicide rates across the world.

Scatter plots with Basemap and Matplotlib

October 12, 2009 stevendkay Leave a comment

flickr-geotagging-with-base
A while back I used the flickr api to map 24 hours worth of geotagged photos.

My previous attempts needed some manual Photoshop work to superimpose the plots on a map. The next logical step is to do the whole process – from start to finish – in code, and remove the manual steps.

To do this, I tried the awesome Basemap toolkit. This library allows all sorts of cartographic projections…

Installing Basemap

Basemap is an extention available with Matplotlib. You can download it here (under matplotlib-toolkits)

I installed the version for Python 2.5 on Windows; this missed out a dependency to httplib2 which I needed to install separately from here.

Getting started

Let’s assume you have 3 arrays – x, y and z. These contain the longitudes, latitudes, and data values at each point. In this case, I binned the geotagged photos into a grid of degree points (360×180), so that each degree square contained the number of photos tagged in that degree square.

Setting up

from basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
import string
import matplotlib.cm as cm

x=[]
y=[]
z=[]

Now, you need to populate the x,y and z arrays with values. I’ll leave that an exercise to you :) All three arrays need to be the same length.

Now, you need to decide which projection to use. Here, I’ve used the Orthographic projection.

m = Basemap(projection='ortho',lon_0=-50,lat_0=60,resolution='l')

Here is the secret sauce I took a while to work out. That’ll teach me not to R the FM. This line transforms all the lat, lon coordinates into the appropriate projection.

x1,y1=m(x,y)

The next bit, you can decide which bits you want to plot – land masses, country boundaries etc.

m.drawmapboundary(fill_color='black') # fill to edge
m.drawcountries()
m.fillcontinents(color='white',lake_color='black',zorder=0)

Finally, the scatter plot.

m.scatter(x1,y1,s=sizes,c=cols,marker="o",cmap=cm.cool,alpha=0.7)
plt.title("Flickr Geotagging Counts with Basemap")
plt.show()

visualising the nationality of Nobel Peace Prize Winners

October 9, 2009 stevendkay Leave a comment

Visualizing the nationality of Nobel Peace Prize winners over time

Image Steganography with PIL

October 7, 2009 stevendkay Leave a comment

Steganography is greek for ‘hidden writing‘; the act of hiding a message inside another message.

In this case, hiding an image inside another image, without it being obvious to the viewer. The example I’ll give here is only a ‘toy’ implementation, for two reasons:-

  • easily cracked: it wouldn’t take the authorities long to spot the hidden message, not least because the algorithm is described on wikipedia ;-)
  • fragile: the hidden image-within-an-image can easily be broken, if the image has its colours changed afterwards.

But it does illustrate how to do bitwise-manipulation of images in PIL using the ImageMath module, which is the purpose of the post.

How it works

The watermark – the image we wish to hide – is a bitonal image, with black and white pixels only. It’s then resized to be the same size as the original image.

We ’smuggle’ the watermark inside the original by replacing the LSB (least significant bit) of each colour channel (R,G and B) in the original with the corresponding pixel in the watermark – either 1 for white, or 0 for black.

This image shows the binary arithmetic…least significant bit on the right.

watermarking

Hiding the watermark image inside our image

For this, we’ll need these imports…

from PIL import Image, ImageMath

and open the two files. The watermark is scaled to match the size of the original image.

watermark=Image.open(r"c:\watermark.png")
original=Image.open(r"c:\original.jpg")
watermark=watermark.resize(original.size)

ImageMath only works with single channel (greyscale) images, so we need to split the two images into their three channels (Red, Green and Blue) using the split() method.

red, green, blue = original.split()
wred, wgreen, wblue = watermark.split()

Now, using ImageMath. ImageMath lets you write simple expressions using values from one or more images. Here, ‘a’ and ‘b’ are bound to the values in the original and watermarked images, respectively. The convert() call is needed to prevent problems later; we need to cast the results back to a greyscale image (mode ‘L’).

red2 = ImageMath.eval("convert(a&0xFE|b&0x1,'L')", a=red, b=wred)
green2 = ImageMath.eval("convert(a&0xFE|b&0x1,'L')", a=green, b=wgreen)
blue2 = ImageMath.eval("convert(a&0xFE|b&0x1,'L')", a=blue, b=wblue)

Okay, so now we have three channels whose LSBs have been replaced with the LSB of the watermark.

But we need to combine the 3 channels back to get an RGB image ready for saving.

out = Image.merge("RGB", (red2, green2, blue2))
out.save(r"c:\merged.png")

Open the original and the processed images; can you see any difference?

Extracting the hidden image

All this is for nought if you can’t extract the hidden image afterwards.

This is simpler, as we only need to produce a black/white image from the LSB of the image. Here, I’ve only bothered with the Red channel.

stegged=Image.open(r"c:\merged.png")
red, green, blue = stegged.split()
watermark=ImageMath.eval("(a&0x1)*255",a=red) # convert to 0 or 255
watermark=watermark.convert("L")
watermark.save(r"c:\extracted-watermark.png")

map of the flags of the world

September 26, 2009 stevendkay Leave a comment

A map of the flags of the world.

Using Flickr API to get the views, faves and comments of your most popular images

September 25, 2009 stevendkay Leave a comment

One of the first things I wanted to do with the Flickr API was to get some stats on my most popular images.

You can get this info through the web front end, but there’s no option to download the stats in delimited format (such as CSV) so it can be analysed in a spreadsheet.

I wanted to work out if there was a pattern emerging in the key stats for my 200 most popular images…

  1. Number of Views
  2. Number of Favourites
  3. Number of Groups posted to
  4. Number of sets an image is in

Using a Python script (v2.5) and Beej’s FlickR API, this is fairly straightforward. It doesn’t require authentication.

The script runs slowly as it ‘plays nice’, leaving a seconds pause between calls, courtesy of the time.sleep() function. I don’t want to thrash the server.

# -*- coding: UTF8 -*-

import flickrapi
import datetime
import time
import string

# enter your api key below
api_key = 'PUT_YOUR_API_KEY_HERE' 

# enter the user id below (you can use flickr.people.findByUsername to get this for any user)
# it'll look something like 99999999@N99
userid='USER_ID_TO_SEARCH'

# delimiter. Use comma if you want, I tend to use ~
DELIMITER="~"

# dump number of views in delimited format

if __name__ == '__main__':
    #output format : "photoid,title,views,faves,groups,sets"
    flickr = flickrapi.FlickrAPI(api_key)
    photos = flickr.photos_Search(user_id= userid,extras='views', per_page='200', page='1', sort='interestingness-desc')
    for photo in photos.find('photos'):

        title = string.replace(photo.get('title'),",","") #in case you want to use comment as a delimiter ;0)

        # number of views
        id = photo.get('id')
        views=photo.get('views')

        # fave count (up to 50)
        faves = flickr.photos_getFavorites(photo_id=id,per_page=50)
        countfaves=faves.find('photo').get('total')
        time.sleep(1)

        # pools and sets posted to
        contexts=flickr.photos_getAllContexts(photo_id=id)
        posted_groups=len(contexts.findall('.//pool'))
        posted_sets=len(contexts.findall('.//set'))
        time.sleep(1)

        # comments
        comments=flickr.photos_comments_getList(photo_id=id)
        countcomments=len(comments.findall('.//comment'))

        # output as delimited text
        tokens=(id,title,str(views),str(countcomments),str(countfaves),str(posted_groups),str(posted_sets))
        converted=DELIMITER.join(tokens)

        print converted

This script dumps to the console, rather than a file; but it’s easily modified to write to a file. It should work with comma as a separator (for CSV use) as the title tag is stripped of commas…

Here’s some sample output from my photostream…

The format is :
photo id~title~views~favourites~groups~sets


2113237108~north-berwick-old-pier~259~25~21~6~3
3598511429~paris photo heatmap~736~6~12~5~3
3609118442~heart texture~861~10~26~0~2
2304836447~persistence de-motivator~4964~1~4~1~1
2347673075~bergen-ole-bull-plass-lensbaby~184~12~7~6~4
1621047086~banners-down-princes-street~325~11~8~4~2
3688253826~St Anthony's Chapel Edinburgh~124~15~15~9~1
2717978614~st marys~88~9~7~3~2

Once you have the output saved to a text file, you can import it into a spreadsheet (like OpenOffice or Excel) and play around with the figures :-)