Home > flickr, Python > Mapping 24 hours of Flickr geotagging in Python

Mapping 24 hours of Flickr geotagging in Python

The aim of this project was to find out where in the world people were geotagging their photos on flickR, using the flickR API.

world

The approach taken was to poll one of the flickR ‘pandas’, ‘Wang Wang’. This is a service which keeps track of geotagged photos as they come in.

The following Python script runs in the background, polls the service once a minute, and appends the location of newly tagged photos to a CSV file. It only asks for up to 100 photos in the previous minute; in reality, up to 120 are returned in any one minute! The average was around 80/minute when I last ran this.

The flickr API is being accessed using beej’s flickr api.

# -*- coding: UTF8 -*-
'''
Created on 28 Apr 2009
Ask WangWang for recently geotagged photos
@author: Steven
'''

import flickrapi
import datetime
import string
from time import time as t
import time

api_key = 'YOUR_API_KEY_HERE'
flickr = flickrapi.FlickrAPI(api_key)
if __name__ == '__main__':
    ct=0
    lastct=0
    print "Timestamp  Total   This"
    while True:
        tstamp=int(t())-60
        wangwang = flickr.panda_getPhotos(panda_name='wang wang', interval=60000, per_page=100, last_update=tstamp, extras='geo')
        fo=open("c:\\wangwang24hours.csv","a")
        for x in wangwang.find('photos'):
            s= "%d,%s,%s,%s\n" % (tstamp, x.get('longitude'),x.get('latitude'),x.get('id'))
            ct=ct+1
            fo.write(s)
        time.sleep(60)
        fo.close()
        print "%10s %07d %04d" %(tstamp,ct,ct-lastct)
        lastct=ct
    print 'done'

Once we have the data, it’s time to visualise it. A heatmap seemed a good choice; the chart uses the Matplotlib ‘hexbin’ style. This takes two arrays of the same size (here, the longitudes are in X and the latitudes in Y) and maps the values onto a hexagonal grid (here, of size 180×180), counting the number of photos which fall into each hexagonal bin.

Each bin is coloured according to the number of points that fall into it; red have most, green have less, blue have the least.

The following script takes the output from the previous script, and plots it.

import numpy as np
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import string

X=[]
Y=[]
fi=open(r"c:\wangwang24hours.csv")
for line in fi:
    ignore,x,y,ignore2=string.split(line, ",")
    if x!='None' and y!='None':
        X.append(float(x))
        Y.append(float(y))
fi.close()
hexbin(X,Y,gridsize=180,bins='log',cmap=cm.jet,linewidths=0,edgecolors=None)
show()
Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: