Monday, September 26, 2011

Converting a list of IP addresses to countries

If you have an Excel file with a column filled with IP addresses that need to be converted to countries, here's a way to do it. These instructions were tested on a Mac, but it should work fine in any environment with Python.

Start by downloading the GeoIP City database from MaxMind. They have a free version that you can download here. Download the one in binary format, and uncompress it.

Next, you need the library to access this database format. There's a pure Python library called pygeoip that you can download from google code. To install it, just uncompress it and run the installer: sudo python setup.py install

Next, you need to build a small script to convert the IP addresses to countries. Here's the script I used (note that the countries database should be in the same directory as the script).

#!/usr/bin/env python

import pygeoip, sys
gi = pygeoip.GeoIP('GeoLiteCity.dat')

for line in sys.stdin:
	rec = gi.record_by_addr(line)
	print rec['country_name']

I used this script (geo.py) by copying the IP list from excel to a plain text file (ips.txt), where you get one address per line. Then just run it with something like python geo.py < ips.text and you get a list of countries on your terminal window. Copy/paste to excel and you're done!

If you want more than just the country, just play a bit with the print line. Here's a variation I did to get the state and the city. The output is tab separated so that you can copy it easily to excel:

#!/usr/bin/env python

import pygeoip, sys
gi = pygeoip.GeoIP('GeoLiteCity.dat')

for line in sys.stdin:
	rec = gi.record_by_addr(line)
	print rec['country_name'] + '\t',
	if rec['country_code'] == 'US' and 'region_name' in rec:
		print rec['region_name'] + '\t' + rec['city']
	else:
		print '-' + '\t' + rec['city']

As a side note, I tried another database from hostip.info, but it was only able to convert about half of the IPs I threw at it, so I recommend going with the one from MaxMind...

No comments: