[email protected] Team Statistics Scraper

December 11, 2009

I created a team for Little Filament on [email protected] Our team number is 172406 (in case you want to join), but I wanted to add our latest stats on the Little Filament site. As far as I can tell there is no API for the stats, so I worked up a scraper in bash.

Basically all it does is fetch the page, then grep and sed it’s way to the variables, finally dumping them into a json file (for easy JavaScript consumption).

The kicker is that the stats server is overloaded or down a lot, so we can’t rely on it and we don’t want to stress it out further. My decision was to poll it at a large interval, 12-24 hours. I don’t have enough clients on the team to exact significant change over 6-12 hours, but I don’t want to fall too far out of date either. So if the server is overloaded and drops it once or twice, not a big deal.

Without further ado, here is the script.

#!/bin/bash

NOW=$(date +%s)
THEN=$(cat fah_check.lock | tr -d '\n')

if [ $NOW -gt $(($THEN + 86400)) ]; then
	wget "http://fah-web.stanford.edu/cgi-bin/main.py?qtype=teampage&teamnum=172406" -O fah_check.html
	if [ "$?" == "0" ]; then
		grep "Grand Score" fah_check.html > /dev/null 2&>1
		if [ "$?" == "0" ]; then
			SCORE=$(grep -C 2 "Grand Score" fah_check.html | sed 's/[^0-9]//gm' | tr -d '\n')
			WU=$(grep -C 2 "Work Unit Count" fah_check.html | sed 's/[^0-9]//gm' | tr -d '\n')
			RANK=$(grep -C 1 "Team Ranking" fah_check.html | sed 's/[^0-9of]//gm' | tr -d '\n' | sed 's/f\([0-9]*\)of\([0-9]*\)/\1 of \2/')
			echo "{\"score\": \"$SCORE\", \"work_units\": \"$WU\", \"rank\": \"$RANK\" }" > fah_check.json
			echo "[$NOW] - Success!" >> fah_check.log
			echo $NOW > fah_check.lock
		else
			echo "[$NOW] - Filter Failed" >> fah_check.log
		fi
	else
		echo "[$NOW] - Download Failed" >> fah_check.log
	fi
else
	echo "[$NOW] - Skip Update" >> fah_check.log
fi

That cranks out fah_check.json, which looks like this:

{"score": "4355", "work_units": "20", "rank": "39881 of 169721" }

To see it in action, check out the Little Filament Folding page.

Comments

Leave A Comment

Your email will not be published.