Monday, August 4, 2008

Trackulator: APRS Web Service on Google App Engine

I've been experimenting with APRS for quite a while. I even wrote a pure-Perl APRS parser, Ham::APRS::Parser, which is on Sourceforge (although not CPAN, I'm too lazy to polish it up for submission.) For quite some time I've wanted to write an APRS web client/service. I really wanted to write it in Python, make it open source, and to allow public programmatic access to the database. But I could never find a web application framework and hosting solution that I was happy with, and so it sat on the back burner for years.

Then I read about Google App Engine. Free Python web application hosting on an automagically scaling cluster?? Sign me up! I banged out a very simple APRS web application in about a day.

Google App Engine has several limitations. There is no cron, and no long running processes. The only way to interact is via HTTP, and any request that doesn't return within a few seconds is terminated with great prejudice. Furthermore it doesn't allow access to low level sockets. So it was impossible for the web application to receive the APRS-IS stream directly.

I solved this problem by writing a small daemon to run on my computer at work. It connects to the APRS-IS server, collects the packets, and every couple of seconds it uploads a chunk of them to the site via an HTTP POST request. It's a bit kludgy, the Python daemon opens the APRS-IS connection, then pipes the raw data to my Ham::APRS::Parser, which pipes the decoded packets back to Python in a different pack()'ed encoding. This is a very efficient way to transfer as many packets as possible while incurring the lowest possible processing and transit overhead. Once the web app receives the request, it unpack()'s each report, and inserts them into the GAE datastore.

Right now it's only receiving the USA stream. Processor time and disk storage are the major constraints. Google set some fairly low resources quotas for the GAE testing period. Eventually I would like to feed it the global stream; when GAE goes production, it will be possible to purchase additional resources.

Reports are stored using the GAE Datastore, which is based on Google's Bigtable DBMS. This is supposed to scale up into the petabyte zone with good performance, although currently Google limits GAE apps to 500MB. Again, when GAE goes production, it will be possible to buy more storage. The datastore performance has not actually been very good so far. I can only insert 30 or so position reports in a single request, any more and my request times out before the datastore finishes inserting them. This usually takes a few seconds to process and works OK. Still, sometimes the requests take over 10 seconds and time out, and currently that causes the data to be lost. This is probably due to varying load on whatever server is handling that particular request. I could send fewer reports per HTTP request, but that really seems like a waste of overhead resources.

Deleting old data from the datastore is also an issue. Here I can only seem to safely handle about 10 reports at a time, and even that times out even more frequently than the inserts do. Currently I'm deleting all reports over 24 hours old. This results in a database that is approximately 200MB in size, for just the USA feed.

I wrote a horribly simple query function that returns an ugly static Google map with the station's 25 most recent reports, and a track-line (if any).

I have also written a simple JSON API to allow remote programmatic database queries. Anybody anywhere can efficiently query the database using HTTP and JSON from any programming language they like. Google App Engine will automatically scale the application across as many servers as necessary to handle this load. This is what really sets this system apart from the other web APRS offerings. Except for OpenAPRS, which also offers a programmatic interface via XML. Oh well. None of the other APRS web clients that I am aware of (FindU, aprs.fi) offer programmatic access.

Future plans include storing APRS data beyond just positions, additional search API functions, and expanding to the global feed. I suppose a prettier web GUI would be nice but I'm not really much of a JavaScript programmer. Maybe if I provide the database API, somebody else will work on the GUI.

Please try it out. You can get a list of moving callsigns to search from from http://aprs.fi/moving/. Please report any issues you find. Remember, not only is my application in testing, but Google App Engine is in testing, so anything might happen. Don't rely on this for production use.

Trackulator is Copyright 2008 Jeffrey M. Laughlin, and is released under the GNU AGPL.

Related links: