Using Google AppEngine for a Little Micro-Scalability
In the move I exported HighScalability from a VPS and imported it into a shared hosting service. I could never quite configure Apache and MySQL well enough that they wouldn't spike memory periodically and crash the VPS. And since a memory crash did not automatically restarted it was unacceptable. I also wrote a script to convert a few thousand pages of JSPWiki to MediaWiki format, moved from my own mail server, moved all my code to a hosted SVN server, and moved a few other blogs and static sites along the way. No, it wasn’t very fun.
One service I had a problem moving was http://innertwitter.com because of two scripts it used. In one script (Perl) I login to Twitter and download the most recent tweets for an account and display them on a web page. In another script (Java) I periodically post messages to various Twitter accounts.
Without my own server I had nowhere to run these programs. I could keep a VPS but that would cost a lot and I would still have to worry about failure. I could use AWS but the cost of fault tolerant system would be too high for my meager needs. I could rewrite the functionality in PHP and use a shared hosting account, but I didn’t want to go down the PHP road. What to do?
Then Google AppEngine announced and I saw an opportunity to kill two stones with one bird: learn something while doing something useful. With no Python skills I just couldn’t get started, so I ordered Learning Python by Mark Lutz. It arrived a few days later and I read it over an afternoon. I knew just enough Python to get started and that was all I needed. Excellent book, BTW.
My first impression of Python is that it is a huge language. It gives you a full plate of functional and object oriented dishes and it will clearly take a while to digest. I’m pretty language agnostic so I’m not much of a fan boy of any language. A lot of people are quite passionate about Python. I don’t exactly understand why, but it looks like it does the job and that’s all I really care about.
Basic Python skills in hand I run through the GAE tutorial. Shockingly it all just worked. They kept it very basic which is probably why it worked so well. With little ceremony I was able to create a site, access the database, register the application, upload the application, and then access it over the web. To get to the same point using AWS was *a lot* harder.
Time to take off the training wheels. In the same way understanding a foreign language is a lot easier than speaking it, I found writing Python from scratch a lot harder than simply reading/editing it. I’m sure I’m committing all the classic noob mistakes. The indenting thing is a bit of a pain at first, but I like the resulting clean looking code. Not using semi-colons at the end of a line takes getting used to. I found the error messages none to helpful. Everything was a syntax error. Sorry folks, statically typed languages are still far superior in this regard. But the warm fuzzy feeling you get from changing code and immediately running it never gets old.
My first task was to get recent entries from my Twitter account. My original Perl code looks like:
use strict;
use warnings;
use CGI;
use LWP;
eval
{
my $query = new CGI;
print $query->header;
my $callback= $query->param("callback");
my $url= "http://twitter.com/statuses/replies.json";
my $ua= new LWP::UserAgent;
$ua->agent("InnerTwitter/0.1" . $ua->agent);
my $header= new HTTP::Headers;
$header->authorization_basic("user", "password");
my $req= new HTTP::Request("GET", $url, $header);
my $res= $ua->request($req);
if ($res->is_success)
{ print "$callback(" . $res->content . ")"; }
else
{
my $msg= $res->error_as_HTML();
print $msg;
}
};
My strategy was to try and do a pretty straightforward replacement of Perl with Python. From my reading URL fetch was what I needed to make the json call. Well, the documentation for URL fetch is nearly useless. There’s not a practical “help get stuff done” line in it. How do I perform authorization, for example? Eventually I hit on:
class InnerTwitter(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
callback = self.request.get("callback")
base64string = base64.encodestring('%s:%s' % ("user", "password"))[:-1]
headers = {'Authorization': "Basic %s" % base64string}
url = "http://twitter.com/statuses/replies.json";
result = urlfetch.fetch(url, method=urlfetch.GET, headers=headers)
self.response.out.write(callback + "(" + result.content + ")")
def main():
application = webapp.WSGIApplication(
[('/innertwitter', InnerTwitter)],
debug=True)
For me the Perl code was easier simply because there is example code everywhere. Perhaps Python programmers already know all this stuff so it’s easier for them. I eventually figured out all the WSGI stuff is standard and there was doc available. Once I figured out what I needed to do the code is simple and straightforward. The one thing I really dislike is passing self around. It just indicates bolt-on to me, but other than that I like it. I also like the simple mapping of URL to handler. As an early CGI user I could never quite understand why more moderns need a framework to “route” to URL handlers. This approach hits just the right level of abstraction to me.
My next task was to write a string to a twitter account. Here’s my original java code:
private static void sendTwitter(String username)
{
username+= "@domain.com";
String password = "password";
try
{
String chime= getChimeForUser(username);
String msg= "status=" + URLEncoder.encode(chime);
msg+= "&source=innertwitter";
URL url = new URL("http://twitter.com/statuses/update.xml");
URLConnection conn = url.openConnection();
conn.setDoOutput(true); // set POST
conn.setUseCaches (false);
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
conn.setRequestProperty("CONTENT_LENGTH", "" + msg.length());
String credentials = new sun.misc.BASE64Encoder().encode((username
+ ":" + password).getBytes());
conn.setRequestProperty("Authorization", "Basic " + credentials);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(msg);
wr.flush();
wr.close();
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line = "";
while ((line = rd.readLine()) != null)
{
System.out.println(line);
}
} catch (Exception e)
{
e.printStackTrace();
}
}
private static String getChimeForUser(String username)
{
Date date = new Date();
Format formatter = new SimpleDateFormat("........hh:mm EEE, MMM d");
String chime= "........*chime* " + formatter.format(date);
return chime;
}
Here’s my Python translation:
class SendChime(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
username = self.request.get("username")
login = username
password = "password"
chime = self.get_chime()
payload= {'status' : chime, 'source' : "innertwitter"}
payload= urllib.urlencode(payload)
base64string = base64.encodestring('%s:%s' % (login, password))[:-1]
headers = {'Authorization': "Basic %s" % base64string}
url = "http://twitter.com/statuses/update.xml"
result = urlfetch.fetch(url, payload=payload, method=urlfetch.POST, headers=headers)
self.response.out.write(result.content)
def get_chime(self):
now = datetime.datetime.now()
chime = "........*chime*.............." + now.ctime()
return chime
def main():
application = webapp.WSGIApplication(
[('/innertwitter', InnerTwitter),
('/sendchime', SendChime)],
debug=True)
I had to drive the timed execution of this URL from an external cron service, which points out that GAE is still a very limited environment.
Start to finish the coding took me 4 hours and the scripts are now running in production. Certainly this is not a complex application in any sense, but I was happy it never degenerated into the all too familiar debug fest where you continually fight infrastructure problems and don’t get anything done. I developed code locally and it worked. I pushed code into the cloud and it worked. Nice.
Most of my time was spent trying to wrap my head around how you code standard HTTP tasks in Python/GAE. The development process went smoothly. The local web server and the deployment environment seemed to be in harmony. And deploying the local site into Google’s cloud went without a hitch. The debugging environment is primitive, but I imagine that will improve over time.
This wasn’t merely a programming exercise for an overly long and boring post. I got some real value out of this:
GAE offers a kind of micro-scalability. All the little things you didn’t have a place to put before can now find a home. And as they grow up they might just find they like staying around for a little of momma’s home cooking.