advertise
« Simple NFS failover solution with symbolic link? | Main | The Search for the Source of Data - How SimpleDB Differs from a RDBMS »
Monday
Apr212008

Using Google AppEngine for a Little Micro-Scalability

Over the years I've accumulated quite a rag tag collection of personal systems scattered wide across a galaxy of different servers. For the past month I've been on a quest to rationalize this conglomeration by moving everything to a managed service of one kind or another. The goal: lift a load of worry from my mind. I like to do my own stuff my self so I learn something and have control. Control always comes with headaches and it was time for a little aspirin. As part of the process GAE came in handy as a host for a few Twitter related scripts I couldn't manage to run anywhere else. I recoded my simple little scripts into Python/GAE and learned a lot in the process.

In the move I exported HighScalability from a VPS and imported it into a shared hosting service. I could never quite configure Apache and MySQL well enough that they wouldn't spike memory periodically and crash the VPS. And since a memory crash did not automatically restarted it was unacceptable. I also wrote a script to convert a few thousand pages of JSPWiki to MediaWiki format, moved from my own mail server, moved all my code to a hosted SVN server, and moved a few other blogs and static sites along the way. No, it wasn’t very fun.

One service I had a problem moving was http://innertwitter.com because of two scripts it used. In one script (Perl) I login to Twitter and download the most recent tweets for an account and display them on a web page. In another script (Java) I periodically post messages to various Twitter accounts.

Without my own server I had nowhere to run these programs. I could keep a VPS but that would cost a lot and I would still have to worry about failure. I could use AWS but the cost of fault tolerant system would be too high for my meager needs. I could rewrite the functionality in PHP and use a shared hosting account, but I didn’t want to go down the PHP road. What to do?

Then Google AppEngine announced and I saw an opportunity to kill two stones with one bird: learn something while doing something useful. With no Python skills I just couldn’t get started, so I ordered Learning Python by Mark Lutz. It arrived a few days later and I read it over an afternoon. I knew just enough Python to get started and that was all I needed. Excellent book, BTW.

My first impression of Python is that it is a huge language. It gives you a full plate of functional and object oriented dishes and it will clearly take a while to digest. I’m pretty language agnostic so I’m not much of a fan boy of any language. A lot of people are quite passionate about Python. I don’t exactly understand why, but it looks like it does the job and that’s all I really care about.

Basic Python skills in hand I run through the GAE tutorial. Shockingly it all just worked. They kept it very basic which is probably why it worked so well. With little ceremony I was able to create a site, access the database, register the application, upload the application, and then access it over the web. To get to the same point using AWS was *a lot* harder.

Time to take off the training wheels. In the same way understanding a foreign language is a lot easier than speaking it, I found writing Python from scratch a lot harder than simply reading/editing it. I’m sure I’m committing all the classic noob mistakes. The indenting thing is a bit of a pain at first, but I like the resulting clean looking code. Not using semi-colons at the end of a line takes getting used to. I found the error messages none to helpful. Everything was a syntax error. Sorry folks, statically typed languages are still far superior in this regard. But the warm fuzzy feeling you get from changing code and immediately running it never gets old.

My first task was to get recent entries from my Twitter account. My original Perl code looks like:

use strict;
use warnings;
use CGI;
use LWP;
eval
{
my $query = new CGI;
print $query->header;
my $callback= $query->param("callback");
my $url= "http://twitter.com/statuses/replies.json";
my $ua= new LWP::UserAgent;
$ua->agent("InnerTwitter/0.1" . $ua->agent);
my $header= new HTTP::Headers;
$header->authorization_basic("user", "password");
my $req= new HTTP::Request("GET", $url, $header);
my $res= $ua->request($req);
if ($res->is_success)
{ print "$callback(" . $res->content . ")"; }
else
{
my $msg= $res->error_as_HTML();
print $msg;
}
};


My strategy was to try and do a pretty straightforward replacement of Perl with Python. From my reading URL fetch was what I needed to make the json call. Well, the documentation for URL fetch is nearly useless. There’s not a practical “help get stuff done” line in it. How do I perform authorization, for example? Eventually I hit on:

class InnerTwitter(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
callback = self.request.get("callback")
base64string = base64.encodestring('%s:%s' % ("user", "password"))[:-1]
headers = {'Authorization': "Basic %s" % base64string}
url = "http://twitter.com/statuses/replies.json";
result = urlfetch.fetch(url, method=urlfetch.GET, headers=headers)
self.response.out.write(callback + "(" + result.content + ")")

def main():
application = webapp.WSGIApplication(
[('/innertwitter', InnerTwitter)],
debug=True)


For me the Perl code was easier simply because there is example code everywhere. Perhaps Python programmers already know all this stuff so it’s easier for them. I eventually figured out all the WSGI stuff is standard and there was doc available. Once I figured out what I needed to do the code is simple and straightforward. The one thing I really dislike is passing self around. It just indicates bolt-on to me, but other than that I like it. I also like the simple mapping of URL to handler. As an early CGI user I could never quite understand why more moderns need a framework to “route” to URL handlers. This approach hits just the right level of abstraction to me.

My next task was to write a string to a twitter account. Here’s my original java code:

private static void sendTwitter(String username)
{
username+= "@domain.com";
String password = "password";

try
{
String chime= getChimeForUser(username);
String msg= "status=" + URLEncoder.encode(chime);
msg+= "&source=innertwitter";
URL url = new URL("http://twitter.com/statuses/update.xml");
URLConnection conn = url.openConnection();
conn.setDoOutput(true); // set POST
conn.setUseCaches (false);
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
conn.setRequestProperty("CONTENT_LENGTH", "" + msg.length());
String credentials = new sun.misc.BASE64Encoder().encode((username
+ ":" + password).getBytes());
conn.setRequestProperty("Authorization", "Basic " + credentials);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(msg);
wr.flush();
wr.close();
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line = "";
while ((line = rd.readLine()) != null)
{
System.out.println(line);
}

} catch (Exception e)
{
e.printStackTrace();
}
}

private static String getChimeForUser(String username)
{
Date date = new Date();
Format formatter = new SimpleDateFormat("........hh:mm EEE, MMM d");
String chime= "........*chime* " + formatter.format(date);
return chime;
}



Here’s my Python translation:


class SendChime(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
username = self.request.get("username")

login = username
password = "password"
chime = self.get_chime()
payload= {'status' : chime, 'source' : "innertwitter"}
payload= urllib.urlencode(payload)

base64string = base64.encodestring('%s:%s' % (login, password))[:-1]
headers = {'Authorization': "Basic %s" % base64string}

url = "http://twitter.com/statuses/update.xml"
result = urlfetch.fetch(url, payload=payload, method=urlfetch.POST, headers=headers)

self.response.out.write(result.content)

def get_chime(self):
now = datetime.datetime.now()
chime = "........*chime*.............." + now.ctime()
return chime

def main():
application = webapp.WSGIApplication(
[('/innertwitter', InnerTwitter),
('/sendchime', SendChime)],
debug=True)


I had to drive the timed execution of this URL from an external cron service, which points out that GAE is still a very limited environment.

Start to finish the coding took me 4 hours and the scripts are now running in production. Certainly this is not a complex application in any sense, but I was happy it never degenerated into the all too familiar debug fest where you continually fight infrastructure problems and don’t get anything done. I developed code locally and it worked. I pushed code into the cloud and it worked. Nice.

Most of my time was spent trying to wrap my head around how you code standard HTTP tasks in Python/GAE. The development process went smoothly. The local web server and the deployment environment seemed to be in harmony. And deploying the local site into Google’s cloud went without a hitch. The debugging environment is primitive, but I imagine that will improve over time.

This wasn’t merely a programming exercise for an overly long and boring post. I got some real value out of this:
  • Hosting for my programs. I didn’t have any great alternatives to solve my hosting problem and GAE fit a nice niche for me.
  • Free. I wouldn’t really mind if it was low cost, but since most of my stuff never makes money I need to be frugal.
  • Scalable. I don’t have to worry about overloading the service.
  • Reliable. I don’t have to worry about the service going down and people not seeing their tweets or getting their chimes.
  • Simple. The process was very simple and developer friendly. AWS will be the way to go for “real” apps, but for simpler apps a lighter weight approach is refreshing. One can see the GUI layer in GAE and the service layer in AWS.

    GAE offers a kind of micro-scalability. All the little things you didn’t have a place to put before can now find a home. And as they grow up they might just find they like staying around for a little of momma’s home cooking.


    Related Articles



  • How SimpleDB Differs from a RDBMS
  • Google AppEngine – A Second Look
  • Is App Tone Enough? at Appistry.
  • Reader Comments (3)

    Hosted SVN? Who are they? Are they any good?
    External Cron? Who are they? Are they any good?

    November 29, 1990 | Unregistered CommenterFrancis

    Im'm using the cron that's part of my siteground.com account. I didn't find an inexpensive web service or I would have used them. Much love to siteground so far. I'm traveling at the moment and I don't rememver the SVN provider. I'll get back to you on that. I don't have a lot of experience with them yet so I wouldn't want to recomend them yet in any case.

    November 29, 1990 | Unregistered CommenterTodd Hoff

    That algorithm is really hard to understand unless a programmer can easily get it. Not to everyone I guess..
    -----
    http://underwaterseaplants.awardspace.com">sea plants
    http://underwaterseaplants.awardspace.com/seagrapes.htm">sea grapes...http://underwaterseaplants.awardspace.com/seaweed.htm">seaweed

    November 29, 1990 | Unregistered Commenterfarhaj

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>