Catherine: pyOraGeek

Tuesday, September 02, 2008

BigTable blues

This was supposed to be the blog entry where I would announce the Geek Event Aggregator's successful port to Google App Engine.

(sigh)

I've read an awful lot of buzz about how GAE's BigTable is the Next Big Thing in data, makes RDBMS obsolete, etc. Maybe I'm just doing it wrong, but right now I am utterly unimpressed.

The Geek Event Aggregator needs to search its database of 5000 or so events for events whose longitude and latitude are close enough to the user to be of interest. Does that sound so impossible?

I couldn't do it in GAE. First, "Inequality Filters Are Allowed On One Property Only" - so I can filter for longitude or latitude, but not both. I had to filter only for longitude, pull all resulting records into the application, and finish boiling the ocean in my app. It was slow, in the local application environment, but I hoped it would run faster once uploaded to the actual GAE production servers.

In production, though, it doesn't run at all - "Timeout: datastore timeout: operation took too long.". Querying from 5000 records - too much for the mighty BigTable, apparently. Dropping the filters on longitude (to do all the filtering in the app, in case inequality filtering is just so poisonous) didn't help, either.

Oh well. I still enjoyed working with GAE at first, and maybe I'll use it again for something with very light data demands. For the Geek Event Aggregator, I do have a server available where I can host in TurboGears - it'll just take a bit of rewriting. Later this week, hopefully.

2 Comments:

  • Consider having a look at: http://geohash.org/
    It lets you encode Lat and Long into a simple string which has the nice property of sorting places close to each other when you do a simple alphabetical sort of the list of strings.

    And the GAE is funky yes, but you need to apprach things from a different angle. Difficult to forget all the RDBMS habits, but you have to try.

    By Blogger Etienne, at 1:27 AM  

  • Imagine you arrive at a stationary store and the clerk tells you. "OK miss, please wait just a bit while I check against my whole inventory of 5,000 products to find those 2 or 3 that match your request."

    Sounds stupid in the real world, doesn't it?

    But programmers got so used to this sort of approach because of RDBMS that anything that doesn't fit this model seems deficient.

    The bottom line: Refactory your model, re-tag it, think outside the the RDBMS box!

    By Blogger Jay A., at 7:01 PM  

Post a Comment

<< Home