Google App Engine is a fantastic platform for hosting webapps, and a great resource for iOS developers who need an online component to their products. It’s hard to believe that the service is essentially free! I’m using it with The Cartographer, but I found myself coming up against a hard limit with the datastore.
You see, the datastore limits entities to 1 Mb. I’m trying to store XML data in there, and sometimes that can exceed the 1 Mb limit.
XML being the verbose creature that it is compresses very nicely, so it occurred to me that if I selectively compress the larger blocks, I should be able to quite easily squeeze in underneath the limit. Sure enough, a 1.6 Mb XML block compressed into about 200 Kb.
App Engine makes it very easy to define custom properties on data models, so I’ve written a
CompressibleTextProperty class that automatically compresses/decompresses properties above a certain size. This means that there’s no performance loss for entities that are small enough to fit easily, but still enables the storage of bigger blocks of content.
The alternative was to break entities up into several different database entities, but this sounded like much more work, and sounded much less elegant.
#!/usr/bin/env python # encoding: utf-8 """ compressible_text_property.py A string property that will automatically be stored compressed if larger than a given length threshold Created by Michael Tyson on 2011-01-07. Copyright (c) 2011 A Tasty Pixel. All rights reserved. BSD LICENSE """ from google.appengine.ext import db from google.appengine.api import datastore_types import zlib Text = datastore_types.Text LENGTH_THRESHOLD = 500000 # Bytes EXPECTED_ZLIB_HEADER = u"x\x9c" class CompressibleTextProperty(db.TextProperty): """A string property that will automatically be stored compressed if larger than a given length threshold This is designed to be used with textual properties that may exceed App Engine's 1MB entity size limit. Note that, if compressed, property will not be searchable. """ def validate(self, value): """Validate text property; Nicked verbatim from TextProperty. Returns: A valid value. Raises: BadValueError if property is not instance of 'Text'. """ if value is not None and not isinstance(value, Text): try: value = db.Text(value) except TypeError, err: raise BadValueError('Property %s must be convertible ' 'to a Text instance (%s)' % (self.name, err)) value = super(db.TextProperty, self).validate(value) if value is not None and not isinstance(value, Text): raise BadValueError('Property %s must be a Text instance' % self.name) return value def get_value_for_datastore(self, model_instance): """For writing to the datastore: Performs compression if length is greater than the threshold""" value = super(CompressibleTextProperty, self).get_value_for_datastore(model_instance) if len(value) > LENGTH_THRESHOLD and not value.startswith(EXPECTED_ZLIB_HEADER): value = unicode(zlib.compress(value), 'ISO-8859-1') return Text(value) def make_value_from_datastore(self, value): """For reading from the datastore: Decompresses if compressed data detected""" if value is None: return None if value.startswith(EXPECTED_ZLIB_HEADER): value = zlib.decompress(value.encode('ISO-8859-1')) return value data_type = Text