Category Archives: Django

Porting chrss from Turbogears 1.0 to Django 1.3

For the a period of 18 months or so I slowly ported my chess by rss site (chrss) from Turbogears 1.0 to Django. When I started the port Django 1.1 was the latest version. I took so long to finish the port, that Django got to version 1.3 in the meantime! The slowness was mostly due to my then pending and now actual fatherhood. However by slowly toiling away, with the aid of quite a few automated tests I managed to deploy a Django version of the chrss the other week a couple few of months ago.

Python Framework to Python Framework

The good thing about moving from Turbogears to Python was that it’s all Python code. This meant that things like the core chess code didn’t need touching at all. It also meant that a lot of fairly standard code could be almost directly ported. Methods on controller objects in Turbogears became view functions in Django. Methods on model objects stayed methods and so on. A lot of the porting was fairly mechanistic. I moved all of the Turbogears code into a folder for easy referral and then built the Django version from nothing back up. Initially most of the work was done at the model/db level where I could copy over the code, convert it to Django style and then copy over and update the automated tests. I used Django Coverage to make sure the code was still all actually getting tested.

I could have opted to exactly match the database from the Turbogears version, but opted instead to make it a more Django like. This meant using the regular Django user models and so on. As Turbogears 1.0 involved a custom user model I had to create a separate user profile model in the Django port. There were a few other changes along these lines, but most of the porting was not so structural.

A lot of the hard work during porting came from missing bits of functionality that had far reaching effects. Testing was very awkward until a lot of the code had been ported already.

Cheetah templates to Django templates

Chrss used Cheetah for templates. Cheetah is not as restrictive with it’s templates as Django. It’s very easy to add lots of logic and code in there. Some pages in chrss had quite a bit of logic – in particular the main chess board page. This made porting rather tricky with Django. I had to put a lot of code into template tags and filters and carefully re-organise things. Porting the templates was probably the hardest part. Automated tests helped a bit with this, but a lot of the issues needed visual verification to ensure things really were working as they should.

One advantage of going from Cheetah to Django’s tempate language was the incompatible syntax. This meant I could simply leave bit’s of Cheetah code in a template and it would serve as quite a good visual reminder of something that was yet to be ported.

The second 90%

A good portion of the site was ported before my son’s birth. It seemed like maybe it wouldn’t take much longer, as it felt like 90% of the work was done. Of course it turned out there was another 90% yet to finish.

Beyond the usual tweaking and finishing of various odds and ends, the remaining work consisted of:

  • Completing the openid integration
  • Migrating the database

For the open id integration I opted to use Simon Willison’s Django OpenID app – hoping to be able to have a nice drop-in app. Possibly due to the slightly unfinished nature of the app and mostly due to my desire to customise the urls and general flow of the login/register process this took a fair while. It might have been quicker directly porting the original OpenID code I used with Turbogears, but it did work out ok in the end.

Of course it turns out that after all that hard work, that hardly anyone seems to use OpenID to login to sites anymore. I might have been better off integrating Django Social Auth instead, for Twitter and/or Facebook logings. However I decided that this would have been too much like feature creep and stuck with the original plan.

The chrss database isn’t very large. The table recording the moves for each game is currently over 70,000 rows, but each row is very small. So the database dump when tar gzipped is less than 3Mb in size. This gave me plenty of options for database migration. I’d roughly matched the schema used for the Turbogears site, but did take the opportunity to break from the past slightly. I created a Python script that used talked to the MySQL database and generated an sql dump in the new format. By using Python I was free to do any slightly more complex database manipulation I needed. The only real tricky part was converting to using Django’s user models.

One wrinkle with the database migrating was a bit disturbing. When I setup the app on webfaction I found some very odd behaviour. Logging in seemed to work, but after a couple of page requests you’d be logged out. The guys at webfaction were very helpful, but we were unable to get to the bottom of the problem. During testing I found that this problem did not happen with SQLite or Postgres, so was something to do with MySQL. This was one of those times when using an ORM paid off massively. Apart from the database migration script no code needed to be touched. If I’d had more time I might have persevered with MySQL, but Postgres has been working very well for me.

Conclusion

Chrss has been running using Django and Postgres for nearly eleven months now and has been working very well. I’ve had to fix a few things now and then, but doing this in Django has been very easy. I was also able to automate deployment using Fabric, so new code could be put live with the minimum of fuss. When you’ve only got a limited time to work with, automated deployment makes a huge difference.

Hopefully now that sleep routines are better established and my own sleep is less interrupted I’ll have a chance to add new features to chrss soon.

Django User Profiles and select_related optimisation

If you want to associate extra information with a user account in Django you need to create a separate “User Profile” model. To get the profile for a given User object one simply calls get_profile. This is a nice easy way of handling things and helps keep the User model in Django simple. However it does have a downside – fetching the user and user profile requires two database queries. That’s not so bad when we’re selecting just one user and profile, but when we are displaying a list of users we’d end up doing one query for the users and another n-queries for the profiles – the classic n+1 query problem!

To reduce the number of queries to just one we can use select_related. Prior to Django 1.2 select_related did not support the reverse direction of a OneToOneField, so we couldn’t actually use it with the user and user profile models. Luckily that’s no longer a problem.

If we are clever we can setup the user profile so it stills works with get_profile and does not create an extra query.

get_profile looks like this:


def get_profile(self):
        if not hasattr(self, '_profile_cache'):
            # ...
            # query to get profile and store it in _profile_cache on instance
            # so we don't need to look it up again later
            # ...
        return self._profile_cache

So if select_related stores the profile object in _profile_cache then get_profile will not need to do any more querying.

To do this we’d define the user profile model like this:


class UserProfile(models.Model):
    user = models.OneToOneField(User, related_name='_profile_cache')
    # ...
    # other model fields
    # ...

The key thing is that we have set related_name on the OneToOneField to be '_profile_cache'. The related_name defines the attribute on the *other* model (User in this case) and is what we need to refer to when we use select_related.

Querying for all user instances and user profiles at the same time would look like:


User.objects.selected_related('_profile_cache')

The only downside is that this does change the behaviour of get_profile slightly. Previously if no profile existed for a given user a DoesNotExist exception is raised. With this approach get_profile instead returns None. So you’re code will need to handle this. On the upside repeated calls to get_profile, when no profile exists, won’t re-query the database.

The other minor problem is that we are relying on the name of a private attribute on the User model (that little pesky underscore indicates it’s not officially for public consumption). Theoretically the attribute name could change in a future version of Django. To mitigate against the name changing I’d personally just store the name in a variable or on the profile model as a class attribute and reference that whenever you need it, so at least there’s only one place to make any change. Apart from that this only requires a minor modification to your user profile model and the use of select_related, so it’s not a massively invasive optimisation.

Django Batch Select

So quite a while ago I wrote about avoiding the n+1 query problem with SQLObject. I’ve since been using Django a lot more. In particular at my day job I’ve been improving a Django site we’ve inherited. This particular site suffered from a few too many SQL queries and needed speeding up. Some of the issues related to the classic n+1 problem. i.e. one initial query triggering a further n queries (where n is the number of results in the first query).

A liberal dose of select_related helped. However that was only useful in the cases where a ForeignKey needed to be pre-fetched.

In the case however there was a page that was selecting a set of objects that had tags. The tags for each object were being displayed along side a link to the main object. Given that the initial query returned over three hundred objects, this meant the page was performing another three hundred (plus) queries to fetch the individual tags for each object! Now we could cache the page and that’s was indeed what was being done. The trouble however was when the cache expired. It also made things painful when developing – as I’d typically want to disable caching whilst I’m making changes to pages frequently.

I came up with a specific solution for this project – to perform the initial query, then a second query to fetch *all* of the other tags in one go. The results of the second query could then be stitched into the original results, to make them available in a handy manor within the page’s template.

I took the original idea and made it re-usable and am now releasing that code as Django Batch Select. There are examples of how to use it over at github, but it works a bit like this:


>>> entries = Entry.objects.batch_select('tags').all()
>>> entry = entries[0]
>>> print entry.tags_all
[<Tag: Tag object>]

It’s a very early release – after the result of only a few hours coding, so use with care. It does have 100% test code coverage at the moment and I’m reasonable confident about it working. Please try it out and let me know whether it works for you.

Django simple admin ImageField thumbnail

The Django admin site is one of the best features of Django. It really lets you just get on with building your app, without having to worry too much about how you’ll administer your site.

The defaults are generally pretty good, but it’s often the case that you’ll want to tweak and change it (particularly when you have clients involved). Luckily it’s pretty easy to customize.

One common change that many people will want to do is to display a thumbnail of an uploaded image in the admin. The default image field (when an image has been uploaded) in the admin looks like:

image_field1

and we want to change it to look like this:

image_field2

In my case I had images that I knew would be fairly small so I didn’t need to use any auto-resizing or anything like that. I found this snippet for doing the same task, but decided to simplify it a fair bit and ended up with:


from django.contrib.admin.widgets import AdminFileWidget
from django.utils.translation import ugettext as _
from django.utils.safestring import mark_safe

class AdminImageWidget(AdminFileWidget):
    def render(self, name, value, attrs=None):
        output = []
        if value and getattr(value, "url", None):
            image_url = value.url
            file_name=str(value)
            output.append(u' <a href="%s" target="_blank"><img src="%s" alt="%s" /></a> %s ' % \
                (image_url, image_url, file_name, _('Change:')))
        output.append(super(AdminFileWidget, self).render(name, value, attrs))
        return mark_safe(u''.join(output))

Then to use it I simply override formfield_for_dbfield to return a field that uses that widget for the field I’m interested in:


class MyAdmin(admin.ModelAdmin):
    def formfield_for_dbfield(self, db_field, **kwargs):
        if db_field.name == 'profile_image':
            request = kwargs.pop("request", None)
            kwargs['widget'] = AdminImageWidget
            return db_field.formfield(**kwargs)
        return super(MyAdmin,self).formfield_for_dbfield(db_field, **kwargs)

It’s a fairly straightforward alteration to how the ImageField widget renders in the admin, but it is often a big help to be able to actually see the image in question.

UPDATED: Feb 24th, 2010. Based on Jeremy’s comment below and having upgraded to Django 1.1 I’ve modified this example to remove the “request” param from kwargs before passing it db_field.