So here’s a little script for compressing a Python script to make it much smaller:
import sys import bz2 import base64 write=sys.stdout.write for name in sys.argv[1:]: contents=bz2.compress(open(name).read()) write('import bz2, base64\n') write('exec bz2.decompress(base64.b64decode(\'') write(base64.b64encode((contents))) write('\'))\n')
To use it save that code into a file (e.g.
pack.py). It will write the compressed code to standard output (i.e. the screen), so you can redirect it to whatever file you want:
python pack.py my_script.py > my_script_packed.py
The compressed code looks like this:
import bz2, base64 exec bz2.decompress(base64.b64decode('<base64 encoded compressed data>'))
Which hopefully is fairly clear as to what it’s doing, but to summarize:
- The script data is base 64 decoded into bytes
- The bytes are then decompressed (bz2) into the text of the original script
execis then called to run the original script
One nice benefit to this way of compressing the script is that the final module namespace (after de-compression) will look essentially the same as it did if it was not compressed. The only difference is the
base64 modules will also be present. This should mean that you can actually compress multiple Python files and importing from them should still work. Though of course as is the case when adding an extra layer of complexity your mileage may vary…
For an idea of how effective this compression can be I took the Python script for calculating pi on wikipedia and ran it through the script. After confirming that it ran the same, a quick comparison revealed the compressed version had gone from 12658 bytes to 3421 bytes – less than a third of the original size.
As I work on the example Python 5K app I should hopefully get a good feel for how the competition rules might need changing to allow “scripting” languages like Python, Ruby, Perl and PHP. I think the plan will be that the app must be contained in a single file (less than 5120 bytes in size), with any resources embedded in the file. This is the norm for Java and Flash, but will probably require an extra packaging step for most other languages/runtimes. That single file can then be run either via a GUI (double-clicking) or via a standard invocation from the command line (e.g.
python my_script_packed.py) using only a “standard” version of the language runtime. Note that the standard installed version of Python on MacOS X 10.5 (Leopard) includes quite a few extra libraries (e.g. wxPython) so these libraries would be eligible for use in the 5K app. The same will be true for Ruby and Perl, so that should hopefully help open things out. Otherwise Java’s large standard library might give it too much of an advantage…