spaceblog

lca2007 CFP open!

lca2007 CFP open! icon

I was on a jet from Austin to London when it happened, but busy the hours beforehand in a flat in Austin nursing a hangover putting the final touches on the website. So, though I missed the chance to announce at the time, I’ll take this opportunity to blog about it now!

The LCA 2007 CFP is open, so submit your proposal for a talk, miniconf, etc now! (Or in reality, start writing your proposal now, so that you can submit it before the CFP closes ;-)

We’re looking forward to your submissions!

pylons, paste, and wsgi

I scribbled this down on our whiteboard last Friday, trying to explain how Pylons and Paste fit together. Prevously jdub and Lindsay had asked me similar questions. Until Friday, I wasn’t even sure myself.

pylons and paste stack diagram

The first thing to note is that Paste is not a framework or single library, it’s a collection of components that by themselves don’t do a lot, but with their powers combined form a set of useful and sometimes essential tools for building a web application in Python.

Paste implements an interface known as WSGI, aka the Web Server Gateway Interface. It’s defined in PEP 333. Basically WSGI describes a Chain of Command design pattern; each piece of a WSGI application takes a request, and either acts on that request or passes it along the chain. The interface described by WSGI means you can plug WSGI apps (or as Pylons calls them, /middleware/) together in any order as you like.

Why is this useful? Well, it means you can take an off-the-shelf authentication handler to cope with 403 and 401 responses and take care of logins. One would only need to say “this is how you authenticate someone” and “this is how you ask the user for their password.” Other things are possible; Pylons ships with an ultra-sexy 500 handler that puts you in a DHTML debugger, complete with traceback and Python interpreter. (Of course such a tool is a giant security hole so it is easily turned off in production environments.)

So, that’s Paste. There’s a few special cases in there, though: PasteScript and PasteDeploy. They’re special in that they tend to be at the bottom of the stack – they’re specifically for launching WSGI applications, configuration of the application (e.g. authenticatoin details alluded to above) and connecting to the application (e.g. direct HTTP, FastCGI, and other connectors). I suspect that my diagram above doesn’t lend itself well to describing how PasteScript and PasteDeploy really work; it’s still a bit of dark magic to me. I hope someone else would be able to build on this article with their own that rebuts the errors and clears the grey areas.

In a Pylons app, you tend not to notice Paste, except when deploying (because you tend to run the command paster serve to launch a development environment). Pylons itself is mostly just glue. It’s a thin veil of a framework over the top of some very powerful supporting libraries but presents them in a convenient and well defined way.

When you create a Pylons app, you get your paste middleware built for you, and then the entry point for your app is created as a WSGI application too. So it sits on top of the stack, taking in requests, and sending out responses. Your app can define its own middleware, too, so you have a lot of control over what happens between your app and the browser.

The main components of a Pylons app are:

  • A route mapper, by default Routes. The route mapper takes in URLs from the request passed into the app, and maps that URL to a controller object and method call. (If you’ve used RoR then you probably are familiar with this already.)

  • A templating engine, by default Myghty. The templating engine generates the view presented to the browser.

  • A data model. Pylons doesn’t prefer any method of data model, it just makes available a model module within which you can define your own data model. I use SQLAlchemy as an ORM because it is very powerful and is nicely suited to working with existing schemas. It works as an MVC between the data model presented to the application and the database schema itself.

Pylons lets you swap out any of these components with your own, if you desire. I find Routes and Myghty to be powerful and flexible and friendly enough that there’s no reason to want anything else.

Your controller objects, like any MVC pattern, coordinate between the model and the view. An action performed on a controller retrieves some data from the model, possibly altering it, and renders that data using the template engine.

There are other parts, other libraries that you’ll see in a Pylons app, that aren’t represented here. WebHelpers is a library of convenience functions used in the template engine, for generating common HTML and JavaScript. paste.fixture is a web app test framework that takes advantage of the common interface of WSGI to allow one to test their application without requiring a full web server and socket handling. FormEncode handles form validation, useful from within a controller object. These are but to name a few.

Unfortunately there is a sore need for overviews like this one in the Paste and Pylons community; as stated earlier I didn’t fully understand the relationships myself until I came up with this diagram. Hopefully then, dear reader, you have a better insight into how this collection of names fit together, and can avoid the steep learning curve :-)

TDD promotes good health

There’s an important advantage to Test Driven Development that I don’t think was covered on list or by Rob at his talk.

By having a test suite, you can code after a heavy liquid lunch and be sure that you’re not decreasing the quality of existing code. It makes it easier to focus on a specific task and write code to solve that problem. Having something do all the work for you when testing is a massive bonus, because obviulsy the side-effects of a pub lunch are that you are easily distracted and lack the willingness to focus on the task at hand. Test suites lower the barrier of entry to getting work done.

Who’da thought that best practices would also be best for drinking practice?

semiautomatic test generation

Whilst developing two webapps, one at work and one for the los palmas seven, I found that when writing the tests for the data model (hell yes I use test driven development, it’s very productive as it defines short and medium term goals and keeps you in focus) I was writing the same code over and over, but was very reliant on the structure of the data model I was testing.

Having done lots of currying in functional programming back in uni, I could see there was an obvious pattern, and thus it needed refactoring to reduce the amount of code I was writing and in doing so reduce the probability of error.

So I pulled out a set of functoins from one test, and made it a base class, and refactored it to allow the child class to specify the data; but due to the way python’s unittest and the test runner nose work, I had to manually write a function in each child class to call the worker functoin in the parent.

Obviously requiring the user to type more meant that there was a risk of error, and it was also tedious (read: very boring) to have to do this every time. Not knowing enough about python internals, I posted to the new slug coders list, christening it with the first development related post, and got back a good suggestion, but I didn’t try it out. Michael K suggested at the slug meeting last friday that I check out metaclasses if his previous suggestion didn’t work.

Saturday I jumped into the deep end with metaclasses, and it turned out they were easier than I thought, but having done so now I wouldn’t recommend them as a hammer to bash in all those screw and rivet problems. They are very cool though.

So I ended up with a base class that had a metaclass that would monkey patch the child class with the correct method names just as it was being inspected, and had a small test program written that demonstrated my idea was sound.

However, applying this method to the actual tests running under nose didn’t work. Lots of debugging printfs later I eventually traced this to a peculiarity in the way nose decides on what tests to run: to prevent test methods being run twice, it makes sure that when running a test, the module that it is defined in is the same as the module currently being tested, i.e. it makes sure that __module__ matches in both the callable and the current test case.

Now, when you define your methods in a parent, the method’s module is that of the parent, so a structure like the following:

tests/module/
tests/module/__init__.py
tests/module/test_something.py

sets __module__ in __init__.py to tests.module and in test_something.py to tests.module.test_something. Running a method from tests.module.test_something that’s defined in tests.module, nose says no.

Enter Benno and his knowledge of python internals. An impromptu 7 hackfest on Sunday at jdub’s house let me show him what I was trying to do, where he suggested using python’s new package, and the im_func attributes on callables, to build a workaround for nose's features, which is much better than what I was thinking of doing to solve the problem (something about injecting docstrings into the child class and then evaling them in the child’s context).

Some quick hacks later, we had a test program that showed it’d work, and patched up the base test class. Appended for your enjoyment, a base test class that one can inherit from to automatically generate common tests for tables when using SQLAlchemy.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
from unittest import TestCase

import sqlalchemy

import zookeepr.models as model


class TestBase(TestCase):
    def assertRaisesAny(self, callable_obj, *args, **kwargs):
        """Assert that the ``callable_obj`` raises any exception."""
        try:
            callable_obj(*args, **kwargs)
        except:
            pass
        else:
            self.fail("callable %s failed to raise an exception" % callable_obj)


def monkeypatch(cls, test_name, func_name):
    """Create a method on a class with a different name.

    This method patches ``cls`` with a method called ``test_name``, which
    is bound to the actual callable ``func_name``.

    In order to make sure test cases get run in children of the assortment
    of test base classes in this module, we do not name the worker methods
    with the prefix 'test_'.  Instead they are named otherwise, and we
    alias them in the metaclass of the test class.

    However, due to the behaviour of ``nose`` to not run tests that are
    defined outside of the module of the current test class being run, we
    need to create these test aliases with the model of the child class,
    rather than simply calling ``setattr``.

    (Curious readers can study ``node.selector``, in particular the
    ``wantMethod``, ``callableInTests``, and ``anytests`` methods (as of
    this writing).)

    You can't set __module__ directly because it's a r/o attribute, so we
    call ``new.function`` to create a new function with the same code as
    the original.  The __module__ attribute is set by the new.function
    method from the globals dict that it is passed, so here we make a
    shallow copy of the original and override the __name__ attribute to
    point to the module of the class we're actually testing.
    
    By this stage, you may think that this is crack.  You're right.
    But at least I don't have to repeat the same code over and
    over in the actual tests ;-)
    """
    # get the code
    code = getattr(cls, func_name).im_func.func_code
    # get the function globals so we can overwrite the module
    g = getattr(cls, func_name).im_func.func_globals.copy()
    g['__name__'] = cls.__module__
    # create a new function with:
    # the code of the original function,
    # our patched globals,
    # and the new name of the function
    setattr(cls, test_name, new.function(code, g, test_name))


class TableTestGenerator(type):
    """Monkeypatching metaclass for table schema test classes.
    
    This metaclass does some funky class method rewriting to generate
    test methods so that one doesn't actually need to do any work to get
    table tests written.  How awesome is that for TDD? :-)
    """
    def __init__(mcs, name, bases, classdict):
        type.__init__(mcs, name, bases, classdict)
        if 'table' in classdict:
            monkeypatch(mcs, 'test_insert', 'insert')
            
            for k in ['not_nullable', 'unique']:
                if k + 's' in classdict:
                    monkeypatch(mcs, 'test_' + k, k)


class TableTest(TestBase):
    """Base class for testing the database schema.

    Derived classes should set the following attributes:

    ``table`` is a string containing the name of the table being tested,
    scoped relative to the module ``zookeepr.models``.

    ``samples`` is a list of dictionaries of columns and their values to use
    when inserting a row into the table.

    ``not_nullables`` is a list of column names that must not be undefined
    in the table.

    ``uniques`` is a list of column names that must uniquely identify
    the object.

    An example using this base class:

    class TestSomeTable(TestTable):
        table = 'module.SomeTable'
        samples = [dict(name='testguy', email_address='test@example.org')]
        not_nullables = ['name']
        uniques = ['name', 'email_address']
    """
    __metaclass__ = TableTestGenerator

    def get_table(self):
        """Return the table, coping with scoping.

        Set the ``table`` class variable to the name of the table variable
        relative to anchor.model.
        """
        module = model
        # cope with classes in sub-models
        for submodule in self.table.split('.'):
            module = getattr(module, submodule)
        return module
        
    def check_empty_table(self):
        """Check that the database was left empty after the test"""
        query = sqlalchemy.select([sqlalchemy.func.count(self.get_table().c.id)])
        result = query.execute()
        self.assertEqual(0, result.fetchone()[0])

    def insert(self):
        """Test insertion of sample data

        Insert a row into the table, check that it was
        inserted into the database, and then delete it.
    
        Set the attributes for this model object in the ``attrs`` class
        variable.
        """

        self.failIf(len(self.samples) < 1, "not enough sample data, stranger")
        
        for sample in self.samples:
            print "testing insert of s %s" % sample
            query = self.get_table().insert()
            print query
            query.execute(sample)

            for key in sample.keys():
                col = getattr(self.get_table().c, key)
                query = sqlalchemy.select([col])
                print "query", query
                result = query.execute()
                print result
                row = result.fetchone()
                print "row", row
                self.assertEqual(sample[key], row[0])

            self.get_table().delete().execute()

        # do this again to make sure the test data is all able to go into
        # the db, so that we know it's good to do uniqueness tests, for example
        for sample in self.samples:
            query = self.get_table().insert()
            query.execute(sample)

        # get the count of rows
        query = sqlalchemy.select([sqlalchemy.func.count(self.get_table().c.id)])
        result = query.execute()
        # check that it's the same length as the sample data
        self.assertEqual(len(self.samples), result.fetchone()[0])

        # ok, delete it
        self.get_table().delete().execute()

        self.check_empty_table()

    def not_nullable(self):
        """Check that certain columns of a table are not nullable.
    
        Specify the ``not_nullables`` class variable with a list of column names
        that must not be null, and this method will insert into the table rows
        with each set to null and test for an exception from the database layer.
        """

        self.failIf(len(self.samples) < 1, "not enough sample data, stranger")

        for col in self.not_nullables:
            print "testing that %s is not nullable" % col
            
            # construct an attribute dictionary without the 'not null' attribute
            coldata = {}
            coldata.update(self.samples[0])
            coldata[col] = None
    
            # create the model object
            print coldata

            query = self.get_table().insert()
            self.assertRaisesAny(query.execute, coldata)

            self.get_table().delete().execute()

            self.check_empty_table()

    def unique(self):
        """Check that certain attributes of a model object are unique.

        Specify the ``uniques`` class variable with a list of attributes
        that must be unique, and this method will create two copies of the
        model object with that attribute the same and test for an exception
        from the database layer.
        """

        self.failIf(len(self.samples) < 2, "not enough sample data, stranger")

        for col in self.uniques:

            self.get_table().insert().execute(self.samples[0])

            attr = {}
            attr.update(self.samples[1])

            attr[col] = self.samples[0][col]

            query = self.get_table().insert()
            self.assertRaisesAny(query.execute, attr)

            self.get_table().delete().execute()

            self.check_empty_table()

Feel free to use this in your own code, I’m placing it in the public domain.

cityrail ticket cubohemioctahedron

A while ago, as many of you would, I saw the Metrocard tricontahedron linked from the Makezine blog.

This weekend, whilst hacking away, I took a break and was distracted by the large pile of used tickets that have built up on my desk. I found the above site again, and spend about 15 minutes folding some tickets until I emerged to the loungeroom to show off the results.

cityrail ticketcubohemioctahedron (flickr)

So what it this shape actually called? I googled ‘tricontrahedron’ (because I can’t read) and then ‘tricontahedron’, and discovered that this name is derived from the prefix tri-, the modifier conta- (which refers to a group of ten), and the suffix -hedron, meaning “faces”, so we have a 30-faced polyhedra. True so far, but actually googling or searching Mathworld for tricontahedron wasn’t showing me any graphical evidence that this was the right name. In fact, the closest by name, the rhombic tricontahedron, looks nothing like this shape.

But finally I stumbled on a set of excellent polyhedra sites:

Thanks to FreeWRL I managed to finally find the cubohemioctahedron, which although the image on that linked page doesn’t look terribly much like our shape, upon loading the VRML for the polyhedron I was able to convince myself that this is the one.