wrapping CGI applications in WSGI
We’ve got a large “legacy” body of code that is used by our staff to track most of our business, it’s a whole lot of Python CGI that uses some custom HTML and DB frameworky code; it’s pretty ugly and having become a convert to the cult of Pylons, WSGI, and SQLAlchemy, I really want to replace it.
Of course, anyone knows that one of the Things You Should Never Do is rewrite from scratch. Even in the same language.
It would be much easier to integrate the old app into a new Pylons app, have them running side-by-side, and slowly deprecate the old one as new interfaces are written. (This is still not a perfect idea, as demonstrated by the 4 year old TCL code that the current app was meant to replace still running in production ;-) As bugs in the old code are found, we can either beat our heads against brick walls or replace just that functionality with a sane data model, similar looking templates, and shiny new controller smarts, and no-one would be the wiser, except of course that for some reason the developers are no longer constantly grumpy and the webapp is running smoother and faster than before, and crashing less often…
It occurred to me yesterday the best way to get a legacy CGI app to run along with Pylons is to convert it to a WSGI application, and just mash it in at the bottom of the application stack, where Pylons would normally go when it 404s.
Here’s the result of some free time and caffeinated excitement this morning:
|
|
Our CGI apps print out on stdout
, as you’d expect, so we need to
trap that, here done with a StringIO
monkeypatched on the top of
sys.stdout
. We also need to hack sys.exit
out of the way, so that
the CGIs don’t quit before we’ve completed the WSGI protocol. (I
think this might cause some bugs in the execution though, because now
it’s not terminating execution of the module, but I haven’t found an
example yet to bother worrying about it.)
I import the script, rather than using os.system
, because it
feels right. I use imp.load_module
rather than import
because we
don’t know what the script is until runtime :)
The real trick comes from a tip I found
here
, whilst looking for how to run the imported module as __main__
.
Just imp.load_module
and tell it that it’s __main__
! Simple!
(The hardest part about this whole excercise was now fiddling with
sys.path
and the CWD to make sure the imported script was running
with the right environment that the CGIs used to expect, this is all
done in the CGI runner dispatch.cgi
which I won’t copy here because
it’s pretty trivial and well documented in the WSGI spec.)