03 Feb 2010
    I develop a number of Django-powered websites at work, and usually I want to leave them running when I’m not
working on them so others can check out my progress and give me suggestions. The Django development server is
incredibly useful when developing, but it’s not detached from the terminal so as soon as you log out the
server gets switched off. One alternative is to run the website under Apache, as you would deploy it normally.
This solves the problem of leaving the website running, but makes it much harder to develop with.
A third option is the GNU program Screen. When run without arguments
screen puts you into a new bash session. Pressing Ctrl+d drops you back out to where you were. The magic
occurs when you press Ctrl+a d. This drops you back out, but the bash session is stilling running!
By typing screen -r you’ll reattach to the session and can carry on working as before. You can leave it as
long as you like between detaching and reattaching to a session, as long as the computer is still running.
It is possible to run multiple screen sessions at once, perhaps with a different Django development server
running in each. Unfortunately screen will only reattach automatically when there is just one detached
session. If you have more than one then you’ll be confronted by a cryptic series of numbers that uniquely
identifies each session. You can reattach to a specific session you can type screen -r <pid>.n To make
things easier to reattach to the session that I’m working on I give these sessions name so rather than a
cryptic series of numbers I see a useful set of names. To do this you just need to type Ctrl + A :
sessionname name.
There are plenty of other useful things that screen can do, but named sessions is by far and away the most
common one that I use.
    Read More...
   
  
  
    
    27 Jan 2010
    
While on my delayed train this morning I was listening to episode
80 of the excellent Stack Overflow
podcast. In this episode Jeff Atwood was complaining to Joel Spolsky about
his problems with GitHub.
GitHub is a social coding site, along the same lines as Sourceforge or Google Code, but focused entirely on
the distributed version control system Git. Where GitHub differs from the other project hosting sites, and
where I think Jeff’s confusion comes from is that with GitHub the primary structure on their site is that of
the developer, not of the project. They treat every developer as a rock star, who is bigger than the projects
that they work on.
GitHub makes it incredibly easy to take a codebase, make your own changes and to publish them to world. What
GitHub fails to do is to encourage people to collaborate together to push one code base forward. What I’m not
suggestion is that branching is a bad idea. Branching code is a useful coding technique which can be used to
separate in-development features from other changes until the code has stabilised again. What GitHub focuses
on is the changes that an individual developer makes, not the changes required for a particular feature.
    Read More...
   
  
  
    
    21 Jan 2010
    Whoosh is quite a nice pure-python full text search engine. While it is still being
actively developed and is suitable for production usage there are still some rough edges. One problem that
stumped me for a while was searching stemmed fields.
Stemming is where you take the endings off words, such as ‘ings’ on the word endings. This reduces the
accuracy of searches but greatly increases the chances of users finding something related to what they were
looking for.n To create a stemmed field you need to tell Whoosh to use the
StemmingAnalyzer, as
shown in the schema definition below.
from whoosh.analysis import StemmingAnalyzer
from whoosh.fields import Schema, TEXT, ID
schema = Schema(id=ID(stored=True, unique=True),
                       text=TEXT(analyzer=StemmingAnalyzer()))
 
    Read More...
   
  
  
    
    17 Dec 2009
    I love listening to both BBC Radio 4 and BBC 6 Music. Like the rest of the BBC radio stations a significant
proportion of the shows are available as a podcast. Unfortunately this is not true of all the shows, and for
those that feature music such as Adam & Joe or Steve Lamacq the podcasts are talking only.
I watch almost all of TV through MythTV which records all of my favourite shows automatically while on my way
to work I like to listen to podcasts that are downloaded automatically by iTunes. Would it be possible to
automatically record shows with MythTV that aren’t available as podcasts and sync them to my iPhone
automatically?
Recording a radio show with MythTV is no different to recording a TV show so that’s not a problem. MythTV also
provides the ability to run a script after certain shows have been recorded. All that is required is a script
that converts the recording into an mp3 file and to build an RSS feed which can be read by iTunes.
    Read More...
   
  
  
    
    29 Sep 2009
    It’s well known that one of the best things you can do to speed up CouchDB is to use bulk
inserts to add or update many documents at one
time.
Bulk updates are easy to use if you’re just blindly inserting documents into the database because you can just
maintain a list of documents. However, a common scheme that I often use is to call a view to determine whether
a document representing an object exists, update it if it does, add a new document if it doesn’t. To help make
this easier I use the DocCache class given below.
The cache contains two interesting methods, get and update. Rather than writing directly to CouchDB when
you want to add or update a document just pass the document to update. This will cache the document and
periodically save them in a bulk update.
It is possible that you will retrieve a document from CouchDB that an updated version exists in the cache. To
avoid the possibility that changes get lost you should pass the retrieved document to get. This will either
return the document you passed in or the document that’s waiting to be saved if it exists in the cache.
Because there is a gap between when you ask for document to be saved and when it actually is saved any views
you use may be out of date, but that’s the cost of faster updates with CouchDB.
    Read More...