Naming Screen Sessions

I develop a number of Django-powered websites at work, and usually I want to leave them running when I’m not working on them so others can check out my progress and give me suggestions. The Django development server is incredibly useful when developing, but it’s not detached from the terminal so as soon as you log out the server gets switched off. One alternative is to run the website under Apache, as you would deploy it normally. This solves the problem of leaving the website running, but makes it much harder to develop with.

A third option is the GNU program Screen. When run without arguments screen puts you into a new bash session. Pressing Ctrl+d drops you back out to where you were. The magic occurs when you press Ctrl+a d. This drops you back out, but the bash session is stilling running! By typing screen -r you’ll reattach to the session and can carry on working as before. You can leave it as long as you like between detaching and reattaching to a session, as long as the computer is still running.

It is possible to run multiple screen sessions at once, perhaps with a different Django development server running in each. Unfortunately screen will only reattach automatically when there is just one detached session. If you have more than one then you’ll be confronted by a cryptic series of numbers that uniquely identifies each session. You can reattach to a specific session you can type screen -r <pid>.n To make things easier to reattach to the session that I’m working on I give these sessions name so rather than a cryptic series of numbers I see a useful set of names. To do this you just need to type Ctrl + A : sessionname name.

There are plenty of other useful things that screen can do, but named sessions is by far and away the most common one that I use.

Read More...

Where GitHub (Possibly) Went Wrong

8 Forks

While on my delayed train this morning I was listening to episode 80 of the excellent Stack Overflow podcast. In this episode Jeff Atwood was complaining to Joel Spolsky about his problems with GitHub.

GitHub is a social coding site, along the same lines as Sourceforge or Google Code, but focused entirely on the distributed version control system Git. Where GitHub differs from the other project hosting sites, and where I think Jeff’s confusion comes from is that with GitHub the primary structure on their site is that of the developer, not of the project. They treat every developer as a rock star, who is bigger than the projects that they work on.

GitHub makes it incredibly easy to take a codebase, make your own changes and to publish them to world. What GitHub fails to do is to encourage people to collaborate together to push one code base forward. What I’m not suggestion is that branching is a bad idea. Branching code is a useful coding technique which can be used to separate in-development features from other changes until the code has stabilised again. What GitHub focuses on is the changes that an individual developer makes, not the changes required for a particular feature.

Read More...

Searching Stemmed Fields With Whoosh

Whoosh is quite a nice pure-python full text search engine. While it is still being actively developed and is suitable for production usage there are still some rough edges. One problem that stumped me for a while was searching stemmed fields.

Stemming is where you take the endings off words, such as ‘ings’ on the word endings. This reduces the accuracy of searches but greatly increases the chances of users finding something related to what they were looking for.n To create a stemmed field you need to tell Whoosh to use the StemmingAnalyzer, as shown in the schema definition below.

from whoosh.analysis import StemmingAnalyzer
from whoosh.fields import Schema, TEXT, ID
schema = Schema(id=ID(stored=True, unique=True),
                       text=TEXT(analyzer=StemmingAnalyzer()))
Read More...

Custom Podcasts With MythTV

I love listening to both BBC Radio 4 and BBC 6 Music. Like the rest of the BBC radio stations a significant proportion of the shows are available as a podcast. Unfortunately this is not true of all the shows, and for those that feature music such as Adam & Joe or Steve Lamacq the podcasts are talking only.

I watch almost all of TV through MythTV which records all of my favourite shows automatically while on my way to work I like to listen to podcasts that are downloaded automatically by iTunes. Would it be possible to automatically record shows with MythTV that aren’t available as podcasts and sync them to my iPhone automatically?

Recording a radio show with MythTV is no different to recording a TV show so that’s not a problem. MythTV also provides the ability to run a script after certain shows have been recorded. All that is required is a script that converts the recording into an mp3 file and to build an RSS feed which can be read by iTunes.

Read More...

CouchDB Document Cache

It’s well known that one of the best things you can do to speed up CouchDB is to use bulk inserts to add or update many documents at one time.

Bulk updates are easy to use if you’re just blindly inserting documents into the database because you can just maintain a list of documents. However, a common scheme that I often use is to call a view to determine whether a document representing an object exists, update it if it does, add a new document if it doesn’t. To help make this easier I use the DocCache class given below.

The cache contains two interesting methods, get and update. Rather than writing directly to CouchDB when you want to add or update a document just pass the document to update. This will cache the document and periodically save them in a bulk update.

It is possible that you will retrieve a document from CouchDB that an updated version exists in the cache. To avoid the possibility that changes get lost you should pass the retrieved document to get. This will either return the document you passed in or the document that’s waiting to be saved if it exists in the cache. Because there is a gap between when you ask for document to be saved and when it actually is saved any views you use may be out of date, but that’s the cost of faster updates with CouchDB.

Read More...