Thursday, February 28, 2008

Simple, complete example of Python getstate and setstate

I've been doing serialization and/or object-relation mapping in languages like C++ and Java for at least 15 years. I've known about Python's serialization facility (the pickle and cPickle modules) for as long as they've existed, but I've never had a need to use them. Recently, I needed to pickle an object to store in memcached to reduce database traffic.

Wouldn't you know it - the first class I try to pickle throws an exception because it contains some attributes that can't be serialized. I couldn't figure out where the problem was because the Exception and trace back didn't include the name of the attribute that contained the threading lock that couldn't be serialized. However, a quick look at the code revealed a couple of suspects.

Even though I didn't know which attributes were causing the problem, I knew that the only solution would be to take control of the serialization process. Once I could pick and choose which attributes were being pickled, I could search for the offender(s). As it turned out, both of my initial suspects were guilty of evading pickling.

From the documentation on pickling, I could see that implementing the __getstate__ and __setstate__ methods, but it wasn't clear what those methods need to look like. I found an example online, but the guy was having problems (it was posted to a mailing list), and as I implemented my own methods, I realized what his problem was. So, here's the code:

def __getstate__(self):
result = self.__dict__.copy()
del result['log']
del result['cfg']
return result

The problem I was having with pickling were the logging and configuration attributes. These needed to be removed from the object before pickling. Fortunately, they're not unique to the instance, so they're easy to recreate during unpickling.

As you can tell, __getstate__ returns a dictionary of the object's state. By default (if you didn't implement the method), this is just the __dict__ member. To exclude some attributes, we just need to delete the keys from the dictionary. However, the crucial step is that you have to make a (shallow) copy of __dict__ first. Otherwise, deleting the keys from the dictionary is the same as deleting the attributes from the instance, which would be bad. (This is where the other example I found online failed - he didn't make a copy.)

The __setstate__ method is the reverse, only we don't have to mess with copies:

def __setstate__(self, dict):
self.__dict__ = dict
cfg = self.cfg = getConfig()
self.log = getLog()


Enjoy,
Charles.

Saturday, February 23, 2008

Mac OS X 10.5.2 did not completely fix stacks

Apple's newest update to Leopard, 10.5.2, has greatly improved the new Stack feature by adding a hierarchical list view, but it is still not as functional as the list view in Tiger. As I noted before, I created my own directory that has a collection of aliases (symbolic links) to the applications I use most frequently, as well as links to Applications, Utilities, and the LocalApps directory where I install third-party applications. The problem is even with the 10.5.2 update, the list view does not follow the symbolic links, so my folder of links is basically useless.

I'm still pleased with the list view - it is a huge improvement, but I won't be totally satisfied until it follows links.

Charles.

Sunday, February 17, 2008

Simply Mercurial

Mercurial is the easiest revision control system I've used, "and so can you" to quote Stephen Colbert. I became interested in the idea of a distributed SCM tool in order to keep my revision history with me while I'm on the road and not necessarily connected. I would have assumed that to get that power, the tool would be more complex - you can't get something for free, right? However, Mercurial is so easy to use, I'm using it for simple one-off revision needs.

Consider the case of a lone developer with a modest number of files to keep track of. To use Mercurial, all he needs to do change into the directory where the files are and run:hg init
That creates a repository, hidden in the .hg subdirectory, and sets the directory up as a working directory. The hg status command shows that none of the files is under control, yet. Running hg add * (or whatever subset of the files is appropriate) marks all of the files to be added to the repository. Finally, hg commit commits the files.

The real beauty was in that first step - hg init. That is so much easier than CVS or Subversion where you either have to create a new repository or figure out where in an existing repository you want to put these files. And it's easier than the dinosaurs, RCS and SCCS, where you have to set up subdirectories to hold the version files in every subdirectory - not to mention the fact that those tools don't really deal with multiple users.

Mercurial is about as simple as can be, and if you never work with multiple developers and passing changes around between developers and repositories, then it stays that simple. Period.


enjoy,
Charles.

Saturday, February 02, 2008

Complete example of __getattr_ in Python

I've always known about the _getattr_ function on classes in Python and how it could theoretically be used. However, I never had a real need to implement it, and so I had never actually implemented __getattr__. For whatever reason, it was a tad more difficult that I thought, so I figured I'd share an example with you'all.

In case you don't already know, the idea is that any time any code makes are reference to an attribute a class (e.g., obj.x ), __getattr__ gets called to fetch or compute the value of the attribute. This function can do almost anything, but you must be careful when making references to attributes, because that will trigger a recursive call to __getattr__.

In my case, I was writing some code for unit testing. I needed to create a mock object that's used to store configuration information for the system. In the real object, every configuration attribute is initialized from an ini file parsed by ConfigParser in the constructor. For testing, didn't want to have a huge configuration file for every test. So, I wanted to create a system that performed lazy initialization of the data attributes - i.e., only look in the ini file if we actually need a given item, and if the attribute is never referenced, we never need to fetch it from the ini file. Therefore, the ini file only needs the attributes that are actually used by a given test. Implementing __getattr__ is the way to hook into the process to provide this lazy initialization.

The basic outline/algorithm is:
  1. If the attribute already exists on self, return that value
  2. Fetch/compute the missing value
  3. Store the value on self for subsequent use
  4. Return the value
The key to making this work and avoiding infinite recursion is the __dict__ attribute, which is a (regular) dictionary, the keys of which are attributes that exist on the object and the values are the values of the attributes. We can access these keys and values without going through __getattr__, thus avoiding recursion.



def __getattr__(self, attrName):
if not self.__dict__.has_key(attrName):
value = self.fetchAttr(attrName) # computes the value
self.__dict__[attrName] = value
return self.__dict__[attrName]



It's pretty straightforward. In retrospect, I'm not sure what tripped me up when I first went to implement it. In the end, the fetchAttr function ended up being pretty fancy, but I'll write more about that later. You gotta love a dynamic language like Python that makes this as simple as it is, even if it does require a bunch of underscores.


Enjoy,
Charles.

Thursday, January 24, 2008

Remove Office on Mac sux

The Remove Office program that comes with the trial version of Microsoft Office just sucks. The two times I've used it, it's failed. The first time, I had my old iMac connected via Firewire (target disk mode) to my new iMac. I ran Remove Office on the new iMac to remove the trial version, and it removed the non-trial version on my old iMac - without even asking me which version to remove. Now, on my new MacBook, I tried running Remove Office (without any other computers connected), and it couldn't find the trial version to remove - never mind the fact that I was running Remove Office from the folder containing Office.

My advice to anyone is to just use 'rm -r' from the Terminal window. If you just drag it to the trash (without emptying the trash), the trail version will keep launching.

enjoy,
Charles.

Sunday, January 20, 2008

Exporting from ExamView to Moodle

I've just started using Moodle for a course (more on that later), and I wanted to import some questions I wrote using ExamView into Moodle. Although our version of Moodle (1.8?) suggests that it can import EvamView questions, it failed for me. I came across this post that suggests that it isn't even supported - at least not by the company that makes ExamView. Based on that post, here's what I ended up doing:
  1. From ExamView Pro (version 4.0.8) export as Blackboard 5.x, which is a zip file.
  2. Unzip the zip file.
  3. Import the .dat file into Moodle as a Blackboard, not Blackboard V6+
(And, all of this is made more complicated by the fact that I've basically abandoned the PC for the Mac, and I only have a version of ExamView for the PC. Praise be to VMware for Fusion.)

There may be a simpler way to do this, but it works, which is all that matters right now. It would be swell if I could automate the process...

enjoy,
Charles.

Friday, October 12, 2007

Subersion on the Mac

If you've ever used Subversion (SVN) on Windows, chances are you've used TortiseSVN. (If you haven't, definitely check it out.) It works as an extension to Windows Explorer. What this means, is that you don't have a separate UI for SVN - you do everything in Explorer. I was really struck by how cool this was when I was working on an old project using an old version of Perforce; it seemed barbaric to have to keep switching between the SCM UI and Explorer.

Until recently, there wasn't anything like that for the Mac, at least not anything that was free/open source and that I knew about. Then, along comes SCPlugin. From the web page: "The goal of the SCPlugin project is to integrate Subversion into the Mac OS X Finder. The inspiration for this project came from the TortoiseSVN project."

As of this writing, it's at version 0.7, which means it has a few rough spots, but for the light use I've put it to so far, it's been mostly pretty good. And since it's 0.7, it has nowhere to go but up.

enjoy,
Charles.

Thursday, October 04, 2007

Seven Habits of Effective Text Editing - with Vim

I'm a big fan of vim. It's the Unix vi editor with a large number of improvements. The biggest thing I like about it is that it runs pretty much everywhere - Unix/Linux, Mac, and Windows. Having a powerful, familiar editor on Windows was a huge technological leap forward for me when I spent a lot of time on Windows, and it's still very useful on those occasions when I'm "stuck" on Windows.

I recently stumbled across an old article (Nov 2002) from the creator of vim, Bram Moolenaar, that describes how to use vim more effectively. The big thing I learned from it is how to use ctrl-N to complete things like identifiers in programming languages - e.g., type "read" and it expands to "readlines" and offers you a list of other options in your program. Granted, this is old hat for fancy-ass IDEs like Eclipse and Netbeans, but I had no idea that little old vim had it. I guess that's what they mean by vim means "vi-improved."

enjoy,
Charles.

Tuesday, September 25, 2007

Very non-intuitive error message from SVN

I just got the following error message out of Subversion (on the command line under Linux):


sh: ./svn-commit.tmp: Permission denied
svn: Commit failed (details follow):
svn: system(' svn-commit.tmp') returned 32256


It sounds like a file permission problem, but I couldn't find any problems - I was working in my own directory that I checked out of SVN. I ran strace to see what was going on, and right about the time that the Shinola hit the fan, I could see that it was trying to fire up an editor for the check-in comment. I looked, and I did not have the EDITOR environment variable set. Once I set it, it worked fine.

Let's see, "Permisssion denied" means that an environment variable was not set. Sure, that makes sense - NOT.

enjoy,
Charles.

Tuesday, September 04, 2007

Getting Maxtor OneTouch III to work on MacTel

I recently bought a Maxtor (or is it Seagate?) OneTouch III external hard drive. It's a pretty sweet drive, maybe a bit big and noisy, but perfect for my needs - backing up my systems. One really nice thing is that it's totally Mac-centric: it ships preformatted with HFS+, it supports Firewire 400 and 800 (nice with the new iMac), and it includes Retrospect Express backup software, which originated on the Mac, as I recall.

It include a button on the front that you can program to do various things, the most obvious being to initiate a backup. But it didn't work; I'd push the button and nothing would happen. After poking around, I found this update for Intel Macs. After installing it (and rebooting), it works like a charm.

enjoy,
Charles.