Western Skies: 2008

Wednesday, November 26, 2008

Book: The Productive Programmer

The Productive Programmer by Neil Ford
ISBN: 978-0-596-51978-0

I've always had a more-than-passing interest in productivity porn, as Merlin Mann calls it. I'm not obsessive about seeking out productivity books (I can quit any time), but I've read more than a few such books in my day, and I do keep my tasks organized with OmniFocus. When I saw The Productive Programmer at the Powell's table at OSCON 2008, I confess I got pretty tingly, kinda like "when we used to climb the rope in gym class." Here was a book talking about productivity aimed specifically at what I do every day - programming, not management, not sales, not coaching a football team, but programming. In the words of Cartman, "sweet."

That said, I wouldn't classify this book as hard-core productivity porn. It doesn't lay out a dogmatic formula, nor does it suggest or require specific tools or techniques. And that's probably a good thing; in my experience, programmers can be some of the most opinionated people I've known, especially when it comes to their craft (e.g., editor wars). If the author had attempted to prescribe a specific set of practices, I think almost every programmer would have found something to hate about the book, and what's the point of that - we could just as easily go back to ragging on emacs, Windows, or Steve Jobs.

Instead, Mr. Ford offers a number of possible suggestions that one can take or leave. These are organized into two parts: mechanics (the productivity principles) and practice (philosphy), or what I would call tactical (little picture) and strategic (big picture) techniques. Mechanics includes things like controlling interruptions from things like email and using tools like Quicksilver (which I finally started using after reading this book). Philosophy includes things like test-driven development/design (TDD) and using things like static analysis tools.

The various suggestions were all very well and good, but what I liked most about this book is that it made me thing about mom-and-apple-pie topics like TDD (of course, we all write unit tests, right?) from a totally different angle - productivity. Of course, that what the agile folks have been saying all along, but somehow this book shed a whole new light on it and helps drive it even deeper. And the whole book got me thinking about the bigger question - "how can I be more productive and effective in my programming?"

Like I said, this isn't hard-core productivity porn, but it's a very useful and approachable guide to productivity by a programmer for programmers. Maybe that makes it productivity literary erotica?

enjoy,
Charles.

In Praise of TDD and Mocking

As noted elsewhere in the blog, I've been doing a bunch of work to script ESXi using Vmware's (sometimes troublesome) RCLI tools. I'm developing a higher-level set of code written in Python. (In theory, using Vmware's Perl toolkit would be cleaner, and I wouldn't have to bitch about the RCLI tools, but I've done enough Perl for one lifetime.) In addition to the mom-and-apple-pie goodness of TDD, I'm also getting a huge productivity boost by employing TDD via mocking.

In the code I'm writing, each method typically makes one or more RCLI calls. Each RCLI call takes 3-5 seconds. I structured the code so that the RCLI invocations all funnel through one point in the code which can easily be monkey-patched to go through a mock function. (More details on that in a later post.) After mocking, the result is that I can execute 20 tests with dozens of mocked RCLI calls in less than a tenth of second. After the unit tests pass, I can push the code out to a real host and run real RCLI comannds, and for the most part, "it just works."

When I started, I figured mocking was the only way to unit test this code, and it was more convenient to develop the code on a machine where I don't have the RCLI installed. The performance boost was unexpected but by far the most significant benefit of the unit tests. And when I needed to perform a couple of refactorings during the development, I got to enjoy wicked fast speed plus safety - two great tastes that taste great together.

Charles.

Thursday, November 20, 2008

Vmware RCLI and exit status

Yet another gripe about Vmware's RCLI commands: they almost always return an exit status of zero, even if the command failed. For example, 'vmware-cmd -s unregister' for a non-existent virtual machine, return a status of zero (which is the same as if the command succeeded). One has to look at the standard output of the command to see "No virtual machine found." This is OK for an interactive user, but it's a pain-in-the-ass if you're trying to write scripts, as I happen to be. For every command, I have to parse the output to see if it succeed or not.

But then some commands (e.g., vifs) do return useful exit codes, at least some of the time. This whole process has to be handled on a case-by-case basis.

Speaking of parsing output, the outputs are not consistent. For example, when that unregister command is successful, it returns "unregister() = 1". Note there is a space on both sides of the equals sign. When the corresponding register command succeeds, it returns "register() =1". Note that there is no space on the left side of the equal sign. As I write code to parse these outputs (because I can't count on the exit status), I can't help but wonder how brittle this code will be...

Charles.

Friday, November 07, 2008

Book: Working Effectively with Legacy Code

Working Effectively with Legacy Code by Michael Feathers
ISBN: 0-13-11705-2

"...legacy code is simply code without tests... Code without tests is bad code."
From those statements, it doesn't take much to figure out what this book is about - how to write unit tests for code without tests. Of course, if you've ever tried to do that, you know that it's easier said than done. Many/most programmers who inherit a big ball of mud of code without tests (legacy code) just punt; the existing code has no tests, I can't see how to get any of it under test, so I'll just hack and pray - just like the original author(s) did.

That's where this book comes in. It's a primarily large collection of recipes about how to write unit tests for legacy code. That said, the focus is not really on how to write the tests but rather how to get chunks of the legacy code into a test harness so that you can write unit tests to characterize the existing functionality before adding or modifying functionality. It also contains techniques to add new functionality in such a way that you can test it immediately and possibly execute the "clean up" that the original author(s) promised would happen as soon as that next deadline was reached - all those years and deadlines ago.

The bulk of the book (Part 2) is organized as a series of complaints or excuses and how to deal with them. These include such topics as "My application has no structure," "Dependencies on libraries are killing me," and "This class is too big, and I don't want it to get any bigger." In each chapter, the author provides examples (in multiple languages - Java, C++, C, etc.) of these problems and specific techniques that can be used to address them. The last chapter (Part 3) is an encyclopedia of the techniques for easy reference.

If you're lucky enough to do only green-field development, you might think this book would be useless. However, one interpretation of this books is that it is a list of sins to avoid while your playing in the green field. And, many of the techniques can be interpreted as best practices for how to write you code to ensure it's testable. (Of course, you're following test driven development and achieving near 100% coverage, so that would never be a problem with your new code, would it? :-)

My one disappointment with this book was that I was hoping it would provide ideas about how to create higher-level (e.g., functional) tests. Of course, high-level tests are no substitute for unit tests. It's just that I was tasked with creating "some tests" quickly for an entire application, and unit tests are not practical in this particular case, which is my problem, not the author's.

This book is an excellent resource and cookbook for how to add unit tests to an existing code base that lacks tests, and it also provides design and implementation templates to ensure that new code is testable as it's created.

enjoy,
Charles.

Wednesday, November 05, 2008

vmware-cmd: SystemError=HASH(0x95d8d70)

More tales from the dark side. Trying to start a VM out on an ESXi host:

vmware-cmd  [conn options]  '[datastore] path-to-vm/conf.vmx' start hard

Returns this output:

Fault string: A general system error occurred: Internal error
Fault detail: SystemError=HASH(0x95d8d70)

I spent hours permuting the parameters. And vmware-cmd's -v option doesn't dump out all of the SOAP guts, so I had no view on what was going on. Finally I ran a vifs --dir out on the VM dir and found a bunch of vmware.log files. I fetched one that showed a normal startup and then this:

Nov 05 05:51:51.883: vmx| [msg.License.product.expired] This product has expired.
Nov 05 05:51:51.883: vmx| Be sure that your host machine's date and time are set correctly.

There was a botched version of ESXi that we installed that contained a time bomb, and it had come home to roost. Fair enough - our bad, but certainly there could be a better error message than SystemError=HASH(0x95d8d70).

Charles.

Vmware ESXi - Unable to clone virtual disk

To import a VM from VMware Server to ESXi one must convert the disk by cloning the old one using vmkfstools. Fair enough. Doing this using the ESXi RCLI tools should look something like (based on what I've seen for non-RCLI for ESX):
vmkfstools [conn-options] -i old-disk new-disk -d thin

However, when you do this with RCLI, you get the following useless error message:
Unable to clone virtual disk : A general system error occurred: Internal error

Using the --verbose flag to look at the gory SOAP details I could see that the thin option wasn't getting sent. Looking through the Perl code it looks like one needs to specify both the -d and -a (adapter type) options. Once I added -a lsilogic it worked like a charm.

My big complaint (and I have other examples of this up my sleeve) is if the user makes a simple, bonehead parameter error, the command should point it out rather than saying "general system error...internal error." I hope this post will help someone else out if s/he is unlucky enough to encounter this error message.

Charles.

Thursday, October 30, 2008

WebTest Groovy example thows NoSuchMethodError

I've been working on a project to do functional testing on a web application using WebTest. I've coded up some prototypes using the standard XML/Ant format/language/DSL, but I wanted to try doing it in Groovy, since I'm not a big fan of XML. I copied the sample application from the manual, but when I ran it I was getting and exception:

Caught: : java.lang.NoSuchMethodError: org.apache.xpath.compiler.FunctionTable.installFunction(Ljava/lang/String;Ljava/lang/Class;)I
at test.run(test.groovy:33)
at test.main(test.groovy)

(If you're following along at home, you'll note that that line number is too large for the example - I've already hacked some stuff into it.)

I couldn't find any examples online of people having exactly the same problem, but I'd seen other errors blaming an old version of xalan.jar for this problem, especially under Mac OS 10.5 - my platform. (BTW, this happened under both Java5 and Java6.) Although I didn't have CLASSPATH set to anything, I figured I'd try setting it to have the version of xalan.jar that ships with WebTest, and it worked. So, my complete invocation line is:

groovy -cp /usr/local/java/webtest_1720/lib/xalan-2.7.0.jar -Dwebtest.home=/usr/local/java/webtest_1720 test.groovy

enjoy,
Charles.

Wednesday, October 29, 2008

VMware ESXi default resource pool name

I've been playing around with ESXi trying to figure out how to use their Remote Command Line Interface (RCLI) to import and run a VM. This has been a major PITA, which I suppose I could have been documenting as I go but I didn't. (Full disclosure - I'm doing this without any formal training, without their snazzy Virtual Infrastructure tools, or anything else.)

According to VMware's RCLI Installation and Reference Guide, the data center and resource pool parameters to vmware-cmd -s register are optional, but nonetheless, it kept telling me "Must specify resource pool".

Eventually, I found a forum thread that contained the answer: Resources. (Kinda like "plastics" only different.) I don't have a data center set up, but I found that using almost any word for the data center name seemed to work.

So, in the end, this is what worked for me (your milage may vary):

vmware-cmd  [conn options] -s register '[datastore-name] vm-dir/conf.vmx' root Resources

enjoy,
Charles.

Thursday, September 04, 2008

Increasing Memory for Ant

I had an issue today with the javac Ant task running out of memory when compiling a project (~800 files) under Java 6 on the Mac. It was fine under the default Java 5 compiler on the Mac, and it was fine under Java 6 on the PC. Initially, I set memoryMaximumSize on the javac task, which also required that fork be set.
This was OK, but since this was specific to compiling on the Mac, I figured the solution should be specific to my environment on the Mac. After looking at the source of the ant script, I realized that the ANT_OPTS variable could be set to contain an option (-Xmx512m) for the JRE running Ant, and that ANT_OPTS could be set from either ~/.antrc or ~/.ant/ant.conf. So, I created ~/.antrc with one line: ANT_OPTS=-DXmx512m.
(Update: oops, that should be ANT_OPTS=-Xmx512m)
And it worked perfectly, of course - so simple it had to work.

enjoy,
Charles.

Thursday, August 28, 2008

Ant, JavaDoc, and CreateProcess error=87

I've been developing an Ant build file for a large-ish base of existing code. I've been using my Mac as the primary devleopment platform, but the ultimate target (for the other developers) is Windows. My task to build the javadocs runs fine on OS X, but the first time I rant it on Windows, I got the following:

c:\demo\build.xml:180: Javadoc failed: java.io.IOException: Cannot run program "c:\Program Files\Java\jdk1.6.0\bin\javadoc.exe": CreateProcess error=87, The parameter is incorrect

After about 10 minutes of Googling, I found a/the simple solution: add the useexternalfile attribute to the javadoc task in Ant. Voila , back in business, on the PC at least, and it didn't seem to break the build on the Mac.

Charles.

Tuesday, August 12, 2008

Book: Essential SQLAlchemy

Essential SQLAlchemy by Rick Copeland
O'Reilly Media
ISBN 10: 0-596-51614-2 / ISBN 13: 9780596516147

This is a great book describing how to use SQLAlchemy to connect Python programs to databases. In fact, at the moment (mid-summer 2008), it is the book, since there are no other books on the subject, yet. Athough I am not (yet) a SQLAlchemy user, this book seems to cover all of the core topics in SQLAlchemy. The text includes many straightforward examples of how to use various facilities in SQLAlchemy and how to map various database programming problems into Python code via SQLAlchemy. Copeland also provides a whirlwind tour of some extensions to SQLAlchemy.

I heard about SQLAlchemy project on the This Week in Django podcast. Django doesn’t use SQLAlchmey, but it does use a similar object-relational mapper (ORM). As I mentioned, I haven’t used SQLAlchemy so I came into this book with a somewhat blank slate. I have, however, been programming in Python since before 1.0, and I’ve worked with database APIs and ORMs since the early 90s in C++, Java, and Python. So, I was familiar with the basic landscape of database programming, even if I hadn’t used SQLAlchemy. And, I’m currently working on a large Python project that is coded using the Python database API directly, which is very tedious. So, the whole time I was reading this book, I was looking at how to fit SQLAlchemy into this existing code base.

To be honest, the first chapter (the proverbial introduction) almost turned me off. The author starts out slowly enough, but then he starts touching on a huge number details, which were glazing my eyes over. However, the second chapter (getting started) started back at ground zero and stepped through everything in a nice clear fashion, and the rest of the book continued in that vein. He covers all the topics you would expect in a database programming book: queries, updates, joins, the built-in types, and how to hook in to provide support for your own types.

Something I didn’t realize about SQLAlchemy coming into this is that SQLAlchemy is both an ORM (what I expected) as well as a high-level, database-independent API. Which is to say, you can just access the database as tables, columns and rows rather than as classes, attributes, and object instances. Although I’d personally prefer to use the ORM, I can imagine cases where it might not be the right tool for the job, and it’s good to have a choice.

I was also surprised to see the ORM supports two styles of object-relational access: the data mapper pattern (which I had seen in Django and Hibernate) and the active record (used in Ruby). The author does a good job of explaining both of these and how to use them. He even devotes a whole chapter to Exlir, which is an extension that implements the active record pattern.

One thing that many people might consider odd is the fact that although SQLAlchemy is an ORM, the author waits until chapter eight to discuss how to map object inheritance hierarchies onto relational databases. Most books I’ve read on ORMs discuss this topic early, but I applaud Copeland’s decisions to hold off on discussing it. When books bring this up early (e.g., in chapter three), the discussion often gets bogged down in details, which glaze the reader’s eyes. I’ve dealt with the issue of inheritance mapping enough in ORMs I’ve used and those I’ve written enough that I’m not that interested in the topic (assuming the tool provides the typical, reasonable solutions), and was grateful that he held off on it.

One issue I had with the overall structure of the book is that I’m hard pressed to pigeon-hole the book. Books about a single technology such as SQLAlchemy usually occupy one end or the other of a spectrum. Either they’re hard-core references, often times copying-and-pasting API documentation from a web site (I really hate that), or they’re largish tutorials that may or may not contain enough technical meat. This is a short (roughly 200 page) book that contains plenty of technical meat, but it also includes some simple tutorial motivations for using various capabilities of the tool. Although this mix felt odd to me, I’m sure it will be perfect when I go to apply SQLAlchemy to my existing database projects since I don’t really want a hand-holding tutorial, but a pure reference wouldn’t quite work for me either when I’m just starting out.

In conclusion, Essential SQLAlchemy provides a thorough presentation of the SQLAlchemy tool for interfacing Python code to SQL databases. The author covers a number of different methods in which SQLAlchemy can be used to access databases from Python, and he provides plenty of details of the various APIs available to the programmer.

Enjoy,
Charles.

Sunday, July 06, 2008

Converting a Django Site to newforms-admin

Right after I learned about Django at OSCON 2006, I created a prototype to replace an existing PHP site. Due to various distractions, I haven't done much with that Django site since then and it never went live, but I've come back to it recently.

My first order of business was to upgrade from 0.95 to something modern. The first stop was to upgrade to trunk. Following the instructions on the Django site, I got it running on trunk and only had to change maxlength to max_length in my models.

Since, newforms-admin will be merging with trunk soon, porting to newforms-admin was my next step. I pointed my django install at my newforms admin svn tree and let it rip - i.e., I didn't bother reading any documentation or anything.
It blew up with a nasty-looking traceback that ended with this:

func(self, *args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'prepopulate_from'

A quick Google search turned up this thread on the Django Users mailing list. Five minutes of reading the newforms-admin wiki page and refactoring my code, and I was up and running. So, despite the ugly-looking traceback, it just wan't that bad.

Granted, my project is pretty small because I never had much time to work on it. However, all in all, the conversion process from 0.95 to the bleeding edge of newforms-admin was pretty painless.

Another happy Django user,
Charles.

Book: Effective Java

Effective Java, 2nd Edition by Joshua Bloch. ISBN 0321356683

This is the best book that I've read that I didn't know I needed to read. If you are a pretty good Java programmer, and you want to be better, this is a book you should definitely read.

Back in the day (BITD), we had Scott Meyer's book Effective C++, and we needed it. Even without templates and all the "new" things in C++, C++ was a large and complex language. I loved Effective C++ not only because it had lots of tips to keep you out of trouble when coding C++, but also because the organization was brilliant: each tip was about the right length to be read while sitting on the can.

I ignored the first edition of Effective Java book when it came out because I smugly assumed that Java was so superior to C++ (Java is C++ with the pointy bits filed down) that it wasn't really necessary. And that may or may not have been true in the 1.0 or 1.1 days of Java, but with the release of Java 5, the language has certainly grown and is now sufficiently complex that a book like this is a necessity, especially if you've recently moved up to Java 5 or Java 6. Lucky for me, this book landed on my desk against my will.

The book uses that same sitting-on-the-can format from Effective C++; the information is broken down into nice, small bits of information that you can read one or two at a time. These are grouped into 10 sections. There are specific sections for Java 5 features like Generics and Enumerations, which is great for people like me who were stuck on 1.4 for far too long.

What is really amazing about this book is that I learned a lot about topics that I thought I already knew about. For example, I've used serialization in a number of formats (built-in and do-it-yourself) for years, but I realized there's a lot more that I had no idea about. I will certainly think twice before doing any serialization in the future.

Unless you're the sort that knows the Java Language Specification by heart, you need to read this book.

Enjoy,
Charles.

Tuesday, June 24, 2008

Configuring TortiseMerge for use with Mercurial

I've been using Mercurial primarily on a Mac, but also on a Windows box for a client. Since the Windows box isn't my primary platform, it tends to be under-configured/under-tricked-out. After my first conflicted hg merge, it was time to set up a merge program.
I guess Mercurial looks through the registry where it was finding some wacked out description for P4 merge. (The client currently uses P4.) It was referring to some path that didn't exist.
I have TortiseSVN on the machine from work for a different client, so I figured I'd use that merge program. Googling around, I didn't find an example, so here you go (even though it's not real hard - see MergeToolConfiguration for details):

[ui]
merge=TortoiseMerge
[merge-tools]
TortoiseMerge.executable=C:\Programs\TortoiseSVN\bin\TortoiseMerge.exe
TortoiseMerge.args=/mine:$local /theirs:$other /base:$base -o /merged:$output

enjoy,
Charles.

Tuesday, May 27, 2008

DVCS (Mercurial) for Students

I just had a bit of an epiphany for yet another reason why distributed version control systems (DVCS) like Mercurial rock. In an advanced software engineering class (e.g., a capstone project), it would be appropriate to have project teams using a SCM/VCS tool. At my campus, we've never pushed that because it can be a pain to set up a server for students to access. Gotta have a dedicated host for it. Firewalls have to be open. Permissions and users have to be set up. Yada, yada, yada.

However, with a DVCS tool like Mercurial, it would be trivial and wouldn't require any networking at all. Students could use the modern equivalent of SneakerNet: ThumbNet. Put the whole repo on a thumb drive, meet up with your project partners, push and pull, done. Or, if the project is small enough - just email whole repos around. In this context, "distributed" also means you don't need any support from The Man, and that's a good thing.

For better or worse, I don't teach those classes, but if I did....

Charles.

Friday, May 16, 2008

@Override is your friend

When Java 5 came out, I had my head down teaching and wasn't really paying attention. I've since started working with it and have been using annotations for "big" tasks like Hibernate/JPA metadata. I was pretty underwhelmed by @Override - one of the only stock annotations. When I implement toString, I know I'm overriding the method on Object. Who cares?

The other day I was beating my head into the desk trying to figure out why a Swing table wasn't editable. I had overridden the isCellEditable method on JTable, but the cells weren't editable. Then, I remembered something from the annoations tutorial I'd read at some point: "While it's not required ... it helps to prevent errors." So, I added @Override, and sure enough - I'd misspelled the method name, just the sort of error that @Override can prevent.

I've got the religion. And like any recent convert, I suggest you get it too.

enjoy,
Charles.

Monday, May 12, 2008

Data Driven, my eye!

I've started using the WebTest web testing framework. Mostly, it's pretty cool. However, I have a bone to pick with screencast demonstrating the dataDriven task.

There's a "slide" that says "Do you know the dataDriven Ant task?" I know of no such standard Ant task. It turns out that it's specific to WebTest. Not really clear.
They show no configuration steps to use it, implying that it works out-of-the-box. I don't know if my environment is wacked (I installed from the developer build, as they suggested), but I had to add an Ant taskdef referring to com.canoo.ant.task.PropertyTableTask, and I only found that by looking in the source.
The screencast shows running ant at the command line, which is how I've been running my tests, but I had to run their webtest script instead. Again, maybe my installation is wacked.
I really wish it could handle data in a TSV/CSV/text file, since I don't have Excel installed on the machine where I'm running these tests, but it only seems to accept an xls file.
Just to add insult to injury, Google Spreadsheet (which I'm using to generate the data file) seems to append a bunch of empty lines to my spreadsheet, which causes the dataDriven task to repeat the last line 90-odd times.

Grrr,
Charles - aka Cranky Pants.

Update:
OK, so it sucked to be me, but not any more. I figured out my various issues with the dataDriven task. It turns out that the screencast (clearly) shows them developing in the tests directory of a WebTest project. I missed that and tried to use dataDriven in the top-level build.xml in a target that didn't declare wt.defineTasks as a dependency. Guess what wt.defineTask does - yup, it does the taskdef.
I coupled that breakthrough with breaking down and using Excel to create the xls file (instead of exporting from Google) and viola - I'm livin' large and no longer cranky.

enjoy,
Charles.

P.S. Apologies to Cheech and Chong.

Wednesday, May 07, 2008

Universal???

Apple has released Java6 for the Mac, which I have been eagerly awaiting. However, their download page is a bit contradictory:

This is limited to 64-bit Intel macs (which is fine with me), but yet they still call it Universal. What's Universal about that? The only way I could imagine it being less universal is if it they specified a number of cores or CPU speed.

Anyway, Java6 is good stuff for those of us on Universal Core2 Duo iMacs.

Charles.

Saturday, March 29, 2008

Netbeans is a memory hog?

I was looking at Activity Monitor on my iMac, and I couldn't read the size entry for the Netbeans process that was running. So, I selected it and got a detail report:

I knew that Java and Netbeans could be memory hogs, but 16 million TB (16 exabytes) is a bit much. Good thing I have virtual memory! (I didn't alter the image, but clearly it's a bug with Activity Monitor and/or OS X 10.5.2.)

Charles.

Thursday, February 28, 2008

Simple, complete example of Python getstate and setstate

I've been doing serialization and/or object-relation mapping in languages like C++ and Java for at least 15 years. I've known about Python's serialization facility (the pickle and cPickle modules) for as long as they've existed, but I've never had a need to use them. Recently, I needed to pickle an object to store in memcached to reduce database traffic.

Wouldn't you know it - the first class I try to pickle throws an exception because it contains some attributes that can't be serialized. I couldn't figure out where the problem was because the Exception and trace back didn't include the name of the attribute that contained the threading lock that couldn't be serialized. However, a quick look at the code revealed a couple of suspects.

Even though I didn't know which attributes were causing the problem, I knew that the only solution would be to take control of the serialization process. Once I could pick and choose which attributes were being pickled, I could search for the offender(s). As it turned out, both of my initial suspects were guilty of evading pickling.

From the documentation on pickling, I could see that implementing the __getstate__ and __setstate__ methods, but it wasn't clear what those methods need to look like. I found an example online, but the guy was having problems (it was posted to a mailing list), and as I implemented my own methods, I realized what his problem was. So, here's the code:


    def __getstate__(self):
        result = self.__dict__.copy()
        del result['log']
        del result['cfg']
        return result

The problem I was having with pickling were the logging and configuration attributes. These needed to be removed from the object before pickling. Fortunately, they're not unique to the instance, so they're easy to recreate during unpickling.

As you can tell, __getstate__ returns a dictionary of the object's state. By default (if you didn't implement the method), this is just the __dict__ member. To exclude some attributes, we just need to delete the keys from the dictionary. However, the crucial step is that you have to make a (shallow) copy of __dict__ first. Otherwise, deleting the keys from the dictionary is the same as deleting the attributes from the instance, which would be bad. (This is where the other example I found online failed - he didn't make a copy.)

The __setstate__ method is the reverse, only we don't have to mess with copies:


def __setstate__(self, dict):
    self.__dict__ = dict
    cfg = self.cfg = getConfig()
    self.log = getLog()

Enjoy,
Charles.

Saturday, February 23, 2008

Mac OS X 10.5.2 did not completely fix stacks

Apple's newest update to Leopard, 10.5.2, has greatly improved the new Stack feature by adding a hierarchical list view, but it is still not as functional as the list view in Tiger. As I noted before, I created my own directory that has a collection of aliases (symbolic links) to the applications I use most frequently, as well as links to Applications, Utilities, and the LocalApps directory where I install third-party applications. The problem is even with the 10.5.2 update, the list view does not follow the symbolic links, so my folder of links is basically useless.

I'm still pleased with the list view - it is a huge improvement, but I won't be totally satisfied until it follows links.

Charles.

Sunday, February 17, 2008

Simply Mercurial

Mercurial is the easiest revision control system I've used, "and so can you" to quote Stephen Colbert. I became interested in the idea of a distributed SCM tool in order to keep my revision history with me while I'm on the road and not necessarily connected. I would have assumed that to get that power, the tool would be more complex - you can't get something for free, right? However, Mercurial is so easy to use, I'm using it for simple one-off revision needs.

Consider the case of a lone developer with a modest number of files to keep track of. To use Mercurial, all he needs to do change into the directory where the files are and run:hg init
That creates a repository, hidden in the .hg subdirectory, and sets the directory up as a working directory. The hg status command shows that none of the files is under control, yet. Running hg add * (or whatever subset of the files is appropriate) marks all of the files to be added to the repository. Finally, hg commit commits the files.

The real beauty was in that first step - hg init. That is so much easier than CVS or Subversion where you either have to create a new repository or figure out where in an existing repository you want to put these files. And it's easier than the dinosaurs, RCS and SCCS, where you have to set up subdirectories to hold the version files in every subdirectory - not to mention the fact that those tools don't really deal with multiple users.

Mercurial is about as simple as can be, and if you never work with multiple developers and passing changes around between developers and repositories, then it stays that simple. Period.

enjoy,
Charles.

Saturday, February 02, 2008

Complete example of __getattr_ in Python

I've always known about the _getattr_ function on classes in Python and how it could theoretically be used. However, I never had a real need to implement it, and so I had never actually implemented __getattr__. For whatever reason, it was a tad more difficult that I thought, so I figured I'd share an example with you'all.

In case you don't already know, the idea is that any time any code makes are reference to an attribute a class (e.g., obj.x ), __getattr__ gets called to fetch or compute the value of the attribute. This function can do almost anything, but you must be careful when making references to attributes, because that will trigger a recursive call to __getattr__.

In my case, I was writing some code for unit testing. I needed to create a mock object that's used to store configuration information for the system. In the real object, every configuration attribute is initialized from an ini file parsed by ConfigParser in the constructor. For testing, didn't want to have a huge configuration file for every test. So, I wanted to create a system that performed lazy initialization of the data attributes - i.e., only look in the ini file if we actually need a given item, and if the attribute is never referenced, we never need to fetch it from the ini file. Therefore, the ini file only needs the attributes that are actually used by a given test. Implementing __getattr__ is the way to hook into the process to provide this lazy initialization.

The basic outline/algorithm is:

If the attribute already exists on self, return that value
Fetch/compute the missing value
Store the value on self for subsequent use
Return the value

The key to making this work and avoiding infinite recursion is the __dict__ attribute, which is a (regular) dictionary, the keys of which are attributes that exist on the object and the values are the values of the attributes. We can access these keys and values without going through __getattr__, thus avoiding recursion.


def __getattr__(self, attrName):
  if not self.__dict__.has_key(attrName):
     value = self.fetchAttr(attrName)    # computes the value
     self.__dict__[attrName] = value
  return self.__dict__[attrName]

It's pretty straightforward. In retrospect, I'm not sure what tripped me up when I first went to implement it. In the end, the fetchAttr function ended up being pretty fancy, but I'll write more about that later. You gotta love a dynamic language like Python that makes this as simple as it is, even if it does require a bunch of underscores.

Enjoy,
Charles.

Thursday, January 24, 2008

Remove Office on Mac sux

The Remove Office program that comes with the trial version of Microsoft Office just sucks. The two times I've used it, it's failed. The first time, I had my old iMac connected via Firewire (target disk mode) to my new iMac. I ran Remove Office on the new iMac to remove the trial version, and it removed the non-trial version on my old iMac - without even asking me which version to remove. Now, on my new MacBook, I tried running Remove Office (without any other computers connected), and it couldn't find the trial version to remove - never mind the fact that I was running Remove Office from the folder containing Office.

My advice to anyone is to just use 'rm -r' from the Terminal window. If you just drag it to the trash (without emptying the trash), the trail version will keep launching.

enjoy,
Charles.

Sunday, January 20, 2008

Exporting from ExamView to Moodle

I've just started using Moodle for a course (more on that later), and I wanted to import some questions I wrote using ExamView into Moodle. Although our version of Moodle (1.8?) suggests that it can import EvamView questions, it failed for me. I came across this post that suggests that it isn't even supported - at least not by the company that makes ExamView. Based on that post, here's what I ended up doing:

From ExamView Pro (version 4.0.8) export as Blackboard 5.x, which is a zip file.
Unzip the zip file.
Import the .dat file into Moodle as a Blackboard, not Blackboard V6+

(And, all of this is made more complicated by the fact that I've basically abandoned the PC for the Mac, and I only have a version of ExamView for the PC. Praise be to VMware for Fusion.)

There may be a simpler way to do this, but it works, which is all that matters right now. It would be swell if I could automate the process...

enjoy,
Charles.