Sunday, July 06, 2008

Book: Effective Java

Effective Java, 2nd Edition by Joshua Bloch. ISBN 0321356683

This is the best book that I've read that I didn't know I needed to read. If you are a pretty good Java programmer, and you want to be better, this is a book you should definitely read.

Back in the day (BITD), we had Scott Meyer's book Effective C++, and we needed it. Even without templates and all the "new" things in C++, C++ was a large and complex language. I loved Effective C++ not only because it had lots of tips to keep you out of trouble when coding C++, but also because the organization was brilliant: each tip was about the right length to be read while sitting on the can.

I ignored the first edition of Effective Java book when it came out because I smugly assumed that Java was so superior to C++ (Java is C++ with the pointy bits filed down) that it wasn't really necessary. And that may or may not have been true in the 1.0 or 1.1 days of Java, but with the release of Java 5, the language has certainly grown and is now sufficiently complex that a book like this is a necessity, especially if you've recently moved up to Java 5 or Java 6. Lucky for me, this book landed on my desk against my will.

The book uses that same sitting-on-the-can format from Effective C++; the information is broken down into nice, small bits of information that you can read one or two at a time. These are grouped into 10 sections. There are specific sections for Java 5 features like Generics and Enumerations, which is great for people like me who were stuck on 1.4 for far too long.

What is really amazing about this book is that I learned a lot about topics that I thought I already knew about. For example, I've used serialization in a number of formats (built-in and do-it-yourself) for years, but I realized there's a lot more that I had no idea about. I will certainly think twice before doing any serialization in the future.

Unless you're the sort that knows the Java Language Specification by heart, you need to read this book.

Enjoy,
Charles.

Tuesday, June 24, 2008

Configuring TortiseMerge for use with Mercurial

I've been using Mercurial primarily on a Mac, but also on a Windows box for a client. Since the Windows box isn't my primary platform, it tends to be under-configured/under-tricked-out. After my first conflicted hg merge, it was time to set up a merge program.
I guess Mercurial looks through the registry where it was finding some wacked out description for P4 merge. (The client currently uses P4.) It was referring to some path that didn't exist.
I have TortiseSVN on the machine from work for a different client, so I figured I'd use that merge program. Googling around, I didn't find an example, so here you go (even though it's not real hard - see MergeToolConfiguration for details):

[ui]
merge=TortoiseMerge
[merge-tools]
TortoiseMerge.executable=C:\Programs\TortoiseSVN\bin\TortoiseMerge.exe
TortoiseMerge.args=/mine:$local /theirs:$other /base:$base -o /merged:$output


enjoy,
Charles.

Tuesday, May 27, 2008

DVCS (Mercurial) for Students

I just had a bit of an epiphany for yet another reason why distributed version control systems (DVCS) like Mercurial rock. In an advanced software engineering class (e.g., a capstone project), it would be appropriate to have project teams using a SCM/VCS tool. At my campus, we've never pushed that because it can be a pain to set up a server for students to access. Gotta have a dedicated host for it. Firewalls have to be open. Permissions and users have to be set up. Yada, yada, yada.

However, with a DVCS tool like Mercurial, it would be trivial and wouldn't require any networking at all. Students could use the modern equivalent of SneakerNet: ThumbNet. Put the whole repo on a thumb drive, meet up with your project partners, push and pull, done. Or, if the project is small enough - just email whole repos around. In this context, "distributed" also means you don't need any support from The Man, and that's a good thing.

For better or worse, I don't teach those classes, but if I did....


Charles.

Friday, May 16, 2008

@Override is your friend

When Java 5 came out, I had my head down teaching and wasn't really paying attention. I've since started working with it and have been using annotations for "big" tasks like Hibernate/JPA metadata. I was pretty underwhelmed by @Override - one of the only stock annotations. When I implement toString, I know I'm overriding the method on Object. Who cares?

The other day I was beating my head into the desk trying to figure out why a Swing table wasn't editable. I had overridden the isCellEditable method on JTable, but the cells weren't editable. Then, I remembered something from the annoations tutorial I'd read at some point: "While it's not required ... it helps to prevent errors." So, I added @Override, and sure enough - I'd misspelled the method name, just the sort of error that @Override can prevent.

I've got the religion. And like any recent convert, I suggest you get it too.

enjoy,
Charles.

Monday, May 12, 2008

Data Driven, my eye!

I've started using the WebTest web testing framework. Mostly, it's pretty cool. However, I have a bone to pick with screencast demonstrating the dataDriven task.

  1. There's a "slide" that says "Do you know the dataDriven Ant task?" I know of no such standard Ant task. It turns out that it's specific to WebTest. Not really clear.
  2. They show no configuration steps to use it, implying that it works out-of-the-box. I don't know if my environment is wacked (I installed from the developer build, as they suggested), but I had to add an Ant taskdef referring to com.canoo.ant.task.PropertyTableTask, and I only found that by looking in the source.
  3. The screencast shows running ant at the command line, which is how I've been running my tests, but I had to run their webtest script instead. Again, maybe my installation is wacked.
  4. I really wish it could handle data in a TSV/CSV/text file, since I don't have Excel installed on the machine where I'm running these tests, but it only seems to accept an xls file.
  5. Just to add insult to injury, Google Spreadsheet (which I'm using to generate the data file) seems to append a bunch of empty lines to my spreadsheet, which causes the dataDriven task to repeat the last line 90-odd times.
Grrr,
Charles - aka Cranky Pants.

Update:
OK, so it sucked to be me, but not any more. I figured out my various issues with the dataDriven task. It turns out that the screencast (clearly) shows them developing in the tests directory of a WebTest project. I missed that and tried to use dataDriven in the top-level build.xml in a target that didn't declare wt.defineTasks as a dependency. Guess what wt.defineTask does - yup, it does the taskdef.
I coupled that breakthrough with breaking down and using Excel to create the xls file (instead of exporting from Google) and viola - I'm livin' large and no longer cranky.

enjoy,
Charles.

P.S. Apologies to Cheech and Chong.

Wednesday, May 07, 2008

Universal???

Apple has released Java6 for the Mac, which I have been eagerly awaiting. However, their download page is a bit contradictory:


This is limited to 64-bit Intel macs (which is fine with me), but yet they still call it Universal. What's Universal about that? The only way I could imagine it being less universal is if it they specified a number of cores or CPU speed.

Anyway, Java6 is good stuff for those of us on Universal Core2 Duo iMacs.


Charles.

Saturday, March 29, 2008

Netbeans is a memory hog?

I was looking at Activity Monitor on my iMac, and I couldn't read the size entry for the Netbeans process that was running. So, I selected it and got a detail report:



I knew that Java and Netbeans could be memory hogs, but 16 million TB (16 exabytes) is a bit much. Good thing I have virtual memory! (I didn't alter the image, but clearly it's a bug with Activity Monitor and/or OS X 10.5.2.)


Charles.

Thursday, February 28, 2008

Simple, complete example of Python getstate and setstate

I've been doing serialization and/or object-relation mapping in languages like C++ and Java for at least 15 years. I've known about Python's serialization facility (the pickle and cPickle modules) for as long as they've existed, but I've never had a need to use them. Recently, I needed to pickle an object to store in memcached to reduce database traffic.

Wouldn't you know it - the first class I try to pickle throws an exception because it contains some attributes that can't be serialized. I couldn't figure out where the problem was because the Exception and trace back didn't include the name of the attribute that contained the threading lock that couldn't be serialized. However, a quick look at the code revealed a couple of suspects.

Even though I didn't know which attributes were causing the problem, I knew that the only solution would be to take control of the serialization process. Once I could pick and choose which attributes were being pickled, I could search for the offender(s). As it turned out, both of my initial suspects were guilty of evading pickling.

From the documentation on pickling, I could see that implementing the __getstate__ and __setstate__ methods, but it wasn't clear what those methods need to look like. I found an example online, but the guy was having problems (it was posted to a mailing list), and as I implemented my own methods, I realized what his problem was. So, here's the code:

def __getstate__(self):
result = self.__dict__.copy()
del result['log']
del result['cfg']
return result

The problem I was having with pickling were the logging and configuration attributes. These needed to be removed from the object before pickling. Fortunately, they're not unique to the instance, so they're easy to recreate during unpickling.

As you can tell, __getstate__ returns a dictionary of the object's state. By default (if you didn't implement the method), this is just the __dict__ member. To exclude some attributes, we just need to delete the keys from the dictionary. However, the crucial step is that you have to make a (shallow) copy of __dict__ first. Otherwise, deleting the keys from the dictionary is the same as deleting the attributes from the instance, which would be bad. (This is where the other example I found online failed - he didn't make a copy.)

The __setstate__ method is the reverse, only we don't have to mess with copies:

def __setstate__(self, dict):
self.__dict__ = dict
cfg = self.cfg = getConfig()
self.log = getLog()


Enjoy,
Charles.

Saturday, February 23, 2008

Mac OS X 10.5.2 did not completely fix stacks

Apple's newest update to Leopard, 10.5.2, has greatly improved the new Stack feature by adding a hierarchical list view, but it is still not as functional as the list view in Tiger. As I noted before, I created my own directory that has a collection of aliases (symbolic links) to the applications I use most frequently, as well as links to Applications, Utilities, and the LocalApps directory where I install third-party applications. The problem is even with the 10.5.2 update, the list view does not follow the symbolic links, so my folder of links is basically useless.

I'm still pleased with the list view - it is a huge improvement, but I won't be totally satisfied until it follows links.

Charles.

Sunday, February 17, 2008

Simply Mercurial

Mercurial is the easiest revision control system I've used, "and so can you" to quote Stephen Colbert. I became interested in the idea of a distributed SCM tool in order to keep my revision history with me while I'm on the road and not necessarily connected. I would have assumed that to get that power, the tool would be more complex - you can't get something for free, right? However, Mercurial is so easy to use, I'm using it for simple one-off revision needs.

Consider the case of a lone developer with a modest number of files to keep track of. To use Mercurial, all he needs to do change into the directory where the files are and run:hg init
That creates a repository, hidden in the .hg subdirectory, and sets the directory up as a working directory. The hg status command shows that none of the files is under control, yet. Running hg add * (or whatever subset of the files is appropriate) marks all of the files to be added to the repository. Finally, hg commit commits the files.

The real beauty was in that first step - hg init. That is so much easier than CVS or Subversion where you either have to create a new repository or figure out where in an existing repository you want to put these files. And it's easier than the dinosaurs, RCS and SCCS, where you have to set up subdirectories to hold the version files in every subdirectory - not to mention the fact that those tools don't really deal with multiple users.

Mercurial is about as simple as can be, and if you never work with multiple developers and passing changes around between developers and repositories, then it stays that simple. Period.


enjoy,
Charles.