Western Skies

SCIM with Azure AD - no parameters coming through to Rails contoller

2018-08-02T16:43:00.000-07:00

I've been working to integrate our application with Azure Active Directory via SCIM - i.e., to allow Azure AD to provision users in our application using SCIM. The problem I was having was that when Azure went to create users, the parameters hash in Rails was empty - {}. I opened a ticket with Microsoft and spent weeks (literally) going back and forth with them. After they assured me that the parameters were being sent, I started dumping all the info I could about the incoming requests.

Eventually, I found that request was coming in with a "Content-type: application/scim+json" header, even though Microsoft's documentation showed "application/json". (I've opened an issue for this.) Once I saw that, I could easily reproduce the bug locally with curl. This fix was pretty straightforward - add a new MIME type. I found a thread on GitHub hashing out how/when to do that with Rails API, but it applied just the same in vanilla Rails.

I hope this helps someone else out there. Enjoy.

Fixed: Problem with Safari To Go App for ACM Content

2017-02-12T20:07:00.000-08:00

TL;DR - launch the Safari To Go app from the ACM Safari page.

One benefit of the ACM is access to a nice subset of O'Reilly's Safari library. One can read the content on the website. O'Reilly has an Android app for accessing the content called Safari To Go. (Note, they also have a newer app that doesn't work with the ACM library.) However, there is a problem when trying to authenticate with ACM and Safari from the app. The login page flashes a message about being signed in, and then it pops up a message saying the session has expired ("your session has been logged out"). I think this started when they added the ability to access books offline. There are several one-star ratings in the app store bitching about this problem, and no responses from O'Reilly, both of which are frustrating.

Here's the simple solution I found:

Go to the ACM's learning site (https://learning.acm.org/) on your Android device
Login with your ACM web account credentials in the top right corner.
Click on "Safari" in the "Go to my..." menu in the top right corner.
Click on "Launch the mobile app..." at the top left.

This worked for me in February 2017. All interfaces subject to change. Your mileage may vary.

enjoy,
Charles.

Updating Ruby versions in a Boxen - rbenv - ruby-build Environment

2016-04-22T16:02:00.000-07:00

TL;DR: cd /opt/boxen/rbenv/plugins/ruby-build && git pull origin master

A while ago, I set up my development environment on my two machines using GitHub's Boxen. To be honest, I got it going and moved on. I have not been updating it. Then, I recently started a new Ruby project, and I wanted the latest version of Ruby. I went into rbenv, and it had no recent versions of Ruby - none since I set up Boxen.

How do I update the list of Ruby versions available in my Boxen environment?

I could update Boxen, but that's more than I wanted to bite off. My first guess was to run brew update and brew upgrade rbenv ruby-build, but that failed because rbenv is part of the bootstrap of Boxen and not installed by Boxen's version of Homebrew.

So, I thought I'd need to update rbenv directly. I figured it was installed as a git repo. The question was: where? which rbenv says it's a shell function. Looking at that code, it un-aliases itself to run.

which -a shows that rbenv is in /opt/boxen/rbenv/bin/rbenv. From there I found the ruby-build plugin, which leads to the solution:

cd /opt/boxen/rbenv/plugins/ruby-build
git pull origin master

It's not like it's rocket-surgery, but it was a curious puzzle to sort out. Soon I should update my Boxen environment, but that's a story for another day.

enjoy,
Charles.

Getting Started

2016-03-15T21:00:00.000-07:00

How to begin the HashiQuest?

The world of AWS is huge these days. The HashiCorp tools can be counted two hands, but since they interface with AWS, that limited count is deceiving.

I actually started by getting the lay of the land from AWS in Action, which Manning conveniently had on special just about the time I was interested in learning more about it. The book isn't an exhaustive coverage of all of the AWS services, but it's an excellent overview. I did their tutorial for building a WordPress site. The authors provide their code examples online in GitHub, which is excellent.

After that, Terraform seemed like the logical place to start since it deals with building infrastructure in AWS. Again, I followed the online tutorial, and was pleased by the lack of drama.

I don't for a minute think that doing either of these tutorials qualifies as any real expertise, but I found that just typing the command and checking the results in the AWS console starts to build both the physical and mental "muscle memory." Both tutorials were done with the free tier in AWS, and free (as in beer) is always good.

Next, I checked out a new tool from HashiCorp - Otto. Otto is a successor to Vagrant, but it's heading in very different directions. If you start thinking that Otto is just Vagrant++, it's hard to understand the infrastructure and deployment functionality that Otto provides. Otto provides a path from a development environment on a single machine, to a simple AWS deployment, to a more sophisticated AWS deployment.

Because Otto is based on some opinionated policies and best practices, it provides a great way to see how all the pieces of the HashiCorp ecosystem and AWS fit together. It generates plain-text configurations and scripts in the .otto directory in your project's tree. These are there to read and learn from. Some AWS masters might chafe at the best practices, but everyone's gotta start somewhere, so it might as well be something sane.

I'm not sure if this is the best way to learn about AWS and the HashiCorp tools, but it's what I've done. Your milage may vary.

Enjoy,
Charles.

HashiQuest

2016-02-22T21:04:00.000-08:00

I'm starting a new project at work to build a new infrastructure for hosting our apps. My objectives/requirements include:

Elastic - something we can easily scale up and down.
Redundant - something that can tolerate reasonable outages. With the recent Xen security issues, I've had more than one ride in the reboot-rodeo, and I'm getting tired of that.
Invented Here as opposed to Not Invented Here (NIH) - I inherited the existing infrastructure, and there's nothing wrong with it per-se, but because I didn't build it, it surprises me from time to time.
Immutable - I want to get on the immutable infrastructure bandwagon because it's what the cool kids have been doing. But, as I get into this, I realize that immutable infrastructure can lead to...
Fearless - I want to be able to make changes quickly and easily without uttering "what could possibly go wrong?" before each change.

To achieve these objectives, I plan on using tools from HashiCorp to build out a pretty traditional infrastructure on Amazon Web Services. I'm a big fan of HashiCorp and their tools. Most of their tools are open source, which I like for cost and "religious" reasons. Mitchell Hashimoto was my first guest on the SE Radio podcast when HashiCorp was just launching, and he's great. Once my infrastructure is up and running, I look forward to using their Atlas tool to manage it all and pay Mitchell for all the great stuff he's done.

As mentioned, my initial plan is to build the first version of the infrastructure using AMIs running on EC2 instances as opposed to building Docker containers or running on Google Compute Engine. I made that decision in part to be more conservative (I hate explaining our current environment to prospective customers - no one ever got fired for picking IBM/Cicso/Amazon.) However, by using HashiCorp tools, I am hoping that I can keep my options open in the future.

Hence, I have begun what I'm calling HashiQuest. Stay tuned.

Charles.

A Pair of Interviews

2016-01-13T20:05:00.000-08:00

I had two interviews released back-to-back on Software Engineering Radio around New Years:

Episode 245: John Sonmez and I discussed his book Soft Skills - in particular the chapters on career management and marketing yourself. He was a great guest. I wish we could have gone over the whole book.
Episode 246: I interview John Wilkes from Google about the Borg cluster management software used at Google and Kubernetes. No one, except John, will ever know what a pain in the butt it was to record that episode - epic Skype fails the first time, but John was exceedingly helpful and understanding.

Check them out,

Charles.

A Pleasant Echo

2015-06-15T21:11:00.000-07:00

My podcast interview with James Turnbull was transcribed and published by the nice folks at IEEE. Check it out in May/June 2015 issue of IEEE Software. That's something I wasn't expecting when I did the interview.

Charles.

Open Classes and Lazy Loading in Rails Don't Always Mix

2015-02-04T20:48:00.000-08:00

Once upon a time, I came across an odd bug in some Rails code that was kicking my butt for some time. The reason it perplexed me was it was a classic Heizenbug that seemed to come and go. (That, and this was a side project that I couldn't devote much focused time to.)

The bug sometimes manifested itself with this error when running tests -
NoMethodError: undefined method `all' for SocialQueue:Class
where SocialQueue is an ActiveRecord model. If I subsetted the tests being run to just the ones that failed, the bug would go away. And, if I ran other tests before, the error would go away. In other words, the act of trying to observe the bug would change it.

Another variant of the error I found when running the server in development mode was:
undefined method arel_table for QueuedPost:Class. Again, QueuedPost is a model class, and I assume it has that method somewhere in the voodo that is ActiveRecord.

The error showed up when I added an "innocuous" tracing statement. If I replaced the tracing statement with a puts statement, it worked. If I put a return statement as the first statement in the tracing method, the error persisted - i.e., nothing in the body of that tracing method was causing harm.

The tracing method was in a module, in a separate file, and just requiring that module would cause the errors. How could that be? The module in question wasn't methods to be included in a class - it's just a name space to put this tracing method. What's so bad about requiring a module?

What else is in that file? Oh yeah, I have some code in there that opens my model classes to add a method to each class. I put the new methods in that file, away from the rest of the model's definitions, because this tracing facility was experimental, and I didn't want to commit to modifying the model classes just yet.

In the words of Merlin Mann, "turns out" in development and test modes, Rails loads class definitions lazily. When my module that opened model classes was loaded, the model classes hadn't necessarily been loaded. If the model was loaded, it worked. Otherwise, it wasn't opening an existing model class, but rather it was opening a new class. Then, when my code that tried to use the models was run, there was already a definition for the class, so Rails didn't load the model class, and the object I thought was a model, was basically a lump of uselessness with the one method I intended to inject into a model class but none of the model methods. I have since heard of lazy loading causing problems in STI.

The long-term solution for my problem is to move those new, injected methods into the main definitions of the models, now that my experiment is over, and I know I want to keep those methods.

In the interim, I came up with a simple hack that can be used anywhere you want to open a model class from some other file (or, maybe I've convinced you not bother - just edit the model). Right before you open the model class, just mention it. For example:
SomeModel
class SomeModel
def my_new_method()
end
end
Mentioning the model causes Rails to load it. Then, when you open it, you're actually opening the real model class.

Perhaps there are better ways to skin this cat, but this works, and in the process, I learned about the existence and dangers lazy loading in Rails.

enjoy,
Charles.

James Turnbull on Docker

2015-01-07T20:54:00.000-08:00

I interviewed James Turnbull on the Software Engineering Radio podcast to discuss Docker.

Charles.

Mitchell Hashimoto on the Vagrant Project

2014-08-07T18:49:00.000-07:00

I almost forgot to mention it here, but my first podcast episode with SE Radio went live the last week of July. In it, I interviewed Mitchell Hashimoto about the Vagrant project. Stay tuned for more...

Charles.

Living Dangerously with MySQL Fatal Error 1236

2014-04-15T15:48:00.001-07:00

This past weekend, the data center where our MySQL master resides suffered some issues. At first I thought it was just some connectivity issues, but it was a power outage and our nodes were all rebooted. While cleaning up various messes from that, I discovered that our D/R slave in another data center was stuck with the error message:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.014593' at 52888137, the last event read from '/var/log/mysql/mysql-bin.014593' at 4, the last byte read from '/var/log/mysql/mysql-bin.014593' at 4.'

Googling around, I found that this error means that the slave got more data from the master than what the master wrote to its log before the crash. The result was that the slave was asking for data beyond the end of the log file - the master started a new log when it restarted.

I wondered if it would be possible to just move on to the next log, and looking at more postings, I found that it is possible.

NOTE: this procedure is dangerous and may lead to data loss or corruption. I would never do it on a truly critical system like a financial system.

I figured it was worth a try. In the worst case, I would hose the slave and have to rebuild from a fresh dump, which was the only other alternative. I also realized that when the slave restarted, there might be some replication issues around that area in the "log transition."

As the blonde said, "do you want to live forever?"

So, I stopped the slave, moved it to the beginning of the next log file and started it again.
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_POS = 4;
CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin. 014593';
START SLAVE;

As anticipated, there were issues. There were a number of UPDATE statements that couldn't be applied because of a missing row. I steered around them, one at a time with:
STOP SLAVE ;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER =1;
START SLAVE ;

It was a hassle, and it took interventions that I expected, but it was quicker than shutting down my production applications to take a consistent dump, transferring, and restoring it. And, while I was babysitting it, I could write a blog post.

Your milage may vary,
Charles.

Active Record - joins + include Methods Causing an Unintended Join

2013-11-25T21:38:00.000-08:00

I recently fixed a problem in my Rails 3.2 app where I was using both the joins and include methods in an Active Record query, and it was triggering a join that I didn't want. WTF? Why are you using include and joins, and you don't want a join?

I needed to run a query on table A and I needed to apply criteria against another table B. Thus, I needed to (inner) join those two with the joins method. For the rows of A that met the search criteria, I wanted to eagerly the corresponding rows from tables X, Y, and Z. Of course, I wanted to avoid a 3N+1 query situation. So, I also used the includes method.

Typically, the includes method generates a query by IDs for the related objects. In my case, I was getting four INNER JOINs - one each for B, X, Y, and Z. Under "normal" circumstance, maybe this would have been OK, but my problem is table Y is in a separate database, and you can't join across databases. (You can't really do transactions across databases, either.)

My original code used an array of named associations in the joins method - joins(:bs). On a lark, I decided to recode it to use a string - joins('INNER JOIN bs ON bs.a_id = as.id'), and it worked: I got the inner join for B and three individual queries for X, Y, and Z. Because Y is queried as a simple query with an array of IDs, the fact that Y is in another database isn't a problem - it just works.

Anyway, if you've stumbled across this post while trying to solve the same problem, I hope this helps.

Charles.

Ctags for Puppet - Three (previously missing) Pieces

2013-05-23T20:46:00.001-07:00

Back in the day, when I was coding in C on the Unix kernel (before Linux even existed), I used vi's tags functionality extensively. We had a patched version of vi (before vim existed) that supported tag stacks and a hacked version of ctags that picked up all kinds of things like #defines, and it used the -D flags you used when compiling to get you to the right definition of something that was defined many times for various architectures, etc. But, when I moved to C++ with function overloading, ctags broke down for me, and I quit using it.

Recently, I inherited pretty big Puppet code base. For a long time, I was just navigating it by hand using lots of find and grep commands. Finally, I broke down and figured out how to get ctags working for my Puppet code on OS X. Actually, other people figured it out, but here were the three pieces I had to string together.

A modern version of ctags - aka exuberant ctags. This is pretty easy to install with homebrew, but there is a rub: OS X already has a version of it installed, and depending on how your PATH is configured, the stock version might trump homebrew's version. Matt Pollito has a nice, concise blog post explaining how to cope with that.

Tell ctags about Puppet's syntax: Paul Nasrat has a little post describing the definitions needed in the ~/.ctags file and the invocation of ctags.

Tell vim about Puppet's syntax: Netdata's virmrc file has the last piece:
set iskeyword=-,:,@,48-57,_,192-255
The colon is the key there (no pun intended) - without that, vim wasn't dealing with scoped identifiers and was just hitting the top-level modules.

The last bit is for me to re-learn the muscle memory for navigating with tags that has atrophied after 20 years give or take. BTW, if you don't have tags, a cool approximation within a single file is '*' in command mode - it searches for the word under the cursor.

enjoy,
Charles.

Hadoop Beginner's Guide

2013-05-07T21:21:00.000-07:00

Hadoop Beginner's Guide by Garry Turkington
ISBN: 1849517304

Hadoop Beginner's Guide is, as the title suggests, a new introductory book to the Hadoop ecosystem. It provides an introduction to how to get up and running with the core components of Hadoop (Map-Reduce and HDFS), some higher level tools like Hive, integration tools like Sqoop and Flume, and it also provides some good starting information relating to operational issues with Hadoop. This is not an exhaustive reference like Hadoop: The Definitive Guide, and for a beginner, that's probably a good thing. (In my day, we only had The Definitive Guide, and we liked it!)

Most of the topics are covered in a "dive right in" format. After some brief introduction to the topic the author provides a list of commands or a block of code and invites you to run it. This is followed by "What just happened?" that explains the details of the operation or code. Personally, I don't care for that too much because the explanation is sometimes separated from the code by multiple pages, which was a real hassle reading this as a PDF. But, maybe that's just me.

As I mentioned, the book includes a couple of chapters on operations, which I found to be a nice addition to a beginner's book. Some of these operational details were explained by hands-on experiments like shutting down processes or nodes, in which case "What just happened?" is more like "What just broke?" The operational scenarios are by no means exhaustive (that's what you learn from production), but they provide the reader with some "real life" experience gained in a low-risk environment. And, they introduce a powerful method to learn more operational details: set up an experiment and find out what happens. Learning to learn is the most valuable thing you can gain from any book, class, or seminar.

Another nice feature of this book that I haven't seen in others is that the author includes examples of Amazon EC2 and Elastic Map Reduce (EMR). There are examples of both Map Reduce and Hive jobs on EMR. He doesn't do everything with "raw" Map Reduce and EMR because once you know the basics of EMR, the same principles apply to both raw Hadoop and EMR.

I do have some complaints about the book, but many of them are nit-picking or personal style. That said, I think the biggest thing this book would benefit from would be some very detailed "technical editing." By that I mean there are technical details that got corrupted during the book production process. For example, the hadoop command is often rendered as Hadoop in examples. There are plenty of similar formatting and typographic errors. Of course, an experienced Hadoop user wouldn't be tripped up by these, but this is a "beginner's guide," and such details can cause tremendous pain and suffering for newbies.

To wrap things up, Hadoop Beginner's Guide is a pretty good introduction to the Hadoop ecosystem. I'd recommend it to anyone just starting out with Hadoop before moving on to something more reference-oriented like The Definitive Guide.

enjoy,
Charles.

FTC disclaimer: I received a free review copy of this book from DZone. The links to Amazon above contain my Amazon Associates tag.

Why is my Rails app calling Solr so often?

2011-10-14T21:44:00.000-07:00

I work on the back-end of a Rails app that uses Solr via Sunspot. Looking at the solr logs, I could see the same item being added/indexed repeatedly sometimes right before it was deleted from solr. I didn't write the code, but I was tasked with figuring it out.

Glancing at the main path of the code didn't show anything obvious. I figured the superfluous solr calls were happening via callbacks somewhere in the graph of objects related to my object in solr, but which one(s). Again, I didn't write the code, I just had to make it perform.

I hit on the idea of monkey-patching (for good, not evil) the Sunspot module. Fortunately, most/all of the methods on the Sunspot module just forward the call onto the session object. So, it's really easy to replace the original call with anything you want and still call the real Sunspot code, if that's what you want to do.

This is so easy to do that I even did it the first time in the rails console. In that case, I was happy to abort the index operation when it first happened. So, I whipped this up in a text file and pasted it into the console:

module Sunspot
class <<self
    def index(*objects)
      raise "not gonna do it!"
    end
end
end

Then, I invoked the destroy operation that was triggering the solr adds, got the stack trace, and could clearly see which dependent object was causing the index operation.

For another case, I needed to run a complex workflow in a script to trigger the offending solr operations. In that case, I wanted something automatically installed when the script started up, and I wanted something that didn't abort - all I wanted was a stack trace. So, I installed the monkey-patch in config/initializers/sunspot.rb and had a more involved index function:

    def index(*objects)
      puts "Indexing the following objects:"
      objects.each { |o| puts "#{o.class} - #{o.id}" }
      puts "From: =============="
      raise rescue puts $!.backtrace
      puts ===========\n"
      session.index(*objects)
    end

That last line is the body of the real version of the index method - like I said, trivial to re-implement; no alias chaining required.

Maybe there's some cooler way to figure this out, but this worked for me.

enjoy,
Charles.

Rails/Rspec does not clean up model instances on MySQL

2011-08-18T09:50:00.000-07:00

I recently solved a thorn in my side relating to some Rspec tests in our code base when running on my development machine using MySQL. For some reason, some instances that were created using Factory Girl weren't getting cleaned up, which in turn would cause subsequent test runs to fail because of duplicate data. So, I'd DELETE the whole tables from the MySQL prompt. I looked in the test.log file, and I could see the save points being issued before the objects were created, but they weren't getting removed at the end of the test.

I didn't have a lot of time to look into it, and I didn't know where to look - Rspec, Factory Girl, Rails? So, in the short-term, I just added after_each calls to destroy the objects. And, I moved on.

Then, I was dumping schemas in MySQL using SHOW CREATE TABLE in order to analyze some tables and indexes, and I noticed the storage ENGINE flag on the tables. I went back and looked at the tables in my test database that were giving me trouble, and, of course(?), they were MyISAM rather than InnoDB. So, transaction rollback (used to clean up after tests) didn't work.

I changed the storage engine on those tables (ALTER TABLE t1 ENGINE = InnoDB), commented out the manual clean-up code, and voila! It works right now. Pretty obvious in retrospect, but I didn't even know where to start looking in our stack.

I hope this helps some other poor souls, too.

Charles.

Freeing up phone space on Android

2011-04-26T10:02:00.000-07:00

For the last couple of months my Motorola Droid running Android 2.2.2 has been complaining about being "low on space" for the phone, not the SD card. I pruned some apps, but that didn't help much. Things really came to a head this morning when my phone was so low on memory that it was no longer downloading email.

I found this article to be quite helpful -
http://www.androidcentral.com/monthly-maintenance-keeping-things-speedy

For me, the two big ones were Messaging and the Browser cache. I had a couple of threads in Messaging containing a number of pictures. Once I saved the pictures off to the SD card, I purged the threads, that freed up ~20MB. Clearing the browser cache freed another ~20MB, but that will probably evaporate again as the browser caches things.

Here's a minor whine about Android: the SD card and phone storage settings page tells you how big your SD card is and how much space is remaining, but the phone storage just says how much is left. Without knowing how much I had to start with, it's hard to know if, say, 20MB is a lot or not. As near as I can tell, Android seems to complain when the space is less than 25MB.

Update: I ran out of space again, and clearing the browser cache didn't help. After bumping around some more, first I discovered that in "Manage Applications" the one and only menu option is to sort by size. Doing that revealed that the new pig was the (post pay-wall version) New York Times application. It was using over 60MB of data space in the Phone Storage area. The app doesn't have a "clear cache" function, so I used the "Clear Data" button from within Manage Applications, and I was back in action.

enjoy,
Charles.

A Fix for "Exceeded MAX_FAILED_UNIQUE_FETCHES" in Hadoop

2010-11-05T17:19:00.000-07:00

In a project I'm currently working on, we're moving a bunch of our back-end processing to Hadoop. We started a two-node cluster: one master, one slave. That seemed to work fine. Then, we went to four nodes, and about the same time I was testing out a new Hadoop job. The (single) reducer was hanging with this somewhat cryptic message:

Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out

I went out to the slave node and looked through the job logs, and I could see that it was timing out trying to transfer data from one of the other slave nodes - an upstream mapper node. Upon closer scrutiny of the the log file I realized that Hadoop was trying to transfer from the other slave's public IP address, which is behind a firewall that blocks public access.

Key take-away number one: when you're just starting out with Hadoop, if you only have one slave, you've only demonstrated one real communication path: master-to-slave. Your cluster isn't doing any slave-to-slave transfers because everything was on the one slave. Also, our initial job had no reducer, so it ran fine on the new, 4-node cluster because it was still only master-slave communication.

For some reason, the mapper slave was advertising the location of the map output data via its public IP address. My first attempt at fixing this problem involved the dfs.datanode.dns.interface configuration parameter (and it's mapred equivalent). This tells Hadoop that when a process (mapred or dfs) wants to figure out it's host name, use the IP address associated with the given interface. (You could even have dfs and mapred using separate interfaces for additional throughput.)

This failed for me because I had one interface with two addresses, not two interfaces. I dug through the Hadoop DNS code (org.apache.hadoop.net.DNS - God, I love open-source: you can just look for yourself) and saw that if there is one interface, the code loops through the IP addresses and performs reverse DNS lookups and takes the first successful result. I was fortunate in that the private IP address was coming up first in that enumeration of the IPs on the interface, but it still wasn't working. I talked to our system admin/configuration guru. It turns out that our hosting provider doesn't provide reverse DNS for those private IP addresses. We could have set up our own DNS server for just these reverse lookups, but there was a brute-force option available to us.

You can bypass all of Hadoop's efforts to automatically figure out the slave's host name by specifying the slave.host.name parameter in the configuration files. If that is set, Hadoop will just take your word for it and use the name you provide. Now, in theory, this might be onerous - it means you have a different configuration file per-slave. However, our cluster is configured and maintained via Puppet. So, our puppet master just tweaked his Puppet script, and we never looked back.

Take-away number two: Exceeded MAX_FAILED_UNIQUE_FETCHES could mean a simple connectivity problem. I'm sure there are other possible causes, but an inability to connect between slaves is comparatively simple to troubleshoot.

enjoy,
Charles.

Django vs. Grails

2010-04-06T14:56:00.000-07:00

When I came up with the Five Technologies in Five Weeks project, I hadn't intended to compare any of the technologies to each other directly. My original ordering for the technologies didn't have any head-to-head match-ups, but I just finished my Grails project, which was "right after" (modulo interruptions) my Django/GAE project. And, while I was working on Grails, I kept comparing it to Django, and I felt compelled to write up my observations.

"All things being equal, which they never are" (Manager Tools), I'd choose Django over Grails for a new, green-field project. However, given the constraint to run in a Java environment, I'd gladly choose Grails over the other Java/J2EE frameworks I'm familiar with. (Django on Jython would be a contender as would Wicket, but I don't have hands-on experience with either.) This is not a slam on Grails.

Here are some of the reasons I prefer Django over Grails:

Development Speed (not runtime performance): running (a trivial number of) unit tests for Grails on my machine took about 15 seconds (with Folding@Home in the background). Django unit tests typically only take me a couple of seconds (for a trivial number of tests). Integration tests in Grails were 30-40 seconds; Django might be 5 seconds. Although Grails will reload the running webapp when you make a change, sometimes things are really buggered, which requires a full restart. That seemed to happen more often in Grails than Django, and restarts take ~20 seconds vs. 2 seconds.
Compared to the 5-7 minutes I've seen WebSphere to restart an application, Grails is blazingly fast. But the 15-30 seconds that many operations take in Grails is long enough for the mind to wander, which is not good.
Update: a commenter pointed out the (under-documented) interactive mode of Grails. That speeds the edit-test-edit loop considerably.
Voodoo: both Grails and Django do some voodoo behind the scenes to reduce the amount of chimp programming (e.g., non-DRY boilerplate) the programmer has to do. This voodoo involves topics of the high priests like meta object programming that mere mortals typically don't have to worry about. Maybe it's just that I don't have as much "time behind the wheel" with Grails, but I feel like its voodoo leaks out a bit and is too voodoo-rific. A Grails programmer has to be aware of which methods are dynamically added to a class in order to separate unit tests (no voodoo) from integration tests (full voodoo). And, although I like the power of dynamic finder methods like User.findByName(), it kinda bugs me that the code for that doesn't exist somewhere that I can see (you know, on punch cards!) You don't see the dynamic methods on a Django class, but there are a lot fewer of them, so it seems less voodoo-rific. Again, maybe more time behind the wheel of Grails would make me feel more comfortable.
(As an aside, when I taught Python and Django in Spring 2009, the students didn't even notice the voodoo of fields on models in Django until I pointed it out to them. Then, I exploded their heads with MOP.)
View Technology: I have a long-standing personal preference for templates instead of ASP/JSP-like mark-up languages, and I'd lump Grails GSP pages in the latter category. Back in the day, Jason Hunter wrote an essay called The Problems with JSP that really stuck with me. GSPs are much, much better than the Model 1 JSP pages that Hunter talks about, but they still feel similar enough to make me think about using something like Velocity with Grails. Django kicks templates up a notch by having inheritance with templates, which I really love. (Even if Django didn't invent template inheritance.)
Dynamic Language Issues: This is a real unfair comparison because I've been using Python for 15 years, but I felt that errors in Grails/Groovy were more cryptic and hard to find compared to Python. If I typo the named parameter to a method in Python, it lets me know immediately. More than once in Grails I misspelled a named parameter to a constructor, which failed silently and then lead to a validation failure later when I went to save the object. Some of that is from lack of experience with Groovy/Grails - i.e., learning what error messages really mean. But still, Python seems to fail in a more helpful way. (I'd say Python fails gracefully, but if anything it's the opposite: very loud - "you dumb-ass, there is no parameter called recipent" when I misspelled recipient.)

I don't mean for this to be a hit-piece against Grails; I really like Grails. I look forward to using it some more. It's just that, for what it's worth, I like Django more.

5Tech - Week 2: Grails

2010-04-06T13:54:00.000-07:00

For the second project and technology in five weeks I chose Grails. I was very curious to try Grails on on a "real"/non-tutorial project to confirm its usability and productivity. On my last contract gig, I was in a non-development role (configuration management) on a project that was using JSF 1.1/Spring/Hibernate and WebSphere, and their development looked really painful by my standards. For example, they'd spend 5-7 minutes to redeploy the app just to inspect some HTML change in the JSF page. I was looking to Grails to provide a much more productive environment.

The project involved creating a system to process events and route them to users. It was inspired by a code base that I worked on for a previous client. (This was done with their permission.) They have a large code base that includes "application functionality" like this event routing, as well as a lot of "technical functionality" that they rolled themselves years ago before modern tools like Spring and Hibernate were created and became mainstream. In a way, this was a small prototype to research the feasibility of reimplementing the application functionality on a more modern platform - Grails.

Analysis
All in all the project went very smoothly. Because Grails is comparatively mature, there exists a fair amount of documentation, including numerous books. I leaned heavily on Grails in Action by Smith and Ledbrook. I was able to follow their examples and adapt them to my application easily. There just weren't any serious gaps or surprises.

Using Grails to create domain classes (database entities) was a breeze. It's so nice to just declare the fields and their constraints and have Grails "do the right thing." You don't have to bother with annotations, let alone XML configuration files. Creating controllers to process web requests is trivial, and being able to scaffold them to get the basic CRUD functionality in place immediately was very conducive to high productivity.

I also liked the idea that services are a first-class concept in Grails (along with controllers and domain classes), which makes it very easy to program with them. Grails wires them into the classes that use the services via Spring, but again, you don't have to monkey with Spring's applicationContext.xml file. Finally, adding the REST interface for incoming events was almost trivial, especially since I already had the service in place.

Grails (like Rails) treats testing as a core concept, not something you wave your hands at after the fact. It has the concept of unit and integration tests, and there are a number of functional test plug-ins, too. I did a fair amount of unit and integration testing, which was a real live-saver. Due to the very dynamic nature of Groovy, many of my typos were not caught in the compile phase, but exercising the code in tests did catch those. I had very few issues when I ran the actual web application.

A lot of developers are unclear on the difference between unit testing and integration testing. For better or worse, Gails makes some clear distinctions. A lot of the Grails voodoo (e.g., dependency injection and adding dynamic methods to classes) is not available in unit tests, or you have to add it yourself via mocking or manual injection. Thus, there is a simple distinction: if it runs under "test-app -unit", it's a unit test, otherwise it's an integration test. This bit me once early on when I wrote a test that needed that higher-order functionality, but I put it in a unit test. First I got a null pointer exception because the service hadn't been injected, then I got a missing method exception because the save method on the domain object hadn't been added dynamically. However, the fix was simple enough - move the test to the integration folder, and it ran fine with -integration. Then, I copied it back, added a few mocks (very easy in Grails), and I had a unit test, too.

In terms of "project management," this step of the Five Technologies suffered from some "life happens" distractions. Rather than running Monday through Friday, it ended up being Tuesday through Monday, and some of those days were a bit short-changed. When I was on task, the Pomodoro Technique continued to be effective, and I replaced my dead watch/timer with a dedicated Android application called Pomodoro Tasks - it's even open-source.

In conclusion, Grails was another success for the Five Technologies in Five Weeks project. I got the application functionality done that I expected. Grails was very usable and productive - no nasty surprises. What's next? Probably, NetBeans Platform because I have documentation for it already whereas I'm waiting on some Android books.

5Tech - Week 1: Google App Engine

2010-03-24T11:15:00.000-07:00

For the first of the five technologies in five weeks, I picked something easy - Google App Engine using Python and Django. As someone who's been using Python for 15 years, there was no language learning curve. And using the Django helper for app engine package allowed me to leverage my Django experience. So with a minimal learning curve, the results were basically all good.

The project involved creating a simple application to monitor web sites to check if they are up or down and notify the user about status changes. As such, the core of the application isn't even a web app. In fact, I've implemented the same thing a couple of different times as a standalone program in Python or Java. However, to run a standalone application like that requires a server where it can run, and that's not something I've always had access to. GAE's cron service provides the ability to run checks periodically - just like the main loop in my standalone applications, and GAE provides a large number of notification options, although I only used email initially. The application does have a web interface for configuring checks and viewing the status

Project Analysis

My biggest problem developing the application was the fact that I was working with three separate but overlapping tools/frameworks: GAE, Django, and the Django helper. These all worked fine, but when I wanted to do some task, I didn't know where to look for the "right" way to do it. After wasting a bunch of time searching for things only to find that the Django way was the right way, I just quit asking and adopted the "try Django first" strategy. And then, the first time I went to apply that I got burned - there are some slight differences in how the filter method is handled in Django helper versus "native" Django. But for the most part, Django first is the way to go. In addition to the online documentation for these three tools/frameworks , I used Programming Google App Engine by Sanderson, which covers all of them - I highly recommend it.

Although my plan with Five Technologies is to use best practices, I am embarrassed to admit that I didn't practice much test driven development on this project despite having "home field advantage" with the technology - Django. Some of that is due to the exploratory nature of the programming - just figuring out which end is up, and some of it was the confusion over how to do tests - as noted above, the answer is "the Django way". Also, I found a bug in the app engine helper when loading a fixture, which hung me up - I'll be submitting a patch shortly.

One (non-technical) practice that I adopted was the Pomodoro Technique, and that worked pretty well. The 25 minute mini-sprints were really nice to contain technology-induced ADD. However, the watch I was using as my timer died during a pomodoro which lead to a very long and productive pomodoro.

In terms of the bigger goals of Five Technologies, I have another confession: I did not restrict myself to one week. I created version 0.1 and deployed it to the cloud within one week, but the week following this project I went to the Java Posse Roundup, and I kept working on the Django app, even though I should have been focusing on Java. It's just that after a week of working on GAE, I had built up some good momentum and didn't want to quit. I fear that this will be a real problem for the next projects because I won't have the luxury of continuing to work on whatever technology when I begin the next project.

In conclusion, the first technology experiment was a success, even if I wasn't dogmatic. What's next? Most likely Grails. Stay tuned.

Update: I forgot to mention something cool I learned about - schema migration, or the lack thereof. Before I began the project I was fretting about schema migration on GAE because I've been too lazy to learn something like South, and therefore I do schema migrations at a SQL command prompt. Obviously, there is no SQL problem for a NoSQL database like Google's BigTable. Then I forgot about the issue, but half way through I looked up and said, "hey, I haven't been doing any migrations, but this all works." Duh! - like many NoSQL databases, BigTable is schema-less, so there is no schema to migrate. Problem solved. OK, the application has to be prepared to do with an attribute on a record/row that isn't present, but that code is basically the same as dealing with a NULL value that you might specify as the default value when you issue an ALTER TABLE to add a column. Also, you can still imagine scenarios where you might still have to do some sort of schema/data conversion, but without even consciously thinking about it, I managed to avoid those. That was cool.

Five Technologies in Five Weeks

2010-03-08T16:35:00.000-08:00

I am currently between consulting jobs, and during the down time, I have embarked on a project to learn five new (to me) technologies in five weeks. The reasons for doing this include:

Learn new things - this project is a variant on the "learn one new language a year" meme that's been going around. I'm just taking on five things (not necessarily languages) in much less than a year.
Bust some code - it's been a while since I've been able to do any hard-core coding. This will be a sprint which can blow out some cobwebs.
Improve my development practices by trying some new techniques and focusing on refining existing ones.

The technologies I plan on tackling are (subject to change):

Google App Engine (with Python, not Java)
Grails
Android
NetBeans Platform (not just using the IDE)
Griffon

I chose these technologies because I know of them but haven't actually built anything with them. Also, they are technologies that I'm interested in testing out to see how usable/effective they are and if I should pursue them further.

In a way, each week's technology is a bit like a "spike" in an agile project, only I'm not looking to evaluate/sketch out solutions to application problems but rather evaluate technologies in a more abstract sense. Although, in some cases (e.g., NetBeans Platform and Griffon) I have some application ideas I'd like to implement, and I really am spiking possible solutions.

For each technology, I've got a modest application in mind. I have (or will have) a series of story cards describing various aspects of the application I'd like to create. And then, I will sprint for a week to implement as many of the stories as possible. I'll also post at least one blog entry as a retrospective for each sprint.

I've been planning this for some time - ever since the end was in sight on my last contract. The most significant threat to this undertaking is not failing at one or more technologies, but rather if I find another contract before I've completed the technologies. (There are worse things than finding a paying job when you're currently between jobs.) Another known disruption to the "five weeks" is that I'm taking one week off to go to the Java Posse Roundup, which will technically make this five technologies in six weeks, but that isn't as snappy as five in five.

Hiring is Only for Managers?

2010-03-04T10:12:00.000-08:00

A friend of mine told me that he just met the new guy on their team. I thought it was odd that he was just meeting a new team member, so I asked if he wasn't around during the interviews, and he told me that the developers on the team never interview candidates - only managers do that. As near as I can tell, they do this to minimize the time required during interviewing. This is just wrong.

In the words of Manager Tools, "hiring is the most important job that a manager does" because, in part, failures pull down the whole team for months or years. And, team-fit is a crucial part of that interviewing process. All things being equal (which they never are), it's better to hire a technical 8 who has a 10 personality than it is to hire a technical 10 who was only an 8 (or less) personality. If nothing else, you can teach technical stuff, but you can't teach personality.

Dave Ramsey describes his lengthy, multi-round interview process that even includes dinner with spouses to ensure team (in the largest sense) fit. As lengthy as the process is, he points out that fixing a hiring mistake costs much more than the added time of proper interviews.

I'd even argue that this manager-only interviewing process produces shortcoming in the technical area, too, because the manager works off of a superficial checklist that s/he has to get through quickly in the interview. Thus, if a candidate is asked "do you know web framework X," and the answer is "no, but I know frameworks A through F, and I wrote a framework I call G," that candidate is treated the same as someone straight out of school who doesn't know any frameworks. This narrow-mindedness leads to hires that know framework X, but they store passwords as plain-text because no one told them not to, and none of them knows how to use MD5 to store a password (another story from my friend). These are what Erik Sink calls programmers not developers - you want (well-rounded) developers.

So, with apologies to Georges Clemenceau, interviewing is too important to be left exclusively to the managers.

JIRA Workflow Transition Displaying Wrong Text

2010-02-11T10:11:00.000-08:00

I've been doing some JIRA administration and customization for a client, and I ran into a problem that was driving me crazy. I created a new workflow to deal with an Issue Type called Change Request by copying an exsiting workflow. I renamed one of the transitions from "Close Issue" to "Decline CR," and I associated a screen with the transition (the reason for creating a new workflow).

The name of the transition ("Decline CR") displayed correctly in the administration page, but in the regular display page of a Change Request issue, the name of the workflow action was still the old value ("Close Issue") from the original workflow. I tried a few things to fix it, but nothing seemed to help. I even tried changing the workflow scheme and changing it back. Still nothing.

This morning while fishing around in the administration pages, I found the problem. The transition included an internationalization property, jira.i18n.title, that specifies what appears to be the title of the workflow action, closeissue.title. I picked up that property when I cloned the workflow - the ultimate origin of the workflow was the JIRA's default workflow, which was internationalized. Removing the property got the workflow action name ("Decline CR") to display correctly. (I'm not working in a multi-language environment, so I don't need any localization.)

World's Ugliest URLs

2009-10-01T20:58:00.000-07:00

I live outside the city of Monmouth, Oregon (beneath the Western Skies, of course). Our city's official website has URLs like:
http://www.ci.monmouth.or.us/index.asp?Type=B_BASIC&SEC={F6D36CB4-8AB1-4E2F-9F16-EEB14A3A83DD} - Things to See & Do
http://www.ci.monmouth.or.us/index.asp?Type=B_BASIC&SEC={6010A930-F666-42AF-A359-971AC53933A1} - City Government

OK, maybe they're not the ugliest in the world, but those bad boys have gotta rank way up there. Anyway, when I first heard Jacob Kaplan-Moss talk about pretty URLs in Django, I thought some of criticisms were a little nit-picky. Initially, I didn't see anything wrong with index.php, but I could see the basic point even before seeing Monmouth's URLs. When I converted the Evergreen Terrace Farms site from PHP to Django, cleaning up the URLs was something I was pleased with. For example, www.et-farms.com/animals/detail.php?name=bart became www.etfarms.com/animals/bart. So, I guess I have drunk the Kool-Aide, and I have become a bit of a URL snob.

Regardless of how picky you are about URLs, I think any sane person would agree that those URLs for the Monmouth site are just crazy. Just imagine trying to read one of those to your mom over the phone - "no, it's 42AF, and the A and the F are capitalized."

In my opinion, the saddest thing is that the site is a commercial product created by a company that boasts about how many governments they've sold it to. It would be one thing if some students created something like that for a senior project, but when you're charging people money for something like that you should at least not expose the ugliest of the ugly Microsoft crap from the depths of the implementation (I assume those are UUIDs generated by .Net). I guess I'm also disappointed that no one in the city even noticed those URLs before putting out taxpayer money.

Charles.