Monday, June 15, 2015

A Pleasant Echo

My podcast interview with James Turnbull was transcribed and published by the nice folks at IEEE.  Check it out in May/June 2015 issue of IEEE Software.  That's something I wasn't expecting when I did the interview.


Wednesday, February 04, 2015

Open Classes and Lazy Loading in Rails Don't Always Mix

Once upon a time, I came across an odd bug in some Rails code that was kicking my butt for some time.  The reason it perplexed me was it was a classic Heizenbug that seemed to come and go.  (That, and this was a side project that I couldn't devote much focused time to.)

The bug sometimes manifested itself with this error when running tests -
NoMethodError: undefined method `all' for SocialQueue:Class
where SocialQueue is an ActiveRecord model.  If I subsetted the tests being run to just the ones that failed, the bug would go away.  And, if I ran other tests before, the error would go away.   In other words, the act of trying to observe the bug would change it.

Another variant of the error I found when running the server in development mode was:
undefined method arel_table for QueuedPost:Class.  Again, QueuedPost is a model class, and I assume it has that method somewhere in the voodo that is ActiveRecord.

The error showed up when I added an "innocuous" tracing statement.  If I replaced the tracing statement with a puts statement, it worked.  If I put a return statement as the first statement in the tracing method, the error persisted - i.e., nothing in the body of that tracing method was causing harm.

The tracing method was in a module, in a separate file, and just requiring that module would cause the errors.  How could that be?  The module in question wasn't methods to be included in a class - it's just a name space to put this tracing method.  What's so bad about requiring a module?

What else is in that file?  Oh yeah, I have some code in there that opens my model classes to add a method to each class.  I put the new methods in that file, away from the rest of the model's definitions, because this tracing facility was experimental, and I didn't want to commit to modifying the model classes just yet.

In the words of Merlin Mann, "turns out" in development and test modes, Rails loads class definitions lazily.  When my module that opened model classes was loaded, the model classes hadn't necessarily been loaded.  If the model was loaded, it worked.  Otherwise, it wasn't opening an existing model class, but rather it was opening a new class.  Then, when my code that tried to use the models was run, there was already a definition for the class, so Rails didn't load the model class, and the object I thought was a model, was basically a lump of uselessness with the one method I intended to inject into a model class but none of the model methods.  I have since heard of lazy loading causing problems in STI.

The long-term solution for my problem is to move those new, injected methods into the main definitions of the models, now that my experiment is over, and I know I want to keep those methods.

In the interim, I came up with a simple hack that can be used anywhere you want to open a model class from some other file (or, maybe I've convinced you not bother - just edit the model).  Right before you open the model class, just mention it.  For example:
class SomeModel
  def my_new_method()
Mentioning the model causes Rails to load it.  Then, when you open it, you're actually opening the real model class.

Perhaps there are better ways to skin this cat, but this works, and in the process, I learned about the existence and dangers lazy loading in Rails.


Wednesday, January 07, 2015

Thursday, August 07, 2014

Mitchell Hashimoto on the Vagrant Project

I almost forgot to mention it here, but my first podcast episode with SE Radio went live the last week of July.  In it, I interviewed Mitchell Hashimoto about the Vagrant project.  Stay tuned for more...


Tuesday, April 15, 2014

Living Dangerously with MySQL Fatal Error 1236

This past weekend, the data center where our MySQL master resides suffered some issues.  At first I thought it was just some connectivity issues, but it was a power outage and our nodes were all rebooted.  While cleaning up various messes from that, I discovered that our D/R slave in another data center was stuck with the error message:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.014593' at 52888137, the last event read from '/var/log/mysql/mysql-bin.014593' at 4, the last byte read from '/var/log/mysql/mysql-bin.014593' at 4.'

Googling around, I found that this error means that the slave got more data from the master than what the master wrote to its log before the crash.  The result was that the slave was asking for data beyond the end of the log file - the master started a new log when it restarted.

I wondered if it would be possible to just move on to the next log, and looking at more postings, I found that it is possible.

NOTE: this procedure is dangerous and may lead to data loss or corruption.  I would never do it on a truly critical system like a financial system.

I figured it was worth a try.  In the worst case, I would hose the slave and have to rebuild from a fresh dump, which was the only other alternative.  I also realized that when the slave restarted, there might be some replication issues around that area in the "log transition."

As the blonde said, "do you want to live forever?"

So, I stopped the slave, moved it to the beginning of the next log file and started it again.
CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin. 014593';

As anticipated, there were issues.  There were a number of UPDATE statements that couldn't be applied because of a missing row.  I steered around them, one at a time with:

It was a hassle, and it took interventions that I expected, but it was quicker than shutting down my production applications to take a consistent dump, transferring, and restoring it.  And, while I was babysitting it, I could write a blog post.

Your milage may vary,

Monday, November 25, 2013

Active Record - joins + include Methods Causing an Unintended Join

I recently fixed a problem in my Rails 3.2 app where I was using both the joins and include methods in an Active Record query, and it was triggering a join that I didn't want.  WTF?  Why are you using include and joins, and you don't want a join?

I needed to run a query on table A and I needed to apply criteria against another table B.  Thus, I needed to (inner) join those two with the joins method.    For the rows of A that met the search criteria, I wanted to eagerly the corresponding rows from tables X, Y, and Z.  Of course, I wanted to avoid a 3N+1 query situation.  So, I also used the includes method.

Typically, the includes method generates a query by IDs for the related objects.  In my case, I was getting four INNER JOINs - one each for B, X, Y, and Z.  Under "normal" circumstance, maybe this would have been OK, but my problem is table Y is in a separate database, and you can't join across databases.  (You can't really do transactions across databases, either.)

My original code used an array of named associations in the joins method - joins(:bs).  On a lark, I decided to recode it to use a string - joins('INNER JOIN bs ON bs.a_id ='), and it worked:  I got the inner join for B and three individual queries for X, Y, and Z.  Because Y is queried as a simple query with an array of IDs, the fact that Y is in another database isn't a problem - it just works.

Anyway, if you've stumbled across this post while trying to solve the same problem, I hope this helps.


Thursday, May 23, 2013

Ctags for Puppet - Three (previously missing) Pieces

Back in the day, when I was coding in C on the Unix kernel (before Linux even existed), I used vi's tags functionality extensively.  We had a patched version of vi (before vim existed) that supported tag stacks and a hacked version of ctags that picked up all kinds of things like #defines, and it used the -D flags you used when compiling to get you to the right definition of something that was defined many times for various architectures, etc.  But, when I moved to C++ with function overloading, ctags broke down for me, and I quit using it.

Recently, I inherited pretty big Puppet code base.  For a long time, I was just navigating it by hand using lots of find and grep commands.  Finally, I broke down and figured out how to get ctags working for my Puppet code on OS X.  Actually, other people figured it out, but here were the three pieces I had to string together.

A modern version of ctags - aka exuberant ctags.  This is pretty easy to install with homebrew, but there is a rub: OS X already has a version of it installed, and depending on how your PATH is configured, the stock version might trump homebrew's version.  Matt Pollito has a nice, concise blog post explaining how to cope with that.

Tell ctags about Puppet's syntax: Paul Nasrat has a little post describing the definitions needed in the ~/.ctags file and the invocation of ctags.

Tell vim about Puppet's syntax: Netdata's virmrc file has the last piece:
set iskeyword=-,:,@,48-57,_,192-255
The colon is the key there (no pun intended) - without that, vim wasn't dealing with scoped identifiers and was just hitting the top-level modules.

The last bit is for me to re-learn the muscle memory for navigating with tags that has atrophied after 20 years give or take.  BTW, if you don't have tags, a cool approximation within a single file is '*' in command mode - it searches for the word under the cursor.


Tuesday, May 07, 2013

Hadoop Beginner's Guide

Hadoop Beginner's Guide by Garry Turkington
ISBN: 1849517304

Hadoop Beginner's Guide is, as the title suggests, a new introductory book to the Hadoop ecosystem.  It provides an introduction to how to get up and running with the core components of Hadoop (Map-Reduce and HDFS),  some higher level tools like Hive, integration tools like Sqoop and Flume, and it also provides some good starting information relating to operational issues with Hadoop. This is not an exhaustive reference like Hadoop: The Definitive Guide, and for a beginner, that's probably a good thing.  (In my day, we only had The Definitive Guide, and we liked it!)

Most of the topics are covered in a "dive right in" format.  After some brief introduction to the topic the author provides a list of commands or a block of code and invites you to run it.  This is followed by "What just happened?" that explains the details of the operation or code.  Personally, I don't care for that too much because the explanation is sometimes separated from the code by multiple pages, which was a real hassle reading this as a PDF.  But, maybe that's just me.

As I mentioned, the book includes a couple of chapters on operations, which I found to be a nice addition to a beginner's book.  Some of these operational details were explained by hands-on experiments like shutting down processes or nodes, in which case "What just happened?" is more like "What just broke?"  The operational scenarios are by no means exhaustive (that's what you learn from production), but they provide the reader with some "real life" experience gained in a low-risk environment.  And, they introduce a powerful method to learn more operational details: set up an experiment and find out what happens.  Learning to learn is the most valuable thing you can gain from any book, class, or seminar.

Another nice feature of this book that I haven't seen in others is that the author includes examples of Amazon EC2 and Elastic Map Reduce (EMR).  There are examples of both Map Reduce and Hive jobs on EMR.  He doesn't do everything with "raw" Map Reduce and EMR because once you know the basics of EMR, the same principles apply to both raw Hadoop and EMR.

I do have some complaints about the book, but many of them are nit-picking or personal style.  That said, I think the biggest thing this book would benefit from would be some very detailed "technical editing."  By that I mean there are technical details that got corrupted during the book production process.  For example, the hadoop command is often rendered as Hadoop in examples.  There are plenty of similar formatting and typographic errors. Of course, an experienced Hadoop user wouldn't be tripped up by these, but this is a "beginner's guide," and such details can cause tremendous pain and suffering for newbies.

To wrap things up, Hadoop Beginner's Guide is a pretty good introduction to the Hadoop ecosystem.  I'd recommend it to anyone just starting out with Hadoop before moving on to something more reference-oriented like The Definitive Guide.


FTC disclaimer: I received a free review copy of this book from DZone.  The links to Amazon above contain my Amazon Associates tag.

Friday, October 14, 2011

Why is my Rails app calling Solr so often?

I work on the back-end of a Rails app that uses Solr via Sunspot. Looking at the solr logs, I could see the same item being added/indexed repeatedly sometimes right before it was deleted from solr. I didn't write the code, but I was tasked with figuring it out.

Glancing at the main path of the code didn't show anything obvious. I figured the superfluous solr calls were  happening via callbacks somewhere in the graph of objects related to my object in solr, but which one(s).  Again, I didn't write the code, I just had to make it perform.

I hit on the idea of monkey-patching (for good, not evil) the Sunspot module.  Fortunately, most/all of the methods on the Sunspot module just forward the call onto the session object.  So, it's really easy to replace the original call with anything you want and still call the real Sunspot code, if that's what you want to do.

This is so easy to do that I even did it the first time in the rails console.  In that case, I was happy to abort the index operation when it first happened.  So, I whipped this up in a text file and pasted it into the console:

module Sunspot
  class <<self
    def index(*objects)
      raise "not gonna do it!"

Then, I invoked the destroy operation that was triggering the solr adds, got the stack trace, and could clearly see which dependent object was causing the index operation.

For another case, I needed to run a complex workflow in a script to trigger the offending solr operations. In that case, I wanted something automatically installed when the script started up, and I wanted something that didn't abort - all I wanted was a stack trace. So, I installed the monkey-patch in config/initializers/sunspot.rb and had a more involved index function:

    def index(*objects)
      puts "Indexing the following objects:"
      objects.each { |o| puts "#{o.class} - #{}" }
      puts "From: =============="
      raise rescue puts $!.backtrace
      puts ===========\n"

That last line is the body of the real version of the index method - like I said, trivial to re-implement; no alias chaining required.

Maybe there's some cooler way to figure this out, but this worked for me.


Thursday, August 18, 2011

Rails/Rspec does not clean up model instances on MySQL

I recently solved a thorn in my side relating to some Rspec tests in our code base when running on my development machine using MySQL.  For some reason, some instances that were created using Factory Girl weren't getting cleaned up, which in turn would cause subsequent test runs to fail because of duplicate data.  So, I'd DELETE the whole tables from the MySQL prompt.  I looked in the test.log file, and I could see the save points being issued before the objects were created, but they weren't getting removed at the end of the test. 

I didn't have a lot of time to look into it, and I didn't know where to look - Rspec, Factory Girl, Rails?  So, in the short-term, I just added after_each calls to destroy the objects.  And, I moved on.

Then, I was dumping schemas in MySQL using SHOW CREATE TABLE in order to analyze some tables and indexes, and I noticed the storage ENGINE flag on the tables.  I went back and looked at the tables in my test database that were giving me trouble, and, of course(?), they were MyISAM rather than InnoDB.  So, transaction rollback (used to clean up after tests) didn't work.

I changed the storage engine on those tables (ALTER TABLE t1 ENGINE = InnoDB), commented out the manual clean-up code, and voila!  It works right now.  Pretty obvious in retrospect, but I didn't even know where to start looking in our stack.

I hope this helps some other poor souls, too.