Friday, October 14, 2011

Why is my Rails app calling Solr so often?

I work on the back-end of a Rails app that uses Solr via Sunspot. Looking at the solr logs, I could see the same item being added/indexed repeatedly sometimes right before it was deleted from solr. I didn't write the code, but I was tasked with figuring it out.

Glancing at the main path of the code didn't show anything obvious. I figured the superfluous solr calls were  happening via callbacks somewhere in the graph of objects related to my object in solr, but which one(s).  Again, I didn't write the code, I just had to make it perform.

I hit on the idea of monkey-patching (for good, not evil) the Sunspot module.  Fortunately, most/all of the methods on the Sunspot module just forward the call onto the session object.  So, it's really easy to replace the original call with anything you want and still call the real Sunspot code, if that's what you want to do.

This is so easy to do that I even did it the first time in the rails console.  In that case, I was happy to abort the index operation when it first happened.  So, I whipped this up in a text file and pasted it into the console:

module Sunspot
  class <<self
    def index(*objects)
      raise "not gonna do it!"
    end
  end
end


Then, I invoked the destroy operation that was triggering the solr adds, got the stack trace, and could clearly see which dependent object was causing the index operation.

For another case, I needed to run a complex workflow in a script to trigger the offending solr operations. In that case, I wanted something automatically installed when the script started up, and I wanted something that didn't abort - all I wanted was a stack trace. So, I installed the monkey-patch in config/initializers/sunspot.rb and had a more involved index function:

    def index(*objects)
      puts "Indexing the following objects:"
      objects.each { |o| puts "#{o.class} - #{o.id}" }
      puts "From: =============="
      raise rescue puts $!.backtrace
      puts ===========\n"
      session.index(*objects)
    end


That last line is the body of the real version of the index method - like I said, trivial to re-implement; no alias chaining required.

Maybe there's some cooler way to figure this out, but this worked for me.

enjoy,
Charles.