Monday, February 22, 2016

HashiQuest

I'm starting a new project at work to build a new infrastructure for hosting our apps.  My objectives/requirements include:

  • Elastic - something we can easily scale up and down.
  • Redundant - something that can tolerate reasonable outages.  With the recent Xen security issues, I've had more than one ride in the reboot-rodeo, and I'm getting tired of that.
  • Invented Here as opposed to Not Invented Here (NIH) - I inherited the existing infrastructure, and there's nothing wrong with it per-se, but because I didn't build it, it surprises me from time to time.
  • Immutable - I want to get on the immutable infrastructure bandwagon because it's what the cool kids have been doing.  But, as I get into this, I realize that immutable infrastructure can lead to...
  • Fearless - I want to be able to make changes quickly and easily without uttering "what could possibly go wrong?" before each change.
To achieve these objectives, I plan on using tools from HashiCorp to build out a pretty traditional infrastructure on Amazon Web Services.  I'm a big fan of HashiCorp and their tools.  Most of their tools are open source, which I like for cost and "religious" reasons.  Mitchell Hashimoto was my first guest on the SE Radio podcast when HashiCorp was just launching, and he's great.  Once my infrastructure is up and running, I look forward to using their Atlas tool to manage it all and pay Mitchell for all the great stuff he's done.

As mentioned, my initial plan is to build the first version of the infrastructure using AMIs running on EC2 instances as opposed to building Docker containers or running on Google Compute Engine. I made that decision in part to be more conservative (I hate explaining our current environment to prospective customers - no one ever got fired for picking IBM/Cicso/Amazon.)  However, by using HashiCorp tools, I am hoping that I can keep my options open in the future.

Hence, I have begun what I'm calling HashiQuest.  Stay tuned.

Charles.

Wednesday, January 13, 2016

A Pair of Interviews

I had two interviews released back-to-back on Software Engineering Radio around New Years:

  • Episode 245: John Sonmez and I discussed his book Soft Skills - in particular the chapters on career management and marketing yourself.  He was a great guest.  I wish we could have gone over the whole book.
  • Episode 246:  I interview John Wilkes from Google about the Borg cluster management software used at Google and Kubernetes. No one, except John, will ever know what a pain in the butt it was to record that episode - epic Skype fails the first time, but John was exceedingly helpful and understanding.
Check them out,
Charles.

Monday, June 15, 2015

A Pleasant Echo

My podcast interview with James Turnbull was transcribed and published by the nice folks at IEEE.  Check it out in May/June 2015 issue of IEEE Software.  That's something I wasn't expecting when I did the interview.

Charles.

Wednesday, February 04, 2015

Open Classes and Lazy Loading in Rails Don't Always Mix


Once upon a time, I came across an odd bug in some Rails code that was kicking my butt for some time.  The reason it perplexed me was it was a classic Heizenbug that seemed to come and go.  (That, and this was a side project that I couldn't devote much focused time to.)

The bug sometimes manifested itself with this error when running tests -
NoMethodError: undefined method `all' for SocialQueue:Class
where SocialQueue is an ActiveRecord model.  If I subsetted the tests being run to just the ones that failed, the bug would go away.  And, if I ran other tests before, the error would go away.   In other words, the act of trying to observe the bug would change it.

Another variant of the error I found when running the server in development mode was:
undefined method arel_table for QueuedPost:Class.  Again, QueuedPost is a model class, and I assume it has that method somewhere in the voodo that is ActiveRecord.

The error showed up when I added an "innocuous" tracing statement.  If I replaced the tracing statement with a puts statement, it worked.  If I put a return statement as the first statement in the tracing method, the error persisted - i.e., nothing in the body of that tracing method was causing harm.

The tracing method was in a module, in a separate file, and just requiring that module would cause the errors.  How could that be?  The module in question wasn't methods to be included in a class - it's just a name space to put this tracing method.  What's so bad about requiring a module?

What else is in that file?  Oh yeah, I have some code in there that opens my model classes to add a method to each class.  I put the new methods in that file, away from the rest of the model's definitions, because this tracing facility was experimental, and I didn't want to commit to modifying the model classes just yet.

In the words of Merlin Mann, "turns out" in development and test modes, Rails loads class definitions lazily.  When my module that opened model classes was loaded, the model classes hadn't necessarily been loaded.  If the model was loaded, it worked.  Otherwise, it wasn't opening an existing model class, but rather it was opening a new class.  Then, when my code that tried to use the models was run, there was already a definition for the class, so Rails didn't load the model class, and the object I thought was a model, was basically a lump of uselessness with the one method I intended to inject into a model class but none of the model methods.  I have since heard of lazy loading causing problems in STI.

The long-term solution for my problem is to move those new, injected methods into the main definitions of the models, now that my experiment is over, and I know I want to keep those methods.

In the interim, I came up with a simple hack that can be used anywhere you want to open a model class from some other file (or, maybe I've convinced you not bother - just edit the model).  Right before you open the model class, just mention it.  For example:
SomeModel
class SomeModel
  def my_new_method()
  end
end
Mentioning the model causes Rails to load it.  Then, when you open it, you're actually opening the real model class.

Perhaps there are better ways to skin this cat, but this works, and in the process, I learned about the existence and dangers lazy loading in Rails.

enjoy,
Charles.

Wednesday, January 07, 2015

Thursday, August 07, 2014

Mitchell Hashimoto on the Vagrant Project

I almost forgot to mention it here, but my first podcast episode with SE Radio went live the last week of July.  In it, I interviewed Mitchell Hashimoto about the Vagrant project.  Stay tuned for more...

Charles.

Tuesday, April 15, 2014

Living Dangerously with MySQL Fatal Error 1236

This past weekend, the data center where our MySQL master resides suffered some issues.  At first I thought it was just some connectivity issues, but it was a power outage and our nodes were all rebooted.  While cleaning up various messes from that, I discovered that our D/R slave in another data center was stuck with the error message:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.014593' at 52888137, the last event read from '/var/log/mysql/mysql-bin.014593' at 4, the last byte read from '/var/log/mysql/mysql-bin.014593' at 4.'

Googling around, I found that this error means that the slave got more data from the master than what the master wrote to its log before the crash.  The result was that the slave was asking for data beyond the end of the log file - the master started a new log when it restarted.

I wondered if it would be possible to just move on to the next log, and looking at more postings, I found that it is possible.

NOTE: this procedure is dangerous and may lead to data loss or corruption.  I would never do it on a truly critical system like a financial system.

I figured it was worth a try.  In the worst case, I would hose the slave and have to rebuild from a fresh dump, which was the only other alternative.  I also realized that when the slave restarted, there might be some replication issues around that area in the "log transition."

As the blonde said, "do you want to live forever?"

So, I stopped the slave, moved it to the beginning of the next log file and started it again.
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_POS = 4;
CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin. 014593';
START SLAVE;

As anticipated, there were issues.  There were a number of UPDATE statements that couldn't be applied because of a missing row.  I steered around them, one at a time with:
STOP SLAVE ; 
SET GLOBAL SQL_SLAVE_SKIP_COUNTER =1; 
START SLAVE ;

It was a hassle, and it took interventions that I expected, but it was quicker than shutting down my production applications to take a consistent dump, transferring, and restoring it.  And, while I was babysitting it, I could write a blog post.

Your milage may vary,
Charles.

Monday, November 25, 2013

Active Record - joins + include Methods Causing an Unintended Join

I recently fixed a problem in my Rails 3.2 app where I was using both the joins and include methods in an Active Record query, and it was triggering a join that I didn't want.  WTF?  Why are you using include and joins, and you don't want a join?

I needed to run a query on table A and I needed to apply criteria against another table B.  Thus, I needed to (inner) join those two with the joins method.    For the rows of A that met the search criteria, I wanted to eagerly the corresponding rows from tables X, Y, and Z.  Of course, I wanted to avoid a 3N+1 query situation.  So, I also used the includes method.

Typically, the includes method generates a query by IDs for the related objects.  In my case, I was getting four INNER JOINs - one each for B, X, Y, and Z.  Under "normal" circumstance, maybe this would have been OK, but my problem is table Y is in a separate database, and you can't join across databases.  (You can't really do transactions across databases, either.)

My original code used an array of named associations in the joins method - joins(:bs).  On a lark, I decided to recode it to use a string - joins('INNER JOIN bs ON bs.a_id = as.id'), and it worked:  I got the inner join for B and three individual queries for X, Y, and Z.  Because Y is queried as a simple query with an array of IDs, the fact that Y is in another database isn't a problem - it just works.

Anyway, if you've stumbled across this post while trying to solve the same problem, I hope this helps.

Charles.

Thursday, May 23, 2013

Ctags for Puppet - Three (previously missing) Pieces


Back in the day, when I was coding in C on the Unix kernel (before Linux even existed), I used vi's tags functionality extensively.  We had a patched version of vi (before vim existed) that supported tag stacks and a hacked version of ctags that picked up all kinds of things like #defines, and it used the -D flags you used when compiling to get you to the right definition of something that was defined many times for various architectures, etc.  But, when I moved to C++ with function overloading, ctags broke down for me, and I quit using it.

Recently, I inherited pretty big Puppet code base.  For a long time, I was just navigating it by hand using lots of find and grep commands.  Finally, I broke down and figured out how to get ctags working for my Puppet code on OS X.  Actually, other people figured it out, but here were the three pieces I had to string together.

A modern version of ctags - aka exuberant ctags.  This is pretty easy to install with homebrew, but there is a rub: OS X already has a version of it installed, and depending on how your PATH is configured, the stock version might trump homebrew's version.  Matt Pollito has a nice, concise blog post explaining how to cope with that.

Tell ctags about Puppet's syntax: Paul Nasrat has a little post describing the definitions needed in the ~/.ctags file and the invocation of ctags.

Tell vim about Puppet's syntax: Netdata's virmrc file has the last piece:
set iskeyword=-,:,@,48-57,_,192-255
The colon is the key there (no pun intended) - without that, vim wasn't dealing with scoped identifiers and was just hitting the top-level modules.

The last bit is for me to re-learn the muscle memory for navigating with tags that has atrophied after 20 years give or take.  BTW, if you don't have tags, a cool approximation within a single file is '*' in command mode - it searches for the word under the cursor.

enjoy,
Charles.

Tuesday, May 07, 2013

Hadoop Beginner's Guide

Hadoop Beginner's Guide by Garry Turkington
ISBN: 1849517304

Hadoop Beginner's Guide is, as the title suggests, a new introductory book to the Hadoop ecosystem.  It provides an introduction to how to get up and running with the core components of Hadoop (Map-Reduce and HDFS),  some higher level tools like Hive, integration tools like Sqoop and Flume, and it also provides some good starting information relating to operational issues with Hadoop. This is not an exhaustive reference like Hadoop: The Definitive Guide, and for a beginner, that's probably a good thing.  (In my day, we only had The Definitive Guide, and we liked it!)

Most of the topics are covered in a "dive right in" format.  After some brief introduction to the topic the author provides a list of commands or a block of code and invites you to run it.  This is followed by "What just happened?" that explains the details of the operation or code.  Personally, I don't care for that too much because the explanation is sometimes separated from the code by multiple pages, which was a real hassle reading this as a PDF.  But, maybe that's just me.

As I mentioned, the book includes a couple of chapters on operations, which I found to be a nice addition to a beginner's book.  Some of these operational details were explained by hands-on experiments like shutting down processes or nodes, in which case "What just happened?" is more like "What just broke?"  The operational scenarios are by no means exhaustive (that's what you learn from production), but they provide the reader with some "real life" experience gained in a low-risk environment.  And, they introduce a powerful method to learn more operational details: set up an experiment and find out what happens.  Learning to learn is the most valuable thing you can gain from any book, class, or seminar.

Another nice feature of this book that I haven't seen in others is that the author includes examples of Amazon EC2 and Elastic Map Reduce (EMR).  There are examples of both Map Reduce and Hive jobs on EMR.  He doesn't do everything with "raw" Map Reduce and EMR because once you know the basics of EMR, the same principles apply to both raw Hadoop and EMR.

I do have some complaints about the book, but many of them are nit-picking or personal style.  That said, I think the biggest thing this book would benefit from would be some very detailed "technical editing."  By that I mean there are technical details that got corrupted during the book production process.  For example, the hadoop command is often rendered as Hadoop in examples.  There are plenty of similar formatting and typographic errors. Of course, an experienced Hadoop user wouldn't be tripped up by these, but this is a "beginner's guide," and such details can cause tremendous pain and suffering for newbies.

To wrap things up, Hadoop Beginner's Guide is a pretty good introduction to the Hadoop ecosystem.  I'd recommend it to anyone just starting out with Hadoop before moving on to something more reference-oriented like The Definitive Guide.

enjoy,
Charles.




FTC disclaimer: I received a free review copy of this book from DZone.  The links to Amazon above contain my Amazon Associates tag.