Showing posts with label books. Show all posts
Showing posts with label books. Show all posts

Tuesday, May 07, 2013

Hadoop Beginner's Guide

Hadoop Beginner's Guide by Garry Turkington
ISBN: 1849517304

Hadoop Beginner's Guide is, as the title suggests, a new introductory book to the Hadoop ecosystem.  It provides an introduction to how to get up and running with the core components of Hadoop (Map-Reduce and HDFS),  some higher level tools like Hive, integration tools like Sqoop and Flume, and it also provides some good starting information relating to operational issues with Hadoop. This is not an exhaustive reference like Hadoop: The Definitive Guide, and for a beginner, that's probably a good thing.  (In my day, we only had The Definitive Guide, and we liked it!)

Most of the topics are covered in a "dive right in" format.  After some brief introduction to the topic the author provides a list of commands or a block of code and invites you to run it.  This is followed by "What just happened?" that explains the details of the operation or code.  Personally, I don't care for that too much because the explanation is sometimes separated from the code by multiple pages, which was a real hassle reading this as a PDF.  But, maybe that's just me.

As I mentioned, the book includes a couple of chapters on operations, which I found to be a nice addition to a beginner's book.  Some of these operational details were explained by hands-on experiments like shutting down processes or nodes, in which case "What just happened?" is more like "What just broke?"  The operational scenarios are by no means exhaustive (that's what you learn from production), but they provide the reader with some "real life" experience gained in a low-risk environment.  And, they introduce a powerful method to learn more operational details: set up an experiment and find out what happens.  Learning to learn is the most valuable thing you can gain from any book, class, or seminar.

Another nice feature of this book that I haven't seen in others is that the author includes examples of Amazon EC2 and Elastic Map Reduce (EMR).  There are examples of both Map Reduce and Hive jobs on EMR.  He doesn't do everything with "raw" Map Reduce and EMR because once you know the basics of EMR, the same principles apply to both raw Hadoop and EMR.

I do have some complaints about the book, but many of them are nit-picking or personal style.  That said, I think the biggest thing this book would benefit from would be some very detailed "technical editing."  By that I mean there are technical details that got corrupted during the book production process.  For example, the hadoop command is often rendered as Hadoop in examples.  There are plenty of similar formatting and typographic errors. Of course, an experienced Hadoop user wouldn't be tripped up by these, but this is a "beginner's guide," and such details can cause tremendous pain and suffering for newbies.

To wrap things up, Hadoop Beginner's Guide is a pretty good introduction to the Hadoop ecosystem.  I'd recommend it to anyone just starting out with Hadoop before moving on to something more reference-oriented like The Definitive Guide.

enjoy,
Charles.




FTC disclaimer: I received a free review copy of this book from DZone.  The links to Amazon above contain my Amazon Associates tag.

Tuesday, July 14, 2009

Book: The Principles of Successful Freelancing

The Principles of Successful Freelancing by Miles Burke
ISBN 978-0-98004552-4-6

The Principles of Successful Freelancing is a comprehensive introduction (if that's not a contradiction of terms) to striking out on your own as a freelancer. This book is perfect for someone who is considering moving to freelancing or possibly for someone just starting out.

Mr. Burke covers all of the basic areas of starting and running a freelance business. He discusses how to set up your business and your office, how to sell your services, how to manage your money, and how to give good customer service, which is ultimately the most important aspect of a personal freelancing business. He also addresses how to balance work and life beyond work, which is hard in general and specifically hard in a one-man shop. He concludes with something I haven't seen in a "start you own business" book - where to go next. Do you want to remain as a one-man shop, do you want to grow into a "real" business, or do you just want to "retreat" to the old 9-to-5 job? I don't recall a book like this consider the option of going back to the grind.

Each chapter concludes with two "case studies" - Emily and Jacob. These two characters represent two very different people who might want to go into freelancing. The studies at the end of each chapter explain how these personality types might react to the issues and challenges discussed in the chapter. This device helps the reader envision how he or she might deal with the issues discussed.

Early on, I got the mistaken impression that this book was a bit fluffy. The typography has a fair amount of white space, and it looks kinda arty rather than serious and dense. (OK, I grew up with punched cards and line printers. When's Matlock on?) But, by the time I finished the book, as I looked back across it, I really couldn't think of anything that wasn't covered. Sure, there are whole MBAs built around marketing, and this book only has one chapter on it, but the Mr. Burke provides a perfectly reasonable introduction to the subject. I think I got this "fluffy" mis-impression because immediately prior to reading Successful Freelancing I read Eric Sink's The Business of Software, which is very detailed about a few aspects specifically related to running a small software business. Successful Freelancing covers a wider range of topics, and it is not aimed specifically at software freelancers. If anything, it's aimed more at web designers who probably like nicer typography.

To conclude, The Principles of Successful Freelancing is a great first introduction to the idea of freelancing. It covers all the bases to help someone evaluate whether or not to go into business for himself.

enjoy,
Charles.

Tuesday, March 24, 2009

Book: Writing for Scholars

Writing for Scholars: A Practical Guide to Making Sense and Being Heard by Lynn P. Nygaard
ISBN: 9788215013916

Writing for Scholars is a great guide for (aspiring) academic writers. The simplest thing I could say is that it ought to be required reading for anyone in graduate school who will be doing academic writing - e.g., journal articles or a dissertation. In my experience, academic writing was something that a graduate student was expected to either know already or absorb quickly without little or no coaching.

I've read a couple of books on the subject of academic writing, especially in the area of the sciences. Those books focused on a lot of the minutia of presenting and formatting one's work in a journal or similar medium. Ms. Nygaard takes a much larger view of the writing process, and she de-emphasizes (without completely dismissing) the technical minutia, putting it in the later chapters. She begins by talking about how to develop good writing habits, which is applicable to non-academic writers, too. She also explains the academic dialogue and how an academic paper has to fit into and extend that dialogue.

She continues by explaining how to identify your audience, which is also applicable to non-academic writing. Then she gets down to what I would term the core of the writing process: forming your argument and expressing it in standard academic form (abstract, introduction, method, results, discussion). She also explains how and when to use figures and tables.

A couple of topics that I don't recall reading in other books are: feedback (giving and receiving) and presenting a paper at a conference. Again, both of these are subjects which were never taught in my graduate schooling. These are both crucial topics that complete the academic dialogue.

Throughout the book, Ms. Nygaard includes numerous (sometimes humorous) examples drawn from a wide range of academic disciplines. Perhaps I'm just reading it through my own science-tinted glasses, but I'd say the book does lean more towards the "hard" sciences rather than social sciences or liberal arts. However, I would assume that non-science writers would find this book just as useful as the geeks in the world.

If I had to make a minor criticism of this book, I'd say that Ms. Nygaard should include some references to other sources relating to the various topics she addresses. This is a short book (less than 200 pages), and that's a good thing. But, as a short book, it cannot possibly be the end-all and be-all encyclopedia for academic writing. For example, her chapter on figures and tables is a great introduction, but references to authors like Tufte would serve the (novice) reader well.

In conclusion, Writing for Scholars is a great guide to academic writing. It is a must-read for anyone beginning a career that will involve such writing, and even seasoned writers can learn a few things by filling in some gaps that were left over from learning by osmosis.


enjoy,
Charles.

Wednesday, March 04, 2009

Book: Pro Django

Pro Django by Marty Alchin
ISBN: 978-1-4302-1047-4

Pro Django is an excellent book on Django, but it's not for beginners. The term "Pro" gets thrown around a lot, and it gets applied to things that might better be described with "Dummies." This is the Real Mc Coy - it's serious advanced stuff.

The chapters are centered around nice little chunks of the Django system: Models, Views, Forms, Templates, etc. Each chapter is a nice, self-contained bit of Django knowledge, except for Chapter 2, which is a great survey of advanced Python like meta classes. Most chapters also include an Applied Techniques section which gives some examples of how to apply the material in the chapter.

While reading this book, what struck me was how the chapters seem to pack in a level of detail that you'd typically find only in a comprehensive reference, but yet this book is not a bunch of dry reference material, or worse yet, copies of online manuals. The reader gets serious detailed information, but it almost reads like a fluffy tutorial. It's pretty remarkable.

Something that's unique about this book at this time (Q1 2009) is that it covers the 1.0 version of Django. A bunch of the first books on Django were written against 0.96 or earlier. You'd think there wouldn't be much difference (0.04 versions if you only look at the numbers), but the jump to 1.0 was significant for Django. It's nice to have a book that reflects the 1.0 world.

enjoy,
Charles.

Wednesday, January 21, 2009

Book: Java Power Tools

Java Power Tools by John Ferguson Smart
ISBN: 978-0-596-52793-8

Java Power Tools provides a fairly detailed introduction to a number of tools for Java programmers. It fits nicely between the O'Reilly Hacks series and having a dozen books like Ant: The Definitive Guide, 2nd Edition. Like the Hacks books, Java Power Tools provides an introduction to a bunch of tools. The Hacks books are great for answering the question "I've heard of that tool, but where does it fit?" But whereas the Hacks books provide just an appetizer, this book provides a main course, enough to get seriously started with the tool being discussed. And then, if you want all the gory details, a Definitive Guide could provide the full five-course meal.

The selection of tools presented was really good, at least for me. For example, I know about continuous integrations servers, but I haven't set one up. At one client site, they were using Hudson, which I had some exposure to, but didn't know much about the others like Cruise Control, Continuum, and LuntBuild. Similarly, I've been using JUnit 3.x for years, but I didn't really know what was different in JUnit 4 or how that compares to TestNG. This book provided me with a great overview of these and other tools. Java Power Tools provides a great way to get up to speed with a general area of tooling (e.g., continuous integration servers) or a good cross-section of the majority of the Java tools in use today.

If I had to pick something to complain about, it would be Part II - Version Control Tools. These aren't really Java tools, although every programmer (Java or otherwise) should be using them. Or given the decision to include version control tools, I'd suggest excluding CVS because it's old and including at least one distributed version control tool like Mercurial (used by the Open JDK project and NetBeans) or git (used by the Linux kernel).

So, in conclusion, unless you have no free will about tool selection or you already know all of these tools backwards and forwards, I highly recommend this book to almost any Java programmer.

enjoy,
Charles.

Wednesday, November 26, 2008

Book: The Productive Programmer

The Productive Programmer by Neil Ford
ISBN: 978-0-596-51978-0

I've always had a more-than-passing interest in productivity porn, as Merlin Mann calls it. I'm not obsessive about seeking out productivity books (I can quit any time), but I've read more than a few such books in my day, and I do keep my tasks organized with OmniFocus. When I saw The Productive Programmer at the Powell's table at OSCON 2008, I confess I got pretty tingly, kinda like "when we used to climb the rope in gym class." Here was a book talking about productivity aimed specifically at what I do every day - programming, not management, not sales, not coaching a football team, but programming. In the words of Cartman, "sweet."

That said, I wouldn't classify this book as hard-core productivity porn. It doesn't lay out a dogmatic formula, nor does it suggest or require specific tools or techniques. And that's probably a good thing; in my experience, programmers can be some of the most opinionated people I've known, especially when it comes to their craft (e.g., editor wars). If the author had attempted to prescribe a specific set of practices, I think almost every programmer would have found something to hate about the book, and what's the point of that - we could just as easily go back to ragging on emacs, Windows, or Steve Jobs.

Instead, Mr. Ford offers a number of possible suggestions that one can take or leave. These are organized into two parts: mechanics (the productivity principles) and practice (philosphy), or what I would call tactical (little picture) and strategic (big picture) techniques. Mechanics includes things like controlling interruptions from things like email and using tools like Quicksilver (which I finally started using after reading this book). Philosophy includes things like test-driven development/design (TDD) and using things like static analysis tools.

The various suggestions were all very well and good, but what I liked most about this book is that it made me thing about mom-and-apple-pie topics like TDD (of course, we all write unit tests, right?) from a totally different angle - productivity. Of course, that what the agile folks have been saying all along, but somehow this book shed a whole new light on it and helps drive it even deeper. And the whole book got me thinking about the bigger question - "how can I be more productive and effective in my programming?"

Like I said, this isn't hard-core productivity porn, but it's a very useful and approachable guide to productivity by a programmer for programmers. Maybe that makes it productivity literary erotica?

enjoy,
Charles.

Friday, November 07, 2008

Book: Working Effectively with Legacy Code

Working Effectively with Legacy Code by Michael Feathers
ISBN: 0-13-11705-2

"...legacy code is simply code without tests... Code without tests is bad code."
From those statements, it doesn't take much to figure out what this book is about - how to write unit tests for code without tests. Of course, if you've ever tried to do that, you know that it's easier said than done. Many/most programmers who inherit a big ball of mud of code without tests (legacy code) just punt; the existing code has no tests, I can't see how to get any of it under test, so I'll just hack and pray - just like the original author(s) did.

That's where this book comes in. It's a primarily large collection of recipes about how to write unit tests for legacy code. That said, the focus is not really on how to write the tests but rather how to get chunks of the legacy code into a test harness so that you can write unit tests to characterize the existing functionality before adding or modifying functionality. It also contains techniques to add new functionality in such a way that you can test it immediately and possibly execute the "clean up" that the original author(s) promised would happen as soon as that next deadline was reached - all those years and deadlines ago.

The bulk of the book (Part 2) is organized as a series of complaints or excuses and how to deal with them. These include such topics as "My application has no structure," "Dependencies on libraries are killing me," and "This class is too big, and I don't want it to get any bigger." In each chapter, the author provides examples (in multiple languages - Java, C++, C, etc.) of these problems and specific techniques that can be used to address them. The last chapter (Part 3) is an encyclopedia of the techniques for easy reference.

If you're lucky enough to do only green-field development, you might think this book would be useless. However, one interpretation of this books is that it is a list of sins to avoid while your playing in the green field. And, many of the techniques can be interpreted as best practices for how to write you code to ensure it's testable. (Of course, you're following test driven development and achieving near 100% coverage, so that would never be a problem with your new code, would it? :-)

My one disappointment with this book was that I was hoping it would provide ideas about how to create higher-level (e.g., functional) tests. Of course, high-level tests are no substitute for unit tests. It's just that I was tasked with creating "some tests" quickly for an entire application, and unit tests are not practical in this particular case, which is my problem, not the author's.

This book is an excellent resource and cookbook for how to add unit tests to an existing code base that lacks tests, and it also provides design and implementation templates to ensure that new code is testable as it's created.


enjoy,
Charles.

Tuesday, August 12, 2008

Book: Essential SQLAlchemy

Essential SQLAlchemy by Rick Copeland
O'Reilly Media
ISBN 10: 0-596-51614-2 / ISBN 13: 9780596516147

This is a great book describing how to use SQLAlchemy to connect Python programs to databases. In fact, at the moment (mid-summer 2008), it is the book, since there are no other books on the subject, yet. Athough I am not (yet) a SQLAlchemy user, this book seems to cover all of the core topics in SQLAlchemy. The text includes many straightforward examples of how to use various facilities in SQLAlchemy and how to map various database programming problems into Python code via SQLAlchemy. Copeland also provides a whirlwind tour of some extensions to SQLAlchemy.

I heard about SQLAlchemy project on the This Week in Django podcast. Django doesn’t use SQLAlchmey, but it does use a similar object-relational mapper (ORM). As I mentioned, I haven’t used SQLAlchemy so I came into this book with a somewhat blank slate. I have, however, been programming in Python since before 1.0, and I’ve worked with database APIs and ORMs since the early 90s in C++, Java, and Python. So, I was familiar with the basic landscape of database programming, even if I hadn’t used SQLAlchemy. And, I’m currently working on a large Python project that is coded using the Python database API directly, which is very tedious. So, the whole time I was reading this book, I was looking at how to fit SQLAlchemy into this existing code base.

To be honest, the first chapter (the proverbial introduction) almost turned me off. The author starts out slowly enough, but then he starts touching on a huge number details, which were glazing my eyes over. However, the second chapter (getting started) started back at ground zero and stepped through everything in a nice clear fashion, and the rest of the book continued in that vein. He covers all the topics you would expect in a database programming book: queries, updates, joins, the built-in types, and how to hook in to provide support for your own types.

Something I didn’t realize about SQLAlchemy coming into this is that SQLAlchemy is both an ORM (what I expected) as well as a high-level, database-independent API. Which is to say, you can just access the database as tables, columns and rows rather than as classes, attributes, and object instances. Although I’d personally prefer to use the ORM, I can imagine cases where it might not be the right tool for the job, and it’s good to have a choice.

I was also surprised to see the ORM supports two styles of object-relational access: the data mapper pattern (which I had seen in Django and Hibernate) and the active record (used in Ruby). The author does a good job of explaining both of these and how to use them. He even devotes a whole chapter to Exlir, which is an extension that implements the active record pattern.

One thing that many people might consider odd is the fact that although SQLAlchemy is an ORM, the author waits until chapter eight to discuss how to map object inheritance hierarchies onto relational databases. Most books I’ve read on ORMs discuss this topic early, but I applaud Copeland’s decisions to hold off on discussing it. When books bring this up early (e.g., in chapter three), the discussion often gets bogged down in details, which glaze the reader’s eyes. I’ve dealt with the issue of inheritance mapping enough in ORMs I’ve used and those I’ve written enough that I’m not that interested in the topic (assuming the tool provides the typical, reasonable solutions), and was grateful that he held off on it.

One issue I had with the overall structure of the book is that I’m hard pressed to pigeon-hole the book. Books about a single technology such as SQLAlchemy usually occupy one end or the other of a spectrum. Either they’re hard-core references, often times copying-and-pasting API documentation from a web site (I really hate that), or they’re largish tutorials that may or may not contain enough technical meat. This is a short (roughly 200 page) book that contains plenty of technical meat, but it also includes some simple tutorial motivations for using various capabilities of the tool. Although this mix felt odd to me, I’m sure it will be perfect when I go to apply SQLAlchemy to my existing database projects since I don’t really want a hand-holding tutorial, but a pure reference wouldn’t quite work for me either when I’m just starting out.

In conclusion, Essential SQLAlchemy provides a thorough presentation of the SQLAlchemy tool for interfacing Python code to SQL databases. The author covers a number of different methods in which SQLAlchemy can be used to access databases from Python, and he provides plenty of details of the various APIs available to the programmer.

Enjoy,
Charles.


Sunday, July 06, 2008

Book: Effective Java

Effective Java, 2nd Edition by Joshua Bloch. ISBN 0321356683

This is the best book that I've read that I didn't know I needed to read. If you are a pretty good Java programmer, and you want to be better, this is a book you should definitely read.

Back in the day (BITD), we had Scott Meyer's book Effective C++, and we needed it. Even without templates and all the "new" things in C++, C++ was a large and complex language. I loved Effective C++ not only because it had lots of tips to keep you out of trouble when coding C++, but also because the organization was brilliant: each tip was about the right length to be read while sitting on the can.

I ignored the first edition of Effective Java book when it came out because I smugly assumed that Java was so superior to C++ (Java is C++ with the pointy bits filed down) that it wasn't really necessary. And that may or may not have been true in the 1.0 or 1.1 days of Java, but with the release of Java 5, the language has certainly grown and is now sufficiently complex that a book like this is a necessity, especially if you've recently moved up to Java 5 or Java 6. Lucky for me, this book landed on my desk against my will.

The book uses that same sitting-on-the-can format from Effective C++; the information is broken down into nice, small bits of information that you can read one or two at a time. These are grouped into 10 sections. There are specific sections for Java 5 features like Generics and Enumerations, which is great for people like me who were stuck on 1.4 for far too long.

What is really amazing about this book is that I learned a lot about topics that I thought I already knew about. For example, I've used serialization in a number of formats (built-in and do-it-yourself) for years, but I realized there's a lot more that I had no idea about. I will certainly think twice before doing any serialization in the future.

Unless you're the sort that knows the Java Language Specification by heart, you need to read this book.

Enjoy,
Charles.

Wednesday, August 22, 2007

Everything I Know About Business I Learned from My Mama

Everything I Know About Business I Learned from My Mama by Tim Knox is nominally a business book. However, it is certainly not your typical business book. The author doesn't tell you what corporate structure (e.g., LLC vs. S-Corp) is best, how to keep the books, or how to manage a Fortune 500 company. Rather, this is a bigger picture view of going into business for yourself. Tim Knox is all about entrepreneurship.

Although this book isn't really a how-to manual with lots of nuts-and-bolts details. It would be a really good book for someone who is thinking of going into business for himself. It even begins with a bunch of reasons why someone shouldn't go into business. If after reading the book, someone still wanted to go into business, he should probably read a few more books before jumping in. (The E-Myth Revisited: Why Most Small Businesses Don't Work and What to Do About It is next on my list to read.)

The book is pretty short and humorous, and the chapters are nice little bite-sized chunks. Here's an attempt as his sense of humor: this would be a great book to keep next to the toilet - each chapter is about one visit long. (I have a special place in my heart for books like that.) I suspect the book was easy for Tim to write since he's been writing newspaper articles for some time. I can easily imagine that many of the chapters are extensions of articles he wrote for the paper, bless his heart.

One minor nit I could pick with the title is that there isn't all that much mention of stuff his mama told him. Rather, there's a fair amount of common sense that one's mother might impart.

I bought this book because I've been listening to Tim on Dan Miller's radio show - see www.48days.com. Dan is a career coach and wrote the book 48 Days to the Work You Love. The two of them spend a lot of time telling people to quit jobs they hate and move towards work they love. The radio show recently ended, and they've switched to an Internet format. The radio shows and the new Internet shows are available as podcasts.

enjoy,
Charles.

Thursday, March 01, 2007

Book: Windows Forensics and Incident Recovery

Windows Forensics and Incident Recovery
by Harlan Carvey -ISBN 0-321-20098-5

This is a great book because I learned more than I thought I would from it. Coming from a command-line Unix background, I tend to view Windows as excessively GUI-centric (maybe that's why it's called Windows?) and full of opaque Microsoft voodoo. This books showed me that there are plenty of things to be learned from the Windows command line, and there are lots of transparent, open-source tools to expose the inner workings of Windows.

There are really three types of information in this book: how Windows works, tools to collect information about Windows, and the bigger task of forensic information extraction and processing. There is a lot of information about basic operating systems concepts (files, processes, etc.) and how they are implemented in Windows. I especially liked the presentation of user privileges - we typically only hear about those in the context of administrator versus non-administrator, but there is a listing of each of the individual privileges and what they mean. The tools that the author presents are primarily command-line tools, and many of them are written in Perl - very approachable for an old Unix hack. (A second edition of this book would benefit from a treatment on Microsoft's WMIC tool.) With the basic groundwork laid, the author presents a bigger picture of how to use all of the tools in a forensic investigation. He presents a series of dreams, which are a bit corny, but they serve as a sequence of case studies. He also provides a "forensic server" to storing all the little bits of information that get collected - a bit like "real" tools like EnCase.

Like the book File System Forensic Analysis, one of my favorite aspects of this book is that it provides a lot of practical information about applied operating systems - Windows. The author provides links to a lot of tools and web pages, so this book serves as an excellent starting point to learn a lot more about Windows and forensic data recovery. The text includes complete source code for the Perl tools, so a code-oriented reader can really see what the information is and where it comes from.

If I had to criticize something in this book, I'd say that Chapter 9 on scanners and sniffers drifts a bit from the central theme of the book, but then I've found that to be pretty common in security books because so many of the topics are interrelated; you start pulling one thread on the sweater, and the next thing you know, you've unraveled the whole thing.
All in all, this is a great starting point for learning about forensic data acquisition on the Windows platform.

Enjoy,
Charles.


Monday, June 26, 2006

Book: File System Forensic Analysis

File System Forensics by Brian Carrier. ISBN: 0321268172.


To my way of thinking, this is a really good book about file systems, that just happens to use forensics as a unifying theme and framework under which to study the file systems. The book provides the most detailed coverage of file systems I've seen, short of reading the source code. I used it as a textbook in an advanced operating systems class, but it is not really a textbook, per se.

The author begins with an introduction to the concepts behind digital forensic investigations. He continues with a ground-up introduction to disk drive technology and how disks are used in computer systems. The introductory material concludes with a generic framework for discussing the components and characteristics of file systems.

With all the groundwork laid, the meat of the book consists of detailed discussions of FAT, NTFS, Ext, and UFS file systems. Each file system is presented at a high level first, followed by a detailed description of the structures on disk. The high level information is presented with pictures and via output from the author's file system toolkit (The Sleuth Kit). The details are presented with tables of structure members without resorting to C code, which makes it easier to see the trees rather than the forest, especially for non-programmers.

I found the information about the Microsoft file systems (FAT and NTFS) especially useful, since there isn't much real documentation on those file systems available, and a lot of what is available seems like rumors spread at recess in a schoolyard.

In conclusion, this is a really good, perhaps even the best, book on file systems, even if you're not into forensics. If you're looking for serious details about file systems or forensic analysis of file systems, this is your book.

Enjoy,
Charles.