Derek Zeng

A programmer

LEGOs, Play-Doh, and Programming

by coderek

by Jamis on November 09, 2008 @ 04:46 AM, link: http://weblog.jamisbuck.org/2008/11/9/legos-play-doh-and-programming

This article is based on a talk I gave at the 2008 RubyConf in Orlando, Florida, entitled “Recovering from Enterprise: how to embrace Ruby’s idioms and say goodbye to bad habits”.

The other day I went to Target with my son. Like most kids, I think, he’s convinced that Target is a toy store, which just happens to sell towels and shoes and cleaning supplies, too, so in his eyes it’d be criminal to not walk through the bare handful of toy aisles.

Besides, the toy section is across from the electronics section, which all geeks know is where the real toys are.

So, we went to the toy section and started browsing. I’ve always loved LEGO sets, and it’s a good thing they’re so expensive or I’d come home with a new box of bricks every time. At the Target near our home, they have half of an entire aisle devoted to boxes and boxes of LEGO sets. Need a battle-axe-wielding LEGO dwarf figure? A LEGO shark? How about a giant LEGO skull, a la Indiana Jones? And who could pass a LEGO Star Wars’ Star Destroyer model without a wistful thought or two?

It struck me at that time, though, how incredibly specific so many of these pieces are. With all of those sets in your possession, you could build a secret agent headquarters with a boulder trap that crushes angry battle-axe-wielding dwarves as they drive by in Martian exploration buggies. Which themelves are adorned with flower beds and creeper vines. And you could do all that in under 10 LEGO bricks! (Or, maybe a few more than that.)

Did you know that LEGO currently produces over 900 distinct LEGO pieces, or “elements” as they call them? Over the course of their history, there have been almost 13,000 distinct elements created. Now, that number includes variations in color and material, but even if you exclude those permutations, you’re still left with a staggering 2,800 different elements in the LEGO line.

It’s interesting that LEGO tends to encourage the use of specific pieces, rather than letting you build those pieces from more fundamental parts. It means that in order to master LEGO brick building, you have to know all of the pieces available to you, and have a good intuitive feel for how and when they should be used. That’s…a lot of information to keep tabs on. Myself, I just keep to the standard rectangular blocks and plug an exotic or two on as an afterthought when I see one that looks cool.

Also, if you’ve built up a model, and decide later that you want to change or extend some part of the model, you’ll often have to dismantle part (or all!) of it in order to do so. Kind of a pain.

Regardless, I still love building with LEGO bricks, and I suspect I always will.


Now, my son being all of 6 years old, his attention span requires us to spend no more than a few minutes in any one toy aisle. So, long before I was ready to tear my eyes away from the LEGO sets, we found ourselves in the next aisle over. This was a much more colorful aisle, with bright pastels coloring various pre-school toys. My son, though, has nothing against pre-school toys, and was more than willing to drag me through them.

My eyes caught on the Play-Doh section.

The Play-Doh section at this Target is small, maybe 8 different hangers and a few square feet of shelf-space. You can get Play-Doh in as many as 50 different colors, but regardless of color, it’s all still the same thing: a bucket of malleable dough that you can pound, press, pinch, roll, and sculpt. (And rub into the carpet. And hair. And clothes. But we won’t get into that.)

Honestly, Play-Doh has a bad rap as a pre-schooler toy. It’s remarkably fun to play with. You can do all kinds of things with Play-Doh that you just can’t do with LEGO bricks. For example, the other day I built an arch out of cubes of Play-Doh that were held together only by friction. (You may not be impressed, but my 6-year-old was.)

The best part is that it doesn’t require so much memorization to become proficient in Play-Doh, though it might require more of an artistic streak than LEGO bricks do. Since I’m more engineer than artist, my Play-Doh creations tend to come out blocky and functional, rather than elegant and designed, but then, so do my LEGO creations.

Also, where LEGO models require significant work to alter or extend, Play-Doh models are dead-simple. If you want to add something to the base of your model, just graft more Play-Doh onto it. Want to change the shape of the keystone of your arch? Just pinch and mold in place. Simple!

Interestingly, I’ve found that while you can’t build with LEGO bricks using Play-Doh construction techniques, you can build with Play-Doh using LEGO construction techniques. Just build bricks out of Play-Doh. It’s unwiedly and impractical, but it can be done. The real question is: why would you want to? It’s pretty obvious that to build with Play-Doh, you should just embrace Play-Doh’s own strengths and run with it.

As obvious as that may seem, the lesson didn’t click for me for a long time. It’s not that I went about building Indiana Jones sets out of Play-Doh, one brick at a time. Rather, I didn’t realize that the same lesson applied to programming languages.

Java and LEGOs

Consider Java. Most would consider it the poster child of “enterprise” environments (though .NET is giving it a run for its money). And would you believe, Java and LEGO bricks have several things in common?

As of Java 1.6, there are well over 11,000 different classes and interfaces available to programmers in the standard library. (That’s not even counting the inner and anonymous classes that are usually not publicly documented.) Eleven. Thousand. Classes.

This is readily apparent when you consider the set of collection implementations that Java ships with.

Collection Interfaces:

  • Collection
  • Set
  • List
  • Queue
  • Deque
  • Map
  • SortedSet
  • SortedMap
  • NavigableSet
  • NavigableMap
  • BlockingQueue
  • BlockingDeque
  • ConcurrentMap
  • ConcurrentNavigableMap

General-Purpose Collection Implementations:

  • HashSet
  • TreeSet
  • LinkedHashSet
  • ArrayList
  • ArrayDeque
  • LinkedList
  • PriorityQueue
  • HashMap
  • TreeMap
  • LinkedHashMap

Special-Purpose Collection Implementations:

  • WeakHashMap
  • IdentityHashMap
  • CopyOnWriteArrayList
  • CopyOnWriteArraySet
  • EnumSet
  • EnumMap

Concurrent Collection Implementations:

  • ConcurrentLinkedQueue
  • LinkedBlockingQueue
  • ArrayBlockingQueue
  • PriorityBlockingQueue
  • DelayQueue
  • SynchronousQueue
  • LinkedBlockingDeque
  • ConcurrentHashMap
  • ConcurrentSkipListSet
  • ConcurrentSkipListMap

Abstract Collection Implementations:

  • AbstractCollection
  • AbstractSet
  • AbstractList
  • AbstractSequentialList
  • AbstractQueue
  • AbstractMap

Yes, that is FORTY-SIX different interfaces and implementations related to collections. Now, just like LEGO construction, having this volume of distinct elements on hand affects how you architect things. Writing software becomes more of a smorgasbord, where you pick and choose the specialized bricks you need, fitting them together just so. It also means that, in order to master Java, you need to have that intuitive grasp of how and when to use those thousands of classes. When do you use a HashSet versus a TreeSet? When would you use an ArrayDeque, and when would you want to subclass an AbstractQueue? It’s all part of the job.

Also, IDE’s are popular with Java in part because of the pain of refactoring. If you want to extend or modify a Java application, it can involve (like LEGO models) a lot of dismantling and reassembling.

Ruby as Play-Doh

But if Java is the LEGO of programming languages, then it could be argued that Ruby is the Play-Doh. Just as Play-Doh has been typically considered a pre-school toy, so Ruby has had a bad rap as a “toy” language, not fit for the “real world”. Also, compared to Java’s library of 11,000 classes, Ruby’s meager 1,400 classes (which number does include internal and anonymous ones, but not modules) seems paltry. And collections! Look what Ruby has to offer:

    * Enumerable
    * Comparable (for elements within a collection)
    * Hash
    * Array
    * Set
    * SortedSet

Just 6 options, to Java’s 46. What if you need a queue? Well, Ruby’s Array class has a queue-like interface; you could just use that. What about a sorted map? In that case, you might need to make do with a sorted set, or you could write your own, but it’s not hard. Most data structures are not rocket science, and for those that are, you can bet someone else has implemented it already.

But when you need to extend or modify your application, Ruby is a dream. Like Play-Doh, you can often just “pinch and mold” in place, grafting new code on or pulling old code out.

Ruby’s philosophy is like that of Play-Doh’s: provide a basic set of tools and make it relatively simple to build something complex with them. The very Ruby language itself is designed for this: closures, super-simple introspection of objects, runtime modification of existing objects, and the use of modules for extending classes and objects all tend to result in an environment that is simple, malleable, and extensible.

And just as you cannot use Play-Doh construction techniques with LEGO bricks, you also really cannot use Ruby programming techniques with Java. Using closures for delayed execution, or iteration, is tricky (at best) in Java, when it’s possible at all. Extending objects at runtime typically requires bytecode modification. And Ruby’s use of modules to extend classes and objects, while similar to both inheritance and interfaces, is slightly different (and arguably more powerful) than either.

You can write Ruby programs using Java programming techniques, but just as using LEGO techniques with Play-Doh is unwieldy and overcomplicated, so is mimicking Java in Ruby.

This is the lesson that I was slow to learn.


Consider exhibit A, from my Copland library.

# copland/configuration/loader.rb (collapsed)
module Copland
  module Configuration
    class Loader
      attr_reader :search_paths
      attr_reader :loaders

      def initialize( search_paths=[] )
      def add_search_path( *paths )
      def add_loader( loader )
      def load( options={} )
      def load_path( path, options )
      def use_library( name )

Copland was my first stab at a dependency injection (DI) framework, and is more-or-less a feature-for-feature port of the HiveMind project in Java. (Ironically, it was the subject of my first presentation at a Ruby conference, in 2004!)

It was designed to automatically scan directories in the load path for YAML configuration files (I’ll mention those shortly), and load them up and parse them. The thing is, I imagined a case where someone might want to use XML instead of YAML. I couldn’t just leave these folks behind! So I made the whole configuration loading framework extendible. Want XML config files? Fine! Just implement an XML parser system and register it with the configuration loader framework, and you’re good to go!

That’s just wrong on so many levels. Always, always, always build just what you need, and only when you need it. You’re in Ruby, the Play-Doh of programming languages, and the cost of adding features later is really, really low. Remember YAGNI! Obviously, this principle holds in Java, too, but it really seems like the opposite philosophy has become the standard among many Java projects. It’s too bad, because it has contributed to a bad reputation that Java probably doesn’t entirely deserve.

Here’s a classic Java pattern that just really doesn’t translate to Ruby:

# copland/class-factory.rb
module Copland
  class ClassFactory
    include Singleton

    def initialize
      @pools, @constructors = Hash.new, Hash.new

    def create_pool( name, &block )
      block ||= proc { |k,*args| k.new( *args ) }
      @pools[ name ] = Hash.new
      @constructors[ name ] = block

    def get_pool( name )
      pool = @pools[ name ] or raise NoSuchPoolException, name
      return pool

    def register( pool_name, name, klass )
      pool = get_pool( pool_name )
      pool[ name ] = klass

    def get( pool_name, name, *args )
      pool = get_pool( pool_name )
      klass = pool[ name ]
      raise NoSuchRegisteredClassException, "#{pool_name}:#{name}" unless klass
      constructor = @constructors[ pool_name ]
      return constructor.call( klass, *args )

This is an implementation of a class factory. In Ruby. The HiveMind project had a class factory, so the Copland project needed one, too!

But you know, class factories are absolutely pointless in Ruby. There are plenty of reasons for these in Java, but they just aren’t necessary in Ruby. Want a namespace? Declare the class in a module. Want the class to exist in multiple namespaces? Use constant assignment within whatever modules you desire. Need a dynamic lookup? Try #const_get. In the very worst case, just use a Hash if you need to map arbitrary strings to classes.

module A
  class B

# method #1, use const_get to dynamically look up classes
name = "B" 
klass = A.const_get(name)
object = klass.new

# method #2, use a hash to map arbitrary strings to classes
map = { "bimpl" => A::B }

Seriously. You don’t need explicit class factories in Ruby, because anything can be a class factory, implicitly.

I’ll mention one more painful Javaism that I ported to Copland. It’s so painful that I won’t even bother pasting it here—if you’re following along, look at examples/solitaire-cipher/lib/package.yml in the copland distribution.

If you do, what you’ll see are 106 lines of YAML describing how different Ruby objects in a simple, 250-line program should be initialized and connected. Yes. 106 lines of YAML. For 250 lines of Ruby.

Now, don’t get me wrong. YAML can be great for configuration. Rails, for instance, uses it for database connection information. The problem here, in Copland, was that I was using a static configuration for what would be better served with a block of code. Ruby reads elegantly; a YAML configuration file does not.

Fortunately, those wiser than myself showed me the way.

RubyConf 2004

I still remember Rich Kilmer, sitting in the front row in the October 2004 RubyConf. As I wrapped up my presentation on Copland and dependency injection, I asked if there were any questions.

Rich raised his hand. “Why didn’t you just use Ruby?”

I was confused by his question, and he had to explain. Why did I use YAML instead of just doing the configuration in Ruby code?

I think I mumbled something like “that would be a neat idea”. To me, it was a novel concept. I’d never heard of it before. You’d never see a Java program that was configured by writing Java code. That screams “hard coding”! But Ruby, you see, is different.

Ruby lets you write these beautiful little mini-languages. You’ll hear them called “Domain Specific Languages”, or DSL’s. They are subsets of the Ruby language, and you’ll find them in Rake, Capistrano, rspec, shoulda, and more. They’re really everywhere in Ruby, to varying degrees.

Although Rich tried to open my eyes, I think I would have continued to try and push Copland if it weren’t for Jim Weirich. Jim took the idea of a Ruby-ish DSL for dependency injection and made something concrete of it. A few days after the conference he forwarded me a draft of an article he was writing, in which he described dependency injection and gave a very simple (and very elegant) implementation of a DI framework in Ruby. Instead of static configuration, he’d written a basic DSL for declaring how the dependencies related to each other.

It was a moment of epiphany for me. Suddenly, I got it. I understood what DSL’s were about. I asked Jim for permission to take his simple implemention and build upon it.

The result was Needle.


Now, I’m much prouder of Needle than of Copland, because it is much closer to Ruby’s philosophy than Java. There are some pretty cool designs in there, too, though I use the term “cool” here to mean “neat without having any real practical application.”

Needle, though better, was still far from the mark.

As an example of why it misses the mark, consider Needle’s “pipeline” concept. Conceptually, it allowed you to specify a sequence of post-processors that operated on an object, allowing you to wrap code around it and mimicking (among other things) AOP-like operations. It also let me (as the author of the library) easily implement things like deferred instantiation, singleton services, and the like.

For example, suppose you wanted to declare a “deferred singleton” service, that logged all accesses to one of the methods. Underneath, Needle will create a pipeline of processors that operate on the service, returning a proxy object. The first time the proxy is accessed, it will check to see if the object has been instantiated yet. If it hasn’t, it’ll instantiate it (and cache it). The instantiation, though, actually just hands control to the next element in the pipeline, which in this case checks to see that the “singleton” constraint is enforced (e.g., all requests for this service return the same object, rather than instantiating a new object). The next pipeline element in the chain will wrap the interceptor code around the method in question, and yet another pipeline element would perform the actual object instantiation.

Pipelines really were pretty slick in Needle.

The problem, though, is that instead of leaving them as an implementation detail, I advertised them as one of Needle’s features. “Implement your own service models!” I cried. But, how often, really, is that likely to happen? Instead of exposing only the bare minimum of Needle’s API, I exposed as much of it as I could, because I could.

That’s a bad idea. Expose only what you need. The rest can be there, available, but not formally exposed. Only when (and if) you discover a need to expose more, should you expose more. This helps for several reasons.

  • A smaller API is easier to describe, document, and support.
  • A smaller API is easier for people to learn.
  • A smaller API is easier for you to test.
  • Extending a small API is much less onerous on your users than changing or restricting a larger API.

Net::SSH 1.x

Now, I’ve since come to my senses, but at one time I was completely head-over-heels in love with dependency injection. Like any schoolboy crush, it embarrasses me now to think about it, but there’s no denying it. The proof is everywhere in my project history.

Net::SSH, in particular.

At the time, I was looking for a good demonstration of the flexibility and power of dependency injection, and since Net::SSH was another of my pet projects at the time, it seemed like the perfect candidate.

I was still stuck in the “just in case” mindset, though, and Net::SSH 1.x reflected that. Badly. For instance, I isolated all the OpenSSL crypto interfaces into their own module, because “what if someone wanted to plug in a different crypto lib?” Nevermind that there was no other crypto lib for Ruby (and still isn’t, 4 years later). But WHAT IF?!?

Now, separation of concerns and modularity are good things, when used in moderation. But like any design pattern, it becomes evil when taken to extremes. Too much modularity and you wind up with component soup (and I hope you’re hungry, because you’re going to have a lot of it). With lots of tiny components, the interactions between those components can become difficult to test.

It also fuzzes the line between the public, documented API and the internal, private API. When you have two large components, it is very easy to say “A is public, and B is private”, but when you have two hundred components, where do you draw the line? It’s far too easy to let the “public” boundary meander a bit further into “private” territory than it should.

Even worse, when I added dependency injection to the mix, it became very, very difficult to follow the the flow of the program, and to understand the dependencies. Pull up Net::SSH 1.1.4, for instance, and find net/ssh.rb. Just try and figure out how a connection session is instantiated. It’s a mess. Unless you’re familiar with Needle, it’ll probably take you a long time to discover that the actual services are configured in the various services.rb files, but even after you figure that out, you still have to figure out how the different services interrelate. It’s a mess.

But, isn’t that the opposite of what DI is supposed to do? Isn’t DI supposed to improve the maintainability and testability of your code? Yeah. The problem, though, was three-fold.

First, Net::SSH, though complex in its way, was not really complex enough to need a dependency injection framework. DI itself adds complexity, and a framework for doing dependency injection adds even more, so before you go that route you need to be very sure that the trade-off in complexity is worth it. If your project is too small, you’ll actually increase the complexity of your project by adding a framework for doing DI.

Secondly, I was using a DI framework at a level that was really too granular. I was using the framework to wire together everything. No component was too small! No object too insignificant! I was on the dependency injection horse, and riding it for all it was worth. If I’d taken the time to really understand the pattern, though, I would have learned that though the pattern itself may be applied at the micro level, using a framework to do so is like nuking a mosquito—it works, but it leaves a mess behind.

Which leads to the last problem with Net::SSH’s use of Needle: it is really only appropriate for wiring together components of an application. Very, very few (Ruby) libraries will ever be complex enough, in themselves, to justify adding a dependency injection framework to them. Rather, let the application wire the libraries together as (and when) it needs to. Any more granular than that, and you’ll run into the same quagmire I did, I promise you.

Dependency Injection in Ruby

So, is there no room for DI in Ruby? There definitely is. I use DI nearly every day in Ruby, but I do not use a DI framework. Ruby itself has sufficient power to represent any day-to-day DI idioms you need. Consider this one:

class A

class B
  def new_client(with=A)

Here, B declares a factory method for generating new client objects. Because Ruby lets you declare default values for method arguments, you can let the default client implementation be A, which is the common case. But for testing, you can easily inject a mock into that method by passing an explicit parameter.

For cases where that doesn’t work, you can use a second factory method:

class A

class B
  def new_client

  def client

Then, in your tests, you can subclass B, overriding the client method to return your mock client implementation. It’s dependency injection, Jim, but probably not as you’ve known it.

Hashes, too, are your friend. You can allow optional arguments via hashes to specify implementation classes, defaulting to the standard implementation classes but allowing clients to inject their own implementations where needed:

class A

class B
  def initialize(options={})
    @client_impl = options[:client] || A

  def new_client

“Loose coupling” and “high cohesion” are terms you’ll hear bandied about in defense of dependency injection, and those traits are certainly desirable. But strike a balance with pragmatism. There will be some who call me heretic for saying this, but don’t be afraid to introduce tighter coupling when it makes sense. Loose coupling everywhere is what I had with Net::SSH 1.x, and the result was nearly unmaintainable.

Be wise. You’re competent. Trust your instincts.

Lessons learned

If you read nothing else from this article, take to heart these bite-sized bullet-points:

  • Direct translations are rarely accurate. Try using the Google translator to translate a paragraph from English, to Italian, to Japanese, and back to English, and you’ll see what I mean. The same is true of programming languages. Each language has it’s own idioms, and trying to take what works well in one language and force it directly into another language is doomed to fail, more often than not.
  • Use your environment efficiently. Try as you might, you’ll never make a ball out of a LEGO brick by rolling it between your hands. You’ll just bloody your palms. Learn what your environment is capable of. Reading other people’s code is a great way to do this.
  • DSL’s, not static configuration. Ruby excels at representing DSL’s. Whenever you can, consider using a DSL instead of static configuration for your applications. You’ll find it will simplify a lot more than it complicates.
  • DI frameworks are unnecessary. In more rigid environments, they have value. In agile environments like Ruby, not so much. The patterns themselves may still be applicable, but beware of falling into the trap of thinking you need a special tool for everything. Ruby is Play-Doh, remember! Let’s keep it that way.
  • Just in time. Not just in case. Don’t play “what if” games when you’re coding. Practice discipline, and implement only what you need, when you need it. You’ll wind up with tighter, more testable code that is easier to maintain in the long run.

Learning to program is a journey, and I’m still learning, myself. I’m not perfect at applying the rules above, but I’ve found that when I do, I’m much happier. I think you will be, too.

(End of article)