Generator in Go

Go doesn't have built-in support for generators. To be honest, it doesn't have to. Go has enough functionalities for the general use cases, and with the philosophy of simplifying everything, generators can be a distraction and cause more problems than they're worth. However, it doesn't mean we can't have our own implementation of generator for experimental purposes.

The core functionality of generator is to "generate" data (often very large data) lazily and asynchronously. The 2 main operators are yield and next. In languages like JavaScript, Python or Ruby, the syntax looks like this:

function generateRandoms() {  
    while (true) {
        yield Math.floor(Math.random() * 42);
    }
}

let generator = generateRandoms();  
console.log(gen.next());  

The above generates random numbers indefinitely but doesn't hang the process or cause memory overflow. In languages like Java, one can create an abstraction on top of built-in concurrency support:

Generator<Integer> generator = new Generator<Integer>() {  
    public void run() throws InterruptedException {
        while(true) {
          Random rand = new Random();
          yield(rand.nextInt(42);
        }
    }
};

System.out.println(generator.next());  

The same thing applies to Go with Goroutines and channels. For a start, yielding is similar to starting a goroutine and sending the value to a channel, blocking the goroutine until it's consumed.

func generateRandom() {  
    ch := make(chan int)
    go func() {
      for true {
        ch <- rand.Intn(42)
      }
    }()
    return ch
}

Then reading the next value is equivalent to reading from the channel

gen := generateRandom()  
num := <-gen  
fmt.Println(num)  

However, this syntax doesn't feel like the generator we're used to, so let's make it more pleasing:

package generator

type Generator struct {  
    yielder Yielder
}

type Yielder struct {  
    ch chan interface{}
}

type GeneratorFunction func(yielder Yielder)

func (gen Generator) Next() (interface{}, bool) {  
    val, open := <-gen.yielder.ch
    return val, open
}

func (yielder Yielder) Yield(val interface{}) {  
    yielder.ch <- val
}

func New(fn GeneratorFunction) Generator {  
    yielder := Yielder{make(chan interface{})}
    gen := Generator{yielder}

    go func() {
        fn(yielder)
        close(gen.yielder.ch)
    }()
    return gen
}

So now using creating a generator is as straight forward as:

gen := generator.New(func(yielder generator.Yielder) {  
    for true {
      yielder.Yield(rand.Intn(42))
    }
})

fmt.Println(gen.Next())  

Benefits of generators

The key features of generators are laziness and asynchrony, enabling a whole new way of writing code. For example if we want to generate the first 100 prime numbers, an imperative algorithm can be:

  • Generate numbers from 2 until indefinitely.
  • For each number, add it to the collection if it's a prime.
  • Repeat until 100 primes are collected.

An implementation in imperative programming looks like this:

func generatePrimes(limit int) []int {  
    var primes []int
    num := 2 

    for true {
        if len(primes) >= limit {
            return primes
        }

        isPrime := true
        for i := 2; i <= int(math.Sqrt(float64(num))); i++ {
            if num % i == 0 {
                isPrime = false
                break
            }
        }  
        if isPrime {
            primes = append(primes, num)
        } 
        num ++
    }

    return primes
}

The code above mixes 3 concerns: number generation, prime checking, and limit checking. We programmers are trained to look at logic like that and understand what it does, but regardless of how good you are, it's not always straigth forward to grasp the meaning of the code right away. With generators, the 3 concerns can be separated and the code looks a lot more pleasant.

func generateNumbers() generator.Generator {  
    return generator.New(func(yielder generator.Yielder) {
        num := 0
        for true {
            num++
            yielder.Yield(num)
        }
    })
}

func filter(seq generator.Generator, fn filterFn) generator.Generator {  
    return generator.New(func(yielder generator.Yielder) {
        val, open := seq.Next()
        for ; open; val, open = seq.Next() {
            if fn(val) {
                yielder.Yield(val)
            }
        }
    })
}

func take(limit int, seq generator.Generator) generator.Generator {  
    return generator.New(func(yielder generator.Yielder) {
        for i := 0; i < limit; i++ {
            val, open := seq.Next()
            if open {
                yielder.Yield(val)
            } else {
                break
            }
        }
    })
}

primes := take(filter(generateNumbers(), checkPrime), 100)  
num, open := primes.Next()  
for ; open; num, open = primes.Next() {  
    fmt.Println(num)
}

This implemenatation generates numbers one by one and let the numbers "flow" to the next generators. We can generate numbers without hanging the program and allocating more memory than neccessary. This opens a whole new paradigm of reactive or data flow programming.

Issues with generators

The biggest drawback with this implementation is performance. According to my calculation, the solution for the prime problem above using generators is 30% slower than imperative code. It is mainly caused by the overhead of creating goroutines and passing data around via channels. This makes generators not very practical to be widely used in Go just yet.

Another issue is that once a generator is used, it has to be used with other generators as well. In asynchronous programming, the code is either completely lazy or not at all. That means all the logic that you write has to take generators as input and return other generators as output. Sometimes this make things more complicated than it could be with imperative programming.

Conclusions

Generators are very fun to play around with. They make the code a lot more clean and modularized. However, because there hasn't been any native implementation yet, the current workaround with goroutines and channels is not fast enough to be practical. I really hope there'll be a native implementation some time soon, but given the Go's simple philosophy, it's quite unlikely.

No More Inheritance

Let me be honest, I hate inheritance. There's no other programming feature that annoys me as much as having two classes inherit from each other. And it's not just classical inheritance. Prototypal inheritance, which I used to think as a good idea, is not an exception as well. I have now nearly stopped using any kind of inheritance in all projects that I work in, and I hope that I can convince you to do the same.

What's wrong with inheritance

Inheritance creates the "banana-gorilla" problem. If you're trying to acquire a banana, but that banana is being held by a gorilla, and the gorilla, in turn, is holding on to a tree in the jungle, you'll end up getting the entire jungle. That's what inheritance does to your program - classes and modules have access to way more responsibilities than they need. The worst thing is that you can't deny the unwanted inherited properties. The jungle is always there whether you use it or not.

It's also easy to over-engineer when inheritance is present in a programming language. Most of the time in software development, what we need is a concrete implementation of a feature, not a bunch of taxonomies. If the requirement is to build a Dog and a Gorilla class, it's very tempting to create an Animal class that both of them inherit from. If a Human class is added in the future, we may not resist the itch to add an Hominidae abstract class as well. I just want a dog and a gorilla, but now end up with the entire animal kingdom. It is how Object Oriented Programming is taught in school, mainly because it's easier for beginners to relate to real world concepts. But as we mature, it's sad that we're still keeping the early habits and trying to make sense of software systems that way. Inheritance is not meant to achieve code organization and code reuse. It's mainly a polymorphic tool, and a lot of the time, the need for polymorphism arises from our own desire to use it.

I hate it the most when reading code that abuses inheritance. Finding a particular method that handles the behavior I'm looking for requires jumping back and forth between classes in the inheritance chain. I have to keep a mental model of the inherited attributes and keep track of the communication between overridden and inherited methods. It's nearly impossible to do so without a proper IDE. That's really stupid as I should only need an IDE to write code, not read other people's code. The implementation for a functionality I'm looking for should be easy to trace and the responsibilities of a class/module obvious to see. I shouldn't have to search the entire jungle just to look for something inside a banana.

Prototypal inheritance doesn't make things easier either. JavaScript is the only language I know that has prototypal inheritance built in, which is already a huge improvement over classical languages. It eliminates the unnecessary concept of classes and instead relies on constructor functions. However, that in turn makes prototypal inheritance itself very complex. Just take a look at the code snippet below taken from https://developer.mozilla.org/en/docs/Web/JavaScript/Inheritanceandtheprototypechain:

function A(a){
  this.varA = a;
}

A.prototype = {
  varA : null,  
  doSomething : function(){
    // ...
  }
}

function B(a, b){
  A.call(this, a);
  this.varB = b;
}
B.prototype = Object.create(A.prototype, {
  varB : {
    value: null, 
    enumerable: true, 
    configurable: true, 
    writable: true 
  },
  doSomething : { 
    value: function(){ // override
      A.prototype.doSomething.apply(this, arguments); // call super
      // ...
    },
    enumerable: true,
    configurable: true, 
    writable: true
  }
});
B.prototype.constructor = B;

var b = new B();
b.doSomething();

That's a lot of work just for a simple "B inherits from A" implementation. That code is so ugly that every JS library out there would be incomplete if they don't include some syntactic sugar to make it more pleasant. When writing Javascript, I don't really want to use external libraries unless they're absolutely important. In that case, either I have to write the wrapper myself or stay away from prototypal inheritance altogether. I always choose the latter one.

Replacement for inheritance

So what can we do instead of inheritance? You should probably know the answer to this already - Composition. Composition over Inheritance has become a very popular ideology nowadays, but the majority of programmers I know, especially Java programmers, are still unfamiliar with it. The idea is simple. If you have a piece of code that should be shared among objects, extract it to a common module. Any objects that need the functionality can grab the methods from the common module. Let's go back to the Dog and Gorilla example and try to implement it with composition:

public class Dog {
  private Animal animal;

  public Dog(Animal animal) {
    this.animal = animal;
  }  

  public String walk() {
    return animal.walk();
  }

  public String bark() {
    return "Dog is barking";
  }
}

public class Gorilla {
  private Animal animal;

  public Gorilla(Animal animal) {
    this.animal = animal;
  }

  public String stand() {
    return animal.stand();
  }

  public String holdBanana() {
    return "Gorilla is holding a banana. Don't try to get the banana!";
  }
}

public class Animal {
  public String walk() {
    return "walking";
  }

  public String stand() {
    return "standing";
  }
}

The Animal class now becomes just a collection of common animal behaviors. Dog and Gorilla each has an instance of Animal and can define their own behaviors using what it provides. It's a much better approach because now, instead of getting the behaviors directly through inheritance, Dog and Gorilla can declare their own interface and use the general implementation in Animal if needed. We don't have to jump back and forth in the inheritance chain to tell what the responsibilities of the classes are. It may not fit the Has-A and Is-A relationship model that OOP design promotes, but that shouldn't matter. Software and code is not real world. Stop trying to force real world common sense onto it.

The same idea can be easily applied to other languages. In Javascript, this is 1 way to implement this example:

function animalConstructor() {
  return Object.freeze({
    walk: function() {
      return "walk";
    },

    stand: function() {
      return "stand";
    }
  });
}

function dogConstructor(animal) {
  return Object.freeze({
    bark: function() {
      return "Dog is barking";
    },

    walk: function() {
      return animal.walk();
    }
  });
}

function gorillaConstructor(animal) {
  return Object.freeze({
    holdBanana: function() {
      return "Gorilla is holding a banana. Don't try to get the banana!";
    },

    stand: function() {
      return animal.stand();
    }
  });
}

var dog = dogConstructor(animalConstructor());
var gorilla = gorillaConstructor(animalConstructor());

There's no more hideous messing around with the .prototype property. We have a constructor function for each object type which accepts the dependencies as arguments. Initializing the dependencies directly inside the constructors may also be preferable, but in either case, the implementation is much more pleasant to reason about. It's much easier to write this code ourselves without the help of any external libraries.

What about memory usage?

You may have noticed that this pattern requires a new instance of a common class/module for each object type. Each dog and gorilla object will have a new animal object just for them, which definitely uses more memory than traditional inheritance with shared parent objects. It may be a big deal 10-20 years ago, but not anymore. Memory has become so much more abundant nowadays that saving object allocation hardly ever matters. Just look at the estimated memory usage for the above example in both Java and Javascript below:

Number of Dog objects: 10,000
Number of Gorilla objects: 10,000
Java memory usage with inheritance: ~600kb
Java memory usage with composition: ~1.2mb
Javascript memory usage with inheritance: ~700kb
Javascript memory usage with composition: ~1mb

In today's hardware, that's just peanut. Unless there's memory leak some where in the application, the garbage collector should be able to handle those objects efficiently. Composition is actually better for CPU usage as the interpreter doesn't need to perform look up in the inheritance chain. But again, when writing applications in today's hardware, those micro optimizations don't really matter.

Conclusion

I have stopped using inheritance when writing new code. The only reason inheritance still exists in my life now is because of legacy code written by other people. I'm hoping that by writing this, I can convince some of use to at least give this idea a try. I feel that software engineering and Object-oriented programming in particular needs to evolve to a new era, and we can't do that when inheritance is still around.

ReactJS server-side rendering with browser dependencies

Server-side rendering is one of the most powerful features of React. With renderToString() and browserify, a React component can be pre-rendered on the server to save page rendering time on the browser. There's an excellent example by Pete Hunt to demonstrate this pattern here.

However if your React component depends on browser-only libraries like jQuery, Bootstrap, or ValidateJS, it becomes a bit more complicated. Those libraries will complain about missing window and document objects and hence requiring them with browserify will fail. So something like this doesn't work:

client.js

var React = require('react');
var MyComponent = require('./my_component');

React.renderComponent(MyComponent, document.getElementById('placeholder'));

server.js

var MyComponent = require('./my_component');
var React = require('react');
var markup = React.renderToString(MyComponent);
// render markup...

my_component.js

// this library needs a `window` object to initialize, hence requiring will fail.
var FormValidator = require('./vendor/validate.js'); 

module.exports = React.createClass({
  componentDidMount: function() {
    // Set up form validation...
  },

  render: function() {
    return <form>My Form</form>
  }
});

One solution to this issue is to inject the browser dependencies into the React component and mock them during server rendering:

my_component.js

var FormValidator;
module.exports = React.createClass({
  componentWillMount: function() {
    FormValidator = this.props.FormValidator;
  },

  componentDidMount: function() {
    // Set up form validation...
  },

  render: function() {
    return <form>My Form</form>
  }
});

clien.js

var MyComponent = require('./my_component');
var FormValidator = require('./vendor/validate.js');
var React = require('react');

React.renderComponent(MyComponent, {
  FormValidator: FormValidator
}, document.getElementById('placeholder'));

In the server-side script, we just need to pass in null when calling renderToString:

var MyComponent = require('./my_component');
var React = require('react');
var markup = React.renderToString(MyComponent, {
  FormValidator: null
});

It works but can quickly become tedious when the number of frontend libraries increase. So instead of declaring the each dependencies directly in every React component, we can use a common module to hold references to all of them:

browser_dependencies.js

var dependencies = {};
module.exports = {
  setDependency: function(name, module) {
    dependencies[name] = module;
  },

  getDependency: function(name) {
    return dependencies[name];
  }
};

my_component.js

var $ = require('./browser_dependencies').getDependency('validator');
module.exports = React.createClass({
  componentDidMount: function() {
    // Set up form validation...
  },

  render: function() {
    return <form>My Form</form>
  }
});

And the client script is responsible for setting the dependencies accordingly:

var MyComponent = require('./my_component');
var validator = require('./vendor/validate.js');
var React = require('react');
var browserDependencies = require('./browser_dependencies');

browserDependencies.setDependency('validator', validator);

React.renderComponent(MyComponent, document.getElementById('placeholder'));

There's no need to do anything on the server side. Just render your component to string as normal.

Note that any set up code that references the browser dependencies MUST be put in componentDidMount and not componentWillMount. componentWillMount is called both on the server and client which will obviously fail when using with renderToString.

You can find the example code for this pattern here.

Why I'm no longer using Rails

A lot of people have asked me why I stop doing Rails development, especially after finally making my contribution to the framework with web_console. While it's true that my full-time job doesn't involve Rails anymore, it's not the real reason. The real reason is that I no longer find the Rails philosophy suitable for me.

Rails is too big. As a framework it tries to include everything the community think is useful, a lot of which ends up being unnecessary. I often use Rails without core components like ActiveRecord or default gems like Jquery and CoffeeScript. Whenever there's something I don't need, I'll remove it from the runtime environment to gain more control and avoid overhead later on. While it works well for me, most Rails developers I work with don't like the idea of convenient stuff taken away from them. I usually end up in pointless arguments about which components are needed and whether they should be removed. Those
arguments can be avoided if we use lightweight frameworks that enforce manual inclusion of necessary components instead of removal of unwanted ones. The less opiniated a framework is, the less headache to put up with.

Another source of my frustration with Rails is the abundance and over-using of gems. Pure Ruby gems are generally well-written, and some are fantasic at what they do. However, Rails gems are the opposite. There are so many of them, almost one for every little thing you can think of in web development. In my previous projects, a Gemfile with 100+ gems is quite normal with a whole bunch of features I'll never need. It's quite a nightmare to clean up when the project becomes bigger. Moreover, I always try to stay away from gems that alter database structure or production environment in some way, including extremely popular ones like like devise or paperclip. While they're undeniably convenient, it's not very transparent about what they do behind the scene. If the internal working of the gems are not easy to tell, explaining application structure on top of them would be a challenge. Therefore, to make the code easier to maintain, it's best to just implement the features yourself and use Gems when they're really needed. Unfortunately lots of Rails developers don't understand or don't have the time budget to do this. Technical debts are then unavoidable.

Breaking down a monolithic Rails application is another thing I hate. At some point a Rails application will become too big and hard to scale horizontally. The best strategy at this stage is to break it down into smaller services, each of which can be scaled individually in a distributed environment. It's not an overnight work for sure and requires a lot of commitment from both engineering and business. All of the time saved by rapid development at the start of the project will probably be eaten up by this process, not to mention the frustration that goes along with any big refactoring. That's why nowadays I very much prefer non-monolithic, service-oritented approach. It gives me more control of the whole stack and the flexibility to adapt to changes later on.

To be clear, Rails is not bad at all. In fact, it's a fantastic framework for building conventional web applications and quick prototypes. However, I've experienced the bad side of the framework and now its philosophy is not working for me anymore. I'm sure there are better ways to do Rails development, and people are having huge success with the framework. But for now, I'd like to explore different options to find out what works best for each of the projects I do. I'm still contributing to web_console, but it'll be a while before I create web applications with Rails again.

DustJS share variables between client and server

Sharing variables between client and server is one of the most powerful features of NodeJS. However, it took me quite a bit of time to figure out how to do it properly with Dust templating engine (LinkedIn version).

Suppose that we want to share the product object below with the client Javascript:

app.get('/product/:id', function(req, res) {
  var product = store.getProduct(req.params.id);
  res.render('product', { product: product });
});

All you need to do is to put this in your product.dust file:

<script>var product = {product|s|js};</script>

The expression {product|s|js} means rendering the product object into JSON string without any escaping. The JSON string will be parsed back into normal Javascript object in the client script. s and js are 2 of the many filters LinkedIn Dust offers. You can get a list of all the available filters here.