« Back to home

Generator in Go

Go doesn't have built-in support for generators. To be honest, it doesn't have to. Go has enough functionalities for the general use cases, and with the philosophy of simplifying everything, generators can be a distraction and cause more problems than they're worth. However, it doesn't mean we can't have our own implementation of generator for experimental purposes.

The core functionality of generator is to "generate" data (often very large data) lazily and asynchronously. The 2 main operators are yield and next. In languages like JavaScript, Python or Ruby, the syntax looks like this:

function generateRandoms() {  
    while (true) {
        yield Math.floor(Math.random() * 42);
    }
}

let generator = generateRandoms();  
console.log(gen.next());  

The above generates random numbers indefinitely but doesn't hang the process or cause memory overflow. In languages like Java, one can create an abstraction on top of built-in concurrency support:

Generator<Integer> generator = new Generator<Integer>() {  
    public void run() throws InterruptedException {
        while(true) {
          Random rand = new Random();
          yield(rand.nextInt(42);
        }
    }
};

System.out.println(generator.next());  

The same thing applies to Go with Goroutines and channels. For a start, yielding is similar to starting a goroutine and sending the value to a channel, blocking the goroutine until it's consumed.

func generateRandom() {  
    ch := make(chan int)
    go func() {
      for true {
        ch <- rand.Intn(42)
      }
    }()
    return ch
}

Then reading the next value is equivalent to reading from the channel

gen := generateRandom()  
num := <-gen  
fmt.Println(num)  

However, this syntax doesn't feel like the generator we're used to, so let's make it more pleasing:

package generator

type Generator struct {  
    yielder Yielder
}

type Yielder struct {  
    ch chan interface{}
}

type GeneratorFunction func(yielder Yielder)

func (gen Generator) Next() (interface{}, bool) {  
    val, open := <-gen.yielder.ch
    return val, open
}

func (yielder Yielder) Yield(val interface{}) {  
    yielder.ch <- val
}

func New(fn GeneratorFunction) Generator {  
    yielder := Yielder{make(chan interface{})}
    gen := Generator{yielder}

    go func() {
        fn(yielder)
        close(gen.yielder.ch)
    }()
    return gen
}

So now using creating a generator is as straight forward as:

gen := generator.New(func(yielder generator.Yielder) {  
    for true {
      yielder.Yield(rand.Intn(42))
    }
})

fmt.Println(gen.Next())  

Benefits of generators

The key features of generators are laziness and asynchrony, enabling a whole new way of writing code. For example if we want to generate the first 100 prime numbers, an imperative algorithm can be:

  • Generate numbers from 2 until indefinitely.
  • For each number, add it to the collection if it's a prime.
  • Repeat until 100 primes are collected.

An implementation in imperative programming looks like this:

func generatePrimes(limit int) []int {  
    var primes []int
    num := 2 

    for true {
        if len(primes) >= limit {
            return primes
        }

        isPrime := true
        for i := 2; i <= int(math.Sqrt(float64(num))); i++ {
            if num % i == 0 {
                isPrime = false
                break
            }
        }  
        if isPrime {
            primes = append(primes, num)
        } 
        num ++
    }

    return primes
}

The code above mixes 3 concerns: number generation, prime checking, and limit checking. We programmers are trained to look at logic like that and understand what it does, but regardless of how good you are, it's not always straigth forward to grasp the meaning of the code right away. With generators, the 3 concerns can be separated and the code looks a lot more pleasant.

func generateNumbers() generator.Generator {  
    return generator.New(func(yielder generator.Yielder) {
        num := 0
        for true {
            num++
            yielder.Yield(num)
        }
    })
}

func filter(seq generator.Generator, fn filterFn) generator.Generator {  
    return generator.New(func(yielder generator.Yielder) {
        val, open := seq.Next()
        for ; open; val, open = seq.Next() {
            if fn(val) {
                yielder.Yield(val)
            }
        }
    })
}

func take(limit int, seq generator.Generator) generator.Generator {  
    return generator.New(func(yielder generator.Yielder) {
        for i := 0; i < limit; i++ {
            val, open := seq.Next()
            if open {
                yielder.Yield(val)
            } else {
                break
            }
        }
    })
}

primes := take(filter(generateNumbers(), checkPrime), 100)  
num, open := primes.Next()  
for ; open; num, open = primes.Next() {  
    fmt.Println(num)
}

This implemenatation generates numbers one by one and let the numbers "flow" to the next generators. We can generate numbers without hanging the program and allocating more memory than neccessary. This opens a whole new paradigm of reactive or data flow programming.

Issues with generators

The biggest drawback with this implementation is performance. According to my calculation, the solution for the prime problem above using generators is 30% slower than imperative code. It is mainly caused by the overhead of creating goroutines and passing data around via channels. This makes generators not very practical to be widely used in Go just yet.

Another issue is that once a generator is used, it has to be used with other generators as well. In asynchronous programming, the code is either completely lazy or not at all. That means all the logic that you write has to take generators as input and return other generators as output. Sometimes this make things more complicated than it could be with imperative programming.

Conclusions

Generators are very fun to play around with. They make the code a lot more clean and modularized. However, because there hasn't been any native implementation yet, the current workaround with goroutines and channels is not fast enough to be practical. I really hope there'll be a native implementation some time soon, but given the Go's simple philosophy, it's quite unlikely.

Comments

comments powered by Disqus