Property-based testing in Golang

A simple house a.k.a. property
We will talk about properties! Wait, no… actually about testing software

Testing software is ubiquitous and people naturally expect it to be a part of any kind of software development process. There are many different kinds of forms it can take:

  • at the most rudimentary level: ad-hoc testing;
  • integration testing;
  • synthetic testing;
  • and many others

One of the forms that are quite novel is property-based testing. Essentially, the idea is to check if the software that you’ve produced espouses certain characteristics under inputs which have certain distinctive qualities. It sounds very similar to ordinary unit tests however here the catch is that a random number generator is leveraged in this case. Or, you can think of fuzzing but for ordinary code, not binary interfaces. It lets you run a bunch of tests very quickly and find the edge cases under which your code might not work as expected.

Unfortunately, just pushing random data to your functions is not very useful by itself. That is why the process of “shrinking” has been invented. It is a mechanism by which random data is reduced to a minimal test case which shows what characteristics are failing on what input.

There is a quite huge book on this topic called “PropEr Testing” by Fred Hebert about QuickCheck. I recommend it, you can find a lot of information there. However, here we will focus on how to do this in the Go programming language. For this, we will use the featureful gopter library which includes all the necessary batteries for property-based testing. You could still use the book as a reference because that library tries “to bring the goodness of QuickCheck to Go”. Let’s begin by running through the parlance of property-based testing.

Terminology

Generators are simply things which generate data for functions under test. gopter has a bunch of generators ready for you to use in the gen package. You can probably find anything that you would ever want to generate in there.

Even on the bottom of the page, they have what is called a “weighted generator” – you can pass a bunch of generators to it with their own weights which specify what is the possibility that a generator will be used. It is useful when your function, for example, accepts a interface{} argument and does type assertion inside of it.

The same package contains shrinkers. They have been partially described in the former section. Let me repeat again: shrinkers reduce the random input until you get proper data. For example, an uint64 shrinker, first of all, shrinks it to 0, and then later subtracts the original value from the result of the division of the original value continually by 2 by doing a bit-wise operation. Thus, we would eventually land on a value which shows the problem, if there is any.

Another interesting part of gopter is the commands package. It helps you implement stateful property-based tests. Essentially, ordinarily, you would be testing functions which store no state in any place. However, as we know in real life that is not always the case. Thus, it contains nice and easy-to-use helpers such as ProtoCommands. You can find more information here.

In the end, the arbitary package provides ways to combine multiple generators together using reflection. We have talked about it just a bit before. You could have an array of different generators.

Finally, the main gopter package combines all of these fun things together so that you could use them in your tests. It has some other niche features that I will not look at in this article but you should try them if needed. These include bidirectional mapping, the combination of different generators, the meaning of different outputs of the tests (undecided, exhausted, etc.)

Examples

The documentation of gopter has a lot of examples already so please feel free to explore them. With all of the knowledge that you have now, it should be easy to explore. Still, let me give you two examples so that you could hit the ground running and start using it in your projects in no time!

Fibonacci numbers

Perhaps you have some kind of code in your program which calculates the Fibonacci numbers. Let me remind you that Fibonacci numbers are such numbers that each number in the sequence is the sum of the two previous ones. It starts with a sequence of 1 and 1.

Let’s get back to the code. For example, there could be a function fib(n uint) []int which returns a slice of length n which contains the first n Fibonacci numbers. It could look something like this:

func fib(n uint) []int {
	ret := []int{}
	a, b := 1, 1
	for n > 0 {
		ret = append(ret, a)
		a, b = b, a+b
		n--
	}
	return ret
}

Such code lends very nicely to property based testing since the returned data has to follow the property mentioned before. Let’s use a uint generator and the main gopter package to write a simple property-based test:

func TestFib(t *testing.T) {
	parameters := gopter.DefaultTestParameters()
	parameters.Rng.Seed(2000)
	parameters.MinSuccessfulTests = 20000
	properties := gopter.NewProperties(parameters)

	properties.Property("correct data", prop.ForAll(
		func(n uint) bool {
			r := fib(n)
			switch len(r) {
			case 0:
				return true
			case 1:
				return r[0] == 1
			case 2:
				return r[0] == 1 && r[1] == 1
			default:
				for i := 2; i < len(r); i++ {
					if r[i] != r[i-1]+r[i-2] {
						return false
					}
				}
				return true
			}
		},
		gen.UIntRange(0, 5555),
	))

	properties.TestingRun(t)
}

There is a lot to unpack but at first we set the test parameters object: we have set the random number generator seed to a constant number so that we would get the same results each time and bumped up the minimum number of successful tests so that our function would be bashed for more. Without it, the default amount of minimum successful tests is 200.

Later, the properties are being set up. There is only one – that correct data is returned. Inside of it, we create a function which takes the generator’s value and returns a booltrue if the returned data satisfies the properties, and false – if not.

A generator which generates uint in the range from 0 to 5555 has been used. Probably if the function already satisfies those inputs then it works with all kinds of inputs, does not matter what are they.

In the end, we run the property tests. We can execute all of this if you put the content of these two blocks into one file with appropriate imports by running: go test -v fib_test.go.

=== RUN   TestFib
+ correct data: OK, passed 20000 tests.
Elapsed time: 346.031585ms
--- PASS: TestFib (0.35s)
PASS
ok  	command-line-arguments	0.352s

Thanos case

I have been a maintainer of Thanos for some time now – it’s quite a big Golang project. Recently I have successfully used gopter to catch a bug in one function. That function is pretty important – it selects which blocks of data to download depending on the selected maximum resolution of data, and the time range (minimum/maximum time).

I have written two different property-based tests there:

As the test data, a typical production-grade state has been embedded to test out all possible cases. And it caught this one, serious error – sometimes it didn’t select some blocks that it should’ve. Essentially, the getFor() function (the one under test) only selected the least resolution data and only then went “to the sides” (in terms of time if we imagine it from left to right) to get the higher resolution data, but not in the middle. The property-based tests quickly caught this mistake because the results sometimes weren’t satisfying the “fullness” criteria.

Example stateful test

Last but not least, let’s look over the stateful tests. Not every time you will be so lucky to have some stateless functions in your code like the former which you could easily test. That’s why the gopter library has a nice commands package which has the needed functions to test out code like that as well.

Imagine that you might have some code in your program which determines whether to give out a pizza to someone who has requested it and there is also another action: someone could make a new pizza. The code could look like this:

type Pizza struct{}

type Pizzeria struct {
	pizzasLeft int
	n          int
}

func NewPizzeria(n int) *Pizzeria {
	return &Pizzeria{pizzasLeft: n, n: 0}
}

func (p *Pizzeria) GetOut() *Pizza {
	p.n++
	if p.n > 3 {
		return nil
	}
	if p.pizzasLeft > 0 {
		return &Pizza{}
	}
	return nil
}

func (p *Pizzeria) Bake() {
	p.pizzasLeft++
}

You can spot the buggy in the GetOut() code – it will not give out pizzas anymore after three were taken out. We will try to catch it with property-based tests. We will test out the property that we can always take out some pizzas once they are baked.

Let’s say that there are two commands: Bake() command which bakes a new pizza and GetOut() which gets one out (if possible).

The commands are defined by using the commands.ProtoCommand struct. Here is their code:

var GetOutCommand = &commands.ProtoCommand{
	Name: "GET",
	RunFunc: func(systemUnderTest commands.SystemUnderTest) commands.Result {
		return systemUnderTest.(*Pizzeria).GetOut()
	},
	PostConditionFunc: func(state commands.State, result commands.Result) *gopter.PropResult {
		if state.(int) > 0 && result.(*Pizza) == nil {
			return &gopter.PropResult{Status: gopter.PropFalse}
		}
		return &gopter.PropResult{Status: gopter.PropTrue}
	},
        NextStateFunc: func(state commands.State) commands.State {
		return state.(int) - 1
	},
}

var BakeCommand = &commands.ProtoCommand{
	Name: "BAKE",
	RunFunc: func(systemUnderTest commands.SystemUnderTest) commands.Result {
		systemUnderTest.(*Pizzeria).Bake()
		return nil
	},
	NextStateFunc: func(state commands.State) commands.State {
		return state.(int) + 1
	},
}

var pizzeriaCommands = &commands.ProtoCommands{
	NewSystemUnderTestFunc: func(initialState commands.State) commands.SystemUnderTest {
		return NewPizzeria(0)
	},
	InitialStateGen: gen.Const(0),
	InitialPreConditionFunc: func(state commands.State) bool {
		return state.(int) == 0
	},
	GenCommandFunc: func(state commands.State) gopter.Gen {
		return gen.OneConstOf(BakeCommand, GetOutCommand)
	},
}

That is a lot of unpack. Most of the struct members are self-explanatory however here are their descriptions:

  • RunFunc obviously executes that function
  • PostConditionFunc gets called when gopter wants to check if the conditions are still true after executing it. In our case, we check that we have gotten a pizza if we have baked something
  • NextStateFunc gets executed when gopter wants to get the next state of the system under test – in this case the state is increased or decreased by 1 because we just baked or got out one pizza
  • commands.ProtoCommands lets us define something which must be true at the beginning, before executing tests, lets us define how the system under test object must be constructed, and what commands are available

Then finally lets bind everything and run the tests like the following:

parameters := gopter.DefaultTestParameters()
parameters.Rng.Seed(1234)

properties := gopter.NewProperties(parameters)

properties.Property("pizzeria", commands.Prop(pizzeriaCommands))

properties.Run(gopter.ConsoleReporter(false))

You would get results like the following:

! pizzeria: Falsified after 11 passed tests.
ARG_0: initialState=0 sequential=[BAKE BAKE GET BAKE BAKE GET BAKE GET GET]
ARG_0_ORIGINAL (2 shrinks): initialState=0 sequential=[BAKE BAKE GET BAKE
   BAKE GET BAKE GET GET BAKE BAKE]

We can see that we got the commands BAKE BAKE GET BAKE BAKE GET BAKE GET GET after shrinking the original argument 2 times. Indeed, after the 4th GET, we did not get a pizza like we have expected even though we have baked 5 pizzas before 😢

Conclusion

As you can see, property-based testing is a really powerful concept that you should leverage in your own projects, if appropriate. Please do comment if you have found some mistakes or you want to discuss about it. Thanks for reading so far!

Introducing SAM: Similar Alerts Manager (My Side-Project)

 _______  _______  _______ 
(  ____ \(  ___  )(       )
| (    \/| (   ) || () () |   SIMILAR
| (_____ | (___) || || || |
(_____  )|  ___  || |(_)| |   ALERTS
      ) || (   ) || |   | |
/\____) || )   ( || )   ( |   MANAGER
\_______)|/     \||/     \|

(sorry, I do not have a professional designer at my disposal)

Why?

At the moment, Prometheus only supports a rudimentary way to look up what alerts have been firing in the past and there is no way to “persist” those metrics: there is a synthetic metric called ALERTS which shows what alerts were firing in the past. That is not enough to be able to tell globally what alerts have been firing after Prometheus has restarted. You could use a full-fledged solution like Thanos to solve this problem however that is very cumbersome if you only want this small feature and…

Furthermore, it is hard to tell how different alerts are related to each other. Right now, alerts are differentiated by their label sets. Yes, we could look at a graph of different ALERTS values but it would require a lot of squinting and deducing to see how different alerts are related because some of them might be firing repeatedly a few times in the past and thus at different points in time they could be related to different alerts, and so on. Plus, if you are using something like Thanos and different Prometheus instances have the same alert rules then it might be that you will have almost identical metrics in that graph which will make things even harder.

Thus, something was needed which would look at those historical alerts, persist the data, and aggregate them so that it would be easy to look up similar alerts. This is where the similar alerts manager comes in.

What?

Similar alerts manager or SAM, in short, is a daemon which sits in the background that periodically retrieves new alert information, parses them, and saves the information into a cache, and provides an HTTP API which permits the users to access this information.

It is a side project and thus it is not very polished however it does its job. Also, this project is somewhere between having nothing in this regard and using a fully fledged solution that is provided by a start-up like SignifAI.

New alert information is retrieved from an Elastic Search cluster that is specified by the user. New alerts information are added there via alertmanager2es which hooks into AlertManager. We could directly retrieve alerts from AlertManager by implementing an HTTP server ourselves but I feel that pushing everything to an Elastic Search cluster means that it is easier to discover through other means such as a Kibana instance.

Related alerts are retrieved by always keeping a stack of hashes of the label sets of the firing alerts, and whenever a new alert comes in and it is firing then all of the currently firing alerts are related to it.

For the persistence layer, I have chosen to try out Redis which I have never used. It is indeed very elegant and the command model fits very nicely with the use-case. However, at the current version, only the string will all of the information is saved into Redis but in the future, it might be reworked so that proper hash-map commands are used instead of doing it the brain-dead way as it is right now.

I have chosen the Go programming language to do this project as I feel that it is easy to produce easily understandable, concurrent code, and I have a feeling that it is easy to become productive with it. Maybe for the next projects, I will choose something else like Elixir or Rust to try new things 🙂 I know that it is controversial in some regards such as error handling and generics but I still like it as it feels like it is a spiritual successor to the C programming language.

Here is how SAM’s architecture looks like:

architecture

Lessons learned

Honestly, the first lesson that I have learned is that for high-level languages such as Go there are a lot of different frameworks that you should really try out and should not try to reinvent the wheel. It is really amazing how much good free software is out there and you should not be afraid to reuse other stuff in your side projects.

Secondly, sometimes the Go’s visibility rules are a pain. For example, I want to have a separate package which is responsible for the cache-related functions. However, because it needs to read all of the state’s members, I cannot make them private. We could make a custom marshaler for that type but that is painful and so the proper fix here is to move the saving of alerts’ information into a completely separate part and put it into a hash-map in Redis. This way, we can hide the alerts information and reduce the usage of RAM because we would not have to store everything.

Thirdly, travis is a pretty cool tool. I feel like it is somewhere between Drone CI and a full blown thing which lets you do custom things like Jenkins or TeamCity. travis covers already most of the use-cases by having already made templates for projects with different programming languages. For example, for Go you may only need to specify language: go and that will be it. Everything else will be handled for you.

Lastly, ElasticSearch can be tricky sometimes because it lets the user specify the refresh interval i.e. the time between the updates of the “view” of the data. This means that you can, theoretically, push new data and not see it in Kibana immediately. This is controlled by the option refresh_interval. Thus, the lesson is that ElasticSearch is quite advanced software and there sometimes might be knobs that you might have never thought about, that they even exist in the first place.

What now?

Even though SAM works right now but it needs lots of polish and improvements. The amount of attention that it will get depends on if a lot of other people will find it useful, of course. I will fix the issues mentioned soon.

Besides that, SAM is already usable and you can try it out. Just grab it from the Docker Hub by running these commands:

docker pull stag1e/sam
docker run --rm -it -p 9888:9888 stag1e/sam --elasticsearch 'http://127.0.0.1:1234' --redis '127.0.0.1:3333'

You must specify the IP addresses of Redis and ElasticSearch with these options:

  • -l / –elasticsearch: the URL to the ElasticSearch server
  • -r / –redis: IP and port pair of the Redis server

You can use docker-compose to automatically prepare a simple deployment for testing purposes:

cd docker/
docker-compose -f docker-compose-dev.yml up -d

That will set up the following things in a simple configuration:

  • ElasticSearch
  • Kibana
  • AlertManager
  • Alertmanager2es
  • Redis

So that you would be able to try out SAM. After, add the index template by running: ./scripts/add_es_template.sh and then you can run SAM. For brevity, I will avoid rewriting the same instructions that are available in the repository itself here.

As always, pull requests and bug reports are welcome! Thank you for reading and happy hacking!