I have been using Puppet off-and-on by now for almost a couple of years and saw it transform from a domain-specific language with lots warts to a pretty and functional language. However, not all is completely fine and dandy. I feel that in some ways it could still be improved a lot.
Namespaces support of modules in environments
Puppet has somewhat good support for namespaces in the code with the special delimiter of ::
which has a magical meaning for it: the Puppet agent expects to find the code in one level deeper when it sees such thing in a class name or a define
. From now on, I will only talk about class
objects and not other things which work similarly from the point of view of the parser. However, at the top level, all names of the class
definitions (before ::
, if it exists) are at the same, top-most scope. In a very big environment, you could easily run into such a situation that two different modules (competing implementation) configure the same piece of software and they have identical names.
This StackOverflow question shows that such a problem is not uncommon at all. The only “solution” that you have right now is to fork that module into your own repository and rename all of the class
names into something like: foo_nginx
or bar_sssd
.
The Puppet’s code already expects a somewhat rigid structure of your environment and it probably would not be painful to add another special separator in the include
or contain
statements. For example, it could look like this: modulename/nginx::vhost
. Such syntax would follow the same naming rules as in Puppetfiles.
Better tooling
Puppet could use some better tooling. The implementation of the server itself comes with the JRuby distribution of Ruby. I am not completely sure why it was chosen but JVM generally feels sluggish. However, there is nothing inherently bad in that. The issue here is that the de facto most popular Ruby testing framework RSpec is not totally thread-safe. For example, this issue has been open for a few years by now. I hope that it will not turn into something like MySQL’s bug #11472 which is 10+ years old now!
The problem here is that if you want to test out Puppet’s code, you have to use the same JRuby because some things work a bit differently in it, especially with regards to things which call different C libraries e.g. openssl
. This means that all of your tests in puppet-rspec
need to be executed sequentially i.e. they need to be written one-by-one in different it
blocks!
Also, on the same note, the popular r10k
environment deployment tool does all of its actions sequentially. That makes it take a very, very long time to deploy new environments – in the minutes. Fortunately, smart people have managed to fork it and make the deployment work concurrently: g10k.
It also has the complementary r10k-webhook
here which gives batteries to r10k
and permits deployment of new environments on new commits in some kind of repository, for example. Unfortunately, it has lots of problems as well like:
- new environments are exposed to the Puppet server before they are fully deployed which can lead to spurious errors;
- the operations happen synchronously so the execution can be canceled from the client’s side i.e. GitHub, if it is taking too long — this can easily happen in big environments
Replacements can be written by the community for these tools which will solve these problems elegantly however it would be nice if it came from Puppetlabs themselves.
The movement towards “immutable” infrastructure
Last but not least, let’s talk about the movement towards “immutable infrastructure” in the view of the whole DevOps movement with tools like Terraform being on the wave. Obviously, configuration management is only for ensuring that certain actions are performed which leads to a certain configuration but they, obviously, do not check and revert all of the previous actions that have been done manually. This is where “immutable infrastructure” comes in.
But… do tools like Puppet and Salt still have a place in the modern IT world when we have such things as Kubernetes? I would say that the answer is yes – something still needs to stand up the machines and images which run those Kubernetes clusters. Even if we are just standing up stateless machines with images built from somewhere else – those images still need to be built repeatably and fast.
This is where software like Packer comes into place. It has support for lots of different provisioners and one of them is Puppet. Thus, as time goes on, we will still have competitors and innovation in this space.
I have included this section in this post because I feel that configuration management does not encourage “immutable infrastructure” enough. Sure, we have more automation on top like Spinnaker which helps out with tearing down everything and pulling up everything again but I feel that this has not been emphasized enough.
As time goes on, the state of your machines will inevitably diverge from the things you have in your code. Of course, Puppet does not go so deep but I feel that maybe such tools should have to be more sophisticated and somehow nudge their users to at least periodically build up everything from scratch from the code they have in their repositories. Nightly tests of Puppet profiles/roles and chaos monkey tests is a somewhat possible answer to this however not everyone does this. I think that perhaps it would be cool for tools such as Packer to get support for overlay filesystems: the underlying, read-only filesystem would be prepared by a provisioner like Puppet, and then all of the mutable things would be performed on an “overlay” over it. Time will tell.
Thanks for reading and let me know your thoughts!