Equivocal Puppet Class Parameters

While working with Puppet recently I have noticed that there is some funny business going on with the rules of parameter naming. The Puppet’s documentation states:

Parameter names begin with a dollar sign prefix ($). The parameter name after the prefix:

Must begin with a lowercase letter.

Can include lowercase letters.

Can include digits.

Can include underscores.

Let’s see if this is true. Tried applying this manifest with Puppet 5.5.10 which is available on Ubuntu Focal Fossa (20.04):

class test(
  String $_content
){
  file {'/tmp/helloworld':
    content =>  "${_content}\n",
  }
}


class { 'test':
  _content => "123"
}

And it does not work as expected:

$ puppet apply hello.pp
Error: Could not parse for environment production: Syntax error at '_content' (file: /home/gstatkevicius/dev/puppet_testing/manifest/hellofile.pp, line: 11, column: 3) on node gstatkevicius-desktop

Now, let’s try creating a simple hiera configuration:

---
:backends:
  - yaml
:yaml:
  :datadir: /home/gstatkevicius/dev/puppet_testing/hieradata
:hierarchy:
  - test
:logger: console
$ cat hieradata/test.yaml 
test::_a: "Hello World!"

It looks like the hiera part works:

hiera -c ./hiera.yaml -d test::_a
DEBUG: 2020-11-03 23:39:00 +0200: Hiera YAML backend starting
DEBUG: 2020-11-03 23:39:00 +0200: Looking up test::_a in YAML backend
DEBUG: 2020-11-03 23:39:00 +0200: Looking for data source test
DEBUG: 2020-11-03 23:39:00 +0200: Found test::_a in test
Hello World!

Now let’s see if applying the manifest with Puppet works:

$ puppet apply --hiera_config=hiera.yaml manifest/hellofile.pp
Notice: Applied catalog in 0.01 seconds
$ cat /tmp/helloworld
Hello World!

Oh, so now everything is OK? It seems that for APL – automatic parameter look-up – the rules are a bit different. My guess at this point would be that they are treated as regular variables instead. I personally haven’t found a way to instantiate a class where one parameter starts with an underscore. Thus, I think we can formulate one lesson:

To prevent your class from ever being instantiated by other classes in Puppet with explicit arguments, start at least one class parameter with an underscore.

But, one question remains – why is it actually considered a syntax error? What makes the underscore character forbidden in names of class parameters whereas it works for regular variables?

Now, I’m not an expert in how Puppet parsing but let’s take a short trip down Puppet’s code-base and see what’s happening.

A quick grep of Syntax error at shows that there is some kind of function SYNTAX_ERROR that gets used whenever there is a need to print a message that a syntax error has occurred.

Digging a bit deeper, it seems that there is some kind of parser being generated from a grammar. Other Puppet developers have kindly documented this process for us in docs/parser_work.md. We are finally able to find the grammar in lib/puppet/pops/parser/egrammar.ra.

Valid name of a variable seems to be expressed here in lexer2.rb:

PATTERN_DOLLAR_VAR     = %r{\$(::)?(\w+::)*\w+}

I believe that what is a valid argument passed to a class is defined here:

#---ATTRIBUTE OPERATIONS (Not an expression)
#
attribute_operations
  :                                                { result = [] }
  | attribute_operation                            { result = [val[0]] }
  | attribute_operations COMMA attribute_operation { result = val[0].push(val[2]) }

  # Produces String
  # QUESTION: Why is BOOLEAN valid as an attribute name?
  #
  attribute_name
    : NAME
    | keyword

A valid NAME is defined here in lib/puppet/pops/patterns.rb:

NAME = %r{\A((::)?[a-z]\w*)(::[a-z]\w*)*\z}

So, it seems like the argument’s name is rejected because it does not follow this regular expression even though it is accepted by the lexer. To be fair, Puppet’s documentation also states:

CAUTION: In some cases, names containing unsupported characters might still work. Such cases are bugs and could cease to work at any time. Removal of these bug cases is not limited to major releases.

All in all, it’s probably best to follow the letter of the law laid out in Puppet’s documentation as it says here but if you want to forbid the users of your class to pass arguments explicitly, start one of them with an underscore 🙃.

Crash-Only Software In Practice With Filebeat On Kubernetes

Some time ago I read an article called Crash-only software: More than meets the eye. It’s about an idea that sometimes it is easier and faster to just crash and restart the whole process than handle the error appropriately. Sometimes, obviously, it is even impossible to handle errors as in, for example, Linux kernel panics. Nowadays programming languages especially tend to make it so that it would be very hard, almost impossible to not handle errors. For instance, Rust has a type that either represents success or an error. You have to explicitly unwrap that type and decide what to do in either case. But, apart from those obvious cases, I haven’t seen any examples of crash-only software. Until recently.

There is this popular software project called Beats (Filebeat). It lets you ship various kinds of logs from different sources to a lot of different receivers or “sinks”. I ran into this issue recently that hasn’t been solved for quite some time. A problem occurs when using autodiscovery of containers in Kubernetes, and then some state gets mishandled leading to this error message:

[autodiscover] Error creating runner from config: Can only start an input when all related states are finished

Then, the logs stop being shipped. And it seems like it won’t be solved for still some time because the input sub-system of Beats is written rewritten as I understand it. Something needed to be figured out because moving to another project for performing this is very time consuming and that problem had been occurring then, at that point in time i.e. we were loosing logs. Plus, Filebeat does its job well albeit its benchmarks don’t shine. Vector definitely looks very promising and will be looked into when the k8s integration lands fully.

Anyway, while looking through the comments, this comment here reminded me of the term “crash-only software”. It seems like such a solution is an implementation of “crash-only software” because when Filebeat ships logs, it stores the offsets of files it is reading and so on in a thing called a registry. That permits it to quickly restart in case the process gets killed. That’s how the problem was worked around at least for the time being and I wanted to share this information in case it will be useful for someone.

We will implement this by making a custom liveness probe for our pod in k8s.

At first, you should disable the -e flag if you have it enabled as that disables logging to a file. We will need to log everything to a file because our liveness probe will try reading it to see if that error has occurred. Newer versions have this option but I found that it does not work well in practice. YMMV.

Then, we should enable logging to a file. Do that by adding the following to your Filebeat configuration:

logging.to_files: true
logging.files:
  keepfiles: 2

The only two options which are relevant to us are those. First of all, let’s turn on logging to files by logging.to_files. Then, we also want to keep a minimal number of files because they won’t be shipped anywhere, they will only be used for the liveness probe. Do that with the keepfiles option. Obviously, feel free to modify other options if needed for your use-case.

The final part is the actual liveness probe. To do that, modify your container’s specification by adding this block:

--- 
livenessProbe: 
  exec: 
    command: 
      - /bin/sh
      - "-c"
      - "! (/usr/bin/grep 'Error creating runner from config: Can only start an input when all related states are finished' /usr/share/filebeat/logs/filebeat)"
  initialDelaySeconds: 60
  periodSeconds: 10

I recommend setting the initial delay to about 30 seconds or so to give Filebeat enough time to create the log file and populate it with initial data. Depending on your other logging configuration and volume, you might want to either increase or decrease the sensitivity of this check by either making the period smaller or reduce the number of times (failureThreshold) the command has to fail before Kubernetes makes a conclusion that the container does not work anymore.

I’m sure that this is not the only case of liveness probes being thought of and used like that. Hopefully, this workaround will not be an instance of the old adage “there is nothing more permanent than a temporary solution”. I am certain that the Filebeat developers will fix this problem in the near future. It’s a good piece of software. Let me know if you encounter any problems with this solution or if you have any comments.