Dangers of IsEmpty() With Enums and FluentValidation

I have recently run into this one small issue while using the FluentValidation library on enums in C#. In particular, there can be problems when using IsEmpty() on enums. Now I’m not an expert on C# but I will try my best to explain this issue.

Let’s first start with a recap on how enums work. They allow you to associate some kind of numerical values with names, in essence. If you do not specify any explicit value for an enum value, the compiler automatically takes the numerical value of the previous enum that went in the same type and adds one to it. In the case of the non-existence of the previous value, it starts at 0.

Now, let’s take a look at this small program that exhibits this problem. It has been adapted from this tutorial. Here is the code:

using System;
using FluentValidation;
using FluentValidation.Results;

namespace test_fluentvalidation
{
    public enum CustomerType
    {
        VeryImportant,
        NotSoImportant
    }
    public class Customer
    {
        public int Id { get; set; }
        public string Surname { get; set; }
        public string Forename { get; set; }
        public decimal Discount { get; set; }
        public string Address { get; set; }
        public CustomerType CustomerType { get; set; }
    }

    public class CustomerValidator : AbstractValidator<Customer>
    {
        public CustomerValidator()
        {
            RuleFor(customer => customer.CustomerType).IsInEnum().NotEmpty();
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            Customer customer = new Customer();
            customer.CustomerType = CustomerType.VeryImportant;
            CustomerValidator validator = new CustomerValidator();

            ValidationResult results = validator.Validate(customer);

            if (!results.IsValid)
            {
                foreach (var failure in results.Errors)
                {
                    Console.WriteLine("Property " + failure.PropertyName + " failed validation. Error was: " + failure.ErrorMessage);
                }
            }
        }
    }
}

Can you guess the output? It is actually:

Property CustomerType failed validation. Error was: 'Customer Type' must not be empty.

Could you predict why that has happened? The FluentValidation library considers the numerical value 0 as being “empty”. And, as you can see in the code, the CustomerType.VeryImportant has a numerical value of 0 because no previous value exists before that in the type. I have to say that this part caught me off guard in the beginning and it took me quite some time to figure this out.

But… how to fix this issue? One easy way of doing that is to assign an explicit number higher than 0 to the first enum in the type. If it is lower then potentially with enough members, the value might come up to 0 again and you’d run into the same problem.

To illustrate this, let’s look at this program. It prints nothing and works as expected:

using System;
using FluentValidation;
using FluentValidation.Results;

namespace test_fluentvalidation
{
    public enum CustomerType
    {
        VeryImportant = 1,
        NotSoImportant
    }
    public class Customer
    {
        public int Id { get; set; }
        public string Surname { get; set; }
        public string Forename { get; set; }
        public decimal Discount { get; set; }
        public string Address { get; set; }
        public CustomerType CustomerType { get; set; }
    }

    public class CustomerValidator : AbstractValidator<Customer>
    {
        public CustomerValidator()
        {
            RuleFor(customer => customer.CustomerType).IsInEnum().NotEmpty();
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            Customer customer = new Customer();
            customer.CustomerType = CustomerType.VeryImportant;
            CustomerValidator validator = new CustomerValidator();

            ValidationResult results = validator.Validate(customer);

            if (!results.IsValid)
            {
                foreach (var failure in results.Errors)
                {
                    Console.WriteLine("Property " + failure.PropertyName + " failed validation. Error was: " + failure.ErrorMessage);
                }
            }
        }
    }
}

The program now passes because CustomerType.VeryImportant is 1 which is treated by FluentValidation as non-empty. You can play around with the program and print out the values of CustomerType.VeryImportant and CustomerType.NotSoImportant after casting them to int. I left this as an exercise for the reader.

Another way of fixing this is by converting the type inside Customer to be nullable. You can find documentation on that here. Let’s convert CustomerType to be nullable by appending ? to the type’s name and see the variable customer pass the checks even though CustomerType.VeryImportant still has the value of 0:

using System;
using FluentValidation;
using FluentValidation.Results;

namespace test_fluentvalidation
{
    public enum CustomerType
    {
        VeryImportant,
        NotSoImportant
    }
    public class Customer
    {
        public int Id { get; set; }
        public string Surname { get; set; }
        public string Forename { get; set; }
        public decimal Discount { get; set; }
        public string Address { get; set; }
        public CustomerType? CustomerType { get; set; }
    }

    public class CustomerValidator : AbstractValidator<Customer>
    {
        public CustomerValidator()
        {
            RuleFor(customer => customer.CustomerType).IsInEnum().NotEmpty();
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            Customer customer = new Customer();
            customer.CustomerType = CustomerType.VeryImportant;
            CustomerValidator validator = new CustomerValidator();

            ValidationResult results = validator.Validate(customer);

            if (!results.IsValid)
            {
                foreach (var failure in results.Errors)
                {
                    Console.WriteLine("Property " + failure.PropertyName + " failed validation. Error was: " + failure.ErrorMessage);
                }
            }
        }
    }
}

If you’d remove this line then the validation would fail as usual:

            customer.CustomerType = CustomerType.VeryImportant;
Property CustomerType failed validation. Error was: 'Customer Type' must not be empty.

Suddenly CustomerType became null because the default value for reference types is null as mentioned on this page. But, in this case, in my humble opinion, it is much clearer what is going on because one might not think originally that 0 could also be taken as being empty even though it could have meaning in terms of enumerations.

I hope that this has helped someone. Feel free to ask any questions in the box down below. Arrivederci!

Crash-Only Software In Practice With Filebeat On Kubernetes

Some time ago I read an article called Crash-only software: More than meets the eye. It’s about an idea that sometimes it is easier and faster to just crash and restart the whole process than handle the error appropriately. Sometimes, obviously, it is even impossible to handle errors as in, for example, Linux kernel panics. Nowadays programming languages especially tend to make it so that it would be very hard, almost impossible to not handle errors. For instance, Rust has a type that either represents success or an error. You have to explicitly unwrap that type and decide what to do in either case. But, apart from those obvious cases, I haven’t seen any examples of crash-only software. Until recently.

There is this popular software project called Beats (Filebeat). It lets you ship various kinds of logs from different sources to a lot of different receivers or “sinks”. I ran into this issue recently that hasn’t been solved for quite some time. A problem occurs when using autodiscovery of containers in Kubernetes, and then some state gets mishandled leading to this error message:

[autodiscover] Error creating runner from config: Can only start an input when all related states are finished

Then, the logs stop being shipped. And it seems like it won’t be solved for still some time because the input sub-system of Beats is written rewritten as I understand it. Something needed to be figured out because moving to another project for performing this is very time consuming and that problem had been occurring then, at that point in time i.e. we were loosing logs. Plus, Filebeat does its job well albeit its benchmarks don’t shine. Vector definitely looks very promising and will be looked into when the k8s integration lands fully.

Anyway, while looking through the comments, this comment here reminded me of the term “crash-only software”. It seems like such a solution is an implementation of “crash-only software” because when Filebeat ships logs, it stores the offsets of files it is reading and so on in a thing called a registry. That permits it to quickly restart in case the process gets killed. That’s how the problem was worked around at least for the time being and I wanted to share this information in case it will be useful for someone.

We will implement this by making a custom liveness probe for our pod in k8s.

At first, you should disable the -e flag if you have it enabled as that disables logging to a file. We will need to log everything to a file because our liveness probe will try reading it to see if that error has occurred. Newer versions have this option but I found that it does not work well in practice. YMMV.

Then, we should enable logging to a file. Do that by adding the following to your Filebeat configuration:

logging.to_files: true
logging.files:
  keepfiles: 2

The only two options which are relevant to us are those. First of all, let’s turn on logging to files by logging.to_files. Then, we also want to keep a minimal number of files because they won’t be shipped anywhere, they will only be used for the liveness probe. Do that with the keepfiles option. Obviously, feel free to modify other options if needed for your use-case.

The final part is the actual liveness probe. To do that, modify your container’s specification by adding this block:

--- 
livenessProbe: 
  exec: 
    command: 
      - /bin/sh
      - "-c"
      - "! (/usr/bin/grep 'Error creating runner from config: Can only start an input when all related states are finished' /usr/share/filebeat/logs/filebeat)"
  initialDelaySeconds: 60
  periodSeconds: 10

I recommend setting the initial delay to about 30 seconds or so to give Filebeat enough time to create the log file and populate it with initial data. Depending on your other logging configuration and volume, you might want to either increase or decrease the sensitivity of this check by either making the period smaller or reduce the number of times (failureThreshold) the command has to fail before Kubernetes makes a conclusion that the container does not work anymore.

I’m sure that this is not the only case of liveness probes being thought of and used like that. Hopefully, this workaround will not be an instance of the old adage “there is nothing more permanent than a temporary solution”. I am certain that the Filebeat developers will fix this problem in the near future. It’s a good piece of software. Let me know if you encounter any problems with this solution or if you have any comments.