“Observability Engineering” Book Review

A great, new book “Observability Engineering” came out very recently and I had to jump on reading it. Since it is very closely related to my work, I devoured the pages and read the book in about a day (505 pages). While doing so I wrote down some thoughts that I want to share with you today. They might or might not be true, I am only speaking about the book from my own perspective. Feel free to share your own thoughts!

Overall, the book really resonated with me and it makes me very happy to see literature being written about this topic. Observability is a relatively novel concept in computing that I think will only become more popular in the future. I’d rate the book 4/5 in general but it is 5/5 between books on the same topic.

Here are my thoughts.

  • First of all, it is interesting to see tracing used in CI processes to reduce flakiness. But this probably only matters on a huge scale that most companies will not achieve. At least I haven’t worked at companies so far where it is the case. This has also reminded me of a project to put Kubernetes events as spans. Check it out if you’re interested. I hope to work on distributed tracing projects in the near future, it’s a really exciting topic.
  • Chapters by Slack engineers sometimes felt a bit like an advertisement for Honeycomb. The chapter about telemetry pipelines and their bespoke solutions felt a bit too simplistic because we have things like Vector nowadays not to mention Filebeat and so on. What’s more, Slack engineers have created their own format for storing spans. It seems like a lot of companies nowadays suffer from the “not invented here” syndrome which seems to be the case here. I would be surprised if they won’t migrate to OpenTelemetry (OTel) data format in the near future.
  • Authors spent lots of time talking about and praising OTel. Given that traces are specifically formatted logs, it’s not surprising to see the popularity of OTel. It’s a really exciting project. But we have to keep thinking about events in a system that mutates its state. Traces are only a way of expressing those changes in state.
  • The chapters about finding observability allies are enlightening. I have never thought about customer support and other people as allies that could help one instill a culture of observability in a company.
  • The observability maturity model is great and I could foresee it being used extensively.
  • Event-based service level objectives (SLOs) should be preferred to time-based ones because with distributed systems partial outages are more common than complete blackouts. Event-based SLOs is where you count the good events and the bad events in a window and divide the number of good events by the total number of events. Whereas in time-based SLOs you need to divide the time where some threshold has been exceeded by the amount of time in the window. Also, event-based SLOs reflect the reality more – instead of judging each period of time as either bad or good, with event-based SLOs it is possible to precisely tell how much error budget we’ve burned. Somehow even though I’ve worked with monitoring systems for a long time, such two different points of view escaped me. I will always try to prefer event-based monitoring now.
  • At my previous companies, I saw the same bad practices as outlined in the book. If there are barely any requests in the middle of the night then one or two failures don’t mean much and it’s not needed to alert on those conditions. I am talking about payment failures in the middle of the night if most of your clients are in one or several related timezones, for example. What’s more, I have experienced a bunch of alerts based on symptoms that don’t scale. For example, there are such alerts as “RAM/CPU is used too much”. Just like the authors, I would be in favor of removing them because they are pretty much useless and is reminiscent of the old way of using monitoring systems. I guess this is associated with the observability maturity model that is outlined in the book. My anecdotal data says that many companies are still in their infancy in terms of observability.
  • Lots of text about arbitrarily wide structured events. In an ideal world, we could deduce the internal status of service through them but I believe that it is not it all and not end it all signal. It is just one of many. If instrumentation is not perfect then it is a compression of the state space of your application. And with too much instrumentation there is a risk of high storage costs and too much noise. Sometimes it sounds like a solution to a problem that should be solved in other ways – making services with clearer boundaries and less state. Or, in other words, reduce the sprawling complexity by reducing non-essential complexity to a minimum.
  • I agree with the small section about AIOps (artificial intelligence operations). In general, I feel that it applies to anomaly-based alerting as well. How can computers tell whether some anomaly is bad or not? Instead, we should let computers sift through piles of data and humans should attach meaning to events.
  • I agree with the authors’ arguments about monitoring – again, I believe it’s a cheap signal that is easy to start with, and in my opinion, that’s why so many people rely on it / start with it. It is the same with logs. It is very simple to start emitting them. Distributed tracing takes a lot more effort because you not only have to think about your state but also how your service interacts with others. But, that’s where all of the most important observations lie in the cloud-native world.
  • The book is missing a comparison of different types of signals. The authors really drive the point of arbitrarily wide events but I feel like that isn’t the silver bullet. What about continuous profiling and other emerging signals? Probably not surprising given how much the authors talk about this topic on Twitter.
  • The example of how a columnar database works didn’t convince me and it felt out of place. It probably just needs a better explanation and/or a longer chapter. I would probably recommend you pick up a different book to understand the intricacies of different types of databases.

Of course, my notes here can’t represent all of the content of the book. I’d recommend you to read it yourself! It’s really great. Let me know what you think about it in the comments.

2018 in books

Intro

2018 is coming to a close and so I thought that it would be a good idea to again look back on the books that I have read in 2018 and to share what books I have liked the most with my readers. I will mark the most liked books in bold like Aaron Swartz did in his lists. Also, I will expand a bit under the books which I liked the most and which are kind of controversial.

Perhaps this will be inspiring to someone or they will recommend me similar books that I must definitely read. As I have written already, reading (programming) books is a really rewarding hobby since it helps you grow holistically as a person. You could find my 2016 reads here.

2018 list

I have gone through a total of 14 (+4 compared to last year’s count. I guess it is because I spent a lot of time in 2017 on my education) books in 2018. They have 4843 pages in total which is not bad.

  • “The Pragmatic Programmer” by Andrew Hunt and David Thomas;

A pretty important book in its own regard. It is not opinionated but it talks about various certain lessons that the authors have learned over tens of years of experience of being in the computer software field. It is one of those books that you would read before falling asleep. The exercises are kind of simple but they make you think about what you just read.

  • “To Kill a Mockingbird” by Harper Lee;

A very important piece on topics such as racism and discrimination; totally immersive writing which makes you feel like you are there and it hooks you into reading more and more. Read the Lithuanian version for a change. Atticus Finch is really a hero. My review as posted on Goodreads:

Honestly, 5/5. No comments. No wonder Atticus’ story has inspired thousands of attorneys and it has been rated by British librarians as a book to read before you die even above the Bible.

  • “Surely, you’re joking, Mr. Feynman!” by Richard P. Feynman;

Adventures of a curious character, as the book says. Indeed, popular scientists like Feynman sometimes might seem like they are not human beings anymore, that they had transcended us. However, books like this give us a glimpse into such life and we can see that after all, we are not so different and that we too can achieve such things. In general, it is a very inspiring book. Some people complain about the egoistic tone at some places but I think that if you look past it, you can definitely find a very captivating story.

  • “Italian short stories for beginners” by Olly Richards;
  • “Murder On The Orient Express” by Agatha Christie;
  • “Algorithms to Live By” by Brian Christian;
  • “Inside the Nudge Unit” by David Halpern;

Essentially it is a book about bringing back the scientific method to governmental decisions. It is very interesting to read about this and before this, I never knew that such units even exist. They mostly started using a form of A/B testing to the governmental communication to improve its efficacy. And, surprisingly enough, it has improved a lot and barely any money had been spent on this. It shows that sometimes we do not have to look deeply to find issues which are not hard to fix and would bring a lot of benefits if fixed.

  • “The Soul of a New Machine” by Tracy Kidder;
  • “Rich Dad, Poor Dad” by Robert T. Kiyosaki;

Essentially this book is about a change of the mindset: you should view your money as an asset, not as a liability. I loved this since me and the majority of other people get stuck in this rut of life where we can only see the short-term goals, and we cannot wait until the next pay-check. This book will teach you how to change that thinking and could be a good start of beginning to acquire more and more assets which would generate you more and more money. The main idea is to make money work for you but not work for money.

  • “Meditations” by Marcus Aurelius;
  • “Dataclysm” by Christian Rudder;
  • “Code Complete” by Steve McConnell;

You can find the top list of things that I have learned from this book here. I wrote that blog post at the beginning of 2018… that is how good the book was 🙂 I feel like this is a gentle step forward after “The Pragmatic Programmer”. Afterward, I would recommend everyone to delve into some kind of programming language or paradigm specific book as to learn the nitty-gritty details. Or, you should read through and do exercises of some kind of algorithm book.

  • “Never Split the Difference” by Chris Voss;

The stories and lessons thereof of a former FBI negotiator. You might think you know human psychology but real us actually shine under stressful and life-or-death situations such as those that were a daily encounter for Chris Voss. Contains a lot of golden advice for negotiation and just in general day to day life because actually, we negotiate all the time even though sometimes we might not recognize it. My Goodreads review:

Everything you wanted to know about negotiation. The author writes in a very clear, lucid style. Reminded me of Richard Dawkins’ style of writing. This book will introduce you to the concepts of labelling, mirroring, all kinds of leverage that you could peruse to your advantage, and of course – the black swans. If you always thought that negotiation is something only people in FBI and other agencies do – you are wrong. It is worth for everyone to pick up this book because negotiations happens all of the time in our lives. As the author puts it (I’m paraphrasing here) – conflicts happen each day and you cannot avoid them. So stop thinking of your partner as your adversary and think about them as of your counter-part. The adversary is always the problem or idea being discussed. Very awesome book.

  • “The Power of Now” by Eckhart Tolle

A controversial book. Kind of reminds me of Stoicism. The point that nothing exists besides now is kind of appealing to me and I think that it has a kernel of truth. Ignore the religion stuff while reading it. My review from Goodreads:

I think that the mindfulness parts and the idea that nothing exists besides what is now have a kernel of truth to them. Practicing such a view of life definitely helps to be more calm and view things as they are. In my opinion, this is related to the Stoic view of life in which it is being said that the only thing that you can influence is you yourself and your reactions. However, I did not like some of the explanations that involved “God” or “the Lord” himself because they, to be frank, just do not make any logical sense and because “God” is dead. On the other hand, just like the author says: “words only convey some kind of meaning, they themselves are worthless and completely made up” (paraphrasing).

To read in 2019

One of the first books that I want to read in 2019 is “The Design of Everyday Things” by Donald Norman. I feel that it is useful for programmers and everyone involved to read a bit of literature about usability and user experience. I saw it recommended in a lot of lists.

Afterward, I will try to read something related to algorithms like “Introduction to Algorithms” (CLRS). I feel that algorithms and estimating the complexity of our programs is very important thus I will brush up on that.

Finally, I want to read something related to machine learning or artificial intelligence. These fields are very fascinating to me and, honestly, I do not have that much knowledge of them. We will see what actual books I will choose.