giedrius – Page 9 – Giedrius Statkevičius

Devoxx UK 2021 was a great conference that has just passed by. It was my first time ever speaking at a physical conference. I spoke at virtual conferences before. I had the great honor of representing Thanos at Devoxx. I was supposed to do this presentation with another Thanos maintainer Prem Saraswat however he could not make it. Since it was my first time speaking live, I have learned a bunch of lessons for the next time I will be able to do this again. This post will be about my journey there and my takeaways from it. I hope that you will be able to learn something from it as well.

Me talking at Devoxx UK 2021, a screenshot

Journey

My journey began in Lithuania. Since I traveled to the UK before Omicron, everyone was still kind of relaxed, and not many people were wearing masks. Actually, what was funny is that fewer people were wearing masks in the UK than in Lithuania. My guess is that elderly people are less vaccinated in Lithuania than in the UK so that could explain the difference in attitude – older people in Lithuania are more likely to get seriously ill thus taking up beds in hospitals. Anyway, this post is not about that.

London now (not sure when it was introduced, the last time I was in London was a few years before that) has a cool system in public transportation in that it is not necessary to buy the Oyster card anymore. One can just touch their debit or credit card upon entering a bus or a tram. With so many people living in London I imagine that it was quite an improvement in reducing congestion i.e. more convenient equals more people on public transportation equals fewer cars on roads.

Picture of the business design centre showing the DevoxxUK logo

Also, it was easy to take the COVID test right after landing since everything is conveniently located in the London Luton airport. The whole process was actually smooth, I expected everything to take much longer. Note that this was when it was enough to do the antigen test upon landing.

Be careful with night busses, though. The flight home was in the very early morning so we’ve had to take a night bus. Even though it was the middle of the night, people were still outside, enjoying the nightlife. So, don’t expect that you will find a seating place for yourself. And if you will be traveling with a private company back to the airport in the middle of the night then I’d recommend you book tickets in advance. We ran into a problem where all of the seats were taken in a bus which was supposed to take us to the airport. Another one was in an hour so it was too late. We had to hail a taxi ride with Bolt which was quite expensive.

Now let’s move on to the lessons that I have learned during this trip.

Lesson 1 – Take Care of Everything in Advance

It is better to get as many things sorted out as possible before your talk to reduce the amount of stress. In my case, I had taken care more or less of everything before the day arrived except one thing – the electrical plug converter. My hotel had the “schuko” type plugs i.e. the same ones used in Lithuania so I always used to charge everything there, and I used battery packs to continuously charge my phone during the day. However, I need a converter for the presentation itself. I had some issues with that. I didn’t know about the “schuko” types beforehand and was misled by false advertising. I bought a “Europe to UK” plug converter but it didn’t work because it only worked with plugs in Western Europe i.e. plugs of type E which have an extra prong. So, I had to take two trips to a shop to get the correct converter. This led to extra, unneeded stress. So, always do your homework and come prepared.

Lesson 2 – Do Not Think Too Much About What Your Audience Thinks While Presenting

During my presentation, I had three jokes. After the first one, I had the natural urge of checking whether listeners understood them and they were enjoying my presentation. However, staring at people’s faces is freight with peril. In my experience, it is way too easy to start overthinking what the audience thinks. I started staring at people’s faces way too much. This led me to a few times where I have lost confidence in what I am saying. This escalated into a bit of stuttering.

Picture of the room showing my point of view before the presentation

I think the lesson here is that it is OK to look at your audience a little bit but do not get too nervous or think too much about it. Always have the topic in mind, do not let your mind jump to other things.

Lesson 3 – Understand Your Audience

My talk was oriented at the intermediate skill level – people that are not too new to the Prometheus ecosystem but people who are not experts, yet. I feel like some of the technical concepts that I have explained have escaped the minds of the listeners. I would say that it is probably always better to err on the side of simplicity and focus instead of trying to fit too much into one presentation. My mistake was probably that the talk kind of had two parts – introduction to Thanos and what we had been working on. If it is oriented at moderately knowledgeable people then maybe it would have been better to skip the introductory part. On the other hand, it is a Java-oriented (or at least it was?) conference so perhaps it would’ve been better to avoid advanced stuff altogether, and to focus only on the introductory part?

Even though this is just anecdotal data but another fact alludes to this – there were some questions after the presentation which were really about core stuff i.e. how Thanos works. I think this means that I have failed to properly explain to listeners what is the StoreAPI and so on.

All in all, I can’t say right now which option would have been better but certainly, it would have been better to either focus on the introductory part or the advanced part instead of both of them at the same time.

Lesson 4 – Always Turn Off Redlight and Disable Notifications

Unless you want everything to look like this:

I suggest turning off the screen’s temperature adjustment. In Gnome, you can do that via the status bar (“Night light”). Also, I would strongly recommend you to turn off notifications in case someone would text you something in the middle of the presentation. You don’t want that to end up in the video as well.

Lesson 5 – It Is Okay Not To Know Something

After the presentation, someone had asked me a question about what is the preferred way to deploy Thanos on Kubernetes. Truth be told, there are quite a lot of different options. Just to name a few: kube-prometheus-stack, goatlas, prometheus-operator, etc. I haven’t tried all of them so I cannot reasonably tell someone which one is the best. That is exactly what I told them. In addition, I have told them to use the one which fits their use case the best. And I think that is perfectly fine. It is OK not to know everything.

If I would be planning to start using Kubernetes for deploying Thanos in the near future, I could have told them that I will be able to provide an informed opinion in the future and we could talk about it then. But, that’s not the case. So far I have only deployed Thanos on bare metal. Hence it is better not to lie and not to pretend that you are an expert in something.

That’s all from me on this post! Let me know if you have any comments or suggestions. I hope you’ve learned something.

Setting custom metric retention periods on Thanos is one of the longest feature requests that we have had: https://github.com/thanos-io/thanos/issues/903. It seems like there is still no solution in sight but actually, it is already possible to have custom metrics retention periods. It is quite a simple idea but could be hard to implement if you do not have comfortable deployment tooling in place. You can achieve custom retention periods for different metrics in the following way:

Designate retention as a special (external) label that controls how long the metrics should be kept i.e. ensure that no metrics have this label
Send metrics with bigger retention over remote write to Thanos Receive instances that have retention external label set to the retention period
Set up multiple instances of Thanos Compactors with different retention periods and each of them needs to pick up blocks with those respective external labels
Add retention as another deduplication label on Thanos Query

In the end, all of your blocks should have some kind of retention as an external label and then you should have multiple Thanos Compactors for each stream of retention label.

Note that this whole setup assumes that you will not want to change the default retention for a big amount of metrics. I have found it to be true in most of the cases, in my experience. It is just anecdotal data but most of the time you’ll want around 30 – 60 days of retention by default, with some people wanting about a year’s worth of retention if they are doing some kind of analytics on that data e.g. they are trying to predict the number of requests. If you will want to change the retention of a big amount of metrics then this simple setup will not work and you will need to scale the receiving side i.e. the Receivers. But, that is out of the scope of this article.

Also, ideally you would want to avoid having to remote write anything at all and let Sidecar do its work with multiple Prometheus+Sidecar pairs, each having their own retention label. However, it might not be so easy to do for most people who do not have advanced configuration management set up on their systems.

The rest of the article focuses on a hacky way to achieve multiple retention periods for different metrics with the constraint that only one Prometheus node is in the picture.

Here is how this setup looks like in a graphic:

Let’s walk through the most important parts:

External labels and metric_relabel_config configuration on Prometheus. First, we need to set the label retention to a value such as 1mo which will indicate the default retention for metrics. There may be some extra external labels, that does not matter in our case. Do specify that default retention with:

global:
  external_labels:
    retention: 1mo

Set up Thanos Receive with “tenants” such as 12mo:

--receive.tenant-label-name="retention" --receive.default-tenant-id="12mo" --label=...

Add your extra external labels such as fqdn to identify this Thanos Receive node.

Set up remote writing to Thanos Receive in the Prometheus configuration. For example:

remote_write:
    - url: http://localhost:19291/api/v1/receive

Edit your Thanos Query to include retention as the deduplication label:

query ... --query.replica-label=retention

Set up multiple Thanos Compactors for each different retention with their own relabel configs. Here is an example for 12mo:

    - source_labels:
      - retention
      regex: "12mo"
      action: keep

And then you need to have the respective retention configuration on that Thanos Compactor:

--retention.resolution-1h=365d --retention.resolution-raw=365d --retention.resolution-5m=365d --selector.relabel-config=...

This assumes that there are 365 days in a year.

Repeat this configuration for each different retention external label that you might have.

At this point, all of the metrics are duplicated locally and in remote write with extra retention. Consider following the last point in this post.

(Optional) Enable metric_relabel_configs on your scraping target(-s) to avoid ingesting metrics with certain label names/values. As an alternative, you can use write_relabel_configs to only keep certain metrics sent to remote write storage that have certain patterns. For example, to only send metrics with label tenant="Team Very Important to external storage with 12mo retention, add the following configuration:

remote_write:
    - url: http://localhost:19291/api/v1/receive
      write_relabel_configs:
      - source_labels: [tenant]
        regex: Team Very Important
        action: keep

You could also work around this problem by having separate scrapers and some external system that feeds targets into your Prometheus according to the set retention with file_sd_configs or some other mechanism as mentioned at the beginning of the article.

As the last alternative, consider using the Prometheus Agent to have minimal storage on disk, and to send everything over remote write to Thanos Receivers.

I hope this helps. Let me know if you have any comments or suggestions!

Author: giedrius

Things Learned From Speaking at a Physical Conference The First Time

Journey

Lesson 1 – Take Care of Everything in Advance

Lesson 2 – Do Not Think Too Much About What Your Audience Thinks While Presenting

Lesson 3 – Understand Your Audience

Lesson 4 – Always Turn Off Redlight and Disable Notifications

Lesson 5 – It Is Okay Not To Know Something

Custom Metric Retention Periods on Thanos

Related Posts

Hey there!