Go to file
Conrad Hoffmann 94ca073cd1 Alert on increase in unconfirmed registrations
There is always some fallout, but over the past two weeks the ratio was
never above 0.0001. It did go up to 0.0004 when there was an issue with
email delivery, so 0.0002 seems to be a decent value to trigger an
investigation.
2024-04-09 10:58:59 +02:00
.build.yml .build.yml: upgrade to 3.17 2023-01-19 11:49:38 +01:00
backup_rules.yml Loosen up backup rules 2024-01-08 14:59:22 +01:00
build_rules.yml build_rules.yml: correct name of builds submitted metric 2023-10-04 11:03:41 +02:00
chat_rules.yml chat: add alarm for synIRC 2023-10-24 13:32:49 +02:00
LICENSE Add LICENSE 2020-01-05 13:15:14 -05:00
meta_rules.yml Alert on increase in unconfirmed registrations 2024-04-09 10:58:59 +02:00
node_rules.yml node_rules: take all CPU modes into account 2024-04-02 15:58:26 +02:00
postgres_rules.yml Add postgres_rules.yml 2023-01-19 11:49:12 +01:00
process_rules.yml Add alert for process open FDs 2022-01-18 19:13:10 +01:00
README.md Update README.md 2020-01-06 10:04:15 -05:00
service_rules.yml Fix High number of 500 errors alert to work instance-wide 2022-02-14 16:50:14 +01:00
ssl_rules.yml Fix summary in SSL alarm 2020-02-25 12:23:28 -05:00
test_rules.yml Reschedule weekly test alarm to CEST window 2021-07-29 09:18:59 +02:00

metrics.sr.ht

This repository tracks our Prometheus alert rules. They are available as a package from mirror.sr.ht (for Alpine only) as metrics.sr.ht-rules.

Our Prometheus instance is public:

https://metrics.sr.ht

Usage instructions

  1. Install our package
  2. Add our rules_files entries to your prometheus.yml for each set of rules you wish to use
  3. Configure alertmanager accordingly

Our alerts are categorized into three severity groups:

  • interesting alerts are worth noting, as they may be useful in identifying trends over time, for forensic attention after an outage, or for addressing on a rainy day. Upstream, we send these to our IRC channel.
  • important alerts are likely to be actionable, but do not require immediate attention.
  • urgent alerts require immediate attention.