Swap class and type attributes in stock alarm configurations (#11240)

* swap type and class

* edit REFERENCE.md
This commit is contained in:
Emmanuel Vasilakis 2021-06-14 13:56:23 +03:00 committed by GitHub
parent 59af90b08c
commit f6ec79cfb8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
73 changed files with 592 additions and 578 deletions

View File

@ -59,9 +59,9 @@ Netdata parses the following lines. Beneath the table is an in-depth explanation
| --------------------------------------------------- | --------------- | ------------------------------------------------------------------------------------- |
| [`alarm`/`template`](#alarm-line-alarm-or-template) | yes | Name of the alarm/template. |
| [`on`](#alarm-line-on) | yes | The chart this alarm should attach to. |
| [`class`](#alarm-line-class) | no | The general classification of the alarm. |
| [`component`](#alarm-line-component) | no | Specify the component of the class of the alarm. |
| [`type`](#alarm-line-type) | no | The type of error the alarm monitors. |
| [`class`](#alarm-line-class) | no | The general alarm classification. |
| [`type`](#alarm-line-type) | no | What area of the system the alarm monitors. |
| [`component`](#alarm-line-component) | no | Specific component of the type of the alarm. |
| [`os`](#alarm-line-os) | no | Which operating systems to run this chart. |
| [`hosts`](#alarm-line-hosts) | no | Which hostnames will run this alarm. |
| [`plugin`](#alarm-line-plugin) | no | Restrict an alarm or template to only a certain plugin. |
@ -136,17 +136,38 @@ If you create a template using the `disk.io` context, it will apply an alarm to
#### Alarm line `class`
Specify the classification of the alarm or template.
Class can be used to indicate the broader area of the system that the alarm applies to. For example, under the general `Database` class, you can group together alarms that operate on various database systems, like `MySQL`, `CockroachDB`, `CouchDB` etc. Example:
This indicates the type of error (or general problem area) that the alarm or template applies to. For example, `Latency` can be used for alarms that trigger on latency issues on network interfaces, web servers, or database systems. Example:
```yaml
class: Database
class: Latency
```
<details>
<summary>Netdata's stock alarms use the following `class` attributes by default:</summary>
| Class |
| ----------------|
| Errors |
| Latency |
| Utilization |
| Workload |
</details>
`class` will default to `Unknown` if the line is missing from the alarm configuration.
#### Alarm line `type`
Type can be used to indicate the broader area of the system that the alarm applies to. For example, under the general `Database` type, you can group together alarms that operate on various database systems, like `MySQL`, `CockroachDB`, `CouchDB` etc. Example:
```yaml
type: Database
```
<details>
<summary>Netdata's stock alarms use the following `class` attributes by default, but feel free to adjust for your own requirements.</summary>
<summary>Netdata's stock alarms use the following `type` attributes by default, but feel free to adjust for your own requirements.</summary>
| Class | Description |
| Type | Description |
| ------------------------ | ------------------------------------------------------------------------------------------------ |
| Ad Filtering | Services related to Ad Filtering (like pi-hole) |
| Certificates | Certificates monitoring related |
@ -162,7 +183,7 @@ class: Database
| Linux | Services specific to Linux (e.g. systemd) |
| Messaging | Alerts for message passing services (e.g. vernemq) |
| Netdata | Internal Netdata components monitoring |
| Other | Use as a general class of alerts |
| Other | When an alert doesn't fit in other types. |
| Power Supply | Alerts from power supply related services (e.g. apcupsd) |
| Search engine | Alerts for search services (e.g. elasticsearch) |
| Storage | Class for alerts dealing with storage services (storage devices typically live under `System`) |
@ -174,26 +195,16 @@ class: Database
</details>
If an alarm configuration is missing the `class` line, its value will default to `Unknown`.
If an alarm configuration is missing the `type` line, its value will default to `Unknown`.
#### Alarm line `component`
Component can be used to narrow down what the previous `class` value specifies for each alarm or template. Continuing from the previous example, `component` might include `MySQL`, `CockroachDB`, `MongoDB`, all under the same `Database` classification. Example:
Component can be used to narrow down what the previous `type` value specifies for each alarm or template. Continuing from the previous example, `component` might include `MySQL`, `CockroachDB`, `MongoDB`, all under the same `Database` type. Example:
```yaml
component: MySQL
```
As with the `class` line, if `component` is missing from the configuration, its value will default to `Unknown`.
#### Alarm line `type`
This indicates the type of error (or general problem area) that the alarm or template applies to. For example, `Latency` can be used for alarms that trigger on latency issues in network interfaces, web servers, or database systems. Example:
```yaml
type: Latency
```
`type` will also (as with `class` and `component`) default to `Unknown` if the line is missing from the alarm configuration.
As with the `class` and `type` line, if `component` is missing from the configuration, its value will default to `Unknown`.
#### Alarm line `os`

View File

@ -3,9 +3,9 @@
template: adaptec_raid_ld_status
on: adaptec_raid.ld_status
class: System
class: Errors
type: System
component: RAID
type: Errors
lookup: max -10s foreach *
units: bool
every: 10s
@ -18,9 +18,9 @@ component: RAID
template: adaptec_raid_pd_state
on: adaptec_raid.pd_state
class: System
class: Errors
type: System
component: RAID
type: Errors
lookup: max -10s foreach *
units: bool
every: 10s

View File

@ -2,9 +2,9 @@
template: anomalies_anomaly_probabilities
on: anomalies.probability
class: Netdata
class: Errors
type: Netdata
component: ML
type: Errors
lookup: average -2m foreach *
every: 1m
warn: $this > 50
@ -14,9 +14,9 @@ component: ML
template: anomalies_anomaly_flags
on: anomalies.anomaly
class: Netdata
class: Errors
type: Netdata
component: ML
type: Errors
lookup: sum -2m foreach *
every: 1m
warn: $this > 10

View File

@ -2,9 +2,9 @@
template: apcupsd_10min_ups_load
on: apcupsd.load
class: Power Supply
class: Utilization
type: Power Supply
component: UPS
type: Utilization
os: *
hosts: *
lookup: average -10m unaligned of percentage
@ -20,9 +20,9 @@ component: UPS
# Fire the alarm as soon as it's going on battery (99% charge) and clear only when full.
template: apcupsd_ups_charge
on: apcupsd.charge
class: Power Supply
class: Errors
type: Power Supply
component: UPS
type: Errors
os: *
hosts: *
lookup: average -60s unaligned of charge
@ -36,9 +36,9 @@ component: UPS
template: apcupsd_last_collected_secs
on: apcupsd.load
class: Power Supply
class: Latency
type: Power Supply
component: UPS device
type: Latency
calc: $now - $last_collected_t
every: 10s
units: seconds ago

View File

@ -1,9 +1,9 @@
# Alert that backends subsystem will be disabled soon
alarm: backend_metrics_eol
on: netdata.backend_metrics
class: Netdata
class: Errors
type: Netdata
component: Exporting engine
type: Errors
units: boolean
calc: $now - $last_collected_t
every: 1m
@ -16,9 +16,9 @@ component: Exporting engine
alarm: backend_last_buffering
on: netdata.backend_metrics
class: Netdata
class: Latency
type: Netdata
component: Exporting engine
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s
@ -30,9 +30,9 @@ component: Exporting engine
alarm: backend_metrics_sent
on: netdata.backend_metrics
class: Netdata
class: Workload
type: Netdata
component: Exporting engine
type: Workload
units: %
calc: abs($sent) * 100 / abs($buffered)
every: 10s

View File

@ -1,9 +1,9 @@
template: bcache_cache_errors
on: disk.bcache_cache_read_races
class: System
class: Errors
type: System
component: Disk
type: Errors
lookup: sum -1m unaligned absolute
units: errors
every: 1m
@ -16,9 +16,9 @@ component: Disk
template: bcache_cache_dirty
on: disk.bcache_cache_alloc
class: System
class: Utilization
type: System
component: Disk
type: Utilization
calc: $dirty + $metadata + $undefined
units: %
every: 1m

View File

@ -2,9 +2,9 @@
template: beanstalk_server_buried_jobs
on: beanstalk.current_jobs
class: Messaging
class: Workload
type: Messaging
component: Beanstalk
type: Workload
calc: $buried
units: jobs
every: 10s

View File

@ -1,8 +1,8 @@
template: bind_rndc_stats_file_size
on: bind_rndc.stats_size
class: DNS
class: Utilization
type: DNS
component: BIND
type: Utilization
units: megabytes
every: 60
calc: $stats_size

View File

@ -3,9 +3,9 @@
# Warn on any compute errors encountered.
template: boinc_compute_errors
on: boinc.states
class: Computing
class: Errors
type: Computing
component: BOINC
type: Errors
os: *
hosts: *
families: *
@ -21,9 +21,9 @@ component: BOINC
# Warn on lots of upload errors
template: boinc_upload_errors
on: boinc.states
class: Computing
class: Errors
type: Computing
component: BOINC
type: Errors
os: *
hosts: *
families: *
@ -39,9 +39,9 @@ component: BOINC
# Warn on the task queue being empty
template: boinc_total_tasks
on: boinc.tasks
class: Computing
class: Utilization
type: Computing
component: BOINC
type: Utilization
os: *
hosts: *
families: *
@ -57,9 +57,9 @@ component: BOINC
# Warn on no active tasks with a non-empty queue
template: boinc_active_tasks
on: boinc.tasks
class: Computing
class: Utilization
type: Computing
component: BOINC
type: Utilization
os: *
hosts: *
families: *

View File

@ -1,9 +1,9 @@
template: btrfs_allocated
on: btrfs.disk
class: System
class: Utilization
type: System
component: File system
type: Utilization
os: *
hosts: *
families: *
@ -18,9 +18,9 @@ component: File system
template: btrfs_data
on: btrfs.data
class: System
class: Utilization
type: System
component: File system
type: Utilization
os: *
hosts: *
families: *
@ -35,9 +35,9 @@ component: File system
template: btrfs_metadata
on: btrfs.metadata
class: System
class: Utilization
type: System
component: File system
type: Utilization
os: *
hosts: *
families: *
@ -52,9 +52,9 @@ component: File system
template: btrfs_system
on: btrfs.system
class: System
class: Utilization
type: System
component: File system
type: Utilization
os: *
hosts: *
families: *

View File

@ -2,9 +2,9 @@
template: ceph_cluster_space_usage
on: ceph.general_usage
class: Storage
class: Utilization
type: Storage
component: Ceph
type: Utilization
calc: $used * 100 / ($used + $avail)
units: %
every: 1m

View File

@ -3,9 +3,9 @@
template: cgroup_10min_cpu_usage
on: cgroup.cpu_limit
class: Cgroups
class: Utilization
type: Cgroups
component: CPU
type: Utilization
os: linux
hosts: *
lookup: average -10m unaligned
@ -19,9 +19,9 @@ component: CPU
template: cgroup_ram_in_use
on: cgroup.mem_usage
class: Cgroups
class: Utilization
type: Cgroups
component: Memory
type: Utilization
os: linux
hosts: *
calc: ($ram) * 100 / $memory_limit

View File

@ -3,9 +3,9 @@
template: cockroachdb_used_storage_capacity
on: cockroachdb.storage_used_capacity_percentage
class: Database
class: Utilization
type: Database
component: CockroachDB
type: Utilization
calc: $capacity_used_percent
units: %
every: 10s
@ -17,9 +17,9 @@ component: CockroachDB
template: cockroachdb_used_usable_storage_capacity
on: cockroachdb.storage_used_capacity_percentage
class: Database
class: Utilization
type: Database
component: CockroachDB
type: Utilization
calc: $capacity_usable_used_percent
units: %
every: 10s
@ -33,9 +33,9 @@ component: CockroachDB
template: cockroachdb_unavailable_ranges
on: cockroachdb.ranges_replication_problem
class: Database
class: Utilization
type: Database
component: CockroachDB
type: Utilization
calc: $ranges_unavailable
units: num
every: 10s
@ -48,9 +48,9 @@ component: CockroachDB
template: cockroachdb_open_file_descriptors_limit
on: cockroachdb.process_file_descriptors
class: Database
class: Utilization
type: Database
component: CockroachDB
type: Utilization
calc: $sys_fd_open/$sys_fd_softlimit * 100
units: %
every: 10s

View File

@ -3,9 +3,9 @@
template: 10min_cpu_usage
on: system.cpu
class: System
class: Utilization
type: System
component: CPU
type: Utilization
os: linux
hosts: *
lookup: average -10m unaligned of user,system,softirq,irq,guest
@ -19,9 +19,9 @@ component: CPU
template: 10min_cpu_iowait
on: system.cpu
class: System
class: Utilization
type: System
component: CPU
type: Utilization
os: linux
hosts: *
lookup: average -10m unaligned of iowait
@ -35,9 +35,9 @@ component: CPU
template: 20min_steal_cpu
on: system.cpu
class: System
class: Latency
type: System
component: CPU
type: Latency
os: linux
hosts: *
lookup: average -20m unaligned of steal
@ -52,9 +52,9 @@ component: CPU
## FreeBSD
template: 10min_cpu_usage
on: system.cpu
class: System
class: Utilization
type: System
component: CPU
type: Utilization
os: freebsd
hosts: *
lookup: average -10m unaligned of user,system,interrupt

View File

@ -3,9 +3,9 @@
alarm: 10min_dbengine_global_fs_errors
on: netdata.dbengine_global_errors
class: Netdata
class: Errors
type: Netdata
component: DB engine
type: Errors
os: linux freebsd macos
hosts: *
lookup: sum -10m unaligned of fs_errors
@ -18,9 +18,9 @@ component: DB engine
alarm: 10min_dbengine_global_io_errors
on: netdata.dbengine_global_errors
class: Netdata
class: Errors
type: Netdata
component: DB engine
type: Errors
os: linux freebsd macos
hosts: *
lookup: sum -10m unaligned of io_errors
@ -33,9 +33,9 @@ component: DB engine
alarm: 10min_dbengine_global_flushing_warnings
on: netdata.dbengine_global_errors
class: Netdata
class: Errors
type: Netdata
component: DB engine
type: Errors
os: linux freebsd macos
hosts: *
lookup: sum -10m unaligned of pg_cache_over_half_dirty_events
@ -49,9 +49,9 @@ component: DB engine
alarm: 10min_dbengine_global_flushing_errors
on: netdata.dbengine_long_term_page_stats
class: Netdata
class: Errors
type: Netdata
component: DB engine
type: Errors
os: linux freebsd macos
hosts: *
lookup: sum -10m unaligned of flushing_pressure_deletions

View File

@ -11,9 +11,9 @@
template: disk_space_usage
on: disk.space
class: System
class: Utilization
type: System
component: Disk
type: Utilization
os: linux freebsd
hosts: *
families: !/dev !/dev/* !/run !/run/* *
@ -28,9 +28,9 @@ component: Disk
template: disk_inode_usage
on: disk.inodes
class: System
class: Utilization
type: System
component: Disk
type: Utilization
os: linux freebsd
hosts: *
families: !/dev !/dev/* !/run !/run/* *
@ -136,9 +136,9 @@ component: Disk
template: 10min_disk_utilization
on: disk.util
class: System
class: Utilization
type: System
component: Disk
type: Utilization
os: linux freebsd
hosts: *
families: *
@ -158,9 +158,9 @@ component: Disk
template: 10min_disk_backlog
on: disk.backlog
class: System
class: Latency
type: System
component: Disk
type: Latency
os: linux
hosts: *
families: *

View File

@ -3,9 +3,9 @@
template: dns_query_time_query_time
on: dns_query_time.query_time
class: DNS
class: Latency
type: DNS
component: DNS
type: Latency
lookup: average -10s unaligned foreach *
units: ms
every: 10s

View File

@ -2,9 +2,9 @@
template: dnsmasq_dhcp_dhcp_range_utilization
on: dnsmasq_dhcp.dhcp_range_utilization
class: DHCP
class: Utilization
type: DHCP
component: Dnsmasq
type: Utilization
every: 10s
units: %
calc: $used

View File

@ -1,8 +1,8 @@
template: docker_unhealthy_containers
on: docker.unhealthy_containers
class: Containers
class: Errors
type: Containers
component: Docker
type: Errors
units: unhealthy containers
every: 10s
lookup: average -10s

View File

@ -3,9 +3,9 @@
template: elasticsearch_last_collected
on: elasticsearch.cluster_health_status
class: Search engine
class: Latency
type: Search engine
component: Elasticsearch
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s

View File

@ -5,9 +5,9 @@
alarm: lowest_entropy
on: system.entropy
class: System
class: Utilization
type: System
component: Cryptography
type: Utilization
os: linux
hosts: *
lookup: min -5m unaligned

View File

@ -1,22 +1,25 @@
template: exporting_last_buffering
families: *
on: exporting_data_size
calc: $now - $last_collected_t
units: seconds ago
every: 10s
warn: $this > (($status >= $WARNING) ? ($update_every) : ( 5 * $update_every))
crit: $this > (($status == $CRITICAL) ? ($update_every) : (60 * $update_every))
delay: down 5m multiplier 1.5 max 1h
info: number of seconds since the last successful buffering of exporting data
to: dba
template: exporting_last_buffering
families: *
on: exporting_data_size
class: Latency
type: Netdata
component: Exporting engine
calc: $now - $last_collected_t
units: seconds ago
every: 10s
warn: $this > (($status >= $WARNING) ? ($update_every) : ( 5 * $update_every))
crit: $this > (($status == $CRITICAL) ? ($update_every) : (60 * $update_every))
delay: down 5m multiplier 1.5 max 1h
info: number of seconds since the last successful buffering of exporting data
to: dba
template: exporting_metrics_sent
families: *
on: exporting_data_size
class: Netdata
class: Workload
type: Netdata
component: Exporting engine
type: Workload
units: %
calc: abs($sent) * 100 / abs($buffered)
every: 10s

View File

@ -2,9 +2,9 @@
template: fping_last_collected_secs
families: *
on: fping.latency
class: Other
class: Latency
type: Other
component: Network
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s
@ -17,9 +17,9 @@ component: Network
template: fping_host_reachable
families: *
on: fping.latency
class: Other
class: Errors
type: Other
component: Network
type: Errors
calc: $average != nan
units: up/down
every: 10s
@ -31,9 +31,9 @@ component: Network
template: fping_host_latency
families: *
on: fping.latency
class: Other
class: Latency
type: Other
component: Network
type: Latency
lookup: average -10s unaligned of average
units: ms
every: 10s
@ -48,9 +48,9 @@ component: Network
template: fping_packet_loss
families: *
on: fping.quality
class: System
class: Errors
type: System
component: Network
type: Errors
lookup: average -10m unaligned of returned
calc: 100 - $this
green: 1

View File

@ -1,9 +1,9 @@
template: fronius_last_collected_secs
families: *
on: fronius.power
class: Power Supply
class: Latency
type: Power Supply
component: Solar
type: Latency
calc: $now - $last_collected_t
every: 10s
units: seconds ago

View File

@ -1,9 +1,9 @@
template: gearman_workers_queued
on: gearman.single_job
class: Computing
class: Latency
type: Computing
component: Gearman
type: Latency
lookup: average -10m unaligned match-names of Queued
units: workers
every: 10s

View File

@ -3,9 +3,9 @@
template: go.d_job_last_collected_secs
on: netdata.go_plugin_execution_time
class: Netdata
class: Error
type: Netdata
component: go.d.plugin
type: Error
module: *
calc: $now - $last_collected_t
units: seconds ago

View File

@ -1,8 +1,8 @@
template: haproxy_backend_server_status
on: haproxy_hs.down
class: Web Proxy
class: Errors
type: Web Proxy
component: HAProxy
type: Errors
units: failed servers
every: 10s
lookup: average -10s
@ -12,9 +12,9 @@ component: HAProxy
template: haproxy_backend_status
on: haproxy_hb.down
class: Web Proxy
class: Errors
type: Web Proxy
component: HAProxy
type: Errors
units: failed backend
every: 10s
lookup: average -10s
@ -24,9 +24,9 @@ component: HAProxy
template: haproxy_last_collected
on: haproxy_hb.down
class: Web Proxy
class: Latency
type: Web Proxy
component: HAProxy
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s

View File

@ -3,9 +3,9 @@
template: hdfs_capacity_usage
on: hdfs.capacity
class: Storage
class: Utilization
type: Storage
component: HDFS
type: Utilization
calc: ($used) * 100 / ($used + $remaining)
units: %
every: 10s
@ -20,9 +20,9 @@ component: HDFS
template: hdfs_missing_blocks
on: hdfs.blocks
class: Storage
class: Errors
type: Storage
component: HDFS
type: Errors
calc: $missing
units: missing blocks
every: 10s
@ -34,9 +34,9 @@ component: HDFS
template: hdfs_stale_nodes
on: hdfs.data_nodes
class: Storage
class: Errors
type: Storage
component: HDFS
type: Errors
calc: $stale
units: dead nodes
every: 10s
@ -48,9 +48,9 @@ component: HDFS
template: hdfs_dead_nodes
on: hdfs.data_nodes
class: Storage
class: Errors
type: Storage
component: HDFS
type: Errors
calc: $dead
units: dead nodes
every: 10s
@ -64,9 +64,9 @@ component: HDFS
template: hdfs_num_failed_volumes
on: hdfs.num_failed_volumes
class: Storage
class: Errors
type: Storage
component: HDFS
type: Errors
calc: $fsds_num_failed_volumes
units: failed volumes
every: 10s

View File

@ -3,9 +3,9 @@
template: httpcheck_web_service_up
families: *
on: httpcheck.status
class: Web Server
class: Utilization
type: Web Server
component: HTTP endpoint
type: Utilization
lookup: average -1m unaligned percentage of success
calc: ($this < 75) ? (0) : ($this)
every: 5s
@ -16,9 +16,9 @@ component: HTTP endpoint
template: httpcheck_web_service_bad_content
families: *
on: httpcheck.status
class: Web Server
class: Workload
type: Web Server
component: HTTP endpoint
type: Workload
lookup: average -5m unaligned percentage of bad_content
every: 10s
units: %
@ -32,9 +32,9 @@ component: HTTP endpoint
template: httpcheck_web_service_bad_status
families: *
on: httpcheck.status
class: Web Server
class: Workload
type: Web Server
component: HTTP endpoint
type: Workload
lookup: average -5m unaligned percentage of bad_status
every: 10s
units: %
@ -48,9 +48,9 @@ component: HTTP endpoint
template: httpcheck_web_service_timeouts
families: *
on: httpcheck.status
class: Web Server
class: Latency
type: Web Server
component: HTTP endpoint
type: Latency
lookup: average -5m unaligned percentage of timeout
every: 10s
units: %
@ -59,9 +59,9 @@ component: HTTP endpoint
template: httpcheck_no_web_service_connections
families: *
on: httpcheck.status
class: Other
class: Errors
type: Other
component: HTTP endpoint
type: Errors
lookup: average -5m unaligned percentage of no_connection
every: 10s
units: %
@ -71,9 +71,9 @@ component: HTTP endpoint
template: httpcheck_web_service_unreachable
families: *
on: httpcheck.status
class: Web Server
class: Errors
type: Web Server
component: HTTP endpoint
type: Errors
calc: ($httpcheck_no_web_service_connections >= $httpcheck_web_service_timeouts) ? ($httpcheck_no_web_service_connections) : ($httpcheck_web_service_timeouts)
units: %
every: 10s
@ -87,9 +87,9 @@ component: HTTP endpoint
template: httpcheck_1h_web_service_response_time
families: *
on: httpcheck.responsetime
class: Other
class: Latency
type: Other
component: HTTP endpoint
type: Latency
lookup: average -1h unaligned of time
every: 30s
units: ms
@ -98,9 +98,9 @@ component: HTTP endpoint
template: httpcheck_web_service_slow
families: *
on: httpcheck.responsetime
class: Web Server
class: Latency
type: Web Server
component: HTTP endpoint
type: Latency
lookup: average -3m unaligned of time
units: ms
every: 10s

View File

@ -1,9 +1,9 @@
template: ioping_disk_latency
families: *
on: ioping.latency
class: System
class: Latency
type: System
component: Disk
type: Latency
lookup: average -10s unaligned of average
units: ms
every: 10s

View File

@ -3,9 +3,9 @@
alarm: semaphores_used
on: system.ipc_semaphores
class: System
class: Utilization
type: System
component: IPC
type: Utilization
os: linux
hosts: *
calc: $semaphores * 100 / $ipc_semaphores_max
@ -19,9 +19,9 @@ component: IPC
alarm: semaphore_arrays_used
on: system.ipc_semaphore_arrays
class: System
class: Utilization
type: System
component: IPC
type: Utilization
os: linux
hosts: *
calc: $arrays * 100 / $ipc_semaphores_arrays_max

View File

@ -1,9 +1,9 @@
template: ipfs_datastore_usage
on: ipfs.repo_size
class: Data Sharing
class: Utilization
type: Data Sharing
component: IPFS
type: Utilization
calc: $size * 100 / $avail
units: %
every: 10s

View File

@ -1,8 +1,8 @@
alarm: ipmi_sensors_states
on: ipmi.sensors_states
class: System
class: Errors
type: System
component: IPMI
type: Errors
calc: $warning + $critical
units: sensors
every: 10s
@ -14,9 +14,9 @@ component: IPMI
alarm: ipmi_events
on: ipmi.events
class: System
class: Utilization
type: System
component: IPMI
type: Utilization
calc: $events
units: events
every: 10s

View File

@ -6,9 +6,9 @@
template: kubelet_node_config_error
on: k8s_kubelet.kubelet_node_config_error
class: Kubernetes
class: Errors
type: Kubernetes
component: Kubelet
type: Errors
calc: $kubelet_node_config_error
units: bool
every: 10s
@ -22,9 +22,9 @@ component: Kubelet
template: kubelet_token_requests
lookup: sum -10s of token_fail_count
on: k8s_kubelet.kubelet_token_requests
class: Kubernetes
class: Errors
type: Kubernetes
component: Kubelet
type: Errors
units: failed requests
every: 10s
warn: $this > 0
@ -37,9 +37,9 @@ component: Kubelet
template: kubelet_operations_error
lookup: sum -1m
on: k8s_kubelet.kubelet_operations_errors
class: Kubernetes
class: Errors
type: Kubernetes
component: Kubelet
type: Errors
units: errors
every: 10s
warn: $this > (($status >= $WARNING) ? (0) : (20))
@ -64,9 +64,9 @@ component: Kubelet
template: kubelet_1m_pleg_relist_latency_quantile_05
on: k8s_kubelet.kubelet_pleg_relist_latency_microseconds
class: Kubernetes
class: Latency
type: Kubernetes
component: Kubelet
type: Latency
lookup: average -1m unaligned of kubelet_pleg_relist_latency_05
units: microseconds
every: 10s
@ -74,9 +74,9 @@ component: Kubelet
template: kubelet_10s_pleg_relist_latency_quantile_05
on: k8s_kubelet.kubelet_pleg_relist_latency_microseconds
class: Kubernetes
class: Latency
type: Kubernetes
component: Kubelet
type: Latency
lookup: average -10s unaligned of kubelet_pleg_relist_latency_05
calc: $this * 100 / (($kubelet_1m_pleg_relist_latency_quantile_05 < 1000)?(1000):($kubelet_1m_pleg_relist_latency_quantile_05))
every: 10s
@ -92,9 +92,9 @@ component: Kubelet
template: kubelet_1m_pleg_relist_latency_quantile_09
on: k8s_kubelet.kubelet_pleg_relist_latency_microseconds
class: Kubernetes
class: Latency
type: Kubernetes
component: Kubelet
type: Latency
lookup: average -1m unaligned of kubelet_pleg_relist_latency_09
units: microseconds
every: 10s
@ -102,9 +102,9 @@ component: Kubelet
template: kubelet_10s_pleg_relist_latency_quantile_09
on: k8s_kubelet.kubelet_pleg_relist_latency_microseconds
class: Kubernetes
class: Latency
type: Kubernetes
component: Kubelet
type: Latency
lookup: average -10s unaligned of kubelet_pleg_relist_latency_09
calc: $this * 100 / (($kubelet_1m_pleg_relist_latency_quantile_09 < 1000)?(1000):($kubelet_1m_pleg_relist_latency_quantile_09))
every: 10s
@ -120,9 +120,9 @@ component: Kubelet
template: kubelet_1m_pleg_relist_latency_quantile_099
on: k8s_kubelet.kubelet_pleg_relist_latency_microseconds
class: Kubernetes
class: Latency
type: Kubernetes
component: Kubelet
type: Latency
lookup: average -1m unaligned of kubelet_pleg_relist_latency_099
units: microseconds
every: 10s
@ -130,9 +130,9 @@ component: Kubelet
template: kubelet_10s_pleg_relist_latency_quantile_099
on: k8s_kubelet.kubelet_pleg_relist_latency_microseconds
class: Kubernetes
class: Latency
type: Kubernetes
component: Kubelet
type: Latency
lookup: average -10s unaligned of kubelet_pleg_relist_latency_099
calc: $this * 100 / (($kubelet_1m_pleg_relist_latency_quantile_099 < 1000)?(1000):($kubelet_1m_pleg_relist_latency_quantile_099))
every: 10s

View File

@ -2,9 +2,9 @@
template: linux_power_supply_capacity
on: powersupply.capacity
class: Power Supply
class: Utilization
type: Power Supply
component: Battery
type: Utilization
calc: $capacity
units: %
every: 10s

View File

@ -6,9 +6,9 @@
# minute, with a special case for a single CPU of setting the trigger at 2.
alarm: load_cpu_number
on: system.load
class: System
class: Utilization
type: System
component: Load
type: Utilization
os: linux
hosts: *
calc: ($active_processors == nan or $active_processors == inf or $active_processors < 2) ? ( 2 ) : ( $active_processors )
@ -22,9 +22,9 @@ component: Load
alarm: load_average_15
on: system.load
class: System
class: Utilization
type: System
component: Load
type: Utilization
os: linux
hosts: *
lookup: max -1m unaligned of load15
@ -37,9 +37,9 @@ component: Load
alarm: load_average_5
on: system.load
class: System
class: Utilization
type: System
component: Load
type: Utilization
os: linux
hosts: *
lookup: max -1m unaligned of load5
@ -52,9 +52,9 @@ component: Load
alarm: load_average_1
on: system.load
class: System
class: Utilization
type: System
component: Load
type: Utilization
os: linux
hosts: *
lookup: max -1m unaligned of load1

View File

@ -1,8 +1,8 @@
template: mdstat_last_collected
on: md.disks
class: System
class: Latency
type: System
component: RAID
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s
@ -13,9 +13,9 @@ component: RAID
template: mdstat_disks
on: md.disks
class: System
class: Errors
type: System
component: RAID
type: Errors
units: failed devices
every: 10s
calc: $down
@ -26,9 +26,9 @@ component: RAID
template: mdstat_mismatch_cnt
on: md.mismatch_cnt
class: System
class: Errors
type: System
component: RAID
type: Errors
families: !*(raid1) !*(raid10) *
units: unsynchronized blocks
calc: $count
@ -40,9 +40,9 @@ component: RAID
template: mdstat_nonredundant_last_collected
on: md.nonredundant
class: System
class: Latency
type: System
component: RAID
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s

View File

@ -3,9 +3,9 @@
template: megacli_adapter_state
on: megacli.adapter_degraded
class: System
class: Errors
type: System
component: RAID
type: Errors
lookup: max -10s foreach *
units: boolean
every: 10s
@ -18,9 +18,9 @@ component: RAID
template: megacli_pd_predictive_failures
on: megacli.pd_predictive_failure
class: System
class: Errors
type: System
component: RAID
type: Errors
lookup: sum -10s foreach *
units: predictive failures
every: 10s
@ -31,9 +31,9 @@ component: RAID
template: megacli_pd_media_errors
on: megacli.pd_media_error
class: System
class: Errors
type: System
component: RAID
type: Errors
lookup: sum -10s foreach *
units: media errors
every: 10s
@ -46,9 +46,9 @@ component: RAID
template: megacli_bbu_relative_charge
on: megacli.bbu_relative_charge
class: System
class: Workload
type: System
component: RAID
type: Workload
lookup: average -10s
units: percent
every: 10s
@ -59,9 +59,9 @@ component: RAID
template: megacli_bbu_cycle_count
on: megacli.bbu_cycle_count
class: System
class: Workload
type: System
component: RAID
type: Workload
lookup: average -10s
units: cycles
every: 10s

View File

@ -3,9 +3,9 @@
template: memcached_cache_memory_usage
on: memcached.cache
class: KV Storage
class: Utilization
type: KV Storage
component: Memcached
type: Utilization
calc: $used * 100 / ($used + $available)
units: %
every: 10s
@ -20,9 +20,9 @@ component: Memcached
template: memcached_cache_fill_rate
on: memcached.cache
class: KV Storage
class: Utilization
type: KV Storage
component: Memcached
type: Utilization
lookup: min -10m at -50m unaligned of available
calc: ($this - $available) / (($now - $after) / 3600)
units: KB/hour
@ -34,9 +34,9 @@ component: Memcached
template: memcached_out_of_cache_space_time
on: memcached.cache
class: KV Storage
class: Utilization
type: KV Storage
component: Memcached
type: Utilization
calc: ($memcached_cache_fill_rate > 0) ? ($available / $memcached_cache_fill_rate) : (inf)
units: hours
every: 10s

View File

@ -3,9 +3,9 @@
alarm: 1hour_ecc_memory_correctable
on: mem.ecc_ce
class: System
class: Errors
type: System
component: Memory
type: Errors
os: linux
hosts: *
lookup: sum -10m unaligned
@ -18,9 +18,9 @@ component: Memory
alarm: 1hour_ecc_memory_uncorrectable
on: mem.ecc_ue
class: System
class: Errors
type: System
component: Memory
type: Errors
os: linux
hosts: *
lookup: sum -10m unaligned
@ -33,9 +33,9 @@ component: Memory
alarm: 1hour_memory_hw_corrupted
on: mem.hwcorrupt
class: System
class: Errors
type: System
component: Memory
type: Errors
os: linux
hosts: *
calc: $HardwareCorrupted

View File

@ -3,9 +3,9 @@
template: mysql_10s_slow_queries
on: mysql.queries
class: Database
class: Latency
type: Database
component: MySQL
type: Latency
lookup: sum -10s of slow_queries
units: slow queries
every: 10s
@ -21,9 +21,9 @@ component: MySQL
template: mysql_10s_table_locks_immediate
on: mysql.table_locks
class: Database
class: Utilization
type: Database
component: MySQL
type: Utilization
lookup: sum -10s absolute of immediate
units: immediate locks
every: 10s
@ -32,9 +32,9 @@ component: MySQL
template: mysql_10s_table_locks_waited
on: mysql.table_locks
class: Database
class: Latency
type: Database
component: MySQL
type: Latency
lookup: sum -10s absolute of waited
units: waited locks
every: 10s
@ -43,9 +43,9 @@ component: MySQL
template: mysql_10s_waited_locks_ratio
on: mysql.table_locks
class: Database
class: Latency
type: Database
component: MySQL
type: Latency
calc: ( ($mysql_10s_table_locks_waited + $mysql_10s_table_locks_immediate) > 0 ) ? (($mysql_10s_table_locks_waited * 100) / ($mysql_10s_table_locks_waited + $mysql_10s_table_locks_immediate)) : 0
units: %
every: 10s
@ -61,9 +61,9 @@ component: MySQL
template: mysql_connections
on: mysql.connections_active
class: Database
class: Utilization
type: Database
component: MySQL
type: Utilization
calc: $active * 100 / $limit
units: %
every: 10s
@ -79,9 +79,9 @@ component: MySQL
template: mysql_replication
on: mysql.slave_status
class: Database
class: Errors
type: Database
component: MySQL
type: Errors
calc: ($sql_running <= 0 OR $io_running <= 0)?0:1
units: ok/failed
every: 10s
@ -92,9 +92,9 @@ component: MySQL
template: mysql_replication_lag
on: mysql.slave_behind
class: Database
class: Latency
type: Database
component: MySQL
type: Errors
calc: $seconds
units: seconds
every: 10s
@ -111,9 +111,9 @@ component: MySQL
template: mysql_galera_cluster_size_max_2m
on: mysql.galera_cluster_size
class: Database
class: Utilization
type: Database
component: MySQL
type: Utilization
lookup: max -2m absolute
units: nodes
every: 10s
@ -122,9 +122,9 @@ component: MySQL
template: mysql_galera_cluster_size
on: mysql.galera_cluster_size
class: Database
class: Utilization
type: Database
component: MySQL
type: Utilization
calc: $nodes
units: nodes
every: 10s
@ -138,9 +138,9 @@ component: MySQL
template: mysql_galera_cluster_state
on: mysql.galera_cluster_state
class: Database
class: Errors
type: Database
component: MySQL
type: Errors
calc: $state
every: 10s
warn: $this == 2 OR $this == 3
@ -155,9 +155,9 @@ component: MySQL
template: mysql_galera_cluster_status
on: mysql.galera_cluster_status
class: Database
class: Errors
type: Database
component: MySQL
type: Errors
calc: $wsrep_cluster_status
every: 10s
crit: $mysql_galera_cluster_state != nan AND $this != 0

View File

@ -6,9 +6,9 @@
template: interface_speed
on: net.net
class: System
class: Latency
type: System
component: Network
type: Latency
os: *
hosts: *
families: *
@ -19,9 +19,9 @@ component: Network
template: 1m_received_traffic_overflow
on: net.net
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
families: *
@ -36,9 +36,9 @@ component: Network
template: 1m_sent_traffic_overflow
on: net.net
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
families: *
@ -63,9 +63,9 @@ component: Network
template: inbound_packets_dropped
on: net.drops
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: !net* *
@ -76,9 +76,9 @@ component: Network
template: outbound_packets_dropped
on: net.drops
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: !net* *
@ -89,9 +89,9 @@ component: Network
template: inbound_packets_dropped_ratio
on: net.packets
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: !net* !wl* *
@ -106,9 +106,9 @@ component: Network
template: outbound_packets_dropped_ratio
on: net.packets
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: !net* !wl* *
@ -123,9 +123,9 @@ component: Network
template: wifi_inbound_packets_dropped_ratio
on: net.packets
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: wl*
@ -140,9 +140,9 @@ component: Network
template: wifi_outbound_packets_dropped_ratio
on: net.packets
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: wl*
@ -160,9 +160,9 @@ component: Network
template: interface_inbound_errors
on: net.errors
class: System
class: Errors
type: System
component: Network
type: Errors
os: freebsd
hosts: *
families: *
@ -176,9 +176,9 @@ component: Network
template: interface_outbound_errors
on: net.errors
class: System
class: Errors
type: System
component: Network
type: Errors
os: freebsd
hosts: *
families: *
@ -200,9 +200,9 @@ component: Network
template: 10min_fifo_errors
on: net.fifo
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
families: *
@ -225,9 +225,9 @@ component: Network
template: 1m_received_packets_rate
on: net.packets
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux freebsd
hosts: *
families: *
@ -238,9 +238,9 @@ component: Network
template: 10s_received_packets_storm
on: net.packets
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux freebsd
hosts: *
families: *

View File

@ -3,9 +3,9 @@
alarm: netfilter_conntrack_full
on: netfilter.conntrack_sockets
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
lookup: max -10s unaligned of connections

View File

@ -3,9 +3,9 @@
template: pihole_blocked_queries
on: pihole.dns_queries_percentage
class: Ad Filtering
class: Errors
type: Ad Filtering
component: Pi-hole
type: Errors
every: 10s
units: %
calc: $blocked
@ -21,9 +21,9 @@ component: Pi-hole
template: pihole_blocklist_last_update
on: pihole.blocklist_last_update
class: Ad Filtering
class: Errors
type: Ad Filtering
component: Pi-hole
type: Errors
every: 10s
units: seconds
calc: $ago
@ -36,9 +36,9 @@ component: Pi-hole
template: pihole_blocklist_gravity_file
on: pihole.blocklist_last_update
class: Ad Filtering
class: Errors
type: Ad Filtering
component: Pi-hole
type: Errors
every: 10s
units: boolean
calc: $file_exists
@ -52,9 +52,9 @@ component: Pi-hole
template: pihole_status
on: pihole.unwanted_domains_blocking_status
class: Ad Filtering
class: Errors
type: Ad Filtering
component: Pi-hole
type: Errors
every: 10s
units: boolean
calc: $enabled

View File

@ -3,9 +3,9 @@
template: portcheck_service_reachable
families: *
on: portcheck.status
class: Other
class: Workload
type: Other
component: TCP endpoint
type: Workload
lookup: average -1m unaligned percentage of success
calc: ($this < 75) ? (0) : ($this)
every: 5s
@ -16,9 +16,9 @@ component: TCP endpoint
template: portcheck_connection_timeouts
families: *
on: portcheck.status
class: Other
class: Errors
type: Other
component: TCP endpoint
type: Errors
lookup: average -5m unaligned percentage of timeout
every: 10s
units: %
@ -31,9 +31,9 @@ component: TCP endpoint
template: portcheck_connection_fails
families: *
on: portcheck.status
class: Other
class: Errors
type: Other
component: TCP endpoint
type: Errors
lookup: average -5m unaligned percentage of no_connection,failed
every: 10s
units: %

View File

@ -2,9 +2,9 @@
alarm: active_processes
on: system.active_processes
class: System
class: Workload
type: System
component: Processes
type: Workload
hosts: *
calc: $active * 100 / $pidmax
units: %

View File

@ -3,9 +3,9 @@
template: python.d_job_last_collected_secs
on: netdata.pythond_runtime
class: Netdata
class: Error
type: Netdata
component: python.d.plugin
type: Error
module: *
calc: $now - $last_collected_t
units: seconds ago

View File

@ -3,9 +3,9 @@
alarm: used_ram_to_ignore
on: system.ram
class: System
class: Utilization
type: System
component: Memory
type: Utilization
os: linux freebsd
hosts: *
calc: ($zfs.arc_size.arcsz = nan)?(0):($zfs.arc_size.arcsz - $zfs.arc_size.min)
@ -15,9 +15,9 @@ component: Memory
alarm: ram_in_use
on: system.ram
class: System
class: Utilization
type: System
component: Memory
type: Utilization
os: linux
hosts: *
# calc: $used * 100 / ($used + $cached + $free)
@ -32,9 +32,9 @@ component: Memory
alarm: ram_available
on: mem.available
class: System
class: Utilization
type: System
component: Memory
type: Utilization
os: linux
hosts: *
calc: ($avail + $system.ram.used_ram_to_ignore) * 100 / ($system.ram.used + $system.ram.cached + $system.ram.free + $system.ram.buffers)
@ -61,9 +61,9 @@ component: Memory
## FreeBSD
alarm: ram_in_use
on: system.ram
class: System
class: Utilization
type: System
component: Memory
type: Utilization
os: freebsd
hosts: *
calc: ($active + $wired + $laundry + $buffers - $used_ram_to_ignore) * 100 / ($active + $wired + $laundry + $buffers - $used_ram_to_ignore + $cache + $free + $inactive)
@ -77,9 +77,9 @@ component: Memory
alarm: ram_available
on: system.ram
class: System
class: Utilization
type: System
component: Memory
type: Utilization
os: freebsd
hosts: *
calc: ($free + $inactive + $used_ram_to_ignore) * 100 / ($free + $active + $inactive + $wired + $cache + $laundry + $buffers)

View File

@ -2,9 +2,9 @@
template: redis_bgsave_broken
families: *
on: redis.bgsave_health
class: KV Storage
class: Errors
type: KV Storage
component: Redis
type: Errors
every: 10s
crit: $rdb_last_bgsave_status != 0
units: ok/failed
@ -15,9 +15,9 @@ component: Redis
template: redis_bgsave_slow
families: *
on: redis.bgsave_now
class: KV Storage
class: Latency
type: KV Storage
component: Redis
type: Latency
every: 10s
warn: $rdb_bgsave_in_progress > 600
crit: $rdb_bgsave_in_progress > 1200

View File

@ -3,9 +3,9 @@
template: retroshare_dht_working
on: retroshare.dht
class: Data Sharing
class: Utilization
type: Data Sharing
component: Retroshare
type: Utilization
calc: $dht_size_all
units: peers
every: 1m

View File

@ -2,9 +2,9 @@
# Warn if a list keys operation is running.
template: riakkv_list_keys_active
on: riak.core.fsm_active
class: Database
class: Utilization
type: Database
component: Riak KV
type: Utilization
calc: $list_fsm_active
units: state machines
every: 10s
@ -17,9 +17,9 @@ component: Riak KV
# KV GET
template: riakkv_1h_kv_get_mean_latency
on: riak.kv.latency.get
class: Database
class: Latency
type: Database
component: Riak KV
type: Latency
calc: $node_get_fsm_time_mean
lookup: average -1h unaligned of time
every: 30s
@ -29,9 +29,9 @@ component: Riak KV
template: riakkv_kv_get_slow
on: riak.kv.latency.get
class: Database
class: Latency
type: Database
component: Riak KV
type: Latency
calc: $mean
lookup: average -3m unaligned of time
units: ms
@ -47,9 +47,9 @@ component: Riak KV
# KV PUT
template: riakkv_1h_kv_put_mean_latency
on: riak.kv.latency.put
class: Database
class: Latency
type: Database
component: Riak KV
type: Latency
calc: $node_put_fsm_time_mean
lookup: average -1h unaligned of time
every: 30s
@ -59,9 +59,9 @@ component: Riak KV
template: riakkv_kv_put_slow
on: riak.kv.latency.put
class: Database
class: Latency
type: Database
component: Riak KV
type: Latency
calc: $mean
lookup: average -3m unaligned of time
units: ms
@ -81,9 +81,9 @@ component: Riak KV
# On systems observed, this is < 2000, but may grow depending on load.
template: riakkv_vm_high_process_count
on: riak.vm
class: Database
class: Utilization
type: Database
component: Riak KV
type: Utilization
calc: $sys_process_count
units: processes
every: 10s

View File

@ -3,9 +3,9 @@
template: scaleio_storage_pool_capacity_utilization
on: scaleio.storage_pool_capacity_utilization
class: Storage
class: Utilization
type: Storage
component: ScaleIO
type: Utilization
calc: $used
units: %
every: 10s
@ -20,9 +20,9 @@ component: ScaleIO
template: scaleio_sdc_mdm_connection_state
on: scaleio.sdc_mdm_connection_state
class: Storage
class: Utilization
type: Storage
component: ScaleIO
type: Utilization
calc: $connected
every: 10s
warn: $this != 1

View File

@ -5,9 +5,9 @@
alarm: 1min_netdev_backlog_exceeded
on: system.softnet_stat
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
lookup: average -1m unaligned absolute of dropped
@ -21,9 +21,9 @@ component: Network
alarm: 1min_netdev_budget_ran_outs
on: system.softnet_stat
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
lookup: average -1m unaligned absolute of squeezed
@ -38,9 +38,9 @@ component: Network
alarm: 10min_netisr_backlog_exceeded
on: system.softnet_stat
class: System
class: Errors
type: System
component: Network
type: Errors
os: freebsd
hosts: *
lookup: average -1m unaligned absolute of qdrops

View File

@ -1,9 +1,9 @@
template: stiebeleltron_last_collected_secs
families: *
on: stiebeleltron.heating.hc1
class: Other
class: Latency
type: Other
component: Sensors
type: Latency
calc: $now - $last_collected_t
every: 10s
units: seconds ago

View File

@ -3,9 +3,9 @@
alarm: 30min_ram_swapped_out
on: system.swapio
class: System
class: Workload
type: System
component: Memory
type: Workload
os: linux freebsd
hosts: *
lookup: sum -30m unaligned absolute of out
@ -20,9 +20,9 @@ component: Memory
alarm: used_swap
on: system.swap
class: System
class: Utilization
type: System
component: Memory
type: Utilization
os: linux freebsd
hosts: *
calc: $used * 100 / ( $used + $free )

View File

@ -4,9 +4,9 @@
## Service units
template: systemd_service_units_state
on: systemd.service_units_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -18,9 +18,9 @@ component: Systemd units
## Socket units
template: systemd_socket_units_state
on: systemd.socket_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -32,9 +32,9 @@ component: Systemd units
## Target units
template: systemd_target_units_state
on: systemd.target_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -46,9 +46,9 @@ component: Systemd units
## Path units
template: systemd_path_units_state
on: systemd.path_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -60,9 +60,9 @@ component: Systemd units
## Device units
template: systemd_device_units_state
on: systemd.device_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -74,9 +74,9 @@ component: Systemd units
## Mount units
template: systemd_mount_units_state
on: systemd.mount_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -88,9 +88,9 @@ component: Systemd units
## Automount units
template: systemd_automount_units_state
on: systemd.automount_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -102,9 +102,9 @@ component: Systemd units
## Swap units
template: systemd_swap_units_state
on: systemd.swap_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -116,9 +116,9 @@ component: Systemd units
## Scope units
template: systemd_scope_units_state
on: systemd.scope_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s
@ -130,9 +130,9 @@ component: Systemd units
## Slice units
template: systemd_slice_units_state
on: systemd.slice_unit_state
class: Linux
class: Errors
type: Linux
component: Systemd units
type: Errors
lookup: max -1s min2max
units: ok/failed
every: 10s

View File

@ -7,9 +7,9 @@
alarm: tcp_connections
on: ipv4.tcpsock
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
calc: (${tcp_max_connections} > 0) ? ( ${connections} * 100 / ${tcp_max_connections} ) : 0

View File

@ -20,9 +20,9 @@
alarm: 1m_tcp_accept_queue_overflows
on: ip.tcp_accept_queue
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
lookup: average -60s unaligned absolute of ListenOverflows
@ -38,9 +38,9 @@ component: Network
# CHECK: https://github.com/netdata/netdata/issues/3234#issuecomment-423935842
alarm: 1m_tcp_accept_queue_drops
on: ip.tcp_accept_queue
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
lookup: average -60s unaligned absolute of ListenDrops
@ -63,9 +63,9 @@ component: Network
alarm: 1m_tcp_syn_queue_drops
on: ip.tcp_syn_queue
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
lookup: average -60s unaligned absolute of TCPReqQFullDrop
@ -80,9 +80,9 @@ component: Network
alarm: 1m_tcp_syn_queue_cookies
on: ip.tcp_syn_queue
class: System
class: Workload
type: System
component: Network
type: Workload
os: linux
hosts: *
lookup: average -60s unaligned absolute of TCPReqQFullDoCookies

View File

@ -8,9 +8,9 @@
alarm: tcp_memory
on: ipv4.sockstat_tcp_mem
class: System
class: Utilization
type: System
component: Network
type: Utilization
os: linux
hosts: *
calc: ${mem} * 100 / ${tcp_mem_high}

View File

@ -9,9 +9,9 @@
alarm: tcp_orphans
on: ipv4.sockstat_tcp_sockets
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
calc: ${orphan} * 100 / ${tcp_max_orphans}

View File

@ -6,9 +6,9 @@
alarm: 1m_ipv4_tcp_resets_sent
on: ipv4.tcphandshake
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
lookup: average -1m at -10s unaligned absolute of OutRsts
@ -18,9 +18,9 @@ component: Network
alarm: 10s_ipv4_tcp_resets_sent
on: ipv4.tcphandshake
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
lookup: average -10s unaligned absolute of OutRsts
@ -40,9 +40,9 @@ component: Network
alarm: 1m_ipv4_tcp_resets_received
on: ipv4.tcphandshake
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux freebsd
hosts: *
lookup: average -1m at -10s unaligned absolute of AttemptFails
@ -52,9 +52,9 @@ component: Network
alarm: 10s_ipv4_tcp_resets_received
on: ipv4.tcphandshake
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux freebsd
hosts: *
lookup: average -10s unaligned absolute of AttemptFails

View File

@ -5,9 +5,9 @@
alarm: system_clock_sync_state
on: system.clock_sync_state
os: linux
class: System
class: Error
type: System
component: Clock
type: Error
calc: $state
units: synchronization state
every: 10s

View File

@ -6,9 +6,9 @@
alarm: 1m_ipv4_udp_receive_buffer_errors
on: ipv4.udperrors
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux freebsd
hosts: *
lookup: average -1m unaligned absolute of RcvbufErrors
@ -24,9 +24,9 @@ component: Network
alarm: 1m_ipv4_udp_send_buffer_errors
on: ipv4.udperrors
class: System
class: Errors
type: System
component: Network
type: Errors
os: linux
hosts: *
lookup: average -1m unaligned absolute of SndbufErrors

View File

@ -3,9 +3,9 @@
template: unbound_request_list_overwritten
on: unbound.request_list_jostle_list
class: DNS
class: Errors
type: DNS
component: Unbound
type: Errors
lookup: average -60s unaligned absolute match-names of overwritten
units: queries
every: 10s
@ -16,9 +16,9 @@ component: Unbound
template: unbound_request_list_dropped
on: unbound.request_list_jostle_list
class: DNS
class: Errors
type: DNS
component: Unbound
type: Errors
lookup: average -60s unaligned absolute match-names of dropped
units: queries
every: 10s

View File

@ -1,8 +1,8 @@
alarm: varnish_last_collected
on: varnish.uptime
class: Web Proxy
class: Latency
type: Web Proxy
component: Varnish
type: Latency
calc: $now - $last_collected_t
units: seconds ago
every: 10s

View File

@ -8,9 +8,9 @@
template: vcsa_system_health
on: vcsa.system_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of system
units: status
every: 10s
@ -30,9 +30,9 @@ component: VMware vCenter
template: vcsa_swap_health
on: vcsa.components_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of swap
units: status
every: 10s
@ -45,9 +45,9 @@ component: VMware vCenter
template: vcsa_storage_health
on: vcsa.components_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of storage
units: status
every: 10s
@ -60,9 +60,9 @@ component: VMware vCenter
template: vcsa_mem_health
on: vcsa.components_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of mem
units: status
every: 10s
@ -75,9 +75,9 @@ component: VMware vCenter
template: vcsa_load_health
on: vcsa.components_health
class: Virtual Machine
class: Utilization
type: Virtual Machine
component: VMware vCenter
type: Utilization
lookup: max -10s unaligned of load
units: status
every: 10s
@ -90,9 +90,9 @@ component: VMware vCenter
template: vcsa_database_storage_health
on: vcsa.components_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of database_storage
units: status
every: 10s
@ -105,9 +105,9 @@ component: VMware vCenter
template: vcsa_applmgmt_health
on: vcsa.components_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of applmgmt
units: status
every: 10s
@ -127,9 +127,9 @@ component: VMware vCenter
template: vcsa_software_updates_health
on: vcsa.software_updates_health
class: Virtual Machine
class: Errors
type: Virtual Machine
component: VMware vCenter
type: Errors
lookup: max -10s unaligned of software_packages
units: status
every: 10s

View File

@ -3,9 +3,9 @@
template: vernemq_socket_errors
on: vernemq.socket_errors
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: sum -1m unaligned absolute of socket_error
units: errors
every: 1m
@ -18,9 +18,9 @@ component: VerneMQ
template: vernemq_queue_message_drop
on: vernemq.queue_undelivered_messages
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute of queue_message_drop
units: dropped messages
every: 1m
@ -31,9 +31,9 @@ component: VerneMQ
template: vernemq_queue_message_expired
on: vernemq.queue_undelivered_messages
class: Messaging
class: Latency
type: Messaging
component: VerneMQ
type: Latency
lookup: average -1m unaligned absolute of queue_message_expired
units: expired messages
every: 1m
@ -44,9 +44,9 @@ component: VerneMQ
template: vernemq_queue_message_unhandled
on: vernemq.queue_undelivered_messages
class: Messaging
class: Latency
type: Messaging
component: VerneMQ
type: Latency
lookup: average -1m unaligned absolute of queue_message_unhandled
units: unhandled messages
every: 1m
@ -59,9 +59,9 @@ component: VerneMQ
template: vernemq_average_scheduler_utilization
on: vernemq.average_scheduler_utilization
class: Messaging
class: Utilization
type: Messaging
component: VerneMQ
type: Utilization
lookup: average -10m unaligned
units: %
every: 1m
@ -75,9 +75,9 @@ component: VerneMQ
template: vernemq_cluster_dropped
on: vernemq.cluster_dropped
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: sum -1m unaligned
units: KiB
every: 1m
@ -88,9 +88,9 @@ component: VerneMQ
template: vernemq_netsplits
on: vernemq.netsplits
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: sum -1m unaligned absolute of netsplit_detected
units: netsplits
every: 10s
@ -103,9 +103,9 @@ component: VerneMQ
template: vernemq_mqtt_connack_sent_reason_unsuccessful
on: vernemq.mqtt_connack_sent_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -118,9 +118,9 @@ component: VerneMQ
template: vernemq_mqtt_disconnect_received_reason_not_normal
on: vernemq.mqtt_disconnect_received_reason
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: average -1m unaligned absolute match-names of !normal_disconnect,*
units: packets
every: 1m
@ -131,9 +131,9 @@ component: VerneMQ
template: vernemq_mqtt_disconnect_sent_reason_not_normal
on: vernemq.mqtt_disconnect_sent_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !normal_disconnect,*
units: packets
every: 1m
@ -146,9 +146,9 @@ component: VerneMQ
template: vernemq_mqtt_subscribe_error
on: vernemq.mqtt_subscribe_error
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute
units: failed ops
every: 1m
@ -159,9 +159,9 @@ component: VerneMQ
template: vernemq_mqtt_subscribe_auth_error
on: vernemq.mqtt_subscribe_auth_error
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: average -1m unaligned absolute
units: attempts
every: 1m
@ -174,9 +174,9 @@ component: VerneMQ
template: vernemq_mqtt_unsubscribe_error
on: vernemq.mqtt_unsubscribe_error
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute
units: failed ops
every: 1m
@ -189,9 +189,9 @@ component: VerneMQ
template: vernemq_mqtt_publish_errors
on: vernemq.mqtt_publish_errors
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute
units: failed ops
every: 1m
@ -202,9 +202,9 @@ component: VerneMQ
template: vernemq_mqtt_publish_auth_errors
on: vernemq.mqtt_publish_auth_errors
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: average -1m unaligned absolute
units: attempts
every: 1m
@ -217,9 +217,9 @@ component: VerneMQ
template: vernemq_mqtt_puback_received_reason_unsuccessful
on: vernemq.mqtt_puback_received_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -230,9 +230,9 @@ component: VerneMQ
template: vernemq_mqtt_puback_sent_reason_unsuccessful
on: vernemq.mqtt_puback_sent_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -243,9 +243,9 @@ component: VerneMQ
template: vernemq_mqtt_puback_unexpected
on: vernemq.mqtt_puback_invalid_error
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: average -1m unaligned absolute
units: messages
every: 1m
@ -258,9 +258,9 @@ component: VerneMQ
template: vernemq_mqtt_pubrec_received_reason_unsuccessful
on: vernemq.mqtt_pubrec_received_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -271,9 +271,9 @@ component: VerneMQ
template: vernemq_mqtt_pubrec_sent_reason_unsuccessful
on: vernemq.mqtt_pubrec_sent_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -284,9 +284,9 @@ component: VerneMQ
template: vernemq_mqtt_pubrec_invalid_error
on: vernemq.mqtt_pubrec_invalid_error
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: average -1m unaligned absolute
units: messages
every: 1m
@ -299,9 +299,9 @@ component: VerneMQ
template: vernemq_mqtt_pubrel_received_reason_unsuccessful
on: vernemq.mqtt_pubrel_received_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -312,9 +312,9 @@ component: VerneMQ
template: vernemq_mqtt_pubrel_sent_reason_unsuccessful
on: vernemq.mqtt_pubrel_sent_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -327,9 +327,9 @@ component: VerneMQ
template: vernemq_mqtt_pubcomp_received_reason_unsuccessful
on: vernemq.mqtt_pubcomp_received_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -340,9 +340,9 @@ component: VerneMQ
template: vernemq_mqtt_pubcomp_sent_reason_unsuccessful
on: vernemq.mqtt_pubcomp_sent_reason
class: Messaging
class: Errors
type: Messaging
component: VerneMQ
type: Errors
lookup: average -1m unaligned absolute match-names of !success,*
units: packets
every: 1m
@ -353,9 +353,9 @@ component: VerneMQ
template: vernemq_mqtt_pubcomp_unexpected
on: vernemq.mqtt_pubcomp_invalid_error
class: Messaging
class: Workload
type: Messaging
component: VerneMQ
type: Workload
lookup: average -1m unaligned absolute
units: messages
every: 1m

View File

@ -6,9 +6,9 @@
template: vsphere_vm_mem_usage
on: vsphere.vm_mem_usage_percentage
class: Virtual Machine
class: Utilization
type: Virtual Machine
component: Memory
type: Utilization
hosts: *
calc: $used
units: %
@ -23,9 +23,9 @@ component: Memory
template: vsphere_host_mem_usage
on: vsphere.host_mem_usage_percentage
class: Virtual Machine
class: Utilization
type: Virtual Machine
component: Memory
type: Utilization
hosts: *
calc: $used
units: %
@ -39,9 +39,9 @@ component: Memory
template: vsphere_inbound_packets_errors
on: vsphere.net_errors_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of rx
@ -51,9 +51,9 @@ component: Network
template: vsphere_outbound_packets_errors
on: vsphere.net_errors_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of tx
@ -65,9 +65,9 @@ component: Network
template: vsphere_inbound_packets_errors_ratio
on: vsphere.net_packets_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of rx
@ -81,9 +81,9 @@ component: Network
template: vsphere_outbound_packets_errors_ratio
on: vsphere.net_packets_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of tx
@ -100,9 +100,9 @@ component: Network
template: vsphere_cpu_usage
on: vsphere.cpu_usage_total
class: Virtual Machine
class: Utilization
type: Virtual Machine
component: CPU
type: Utilization
hosts: *
lookup: average -10m unaligned match-names of used
units: %
@ -117,9 +117,9 @@ component: CPU
template: vsphere_inbound_packets_dropped
on: vsphere.net_drops_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of rx
@ -129,9 +129,9 @@ component: Network
template: vsphere_outbound_packets_dropped
on: vsphere.net_drops_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of tx
@ -143,9 +143,9 @@ component: Network
template: vsphere_inbound_packets_dropped_ratio
on: vsphere.net_packets_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of rx
@ -159,9 +159,9 @@ component: Network
template: vsphere_outbound_packets_dropped_ratio
on: vsphere.net_packets_total
class: Virtual Machine
class: Errors
type: Virtual Machine
component: Network
type: Errors
hosts: *
families: *
lookup: sum -10m unaligned absolute match-names of tx

View File

@ -11,9 +11,9 @@
template: 1m_requests
on: web_log.response_statuses
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned
calc: ($this == 0)?(1):($this)
@ -23,9 +23,9 @@ component: Web log
template: 1m_successful
on: web_log.response_statuses
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned of successful_requests
calc: $this * 100 / $1m_requests
@ -39,9 +39,9 @@ component: Web log
template: 1m_redirects
on: web_log.response_statuses
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned of redirects
calc: $this * 100 / $1m_requests
@ -54,9 +54,9 @@ component: Web log
template: 1m_bad_requests
on: web_log.response_statuses
class: Web Server
class: Errors
type: Web Server
component: Web log
type: Errors
families: *
lookup: sum -1m unaligned of bad_requests
calc: $this * 100 / $1m_requests
@ -69,9 +69,9 @@ component: Web log
template: 1m_internal_errors
on: web_log.response_statuses
class: Web Server
class: Errors
type: Web Server
component: Web log
type: Errors
families: *
lookup: sum -1m unaligned of server_errors
calc: $this * 100 / $1m_requests
@ -94,9 +94,9 @@ component: Web log
template: 1m_total_requests
on: web_log.response_codes
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned
calc: ($this == 0)?(1):($this)
@ -106,9 +106,9 @@ component: Web log
template: 1m_unmatched
on: web_log.response_codes
class: Web Server
class: Errors
type: Web Server
component: Web log
type: Errors
families: *
lookup: sum -1m unaligned of unmatched
calc: $this * 100 / $1m_total_requests
@ -131,9 +131,9 @@ component: Web log
template: 10m_response_time
on: web_log.response_time
class: System
class: Latency
type: System
component: Web log
type: Latency
families: *
lookup: average -10m unaligned of avg
units: ms
@ -142,9 +142,9 @@ component: Web log
template: web_slow
on: web_log.response_time
class: Web Server
class: Latency
type: Web Server
component: Web log
type: Latency
families: *
lookup: average -1m unaligned of avg
units: ms
@ -171,9 +171,9 @@ component: Web log
template: 5m_successful_old
on: web_log.response_statuses
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: average -5m at -5m unaligned of successful_requests
units: requests/s
@ -182,9 +182,9 @@ component: Web log
template: 5m_successful
on: web_log.response_statuses
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: average -5m unaligned of successful_requests
units: requests/s
@ -193,9 +193,9 @@ component: Web log
template: 5m_requests_ratio
on: web_log.response_codes
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
calc: ($5m_successful_old > 0)?($5m_successful * 100 / $5m_successful_old):(100)
units: %
@ -224,9 +224,9 @@ component: Web log
template: web_log_1m_total_requests
on: web_log.requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned
calc: ($this == 0)?(1):($this)
@ -236,9 +236,9 @@ component: Web log
template: web_log_1m_unmatched
on: web_log.excluded_requests
class: Web Server
class: Errors
type: Web Server
component: Web log
type: Errors
families: *
lookup: sum -1m unaligned of unmatched
calc: $this * 100 / $web_log_1m_total_requests
@ -261,9 +261,9 @@ component: Web log
template: web_log_1m_requests
on: web_log.type_requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned
calc: ($this == 0)?(1):($this)
@ -273,9 +273,9 @@ component: Web log
template: web_log_1m_successful
on: web_log.type_requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned of success
calc: $this * 100 / $web_log_1m_requests
@ -289,9 +289,9 @@ component: Web log
template: web_log_1m_redirects
on: web_log.type_requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: sum -1m unaligned of redirect
calc: $this * 100 / $web_log_1m_requests
@ -304,9 +304,9 @@ component: Web log
template: web_log_1m_bad_requests
on: web_log.type_requests
class: Web Server
class: Errors
type: Web Server
component: Web log
type: Errors
families: *
lookup: sum -1m unaligned of bad
calc: $this * 100 / $web_log_1m_requests
@ -319,9 +319,9 @@ component: Web log
template: web_log_1m_internal_errors
on: web_log.type_requests
class: Web Server
class: Errors
type: Web Server
component: Web log
type: Errors
families: *
lookup: sum -1m unaligned of error
calc: $this * 100 / $web_log_1m_requests
@ -345,9 +345,9 @@ component: Web log
template: web_log_10m_response_time
on: web_log.request_processing_time
class: System
class: Latency
type: System
component: Web log
type: Latency
families: *
lookup: average -10m unaligned of avg
units: ms
@ -356,9 +356,9 @@ component: Web log
template: web_log_web_slow
on: web_log.request_processing_time
class: Web Server
class: Latency
type: Web Server
component: Web log
type: Latency
families: *
lookup: average -1m unaligned of avg
units: ms
@ -385,9 +385,9 @@ component: Web log
template: web_log_5m_successful_old
on: web_log.type_requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: average -5m at -5m unaligned of success
units: requests/s
@ -396,9 +396,9 @@ component: Web log
template: web_log_5m_successful
on: web_log.type_requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
lookup: average -5m unaligned of success
units: requests/s
@ -407,9 +407,9 @@ component: Web log
template: web_log_5m_requests_ratio
on: web_log.type_requests
class: Web Server
class: Workload
type: Web Server
component: Web log
type: Workload
families: *
calc: ($web_log_5m_successful_old > 0)?($web_log_5m_successful * 100 / $web_log_5m_successful_old):(100)
units: %

View File

@ -1,9 +1,9 @@
template: whoisquery_days_until_expiration
on: whoisquery.time_until_expiration
class: Other
class: Utilization
type: Other
component: WHOIS
type: Utilization
calc: $expiry
units: seconds
every: 60s

View File

@ -3,9 +3,9 @@
template: wmi_10min_cpu_usage
on: wmi.cpu_utilization_total
class: Windows
class: Utilization
type: Windows
component: CPU
type: Utilization
os: linux
hosts: *
lookup: average -10m unaligned match-names of dpc,user,privileged,interrupt
@ -22,9 +22,9 @@ component: CPU
template: wmi_ram_in_use
on: wmi.memory_utilization
class: Windows
class: Utilization
type: Windows
component: Memory
type: Utilization
os: linux
hosts: *
calc: ($used) * 100 / ($used + $available)
@ -38,9 +38,9 @@ component: Memory
template: wmi_swap_in_use
on: wmi.memory_swap_utilization
class: Windows
class: Utilization
type: Windows
component: Memory
type: Utilization
os: linux
hosts: *
calc: ($used) * 100 / ($used + $available)
@ -57,9 +57,9 @@ component: Memory
template: wmi_inbound_packets_discarded
on: wmi.net_discarded
class: Windows
class: Errors
type: Windows
component: Network
type: Errors
os: linux
hosts: *
families: *
@ -73,9 +73,9 @@ component: Network
template: wmi_outbound_packets_discarded
on: wmi.net_discarded
class: Windows
class: Errors
type: Windows
component: Network
type: Errors
os: linux
hosts: *
families: *
@ -89,9 +89,9 @@ component: Network
template: wmi_inbound_packets_errors
on: wmi.net_errors
class: Windows
class: Errors
type: Windows
component: Network
type: Errors
os: linux
hosts: *
families: *
@ -105,9 +105,9 @@ component: Network
template: wmi_outbound_packets_errors
on: wmi.net_errors
class: Windows
class: Errors
type: Windows
component: Network
type: Errors
os: linux
hosts: *
families: *
@ -124,9 +124,9 @@ component: Network
template: wmi_disk_in_use
on: wmi.logical_disk_utilization
class: Windows
class: Utilization
type: Windows
component: Disk
type: Utilization
os: linux
hosts: *
calc: ($used) * 100 / ($used + $free)

View File

@ -1,9 +1,9 @@
template: x509check_days_until_expiration
on: x509check.time_until_expiration
class: Certificates
class: Latency
type: Certificates
component: x509 certificates
type: Latency
calc: $expiry
units: seconds
every: 60s
@ -14,9 +14,9 @@ component: x509 certificates
template: x509check_revocation_status
on: x509check.revocation_status
class: Certificates
class: Errors
type: Certificates
component: x509 certificates
type: Errors
calc: $revoked
every: 60s
crit: $this != nan AND $this != 0

View File

@ -1,9 +1,9 @@
alarm: zfs_memory_throttle
on: zfs.memory_ops
class: System
class: Utilization
type: System
component: File system
type: Utilization
lookup: sum -10m unaligned absolute of throttled
units: events
every: 1m
@ -16,9 +16,9 @@ component: File system
template: zfs_pool_state_warn
on: zfspool.state
class: System
class: Errors
type: System
component: File system
type: Errors
calc: $degraded
units: boolean
every: 10s
@ -29,9 +29,9 @@ component: File system
template: zfs_pool_state_crit
on: zfspool.state
class: System
class: Errors
type: System
component: File system
type: Errors
calc: $faulted + $unavail
units: boolean
every: 10s