Reorg learn 021723 (#14556)

* Change titles of agent alert notifications

* Reintroduce netdata for iot

* Eliminate guides category, merge health config docs

* Rename setup to configuration

* Codacy fixes and move health config reference
This commit is contained in:
Chris Akritidis 2023-02-17 12:20:08 -08:00 committed by GitHub
parent 851ce5a184
commit 1413b5bac3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
62 changed files with 390 additions and 1208 deletions

View File

@ -6,7 +6,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/aclk/README.md"
sidebar_label: "Agent-Cloud link (ACLK)"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Agent-cloud link (ACLK)

View File

@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/claim/README.md
sidebar_label: "Connect Agent to Cloud"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Connect Agent to Cloud

View File

@ -4,7 +4,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/collectors/REFE
sidebar_label: "Collectors configuration"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Collectors configuration reference

View File

@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/database/README
sidebar_label: "Database"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Database

View File

@ -32,8 +32,6 @@ To enable ML on your Netdata Agent, you need to edit the `[ml]` section in your
At a minimum you just need to set `enabled = yes` to enable ML with default params. More details about configuration can be found in the [Netdata Agent ML docs](https://learn.netdata.cloud/docs/agent/ml#configuration).
**Note**: Follow [this guide](https://github.com/netdata/netdata/blob/master/docs/guides/step-by-step/step-04.md) if you are unfamiliar with making configuration changes in Netdata.
When you have finished your configuration, restart Netdata with a command like `sudo systemctl restart netdata` for the config changes to take effect. You can find more info on restarting Netdata [here](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md).
After a brief delay, you should see the number of `trained` dimensions start to increase on the "dimensions" chart of the "Anomaly Detection" menu on the Overview page. By default the `minimum num samples to train = 3600` parameter means at least 1 hour of data is required to train initial models, but you could set this to `900` if you want to train initial models quicker but on less data. Over time, they will retrain on up to `maximum num samples to train = 14400` (4 hours by default), but you could increase this is you wanted to train on more data.

View File

@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/configure/
sidebar_label: "Common configuration changes"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Common configuration changes

View File

@ -2,10 +2,10 @@
title: "Configure the Netdata Agent"
description: "Netdata is zero-configuration for most users, but complex infrastructures may require you to tweak some of the Agent's granular settings."
custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/configure/nodes.md"
sidebar_label: "Setup"
sidebar_label: "Configuration"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
sidebar_position: 30
-->

View File

@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/configure/
sidebar_label: "Secure your nodes"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Secure your nodes

View File

@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/export/ena
sidebar_label: "Enable an exporting connector"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Enable an exporting connector

View File

@ -4,7 +4,7 @@ sidebar_label: "Monitor Nginx or Apache web server log files with Netdata"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/collect-apache-nginx-web-logs.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
learn_rel_path: "Miscellaneous"
-->
# Monitor Nginx or Apache web server log files with Netdata
@ -120,12 +120,5 @@ You can also edit this file directly with `edit-config`:
./edit-config health.d/weblog.conf
```
For more information about editing the defaults or writing new alarm entities, see our [health monitoring
documentation](https://github.com/netdata/netdata/blob/master/health/README.md).
## What's next?
Now that you have web log collection up and running, we recommend you take a look at the collector's [documentation](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md) for some ideas of how you can turn these rather "boring" logs into powerful real-time tools for keeping your servers happy.
Don't forget to give GitHub user [Wing924](https://github.com/Wing924) a big 👍 for his hard work in starting up the Go
refactoring effort.
For more information about editing the defaults or writing new alarm entities, see our
[health monitoring documentation](https://github.com/netdata/netdata/blob/master/health/README.md).

View File

@ -5,7 +5,7 @@ date: 2020-03-31
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/collect-unbound-metrics.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
learn_rel_path: "Miscellaneous"
-->
# Monitor Unbound DNS servers with Netdata

View File

@ -6,7 +6,7 @@ image: /img/seo/guides/configure/performance.png
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/configure/performance.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
learn_rel_path: "Configuration"
-->
# How to optimize the Netdata Agent's performance

View File

@ -1,162 +0,0 @@
<!--
title: "Netdata Longer Metrics Retention"
sidebar_label: "Netdata Longer Metrics Retention"
description: ""
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/longer-metrics-storage.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
-->
# Netdata Longer Metrics Retention
Metrics retention affects 3 parameters on the operation of a Netdata Agent:
1. The disk space required to store the metrics.
2. The memory the Netdata Agent will require to have that retention available for queries.
3. The CPU resources that will be required to query longer time-frames.
As retention increases, the resources required to support that retention increase too.
Since Netdata Agents usually run at the edge, inside production systems, Netdata Agent **parents** should be considered. When having a **parent - child** setup, the child (the Netdata Agent running on a production system) delegates all its functions, including longer metrics retention and querying, to the parent node that can dedicate more resources to this task. A single Netdata Agent parent can centralize multiple children Netdata Agents (dozens, hundreds, or even thousands depending on its available resources).
## Ephemerality of metrics
The ephemerality of metrics plays an important role in retention. In environments where metrics stop being collected and new metrics are constantly being generated, we are interested about 2 parameters:
1. The **expected concurrent number of metrics** as an average for the lifetime of the database.
This affects mainly the storage requirements.
2. The **expected total number of unique metrics** for the lifetime of the database.
This affects mainly the memory requirements for having all these metrics indexed and available to be queried.
## Granularity of metrics
The granularity of metrics (the frequency they are collected and stored, i.e. their resolution) is significantly affecting retention.
Lowering the granularity from per second to every two seconds, will double their retention and half the CPU requirements of the Netdata Agent, without affecting disk space or memory requirements.
## Which database mode to use
Netdata Agents support multiple database modes.
The default mode `[db].mode = dbengine` has been designed to scale for longer retentions.
The other available database modes are designed to minimize resource utilization and should usually be considered on **parent - child** setups at the children side.
So,
* On a single node setup, use `[db].mode = dbengine` to increase retention.
* On a **parent - child** setup, use `[db].mode = dbengine` on the parent to increase retention and a more resource efficient mode (like `save`, `ram` or `none`) for the child to minimize resources utilization.
To use `dbengine`, set this in `netdata.conf` (it is the default):
```
[db]
mode = dbengine
```
## Tiering
`dbengine` supports tiering. Tiering allows having up to 3 versions of the data:
1. Tier 0 is the high resolution data.
2. Tier 1 is the first tier that samples data every 60 data collections of Tier 0.
3. Tier 2 is the second tier that samples data every 3600 data collections of Tier 0 (60 of Tier 1).
To enable tiering set `[db].storage tiers` in `netdata.conf` (the default is 1, to enable only Tier 0):
```
[db]
mode = dbengine
storage tiers = 3
```
## Disk space requirements
Netdata Agents require about 1 bytes on disk per database point on Tier 0 and 4 times more on higher tiers (Tier 1 and 2). They require 4 times more storage per point compared to Tier 0, because for every point higher tiers store `min`, `max`, `sum`, `count` and `anomaly rate` (the values are 5, but they require 4 times the storage because `count` and `anomaly rate` are 16-bit integers). The `average` is calculated on the fly at query time using `sum / count`.
### Tier 0 - per second for a week
For 2000 metrics, collected every second and retained for a week, Tier 0 needs: 1 byte x 2000 metrics x 3600 secs per hour x 24 hours per day x 7 days per week = 1100MB.
The setting to control this is in `netdata.conf`:
```
[db]
mode = dbengine
# per second data collection
update every = 1
# enable only Tier 0
storage tiers = 1
# Tier 0, per second data for a week
dbengine multihost disk space MB = 1100
```
By setting it to `1100` and restarting the Netdata Agent, this node will start maintaining about a week of data. But pay attention to the number of metrics. If you have more than 2000 metrics on a node, or you need more that a week of high resolution metrics, you may need to adjust this setting accordingly.
### Tier 1 - per minute for a month
Tier 1 is by default sampling the data every 60 points of Tier 0. If Tier 0 is per second, then Tier 1 is per minute.
Tier 1 needs 4 times more storage per point compared to Tier 0. So, for 2000 metrics, with per minute resolution, retained for a month, Tier 1 needs: 4 bytes x 2000 metrics x 60 minutes per hour x 24 hours per day x 30 days per month = 330MB.
Do this in `netdata.conf`:
```
[db]
mode = dbengine
# per second data collection
update every = 1
# enable only Tier 0 and Tier 1
storage tiers = 2
# Tier 0, per second data for a week
dbengine multihost disk space MB = 1100
# Tier 1, per minute data for a month
dbengine tier 1 multihost disk space MB = 330
```
Once `netdata.conf` is edited, the Netdata Agent needs to be restarted for the changes to take effect.
### Tier 2 - per hour for a year
Tier 2 is by default sampling data every 3600 points of Tier 0 (60 of Tier 1). If Tier 0 is per second, then Tier 2 is per hour.
The storage requirements are the same to Tier 1.
For 2000 metrics, with per hour resolution, retained for a year, Tier 2 needs: 4 bytes x 2000 metrics x 24 hours per day x 365 days per year = 67MB.
Do this in `netdata.conf`:
```
[db]
mode = dbengine
# per second data collection
update every = 1
# enable only Tier 0 and Tier 1
storage tiers = 3
# Tier 0, per second data for a week
dbengine multihost disk space MB = 1100
# Tier 1, per minute data for a month
dbengine tier 1 multihost disk space MB = 330
# Tier 2, per hour data for a year
dbengine tier 2 multihost disk space MB = 67
```
Once `netdata.conf` is edited, the Netdata Agent needs to be restarted for the changes to take effect.

View File

@ -4,7 +4,7 @@ sidebar_label: "Monitor CockroachDB metrics with Netdata"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor-cockroachdb.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
learn_rel_path: "Miscellaneous"
-->
# Monitor CockroachDB metrics with Netdata
@ -29,7 +29,6 @@ Let's dive in and walk through the process of monitoring CockroachDB metrics wit
- [Configure the CockroachDB collector](#configure-the-cockroachdb-collector)
- [Manual setup for a local CockroachDB database](#manual-setup-for-a-local-cockroachdb-database)
- [Tweak CockroachDB alarms](#tweak-cockroachdb-alarms)
- [What's next?](#whats-next)
## Configure the CockroachDB collector
@ -117,23 +116,3 @@ cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /et
```
For more information about editing the defaults or writing new alarm entities, see our documentation on [configuring health alarms](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md).
## What's next?
Now that you're collecting metrics from your CockroachDB databases, let us know how it's working for you! There's always
room for improvement or refinement based on real-world use cases. Feel free to [file an
issue](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml) with
your
thoughts.
Also, be sure to check out these useful resources:
- [Netdata's CockroachDB documentation](https://github.com/netdata/go.d.plugin/blob/master/modules/cockroachdb/README.md)
- [Netdata's CockroachDB configuration](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/cockroachdb.conf)
- [Netdata's CockroachDB alarms](https://github.com/netdata/netdata/blob/29d9b5e51603792ee27ef5a21f1de0ba8e130158/health/health.d/cockroachdb.conf)
- [CockroachDB homepage](https://www.cockroachlabs.com/product/)
- [CockroachDB documentation](https://www.cockroachlabs.com/docs/stable/)
- [`_status/vars` endpoint docs](https://www.cockroachlabs.com/docs/stable/monitoring-and-alerting.html#prometheus-endpoint)
- [Monitor CockroachDB with Prometheus](https://www.cockroachlabs.com/docs/stable/monitor-cockroachdb-with-prometheus.html)

View File

@ -4,7 +4,7 @@ sidebar_label: "Monitor a Hadoop cluster with Netdata"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor-hadoop-cluster.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
learn_rel_path: "Miscellaneous"
-->
# Monitor a Hadoop cluster with Netdata
@ -188,20 +188,5 @@ sudo /etc/netdata/edit-config health.d/hdfs.conf
sudo /etc/netdata/edit-config health.d/zookeeper.conf
```
For more information about editing the defaults or writing new alarm entities, see our [health monitoring
documentation](https://github.com/netdata/netdata/blob/master/health/README.md).
## What's next?
If you're having issues with Netdata auto-detecting your HDFS/Zookeeper servers, or want to help improve how Netdata
collects or presents metrics from these services, feel free to [file an
issue](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml).
- Read up on the [HDFS configuration
file](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/hdfs.conf) to understand how to configure
global options or per-job options, such as username/password, TLS certificates, timeouts, and more.
- Read up on the [Zookeeper configuration
file](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/zookeeper.conf) to understand how to configure
global options or per-job options, timeouts, TLS certificates, and more.
For more information about editing the defaults or writing new alarm entities, see our
[health monitoring documentation](https://github.com/netdata/netdata/blob/master/health/README.md).

View File

@ -1,193 +0,0 @@
<!--
title: "Detect anomalies in systems and applications"
sidebar_label: "Detect anomalies in systems and applications"
description: "Detect anomalies in any system, container, or application in your infrastructure with machine learning and the open-source Netdata Agent."
image: /img/seo/guides/monitor/anomaly-detection.png
author: "Joel Hans"
author_title: "Editorial Director, Technical & Educational Resources"
author_img: "/img/authors/joel-hans.jpg"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/anomaly-detection-python.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
-->
# Detect anomalies in systems and applications
Beginning with v1.27, the [open-source Netdata Agent](https://github.com/netdata/netdata) is capable of unsupervised
[anomaly detection](https://en.wikipedia.org/wiki/Anomaly_detection) with machine learning (ML). As with all things
Netdata, the anomalies collector comes with preconfigured alarms and instant visualizations that require no query
languages or organizing metrics. You configure the collector to look at specific charts, and it handles the rest.
Netdata's implementation uses a handful of functions in the [Python Outlier Detection (PyOD)
library](https://github.com/yzhao062/pyod/tree/master), which periodically runs a `train` function that learns what
"normal" looks like on your node and creates an ML model for each chart, then utilizes the
[`predict_proba()`](https://pyod.readthedocs.io/en/latest/api_cc.html#pyod.models.base.BaseDetector.predict_proba) and
[`predict()`](https://pyod.readthedocs.io/en/latest/api_cc.html#pyod.models.base.BaseDetector.predict) PyOD functions to
quantify how anomalous certain charts are.
All these metrics and alarms are available for centralized monitoring in [Netdata Cloud](https://app.netdata.cloud). If
you choose to sign up for Netdata Cloud and [connect your nodes](https://github.com/netdata/netdata/blob/master/claim/README.md), you will have the ability to run
tailored anomaly detection on every node in your infrastructure, regardless of its purpose or workload.
In this guide, you'll learn how to set up the anomalies collector to instantly detect anomalies in an Nginx web server
and/or the node that hosts it, which will give you the tools to configure parallel unsupervised monitors for any
application in your infrastructure. Let's get started.
![Example anomaly detection with an Nginx web
server](https://user-images.githubusercontent.com/1153921/103586700-da5b0a00-4ea2-11eb-944e-46edd3f83e3a.png)
## Prerequisites
- A node running the Netdata Agent. If you don't yet have that, [install Netdata](https://github.com/netdata/netdata/blob/master/packaging/installer/README.md).
- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
- Familiarity with configuring the Netdata Agent with [`edit-config`](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md).
- _Optional_: An Nginx web server running on the same node to follow the example configuration steps.
## Install required Python packages
The anomalies collector uses a few Python packages, available with `pip3`, to run ML training. It requires
[`numba`](http://numba.pydata.org/), [`scikit-learn`](https://scikit-learn.org/stable/),
[`pyod`](https://pyod.readthedocs.io/en/latest/), in addition to
[`netdata-pandas`](https://github.com/netdata/netdata-pandas), which is a package built by the Netdata team to pull data
from a Netdata Agent's API into a [Pandas](https://pandas.pydata.org/). Read more about `netdata-pandas` on its [package
repo](https://github.com/netdata/netdata-pandas) or in Netdata's [community
repo](https://github.com/netdata/community/tree/main/netdata-agent-api/netdata-pandas).
```bash
# Become the netdata user
sudo su -s /bin/bash netdata
# Install required packages for the netdata user
pip3 install --user netdata-pandas==0.0.38 numba==0.50.1 scikit-learn==0.23.2 pyod==0.8.3
```
> If the `pip3` command fails, you need to install it. For example, on an Ubuntu system, use `sudo apt install
> python3-pip`.
Use `exit` to become your normal user again.
## Enable the anomalies collector
Navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory) and use `edit-config`
to open the `python.d.conf` file.
```bash
sudo ./edit-config python.d.conf
```
In `python.d.conf` file, search for the `anomalies` line. If the line exists, set the value to `yes`. Add the line
yourself if it doesn't already exist. Either way, the final result should look like:
```conf
anomalies: yes
```
[Restart the Agent](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) with `sudo systemctl restart netdata`, or the [appropriate
method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your system, to start up the anomalies collector. By default, the
model training process runs every 30 minutes, and uses the previous 4 hours of metrics to establish a baseline for
health and performance across the default included charts.
> 💡 The anomaly collector may need 30-60 seconds to finish its initial training and have enough data to start
> generating anomaly scores. You may need to refresh your browser tab for the **Anomalies** section to appear in menus
> on both the local Agent dashboard or Netdata Cloud.
## Configure the anomalies collector
Open `python.d/anomalies.conf` with `edit-conf`.
```bash
sudo ./edit-config python.d/anomalies.conf
```
The file contains many user-configurable settings with sane defaults. Here are some important settings that don't
involve tweaking the behavior of the ML training itself.
- `charts_regex`: Which charts to train models for and run anomaly detection on, with each chart getting a separate
model.
- `charts_to_exclude`: Specific charts, selected by the regex in `charts_regex`, to exclude.
- `train_every_n`: How often to train the ML models.
- `train_n_secs`: The number of historical observations to train each model on. The default is 4 hours, but if your node
doesn't have historical metrics going back that far, consider [changing the metrics retention
policy](https://github.com/netdata/netdata/blob/master/docs/store/change-metrics-storage.md) or reducing this window.
- `custom_models`: A way to define custom models that you want anomaly probabilities for, including multi-node or
streaming setups.
> ⚠️ Setting `charts_regex` with many charts or `train_n_secs` to a very large number will have an impact on the
> resources and time required to train a model for every chart. The actual performance implications depend on the
> resources available on your node. If you plan on changing these settings beyond the default, or what's mentioned in
> this guide, make incremental changes to observe the performance impact. Considering `train_max_n` to cap the number of
> observations actually used to train on.
### Run anomaly detection on Nginx and log file metrics
As mentioned above, this guide uses an Nginx web server to demonstrate how the anomalies collector works. You must
configure the collector to monitor charts from the
[Nginx](https://github.com/netdata/go.d.plugin/blob/master/modules/nginx/README.md) and [web
log](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md) collectors.
`charts_regex` allows for some basic regex, such as wildcards (`*`) to match all contexts with a certain pattern. For
example, `system\..*` matches with any chart with a context that begins with `system.`, and ends in any number of other
characters (`.*`). Note the escape character (`\`) around the first period to capture a period character exactly, and
not any character.
Change `charts_regex` in `anomalies.conf` to the following:
```conf
charts_regex: 'system\..*|nginx_local\..*|web_log_nginx\..*|apps.cpu|apps.mem'
```
This value tells the anomaly collector to train against every `system.` chart, every `nginx_local` chart, every
`web_log_nginx` chart, and specifically the `apps.cpu` and `apps.mem` charts.
![The anomalies collector chart with many
dimensions](https://user-images.githubusercontent.com/1153921/102813877-db5e4880-4386-11eb-8040-d7a1d7a476bb.png)
### Remove some metrics from anomaly detection
As you can see in the above screenshot, this node is now looking for anomalies in many places. The result is a single
`anomalies_local.probability` chart with more than twenty dimensions, some of which the dashboard hides at the bottom of
a scrollable area. In addition, training and analyzing the anomaly collector on many charts might require more CPU
utilization that you're willing to give.
First, explicitly declare which `system.` charts to monitor rather than of all of them using regex (`system\..*`).
```conf
charts_regex: 'system\.cpu|system\.load|system\.io|system\.net|system\.ram|nginx_local\..*|web_log_nginx\..*|apps.cpu|apps.mem'
```
Next, remove some charts with the `charts_to_exclude` setting. For this example, using an Nginx web server, focus on the
volume of requests/responses, not, for example, which type of 4xx response a user might receive.
```conf
charts_to_exclude: 'web_log_nginx.excluded_requests,web_log_nginx.responses_by_status_code_class,web_log_nginx.status_code_class_2xx_responses,web_log_nginx.status_code_class_4xx_responses,web_log_nginx.current_poll_uniq_clients,web_log_nginx.requests_by_http_method,web_log_nginx.requests_by_http_version,web_log_nginx.requests_by_ip_proto'
```
![The anomalies collector with less
dimensions](https://user-images.githubusercontent.com/1153921/102820642-d69f9180-4392-11eb-91c5-d3d166d40105.png)
Apply the ideas behind the collector's regex and exclude settings to any other
[system](https://github.com/netdata/netdata/blob/master/docs/collect/system-metrics.md), [container](https://github.com/netdata/netdata/blob/master/docs/collect/container-metrics.md), or
[application](https://github.com/netdata/netdata/blob/master/docs/collect/application-metrics.md) metrics you want to detect anomalies for.
## What's next?
Now that you know how to set up unsupervised anomaly detection in the Netdata Agent, using an Nginx web server as an
example, it's time to apply that knowledge to other mission-critical parts of your infrastructure. If you're not sure
what to monitor next, check out our list of [collectors](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md) to see what kind of metrics Netdata
can collect from your systems, containers, and applications.
Keep on moving to [part 2](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/visualize-monitor-anomalies.md), which covers the charts and alarms
Netdata creates for unsupervised anomaly detection.
For a different troubleshooting experience, try out the [Metric
Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) feature in Netdata Cloud. Metric
Correlations helps you perform faster root cause analysis by narrowing a dashboard to only the charts most likely to be
related to an anomaly.
### Related reference documentation
- [Netdata Agent · Anomalies collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md)
- [Netdata Agent · Nginx collector](https://github.com/netdata/go.d.plugin/blob/master/modules/nginx/README.md)
- [Netdata Agent · web log collector](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md)
- [Netdata Cloud · Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md)

View File

@ -3,13 +3,9 @@ title: "Machine learning (ML) powered anomaly detection"
sidebar_label: "Machine learning (ML) powered anomaly detection"
description: "Detect anomalies in any system, container, or application in your infrastructure with machine learning and the open-source Netdata Agent."
image: /img/seo/guides/monitor/anomaly-detection.png
author: "Andrew Maguire"
author_title: "Analytics & ML Lead"
author_img: "/img/authors/andy-maguire.jpg"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/anomaly-detection.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
learn_rel_path: "Operations"
-->

View File

@ -1,185 +0,0 @@
<!--
title: "Use dimension templates to create dynamic alarms"
sidebar_label: "Use dimension templates to create dynamic alarms"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/dimension-templates.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
-->
# Use dimension templates to create dynamic alarms
Your ability to monitor the health of your systems and applications relies on your ability to create and maintain
the best set of alarms for your particular needs.
In v1.18 of Netdata, we introduced **dimension templates** for alarms, which simplifies the process of
writing [alarm entities](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#health-entity-reference) for
charts with many dimensions.
Dimension templates can condense many individual entities into one—no more copy-pasting one entity and changing the
`alarm`/`template` and `lookup` lines for each dimension you'd like to monitor.
They are, however, an advanced health monitoring feature. For more basic instructions on creating your first alarm,
check out our [health monitoring documentation](https://github.com/netdata/netdata/blob/master/health/README.md), which also includes
[examples](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#example-alarms).
## The fundamentals of `foreach`
Our dimension templates update creates a new `foreach` parameter to the
existing [`lookup` line](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-lookup). This
is where the magic happens.
You use the `foreach` parameter to specify which dimensions you want to monitor with this single alarm. You can separate
them with a comma (`,`) or a pipe (`|`). You can also use
a [Netdata simple pattern](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) to create
many alarms with a regex-like syntax.
The `foreach` parameter _has_ to be the last parameter in your `lookup` line, and if you have both `of` and `foreach` in
the same `lookup` line, Netdata will ignore the `of` parameter and use `foreach` instead.
Let's get into some examples so you can see how the new parameter works.
> ⚠️ The following entities are examples to showcase the functionality and syntax of dimension templates. They are not
> meant to be run as-is on production systems.
## Condensing entities with `foreach`
Let's say you want to monitor the `system`, `user`, and `nice` dimensions in your system's overall CPU utilization.
Before dimension templates, you would need the following three entities:
```yaml
alarm: cpu_system
on: system.cpu
lookup: average -10m percentage of system
every: 1m
warn: $this > 50
crit: $this > 80
alarm: cpu_user
on: system.cpu
lookup: average -10m percentage of user
every: 1m
warn: $this > 50
crit: $this > 80
alarm: cpu_nice
on: system.cpu
lookup: average -10m percentage of nice
every: 1m
warn: $this > 50
crit: $this > 80
```
With dimension templates, you can condense these into a single alarm. Take note of the `alarm` and `lookup` lines.
```yaml
alarm: cpu_template
on: system.cpu
lookup: average -10m percentage foreach system,user,nice
every: 1m
warn: $this > 50
crit: $this > 80
```
The `alarm` line specifies the naming scheme Netdata will use. You can use whatever naming scheme you'd like, with `.`
and `_` being the only allowed symbols.
The `lookup` line has changed from `of` to `foreach`, and we're now passing three dimensions.
In this example, Netdata will create three alarms with the names `cpu_template_system`, `cpu_template_user`, and
`cpu_template_nice`. Every minute, each alarm will use the same database query to calculate the average CPU usage for
the `system`, `user`, and `nice` dimensions over the last 10 minutes and send out alarms if necessary.
You can find these three alarms active by clicking on the **Alarms** button in the top navigation, and then clicking on
the **All** tab and scrolling to the **system - cpu** collapsible section.
![Three new alarms created from the dimension template](https://user-images.githubusercontent.com/1153921/66218994-29523800-e67f-11e9-9bcb-9bca23e2c554.png)
Let's look at some other examples of how `foreach` works so you can best apply it in your configurations.
### Using a Netdata simple pattern in `foreach`
In the last example, we used `foreach system,user,nice` to create three distinct alarms using dimension templates. But
what if you want to quickly create alarms for _all_ the dimensions of a given chart?
Use a [simple pattern](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md)! One example of a simple pattern is a single wildcard
(`*`).
Instead of monitoring system CPU usage, let's monitor per-application CPU usage using the `apps.cpu` chart. Passing a
wildcard as the simple pattern tells Netdata to create a separate alarm for _every_ process on your system:
```yaml
alarm: app_cpu
on: apps.cpu
lookup: average -10m percentage foreach *
every: 1m
warn: $this > 50
crit: $this > 80
```
This entity will now create alarms for every dimension in the `apps.cpu` chart. Given that most `apps.cpu` charts have
10 or more dimensions, using the wildcard ensures you catch every CPU-hogging process.
To learn more about how to use simple patterns with dimension templates, see
our [simple patterns documentation](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md).
## Using `foreach` with alarm templates
Dimension templates also work
with [alarm templates](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-alarm-or-template).
Alarm templates help you create alarms for all the charts with a given context—for example, all the cores of your
system's CPU.
By combining the two, you can create dozens of individual alarms with a single template entity. Here's how you would
create alarms for the `system`, `user`, and `nice` dimensions for every chart in the `cpu.cpu` context—or, in other
words, every CPU core.
```yaml
template: cpu_template
on: cpu.cpu
lookup: average -10m percentage foreach system,user,nice
every: 1m
warn: $this > 50
crit: $this > 80
```
On a system with a 6-core, 12-thread Ryzen 5 1600 CPU, this one entity creates alarms on the following charts and
dimensions:
- `cpu.cpu0`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
- `cpu.cpu1`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
- `cpu.cpu2`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
- ...
- `cpu.cpu11`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
And how just a few of those dimension template-generated alarms look like in the Netdata dashboard.
![A few of the created alarms in the Netdata dashboard](https://user-images.githubusercontent.com/1153921/66219669-708cf880-e680-11e9-8b3a-7bfe178fa28b.png)
All in all, this single entity creates 36 individual alarms. Much easier than writing 36 separate entities in your
health configuration files!
## What's next?
We hope you're excited about the possibilities of using dimension templates! Maybe they'll inspire you to build new
alarms that will help you better monitor the health of your systems.
Or, at the very least, simplify your configuration files.
For information about other advanced features in Netdata's health monitoring toolkit, check out
our [health documentation](https://github.com/netdata/netdata/blob/master/health/README.md). And if you have some cool
alarms you built using dimension templates,

View File

@ -3,13 +3,9 @@ title: "Kubernetes monitoring with Netdata: Overview and visualizations"
sidebar_label: "Kubernetes monitoring with Netdata: Overview and visualizations"
description: "Learn how to navigate Netdata's Kubernetes monitoring features for visualizing the health and performance of a Kubernetes cluster with per-second granularity."
image: /img/seo/guides/monitor/kubernetes-k8s-netdata.png
author: "Joel Hans"
author_title: "Editorial Director, Technical & Educational Resources"
author_img: "/img/authors/joel-hans.jpg"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/kubernetes-k8s-netdata.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
learn_rel_path: "Miscellaneous"
-->
# Kubernetes monitoring with Netdata: Overview and visualizations

View File

@ -3,13 +3,9 @@ title: "LAMP stack monitoring (Linux, Apache, MySQL, PHP) with Netdata"
sidebar_label: "LAMP stack monitoring (Linux, Apache, MySQL, PHP) with Netdata"
description: "Set up robust LAMP stack monitoring (Linux, Apache, MySQL, PHP) in just a few minutes using a free, open-source monitoring tool that collects metrics every second."
image: /img/seo/guides/monitor/lamp-stack.png
author: "Joel Hans"
author_title: "Editorial Director, Technical & Educational Resources"
author_img: "/img/authors/joel-hans.jpg"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/lamp-stack.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
learn_rel_path: "Miscellaneous"
-->
import { OneLineInstallWget } from '@site/src/components/OneLineInstall/'

View File

@ -5,8 +5,7 @@ description: "Monitor Pi-hole metrics, plus Raspberry Pi system metrics, in minu
image: /img/seo/guides/monitor/netdata-pi-hole-raspberry-pi.png
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/pi-hole-raspberry-pi.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
learn_rel_path: "Miscellaneous"
-->
import { OneLineInstallWget } from '@site/src/components/OneLineInstall/'

View File

@ -5,8 +5,7 @@ description: "Tap into Netdata's powerful collectors, with per-second utilizatio
image: /img/seo/guides/monitor/process.png
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/process.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
learn_rel_path: "Operations"
-->
# Monitor any process in real-time with Netdata
@ -268,45 +267,4 @@ relevant data.
`ebpf.plugin` visualizes additional eBPF metrics, which are system-wide and not per-process, under the **eBPF** section.
## What's next?
Now that you have `apps_groups.conf` configured correctly, and know where to find per-process visualizations throughout
Netdata's ecosystem, you can precisely monitor the health and performance of any process on your node using per-second
metrics.
For even more in-depth troubleshooting, see our guide
on [monitoring and debugging applications with eBPF](https://github.com/netdata/netdata/blob/master/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md).
If the process you're monitoring also has
a [supported collector](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md), now is a great time to
set
that up if it wasn't autodetected. With both process utilization and application-specific metrics, you should have every
piece of data needed to discover the root cause of an incident. See
the [collectors configuration reference](https://github.com/netdata/netdata/blob/master/collectors/REFERENCE.md) documentation for details.
[Create new dashboards](https://github.com/netdata/netdata/blob/master/docs/visualize/create-dashboards.md) in Netdata
Cloud using charts from `apps.plugin`,
`ebpf.plugin`, and application-specific collectors to build targeted dashboards for monitoring key processes across your
infrastructure.
Try
running [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md)
on a node that's running the process(es) you're monitoring. Even if nothing is going wrong at the moment, Netdata
Cloud's embedded intelligence helps you better understand how a MySQL database, for example, might influence a system's
volume of memory page faults. And when an incident is afoot, use Metric Correlations to reduce mean time to resolution (
MTTR) and cognitive load.
If you want more specific metrics from your custom application, check out
Netdata's [statsd support](https://github.com/netdata/netdata/blob/master/collectors/statsd.plugin/README.md). With statd, you can send detailed metrics from your
application to Netdata and visualize them with per-second granularity. Netdata's statsd collector works with dozens of
[statsd server implementations](https://github.com/etsy/statsd/wiki#client-implementations), which work with most application
frameworks.
### Related reference documentation
- [Netdata Agent · `apps.plugin`](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md)
- [Netdata Agent · `ebpf.plugin`](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md)
- [Netdata Agent · Dashboards](https://github.com/netdata/netdata/blob/master/web/README.md#dimensions)
- [Netdata Agent · MySQL collector](https://github.com/netdata/go.d.plugin/blob/master/modules/mysql/README.md)

View File

@ -3,13 +3,9 @@ title: "Unsupervised anomaly detection for Raspberry Pi monitoring"
sidebar_label: "Unsupervised anomaly detection for Raspberry Pi monitoring"
description: "Use a low-overhead machine learning algorithm and an open-source monitoring tool to detect anomalous metrics on a Raspberry Pi."
image: /img/seo/guides/monitor/raspberry-pi-anomaly-detection.png
author: "Andy Maguire"
author_title: "Senior Machine Learning Engineer"
author_img: "/img/authors/andy-maguire.jpg"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/raspberry-pi-anomaly-detection.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
learn_rel_path: "Miscellaneous"
-->
We love IoT and edge at Netdata, we also love machine learning. Even better if we can combine the two to ease the pain
@ -104,26 +100,4 @@ during training. By default, the anomalies collector, along with all other runni
![RAM utilization of anomaly detection on the Raspberry
Pi](https://user-images.githubusercontent.com/1153921/110149720-9e0d3280-7d9b-11eb-883d-b1d4d9b9b5e1.png)
## What's next?
So, all in all, with a small little bit of extra set up and a small overhead on the Pi itself, the anomalies collector
looks like a potentially useful addition to enable unsupervised anomaly detection on your Pi.
See our two-part guide series for a more complete picture of configuring the anomalies collector, plus some best
practices on using the charts it automatically generates:
- [_Detect anomalies in systems and applications_](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md)
- [_Monitor and visualize anomalies with Netdata_](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/visualize-monitor-anomalies.md)
If you're using your Raspberry Pi for other purposes, like blocking ads/trackers with Pi-hole, check out our companions
Pi guide: [_Monitor Pi-hole (and a Raspberry Pi) with Netdata_](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/pi-hole-raspberry-pi.md).
Once you've had a chance to give unsupervised anomaly detection a go, share your use cases and let us know of any
feedback on our [community forum](https://community.netdata.cloud/t/anomalies-collector-feedback-megathread/767).
### Related reference documentation
- [Netdata Agent · Get Netdata](https://github.com/netdata/netdata/blob/master/packaging/installer/README.md)
- [Netdata Agent · Anomalies collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md)

View File

@ -1,94 +0,0 @@
<!--
title: "Stop notifications for individual alarms"
sidebar_label: "Stop notifications for individual alarms"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/stop-notifications-alarms.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
-->
# Stop notifications for individual alarms
In this short tutorial, you'll learn how to stop notifications for individual alarms in Netdata's health
monitoring system. We also refer to this process as _silencing_ the alarm.
Why silence alarms? We designed Netdata's pre-configured alarms for production systems, so they might not be
relevant if you run Netdata on your laptop or a small virtual server. If they're not helpful, they can be a distraction
to real issues with health and performance.
Silencing individual alarms is an excellent solution for situations where you're not interested in seeing a specific
alarm but don't want to disable a [notification system](https://github.com/netdata/netdata/blob/master/health/notifications/README.md) entirely.
## Find the alarm configuration file
To silence an alarm, you need to know where to find its configuration file.
Let's use the `system.cpu` chart as an example. It's the first chart you'll see on most Netdata dashboards.
To figure out which file you need to edit, open up Netdata's dashboard and, click the **Alarms** button at the top
of the dashboard, followed by clicking on the **All** tab.
In this example, we're looking for the `system - cpu` entity, which, when opened, looks like this:
![The system - cpu alarm
entity](https://user-images.githubusercontent.com/1153921/67034648-ebb4cc80-f0cc-11e9-9d49-1023629924f5.png)
In the `source` row, you see that this chart is getting its configuration from
`4@/usr/lib/netdata/conf.d/health.d/cpu.conf`. The relevant part of begins at `health.d`: `health.d/cpu.conf`. That's
the file you need to edit if you want to silence this alarm.
For more information about editing or referencing health configuration files on your system, see the [Configure health alarms](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md#edit-health-configuration-files) and the [Health configuration reference](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md) sections of our documentation.
## Edit the file to enable silencing
To edit `health.d/cpu.conf`, use `edit-config` from inside of your Netdata configuration directory.
```bash
cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /etc/netdata/
./edit-config health.d/cpu.conf
```
> You may need to use `sudo` or another method of elevating your privileges.
The beginning of the file looks like this:
```yaml
template: 10min_cpu_usage
on: system.cpu
os: linux
hosts: *
lookup: average -10m unaligned of user,system,softirq,irq,guest
units: %
every: 1m
warn: $this > (($status >= $WARNING) ? (75) : (85))
crit: $this > (($status == $CRITICAL) ? (85) : (95))
delay: down 15m multiplier 1.5 max 1h
info: average cpu utilization for the last 10 minutes (excluding iowait, nice and steal)
to: sysadmin
```
To silence this alarm, change `sysadmin` to `silent`.
```yaml
to: silent
```
Use one of the available [methods](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md#reload-health-configuration) to reload your health configuration and ensure you get no more notifications about that alarm.
You can add `to: silent` to any alarm you'd rather not bother you with notifications.
## What's next?
You should now know the fundamentals behind silencing any individual alarm in Netdata.
To learn about _all_ of Netdata's health configuration possibilities, visit the [health reference
guide](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md), or check out other [tutorials on health monitoring](https://github.com/netdata/netdata/blob/master/health/README.md#guides).
Or, take better control over how you get notified about alarms via the [notification
system](https://github.com/netdata/netdata/blob/master/health/notifications/README.md).
You can also use Netdata's [Health Management API](https://github.com/netdata/netdata/blob/master/web/api/health/README.md#health-management-api) to control health
checks and notifications while Netdata runs. With this API, you can disable health checks during a maintenance window or
backup process, for example.

View File

@ -1,146 +0,0 @@
<!--
title: "Monitor and visualize anomalies with Netdata (part 2)"
sidebar_label: "Monitor and visualize anomalies with Netdata (part 2)"
description: "Using unsupervised anomaly detection and machine learning, get notified "
image: /img/seo/guides/monitor/visualize-monitor-anomalies.png
author: "Joel Hans"
author_title: "Editorial Director, Technical & Educational Resources"
author_img: "/img/authors/joel-hans.jpg"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/visualize-monitor-anomalies.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Monitor"
-->
Welcome to part 2 of our series of guides on using _unsupervised anomaly detection_ to detect issues with your systems,
containers, and applications using the open-source Netdata Agent. For an introduction to detecting anomalies and
monitoring associated metrics, see [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md), which covers prerequisites and
configuration basics.
With anomaly detection in the Netdata Agent set up, you will now want to visualize and monitor which charts have
anomalous data, when, and where to look next.
> 💡 In certain cases, the anomalies collector doesn't start immediately after restarting the Netdata Agent. If this
> happens, you won't see the dashboard section or the relevant [charts](#visualize-anomalies-in-charts) right away. Wait
> a minute or two, refresh, and look again. If the anomalies charts and alarms are still not present, investigate the
> error log with `less /var/log/netdata/error.log | grep anomalies`.
## Test anomaly detection
Time to see the Netdata Agent's unsupervised anomaly detection in action. To trigger anomalies on the Nginx web server,
use `ab`, otherwise known as [Apache Bench](https://httpd.apache.org/docs/2.4/programs/ab.html). Despite its name, it
works just as well with Nginx web servers. Install it on Ubuntu/Debian systems with `sudo apt install apache2-utils`.
> 💡 If you haven't followed the guide's example of using Nginx, an easy way to test anomaly detection on your node is
> to use the `stress-ng` command, which is available on most Linux distributions. Run `stress-ng --cpu 0` to create CPU
> stress or `stress-ng --vm 0` for RAM stress. Each test will cause some "collateral damage," in that you may see CPU
> utilization rise when running the RAM test, and vice versa.
The following test creates a minimum of 10,000,000 requests for Nginx to handle, with a maximum of 10 at any given time,
with a run time of 60 seconds. If your system can handle those 10,000,000 in less than 60 seconds, `ab` will keep
sending requests until the timer runs out.
```bash
ab -k -c 10 -t 60 -n 10000000 http://127.0.0.1/
```
Let's see how Netdata detects this anomalous behavior and propagates information to you through preconfigured alarms and
dashboards that automatically organize anomaly detection metrics into meaningful charts to help you begin root cause
analysis (RCA).
## Monitor anomalies with alarms
The anomalies collector creates two "classes" of alarms for each chart captured by the `charts_regex` setting. All these
alarms are preconfigured based on your [configuration in
`anomalies.conf`](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md#configure-the-anomalies-collector). With the `charts_regex`
and `charts_to_exclude` settings from [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md) of this guide series, the
Netdata Agent creates 32 alarms driven by unsupervised anomaly detection.
The first class triggers warning alarms when the average anomaly probability for a given chart has stayed above 50% for
at least the last two minutes.
![An example anomaly probability
alarm](https://user-images.githubusercontent.com/1153921/104225767-0a0a9480-5404-11eb-9bfd-e29592397203.png)
The second class triggers warning alarms when the number of anomalies in the last two minutes hits 10 or higher.
![An example anomaly count
alarm](https://user-images.githubusercontent.com/1153921/104225769-0aa32b00-5404-11eb-95f3-7309f9429fe1.png)
If you see either of these alarms in Netdata Cloud, the local Agent dashboard, or on your preferred notification
platform, it's a safe bet that the node's current metrics have deviated from normal. That doesn't necessarily mean
there's a full-blown incident, depending on what application/service you're using anomaly detection on, but it's worth
further investigation.
As you use the anomalies collector, you may find that the default settings provide too many or too few genuine alarms.
In this case, [configure the alarm](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md) with `sudo ./edit-config
health.d/anomalies.conf`. Take a look at the `lookup` line syntax in the [health
reference](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-lookup) to understand how the anomalies collector automatically creates
alarms for any dimension on the `anomalies_local.probability` and `anomalies_local.anomaly` charts.
## Visualize anomalies in charts
In either [Netdata Cloud](https://app.netdata.cloud) or the local Agent dashboard at `http://NODE:19999`, click on the
**Anomalies** [section](https://github.com/netdata/netdata/blob/master/web/gui/README.md#sections) to see the pair of anomaly detection charts, which are
preconfigured to visualize per-second anomaly metrics based on your [configuration in
`anomalies.conf`](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md#configure-the-anomalies-collector).
These charts have the contexts `anomalies.probability` and `anomalies.anomaly`. Together, these charts
create meaningful visualizations for immediately recognizing not only that something is going wrong on your node, but
give context as to where to look next.
The `anomalies_local.probability` chart shows the probability that the latest observed data is anomalous, based on the
trained model. The `anomalies_local.anomaly` chart visualizes 0&rarr;1 predictions based on whether the latest observed
data is anomalous based on the trained model. Both charts share the same dimensions, which you configured via
`charts_regex` and `charts_to_exclude` in [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md).
In other words, the `probability` chart shows the amplitude of the anomaly, whereas the `anomaly` chart provides quick
yes/no context.
![Two charts created by the anomalies
collector](https://user-images.githubusercontent.com/1153921/104226380-ef84eb00-5404-11eb-9faf-9e64c43b95ff.png)
Before `08:32:00`, both charts show little in the way of verified anomalies. Based on the metrics the anomalies
collector has trained on, a certain percentage of anomaly probability score is normal, as seen in the
`web_log_nginx_requests_prob` dimension and a few others. What you're looking for is large deviations from the "noise"
in the `anomalies.probability` chart, or any increments to the `anomalies.anomaly` chart.
Unsurprisingly, the stress test that began at `08:32:00` caused significant changes to these charts. The three
dimensions that immediately shot to 100% anomaly probability, and remained there during the test, were
`web_log_nginx.requests_prob`, `nginx_local.connections_accepted_handled_prob`, and `system.cpu_pressure_prob`.
## Build an anomaly detection dashboard
[Netdata Cloud](https://app.netdata.cloud) features a drag-and-drop [dashboard
editor](https://github.com/netdata/netdata/blob/master/docs/visualize/create-dashboards.md) that helps you create entirely new dashboards with charts targeted for
your specific applications.
For example, here's a dashboard designed for visualizing anomalies present in an Nginx web server, including
documentation about why the dashboard exists and where to look next based on what you're seeing:
![An example anomaly detection
dashboard](https://user-images.githubusercontent.com/1153921/104226915-c6188f00-5405-11eb-9bb4-559a18016fa7.png)
Use the anomaly charts for instant visual identification of potential anomalies, and then Nginx-specific charts, in the
right column, to validate whether the probability and anomaly counters are showing a valid incident worth further
investigation using [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) to narrow
the dashboard into only the charts relevant to what you're seeing from the anomalies collector.
## What's next?
Between this guide and [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md), which covered setup and configuration, you
now have a fundamental understanding of how unsupervised anomaly detection in Netdata works, from root cause to alarms
to preconfigured or custom dashboards.
We'd love to hear your feedback on the anomalies collector. Hop over to the [community
forum](https://community.netdata.cloud/t/anomalies-collector-feedback-megathread/767), and let us know if you're already getting value from
unsupervised anomaly detection, or would like to see something added to it. You might even post a custom configuration
that works well for monitoring some other popular application, like MySQL, PostgreSQL, Redis, or anything else we
[support through collectors](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md).
### Related reference documentation
- [Netdata Agent · Anomalies collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md)
- [Netdata Cloud · Build new dashboards](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/dashboards.md)

View File

@ -5,8 +5,7 @@ description: "Use Netdata's built-in eBPF metrics collector to monitor, troubles
image: /img/seo/guides/troubleshoot/monitor-debug-applications-ebpf.png
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Troubleshoot"
learn_rel_path: "Operations"
-->
# Monitor, troubleshoot, and debug applications with eBPF metrics
@ -251,23 +250,5 @@ systems.
Now that you can see eBPF metrics in Netdata Cloud, you can [invite your
team](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/invite-your-team.md) and share your findings with others.
## What's next?
Debugging and troubleshooting an application takes a special combination of practice, experience, and sheer luck. With
Netdata's eBPF metrics to back you up, you can rest assured that you see every minute detail of how your application
interacts with the Linux kernel.
If you're still trying to wrap your head around what we offer, be sure to read up on our accompanying documentation and
other resources on eBPF monitoring with Netdata:
- [eBPF collector](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md)
- [eBPF's integration with `apps.plugin`](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md#integration-with-ebpf)
- [Linux eBPF monitoring with Netdata](https://www.netdata.cloud/blog/linux-ebpf-monitoring-with-netdata/)
The scenarios described above are just the beginning when it comes to troubleshooting with eBPF metrics. We're excited
to explore others and see what our community dreams up. If you have other use cases, whether simulated or real-world,
we'd love to hear them: [info@netdata.cloud](mailto:info@netdata.cloud).
Happy troubleshooting!

View File

@ -5,7 +5,7 @@ description: "A simple guide to troubleshoot occurrences where the Agent is show
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides/Troubleshoot"
learn_rel_path: "Operations"
-->
# Troubleshoot Agent-Cloud connectivity issues

View File

@ -3,8 +3,7 @@ title: "Use host labels to organize systems, metrics, and alarms"
sidebar_label: "Use host labels to organize systems, metrics, and alarms"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/using-host-labels.md
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Guides"
learn_rel_path: "Operations"
-->
# Use host labels to organize systems, metrics, and alarms
@ -189,28 +188,4 @@ send automatic labels = yes
By applying labels to exported metrics, you can more easily parse historical metrics with the labels applied. To learn
more about exporting, read the [documentation](https://github.com/netdata/netdata/blob/master/exporting/README.md).
## What's next?
Host labels are a brand-new feature to Netdata, and yet they've already propagated deeply into some of its core
functionality. We're just getting started with labels, and will keep the community apprised of additional functionality
as it's made available. You can also track [issue #6503](https://github.com/netdata/netdata/issues/6503), which is where
the Netdata team first kicked off this work.
It should be noted that while the Netdata dashboard does not expose either user-configured or automatic host labels, API
queries _do_ showcase this information. As always, we recommend you secure Netdata
- [Expose Netdata only in a private LAN](https://github.com/netdata/netdata/blob/master/docs/netdata-security.md#expose-netdata-only-in-a-private-lan)
- [Enable TLS/SSL for web/API requests](https://github.com/netdata/netdata/blob/master/web/server/README.md#enabling-tls-support)
- Put Netdata behind a proxy
- [Use an authenticating web server in proxy
mode](https://github.com/netdata/netdata/blob/master/docs/netdata-security.md#use-an-authenticating-web-server-in-proxy-mode)
- [Nginx proxy](https://github.com/netdata/netdata/blob/master/docs/Running-behind-nginx.md)
- [Apache proxy](https://github.com/netdata/netdata/blob/master/docs/Running-behind-apache.md)
- [Lighttpd](https://github.com/netdata/netdata/blob/master/docs/Running-behind-lighttpd.md)
- [Caddy](https://github.com/netdata/netdata/blob/master/docs/Running-behind-caddy.md)
If you have issues or questions around using host labels, don't hesitate to [file an
issue](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml) on GitHub. We're
excited to make host labels even more valuable to our users, which we can only do with your input.

View File

@ -9,7 +9,7 @@ custom_edit_url: "https://github.com/netdata/netdata/blob/master/docs/metrics-st
sidebar_label: "Enable streaming between nodes"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Enable streaming between nodes

View File

@ -1,152 +0,0 @@
<!--
title: "Configure health alarms"
description: "Netdata's health monitoring watchdog is incredibly adaptable to your infrastructure's unique needs, with configurable health alarms."
custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/monitor/configure-alarms.md"
sidebar_label: "Configure health alarms"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
-->
# Configure health alarms
Netdata's health watchdog is highly configurable, with support for dynamic thresholds, hysteresis, alarm templates, and
more. You can tweak any of the existing alarms based on your infrastructure's topology or specific monitoring needs, or
create new entities.
You can use health alarms in conjunction with any of Netdata's [collectors](https://github.com/netdata/netdata/blob/master/collectors/README.md) (see
the [supported collector list](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md)) to monitor the health of your systems, containers, and
applications in real time.
While you can see active alarms both on the local dashboard and Netdata Cloud, all health alarms are configured _per
node_ via individual Netdata Agents. If you want to deploy a new alarm across your
[infrastructure](https://github.com/netdata/netdata/blob/master/docs/quickstart/infrastructure.md), you must configure each node with the same health configuration
files.
## Edit health configuration files
All of Netdata's [health configuration files](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#health-configuration-files) are in Netdata's config
directory, inside the `health.d/` directory. Navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md) and
use `edit-config` to make changes to any of these files.
For example, to edit the `cpu.conf` health configuration file, run:
```bash
sudo ./edit-config health.d/cpu.conf
```
Each health configuration file contains one or more health _entities_, which always begin with `alarm:` or `template:`.
For example, here is the first health entity in `health.d/cpu.conf`:
```yaml
template: 10min_cpu_usage
on: system.cpu
os: linux
hosts: *
lookup: average -10m unaligned of user,system,softirq,irq,guest
units: %
every: 1m
warn: $this > (($status >= $WARNING) ? (75) : (85))
crit: $this > (($status == $CRITICAL) ? (85) : (95))
delay: down 15m multiplier 1.5 max 1h
info: average cpu utilization for the last 10 minutes (excluding iowait, nice and steal)
to: sysadmin
```
To tune this alarm to trigger warning and critical alarms at a lower CPU utilization, change the `warn` and `crit` lines
to the values of your choosing. For example:
```yaml
warn: $this > (($status >= $WARNING) ? (60) : (75))
crit: $this > (($status == $CRITICAL) ? (75) : (85))
```
Save the file and [reload Netdata's health configuration](#reload-health-configuration) to make your changes live.
### Silence an individual alarm
Instead of disabling an alarm altogether, or even disabling _all_ alarms, you can silence individual alarms by changing
one line in a given health entity. To silence any single alarm, change the `to:` line in its entity to `silent`.
```yaml
to: silent
```
## Write a new health entity
While tuning existing alarms may work in some cases, you may need to write entirely new health entities based on how
your systems, containers, and applications work.
Read Netdata's [health reference](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#health-entity-reference) for a full listing of the format,
syntax, and functionality of health entities.
To write a new health entity into a new file, navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md),
then use `touch` to create a new file in the `health.d/` directory. Use `edit-config` to start editing the file.
As an example, let's create a `ram-usage.conf` file.
```bash
sudo touch health.d/ram-usage.conf
sudo ./edit-config health.d/ram-usage.conf
```
For example, here is a health entity that triggers a warning alarm when a node's RAM usage rises above 80%, and a
critical alarm above 90%:
```yaml
alarm: ram_usage
on: system.ram
lookup: average -1m percentage of used
units: %
every: 1m
warn: $this > 80
crit: $this > 90
info: The percentage of RAM being used by the system.
```
Let's look into each of the lines to see how they create a working health entity.
- `alarm`: The name for your new entity. The name needs to follow these requirements:
- Any alphabet letter or number.
- The symbols `.` and `_`.
- Cannot be `chart name`, `dimension name`, `family name`, or `chart variable names`.
- `on`: Which chart the entity listens to.
- `lookup`: Which metrics the alarm monitors, the duration of time to monitor, and how to process the metrics into a
usable format.
- `average`: Calculate the average of all the metrics collected.
- `-1m`: Use metrics from 1 minute ago until now to calculate that average.
- `percentage`: Clarify that we're calculating a percentage of RAM usage.
- `of used`: Specify which dimension (`used`) on the `system.ram` chart you want to monitor with this entity.
- `units`: Use percentages rather than absolute units.
- `every`: How often to perform the `lookup` calculation to decide whether or not to trigger this alarm.
- `warn`/`crit`: The value at which Netdata should trigger a warning or critical alarm. This example uses simple
syntax, but most pre-configured health entities use
[hysteresis](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#special-use-of-the-conditional-operator) to avoid superfluous notifications.
- `info`: A description of the alarm, which will appear in the dashboard and notifications.
In human-readable format:
> This health entity, named **ram_usage**, watches the **system.ram** chart. It looks up the last **1 minute** of
> metrics from the **used** dimension and calculates the **average** of all those metrics in a **percentage** format,
> using a **% unit**. The entity performs this lookup **every minute**.
>
> If the average RAM usage percentage over the last 1 minute is **more than 80%**, the entity triggers a warning alarm.
> If the usage is **more than 90%**, the entity triggers a critical alarm.
When you finish writing this new health entity, [reload Netdata's health configuration](#reload-health-configuration) to
see it live on the local dashboard or Netdata Cloud.
## Reload health configuration
To make any changes to your health configuration live, you must reload Netdata's health monitoring system. To do that
without restarting all of Netdata, run `netdatacli reload-health` or `killall -USR2 netdata`.
## What's next?
With your health entities configured properly, it's time to [enable
notifications](https://github.com/netdata/netdata/blob/master/docs/monitor/enable-notifications.md) to get notified whenever a node reaches a warning or critical
state.
To build complex, dynamic alarms, read our guide on [dimension templates](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/dimension-templates.md).

View File

@ -2,9 +2,8 @@
title: "Netdata for IoT"
custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/netdata-for-IoT.md
sidebar_label: "Netdata for IoT"
learn_status: "Unpublished"
learn_topic_type: "References"
learn_rel_path: "Guides"
learn_status: "Published"
learn_rel_path: "Miscellaneous"
-->
# Netdata for IoT

View File

@ -4,7 +4,7 @@ custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/netdata-sec
sidebar_label: "Security Design"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
sidebar_position: 20
-->

View File

@ -5,7 +5,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/store/chan
sidebar_label: "Change how long Netdata stores metrics"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Change how long Netdata stores metrics

View File

@ -1,33 +1,160 @@
<!--
title: "Health configuration reference"
title: "Configure agent alerts"
sidebar_label: "Health"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/REFERENCE.md"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Operations/Alerts"
learn_rel_path: "Configuration"
-->
# Health configuration reference
# Configure agent alerts
Welcome to the health configuration reference.
Netdata's health watchdog is highly configurable, with support for dynamic thresholds, hysteresis, alarm templates, and
more. You can tweak any of the existing alarms based on your infrastructure's topology or specific monitoring needs, or
create new entities.
This guide contains information about editing health configuration files to tweak existing alarms or create new health
entities that are customized to the needs of your infrastructure.
You can use health alarms in conjunction with any of Netdata's [collectors](https://github.com/netdata/netdata/blob/master/collectors/README.md) (see
the [supported collector list](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md)) to monitor the health of your systems, containers, and
applications in real time.
To learn the basics of locating and editing health configuration files, see the [Configure health alarms](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md) documentation.
While you can see active alarms both on the local dashboard and Netdata Cloud, all health alarms are configured _per
node_ via individual Netdata Agents. If you want to deploy a new alarm across your
[infrastructure](https://github.com/netdata/netdata/blob/master/docs/quickstart/infrastructure.md), you must configure each node with the same health configuration
files.
## Health configuration files
## Edit health configuration files
You can configure the Agent's health watchdog service by editing files in two locations:
- The `[health]` section in `netdata.conf`. By editing the daemon's behavior, you can disable health monitoring
altogether, run health checks more or less often, and more. See [daemon
configuration](https://github.com/netdata/netdata/blob/master/daemon/config/README.md#health-section-options) for a table of all the available settings, their
default values, and what they control.
altogether, run health checks more or less often, and more. See
[daemon configuration](https://github.com/netdata/netdata/blob/master/daemon/config/README.md#health-section-options) for a table of
all the available settings, their default values, and what they control.
- The individual `.conf` files in `health.d/`. These health entity files are organized by the type of metric they are
performing calculations on or their associated collector. You should edit these files using the `edit-config`
script. For example: `sudo ./edit-config health.d/cpu.conf`.
Navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md) and
use `edit-config` to make changes to any of these files.
For example, to edit the `cpu.conf` health configuration file, run:
```bash
sudo ./edit-config health.d/cpu.conf
```
Each health configuration file contains one or more health _entities_, which always begin with `alarm:` or `template:`.
For example, here is the first health entity in `health.d/cpu.conf`:
```yaml
template: 10min_cpu_usage
on: system.cpu
os: linux
hosts: *
lookup: average -10m unaligned of user,system,softirq,irq,guest
units: %
every: 1m
warn: $this > (($status >= $WARNING) ? (75) : (85))
crit: $this > (($status == $CRITICAL) ? (85) : (95))
delay: down 15m multiplier 1.5 max 1h
info: average cpu utilization for the last 10 minutes (excluding iowait, nice and steal)
to: sysadmin
```
To tune this alarm to trigger warning and critical alarms at a lower CPU utilization, change the `warn` and `crit` lines
to the values of your choosing. For example:
```yaml
warn: $this > (($status >= $WARNING) ? (60) : (75))
crit: $this > (($status == $CRITICAL) ? (75) : (85))
```
Save the file and [reload Netdata's health configuration](#reload-health-configuration) to make your changes live.
### Silence an individual alarm
Instead of disabling an alarm altogether, or even disabling _all_ alarms, you can silence individual alarms by changing
one line in a given health entity. To silence any single alarm, change the `to:` line in its entity to `silent`.
```yaml
to: silent
```
## Write a new health entity
While tuning existing alarms may work in some cases, you may need to write entirely new health entities based on how
your systems, containers, and applications work.
Read the [health entity reference](#health-entity-reference) for a full listing of the format,
syntax, and functionality of health entities.
To write a new health entity into a new file, navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md),
then use `touch` to create a new file in the `health.d/` directory. Use `edit-config` to start editing the file.
As an example, let's create a `ram-usage.conf` file.
```bash
sudo touch health.d/ram-usage.conf
sudo ./edit-config health.d/ram-usage.conf
```
For example, here is a health entity that triggers a warning alarm when a node's RAM usage rises above 80%, and a
critical alarm above 90%:
```yaml
alarm: ram_usage
on: system.ram
lookup: average -1m percentage of used
units: %
every: 1m
warn: $this > 80
crit: $this > 90
info: The percentage of RAM being used by the system.
```
Let's look into each of the lines to see how they create a working health entity.
- `alarm`: The name for your new entity. The name needs to follow these requirements:
- Any alphabet letter or number.
- The symbols `.` and `_`.
- Cannot be `chart name`, `dimension name`, `family name`, or `chart variable names`.
- `on`: Which chart the entity listens to.
- `lookup`: Which metrics the alarm monitors, the duration of time to monitor, and how to process the metrics into a
usable format.
- `average`: Calculate the average of all the metrics collected.
- `-1m`: Use metrics from 1 minute ago until now to calculate that average.
- `percentage`: Clarify that we're calculating a percentage of RAM usage.
- `of used`: Specify which dimension (`used`) on the `system.ram` chart you want to monitor with this entity.
- `units`: Use percentages rather than absolute units.
- `every`: How often to perform the `lookup` calculation to decide whether or not to trigger this alarm.
- `warn`/`crit`: The value at which Netdata should trigger a warning or critical alarm. This example uses simple
syntax, but most pre-configured health entities use
[hysteresis](#special-use-of-the-conditional-operator) to avoid superfluous notifications.
- `info`: A description of the alarm, which will appear in the dashboard and notifications.
In human-readable format:
> This health entity, named **ram_usage**, watches the **system.ram** chart. It looks up the last **1 minute** of
> metrics from the **used** dimension and calculates the **average** of all those metrics in a **percentage** format,
> using a **% unit**. The entity performs this lookup **every minute**.
>
> If the average RAM usage percentage over the last 1 minute is **more than 80%**, the entity triggers a warning alarm.
> If the usage is **more than 90%**, the entity triggers a critical alarm.
When you finish writing this new health entity, [reload Netdata's health configuration](#reload-health-configuration) to
see it live on the local dashboard or Netdata Cloud.
## Reload health configuration
To make any changes to your health configuration live, you must reload Netdata's health monitoring system. To do that
without restarting all of Netdata, run `netdatacli reload-health` or `killall -USR2 netdata`.
## Health entity reference
The following reference contains information about the syntax and options of _health entities_, which Netdata attaches
@ -326,7 +453,8 @@ Everything is the same with [badges](https://github.com/netdata/netdata/blob/mas
- `foreach DIMENSIONS` is optional, will always be the last parameter, and uses the same `,`/`|`
rules as the `of` parameter. Each dimension you specify in `foreach` will use the same rule
to trigger an alarm. If you set both `of` and `foreach`, Netdata will ignore the `of` parameter
and replace it with one of the dimensions you gave to `foreach`.
and replace it with one of the dimensions you gave to `foreach`. This option allows you to
[use dimension templates to create dynamic alarms](#use-dimension-templates-to-create-dynamic-alarms).
The result of the lookup will be available as `$this` and `$NAME` in expressions.
The timestamps of the timeframe evaluated by the database lookup is available as variables
@ -1020,7 +1148,166 @@ expression.
It's currently not possible to schedule notifications from within the alarm template. For those scenarios where you need
to temporary disable notifications (for instance when running backups triggers a disk alert) you can disable or silence
notifications are runtime. The health checks can be controlled at runtime via the [health management
api](https://github.com/netdata/netdata/blob/master/web/api/health/README.md).
notifications are runtime. The health checks can be controlled at runtime via the
[health management API](https://github.com/netdata/netdata/blob/master/web/api/health/README.md).
## Use dimension templates to create dynamic alarms
In v1.18 of Netdata, we introduced **dimension templates** for alarms, which simplifies the process of
writing [alarm entities](#health-entity-reference) for
charts with many dimensions.
Dimension templates can condense many individual entities into one—no more copy-pasting one entity and changing the
`alarm`/`template` and `lookup` lines for each dimension you'd like to monitor.
### The fundamentals of `foreach`
Our dimension templates update creates a new `foreach` parameter to the
existing [`lookup` line](#alarm-line-lookup). This
is where the magic happens.
You use the `foreach` parameter to specify which dimensions you want to monitor with this single alarm. You can separate
them with a comma (`,`) or a pipe (`|`). You can also use
a [Netdata simple pattern](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) to create
many alarms with a regex-like syntax.
The `foreach` parameter _has_ to be the last parameter in your `lookup` line, and if you have both `of` and `foreach` in
the same `lookup` line, Netdata will ignore the `of` parameter and use `foreach` instead.
Let's get into some examples so you can see how the new parameter works.
> ⚠️ The following entities are examples to showcase the functionality and syntax of dimension templates. They are not
> meant to be run as-is on production systems.
### Condensing entities with `foreach`
Let's say you want to monitor the `system`, `user`, and `nice` dimensions in your system's overall CPU utilization.
Before dimension templates, you would need the following three entities:
```yaml
alarm: cpu_system
on: system.cpu
lookup: average -10m percentage of system
every: 1m
warn: $this > 50
crit: $this > 80
alarm: cpu_user
on: system.cpu
lookup: average -10m percentage of user
every: 1m
warn: $this > 50
crit: $this > 80
alarm: cpu_nice
on: system.cpu
lookup: average -10m percentage of nice
every: 1m
warn: $this > 50
crit: $this > 80
```
With dimension templates, you can condense these into a single alarm. Take note of the `alarm` and `lookup` lines.
```yaml
alarm: cpu_template
on: system.cpu
lookup: average -10m percentage foreach system,user,nice
every: 1m
warn: $this > 50
crit: $this > 80
```
The `alarm` line specifies the naming scheme Netdata will use. You can use whatever naming scheme you'd like, with `.`
and `_` being the only allowed symbols.
The `lookup` line has changed from `of` to `foreach`, and we're now passing three dimensions.
In this example, Netdata will create three alarms with the names `cpu_template_system`, `cpu_template_user`, and
`cpu_template_nice`. Every minute, each alarm will use the same database query to calculate the average CPU usage for
the `system`, `user`, and `nice` dimensions over the last 10 minutes and send out alarms if necessary.
You can find these three alarms active by clicking on the **Alarms** button in the top navigation, and then clicking on
the **All** tab and scrolling to the **system - cpu** collapsible section.
![Three new alarms created from the dimension template](https://user-images.githubusercontent.com/1153921/66218994-29523800-e67f-11e9-9bcb-9bca23e2c554.png)
Let's look at some other examples of how `foreach` works so you can best apply it in your configurations.
### Using a Netdata simple pattern in `foreach`
In the last example, we used `foreach system,user,nice` to create three distinct alarms using dimension templates. But
what if you want to quickly create alarms for _all_ the dimensions of a given chart?
Use a [simple pattern](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md)! One example of a simple pattern is a single wildcard
(`*`).
Instead of monitoring system CPU usage, let's monitor per-application CPU usage using the `apps.cpu` chart. Passing a
wildcard as the simple pattern tells Netdata to create a separate alarm for _every_ process on your system:
```yaml
alarm: app_cpu
on: apps.cpu
lookup: average -10m percentage foreach *
every: 1m
warn: $this > 50
crit: $this > 80
```
This entity will now create alarms for every dimension in the `apps.cpu` chart. Given that most `apps.cpu` charts have
10 or more dimensions, using the wildcard ensures you catch every CPU-hogging process.
To learn more about how to use simple patterns with dimension templates, see
our [simple patterns documentation](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md).
### Using `foreach` with alarm templates
Dimension templates also work
with [alarm templates](#alarm-line-alarm-or-template).
Alarm templates help you create alarms for all the charts with a given context—for example, all the cores of your
system's CPU.
By combining the two, you can create dozens of individual alarms with a single template entity. Here's how you would
create alarms for the `system`, `user`, and `nice` dimensions for every chart in the `cpu.cpu` context—or, in other
words, every CPU core.
```yaml
template: cpu_template
on: cpu.cpu
lookup: average -10m percentage foreach system,user,nice
every: 1m
warn: $this > 50
crit: $this > 80
```
On a system with a 6-core, 12-thread Ryzen 5 1600 CPU, this one entity creates alarms on the following charts and
dimensions:
- `cpu.cpu0`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
- `cpu.cpu1`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
- `cpu.cpu2`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
- ...
- `cpu.cpu11`
- `cpu_template_user`
- `cpu_template_system`
- `cpu_template_nice`
And how just a few of those dimension template-generated alarms look like in the Netdata dashboard.
![A few of the created alarms in the Netdata dashboard](https://user-images.githubusercontent.com/1153921/66219669-708cf880-e680-11e9-8b3a-7bfe178fa28b.png)
All in all, this single entity creates 36 individual alarms. Much easier than writing 36 separate entities in your
health configuration files!

View File

@ -1,5 +1,5 @@
<!--
title: "alerta.io"
title: "Alerta agent alert notifications"
sidebar_label: "Alerta"
description: "Send alarm notifications to Alerta to see the latest health status updates from multiple nodes in a single interface."
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/alerta/README.md"
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# alerta.io
# Alerta agent alert notifications
The [Alerta](https://alerta.io) monitoring system is a tool used to
consolidate and de-duplicate alerts from multiple sources for quick

View File

@ -1,5 +1,5 @@
<!--
title: "Amazon SNS"
title: "Amazon SNS agent alert notifications"
sidebar_label: "Amazon SNS"
description: "hello"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/awssns/README.md"
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Amazon SNS
# Amazon SNS agent alert notifications
As part of its AWS suite, Amazon provides a notification broker service called 'Simple Notification Service' (SNS). Amazon SNS works similarly to Netdata's own notification system, allowing to dispatch a single notification to multiple subscribers of different types. While Amazon SNS supports sending differently formatted messages for different delivery methods, Netdata does not currently support this functionality.
Among other things, SNS supports sending notifications to:

View File

@ -1,5 +1,5 @@
<!--
title: "Custom"
title: "Custom agent alert notifications"
sidebar_label: "Custom endpoint"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/custom/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Custom
# Custom agent alert notifications
Netdata allows you to send custom notifications to any endpoint you choose.

View File

@ -1,5 +1,5 @@
<!--
title: "Discord.com"
title: "Discord agent alert notifications"
sidebar_label: "Discord"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/discord/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Discord.com
# Discord agent alert notifications
This is what you will get:

View File

@ -1,5 +1,5 @@
<!--
title: "Dynatrace"
title: "Dynatrace agent alert notifications"
sidebar_label: "Dynatrace Events"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/dynatrace/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Dynatrace
# Dynatrace agent alert notifications
Dynatrace allows you to receive notifications using their Events REST API.

View File

@ -1,5 +1,5 @@
<!--
title: "Email"
title: "Email agent alert notifications"
sidebar_label: "Email"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/email/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': True, 'part_of_agent': True}"
-->
# Email
# Email agent alert notifications
You need a working `sendmail` command for email alerts to work. Almost all MTAs provide a `sendmail` interface.

View File

@ -1,5 +1,5 @@
<!--
title: "Flock"
title: "Flock agent alert notifications"
sidebar_label: "Flock"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/flock/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Flock
# Flock agent alert notifications
This is what you will get:

View File

@ -1,5 +1,5 @@
<!--
title: "Send notifications to Gotify"
title: "Gotify agent alert notifications"
description: "Send alerts to your Gotify instance when an alert gets triggered in Netdata."
sidebar_label: "Gotify"
custom_edit_url: https://github.com/netdata/netdata/edit/master/health/notifications/gotify/README.md
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Send notifications to Gotify
# Gotify agent alert notifications
[Gotify](https://gotify.net/) is a self-hosted push notification service created for sending and receiving messages in real time.

View File

@ -1,5 +1,5 @@
<!--
title: "Send notifications to Google Hangouts"
title: "Google Hangouts agent alert notifications"
description: "Send alerts to Send notifications to Google Hangouts any time an anomaly or performance issue strikes a node in your infrastructure."
sidebar_label: "Google Hangouts"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/hangouts/README.md"
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Send notifications to Google Hangouts
# Google Hangouts agent alert notifications
[Google Hangouts](https://hangouts.google.com/) is a cross-platform messaging app developed by Google. You can configure
Netdata to send alarm notifications to a Hangouts room in order to stay aware of possible health or performance issues

View File

@ -1,5 +1,5 @@
<!--
title: "IRC"
title: "IRC agent alert notifications"
sidebar_label: "IRC"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/irc/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# IRC
# IRC agent alert notifications
This is what you will get:

View File

@ -1,5 +1,5 @@
<!--
title: "Kavenegar"
title: "Kavenegar agent alert notifications"
sidebar_label: "Kavenegar"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/kavenegar/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Kavenegar
# Kavenegar agent alert notifications
[Kavenegar](https://kavenegar.com/) as service for software developers, based in Iran, provides send and receive SMS, calling voice by using its APIs.

View File

@ -1,15 +1,14 @@
<!--
title: "Send Netdata notifications to Matrix network rooms"
title: "Matrix agent alert notifications"
description: "Stay aware of warning or critical anomalies by sending health alarms to Matrix network rooms with Netdata's health monitoring watchdog."
sidebar_label: "Matrix"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/matrix/README.md"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Matrix
# Matrix agent alert notifications
Send notifications to [Matrix](https://matrix.org/) network rooms.

View File

@ -1,14 +1,13 @@
<!--
title: "Messagebird"
title: "Messagebird agent alert notifications"
sidebar_label: "Messagebird"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/messagebird/README.md"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Messagebird
# Messagebird agent alert notifications
The messagebird notifications will look like this on your Android device:

View File

@ -1,5 +1,5 @@
<!--
title: "Microsoft Teams"
title: "Microsoft Teams agent alert notifications"
sidebar_label: "Microsoft Teams"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/msteams/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Microsoft Teams
# Microsoft Teams agent alert notifications
This is what you will get:
![image](https://user-images.githubusercontent.com/1122372/92710359-0385e680-f358-11ea-8c52-f366a4fb57dd.png)

View File

@ -1,5 +1,5 @@
<!--
title: "Send notifications to Opsgenie"
title: "Opsgenie agent alert notifications"
description: "Send alerts to your Opsgenie incident response account any time an anomaly or performance issue strikes a node in your infrastructure."
sidebar_label: "Opsgenie"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/opsgenie/README.md"
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Send notifications to Opsgenie
# Opsgenie agent alert notifications
[Opsgenie](https://www.atlassian.com/software/opsgenie) is an alerting and incident response tool. It is designed to
group and filter alarms, build custom routing rules for on-call teams, and correlate deployments and commits to

View File

@ -1,5 +1,5 @@
<!--
title: "Send alert notifications to PagerDuty"
title: "PagerDuty agent alert notifications"
description: "Send alerts to your PagerDuty dashboard any time an anomaly or performance issue strikes a node in your infrastructure."
sidebar_label: "PagerDuty"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/pagerduty/README.md"
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Send alert notifications to PagerDuty
# PagerDuty agent alert notifications
[PagerDuty](https://www.pagerduty.com/company/) is an enterprise incident resolution service that integrates with ITOps
and DevOps monitoring stacks to improve operational reliability and agility. From enriching and aggregating events to

View File

@ -1,5 +1,5 @@
<!--
title: "Prowl"
title: "Prowl agent alert notifications"
sidebar_label: "Prowl"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/prowl/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Prowl
# Prowl agent alert notifications
[Prowl](https://www.prowlapp.com/) is a push notification service for iOS devices. Netdata
supports delivering notifications to iOS devices through Prowl.

View File

@ -1,5 +1,5 @@
<!--
title: "PushBullet"
title: "PushBullet agent alert notifications"
sidebar_label: "PushBullet"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/pushbullet/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# PushBullet
# PushBullet agent alert notifications
Will look like this on your browser:
![image](https://cloud.githubusercontent.com/assets/4300670/19109636/278b1c0c-8aee-11e6-8a09-7fc94fdbfec8.png)

View File

@ -1,5 +1,5 @@
<!--
title: "PushOver"
title: "PushOver agent alert notifications"
sidebar_label: "PushOver"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/pushover/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# PushOver
# PushOver agent alert notifications
pushover.net allows you to receive push notifications on your mobile phone. The service seems free for up to 7.500 messages per month.

View File

@ -1,5 +1,5 @@
<!--
title: "Rocket.Chat"
title: "Rocket.Chat agent alert notifications"
sidebar_label: "Rocket Chat"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/rocketchat/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Rocket.Chat
# Rocket.Chat agent alert notifications
This is what you will get:
![Netdata on RocketChat](https://i.imgur.com/Zu4t3j3.png)

View File

@ -1,5 +1,5 @@
<!--
title: "Slack"
title: "Slack agent alert notifications"
sidebar_label: "Slack"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/slack/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Slack
# Slack agent alert notifications
This is what you will get:
![image](https://cloud.githubusercontent.com/assets/2662304/18407116/bbd0fee6-7710-11e6-81cf-58c0defaee2b.png)

View File

@ -1,5 +1,5 @@
<!--
title: "SMS Server Tools 3"
title: "SMS Server Tools 3 agent alert notifications"
sidebar_label: "SMS server"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/smstools3/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# SMS Server Tools 3
# SMS Server Tools 3 agent alert notifications
The [SMS Server Tools 3](http://smstools3.kekekasvi.com/) is a SMS Gateway software which can send and receive short messages through GSM modems and mobile phones.

View File

@ -1,5 +1,5 @@
<!--
title: "Send notifications to StackPulse"
title: "StackPulse agent alert notifications"
description: "Send alerts to your StackPulse Netdata integration any time an anomaly or performance issue strikes a node in your infrastructure."
sidebar_label: "StackPulse"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/stackpulse/README.md"
@ -9,7 +9,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Send notifications to StackPulse
# StackPulse agent alert notifications
[StackPulse](https://stackpulse.com/) is a software-as-a-service platform for site reliability engineering.
It helps SREs, DevOps Engineers and Software Developers reduce toil and alert fatigue while improving reliability of

View File

@ -1,5 +1,5 @@
<!--
title: "Syslog"
title: "Syslog agent alert notifications"
sidebar_label: "Syslog"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/syslog/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Syslog
# Syslog agent alert notifications
You need a working `logger` command for this to work. This is the case on pretty much every Linux system in existence, and most BSD systems.

View File

@ -1,5 +1,5 @@
<!--
title: "Telegram"
title: "Telegram agent alert notifications"
sidebar_label: "Telegram"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/telegram/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Telegram
# Telegram agent alert notifications
[Telegram](https://telegram.org/) is a messaging app with a focus on speed and security, its super-fast, simple and free. You can use Telegram on all your devices at the same time — your messages sync seamlessly across any number of your phones, tablets or computers.

View File

@ -1,5 +1,5 @@
<!--
title: "Twilio"
title: "Twilio agent alert notifications"
sidebar_label: "Twilio"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/twilio/README.md"
learn_status: "Published"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Twilio
# Twilio agent alert notifications
Will look like this on your Android device:

View File

@ -1,6 +1,6 @@
<!--
title: "Pop up"
sidebar_label: "Pop up notifications"
title: "Browser pop up agent alert notifications"
sidebar_label: "Browser pop ups"
custom_edit_url: "https://github.com/netdata/netdata/edit/master/health/notifications/web/README.md"
learn_status: "Published"
learn_topic_type: "Tasks"
@ -8,7 +8,7 @@ learn_rel_path: "Integrations/Notify/Agent alert notifications"
learn_autogeneration_metadata: "{'part_of_cloud': False, 'part_of_agent': True}"
-->
# Pop up notifications
# Broser pop up agent alert notifications
The Netdata dashboard shows HTML notifications, when it is open.

View File

@ -5,7 +5,7 @@ description: "This is an in-depth look at how Netdata uses ML to detect anomalie
sidebar_label: "Configure machine learning (ML) powered anomaly detection"
learn_status: "Published"
learn_topic_type: "Tasks"
learn_rel_path: "Setup"
learn_rel_path: "Configuration"
-->
# Machine learning (ML) powered anomaly detection