netdata

Go to file

netdatabot 794b453e0c [ci skip] Update changelog and version for nightly build: v1.45.0-292-nightly.		2024-04-26 00:16:30 +00:00
.github	Work around MS’s broken infra in CI. (#17513 )	2024-04-24 12:47:36 +03:00
.vscode	fix move collectors to src/ leftovers (#16967 )	2024-02-08 11:40:14 +02:00
contrib	Move handling of legacy eBPF programs into CMake. (#17512 )	2024-04-25 09:21:03 +03:00
docs	update the file (#17522 )	2024-04-25 11:56:13 +00:00
integrations	Regenerate integrations.js (#17505 )	2024-04-23 21:27:16 +03:00
packaging	[ci skip] Update changelog and version for nightly build: v1.45.0-292-nightly.	2024-04-26 00:16:30 +00:00
src	Bump github.com/docker/docker from 26.0.2+incompatible to 26.1.0+incompatible in /src/go/collectors/go.d.plugin (#17520 )	2024-04-25 15:08:57 +03:00
system	Add basic support for dinit in our system service handling code. (#16836 )	2024-04-04 08:25:52 -04:00
tests	Prefer Protobuf’s own CMake config over CMake's FindProtobuf. (#17128 )	2024-03-20 07:13:44 -04:00
.clang-format	Fine tune clang-format (#7271 )	2021-04-15 12:02:36 +03:00
.codacy.yml	Move web/ under src/ (#16992 )	2024-02-12 14:11:49 +02:00
.dockerignore	Restore a broken symbolic link (#12923 )	2022-05-16 18:46:36 +03:00
.flake8	Add flake8 to review CI to check Python files. (#14582 )	2023-02-23 09:53:33 -05:00
.gitignore	go.d.plugin dyncfgv2 (#17064 )	2024-03-05 16:40:19 +02:00
.gitmodules	Move web/ under src/ (#16992 )	2024-02-12 14:11:49 +02:00
.shellcheckrc	Assorted shellcheck cleanup. (#14524 )	2023-02-16 07:31:06 -05:00
.yamllint.yml	fix move collectors to src/ leftovers (#16967 )	2024-02-08 11:40:14 +02:00
CHANGELOG.md	[ci skip] Update changelog and version for nightly build: v1.45.0-292-nightly.	2024-04-26 00:16:30 +00:00
CMakeLists.txt	Update CMake to request new behavior for all policies through v3.28.0. (#17496 )	2024-04-25 09:21:40 +03:00
Dockerfile	Remove the confusion around the multiple Dockerfile(s) we have (#8214 )	2020-03-10 08:12:26 +10:00
LICENSE	remove license templates; add info about SPDX to main license file	2018-09-08 15:53:07 +02:00
README.md	cncf changed the url (#17427 )	2024-04-17 14:46:07 +03:00
REDISTRIBUTED.md	Move web/ under src/ (#16992 )	2024-02-12 14:11:49 +02:00
netdata-installer.sh	Move handling of legacy eBPF programs into CMake. (#17512 )	2024-04-25 09:21:03 +03:00
netdata.spec.in	Move handling of legacy eBPF programs into CMake. (#17512 )	2024-04-25 09:21:03 +03:00

README.md

Monitor your servers, containers, and applications,
in high-resolution and in real-time.

Visit the Project's Home Page

Netdata collects metrics per second and presents them in beautiful low-latency dashboards. It is designed to run on all of your physical and virtual servers, cloud deployments, Kubernetes clusters, and edge/IoT devices, to monitor your systems, containers, and applications.

It scales nicely from just a single server to thousands of servers, even in complex multi/mixed/hybrid cloud environments, and given enough disk space it can keep your metrics for years.

WHAT CAN BE MONITORED WITH NETDATA:

Netdata monitors all the following:

Component	Linux	FreeBSD	macOS	Windows*
System Resources CPU, Memory and system shared resources	Full	Yes	Yes	Yes
Storage Disks, Mount points, Filesystems, RAID arrays	Full	Basic	Basic	Basic
Network Network Interfaces, Protocols, Firewall, etc	Full	Basic	Basic	Basic
Hardware & Sensors Fans, Temperatures, Controllers, GPUs, etc	Full	Some	Some	Some
O/S Services Resources, Performance and Status	Yes `systemd`	-	-	Basic
Logs	Yes `systemd`-journal	-	-	-
Processes Resources, Performance, OOM, and more	Yes	Yes	Yes	Yes
Network Connections Live TCP and UDP sockets per PID	Yes	-	-	-
Containers Docker/containerd, LXC/LXD, Kubernetes, etc	Yes	-	-	-
VMs (from the host) KVM, qemu, libvirt, Proxmox, etc	Yes `cgroups`	-	-	Yes `Hyper-V`
Synthetic Checks Test APIs, TCP ports, Ping, Certificates, etc	Yes	Yes	Yes	Yes
Packaged Applications nginx, apache, postgres, redis, mongodb, and hundreds more	Yes	Yes	Yes	Yes
Custom Applications OpenMetrics, StatsD	Yes	Yes	Yes	Yes

When Netdata runs on Linux, it monitors every kernel feature available, providing full coverage of all kernel technologies that can be monitored.

Netdata provides full enterprise hardware coverage, monitoring all components that provide hardware error reporting, like PCI AER, RAM EDAC, IPMI, S.M.A.R.T., NVMe, Fans, Power, Voltages, and more.

* Netdata runs on Linux, FreeBSD and macOS. For Windows, we rely on Windows Exporter (so a Netdata running on Linux, FreeBSD or macOS is required, next to the monitored Windows servers).

KEY CHARACTERISTICS:

💥 Collects data from 800+ integrations
Operating system metrics, container metrics, virtual machines, hardware sensors, applications metrics, OpenMetrics exporters, StatsD, and logs.
💪 Real-Time, Low-Latency, High-Resolution
All data are collected per second and are on the dashboard immediately after data collection.
😶‍🌫️ Unsupervised Anomaly Detection
Trains multiple Machine-Learning (ML) models for each metric and uses AI to detect anomalies based on the past behavior of each metric.
🔥 Powerful Visualization
Fully automated dashboard providing corellated visualization of all metrics, allowing you to understand any dataset at first sight, but also to filter, slice and dice the data directly on the dashboards, without the need to learn a query language.
🔔 Out of box Alerts
Comes with hundreds of alerts out of the box to detect common issues and pitfalls, revealing issues that can easily go unnoticed. It supports several notification methods to let you know when your attention is needed.
😎 Low Maintenance
Fully automated in every aspect: automated dashboards, out-of-the-box alerts, auto-detection and auto-discovery of metrics, zero-touch machine-learning, easy scalability and high availability, and CI/CD friendly.
⭐ Open and Extensible
Netdata is a modular platform that can be extended in all possible ways and it also integrates nicely with other monitoring solutions.

💥 NEW: Network Connections Explorer 💥

Network Connections viewer is currently in the nightly builds of Netdata!

This tool visualizes all the sockets each server has (IPv4 and IPv6, TCP and UDP). It can classify them as inbound, outbound, listen and local and allow filtering on them.

The visualization has 4 sides:

public (ie. public IPs),
private (ie. private and reserved IPs),
servers (ie. listening and inbound sockets),
clients (ie. sockets towards other servers).

The position of each application on the chart is determined by the classification of the sockets it has. To the top are clients, to the bottom are servers, to the right are internet facing applications, to the left is internal network applications.

The size of each application in the chart is determined by the number of sockets it has, and each application is a pie chart representing the percentage of each kind of sockets it has.

⭐ Netdata is the most energy-efficient monitoring tool ⭐

Dec 11, 2023: University of Amsterdam published a study related to the impact of monitoring tools for Docker based systems, aiming to answer 2 questions:

What is the impact of monitoring tools on the energy efficiency of Docker-based systems?
What is the impact of monitoring tools on the performance of Docker-based systems?

🚀 Netdata excels in energy efficiency: "... Netdata being the most energy-efficient tool ...", as the study says.
🚀 Netdata excels in CPU Usage, RAM Usage and Execution Time, and has a similar impact in Network Traffic as Prometheus.

The study did not normalize the results based on the number of metrics collected. Given that Netdata usually collects significantly more metrics than the other tools, Netdata managed to outperform the other tools, while ingesting a much higher number of metrics. Read the full study here.

On the same workload, Netdata uses 35% less CPU, 49% less RAM, 12% less bandwidth, 98% less disk I/O, and is 75% more disk space efficient on high resolution metrics storage, while providing more than a year of overall retention on the same disk footprint Prometheus offers 7 days of retention. Read the full analysis in our blog.

NEW: Netdata and LOGS ! 🥳

Check the systemd-journal plugin of Netdata, that allows you to view, explore, analyze and query systemd journal logs!

CNCF CNCF
Netdata actively supports and is a member of the Cloud Native Computing Foundation (CNCF)

...and due to your love ❤️, it is one of the most ⭐'d projects in the CNCF landscape!

Below is an animated image, but you can see Netdata live!
FRANKFURT | NEWYORK | ATLANTA | SANFRANCISCO | TORONTO | SINGAPORE | BANGALORE
They are clustered Netdata Parents. They all have the same data. Select the one closer to you.
All these run with the default configuration. We only clustered them to have multi-node dashboards.

Important 💡
People get addicted to Netdata. Once you use it on your systems, there's no going back!

What's New and Coming?

Click to see our immediate development plans and a summary view of the last 12 months' releases...

What	Description	When	Status
Netdata Cloud On-Prem	Netdata Cloud available for On-Prem installation!	available	fill this form
State manager monitor	Centralized and immediate visibility to the state of your apps and services.	soon	planned
More Customizable	Set default settings for all charts and views!	soon	in progress
AWS Integrated billing	Run Netdata our your AWS instances and get your billing integrated on your AWS account.	soon	in progress
Alert Silence Manager R2	Improvements to the Alert Silencing Manager with recurring schedules and more!	soon	in progress
Okta SSO	Facilitate the integration of Netdata into your organizations user management process.	soon	in progress
Prometheus/OpenMetrics improvements	Allow users to configure how metrics should be ingested and presented.	soon	in progress
Loki logs	Another Logs integration, bring your Loki logs onto the UI!	soon	in progress
UCUM Units	Migrate all metrics to the Unified Code for Units of Measure.	soon	in progress
Dynamic Configurations	Configure Alerts and Data Collectors from the UI!	soon	Beta release v1.45 - in progress
WebRTC	Browser to Agent communication via WebRTC.	later	interrupted
Advanced Troubleshooting	Expanded view of dashboard charts integrating Metrics Correlations, Anomaly Advisor, and many more.	later	interrupted
Homelab plan	Unlimited Netdata plan targeted for homelabbers or students.	Feb 2024	v1.45
Easy Custom Dashboards	Drag and drop charts to create custom dashboards on the fly, while troubleshooting!	Feb 2024	v1.45
Netdata Notifications Mobile App	You can receive and manage alert and reachability notifications from your subscribed spaces.	Jan 2024	v1.45
`systemd` journal	View the `systemd` journal logs of your systems on the dashboard.	Oct 2023	v1.43
Integrations	Netdata Integrations Marketplace!	Aug 2023	v1.42
New Agent UI	Now Netdata Cloud and Netdata Agent share the same dashboard!	Jul 2023	v1.41
Summary Dashboards	High level tiles everywhere!	Jun 2023	v1.40
Machine Learning	Multiple ML models per metric.	Jun 2023	v1.40
SSL	Netdata Agent gets a new SSL layer.	Jun 2023	v1.40
New Cloud UI	Filter, slice and dice any dataset from the UI! ML-first!	May 2023	v1.39
Microsoft Windows	Monitor Windows hosts and apps!	May 2023	v1.39
Virtual Nodes	Go collectors can now be assigned to virtual nodes!	May 2023	v1.39
DBENGINE v2	Faster, more reliable, far more scalable!	Feb 2023	v1.38
Netdata Functions	Netdata beyond metrics! Monitoring anything!	Feb 2023	v1.38
Events Feed	Live feed of events about topology changes and alerts.	Feb 2023	v1.38
Role Based Access Control	More roles, offering finer control over access to infrastructure.	Feb 2023	v1.38
Infinite Scalability	Streaming compression. Replication. Active-active clustering.	Nov 2022	v1.37
Grafana Plugin	Netdata Cloud as a data source for Grafana.	Nov 2022	v1.37
PostgreSQL	Completely rewritten, to reveal all the info, even at the table level.	Nov 2022	v1.37
Metrics Correlations	Advanced algorithms to find the needle in the haystack.	Aug 2022	v1.36
Database Tiering	Netdata gets unlimited retention!	Aug 2022	v1.36
Kubernetes	Monitor your Kubernetes workloads.	Aug 2022	v1.36
Machine Learning	Anomaly Rate information on every chart.	Aug 2022	v1.36
Machine Learning	Anomaly Advisor! Bottom-up unsupervised anomaly detection.	Jun 2022	v1.35
Machine Learning	Metrics Correlation on the Agent.	Jun 2022	v1.35

Getting Started

1. Install Netdata everywhere ✌️

Netdata can be installed on all Linux, macOS, and FreeBSD systems. We provide binary packages for the most popular operating systems and package managers.

Install on Ubuntu, Debian CentOS, Fedora, Suse, Red Hat, Arch, Alpine, Gentoo, even BusyBox.
Install with Docker.
Netdata is a Verified Publisher on DockerHub and our users enjoy free unlimited DockerHub pulls 😍.
Install on MacOS 🤘.
Install on FreeBSD and pfSense.
Install from source
For Kubernetes deployments check here.

Check also the Netdata Deployment Guides to decide how to deploy it in your infrastructure.

By default, you will have immediately available a local dashboard. Netdata starts a web server for its dashboard at port 19999. Open up your web browser of choice and navigate to http://NODE:19999, replacing NODE with the IP address or hostname of your Agent. If installed on localhost, you can access it through http://localhost:19999.

2. Configure Collectors 💥

Netdata auto-detects and auto-discovers most operating system data sources and applications. However, many data sources require some manual configuration, usually to allow Netdata to get access to the metrics.

For a detailed list of the 800+ collectors available, check this guide.
To monitor Windows servers and applications use this guide.
To monitor SNMP devices check this guide.

3. Configure Alert Notifications 🔔

Netdata comes with hundreds of pre-configured alerts, that automatically check your metrics, immediately after they start getting collected.

Netdata can dispatch alert notifications to multiple third party systems, including: email, Alerta, AWS SNS, Discord, Dynatrace, flock, gotify, IRC, Matrix, MessageBird, Microsoft Teams, ntfy, OPSgenie, PagerDuty, Prowl, PushBullet, PushOver, RocketChat, Slack, SMS tools, Syslog, Telegram, Twilio.

By default, Netdata will send e-mail notifications, if there is a configured MTA on the system.

4. Configure Netdata Parents 👪

Optionally, configure one or more Netdata Parents. A Netdata Parent is a Netdata Agent that has been configured to accept streaming connections from other Netdata agents.

Netdata Parents provide:

Infrastructure level dashboards, at http://parent.server.ip:19999/.

Each Netdata Agent has an API listening at the TCP port 19999 of each server. When you hit that port with a web browser (e.g. http://server.ip:19999/), the Netdata Agent UI is presented. When the Netdata Agent is also a Parent, the UI of the Parent includes data for all nodes that stream metrics to that Parent.
Increased retention for all metrics of all your nodes.

Each Netdata Agent maintains each own database of metrics. But Parents can be given additional resources to maintain a much longer database than individual Netdata Agents.
Central configuration of alerts and dispatch of notifications.

Using Netdata Parents, all the alert notifications integrations can be configured only once, at the Parent and they can be disabled at the Netdata Agents.

You can also use Netdata Parents to:

Offload your production systems (the parents run ML, alerts, queries, etc. for all their children)
Secure your production systems (the parents accept user connections, for all their children)

5. Connect to Netdata Cloud ☁️

Optionally, sign-in to Netdata Cloud and claim your Netdata Agents and Parents. If you connect your Netdata Parents, there is no need to connect your Netdata Agents. They will be connected via the Parents.

When your Netdata nodes are connected to Netdata Cloud, you can (on top of the above):

Access your Netdata agents from anywhere
Access sensitive Netdata agent features (like "Netdata Functions": processes, systemd-journal)
Organize your infra in spaces and rooms
Create, manage, and share custom dashboards
Invite your team and assign roles to them (Role Based Access Control - RBAC)
Get infinite horizontal scalability (multiple independent Netdata Agents are viewed as one infra)
Configure alerts from the UI (coming soon)
Configure data collection from the UI (coming soon)
Netdata Mobile App notifications (coming soon)

🤟 Netdata Cloud does not prevent you from using your Netdata Agents and Parents directly, and vice versa.

👌 Your metrics are still stored in your network when you connect your Netdata Agents and Parents to Netdata Cloud.

How it works

Netdata is built around a modular metrics processing pipeline.

Click to see more details about this pipeline...

Each Netdata Agent can perform the following functions:

COLLECT metrics from their sources
Uses internal and external plugins to collect data from their sources.

Netdata auto-detects and collects almost everything from the operating system: including CPU, Interrupts, Memory, Disks, Mount Points, Filesystems, Network Stack, Network Interfaces, Containers, VMs, Processes, systemd units, Linux Performance Metrics, Linux eBPF, Hardware Sensors, IPMI, and more.

It collects application metrics from applications: PostgreSQL, MySQL/MariaDB, Redis, MongoDB, Nginx, Apache, and hundreds more.

Netdata also collects your custom application metrics by scraping OpenMetrics exporters, or via StatsD.

It can convert web server log files to metrics and apply ML and alerts to them, in real-time.

And it also supports synthetic tests / white box tests, so you can ping servers, check API responses, or even check filesystem files and directories to generate metrics, train ML and run alerts and notifications on their status.
STORE metrics to a database
Uses database engine plugins to store the collected data, either in memory and/or on disk. We have developed our own dbengine for storing the data in a very efficient manner, allowing Netdata to have less than 1 byte per sample on disk and amazingly fast queries.
LEARN the behavior of metrics (ML)
Trains multiple Machine-Learning (ML) models per metric to learn the behavior of each metric individually. Netdata uses the kmeans algorithm and creates by default a model per metric per hour, based on the values collected for that metric over the last 6 hours. The trained models are persisted to disk.
DETECT anomalies in metrics (ML)
Uses the trained machine learning (ML) models to detect outliers and mark collected samples as anomalies. Netdata stores anomaly information together with each sample and also streams it to Netdata Parents so that the anomaly is also available at query time for the whole retention of each metric.
CHECK metrics and trigger alert notifications
Uses its configured alerts (you can configure your own) to check the metrics for common issues and uses notifications plugins to send alert notifications.
STREAM metrics to other Netdata Agents
Push metrics in real-time to Netdata Parents.
ARCHIVE metrics to 3rd party databases
Export metrics to industry standard time-series databases, like Prometheus, InfluxDB, OpenTSDB, Graphite, etc.
QUERY metrics and present dashboards
Provide an API to query the data and present interactive dashboards to users.
SCORE metrics to reveal similarities and patterns
Score the metrics according to the given criteria, to find the needle in the haystack.

When using Netdata Parents, all the functions of a Netdata Agent (except data collection) can be delegated to Parents to offload production systems.

The core of Netdata is developed in C. We have our own libnetdata, that provides:

DICTIONARY
A high-performance algorithm to maintain both indexed and ordered pools of structures Netdata needs. It uses JudyHS arrays for indexing, although it is modular: any hashtable or tree can be integrated into it. Despite being in C, dictionaries follow object-oriented programming principles, so there are constructors, destructors, automatic memory management, garbage collection, and more. For more see here.
ARAL
ARray ALlocator (ARAL) is used to minimize the system allocations made by Netdata. ARAL is optimized for maximum multi-threaded performance. It also allows all structures that use it to be allocated in memory-mapped files (shared memory) instead of RAM. For more see here.
PROCFILE
A high-performance /proc (but also any) file parser and text tokenizer. It achieves its performance by keeping files open and adjusting its buffers to read the entire file in one call (which is also required by the Linux kernel). For more see here.
STRING
A string internet mechanism, for string deduplication and indexing (using JudyHS arrays), optimized for multi-threaded usage. For more see here.
ARL
Adaptive Resortable List (ARL), is a very fast list iterator, that keeps the expected items on the list in the same order they are found in input list. So, the first iteration is somewhat slower, but all the following iterations are perfectly aligned for best performance. For more see here.
BUFFER
A flexible text buffer management system that allows Netdata to automatically handle dynamically sized text buffer allocations. The same mechanism is used for generating consistent JSON output by the Netdata APIs. For more see here.
SPINLOCK
Like POSIX MUTEX and RWLOCK but a lot faster, based on atomic operations, with significantly smaller memory impact, while being portable.
PGC
A caching layer that can be used to cache any kind of time-related data, with automatic indexing (based on a tree of JudyL arrays), memory management, evictions, flushing, pressure management. This is extensively used in dbengine. For more see here.

The above, and many more, allow Netdata developers to work on the application fast and with confidence. Most of the business logic in Netdata is a work of mixing the above.

Netdata data collection plugins can be developed in any language. Most of our application collectors though are developed in Go.

FAQ

🛡️ Is Netdata secure?

Of course it is! We do our best to ensure it is!

Click to see detailed answer ...

We understand that Netdata is a software piece that is installed on millions of production systems across the world. So, it is important for us, Netdata to be as secure as possible:

We follow the Open Source Security Foundation best practices.
We have given great attention to detail when it comes to security design. Check out our security design.
Netdata is a popular open-source project and is frequently tested by many security analysts.
Check also our security policies and advisories published so far.

🌀 Will Netdata consume significant resources on my servers?

No. It will not! We promise this will be fast!

Click to see detailed answer ...

Although each Netdata Agent is a complete monitoring solution packed into a single application, and despite the fact that Netdata collects every metric every single second and trains multiple ML models per metric, you will find that Netdata has amazing performance! In many cases, it outperforms other monitoring solutions that have significantly fewer features or far smaller data collection rates.

This is what you should expect:

For production systems, each Netdata Agent with default settings (everything enabled, ML, Health, DB) should consume about 5% CPU utilization of one core and about 150 MiB or RAM.

By using a Netdata parent and streaming all metrics to that parent, you can disable ML & health and use an ephemeral DB mode (like alloc) on the children, leading to utilization of about 1% CPU of a single core and 100 MiB of RAM. Of course, these depend on how many metrics are collected.
For Netdata Parents, for about 1 to 2 million metrics, all collected every second, we suggest a server with 16 cores and 32GB RAM. Less than half of it will be used for data collection and ML. The rest will be available for queries.

Netdata has extensive internal instrumentation to help us reveal how the resources consumed are used. All these are available in the "Netdata Monitoring" section of the dashboard. Depending on your use case, there are many options to optimize resource consumption.

Even if you need to run Netdata on extremely weak embedded or IoT systems, you will find that Netdata can be tuned to be very performant.

📜 How much retention can I have?

As much as you need!

Click to see detailed answer ...

Netdata supports tiering, to downsample past data and save disk space. With default settings, it has 3 tiers:

tier 0, with high resolution, per-second, data.
tier 1, mid-resolution, per minute, data.
tier 2, low-resolution, per hour, data.

All tiers are updated in parallel during data collection. Just increase the disk space you give to Netdata to get a longer history for your metrics. Tiers are automatically chosen at query time depending on the time frame and the resolution requested.

🚀 Does it scale? I have really a lot of servers!

Netdata is designed to scale and can handle large volumes of data.

Click to see detailed answer ...

Netdata is a distributed monitoring solution. You can scale it to infinity by spreading Netdata servers across your infrastructure.

With the streaming feature of the Agent, we can support monitoring ephemeral servers but also allow the creation of "monitoring islands" where metrics are aggregated to a few servers (Netdata Parents) for increased retention, or for offloading production systems.

✈️ Netdata Parents provide great vertical scalability, so you can have as big parents as the CPU, RAM and Disk resources you can dedicate to them. In our lab we constantly stress test Netdata Parents with several million metrics collected per second, to ensure it is reliable, stable, and robust at scale.
🚀 In addition, Netdata Cloud provides virtually unlimited horizontal scalability. It "merges" all the Netdata parents you have into one unified infrastructure at query time. Netdata Cloud itself is probably the biggest single installation monitoring platform ever created, currently monitoring about 100k online servers with about 10k servers changing state (added/removed) per day!

Example: the following chart comes from a single Netdata Parent. As you can see on it, 244 nodes stream to it metrics of about 20k running containers. On this specific chart there are 3 dimensions per container, so a total of about 60k time-series queries are needed to present it.

💾 My production servers are very sensitive in disk I/O. Can I use Netdata?

Yes, you can!

Click to see detailed answer ...

Netdata has been designed to spread disk writes across time. Each metric is flushed to disk every 17 minutes, but metrics are flushed evenly across time, at an almost constant rate. Also, metrics are packed into bigger blocks we call extents and are compressed with LZ4 before saving them, to minimize the number of I/O operations made.

Netdata also employs direct I/O for all its database operations, ensuring optimized performance. By managing its own caches, Netdata avoids overburdening system caches, facilitating a harmonious coexistence with other applications.

Single node Agents (not Parents), should have a constant rate of about 50 KiB/s or less, with some spikes above that every minute (flushing of tier 1) and higher spikes every hour (flushing of tier 2).

Health Alerts and Machine-Learning run queries to evaluate their expressions and learn from the metrics' patterns. These are also spread over time, so there should be an almost constant read rate too.

To make Netdata not use the disks at all, we suggest the following:

Use database mode alloc or ram to disable writing metric data to disk.
Configure streaming to push in real-time all metrics to a Netdata Parent. The Netdata Parent will maintain metrics on disk for this node.
Disable ML and health on this node. The Netdata Parent will do them for this node.
Use the Netdata Parent to access the dashboard.

Using the above, the Netdata Agent on your production system will not use a disk.

🤨 How is Netdata different from a Prometheus and Grafana setup?

Netdata is a "ready to use" monitoring solution. Prometheus and Grafana are tools to build your own monitoring solution.

Netdata is also a lot faster, requires significantly less resources and puts almost no stress on the server it runs. For a performance comparison check this blog.

Click to see detailed answer ...

First, we have to say that Prometheus as a time-series database and Grafana as a visualizer are excellent tools for what they do.

However, we believe that such a setup is missing a key element: A Prometheus and Grafana setup assumes that you know everything about the metrics you collect and you understand deeply how they are structured, they should be queried and visualized.

In reality, this setup has a lot of problems. The vast number of technologies, operating systems, and applications we use in our modern stacks, makes it impossible for any single person to know and understand everything about anything. We get testimonials regularly from Netdata users across the biggest enterprises, that Netdata manages to reveal issues, anomalies and problems they were not aware of and they didn't even have the means to find or troubleshoot.

So, the biggest difference of Netdata to Prometheus, and Grafana, is that we decided that the tool needs to have a much better understanding of the components, the applications, and the metrics it monitors.

When compared to Prometheus, Netdata needs for each metric much more than just a name, some labels, and a value over time. A metric in Netdata is a structured entity that correlates with other metrics in a certain way and has specific attributes that depict how it should be organized, treated, queried, and visualized. We call this the NIDL (Nodes, Instances, Dimensions, Labels) framework.

Maintaining such an index is a challenge: first, because the raw metrics collected do not provide this information, so we have to add it, and second because we need to maintain this index for the lifetime of each metric, which with our current database retention, it is usually more than a year.

At the same time, Netdata provides better retention than Prometheus due to database tiering, scales easier than Prometheus due to streaming, supports anomaly detection and it has a metrics scoring engine to find the needle in the haystack when needed.
When compared to Grafana, Netdata is fully automated. Grafana has more customization capabilities than Netdata, but Netdata presents fully functional dashboards by itself and most importantly it gives you the means to understand, analyze, filter, slice and dice the data without the need for you to edit queries or be aware of any peculiarities the underlying metrics may have.

Furthermore, to help you when you need to find the needle in the haystack, Netdata has advanced troubleshooting tools provided by the Netdata metrics scoring engine, that allows it to score metrics based on their anomaly rate, their differences or similarities for any given time frame.

Still, if you are already familiar with Prometheus and Grafana, Netdata integrates nicely with them, and we have reports from users who use Netdata with Prometheus and Grafana in production.

🤨 How is Netdata different from DataDog, New Relic, Dynatrace, X SaaS Provider?

With Netdata your data are always on-prem and your metrics are always high-resolution.

Click to see detailed answer ...

Most commercial monitoring providers face a significant challenge: they centralize all metrics to their infrastructure and this is, inevitably, expensive. It leads them to one or more of the following:

be unrealistically expensive
limit the number of metrics they collect
limit the resolution of the metrics they collect

As a result, they try to find a balance: collect the least possible data, but collect enough to have something useful out of it.

We, at Netdata, see monitoring in a completely different way: monitoring systems should be built bottom-up and be rich in insights, so we focus on each component individually to collect, store, check and visualize everything related to each of them, and we make sure that all components are monitored. Each metric is important.

This is why Netdata trains multiple machine-learning models per metric, based exclusively on their own past (no sampling of data, no sharing of trained models) to detect anomalies based on the specific use case and workload each component is used.

This is also why Netdata alerts are attached to components (instances) and are configured with dynamic thresholds and rolling windows, instead of static values.

The distributed nature of Netdata helps scale this approach: your data is spread inside your infrastructure, as close to the edge as possible. Netdata is not one data lane. Each Netdata Agent is a data lane and all of them together build a massive distributed metrics processing pipeline that ensures all your infrastructure components and applications are monitored and operating as they should.

🤨 How is Netdata different from Nagios, Icinga, Zabbix, etc?

Netdata offers real-time, comprehensive monitoring, with a user-friendly interface and the ability to monitor everything, without any custom configuration required.

Click to see detailed answer ...

While Nagios, Icinga, Zabbix, and other similar tools are powerful and highly customizable, they can be complex to set up and manage. Their flexibility often comes at the cost of ease-of-use, especially for users who are not systems administrators or do not have extensive experience with these tools. Additionally, these tools generally require you to know what you want to monitor in advance and configure it explicitly.

Netdata, on the other hand, takes a different approach. It provides a "ready to use" monitoring solution with a focus on simplicity and comprehensiveness. It automatically detects and starts monitoring many different system metrics and applications out-of-the-box, without any need for custom configuration.

In comparison to these traditional monitoring tools, Netdata:

Provides real-time, high-resolution metrics, as opposed to the often minute-level granularity that tools like Nagios, Icinga, and Zabbix provide.
Automatically generates meaningful, organized, and interactive visualizations of the collected data. Unlike other tools, where you have to manually create and organize graphs and dashboards, Netdata takes care of this for you.
Applies machine learning to each individual metric to detect anomalies, providing more insightful and relevant alerts than static thresholds.
Is designed to be distributed, so your data is spread inside your infrastructure, as close to the edge as possible. This approach is more scalable and avoids the potential bottleneck of a single centralized server.
Has a more modern and user-friendly interface, making it easy for anyone, not just experienced administrators, to understand the health and performance of their systems.

Even if you're already using Nagios, Icinga, Zabbix, or similar tools, you can use Netdata alongside them to augment your existing monitoring capabilities with real-time insights and user-friendly dashboards.

😳 I feel overwhelmed by the amount of information in Netdata. What should I do?

Netdata is designed to provide comprehensive insights, but we understand that the richness of information might sometimes feel overwhelming. Here are some tips on how to navigate and utilize Netdata effectively...

Click to see detailed answer ...

Netdata is indeed a very comprehensive monitoring tool. It's designed to provide you with as much information as possible about your system and applications, so that you can understand and address any issues that arise. However, we understand that the sheer amount of data can sometimes be overwhelming.

Here are some suggestions on how to manage and navigate this wealth of information:

Start with the Overview Dashboard
Netdata's Overview Dashboard provides a high-level summary of your system's status. We have added summary tiles on almost every section, you reveal the information that is more important. This is a great place to start, as it can help you identify any major issues or trends at a glance.
Use the Search Feature
If you're looking for specific information, you can use the search feature to find the relevant metrics or charts. This can help you avoid scrolling through all the data.
Customize your Dashboards
Netdata allows you to create custom dashboards, which can help you focus on the metrics that are most important to you. Sign-in to Netdata and there you can have your custom dashboards. (coming soon to the agent dashboard too)
Leverage Netdata's Anomaly Detection
Netdata uses machine learning to detect anomalies in your metrics. This can help you identify potential issues before they become major problems. We have added an AR button above the dashboard table of contents to reveal the anomaly rate per section so that you can easily spot what could need your attention.
Take Advantage of Netdata's Documentation and Blogs
Netdata has extensive documentation that can help you understand the different metrics and how to interpret them. You can also find tutorials, guides, and best practices there.

Remember, it's not necessary to understand every single metric or chart right away. Netdata is a powerful tool, and it can take some time to fully explore and understand all of its features. Start with the basics and gradually delve into more complex metrics as you become more comfortable with the tool.

Subscribing to Netdata Cloud is optional but many users find it enhances their experience with Netdata.

Click to see detailed answer ...

The Netdata Agent dashboard and the Netdata Cloud dashboard are the same. Still, Netdata Cloud provides additional features, that the Netdata Agent is not capable of. These include:

Access your infrastructure from anywhere.
Have SSO to protect sensitive features.
Customizable (custom dashboards and other settings are persisted when you are signed in to Netdata Cloud)
Configuration of Alerts and Data Collection from the UI (coming soon)
Security (role-based access control - RBAC).
Horizontal Scalability ("blend" multiple independent parents in one uniform infrastructure)
Central Dispatch of Alert Notifications (even when multiple independent parents are involved)
Mobile App for Alert Notifications (coming soon)

So, although it is not required, you can get the most out of your Netdata setup by using Netdata Cloud.

We encourage you to support Netdata by buying a Netdata Cloud subscription. A successful Netdata is a Netdata that evolves and gets improved to provide a simpler, faster and easier monitoring for all of us.

For organizations that need a fully on-prem solution, we provide Netdata Cloud for on-prem installation. Contact us for more information.

🔎 What does the anonymous telemetry collected by Netdata entail?

Your privacy is our utmost priority. As part of our commitment to improving Netdata, we rely on anonymous telemetry data from our users who choose to leave it enabled. This data greatly informs our decision-making processes and contributes to the future evolution of Netdata.

Should you wish to disable telemetry, instructions for doing so are provided in our installation guides.

Click to see detailed answer ...

Netdata is in a constant state of growth and evolution. The decisions that guide this development are ideally rooted in data. By analyzing anonymous telemetry data, we can answer questions such as: "What features are being used frequently?", "How do we prioritize between potential new features?" and "What elements of Netdata are most important to our users?"

By leaving anonymous telemetry enabled, users indirectly contribute to shaping Netdata's roadmap, providing invaluable information that helps us prioritize our efforts for the project and the community.

We are aware that for privacy or regulatory reasons, not all environments can allow telemetry. To cater to this, we have simplified the process of disabling telemetry:

During installation, you can append --disable-telemetry to our kickstart.sh script, or
Create the file /etc/netdata/.opt-out-from-anonymous-statistics and then restart Netdata.

These steps will disable the anonymous telemetry for your Netdata installation.

Please note, even with telemetry disabled, Netdata still requires a Netdata Registry for alert notifications' Call To Action (CTA) functionality. When you click an alert notification, it redirects you to the Netdata Registry, which then directs your web browser to the specific Netdata Agent that issued the alert for further troubleshooting. The Netdata Registry learns the URLs of your agents when you visit their dashboards.

Any Netdata Agent can act as a Netdata Registry. Simply designate one Netdata Agent as your registry, and our global Netdata Registry will no longer be in use. For further information on this, please refer to this guide.

😏 Who uses Netdata?

Netdata is a widely adopted project...

Click to see detailed answer ...

Browse the Netdata stargazers on GitHub to discover users from renowned companies and enterprises, such as ABN AMRO Bank, AMD, Amazon, Baidu, Booking.com, Cisco, Delta, Facebook, Google, IBM, Intel, Logitech, Netflix, Nokia, Qualcomm, Realtek Semiconductor Corp, Redhat, Riot Games, SAP, Samsung, Unity, Valve, and many others.

Netdata also enjoys significant usage in academia, with notable institutions including New York University, Columbia University, New Jersey University, Seoul National University, University College London, among several others.

And, Netdata is also used by numerous governmental organizations worldwide.

In a nutshell, Netdata proves invaluable for:

Infrastructure intensive organizations
Such as hosting/cloud providers and companies with hundreds or thousands of nodes, who require a high-resolution, real-time monitoring solution for a comprehensive view of all their components and applications.
Technology operators
Those in need of a standardized, comprehensive solution for round-the-clock operations. Netdata not only facilitates operational automation and provides controlled access for their operations engineers, but also enhances skill development over time.
Technology startups
Who seek a feature-rich monitoring solution from the get-go.
Freelancers
Who seek a simple, efficient and straightforward solution without sacrificing performance and outcomes.
Professional SysAdmins and DevOps
Who appreciate the fine details and understand the value of holistic monitoring from the ground up.
Everyone else
All of us, who are tired of the inefficiency in the monitoring industry and would love a refreshing change and a breath of fresh air. 🙂

🌐 Is Netdata open-source?

The Netdata Agent back-end is entirely open-source. We ship 3 different versions of the UI: 2 open-source versions and 1 closed-source version.

Click to see detailed answer ...

The entire back-end of the Netdata Agent is open-source, licensed under GPLv3+. We don't develop a separate enterprise version. All users, including commercial ones, use the same Netdata Agent.

The Netdata Agent is shipped with multiple UI versions:

http://agent.ip:19999/v0/, the original open-source single-node UI, GPLv3+.
http://agent.ip:19999/v1/, the latest open-source single-node UI, GPLv3+.
http://agent.ip:19999/v2/, a snapshot of the latest Netdata Cloud UI as it was at the time the agent was released, licensed to be distributed with Netdata Agents under NCUL1.

When you access a Netdata Agent via http://agent.ip:19999/ a splash screen attempts to use the latest live version of Netdata Cloud UI (downloaded from Cloudflare). This only happens when the web browser has internet connectivity and Netdata Cloud is not disabled at the agent configuration. Otherwise, it falls back to http://agent.ip:19999/v2/.

The Netdata Cloud UI is not open-source. But we thought that it is to the benefit of the community to allow everyone to use it directly with Netdata Agents, for free, even if Netdata Cloud is not used.

💰 What is your monetization strategy?

Netdata generates revenue through subscriptions to advanced features of Netdata Cloud and sales of on-premise and private versions of Netdata Cloud.

Click to see detailed answer ...

Netdata generates revenue from these activities:

Netdata Cloud Subscriptions
Direct funding for our project's vision comes from users subscribing to Netdata Cloud's advanced features.
Netdata Cloud On-Prem or Private
Purchasing the on-premises or private versions of Netdata Cloud supports our financial growth.

Our Open-Source Community and the free access to Netdata Cloud, contribute to Netdata in the following ways:

Netdata Cloud Community Use
The free usage of Netdata Cloud demonstrates its market relevance. While this doesn't generate revenue, it reinforces trust among new users and aids in securing appropriate project funding.
User Feedback
Feedback, especially issues and bug reports, is invaluable. It steers us towards a more resilient and efficient product. This, too, isn't a revenue source but is pivotal for our project's evolution.
Anonymous Telemetry Insights
Users who keep anonymous telemetry enabled, help us make data informed decisions in refining and enhancing Netdata. This isn't a revenue stream, but knowing which features are used and how, contributes in building a better product for everyone.

We don't monetize, directly or indirectly, users' or "device heuristics" data. Any data collected from community members are exclusively used for the purposes stated above.

Netdata grows financially when technology intensive organizations and operators, need - due to regulatory or business requirements - the entire Netdata suite (including Netdata Cloud) on-prem or private, bundled with top-tier support. It is a win-win case for all parties involved: these companies get a battle tested, robust and reliable solution, while the broader community that helps us build this product, enjoys it at no cost.

📖 Documentation

Netdata's documentation is available at Netdata Learn.

This site also hosts a number of guides to help newer users better understand how to collect metrics, troubleshoot via charts, export to external databases, and more.

🎉 Community

Netdata is an inclusive open-source project and community. Please read our Code of Conduct.

Join the Netdata community:

Chat with us and other community members on Discord.
Start a discussion on GitHub discussions.
Open a topic to our community forums.

Meet Up 🧑‍🤝‍🧑🧑‍🤝‍🧑🧑‍🤝‍🧑
The Netdata team and community members have regular online meetups, usually every 2 weeks.
You are welcome to join us! Click here for the schedule.

🙏 Contribute

Contributions are essential to the success of open-source projects. In other words, we need your help to keep Netdata great!

What is a contribution? All the following are highly valuable to Netdata:

Let us know of the best-practices you believe should be standardized
Netdata should out-of-the-box detect as many infrastructure issues as possible. By sharing your knowledge and experiences, you help us build a monitoring solution that has baked into it all the best-practices about infrastructure monitoring.
Let us know if Netdata is not perfect for your use case
We aim to support as many use cases as possible and your feedback can be invaluable. Open a GitHub issue, or start a GitHub discussion about it, to discuss how you want to use Netdata and what you need.

Although we can't implement everything imaginable, we try to prioritize development on use-cases that are common to our community, are in the same direction we want Netdata to evolve and are aligned with our roadmap.
Support other community members
Join our community on GitHub, Discord and Reddit. Generally, Netdata is relatively easy to set up and configure, but still people may need a little push in the right direction to use it effectively. Supporting other members is a great contribution by itself!
Add or improve integrations you need
Integrations tend to be easier and simpler to develop. If you would like to contribute your code to Netdata, we suggest that you start with the integrations you need, which Netdata does not currently support.

General information about contributions:

Check our Security Policy.
Found a bug? Open a GitHub issue.
Read our Contributing Guide, which contains all the information you need to contribute to Netdata, such as improving our documentation, engaging in the community, and developing new features. We've made it as frictionless as possible, but if you need help, just ping us on our community forums!

Package maintainers should read the guide on building Netdata from source for instructions on building each Netdata component from the source and preparing a package.

License

Netdata is released under GPLv3+. Netdata re-distributes other open-source tools and libraries. Please check the third party licenses.

The Latest Netdata UI, is distributed under NCUL1. It also uses third party open source components. Check the UI third party licenses

README.md

Monitor your servers, containers, and applications,in high-resolution and in real-time.

💥 NEW: Network Connections Explorer 💥

⭐ Netdata is the most energy-efficient monitoring tool ⭐

What's New and Coming?

Getting Started

1. Install Netdata everywhere ✌️

2. Configure Collectors 💥

3. Configure Alert Notifications 🔔

4. Configure Netdata Parents 👪

5. Connect to Netdata Cloud ☁️

How it works

FAQ

🛡️ Is Netdata secure?

🌀 Will Netdata consume significant resources on my servers?

📜 How much retention can I have?

🚀 Does it scale? I have really a lot of servers!

💾 My production servers are very sensitive in disk I/O. Can I use Netdata?

🤨 How is Netdata different from a Prometheus and Grafana setup?

🤨 How is Netdata different from DataDog, New Relic, Dynatrace, X SaaS Provider?

🤨 How is Netdata different from Nagios, Icinga, Zabbix, etc?

😳 I feel overwhelmed by the amount of information in Netdata. What should I do?

☁️ Do I have to subscribe to Netdata Cloud?

🔎 What does the anonymous telemetry collected by Netdata entail?

😏 Who uses Netdata?

🌐 Is Netdata open-source?

💰 What is your monetization strategy?

📖 Documentation

🎉 Community

🙏 Contribute

License

Monitor your servers, containers, and applications,
in high-resolution and in real-time.