====== Server Monitoring ======

Voir aussi: [[/informatique/web/webstats|/informatique/web/webstats]]

À lire:
  * [[http://www.thegeekstuff.com/2011/03/linux-performance-monitoring-intro/|Linux Performance Monitoring and Tuning Introduction]]
  * [[https://www.libertycenterone.com/blog/five-of-the-best-free-network-monitoring-software-tools/|Five Of The Best Free Network Monitoring Software Tools]]
  * [[https://www.webpagefx.com/blog/web-design/10-free-server-network-monitoring-tools-that-kick-ass/|10 Free Server & Network Monitoring Tools that Kick Ass]]
===== Outils =====

  * [[http://htop.sourceforge.net/|htop]] permet de monitorer l’activité de son serveur comme top, mais avec une interface plus ergonomqie et amélioré. Il liste les processus, le load average, l’utilisation RAM / SWAP…
  * [[http://www.atcomputing.nl/Tools/atop/index.html|atop]] permet également de monitorer l’activité de se serveur linux. atop permet de monitorer les processus, l’activité des disques, la charge processeur, l’utilisation du réseau ou de la mémoire (vive et swap).
  * [[http://freshmeat.net/projects/apachetop|apachetop]] fonctionne sur le même principe que top, mais lui concerne le serveur web apache. Il permet de voir les requêtes par seconde, les bytes par seconde l’url la plus populaire… sur un serveur web.
  * [[http://jeremy.zawodny.com/mysql/mytop/|mytop]] permet de surveiller les requêtes et la performance MySQL Il supporte les serveurs aux versions 3.22.x, 3.23.x, 4.x et 5.x. Abandonné ?
  * [[http://mtop.sourceforge.net/|mtop]] permet de tracer et de diagnostiquer des requêtes en cours et leur déroulement, ce qui aidera à la résolution d’éventuelles requêtes mal écrites. Abandonné ?
  * [[http://ptop.projects.postgresql.org/|ptop ou pg_top]] est similaire à top pour postgresql. Il va permettre de voir les requêtes en cours, voir le plan d’exécution des requêtes SQL, voir les locks, voir les statistiques
  * [[http://dns.measurement-factory.com/tools/dnstop/|dnstop]] permet de visualiser le trafic de son serveur dns, il permet notamment d’identifier les requêtes indésirables.
  * [[http://www.ex-parrot.com/pdw/iftop/|iftop]] permet de visualiser l’état de la bande passante sur une interface réseau.

  - top - Process Activity Command
  - vmstat - System Activity, Hardware and System Information
  - w - Find Out Who Is Logged on And What They Are Doing
  - uptime - Tell How Long The System Has Been Running
  - ps - Displays The Processes
  - free - Memory Usage
  - mpstat - Multiprocessor Usage (package "sysstat")
  - pmap - Process Memory Usage
  - netstat - Network Statistics
  - ss - Network Statistics
  - iptraf - Real-time Network Statistics (package "iptraf")
  - tcpdump - Detailed Network Traffic Analysis
  - strace - System Calls
  - /Proc file system - Various Kernel Statistics
  - lsof - list open files, network connections and much more.
  - nmap - scan your server for open ports.
  - ntop web based tool - ntop is the best tool to see network usage in a way similar to what top command does for processes i.e. it is network traffic monitoring software. You can see network status, protocol wise distribution of traffic for UDP, TCP, DNS, HTTP and other protocols.
  - mtr - mtr combines the functionality of the traceroute and ping programs in a single network diagnostic tool.  [[http://www.cyberciti.biz/tips/finding-out-a-bad-or-simply-overloaded-network-link-with-linuxunix-oses.html|Finding out a bad or simply overloaded network link with Linux/UNIX oses]]

Présentation de ces outils:
  * http://www.cyberciti.biz/tips/top-linux-monitoring-tools.html


==== Status page ====

=== Cachet ===

https://cachethq.io/

The open source status page system. Beautifully crafted, Translated, JSON API, Scheduled maintenance, Metrics, Two-factor authentication.


==== sysstat ====

Using sar you can monitor performance of various Linux subsystems (CPU, Memory, I/O..) in real time. You can also collect all performance data on an on-going basis, store them, and do historical analysis to identify bottlenecks. Sar is part of the sysstat package. 

À lire:
  * [[http://sebastien.godard.pagesperso-orange.fr/documentation.html|La documentation]]
  * [[http://www.thegeekstuff.com/2011/03/sar-examples/|10 Useful Sar (Sysstat) Examples for UNIX / Linux Performance Monitoring]]
  * [[http://www.tecmint.com/sysstat-commands-to-monitor-linux/|20 Useful Commands of ‘Sysstat’ Utilities (mpstat, pidstat, iostat and sar) for Linux Performance Monitoring]]

=== Sur Debian ===

  sudo apt-get install sysstat

Default settings for
  * /etc/init.d/sysstat
  * /etc/default/sysstat
  * /etc/cron.d/sysstat
  * /etc/cron.daily/sysstat

Activer la collecte:
  sudo vi /etc/default/sysstat
  # Should sadc collect system activity informations? Valid values
  # are "true" and "false". Please do not put other values, they
  # will be overwritten by debconf!
  ENABLED="false"
Mettre ''ENABLED="true"''

==== MRTG ====

Server And Network Monitoring. Nagios is a popular open source computer system and network monitoring application software. You can easily monitor all your hosts, network equipment and services. It can send alert when things go wrong and again when they get better.

[[http://fannagioscd.sourceforge.net/drupal/|FAN]] is "Fully Automated Nagios". FAN goals are to provide a Nagios installation including most tools provided by the Nagios Community. FAN provides a CDRom image in the standard ISO format, making it easy to easilly install a Nagios server. Added to this, a wide bunch of tools are including to the distribution, in order to improve the user experience around Nagios.

Voir [[informatique:mrtg|/informatique/MRTG]]

==== Nagios ====

[[http://www.nagios.org/|Nagios]]

==== Netdata ====

https://github.com/firehol/netdata/wiki

netdata is a scalable, distributed, real-time, performance and health monitoring solution for Linux, FreeBSD and MacOS. It is open-source too.
Out of the box, it collects 1k to 5k metrics per server per second. It is the corresponding of: top, vmstat, iostat, iotop, sar, systemd-cgtop and a dozen more console tools running in parallel. netdata is very efficient in this: the daemon needs just 1% to 3% cpu of a single core, even when it runs on IoT.

Many people view netdata as a collectd + graphite + grafana alternative, or compare it with cacti or munin. All these are really great tools, but they are not netdata. Let's see why.

My primary goal when I was designing netdata was to help us find why our systems and applications are slow or misbehaving. To provide a system that could kill the console for performance monitoring.

To do this, I decided that:
  * high resolution metrics is more important than long history
  * the more metrics collected, the better - we should not fear to add 1k metrics more
  * effective monitoring starts with monitoring everything about each node

==== Grafana ====

http://grafana.org

==== Prometheus ====

[[/informatique/system_admin/Prometheus|Prometheus]]

Prometheus differs from Loki by focusing on **metrics instead of logs**, and delivering logs via **pull**, instead of **push**.

==== Grafana Loki ====

https://github.com/grafana/loki

Loki differs from Prometheus by focusing on **logs instead of metrics**, and delivering logs via **push**, instead of **pull**.
==== Glances ====

Glances est un logiciel libre (distribué sous licence LGPL) permettant de surveiller votre système d'exploitation GNU/Linux ou BSD à partir d'une interface texte. Glances utilise la librairie libstatgrab pour récupérer les informations de  votre système. Il est développé en langage Python.

  * Auteur [[http://blog.nicolargo.com|nicolargo]], [[http://blog.nicolargo.com/2011/12/presentation-complete-de-glances.html|Présentation]] en français.

==== Shinken ====

A Python Nagios® Core total rewrite.

http://shinken-monitoring.org

  * [[http://blog.nicolargo.com/2012/11/installation-pas-a-pas-dun-serveur-de-supervision-shinken.html|Installation pas à pas d’un serveur de supervision Shinken]] par [[http://blog.nicolargo.com|nicolargo]].

> Il existe aussi shinken, mais ça marche un peu comme ça veut et quand ça veut. Depuis, ça prend énormément de RAM pour le service rendu.

==== Zabbix ====

[[http://www.zabbix.com/|Zabbix]]

>(sd-basic@ml.ovh.net) Ca marche plutôt bien mais les anciennes versions consomment énormément de CPU sur le serveur (a priori réglé dans la 1.6, mais j’ai pas encore testé)

  * [[https://github.com/zabbixcn/zabbix-nexmo|zabbix-nexmo]] Simple bash scripts for sending Zabbix alerts to SMS and telephone via Nexmo.

==== SNM ====

http://snm.sourceforge.net/

==== CACTI ====

Web-based Monitoring Tool. Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. It can provide data about network, CPU, memory, logged in users, Apache, DNS servers and much more. See [[http://www.cyberciti.biz/faq/fedora-rhel-install-cacti-monitoring-rrd-software/|how to install and configure Cacti network graphing tool under CentOS / RHEL]].


http://www.cacti.net/

==== Ovh ====

OCO

RTM

==== Snort ====

Snort SP

Snort RNA

[[informatique:Snort|informatique/Snort]]

==== OpenPOM ====

This web user interface allow you to view in a single page almost everything about nagios and icinga : alert, ack, downtime, comment. You can also interact with nagios and icinga through ack, downtime, comment, disable and reset buttons.

  * [[http://code.google.com/p/openpom]]
  * [[http://www.exosec.fr/POM-plateforme-open-source-monitoring.html]]

==== zenoss ====

http://www.zenoss.org/

==== icinga ====

https://www.icinga.org/
===== Services =====

==== UpTimeRobot ====

[[https://uptimerobot.com|https://uptimerobot.com]]

Free plan

==== HetrixTools ====

[[https://hetrixtools.com|https://hetrixtools.com]]

Free plan


==== Server Density ====

http://serverdensity.com/

  * Graphical Data
  * Server resource usage (lightweight python agent)
  * Email / SMS alert
  * iPhone application

Prix:
  * Free pour 1 server
  * 10 to 15£ / server / month

==== pingdom ====

http://www.pingdom.com/

==== newrelic ====

https://newrelic.com/

==== LogMatic ====

LogMatic.io