Difference between revisions of "Configuring nrpe based internal service checks"

From Notes_Wiki
m
m
Line 1: Line 1:
<yambe:breadcrumb>Nagios configuration</yambe:breadcrumb>
<yambe:breadcrumb self="Configuring nrpe based internal service checks">Nagios configuration|Nagios configuration</yambe:breadcrumb>
=Configuring nrpe based internal service checks=
=Configuring nrpe based internal service checks=


Line 129: Line 129:




<yambe:breadcrumb>Nagios configuration</yambe:breadcrumb>
<yambe:breadcrumb self="Configuring nrpe based internal service checks">Nagios configuration|Nagios configuration</yambe:breadcrumb>

Revision as of 07:51, 22 January 2019

<yambe:breadcrumb self="Configuring nrpe based internal service checks">Nagios configuration|Nagios configuration</yambe:breadcrumb>

Configuring nrpe based internal service checks

We can use nrpe to monitor status of processes, hard-disk usage, cpu usage etc. on host. Since this things cannot be checked remotely we run nrpe agent on machine on which we want to check these parameters.


Install nrpe

Install nrpe using yum on machine which should be monitored remotely. This can be done using yum and happens automatically if we use 'yum -y install nagios*'.


Configure nrpe or xinetd for nagios server

Since it is not safe to give out host information to every one we have to configure IPs of nagios servers which will poll for information.

  • Xinetd
    Edit file '/etc/xinetd.d/nrpe' and make 'disable = no' and add IP of nagios server in 'only from' directive.
  • Nrpe
    Edit file '/etc/nagios/nrpe.cfg' and add comman separated values in line 'allowed_hosts=127.0.0.1'


Start xinetd service

Start xinetd service using 'service xinetd start' in case that will be used to run nrpe. Also do 'chkconfig xinetd on' on host to be monitored using nagios, so that even after reboot host responds to nrpe queries.


Configure commands in nrpe server

We can configure additional commands that should be supported in file /etc/nagios/nrpe.conf. One command that you may consider defining on server basic is 'check_hda1'. Not all servers will have disk 'hda1' and there can be more partitions on each disk on server. Hence we can configure more check commands.

command[check_md0]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/md0

Above command can be used to check device /dev/md0 in case the server being monitored uses raid


Allowing connecting to nrpe in firewall

We can add rule like

iptables -I INPUT -s <nagios_server_IP> -p tcp -m tcp –dport 5666 -j ACCEPT

to allow nagios to connect to nrpe client. Do 'service iptables restart' on host so that the firewall rule changes take effect.


Checking nrpe connectivity to remote hosts

We can check whether nagios can communicate with nrpe client or not using command

/usr/lib64/nagios/plugins/check_nrpe -H <host_to_be_monitored>

The command should be run on nagios server. The file 'check_nrpe' can be in different location, use updatedb and locate combination to find the file. The above command should return NRPE version, if it returns error then something is blocking connections or nrpe client is not running on destination host. Try disabling SELinux on destination host to check if SELinux is causing problem.


Configure nrpe in nagios server commands.cfg file

We can configure 'check_nrpe' command in commands.cfg file. Then other check commands as arguments to this check_nrge command to be run on remote host. Lines that can be added to commands.cfg file on nagios server to enable 'check_nrpe' command are

define command{
        command_name check_nrpe
        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }


Configuring remote checks using nrpe in nagios hosts configuration file

If nagios server can connect to client using 'check_nrpe' then we can configure host definition file in server to monitor parameters like disk space and processes, etc. Sample definition which use nrpe to check load on destination machine is

define service{
          use                 generic-service
          host_name           labpc
          service_description CPU Load
          check_command       check_nrpe!check_load
          }

Note that for to work the command check_nrpe should be configured in nagios server commands.cfg file and check_load should be configured in destination host /etc/nagios/nrpe.cfg file.


Other sample remote check configurations

define service{
          use                 generic-service
          host_name           labpc
          service_description Current Users
          check_command       check_nrpe!check_users
          }


define service{
          use                 generic-service
          host_name           labpc
          service_description /dev/md0 Free Space
          check_command       check_nrpe!check_md0
          }

define service{
          use                 generic-service
          host_name           labpc
          service_description Total Processes
          check_command       check_nrpe!check_total_procs!
          }


Changing warning parameters

Note that you can check threshold values for warning and critical in configuration file /etc/nagios/nrpe.conf on destination host where commands are defined. For example to warn if processes are above 200 (default) 150 we can modify values 150, 200 to 200, 250 with resulting configuration line like

command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 200 -c 250 


Automated nagios configuration

For automated nagios configuration refer to:


<yambe:breadcrumb self="Configuring nrpe based internal service checks">Nagios configuration|Nagios configuration</yambe:breadcrumb>