Nagios open source system monitor software
This page has my notes form the conference.
In order to allow the Nagios system to run thousands of alert checks per minute, we needed to tune a few things in Nagios.
First we moved the database and program directories /var/lib and /usr/local over to a high speed EMC disk Array.
none /mnt/ram tmpfs size=500M 0 0
# mkdir -p /mnt/ram; mount /mnt/ram
NOTE: These instructions apply to a system that has not previously had NRPE installed on it. If NRPE has been previously installed, you will need to backup the nrpe.cfg file to ensure it is not lost.
NOTE2: The system you are installing this on MUST have access to your nagios server on port 80. You can verify this by typing "telnet nagiosserver.mycompany.com 80".
1. Install rpmforge on the linux system
Please refer to this document on installing it if you have not already: http://wiki.centos.org/AdditionalResources/Repositories/RPMForge
2. Next: install nagios-nrpe on the system
rpm -q nagios-nrpe || yum --enablerepo=rpmforge install nagios-nrpe
*NOTE* This script WILL record passwords.
Each hostname will have a fully qualified domain name.
Please refer to JMX::Jmx4Perl::Manual for installation instructions
for how to deploy the agent servlet (which can be found in the distribution as agent/j4p.war
Java Management Extensions Technology is a new feature in version 5.0 of the Java Platform, Standard Edition (J2SE). If you are already familiar with JMX technology, see Appendix A, "JMX Technology Versions" for version information.
We need the ability to monitor a part of a page. The content needs to be able to match content on a page using regular expressions. Also needed is the ability to log into the page. Example of expression: /Temp_P18.*\n.*\n.*;(.*) Deg. C.*\n.*\n.*\n.*;(.*) %/
We need to be able to catch the values in the "()" areas listed above and check against those values.
We're experiencing issues with some monitors giving a HTTP-FORBIDDEN issue with the check_http scripts.
nsca - Daemon and client program for sending passive check results across the network