Although not very well documented, the integration between Puppet and Nagios can be very useful, especially in a highly modularized environment with components shared between multiple instances, such as a webfarm.
I intend on writing a little tutorial page on this site with my recent insights on Puppet and Nagios integration, but one thing I won’t keep unmentioned is how to deal with lots and lots of Puppet defined servicechecks.
By default, collected nagios_servicecheck objects are all stored in one config file. As soon as you reach several hundred defined servicechecks, however, the puppetclient on your nagios server will tend to be slower and slower. A puppetrun that takes up as much as an hour is no exception anymore. This is perfectly explainable. Puppet has to verify whether or not a servicecheck definition has changed since the previous puppetrun. It does so by scanning the nagios_servicechecks.cfg file for each and every defined service.
The best way to overcome this issue is to use a config directory instead of a config file. To do this, specify a “target” argument to the nagios_servicecheck:
@@nagios_service {"PING-$fqdn":
ensure => present,
host_name => $fqdn,
...
target => "/etc/nagios/nagios_services.d/ping-$fqdn.cfg"
}
This way, for each and every defined servicecheck, puppet no longer has to scan a textfile containing thousands of lines, but only verify the existence of a configfile and make sure the content (just a couple of lines) is still up to date.
This will give you a speedup in the 50-fold range.
Of course, you can do the same thing with nagios_host, nagios_command, etc.. objects, but the largest gain, you’ll find will be with the servicecheck optimisation.