Aix Error Logging Overview
Contents |
errreporter script. Download the sample errreporter.conf configuration file. The AIX Error Logging Facility Sandor W. Sklar The primary goal of every UNIX systems administrator is to ensure that the systems that they are responsible for are functioning smoothly http://web.stanford.edu/~ssklar/errreporter/article.html and with the best performance possible, 100% of the time. File systems running out of http://www.unixmantra.com/2013/08/aix-error-logging.html space, applications dumping core, and Ethernet adapter failures are just a sample of the types of things that can trip up a system, impacting that goal. Therefore, it is critical that the people responsible for a system are aware of anything that might have an impact on attaining that 100% system availability. One of the things that makes error log AIX my favorite flavor of UNIX is that, besides all the standard tools, daemons, and configuration files that are present in all flavors of UNIX, IBM has provided a number of enhancements that make the monitoring, reliability, and administration of RS/6000 systems second to none. This article will focus on one of those tools: the error logging facility. I'll show you how the AIX error logging facility works, then I'll present a program aix error log I wrote that checks the log for error messages, filters out any error messages you wish to ignore, and sends an email to the systems administrator. The Error Logging Subsystem On most UNIX systems, information and errors from system events and processes are managed by the syslog daemon (syslogd); depending on settings in the configuration file /etc/syslog.conf, messages are passed from the operating system, daemons, and applications to the console, to log files, or to nowhere at all. AIX includes the syslog daemon, and it is used in the same way that other UNIX-based operating systems use it. In addition to syslog, though, AIX also contains another facility for the management of hardware, operating system, and application messages and errors. This facility, while simple in its operation, provides unique and valuable insight into the health and happiness of an RS/6000 system. The AIX error logging facility components are part of the bos.rte and the bos.sysmgt.serv_aid packages, both of which are automatically placed on the system as part of the base operating system installation. Some of these components are shown in Table 1. Unlike the syslog daemon, which performs no logging at all in its default configuration as shipped, the error logging facility requires no configuration before it can provide useful information about the system. The errdemon is star
steps that are required to check the AIX error log. After the terminal window is open, simply type errpt |pg(pgpermits you to display one page at a time) The columns that you need to review are: TIMESTAMP (Column #2) Provides a shortcut version of the date and time. Example:0827041513 0827= August 27th 0415= 4:15 AM OR 04:15 hours 13 = the year, 2013 T(Column #3) Displays the type code for the error: I= Informational P= Permanent T= Temporary Warning: If a type code ofP (Permanent)appears, contact the appropriate AIX Support. C(Column #4) Displays a descriptor of the environment in which the error has occurred. The codes are: H= Hardware S= Software O= Operating Environment Few Example Command To display a complete summary report, enter: errpt To display a complete detailed report, enter: errpt -a To display a detailed report of all errors logged for the error identifier E19E094F, enter: errpt -a -j E19E094F To list error-record templates for which logging is turned off for any error-log entries, enter: errpt -t -F log=0 To view all entries from the alternate error-log file /var/adm/ras/errlog.alternate, enter: errpt -i /var/adm/ras/errlog.alternate To view all hardware entries from the error-log file enter: errpt -d H To view all software entries from the error-log file enter: errpt -d S To display a detailed report of all errors logged for the error label ERRLOG_ON, enter: errpt -a -J ERRLOG_ON To display a detailed report of all errors and group duplicate errors, enter: errpt -aD To display the error-record template repository, enter: errpt -t To disable the reporting of the ERRLOG_OFF event (error ID 192AC071), type the following: errupdate