Group 2 Pci Bus Uncorrectable Error
Contents |
PCI system event errors - IBM BladeCenter HS22, HS22V Applicable countries and regions Source RETAIN tip: H196737 Symptom When a Peripheral Component Interconnect (PCI) controller has an error, the System Event Log (SEL) will only report a PCIe a uncorrectable bus error has occurred on system error, and it does not point to the specific controller. The PCI error logs
Fault In Slot "all Pci Error" On System
will be similar to the following: ?Chassis (NMI State) software NMI ?Group 2, connector (One of PCI Error) PCI slot system board (planar vbat) voltage under critical threshold 0 fault ?Group 2 (PCIs) bus uncorrectable error It must be noted that slot 0 does not map to a physical PCIe device, instead this ID is used when the PCIe device which generated the
System Board, Voltage (planar Fault) Critical
error can not be identified. Affected configurations The system may be any of the following IBM servers: BladeCenter HS22, type 1936, any model BladeCenter HS22, type 7870, any model BladeCenter HS22V, type 1949, any model BladeCenter HS22V, type 7871, any model This tip is not software specific. This tip is not option specific. The system has the symptom described above. Workaround Unified Extensible Firmware Interface (UEFI) will log the correctable ecc memory error logging limit reached failed PCIe device information in its Firmware Service Record associated with System Event Log (SEL). The user may check this by selecting: F1 Setup --> System Event Logs --> System Event Log. Under the Intelligent Platform Management Interface (IPMI), the Standard PCI event will be the firmware service record ("---> S.xxxxx" ). Press 'Enter' key on this event to view the Vendor/Device IDs and details regarding failure information. All vendors can be determined using PCIDatabase at the following URL: http://www.pcidatabase.com/index.php For more information, see pages 22 - 23 of 'Introducing UEFI-Compliant Firmware on IBM System x and BladeCenter Servers' available at the following URL: http://www.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5083207 If the PCIe controller which caused the fault cannot be determined from IPMI SEL then follow suggested actions in the order they are listed below until the issue is solved. Error code: 0x806F0021 Error message: System board, connector (PCIe Status) fault Verify that the system is running the latest UEFI firmware (see Firmware updates in Problem Determination and Service Guide). If an expansion card is installed in the blade server, verify that the firmware for each expansion card is up to date. Run the Setup utility and restore system setting to defaults (see Using the Setup utility in Problem Determination and Service Guid
NMI and Group 2 PCI errors in AMM event log - IBM BladeCenter HS22V (1949, 7871) Applicable countries and regions Source RETAIN tip:
Problem Determination And Service Guide Hs22
H201791 Symptom BladeCenter HS22V with a Compact Form Factor Horizontal (CFFh)
Sensor Planar Fault Has Transitioned To Critical From A Less Severe State
card installed hangs or powers off and posts the following errors in the Advanced Management Module (AMM) h21455 event log: 4 INFO Blade_05 09/17/10 03:07:06 N 0x806F0813 (vspherewb21)Recovery Group 2, (Expansion Card 2-3) (PCIs) bus uncorrectable error 5 INFO Blade_05 09/17/10 03:06:57 N 0x806F0009 https://www.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5083613 (vspherewb21)System board, (Power Unit Stat) power off 8 INFO Blade_05 09/17/10 03:04:11 N 0x806F0313 (vspherewb21)Recovery Chassis, (NMI State) software NMI 9 ERR Blade_05 09/17/10 03:03:48 C 0x806F0813 (vspherewb21)Group 2, (Expansion Card 2-3) (PCIs) bus uncorrectable error 12 ERR Blade_05 09/17/10 03:02:58 C 0x806F0021 (vspherewb21)Group 2, (Expansion Card 2-3) connector (One of PCI Error) PCI express slot https://www.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5086675 0 fault 13 ERR Blade_05 09/17/10 03:02:47 N 0x806F0313 (vspherewb21)Chassis, (NMI State) software NMI Affected configurations The system may be any of the following IBM servers: BladeCenter HS22V, type 1949, any model BladeCenter HS22V, type 7871, any model The system is configured with one or more of the following IBM Options: QLogic 2-Port 10 Gigabit Converged Network Adapter (CFFh) for IBM BladeCenter, Option part number 42C1830, any replacement part number This tip is not software specific. Workaround The PCIe speed for the CFFh adapter should be changed to Gen1 using the following procedure: Enter F1 Setup. Select System Settings, Devices and I/O Ports, PCIe Gen1/Gen2 Speed Selection. There may be multiple selections in this menu depending on the Unified Extensible Firmware Interface (UEFI) level. The I/O Expander card refers to the CIOv card slot. The Blade Expander card refers to the CFFh card slot, select this option. Save and reboot the blade. Additional information BladeCenter HS22V uses Gen2 as the default PCIe bus
at the User's Guide for Integrated Management Module. Note: Deassertive events not listed in this table are informational only. IMM Events that automatically notify Support http://publib.boulder.ibm.com/infocenter/systemx/documentation/topic/com.ibm.sysx.7944.doc/r_imm_error_messages.html You can configure the Integrated Management Module II (IMM2) to automatically https://publib.boulder.ibm.com/infocenter/bladectr/documentation/topic/com.ibm.bladecenter.hx5.doc/dw1it_r_IMM_error_messages.html notify Support (also known as call home) if certain types of errors are encountered. If you have configured this function, see the table for a list of events that automatically notify Support. 40000001-00000000 Management Controller [arg1] Network Initialization Complete. 40000002-00000000 Certificate Authority [arg1] has detected a on system [arg2] Certificate Error. 40000003-00000000 Ethernet Data Rate modified from [arg1] to [arg2] by user [arg3]. 40000004-00000000 Ethernet Duplex setting modified from [arg1] to [arg2] by user [arg3]. 40000005-00000000 Ethernet MTU setting modified from [arg1] to [arg2] by user [arg3]. 40000006-00000000 Ethernet locally administered MAC address modified from [arg1] to [arg2] by user [arg3]. 40000007-00000000 Ethernet interface [arg1] by group 2 pci user [arg2]. 40000008-00000000 Hostname set to [arg1] by user [arg2]. 40000009-00000000 IP address of network interface modified from [arg1] to [arg2] by user [arg3]. 4000000a-00000000 IP subnet mask of network interface modified from [arg1] to [arg2] by user [arg3]. 4000000b-00000000 IP address of default gateway modified from [arg1] to [arg2] by user [arg3]. 4000000c-00000000 OS Watchdog response [arg1] by [arg2] . 4000000d-00000000 DHCP[[arg1]] failure, no IP address assigned. 4000000e-00000000 Remote Login Successful. Login ID: [arg1] from [arg2] at IP address [arg3]. 4000000f-00000000 Attempting to [arg1] server [arg2] by user [arg3]. 40000010-00000000 Security: Userid: [arg1] had [arg2] login failures from WEB client at IP address [arg3]. 40000011-00000000 Security: Login ID: [arg1] had [arg2] login failures from CLI at [arg3].. 40000012-00000000 Remote access attempt failed. Invalid userid or password received. Userid is [arg1] from WEB browser at IP address [arg2]. 40000013-00000000 Remote access attempt failed. Invalid userid or password received. Userid is [arg1] from TELNET client at IP address [arg2]. 40000014-00000000 The [arg1] on system [arg2] cleared by user [arg3]. 40000015-00000000 Management Controller [arg
the blade server and posts events in the IMM event log. In addition, most events are also sent to the advanced management module event log. The following table lists IMM error messages that are displayed in the advanced management module event log and suggested actions to correct the detected problems. These events, in a slightly different format, are also displayed in the IMM event log.Note: An updated list of IMM error messages and corrective actions are available on the IBM® website at http://www.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5079339&brandind=5000008. Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. See Parts listing - BladeCenter HX5 to determine which components are CRUs and which components are FRUs. If an action step is preceded by "(Trained service technician only)," that step must be performed only by a trained service technician. Type Error Message Action Error Code: 0x80010200 Note: Multiple events can be displayed for this error code. Be sure to read the message text to determine the appropriate recovery actions. Error System board (Planar 12V) voltage under critical threshold. Reading: X, Threshold: Y If the under voltage problem is occurring on all blade servers, look for other events in the log related to power and resolve those events (see Event logs). View the event log provided by the advanced management module for your BladeCenter® chassis and resolve any power related errors that might be displayed. If other modules or blade servers are logging the same issue, check the power supply for the BladeCenter chassis. (Trained service technician only) Replace the system-board assembly (see Removing the system-board assembly - BladeCenter HX5 and Installing the system-board assembly - BladeCenter HX5). Error System board (Planar 5V) voltage under critical threshold. Reading: X, Threshold: Y Remove all expansion cards from the blade server (see Removing an I/O expansion card). Remove all storage drives from the blade ser