Memory Device Status Is Critical Single Bit Error Logging Disabled
Contents |
sorted by: [ date ] [ thread ] [ subject ] [ author ] On Thu, 2008-10-09 at 11:20 +0000, Arnar Þórarinsson wrote: single-bit failure error rate exceeded > > Hello, > > Could somebody please explain these error messges to
Correctable Memory Error Rate Exceeded For Dimm
me. I've been > trying to find some info on this but have found nothing. > > Severity clear memory error dell openmanage : Critical > ID : 1404 > Date and Time : Fri Oct 3 19:57:10 2008 > Category : Instrumentation Service > Description : Memory device status is critical Memory device
Persistent Correctable Memory Error Rate Has Increased For A Memory Device At Location
> location: DIMM2_B Possible memory module event cause:Single bit > warning error rate exceeded,Single bit error logging disabled > > Severity : Non-Critical > ID : 1403 > Date and Time : Fri Oct 3 18:01:02 2008 > Category : Instrumentation Service > Description : Memory device status is non-critical Memory device > location: DIMM2_B Possible memory module event cause:Single bit correctable memory error log limit reached > warning error rate exceeded > > > /Arnar Thorarinsson Single bit warning errors by them selves mean very little other then the memory found an error and corrected for it. However, IF you see many of these errors, then there is a more serious issue. That would indicate that you have a bad dimm or a bad dimm card. To test, just swap out dimm2-b with another dimm and see if the error follows the dimm or stays with the slot. If it stays with the slot, you need a new dimm card/MB, if it follows the dimm, you need a new dimm. Again, a few of these warnings mean nothing other then the ECC for your memory is working as designed. Many of these warnings means you have bad memory or bad memory riser/MB. -- Damon L. Chesser
bit error logging disabled How fast is your internet? Test your internet connection If this is your first visit, be sure to check out the
Multi Bit Ecc Error On Raid Controller
FAQ by clicking the link above. You may have to register before
Multi-bit Memory Errors Detected On A Memory Device
you can post: click the register link above to proceed. To start viewing messages, select the forum that multibit error you want to visit from the selection below. Page 1 of 2 12 Last Jump to page: Results 1 to 10 of 12 Thread: RAM - Single bit error logging disabled http://lists.us.dell.com/pipermail/linux-poweredge/2008-October/037484.html Thread Tools Show Printable Version Subscribe to this Thread… Search Thread Advanced Search Display Linear Mode Switch to Hybrid Mode Switch to Threaded Mode 08-06-2010,08:33 PM #1 WarNox View Profile View Forum Posts Private Message Infrastructure Engineer Join Date Aug 2005 Posts 675 RAM - Single bit error logging disabled Hey! I'm getting the 'Single bit error logging disabled error' http://pressf1.pcworld.co.nz/showthread.php?110223-RAM-Single-bit-error-logging-disabled on a ram stick in a server, ecc ram. I've already replaced the stick once and its still happening, regardless of which slot I put it in. The ram is 2nd hand so it is possible that both were faulty. I have 2 questions. 1. Since the server still picks up the ram is there any downside to continue using it? 2. Is it possible that the server is somehow causing the ram stick to fail? This is what the supplier thinks, so he's offered me a refund but doesn't want to supply any more sticks as they might fail too. Not too fussed about the refund as it was rather cheap so if the ram still works I'll just keep it with the errors. Thanks My IT Docs | Twitter 08-06-2010,08:39 PM #2 Sweep View Profile View Forum Posts Private Message Pedantic Bloke Join Date Jan 2006 Location Tokoroa Posts 7,671 Re: RAM - Single bit error logging disabled So what Motherboard is the RAM being used in? Is it the one in your sig? 08-06-2010,09:25 PM #3 KarameaDa
date ] [ thread ] [ subject ] [ author ] Hi, Maybe this link will https://docs.oracle.com/cd/E19121-01/sf.x4140/820-3067-14/error_handling.html be useful http://forum.us.dell.com/supportforums/board/message?board.id=pes_oms&message.id=5381 mcclnx mcc
of Correctable Errors Handling of Parity Errors (PERR) Handling of System Errors (SERR) Handling Mismatching Processors Hardware Error Handling Summary Enabling ILOM Diagnostics in BIOS Error Handling Handling of Uncorrectable Errors This section lists facts and considerations about how the server handles uncorrectable errors. Note - The BIOS ChipKill feature must be disabled if you are testing for failures of multiple bits within a DRAM (ChipKill corrects for the failure of a four-bit wide DRAM). The BIOS logs the error to the SP system event log (SEL) through the board management controller (BMC). The SP's SEL is updated with the failing DIMM pair's particular bank address. The system reboots. The BIOS logs the error in DMI. Note - If the error is on low 1MB, the BIOS freezes after rebooting. Therefore, no DMI log is recorded. An example of the error reported by the SEL through IPMI 2.0 is as follows: When low memory is erroneous, the BIOS is frozen on pre-boot low memory test because the BIOS cannot decompress itself into faulty DRAM and execute the following items: ipmitool> sel list 100 | 08/26/2005 | 11:36:09 | OEM #0xfb | 200 | 08/26/2005 | 11:36:12 | System Firmware Error | No usable system memory 300 | 08/26/2005 | 11:36:12 | Memory | Memory Device Disabled | CPU 0 DIMM 0 When the faulty DIMM is beyond the BIOS's low 1MB extraction space, proper boot happens: ipmitool> sel list 100 | 08/26/2005 | 05:04:04 | OEM #0xfb | 200 | 08/26/2005 | 05:04:09 | Memory | Memory Device Disabled | CPU 0 DIMM 0 Note the following considerations for this revision: Uncorrectable ECC Memory Error is not reported. Multi-bit ECC errors are reported as Memory Device Disabled. On first reboot, BIOS logs a HyperTransport Error in the DMI log. The BIOS disables the DIMM. The BIOS sends the SEL records to the BMC. The BIOS reboots again. The BIOS skips the faulty DIMM on the next POST memory test. The BIOS reports available memory,