Ecc Uncorrectable Error Detected Bank 1
Contents |
error, works fine? If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the dell ecc uncorrectable error detected bank 1 forum that you want to visit from the selection below. Results 1 to 3 of s3c nand ecc uncorrectable error detected 3 Thread: [RESOLVED] Dell Power Edge 860 reports uncorrectable ECC error, works fine? Tweet Thread Tools Show Printable Version Email this Page… Subscribe to
Uncorrectable Dram Ecc Error Detected
this Thread… Search Thread Advanced Search Display Linear Mode Switch to Hybrid Mode Switch to Threaded Mode January 2nd, 2012,02:54 PM #1 shazbot View Profile View Forum Posts Virtual Intern Join Date Oct 2001 Posts 352 [RESOLVED] Dell
Uncorrectable Ecc Error Encountered
Power Edge 860 reports uncorrectable ECC error, works fine? A few days ago, I pulled a Power Edge 860 out for a client because it was only booting sporadically after a power outage. Once I got it to boot regularly, it came back with two problems: a degraded RAID array, and an "uncorrectable ECC error". The former was an easy fix, but I'm having trouble with the latter. Dell's built-in diagnostic program gives the full description of the uncorrectable ecc error error: IPMI system event log check Error code 2900:0221 Uncorrectable ECC error Bank #1 Dell's documentation for the server says that this indicates a problem with one of the DIMMs, either with the memory installed or with the slot itself. I've switch around the modules, but the error still shows up in the scan. In spite of this error, however, the server boots just fine and seems to be working normally, all the installed memory is showing up just fine, and it doesn't show me any error messages unless I run the diagnostic program. Is this indicative of a problem that may pop up sometime in the future, or is there something wrong right now that I'm missing? Reply With Quote January 3rd, 2012,02:19 AM #2 jdc2000 View Profile View Forum Posts Site Moderator Join Date Feb 2000 Location Idaho Falls, Idaho, USA Posts 14,377 Possibly useful links: http://boardreader.com/thread/amber_...okvXazsm8.html http://comments.gmane.org/gmane.linu...oweredge/36609 Reply With Quote January 3rd, 2012,12:24 PM #3 shazbot View Profile View Forum Posts Virtual Intern Join Date Oct 2001 Posts 352 Yeah, that did the trick. I was so focused on figuring out the error itself that I forgot you have to clear out the logs after one occurs, otherwise it will just continue to report on it. I installed the Dell System E-Support Tool (DSET) and cleared the log. Now it's entirely back to normal. Reply With Quote Quick Navigation
version of IPMI tool. If you are using the version released by Dell, it contains some additional formatting
Correctable Memory Error Rate Exceeded For Dimm
that should give a little more detail on the SEL messages. I correctable memory error dell don't recall the exact syntax you would need to use to get the Dell log formatting code to correctable ecc memory error logging limit reached be invoked. You could also use OMSA or the DRAC. In general, uncorrectable errors should cause a crash, but I suspect the SEL display is incorrect which is the http://discussions.virtualdr.com/showthread.php?251101-RESOLVED-Dell-Power-Edge-860-reports-uncorrectable-ECC-error-works-fine other reason to use one of the Dell tools. As for the support issues, the approach is reasonable. I'm just a software guy.....but have been involved in a lot of discussion around memory health and error reporting. In general, a significant number of memory issues are caused by improperly seated dimms or dimms that have dirty contacts. Reseating or moving http://lists.us.dell.com/pipermail/linux-poweredge/2010-December/043898.html the dimms usually fixes a lot of these issues. Also, depending on the specific memory error, the problem could be in the motherboard or in the dimm. Moving the dimm to a different slot allows tech support to determine if it is a dimm or motherboard problem. Different PowerEdge models report memory errors slightly differently. Which model do you have? Wayne Weilnau Systems Management Technologist Dell | OpenManage Software Development Please consider the environment before printing this email. Confidentiality Notice | This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential or proprietary information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, immediately contact the sender by reply e-mail and destroy all copies of the original message. -----Original Message----- From: linux-poweredge-bounces-Lists On Behalf Of Dave Sparks Sent: Thursday, December 16, 2010 6:33 PM To: linux-poweredge-Lists Subject: ECC errors I have a couple of questions I hope someone here can help me with. One of our servers had some ECC errors recently.
It includes the following sections: DIMM Population Rules Supported DIMM Configurations DIMM Replacement Policy How DIMM Errors Are Handled by the System Isolating and https://docs.oracle.com/cd/E19469-01/819-4363-12/dimms_x4540.html Correcting DIMM ECC Errors DIMM Population Rules The DIMM population rules for the http://unixadminschool.com/blog/2011/03/deal-with-memory-errors-correctable-and-uncorrectable/ server are as follows: Each CPU can support a maximum of eight DIMMs. The DIMM slots are paired and the DIMMs must be installed in pairs (0-1, 2-3, 4-5, and 6-7). See FIGURE 10-1. The memory sockets are colored black or white to indicate which slots are paired by matching error detected colors. DIMMs are populated starting from the outside (away from the CPU) and working toward the inside. CPUs with only a single pair of DIMMs must have those DIMMs installed in that CPU’s outside white DIMM slots (6 and 7). See FIGURE 10-1. Only DDR2 800 Mhz, 667Mhz, and 533Mhz DIMMs are supported. Each pair of DIMMs must be identical (same manufacturer, size, ecc uncorrectable error and speed). Supported DIMM Configurations TABLE 10-1 lists the supported DIMM configurations for the Sun Fire Sun Fire X4500/X4540 Servers server. TABLE 10-1 Supported DIMM Configurations Slot 3 Slot 2 Slot 1 Slot 0 Total Memory Per CPU 0 2 GB 0 2 GB 4 GB 2 GB 2 GB 2 GB 2 GB 8 GB 4 GB 4 GB 4 GB 4 GB 16 GB DIMM Replacement Policy Replace a DIMM when one of the following events takes place: The DIMM fails memory testing under BIOS due to Uncorrectable Memory Errors (UCEs). UCEs occur and investigation shows that the errors originated from memory. In addition, a DIMM should be replaced whenever more than 24 Correctable Errors (CEs) originate in 24 hours from a single DIMM and no other DIMM is showing further CEs. If more than one DIMM has experienced multiple CEs, other possible causes of CEs have to be ruled out by a qualified Sun Support specialist before replacing any DIMMs. Retain copies of the logs showing the memory errors per the above r
LinkedIn How to Use this Site ? Solaris Troubleshooting : Deal with memory Errors – Correctable and Uncorrectable March 24, 2011By RamdevMemory errors are quite common hardware related errors in enterprise environment, here we are going to discuss about two common types of errors …. Correctable Memory Errors Symptoms: Your system may have one or more of the following symptoms. The system may have received CE, ECC errors, or recoverable memory errors. The system may be described as having reported CPU or memory errors Example error messages which may have been reported are shown below: Name(required) Email(required) Learning Request(required) Are you Looking for (required) Paid Training Free Training What is your Learning Goal for Next Six Months ? Talk to us