Processor Uncorrectable Error 670
Contents |
Internet of Things Small and Medium Business Service Providers All Solutions Services Advise, Transform and Manage Financing and Flexible Capacity IT Support Services uncorrectable ecc error Education and Training Services All Services Products Integrated Systems
Correctable Memory Error Rate Exceeded For Dimm
Composable Systems Converged Systems Hyper Converged Systems Blade Systems Infrastructure Management Software Application Lifecycle Management correctable ecc memory error logging limit reached Application Delivery Management Big Data Analytics DevOps Enterprise Security Hybrid and Private Cloud Information Governance Information Management IT Service Management Operations Management Server Management Software as a
Correctable Memory Error Dell
Service (SaaS) Software-Defined Data Center Storage Management All Software Servers Rack Servers Tower Servers Blade Servers Density Optimized Mission Critical Servers Servers for Cloud Server Management All Servers Storage All-flash and Hybrid Storage Midrange and Enterprise Storage Entry Storage Systems Data Availability, Protection and Retention Software Defined Storage Management and Orchestration correctable memory error rate exceeded for dimm a1 Storage Networking All Storage Networking Switches Routers Access Points and Controllers Wireless LAN Campus and Branch Networking Data Center Networking Wide Area Network Software Defined Networking Network Functions Virtualization Network Management All Networking About UsSupportClearType to search2086159Solutions Transform to a Hybrid Infrastructure Protect Your Digital Enterprise Empower the Data-Driven Organization Enable Workplace Productivity Cloud Security Big Data Mobility Infrastructure Internet of Things Small and Medium Business Service Providers All Solutions Services Advise, Transform and Manage Financing and Flexible Capacity IT Support Services Education and Training Services All Services Products Integrated Systems Composable Systems Converged Systems Hyper Converged Systems Blade Systems Infrastructure Management Software Application Lifecycle Management Application Delivery Management Big Data Analytics DevOps Enterprise Security Hybrid and Private Cloud Information Governance Information Management IT Service Management Operations Management Server Management Software as a Service (SaaS) Software-Defined Data Center Storage Management All Software Servers Rack Servers Tower Servers Blade Servers Density Optimized Mission
Decomposition Compaq Analyze formats and displays binary event log entries for the following:
Uncorrectable Ecc Error Encountered
Compaq AlphaServer DS10 Compaq AlphaServer DS10L Compaq AlphaServer DS20
High Correctable Ecc Error Rate Detected Cisco
Compaq AlphaServer DS20e Compaq AlphaServer ES40 Compaq AlphaServer GS80 Compaq AlphaServer GS160 Compaq AlphaServer correctable memory error has been detected in memory slot GS320 Memory Channel II Common Access Method (CAM) error log entries Logged message entries
Start here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss http://serverfault.com/questions/144151/how-seriously-should-i-take-ecc-correctable-error-warnings the workings and policies of this site About Us Learn more about Stack Overflow the company Business Learn more about hiring developers or posting ads with us Server Fault Questions http://marc.info/?l=tru64-unix-managers&m=96079072816119&w=2 Tags Users Badges Unanswered Ask Question _ Server Fault is a question and answer site for system and network administrators. Join them; it only takes a minute: Sign up memory error Here's how it works: Anybody can ask a question Anybody can answer The best answers are voted up and rise to the top How seriously should I take ECC correctable error warnings? up vote 7 down vote favorite I have a pile of Sun X2200-M2 servers. These servers have ECC memory. In some of these servers, I am getting correctable memory error warnings in the eLOM about "correctable ECC errors detected", eg: # ssh regress11 ipmitool sel elist 1 | 05/20/2010 | 14:20:27 | Memory CPU0 DIMM2 | Correctable ECC | Asserted 2 | 05/20/2010 | 14:33:47 | Memory CPU0 DIMM2 | Correctable ECC | Asserted ...some more frequently than others. The kernel on this particular system is throwing EDAC errors as well, although with far more frequency than the eLOM is recording ECC events: EDAC k8 MC0: general bus error: participating processor(local node response), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(generic) MC0: CE page 0x42a194, offset 0x60, grain 8, syndrome 0xf654, row 4, channel 1, label "": k8_edac MC0: CE - no information available: k8_edac Error Overflow set EDAC k8 MC0: extended error code: ECC chipkill x4 error EDAC k8 MC0: general bus error: participating processor(local node response), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(generic) MC0: CE page 0x48cb94, offset 0x10, grain 8, syndrome 0xf654, row 5, channel 1, label "": k8_edac MC0: CE - no informatio
RAW] Hi, We have a machine reporting CPU exceptions (a list of recent exceptions is attached at the end of the message). What do these exceptions mean? As far as I can tell, they have caused the machine to crash at least once. After the crash, Compaq replaced the motherboard unit (which has everything except the main memory and PCI cards), however the CPU exceptions persist. Is the main memory faulty? Type of machine: Digital Personal WorkStation 600au OS: Digital Unix 4.0D PK3 Firmware revision: 7.0-10 Memory: 4 * 256Mb (total 1Gb) Disks: 3 channel SWXCR RAID controller + internal SCSI bus Scott -------- Near the time of the crash the exceptions were happening more often, perhaps ten or twenty that day, but now they are occuring once every few days. dia reports the exceptions like this: (the values in the Entry body field change sometimes) ******************************** ENTRY 1 ******************************** Logging OS 2. Digital UNIX System Architecture 2. Alpha Event sequence number 2. Timestamp of occurrence 16-SEP-1999 05:17:26 Host name bluejay System type register x0000001E Systype 30. (Miata) Number of CPUs (mpnum) x00000001 CPU logging event (mperr) x00000000 Event validity 1. O/S claims event is valid Event severity 1. Severe Priority Entry type 100. CPU Machine Check Errors CPU Minor class 3. Processor Correctable Error (630) Entry Body Size: x00000068 Entry body: 15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order 0000: 00000038 00000018 80000000 00000068 *h...........8...* 0010: FFFFFF00 33C8CF4F 00000000 00000086 *........O..3....* 0020: FFFFFFF0 C5FFFFFF 00000000 00001A00 *................* 0030: 00000000 00000000 00000001 00000000 *................* 0040: 00000000 00000000 00000000 00000000 *................* 0050: 00000000 00000000 00000000 00000000 *................* 0060: 5E3C7E25 00000000 * ....%~<^* At the time of the crash: ******************************** ENTRY 27 ******************************** Logging OS 2. Digital UNIX System Architecture 2. Alpha Event sequence number 14. Timestamp of occurrence 02-SEP-1999 18:33:33 Host name bluejay System type register x0000001E Systype 30. (Miata) Number