Buffer I/o Error On Device Dm-10
Contents |
on megaraid rebuild, analysis and Issues related to hardware problems Post Reply Print view Search Advanced search 13 posts 1 2 Next jamesNJ Posts: 18 Joined: 2015/02/25 21:49:44 CentOS server freeze/crash on megaraid
Clonezilla Buffer I O Error On Device
rebuild, analysis and Quote Postby jamesNJ » 2015/07/24 17:25:21 Hello all,I have a problem buffer i o error on device sr0 with a large CentOS 7 server hosting an LSI MegaRAID controller with 16x 1tb SAS drives. The server goes dead at
Kernel Buffer I O Error On Device
night requiring a forced reboot or power cycle to restore service. If it matters, this server has 1 large RAID-6 volume with 1 global hot spare available.I believe I have narrowed this issue down to linux buffer i o error on device the MegaRAID controller being busy with a RAID rebuild, and some automated action occurring at night that confuses LVM into oblivion. The issue is difficult to narrow down because this “automated action” seems to result in all file systems being marked read-only. Syslog seems to continue working, but obviously cannot write useful data out to disk. Hence I have only been able to collect data on those rare times buffer i o error on device sda that I can actually log in when this issue occurs. I was able to capture 2 points of data that seem to start out with the same error condition.This only seems to occur when a drive fails and the MegRAID rebuilds to the global hot spare, or if I force some action on the RAID which causes a drive fail and rebuild to an alternate disk (I had a few disks get SMART predictive failures and have been working to replace these with new). I initially thought this issue was related to smartd warning messages, however when I replaced the last drive with predictive failures, the rebuild triggered the same behavior.So what seems to be the pattern is that I kick off a rebuild (which takes many hours) and then sometime around midnight a systemd-udevd process kicks in and the system eventually ends up unresponsive. From the 2 times I was able to get on, these messages seem to be in common right at the time file systems go read-only:Jul 15 00:45:44 server1 kernel: megaraid_sas: scanning for scsi0...Jul 15 00:45:44 server1 systemd-udevd: failed to execute '/sbin/mdadm' '/sbin/mdadm -If sda5 --path pci-0000:09:00.0-scsi-0:2:0:0': Input/output errorJul 15 00:45:44 server1 systemd-udevd: failed to execute '/sbin/mdadm' '/sbin/mdadm -If sda4 --path pci-0000:09:00.0-scsi-0:2:0:0': Input/o
Bad disk? Date: Wed, 10 Nov 2010 09:39:00 -0500 Yesterday I added a hard drive (to put extra stuff on it) to my ubuntu 10.10 box
Buffer I O Error On Device Sdc
and created a LVM in it. Then copied some files to buffer i o error on device sdc1 it and restarted the machine to see if it would mount into the right mountpoint. It didn't. So
Ubuntu Buffer I O Error On Device
I decided to see if it was there (vg in question is export): raub strangepork:~$ sudo vgscan Reading all physical volumes. This may take a while... /dev/dm-0: read failed http://www.centos.org/forums/viewtopic.php?t=53481 after 0 of 4096 at 429496664064: Input/output error /dev/dm-0: read failed after 0 of 4096 at 429496721408: Input/output error /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error /dev/dm-0: read failed after 0 of 4096 at 4096: Input/output error /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error Found volume group "export" using metadata type https://www.redhat.com/archives/linux-lvm/2010-November/msg00011.html lvm2 Found volume group "root" using metadata type lvm2 raub strangepork:~$ Those dm-0 messages do not make me happy. dmesg and vgchange make me think the problem is on the new drive: [ 268.024593] scsi 0:0:0:0: Direct-Access ATA ST3500320NS SN04 PQ: 0 ANSI: 5 [ 268.024900] sd 0:0:0:0: [sdc] 976773168 512-byte logical blocks: (500 GB/465 GiB) [ 268.024918] sd 0:0:0:0: Attached scsi generic sg2 type 0 [ 268.024996] sd 0:0:0:0: [sdc] Write Protect is off [ 268.025003] sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [ 268.025046] sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 268.025377] sdc: sdc1 [ 268.049853] sd 0:0:0:0: [sdc] Attached SCSI disk [ 335.467482] quiet_error: 3 callbacks suppressed [ 335.467492] Buffer I/O error on device dm-0, logical block 104857584 [ 335.467540] Buffer I/O error on device dm-0, logical block 104857584 [ 335.467589] Buffer I/O error on device dm-0, logical block 104857598 [ 335.467615] Buffer I/O error on device dm-0, logical block 104857598 [ 335.467647] Buffer I/O error on device dm-0, logical bloc
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About http://unix.stackexchange.com/questions/98208/i-o-errors-on-linux-lvm Us Learn more about Stack Overflow the company Business Learn more about hiring https://forums.contribs.org/index.php?topic=50231.0 developers or posting ads with us Unix & Linux Questions Tags Users Badges Unanswered Ask Question _ Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. Join them; it only takes a minute: Sign up Here's how it works: Anybody can ask error on a question Anybody can answer The best answers are voted up and rise to the top I/O errors on Linux LVM up vote 4 down vote favorite I have a CentOS 6 box with LVM setup and one of the PVs is a USB disk (I know). One of them is getting the error: Oct 30 10:57:07 alpha01 kernel: lost page write due to I/O error on dm-3 error on device Oct 30 10:57:07 alpha01 kernel: Buffer I/O error on device dm-3, logical block 4 Which is causing problems with all of the LVs on it. pvs shows the PV as unknown device. I can ls to the logical volumes and they show up in lvdisplay, but first I get a bunch of IO errors. I made sure the cables are secure between the USB drive. What should I do to get this back up and running for the meanwhile? Should I unmount each LV and run an fsck.ext4 on each one like fsck.ext4 -y /dev/vg1/lv_logvolname ? linux lvm fsck share|improve this question asked Oct 30 '13 at 15:06 Gregg Leventhal 2,29032752 In addition to fsck, if the external drive is SMART capable, checking the drive status/health and running the drive self tests may be useful. Backing up all the data may also be important. –rickhg12hs Oct 30 '13 at 15:44 add a comment| 1 Answer 1 active oldest votes up vote 2 down vote accepted I usually don't go the route of running an fsck and assume the disk is failing or has bad sectors. I definitely wouldn't run the fsck using the -y, since this will give f
is currently free to download and use. But it is not free to build. You can help by making a donation of time OR money from the links below. Wiki Forums Bugs Lists Manual FAQ Howto Contribs Download Donate Search Login Register Contribs.org > Contribs.org Forums > SME Server 8.x > Topic: Buffer I/O error on device dm-3 after kernel upgrade to 2.6.18-348.1.el5.i686 « previous next » Pages: [1] Go Down Print Author Topic: Buffer I/O error on device dm-3 after kernel upgrade to 2.6.18-348.1.el5.i686 (Read 5749 times) jufra Not too shy to talk Offline Posts: 32 Buffer I/O error on device dm-3 after kernel upgrade to 2.6.18-348.1.el5.i686 « on: September 28, 2013, 12:18:12 PM » Hi,had a few updates coming through today including a kernel upgrade. After this I found the following messages in my log:Code: [Select]Sep 28 16:57:28 bionfileserver kernel: scsi 2:0:0:0: rejecting I/O to dead device
Sep 28 16:57:33 bionfileserver last message repeated 30 times
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383936
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383937
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383938
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383939
Sep 28 16:57:33 bionfileserver kernel: scsi 2:0:0:0: rejecting I/O to dead device
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383936
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383937
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383938
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 488383939
Sep 28 16:57:33 bionfileserver kernel: scsi 2:0:0:0: rejecting I/O to dead device
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 0
Sep 28 16:57:33 bionfileserver kernel: Buffer I/O error on device dm-3, logical block 1
Sep 28 16:57:33 bionfileserver kernel: scsi 2:0:0:0: rejecting I/O to dead device
Sep 28 16:57:43 bionfileserver last message repeated 3 times
Sep 28