Buffer Io Error On Device Dm 2
Contents |
hp com>, Alasdair G Kergon
Kernel Buffer I/o Error On Device Dm-2 Logical Block
We used dt tool for IO exercising. 4. Disable host ports of host H2 or any port of array A2 one after the other (few times) OR disable and enable the same port of the other host – few times (may be 4-5 times). 5. Application (dt tool) aborts with IO error on host H1. ===== Snippet of sys log output (while do ing I/O on /dev/dm-0 ) Feb 1 11:47:14 apwtest52 kernel: SCSI error : <2 0 0 1> return code = 0x20000 Feb 1 11:47:14 apwtest52 kernel: end_request: I/O error, dev sda, sector 1584600 Feb 1 11:47:14 apwtest52 kernel: device-mapper: dm-multipath: Failing path 8:0. <=================path failed, after disabling/enabling the H2 host port 1 Feb 1 11:47:14 apwtest52 kernel: end_request: I/O error, dev sda, sector 1584608 Feb 1 11:47:45 apwtest52 kernel: SCSI error : <3 0 1 1> return code = 0x20000 Feb 1 11:47:45 apwtest52 kernel: end_request: I/O error, dev sdg, sector 861400 Feb 1 11:47:45 apwtest52 kernel: device-mapper: dm-multipath: Failing path 8:96. <=================path failed, after disabling/enabling the H2 host port 2 Feb 1 11:47:45 apwtest52 kernel: end_request: I/O error, dev sdg, sector 861408 Feb 1 11:47:45 apwtest52 kernel: SCSI error : <3 0 0 1> return code = 0x20000 Feb 1 11:47:45 apwtest52 kernel: end_request: I/O error, dev sde, sector 452760 Feb 1 11:47:45 apwtest52 kernel: device-mapper: dm-multipath: Failing path 8:64. <=================path failed after disabling/enabling the H2 host port 1 Feb 1 11:47:45 apwtest52 kernel:
on megaraid rebuild, analysis and Issues related to hardware problems Post Reply Print view Search Advanced search 13 posts 1 2 Next jamesNJ
Buffer I/o Error On Device Dm-3
Posts: 18 Joined: 2015/02/25 21:49:44 CentOS server freeze/crash on megaraid rebuild, analysis buffer i/o error on device dm-0 logical block redhat and Quote Postby jamesNJ » 2015/07/24 17:25:21 Hello all,I have a problem with a large CentOS 7 server lost page write due to i/o error on dm-1 hosting an LSI MegaRAID controller with 16x 1tb SAS drives. The server goes dead at night requiring a forced reboot or power cycle to restore service. If it matters, https://www.redhat.com/archives/dm-devel/2006-March/msg00160.html this server has 1 large RAID-6 volume with 1 global hot spare available.I believe I have narrowed this issue down to the MegaRAID controller being busy with a RAID rebuild, and some automated action occurring at night that confuses LVM into oblivion. The issue is difficult to narrow down because this “automated action” seems to result in all file http://www.centos.org/forums/viewtopic.php?t=53481 systems being marked read-only. Syslog seems to continue working, but obviously cannot write useful data out to disk. Hence I have only been able to collect data on those rare times that I can actually log in when this issue occurs. I was able to capture 2 points of data that seem to start out with the same error condition.This only seems to occur when a drive fails and the MegRAID rebuilds to the global hot spare, or if I force some action on the RAID which causes a drive fail and rebuild to an alternate disk (I had a few disks get SMART predictive failures and have been working to replace these with new). I initially thought this issue was related to smartd warning messages, however when I replaced the last drive with predictive failures, the rebuild triggered the same behavior.So what seems to be the pattern is that I kick off a rebuild (which takes many hours) and then sometime around midnight a systemd-udevd process kicks in and the system eventually ends up unres
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About Us Learn more about Stack Overflow the company Business Learn more about hiring developers or posting ads with us http://unix.stackexchange.com/questions/98208/i-o-errors-on-linux-lvm Unix & Linux Questions Tags Users Badges Unanswered Ask Question _ Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. Join them; it only takes a minute: Sign up Here's how it works: Anybody can ask a question Anybody can answer The best answers are voted up and rise to the top I/O errors on Linux LVM up vote 4 down vote favorite I have a CentOS 6 box with LVM setup and one error on of the PVs is a USB disk (I know). One of them is getting the error: Oct 30 10:57:07 alpha01 kernel: lost page write due to I/O error on dm-3 Oct 30 10:57:07 alpha01 kernel: Buffer I/O error on device dm-3, logical block 4 Which is causing problems with all of the LVs on it. pvs shows the PV as unknown device. I can ls to the logical volumes and they show up in lvdisplay, but first I get a bunch of IO errors. I made sure o error on the cables are secure between the USB drive. What should I do to get this back up and running for the meanwhile? Should I unmount each LV and run an fsck.ext4 on each one like fsck.ext4 -y /dev/vg1/lv_logvolname ? linux lvm fsck share|improve this question asked Oct 30 '13 at 15:06 Gregg Leventhal 2,29032752 In addition to fsck, if the external drive is SMART capable, checking the drive status/health and running the drive self tests may be useful. Backing up all the data may also be important. –rickhg12hs Oct 30 '13 at 15:44 add a comment| 1 Answer 1 active oldest votes up vote 2 down vote accepted I usually don't go the route of running an fsck and assume the disk is failing or has bad sectors. I definitely wouldn't run the fsck using the -y, since this will give fsck to attempt to start moving blocks which may exacerbate the problem. Instead I'll run a tool such as Spinrite (Commercial) or HDAT2 (freeware) on the disk to do the analysis & potential repair. What else? See my other answers to these questions for additional methods: fsck -cc /dev/sdb1 gives this result. Is everything okay> Detect damaged audio CD The 2nd link provides details about another tool, safecopy for attempting to recover data from a failed/failing drive. It doesn't attempt to do any repairing of hte HDD. share|improve this answer edited Aug 4 '15 at 22:28 answered Oct 30 '13 at 16:19 slm♦ 164k40299469 Is there a linux spinrite that can be run from