I was startled today by a message in syslog that seems to point to a problem with my RAID1 volumes:
Mar 1 01:13:54 server mdadm[961]: RebuildFinished event detected on md device /dev/md3, component device mismatches found: 512
This value is reflected in the following counter:
root:/etc/mdadm# cat /sys/block/md3/md/mismatch_cnt 512
I tried to clarify this by googling around, but I found no definitive answer whether this is an actual problem or not. However, I found a way to resync the MD components so that no mismatches remain:
root:/etc/mdadm#
echo repair >> /sys/block/md3/md/sync_action
After you execute the repair you will notice that the counter shows the same number of mismatches again:
root:/etc/mdadm# cat /sys/block/md3/md/mismatch_cnt 512
This was to be expected — because the repair corrected (and thus encountered) this number of mismatches. So, if you force a check again, the counter should be down to 0:
root:/etc/mdadm# echo check >> /sys/block/md3/md/sync_action
root:/etc/mdadm# watch cat /proc/mdstat
[wait until check is finished]
root:/etc/mdadm# cat /sys/block/md3/md/mismatch_cnt
0
4 replies on “Startled by “component device mismatches” on RAID1 volumes”
Update: I found the following statement from Neil Brown which seems to be a good explanation why these differences may happen:
Neil Brown also says, that it is a problem mostly found with Swap on Raid1. Link to the whole discussion is: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=518834
It happened that I saw this error (again) a couple of days ago. Since it startled me again I googled for it — and came across my own above article as the first hit. I had simply forgotten that I had seen and investigated this error already a while ago… 🙂
In my case mismatches had exactly the value of 128 and appeared only for /boot partitions on raid1. I think this pretty normal, because event sequence was as follows:
1) OS installed on 2 HDDs with /boot and / partitions on RAID1 via mdadm
2) /dev/sdb at some point became faulty and was replaced
3) sfdisk -d /dev/sda | sfdisk /dev/sdb
4) mdadm /dev/md0 –add /dev/sdb1
5) echo -e “root (hd1,0)\nsetup (hd1)\nquit” | grub –batch
At stage 5) grub writes directly to HDD, omitting the raid1 level, so /dev/sda1 and /dev/sdb1 become out of sync and weekly cron job 99-raid-check complains about mismatches.