btrfs degraded csum error

Let's say you've using RAID5/6 (extra experimental) and popped a drive out of your array btrfs array. You can still mount it with the degraded option:

sudo mount -o degraded,noatime,compress=lzo /dev/mapper/crypt1 /srv

Now you just add a device, and delete the missing device. (replace doesn't work yet) Easy right?

sudo btrfs dev del missing /srv/
ERROR: error removing the device 'missing' - Input/output error

Whoa - an error, what's the log say?

allen@work:~$ tail -100 /var/log/kern.log | grep BTRFS

Apr 28 08:13:06 work kernel: [  144.573284] BTRFS info (device dm-6): allowing degraded mounts
Apr 28 08:13:06 work kernel: [  144.573290] BTRFS info (device dm-6): disk space caching is enabled
Apr 28 08:13:06 work kernel: [  144.590842] BTRFS warning (device dm-6): devid 7 missing
Apr 28 08:14:25 work kernel: [  223.823027] BTRFS info (device dm-6): relocating block group 18495618154496 flags 129
Apr 28 08:17:45 work kernel: [  424.113495] BTRFS info (device dm-6): csum failed ino 257 off 4316004352 csum 3998390505 expected csum 983723006
Apr 28 08:17:45 work kernel: [  424.113562] BTRFS info (device dm-6): csum failed ino 257 off 4316004352 csum 3998390505 expected csum 983723006

Uh oh, we've mounted the array without dm-6, so the filesystem is reconstructing it from parity, and now it's got checksum errors.

You can't scrub.
That's just a way to find these types of errors in advance and fix them using the other copy of the data (assuming you're on raid 1) before they accumulate. But we're in degraded mode so there is no other copy. Plus, we're using raid5/6 which has no scrub support anyway, as of mid 2014. But let's start a scrub to see what happens; the task hangs. You can reboot however, at least.

sudo btrfs scrub start /srv
scrub started on /srv, fsid dfea5d9a-90d3-4906-b504-623b2523944a (pid=2732)

Apr 28 09:33:14 work kernel: [  601.705845] INFO: task btrfs:2734 blocked for more than 120 seconds.
Apr 28 09:33:14 work kernel: [  601.705873]       Not tainted 3.14.1-031401-generic #201404141220

You can't re-balance
That would normally work around the no-scrub support issue (I've read), but since you have no other copy of the data, you're in the same boat as above.

You can't always just delete what's  affected
So lets see what file is affected and see if we can just delete it and move on.

sudo btrfs inspect-internal inode-resolve -v 257 /srv
ioctl ret=0, bytes_left=4048, bytes_missing=0, cnt=1, missed=0
/srv/nfs

Crap - that's the root of all the NFS files and so deleting all the files isn't really a useful solution.

You can't just copy the data
When you hit a csum error on a degraded raid5/6, it causes a kernel bug and whatever it was you were doing, stops.  All other file actions seem to stop too. You can't unmount, all you can do it shutdown as much as you can, and go down hard. Note: this is present up into kernel 3.15 rc2

cp -a /srv/nfs /mnt/
cp: error reading ‘someFile’: Input/output error
cp: failed to extend ‘someFile’: Input/output error

Apr 28 08:56:19 work kernel: [ 2740.362340] BTRFS info (device dm-6): csum failed ino 5018 extent 594936590336 csum 3164482640 wanted 2176410131 mirror 1
Apr 28 08:56:19 work kernel: [ 2740.362667] ------------[ cut here ]------------
Apr 28 08:56:19 work kernel: [ 2740.362714] kernel BUG at /home/apw/COD/linux/fs/btrfs/raid56.c:1831!
Apr 28 08:56:19 work kernel: [ 2740.362766] invalid opcode: 0000 [#1] SMP
...
...
init 1
...
reboot
...
(press and hold power button!)

You can't restore data
We can mount the file system and see the files, otherwise we'd run a btrfs restore. You can do this anyway,  but you're in same boat as above. You get some data but then get dumped.

sudo btrfs restore -i /dev/mapper/crypt1 /mnt

Check tree block failed, want=18398258737152, have=18398258802688
read block failed check_tree_block
Error searching -5
Ignoring transid failure
btrfs: ctree.c:1514: leaf_space_used: Assertion `!(data_len < 0)' failed.

You can't run btrfsck
The case of last resort is normally this, however it seems be unequipped to handle degraded raid5/6. An very last resort is the redo the csum tree so we just accept the errors and hope the files aren't too badly damaged.

sudo btrfs check /dev/mapper/crypt1

Checking filesystem on /dev/mapper/crypt1
UUID: dfea5d9a-90d3-4906-b504-623b2523944a
checking extents
Check tree block failed, want=18398052323328, have=18398052388864
read block failed check_tree_block
btrfs: ctree.c:1514: leaf_space_used: Assertion `!(data_len < 0)' failed.

sudo btrfs check --init-csum-tree /dev/mapper/crypt1

Reinit crc root
Unable to find block group for 0
btrfs: extent-tree.c:288: find_search_start: Assertion `!(1)' failed.


There is no way to recover
All you can do is copy a directory at a time and when you hit a bad csum, note it, reboot and delete it.

cp -an /some /mnt
(contrl-c when it hangs)
tail -100 /var/log/kern.log
Apr 28 09:51:12 work kernel: [  820.120817] BTRFS info (device dm-6): csum failed ino 5458 extent 17267683328 csum 2933540263 wanted 2620942209 mirror 0
Apr 28 09:51:12 work kernel: [  820.123734] BTRFS info (device dm-6): csum failed ino 5458 extent 17267683328 csum 1050126411 wanted 2620942209 mirror 1
sudo reboot
sudo btrfs inspect-internal inode-resolve -v 5458 /srv
sudo rm (whatever the above indicates)

Now repeat

Comments