Server instability issues, another update

Richard Vrijhof's blog and stuff

Server instability issues, another update

Trackbacks

Comments

Add Comment

Posted by Richard Vrijhof on Wednesday, May 18. 2016

I've checked for bad blocks on all disks that are part of the RAID array connected to my SAS controller: none found, which in itself is a good thing, but still doesn't provide an explanation about the server freezes. As it's a RAID 5 array, I could only take out one disk, check it for bad blocks, re-add it to the array, wait for the disk to have re-synced with the array, before I could go ahead with the next disk. That all takes quite some time.

There's also the possibility to check for bad blocks with a read/write test, instead of just read-only. This test takes a whole lot longer to complete, but should I have done it anyway? What's the advantage of this test over the read-only test? Does it find bad blocks that would otherwise not be found?

I'm running with the Samba daemon stopped for a few weeks now. No freezes anymore, but making do without Samba is a real pain, as this means there's no NAS on the LAN anymore. That's why checking for bad blocks took over two weeks to complete, because I was copying a folder of more than 2 TiBs over SFTP to my desktop pc, which makes the re-sync on the RAID array very, very slow. As it does seem to be the culprit, but I still refuse to believe the freezes are caused by software but instead by hardware, there are only two possibilities left in my opinion: the SAS controller or the CPU, with the SAS controller being the most likely candidate. Sigh, decisions, decisions…

November '25