Server still unstable Sat, Apr 30. 2016
Unfortunately exchanging the motherboard hasn't changed anything. The server is still unstable, though I must say, it hasn't hung, the only problems were the COM port that stopped working and kernel messages about a stuck process on one of the CPU cores.
I've run a memtest too. All RAM working without errors.
In syslog I see the stuck process is always smbd (the Samba daemon). For now, I've stopped Samba, but that still doesn't answer the question why the COM port stops working and what is the cause of the stuck process (hardware, like the SAS controller or one of the SATA hard disks, or software, which had been updated several days before the problems began, so it is strange the server was stable for several days before the problems started). For now, I'm a bit at a loss and don't really know how to proceed further.
Maybe the CPU is the culprit? I will probably have to run some stress tests to try to find that out.
To be continued, that's for sure.
Server unstable as hell Tue, Apr 26. 2016
As you may have noticed (but probably not), my server is really very, very unstable and seemingly randomly hangs. The problems started about last Sunday evening.
At first I thought an update was causing the problems (although I've never seen a Linux server completely hanging because of user space software), but that doesn't seem to be the case. Looking at statistical data about my system like CPU, memory, disk, network usage and temperature sensor information neither revealed any clues. Finally, when yesterday the serial port (that is being used to communicate with an old, but just fine APC UPS) just stopped working (but after restarting the server started working again) and also seeing strange kernel errors in syslog, it hit me it must be hardware issues and in this case most probably the motherboard.
Sending the motherboard back for repairs and then having to wait for probably weeks isn't an option, so today I ordered a new motherboard (the same model, so hopefully this instability issue after less than a year isn't a symptom of structurally using sub-par components) and this motherboard is expected to arrive in a few days' time. It will probably be next week the server is running stably again, I hope, but for now, it's sadly a question of keeping an eye on the server and switching it off and on again (just like Roy from The IT Crowd) when it hangs.