Monday, August 15, 2011

How Linux handles hardware problems

I usually write about various issues in Windows, because there are so many and so frequent. Hardly do I run across major issues with Linux. Until the one I will mention here, which involves a very rare kernel lockup. As we should all know, kernel lockups in Linux are very rare, and in fact I can easily count the number of instances I've seen it happen over 14 years, and keep it under a total count of 10. They usually happen due to hardware problems, when the kernel can no longer run. Recently, I've seen an 10 year old server running Red Hat Linux 7.1, lock up completely. And yes, the OS was installed on the Dell Poweredge 2400 10 years ago, back in 2001, and has been running just fine for many years. Never any file corruption or slowdowns, or other issues like we see with old Windows installations. Recently, the server was shut off abruptly due to an extended power outage. After that, it would run for roughly a week at a time then lock up. The screen at the console was black one time, and another time had a kernel dump screen.

After booting the server back up, I noticed this entry in /var/log/messages:


