My saga this week has been that my server has been quite unstable and would lock up at seemingly random times. My first attempt to fix it was replacing the power supply; Monday morning, I noticed my server was completely off and it wouldn’t power up, so I yank 2 of the 3 hard drives from it and it started. The conclusion (which I now believe was wrong) was that the power supply couldn’t handle the 3 drives. So, I replaced it with a high efficiency one that is supposed to be quiet. Tuesday I still had problems and they seemed related to when I used one of my backup hard drives. So, I replaced the drive (which wasn’t a bad thing as it was left over from my old server). I also figured out how to finally get lm-sensors to work with my motherboard. I spent time over the last year trying to get it to work, but to no avail. One of the things I was curious about was the temperature of the CPU and case. Now that lm-sensors was working, I had this information. When I looked at the CPU temperature, it ranged from 50-80 degrees Celsius. This indicated to me that there was a problem as I read that the Pentium 4 chip I have should usually be below 40 degrees Celsius. Things started clicking…the times when the system froze were when I was backing up to my secondary drive; my backups use rsync and tar/gzip, both mechanisms are very processor intensive. So it was looking like when the processor got hit hard, the temperature on it rose and the system froze. Ah ha! I may have found the cause. So another trip to Fry’s to pick up a new CPU Cooler, new case fan and new hard drive cable (for good measure). I put everything in and almost fell over when I saw the temperature. It was ranging from 15 to 25 degrees Celsius! Under maximum CPU load (during my backup), it wasn’t getting above 25 degrees Celsius and hasn’t locked up, yet (knock on wood). I ended up getting a Cooler Master Hyper L3 as it looked like it would fit (my case has an airflow vent on top of the CPU and some of the coolers are way too big) and the heat pipe technology seems to make sense.
Hopefully all this work will actually solve this issue.