Computer Freezes, probably excess heat

Locked
User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Computer Freezes, probably excess heat

Post by dwchang » Thu Mar 13, 2003 11:26 am

Hello, I am recently having problems with my computer freezing up somewhere in Windows. It's some times random, but usually near the beginning. I haven't run into this problem for the 6 months I've had the computer, but upon small investigations, I'm pretty sure it's heat. If I have the case open, the CPU is fine and stays "alive." When the case is on, the CPU freezes. So is this enough to say it's heat?

I have:
-Dual Athlon 1.733 Ghz
-1024 MB DDR 266 RAM
-120 HD
-80 HD
-GeForce4 TI4200
-Two Copper (?) ATX Heatsinks
-3 fans, two near the back, one near the bottom of the front.

First off, do you guys think it is heat? Secondly, what's a good and CHEAP cooling device? Upon running Sisandra (does some system checks), it says "Warning: CPU Heat too high" and same for Motherboard. Lastly, why has this only started showing up recently?

Thanks!
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

User avatar
FurryCurry
Joined: Sun Jul 14, 2002 8:41 pm
Org Profile

Post by FurryCurry » Thu Mar 13, 2003 11:38 pm

It sure sounds like overheating.

I find it worthwhile to use a can of that computer duster stuff every 6 or so months, and give my machines a good air blasting, particularly the CPU heatsink, as they do tend to be dust magnets.

What king of thermal interface material are you using? did the heatsinks come with pads? a tube of goo?

Thermalright makes excellent heatsinks, but they are not cheap.

I suppose I'd recommend Coolermaster for good but lower priced units.

Arctic Silver is your friend, too.

TaranT
Joined: Wed May 16, 2001 11:20 pm
Org Profile

Post by TaranT » Fri Mar 14, 2003 2:32 am

Sounds like overheating (!). Make sure all five fans are running at a decent RPM. I'd check this visually, too; don't just trust the numbers. FurryCurry's idea of cleaning the fans is good, too.

Also, check the mounting of the heatsinks. Maybe the clamps that secure one or both to the sockets have come loose. You might want to re-seat the heatsinks, too. Don't use too much of the thermal grease. That's as bad as not using enough.

Make sure the inlet holes through the case are clean and not blocked by books or whatever. I've also known people to drill out those holes in some cases.

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Fri Mar 14, 2003 2:59 pm

What king of thermal interface material are you using? did the heatsinks come with pads? a tube of goo?
I am using an....ATT (I think) with Copper. I was told by the people at fry's they would be sufficient. As for Thermal Greese, yeah I put that on the die. I also put 4 pads on the corners. These are easy to get when You're an Engineer at AMD :-P
Also, check the mounting of the heatsinks. Maybe the clamps that secure one or both to the sockets have come loose. You might want to re-seat the heatsinks, too. Don't use too much of the thermal grease. That's as bad as not using enough.

Make sure the inlet holes through the case are clean and not blocked by books or whatever. I've also known people to drill out those holes in some cases.
Would "small" things like this cause that big of a problem. I checked with Sisandra the temperatures and the board and two CPUs are running at 77 and 79 C respectively. Now these numbers sound high compared to 25, but like I said I'm an Engineer at AMD (Product Engineer at that, so this is my job) and I know our chips can withstand a lot higher temperature than 79.

Somewhat of an update, but my computer won't boot now. I tried securing the Heatsinks on better last night and it would get to windows, but not load up. I went to Fry's, picked up the BEST Heatsinks (I think Thermacool or something) and upon putting those on, I can't even get to the Video Card Information. I *believe* the video card info is the first thing that loads up and then the BIOS stuff (like CPU, RAM, etc.). This means something is VERY wrong :(. However, I do think that this new problem is my processor getting cracked. The new heatsinks were VERY difficult to get on (and very tight). I know from AMD's end that we get a lot of customer returns for die chipping, so perhaps I killed them with the force exerted :(. I should be able to find that out soon though. If that's the case, I can always get new ones for free.
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

User avatar
FurryCurry
Joined: Sun Jul 14, 2002 8:41 pm
Org Profile

Post by FurryCurry » Fri Mar 14, 2003 5:53 pm

I can think of two possibilities based on your last post.

1) You used an exessive amount of thremal grease. I recommend following the instructions here for any sort of thermal grease:
http://www.arcticsilver.com/arctic_silv ... ctions.htm

2) the pads you added reduced/eliminated the contact area of the heatsink/die, causing the problem. That page I linked shows a picture of a duron about halfway down, with the factory installed pads in place. if you installed pads on your heatsink as well, you probably screwed the contact between your heatsink and die.

*never use both thermal transfer tape (that pink or black square of stuff that comes pre-applied to the bottom of the heatsink) and thermal grease, use one or the other.

*never re-use thermal tape. to re-use a heatsink, clean off the old material with solvent like alcohol or something stronger, and some sort of wood , cardboard or plastic scraper. (an expired credit card or something like that is great.) be careful not to scratch the bottom of the heatsink, and make sure it is completely clean before using it.

*you may well have killed one or both of your processors, or will shortly, if you keep trying to boot when there's obviously a problem.

*If you have damaged the die, there is usually some visible sign of that having happened, or you may have heard a nasty cracking noise while trying to install one of the heatsinks.

Search for info on how to do all this stuff at places like Ars Technica, 2cpu.com, Hard OCP, etc.
Getting the 'sinks on right seems to be a universal hurdle for newbie builders to get over.



I don't mean to be rude, but what kind of engineer for a company like AMD doesn't know how to do stuff like this? I'm just sort of surprised...

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Sun Mar 16, 2003 2:32 am

Well I'm pretty sure I figured it out. One of my fans was pointing the wrong way :(. I had an in and out right next to each other. This would in essense do nothing (i.e. not cool) since they would rule each other out. Would this cause the problem I described? Like I said, it woud initially just reboot every week or so randomly (only like once a week, so no big deal I thought) and then after 6 months started stalling (until I opened the case). Do you think something like the fan-thing would cause that which would take time to degrade? If so, I think I fixed it.

As you mentioned, I did kill my two processors :(, but no big deal since I work for AMD and this may be a blessing in disguise. I think I'm getting two Bartons for free now so that means 400 more Mhz and 512 Cache instead of 256. Basically, when putting on the much tighter heatsink, I'm pretty sure I cracked the die. Funny that I see these customer returns and now I've created two :(.

I think you are also correct about thermal grease since I put A LOT on which would reduce contact and in essence create somewhat of a wall.

I *think* fixing my fans will fix it, but I won't know till Monday. I'm running a single Palamino I'm borrowing from a friend (it sux0rs going from dual to single :() and it hasn't stalled (although having two processors creates A LOT more heat).

The strange thing is that SiSandra is still saying my CPU and MB are running at 77 C. This is equivalent to ~170 F. I've touched the board and heatsink and they are NOWHERE close to 170 F. Perhaps it's wrong...I sure hope so.

Thanks A lot!
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Sun Mar 16, 2003 2:36 am

I don't mean to be rude, but what kind of engineer for a company like AMD doesn't know how to do stuff like this? I'm just sort of surprised...
Don't worry I'm not at all insulted, but I'm fairly sure the stuff like Thermal Grease and whatnot is fine. As for the heatsink (which caused die cracking), this is pretty common since we get a ton of customer returns on this. Trust me, this heatsink was TIGHT and if basically if you were to slip even a little bit, that small amount of damage could easily crack your die (since it is exposed).

Although, I hate to feign ignorance, but I'm not a systems engineer. I'm a design and product engineer. Systems engineers would do the things you talk about, but as a product engineer, we find things like Speed Paths, L2 fails, etc. As for design...that's obviously self-explanatory. I hate to admit it, but I'm not that "hands-on." I know the stereotype is that engineers are like that, but that is very untrue. I studied advanced comptuer architecture, VLSI, Verilog/VHDL, etc. not anything like this. This is more of a technician's job. I guess I'm trying to make excuses, but in any case, don't worry...I haven't f'ed up AMD yet :-P. In fact, if I could talk....
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

TaranT
Joined: Wed May 16, 2001 11:20 pm
Org Profile

Post by TaranT » Sun Mar 16, 2003 3:47 am

77 C sounds high, but I'm not familiar with AMD chips. My editing system right now is running at 60 C (CPU) and 40 C (system). This is from the BIOS/mobo status monitors, and I've been told that my CPU at that temp is actually higher than it should be. This is an Intel P4 (2.53 GHz) in a very tightly packed box (Shuttle X-PC). Judging from my numbers, I would have thought there would be more of a difference between your CPU and mobo temps.

Maybe SiSandra is not scaling the thermo voltages properly? Or maybe the mobo is defective (?).

I think the too-thick grease was more of a problem than the fan, although synching the fans is still important. The real test is to let it run and see what happens.

A quick Google search turned up this:
The 2100+ is no exception, being the very last 0.18 micron CPU. During our testing, normal operating temperature, with a pretty big sized heatsink and a loud fan, is around 50-55 degrees for this chip. Compare that to the P4 2.4, which hardly got over 35 degrees with the stock Intel cooler!
This was at http://www.hardcoreware.net/reviews/pro ... p4_2/1.php

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Sun Mar 16, 2003 4:09 am

77 C sounds high, but I'm not familiar with AMD chips. My editing system right now is running at 60 C (CPU) and 40 C (system). This is from the BIOS/mobo status monitors, and I've been told that my CPU at that temp is actually higher than it should be. This is an Intel P4 (2.53 GHz) in a very tightly packed box (Shuttle X-PC). Judging from my numbers, I would have thought there would be more of a difference between your CPU and mobo temps.

Maybe SiSandra is not scaling the thermo voltages properly? Or maybe the mobo is defective (?).

I think the too-thick grease was more of a problem than the fan, although synching the fans is still important. The real test is to let it run and see what happens.
I see. Well the thing that seems to back-up the software being wrong is that I open up my case and it spits out the exact same values for temperature. My understanding of thermodynamics (I did get a B, so perhaps this is bad understanding haha) is that the air from my apartment (room temperatures ~25 C) would cool them below 77 C. Also, I can touch the motherboard and it doesn't feel that much hotter than 100 C. The northbridge (which should be the hottest) part of the CPU is also probably aroun 120 C. I *hope* it's something with the software, but like you said, we'll see very soon since that's the true test.

You think the thermal grease would cause something to degrade that slowly (I guess six months isn't that slow though)? My understanding is that too much thermal grease causes the processor to not make adequate contact with the heatsink (makes sense from a physical standpoint of layers and whatnot), but wouldn't that become more apparent earlier? (shrug)

Shuttle PC for editing? Nice. I bet you could bring those to conventions if you wanted. I guess I can't do that with a dual, but the raw power is nice :-P
A quick Google search turned up this:
Quote:
The 2100+ is no exception, being the very last 0.18 micron CPU. During our testing, normal operating temperature, with a pretty big sized heatsink and a loud fan, is around 50-55 degrees for this chip. Compare that to the P4 2.4, which hardly got over 35 degrees with the stock Intel cooler!
Wow thanks for the link! Well I don't have a huge and loud fan/heatsink so perhaps 77 is correct. Well, it's not big news that AMD processors (namely Thunderbird) are (were) hot. I know from Product Development that they are built to withstand temperatures above 79 C, however my understanding is that the higher the temperature, the more you degrade the life of the processor. Hopefully it's not to the point of 6 months though.

Thanks TaranT!
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Sun Mar 16, 2003 4:20 am

OK something is definitely up with the software because in my motherboard hardware monitor, the CPU temperature is 40 C (more reasonable). I'm under the impression that although the CPU isn't doing as much when it posts and in the BIOS, this 40 C is fairly accurate.

I know right now I am running a ton more programs and there are a lot of operations going on, but I don't think it would double the temperature and as I said, it doesn't change if I open the case or not. Argh stupid software :-P
I guess I'll find out in about 6 months after installing my new (hopefully) Bartons (or Thoroughbreds) thsi week. Although putting in another CPU will increase the temperature, but I *think* everything after Palamino (T-bred and Barton) runs cooler. I've only worked with the company for 8 months so I know almost nothing about their designs. (Shrug) I can only hope.

BTW you should see what a part looks like after it's in an oven at 140 C for 2 weeks :-P
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

User avatar
FurryCurry
Joined: Sun Jul 14, 2002 8:41 pm
Org Profile

Post by FurryCurry » Sun Mar 16, 2003 1:18 pm

Those 40c temps look a lot more realistic than what Sisandra was reporting. I've never really thought of sandra as a useful temp monitoring solution, but somehow the fact you were using that breezed past me before.

The best way to deal with case airflow, IME, is to have things set up to flow in one direction, ie front fans blowing in, rear fans and the PSU pulling air out.

A couple other things came to mind, one would be the capacity of your power supply. Those types of duallie rigs REALLY suck down the power. My dual MP 1900 system killed a 400 watt psu (fizzing, smoky death! hehe!) after about 8 months, replacing it with a high-quality 550 watt Antec solved that problem.

Also, some boards (like my now evicted Tyan Tiger) have issues with the number of banks of memory installed. That thing would never run stable with four DIMMs installed (1024mb), I had to settle for running three (768mb). It's possible one or more of your DIMMs has worked itself a bit loose, or is beginning to fail. Reseating your memory, and testing with Memtest86 should help you check this.

Some (a lot, actually) of those dual AMD boards are quirky in their own strange ways, and one other resource I can point you to for more specific info would be 2cpu.com, particularly the forums there, for info on your board, and dual systems in particular.

Good luck with getting your system stable, nothing about computing is more personally demoralizing to me than trying to deal with flaky, unstable hardware you can't trust not to suddenly crash on you when you need it the most.

Oh, if you post what actual brand and model of system board you have, I or someoneelse might know of a particular issue with that board. (boy, do I know about weirdness with the Tiger. -_-; )

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Sun Mar 16, 2003 1:51 pm

The best way to deal with case airflow, IME, is to have things set up to flow in one direction, ie front fans blowing in, rear fans and the PSU pulling air out.
That's what I have now.
A couple other things came to mind, one would be the capacity of your power supply. Those types of duallie rigs REALLY suck down the power. My dual MP 1900 system killed a 400 watt psu (fizzing, smoky death! hehe!) after about 8 months, replacing it with a high-quality 550 watt Antec solved that problem.
I have a 420 W power supply from Antec. I have been told 400 W or greater would be sufficient for a dual set-up with an AGP video card.
Oh, if you post what actual brand and model of system board you have, I or someoneelse might know of a particular issue with that board. (boy, do I know about weirdness with the Tiger. -_-; )
I have a Tyan Tiger 760 MPX. I am using banks 0 and 2 with two 512 MB ECC registered DIMMS. I have heard about the issues with the Tyan Tiger (something about the 12 pin connector heating up, etc.), but I was hoping the MPX had these things fixed.

Thanks!
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

User avatar
CaTaClYsM
Joined: Fri Jul 26, 2002 3:54 am
Contact:
Org Profile

Post by CaTaClYsM » Sun Mar 16, 2003 9:11 pm

ok, I know this is going to be s a stupid question, but exactly what programs do you have running, I mean, when you press CTRL+ALT+DELETE, what do you typicaly have in the background?
So in other words, one part of the community is waging war on another part of the community because they take their community seriously enough to want to do so. Then they tell the powerless side to get over the loss cause it's just an online community. I'm glad people make so much sense." -- Tab

User avatar
dwchang
Sad Boy on Site
Joined: Mon Mar 04, 2002 12:22 am
Location: Madison, WI
Contact:
Org Profile

Post by dwchang » Sun Mar 16, 2003 9:26 pm

ok, I know this is going to be s a stupid question, but exactly what programs do you have running, I mean, when you press CTRL+ALT+DELETE, what do you typicaly have in the background?
I close almost everything except for ZoneAlarm (firewall). Of course there is always the other things running like rundll32, etc.
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space

Locked

Return to “Hardware Discussion”