you: yes, like I said fifty billion times before, clock speeds are just as important as efficiency. But I like efficiency better than high clock speeds.
me: But you still haven't explained WHY!!! AAAAURGH!!! *head explodes*
you: okay, the real reason I hate the Intel is because they were smart enough to make a chip that was just as fast as ours but looked a lot faster, and we wish we'd thought of that first.
Misc:
For the record, a 3.06 ghz P4 rates at 18.4 gigacraps - a bit lower than your 20.25 gigacraps Athlon XP.
According to PCmech.com, the PIII executes 5 instructions per clock cycle, and the P4 executes 6. So how the hell could a PIII at a given clock speed outperform a P4 at the same clock speed? you said yourself that performance = etc. etc....
Does RAM have anything to do with the proccessor or MB?
- the Black Monarch
- Joined: Tue Jul 09, 2002 1:29 am
- Location: The Stellar Converter on Meklon IV
- dwchang
- Sad Boy on Site
- Joined: Mon Mar 04, 2002 12:22 am
- Location: Madison, WI
- Contact:
Argh! Let the topic die!!!! Final response...I swear.the Black Monarch wrote:you: yes, like I said fifty billion times before, clock speeds are just as important as efficiency. But I like efficiency better than high clock speeds.
me: But you still haven't explained WHY!!! AAAAURGH!!! *head explodes*
you: okay, the real reason I hate the Intel is because they were smart enough to make a chip that was just as fast as ours but looked a lot faster, and we wish we'd thought of that first.
Why do I prefer a more efficient solution....because I've been taught to design that way. Given it's not exactly the best with respect to money since it wouldn't adhere to the speed = performance misconception. In school we are taught to design the most efficient designs and the more elegant solutions. That's just how it works...
Oh and it's not best to continually take cheap shots when you're trying to get info.
First of all, I question your source, but upon going to the website I found this article which seems to quite contradict what it also supposedly says. Let me quote the line that is of importance:the Black Monarch wrote:According to PCmech.com, the PIII executes 5 instructions per clock cycle, and the P4 executes 6. So how the hell could a PIII at a given clock speed outperform a P4 at the same clock speed? you said yourself that performance = etc. etc....
"Along comes the Pentium 4, equipped with an instruction pipeline that is twice as long as it's predecessor, the Pentium iii, 20 stages to be exact. For a 1.5GHz P4, which is what Intel plans on releasing it at, the pipeline will take (assuming one instruction per stage per clock cycle.) the P4 will still pump out 1 instruction per pipeline per clock cycle, but each individual instruction will take 1.333*10-8 seconds to get out of the pipeline. That's 0.333*10-8 seconds longer than it's predecessor, which isn't good.
But, because the P4 will still pump out one instruction per clock cycle, which is still good. It will just take longer to do individual instructions. In processor heavy benchmarks, such as 3D games, the Pentium 4 at 1.4GHz is actually shown to be lesser than or equal to the speed of the Pentium !!!. Please be aware that the Pentium 4 tested was a pre-release sample, and meant only to improve the final product by finding faults before they go on sale, just like Beta Software.
Before I go any further chastising the Pentium 4 for being mentally slow, I must say there are a few advantages to having a longer member...err...pipeline. A longer pipeline gives the chip the ability to be ramped up to higher clock speeds. This helps to offset the obvious disadvantage of a longer pipeline at the same clock speed."
Now let me say that the numbers are VERY incorrect in that the P4 DOES NOT churn out one instruction per clock, however his statements about being "mentally slow" ARE correct. As he also states, they did this so they could bump up the frequency.
Now of course they ASSUME one instruction per clock, but even if it were 6 vs. 5 (which I am not sure if it's true), it's obvious the 5 would come out quicker due to an efficient core which is what I've been saying. Now again, if you pump up the clock, you can close this efficiency gap as earlier stated.
Here is a fairly technical article that goes through it. Given 99% of it is not valid with respect to this argument (I'm not saying the info isn't correct, just saying most of it is not related0 so I'll point out a few quotes:
"and anyone who was paying attention during that time learned at least one, major lesson: clock speed sells. Intel was definitely paying attention, and as the Willamette team labored away in Hillsboro, Oregon they kept MHz foremost in their minds. This singular focus is evident in everything from Intel's Pentium 4 promotional and technical literature down to the very last detail of the processor's design. "
"This article will examine the tradeoffs and design decisions that the P4's architects made in their effort to build a MHz monster"
"As we'll see, the Pentium 4 makes quite a few sacrifices for clock speed, and although Intel tries to spin it differently, an extraordinarily deep pipeline is one of those sacrifices. "
"The P4's long pipeline means that bubbles take a long time to propagate off the CPU, so a single bubble results in a lot of wasted cycles...So a single bubble in the P4's 20 stage pipeline wastes at least 20 clock cycles (more if it's a bubble in one of the longer FPU pipelines),...20 clock cycles is a lot of wasted work, and even if the P4's clock is running twice as fast as the G4e's it still takes a larger performance hit for each pipeline bubble than the shorter machine."
Now I know that's pretty technical, but all I'm getting at is that the architecture IS more inefficient regardless of what you've read. At the same time, it runs quite fast as previously stated. I will of course that QUITE a bit of the article is unrelated (and hence the quotes instead of reading). Also I will state that later on it does talk about how the P4 handles some of these deficiencies (like the scheduler). I will admit that just in case of a rebutal (since I'm trying to kill the thread...tired of writing so much

I don't really have much more to say since well...it's fairly well-accepted fact that the P3 is greater in efficiency than the P4. Also, Ars Technica is a fairly well-respected site with regards to this stuff.
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
Newest Video: Through the Years and Far Away aka Sad Girl in Space
-
- Joined: Mon May 14, 2001 2:43 am
Well, the real killer for the P4 is not that it takes 20 cycles for a single instruction to go through, since you can pretty much have an instruction being executed in every stage simultaneously, its when the P4 screws up in its branch predicition which forces it to empty everything in that pipeline and start anew that really hurts it (especially since those extra pipelines require even more branch predicitions, and thus more likelihood for a f-up). That's why I'm interested when Intel says Prescott will have improved branch prediction, but I haven't found anything that would tell me exactly why it does or approximately how much better it is supposed to be.
- dwchang
- Sad Boy on Site
- Joined: Mon Mar 04, 2002 12:22 am
- Location: Madison, WI
- Contact:
GRRRR! Did I not put that in my post? Argh! I had that line even highlighted in my mind from the Ars Technica article.alternatefutures wrote:Well, the real killer for the P4 is not that it takes 20 cycles for a single instruction to go through, since you can pretty much have an instruction being executed in every stage simultaneously, its when the P4 screws up in its branch predicition which forces it to empty everything in that pipeline and start anew that really hurts it (especially since those extra pipelines require even more branch predicitions, and thus more likelihood for a f-up). That's why I'm interested when Intel says Prescott will have improved branch prediction, but I haven't found anything that would tell me exactly why it does or approximately how much better it is supposed to be.
You are correct though since if you mispredict a branch, you have to squash that instruction and start all 20 stages over again. Some times worse since you might've fetched a bunch of other wrong instructions.
Regardless, thanks for bringing that up since well..it's the truth

-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
Newest Video: Through the Years and Far Away aka Sad Girl in Space
- the Black Monarch
- Joined: Tue Jul 09, 2002 1:29 am
- Location: The Stellar Converter on Meklon IV
Ah, I knew you'd tell me the answer if I bugged you enough.dwchang wrote:Why do I prefer a more efficient solution....because I've been taught to design that way. In school we are taught to design the most efficient designs and the more elegant solutions. That's just how it works...
Ah, so THAT's how the P6 core outperforms the P7 core! I didn't bother trying to understand that other crap, this was all the info that I really needed.dwchang wrote:each individual instruction will take 1.333*10-8 seconds to get out of the pipeline. That's 0.333*10-8 seconds longer than it's predecessor, which isn't good.
Ask me about my secret stash of videos that can't be found anywhere anymore.