Speed (mhz) != Performance

This forum is for help with and discussion about your video hardware.

Postby dj-ohki » Tue Nov 11, 2003 10:59 pm

dwchang wrote:Wow 0 for 3 in the three posts i've replied to.

The Athlon 64 has a 256k L2 cache. I can guarantee you on that. You don't need a whitepaper for that. Just go to any major technical website or vendor.


referenced from http://www.amd.com/us-en/assets/content ... agram3.gif

128k L1 (64 data, 64 instruction), '1152KB effective cache' (which is where i got the 1mb cache from)

the athlon fx has its own reference page, with http://www.amd.com/us-en/assets/content ... agram3.gif

note, 128 L1, 1152KB effective. thus, 1meg on each.
The Athlon FX has a 1MB L2 cache. Although you are right about the HT links. It only has one while the Opteron has 3. Then again, why would a consumer need more than one 1 HT link which runs at over 4 GB/s xfer on the north-bridge.



6.4Gb/sec, which is 0.8GB/sec. and if the FX is posed to be a prosumer cpu, i would have expected it to have at least 2 HT links (2 way glueless SMP).

2 MB cache? Are you out of your mind. That would make the die nearly 60% larger than it already it is. Who do you think we are? Intel? People who just throw bigger caches at a performance problem (*cough* Pentium 4 Extreme Edition *cough*).


beacuse the opterion is posed to strike the Xeons, though the 2MB/l2 is prolly from a very old tech spec. but, i would have liked to see a 2MB or even a 8 or 16 MB version of the opertion or 64FX, being memory starved as they are. or perhaps even a 256, 512 or even 1024 bit memory bus when multiple DDR slots are occupided (like nvidia's memory crossbar in the FX video cards). die size be dammed, a 1024bit DDR3200 memory controller in a opeterion core with 8MB L2 would FLY.
User avatar
dj-ohki
 
Joined: 17 Apr 2001

Postby dwchang » Wed Nov 12, 2003 2:43 pm

dj-ohki wrote:
dwchang wrote:Wow 0 for 3 in the three posts i've replied to.

The Athlon 64 has a 256k L2 cache. I can guarantee you on that. You don't need a whitepaper for that. Just go to any major technical website or vendor.


referenced from http://www.amd.com/us-en/assets/content ... agram3.gif

128k L1 (64 data, 64 instruction), '1152KB effective cache' (which is where i got the 1mb cache from)

the athlon fx has its own reference page, with http://www.amd.com/us-en/assets/content ... agram3.gif

note, 128 L1, 1152KB effective. thus, 1meg on each.


uhm...I don't mean to sound arrogant, but I don't blame you since you wouldn't know this.

Nobody in industry talks about "total cache" (which includes the L3). When you talk cache, you talk about the L2 and in both cases, my statements are still correct. You'll notice I explicitely said L2.

FX = 1 MB L2
Athlon 64 = 256k L2

dj-ohki wrote:
The Athlon FX has a 1MB L2 cache. Although you are right about the HT links. It only has one while the Opteron has 3. Then again, why would a consumer need more than one 1 HT link which runs at over 4 GB/s xfer on the north-bridge.



6.4Gb/sec, which is 0.8GB/sec. and if the FX is posed to be a prosumer cpu, i would have expected it to have at least 2 HT links (2 way glueless SMP).


Again, the FX still has only one HT link (it even says on that site "A Hypertransport...). I believe the memory interface is 128 bit though.

dj-ohki wrote:
2 MB cache? Are you out of your mind. That would make the die nearly 60% larger than it already it is. Who do you think we are? Intel? People who just throw bigger caches at a performance problem (*cough* Pentium 4 Extreme Edition *cough*).


beacuse the opterion is posed to strike the Xeons, though the 2MB/l2 is prolly from a very old tech spec. but, i would have liked to see a 2MB or even a 8 or 16 MB version of the opertion or 64FX, being memory starved as they are. or perhaps even a 256, 512 or even 1024 bit memory bus when multiple DDR slots are occupided (like nvidia's memory crossbar in the FX video cards). die size be dammed, a 1024bit DDR3200 memory controller in a opeterion core with 8MB L2 would FLY.


Uhm...Memory Starved? Do you understand Computer Architecture? 1 MB of L2 cache is A LOT...well when you consider die-size.

1) L2 increasing has diminishing returns. Sure if you increase it, you get performance, but at a certain point, it's not worth it. The returns just aren't that good.

2) Increasing the L2 will increase the die size. Do you really wanna have a chip *that* big? Oh and go tell the Motherboard manufacturers that they gotta go design *another* motherboard. Yeah right.

3) Increasing L2 size will lead to less yield. L2's aren't that easy to fabricate (or rather large ones). As I said with diminishing returns in performance, you will also get diminishing returns in yield since you won't get that many chips out. Go make an 8 MB L2 cache and see just how many of these (giant) processors will yield at final sort. I doubt even one would out of a possible 250+.

Again, increasing the L2 is just an easy way to increase performance and everyone knows it's the lazy approach. It's *much* better to just make optimizations to your pipeline and various other components. It A) won't increase die-size, B) won't destroy yields and C) will lead to much better performance increases.
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
User avatar
dwchang
Sad Boy on Site
 
Joined: 04 Mar 2002
Location: Madison, WI

Postby dj-ohki » Thu Nov 13, 2003 1:09 am

dwchang wrote:uhm...I don't mean to sound arrogant, but I don't blame you since you wouldn't know this.

Nobody in industry talks about "total cache" (which includes the L3). When you talk cache, you talk about the L2 and in both cases, my statements are still correct. You'll notice I explicitely said L2.


just going from whats on your guy's site. im not trying to arugue, im just pointing out where i got that info, since you corrected me, i stand corrected

FX = 1 MB L2
Athlon 64 = 256k L2


so you're saying the ath64 has 768 kb of l3 and the FX has 0 l3? FWIT, IIRC, there is 0 off chip cache in any 64 bit athlon varient.

dwchang wrote:
dj-ohki wrote:
The Athlon FX has a 1MB L2 cache. Although you are right about the HT links. It only has one while the Opteron has 3. Then again, why would a consumer need more than one 1 HT link which runs at over 4 GB/s xfer on the north-bridge.



6.4Gb/sec, which is 0.8GB/sec. and if the FX is posed to be a prosumer cpu, i would have expected it to have at least 2 HT links (2 way glueless SMP).


Again, the FX still has only one HT link (it even says on that site "A Hypertransport...). I believe the memory interface is 128 bit though.


yup, i know it only has 1 HT link, im waxing retorical on the idea of having 2 HT links would have been really nice for a prosumer chip, due to the fact it would allow for 2 way glueless SMP.

dwchang wrote:
dj-ohki wrote:
2 MB cache? Are you out of your mind. That would make the die nearly 60% larger than it already it is. Who do you think we are? Intel? People who just throw bigger caches at a performance problem (*cough* Pentium 4 Extreme Edition *cough*).


beacuse the opterion is posed to strike the Xeons, though the 2MB/l2 is prolly from a very old tech spec. but, i would have liked to see a 2MB or even a 8 or 16 MB version of the opertion or 64FX, being memory starved as they are. or perhaps even a 256, 512 or even 1024 bit memory bus when multiple DDR slots are occupided (like nvidia's memory crossbar in the FX video cards). die size be dammed, a 1024bit DDR3200 memory controller in a opeterion core with 8MB L2 would FLY.


Uhm...Memory Starved? Do you understand Computer Architecture? 1 MB of L2 cache is A LOT...well when you consider die-size.

1) L2 increasing has diminishing returns. Sure if you increase it, you get performance, but at a certain point, it's not worth it. The returns just aren't that good.

2) Increasing the L2 will increase the die size. Do you really wanna have a chip *that* big? Oh and go tell the Motherboard manufacturers that they gotta go design *another* motherboard. Yeah right.

3) Increasing L2 size will lead to less yield. L2's aren't that easy to fabricate (or rather large ones). As I said with diminishing returns in performance, you will also get diminishing returns in yield since you won't get that many chips out. Go make an 8 MB L2 cache and see just how many of these (giant) processors will yield at final sort. I doubt even one would out of a possible 250+.

Again, increasing the L2 is just an easy way to increase performance and everyone knows it's the lazy approach. It's *much* better to just make optimizations to your pipeline and various other components. It A) won't increase die-size, B) won't destroy yields and C) will lead to much better performance increases.


again, waxing retorical, based on the fact that the power3 has 8 meg, the power 4 has 1.4 meg, and the RS64-3 has 16 meg. im not arguing here.
User avatar
dj-ohki
 
Joined: 17 Apr 2001

Postby dwchang » Thu Nov 13, 2003 2:15 am

dj-ohki wrote:so you're saying the ath64 has 768 kb of l3 and the FX has 0 l3? FWIT, IIRC, there is 0 off chip cache in any 64 bit athlon varient.


Actually both don't have an L3 period. It's just as I stated with the FX at 1 MB L2 and 64 at 256 KB. Both have the same L1.

dj-ohki wrote:yup, i know it only has 1 HT link, im waxing retorical on the idea of having 2 HT links would have been really nice for a prosumer chip, due to the fact it would allow for 2 way glueless SMP.


True. That would definitely help, but I imagine it would've been quite a bit more difficult to get in there.

dj-ohki wrote:again, waxing retorical, based on the fact that the power3 has 8 meg, the power 4 has 1.4 meg, and the RS64-3 has 16 meg. im not arguing here.


Well you're right *but* the big difference is that these are high-end server chips. We're talking (or so I thought) about consumer desktop chips and the L2 wouldn't need to be that big for consumer based applications and more importantly, they wouldn't want to pay that much for them. As I said earlier, it's hard to yield with a larger cache and thus the price also goes up (to make up for that cost).
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
User avatar
dwchang
Sad Boy on Site
 
Joined: 04 Mar 2002
Location: Madison, WI

Postby dj-ohki » Thu Nov 13, 2003 9:56 am

dwchang wrote:
dj-ohki wrote:so you're saying the ath64 has 768 kb of l3 and the FX has 0 l3? FWIT, IIRC, there is 0 off chip cache in any 64 bit athlon varient.


Actually both don't have an L3 period. It's just as I stated with the FX at 1 MB L2 and 64 at 256 KB. Both have the same L1.


then the over 1 meg 'effective' cache listing on their site is pretty much a load of crap. or a really streched spin.

dj-ohki wrote:yup, i know it only has 1 HT link, im waxing retorical on the idea of having 2 HT links would have been really nice for a prosumer chip, due to the fact it would allow for 2 way glueless SMP.


True. That would definitely help, but I imagine it would've been quite a bit more difficult to get in there.


true, but since ath64 is NUMA and not SMP, you wouldnt have to worry about cache coherency and all that rot. *shrug* oh well, there's always operterion.

dj-ohki wrote:again, waxing retorical, based on the fact that the power3 has 8 meg, the power 4 has 1.4 meg, and the RS64-3 has 16 meg. im not arguing here.


Well you're right *but* the big difference is that these are high-end server chips. We're talking (or so I thought) about consumer desktop chips and the L2 wouldn't need to be that big for consumer based applications and more importantly, they wouldn't want to pay that much for them. As I said earlier, it's hard to yield with a larger cache and thus the price also goes up (to make up for that cost).


i though we were having 2 distinct converstations, one about ath64/ath64fx (consumer/prosumer) and one about the opterion, which is amd's forray into the high end server market. 64 bit NUMA architecture with 8 way glueless MP, thats pretty high end.

and no, consumer level programs have no need for 8 meg l2s. it would be nice to be able to fit an entire filter chain + video frame in cache, but again, not needed.

anyway, im hoping to get a hold of a dualie sledgehammer in the next few months, once the price goes down. either that, or a dualie P4EE (quad dispatch engines is pretty sweet, and before you jump on that, i know it only helps poorly written code) all depends on how things sit in the future.
User avatar
dj-ohki
 
Joined: 17 Apr 2001

Postby dwchang » Thu Nov 13, 2003 4:49 pm

dj-ohki wrote:then the over 1 meg 'effective' cache listing on their site is pretty much a load of crap. or a really streched spin.


My guess is that it's a typo or someone being lazy (probably copied and pasted the FX chart). The Athlon 64 has 256k L2 and 128 (64/64) of L1. I know that. Oh and the FX *does* have 1MB of effective cache since 1 MB of L2 and 128 of L1 (= 1128 kb). That's the reason for my suspicion of a typo since the FX equates correctly.

dj-ohki wrote:true, but since ath64 is NUMA and not SMP, you wouldnt have to worry about cache coherency and all that rot. *shrug* oh well, there's always operterion.


Why would cache coherency have to do with a single CPU. The Athlon 64 isn't a multi-processor die period (whether SMP or whatever). There wouldn't be an cache coherency problems since it is all internal.

Also why would the HT link have anything to do with the cache coherency if it's not on an MP processor. Effectively for an Athlon 64, it's just a really fast bus on the northbridge. I imagine that's why there is only one HT link...since it's not necessary to have more. But don't quote me on that...I didn't design the thing..just making logical guesses.

dj-ohki wrote:i though we were having 2 distinct converstations, one about ath64/ath64fx (consumer/prosumer) and one about the opterion, which is amd's forray into the high end server market. 64 bit NUMA architecture with 8 way glueless MP, thats pretty high end.

and no, consumer level programs have no need for 8 meg l2s. it would be nice to be able to fit an entire filter chain + video frame in cache, but again, not needed.

anyway, im hoping to get a hold of a dualie sledgehammer in the next few months, once the price goes down. either that, or a dualie P4EE (quad dispatch engines is pretty sweet, and before you jump on that, i know it only helps poorly written code) all depends on how things sit in the future.


No, I thought we were only talking about Desktop :P. If you talk about servers well...I imagine an 8-way Opteron can maul a Power4 or anything with an 8 MB cache. 8-way system vs. 8 MB L2...that's an easy choice in terms of efficiency and performance ne?

As you have already stated though, an 8 MB cache is ridiculous for consumers and as for high-end servers...I have already presented an alternative in the 2, 4 and 8-way systems. That obviously would be more cost efficient then something that is very difficult to fabricate and in the end less processing power. I imagine fabricating 8 chips with a smaller cache is *a lot* easier than fabricating one with 8 MB cache. Especially when our prevoius chips have had nothing above 512, the jump to 8 MB would be ridiculous both on our end (for product verification/fabrication) and for motherboard manufacturers and so on. It probably would require an entire revamping of the infratstructure....which is ultimately stupid.

The p4EE can go dual? I didn't know that. I mean I know all it is is a Xeon (which Intel denies ha), so I guess it could since the Xeon can go Dual. At the same time, benchmarks clearly show the Opteron mauling the P4EE...oh and the P4EE doesn't have 64-bit capability. You have to *laugh* Itanium for that...haha.
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
User avatar
dwchang
Sad Boy on Site
 
Joined: 04 Mar 2002
Location: Madison, WI

Postby dj-ohki » Fri Nov 14, 2003 3:36 pm

dwchang wrote:
dj-ohki wrote:then the over 1 meg 'effective' cache listing on their site is pretty much a load of crap. or a really streched spin.


My guess is that it's a typo or someone being lazy (probably copied and pasted the FX chart). The Athlon 64 has 256k L2 and 128 (64/64) of L1. I know that. Oh and the FX *does* have 1MB of effective cache since 1 MB of L2 and 128 of L1 (= 1128 kb). That's the reason for my suspicion of a typo since the FX equates correctly.

the ath64 pic was up before the FX was announced. so dunno. we'll end this part cause it is most likely a typo.

dj-ohki wrote:true, but since ath64 is NUMA and not SMP, you wouldnt have to worry about cache coherency and all that rot. *shrug* oh well, there's always operterion.


Why would cache coherency have to do with a single CPU. The Athlon 64 isn't a multi-processor die period (whether SMP or whatever). There wouldn't be an cache coherency problems since it is all internal.


im talking about NUMA in a machine. and by the design of the whole ath64 arch, being a numa system, it doesnt have to deal with cache cohearncy period when in a MP environment.

Also why would the HT link have anything to do with the cache coherency if it's not on an MP processor. Effectively for an Athlon 64, it's just a really fast bus on the northbridge. I imagine that's why there is only one HT link...since it's not necessary to have more. But don't quote me on that...I didn't design the thing..just making logical guesses.


^. see above. when the opterion is in a MP environment, the HT link is used for interprocessor communication. since there are 3 HT links, it supports up to 8 way glueless (no support needed from the host chipset). a ath64 with 2 HT links would be able to go dualie with no added support needed from the chipset. looks like the ath64 is gonna be a strictly uniprocessor solution unless something changes. this is suprising to me for the FX, cause of its prosumer nature. a lot of prosumers prefer MP systems, cause of all the added benifits of MP, which is not possible with the ath64 at this time, but is possible with the P4EE.

dj-ohki wrote:i though we were having 2 distinct converstations, one about ath64/ath64fx (consumer/prosumer) and one about the opterion, which is amd's forray into the high end server market. 64 bit NUMA architecture with 8 way glueless MP, thats pretty high end.

and no, consumer level programs have no need for 8 meg l2s. it would be nice to be able to fit an entire filter chain + video frame in cache, but again, not needed.

anyway, im hoping to get a hold of a dualie sledgehammer in the next few months, once the price goes down. either that, or a dualie P4EE (quad dispatch engines is pretty sweet, and before you jump on that, i know it only helps poorly written code) all depends on how things sit in the future.


No, I thought we were only talking about Desktop :P. If you talk about servers well...I imagine an 8-way Opteron can maul a Power4 or anything with an 8 MB cache. 8-way system vs. 8 MB L2...that's an easy choice in terms of efficiency and performance ne?


but can it maul a 8 way power4. or a 4 way power4? dont know what the price/performace on that chip is ATM.

As you have already stated though, an 8 MB cache is ridiculous for consumers and as for high-end servers...I have already presented an alternative in the 2, 4 and 8-way systems. That obviously would be more cost efficient then something that is very difficult to fabricate and in the end less processing power. I imagine fabricating 8 chips with a smaller cache is *a lot* easier than fabricating one with 8 MB cache. Especially when our prevoius chips have had nothing above 512, the jump to 8 MB would be ridiculous both on our end (for product verification/fabrication) and for motherboard manufacturers and so on. It probably would require an entire revamping of the infratstructure....which is ultimately stupid.


define high end. for most servers, yea, 1meg is perfect. some could use 2 meg, but thats picking nits. there are some specialized cases where 8 meg would be VERY useful, but they are far between.


but yea, the infrastrutre change required to implement that would be staggering, and thus pointless.

The p4EE can go dual? I didn't know that. I mean I know all it is is a Xeon (which Intel denies ha), so I guess it could since the Xeon can go Dual. At the same time, benchmarks clearly show the Opteron mauling the P4EE...oh and the P4EE doesn't have 64-bit capability. You have to *laugh* Itanium for that...haha.


true, whats the price points on both though?

and this whole 'OMG 64 bits! it makes everything better' attitude i keep seeing all over the internet irks me. since the ath64 is nice in that 32 bit code executes at the same speed as 64 bit code, the whole 64 bit part is pointless at this stage in time. how many people do you know work with apps that use over 4 gig of memory?
User avatar
dj-ohki
 
Joined: 17 Apr 2001

Postby Savia » Fri Nov 14, 2003 3:37 pm

:shock:

That's the last time I wonder into Video Hardware threads.
"A creator needs only one enthusiast to justify him." - Man Ray
"Restrictions breed creativity." - Mark Rosewater

A Freudian slip is where you say one thing, but mean your mother.
User avatar
Savia
Chocolate teapot
 
Joined: 02 Apr 2003
Location: Reading, UK

Postby dwchang » Fri Nov 14, 2003 6:46 pm

dj-ohki wrote:^. see above. when the opterion is in a MP environment, the HT link is used for interprocessor communication. since there are 3 HT links, it supports up to 8 way glueless (no support needed from the host chipset). a ath64 with 2 HT links would be able to go dualie with no added support needed from the chipset. looks like the ath64 is gonna be a strictly uniprocessor solution unless something changes. this is suprising to me for the FX, cause of its prosumer nature. a lot of prosumers prefer MP systems, cause of all the added benifits of MP, which is not possible with the ath64 at this time, but is possible with the P4EE.


No, I understand the 3 HT links think (2^3 = 8) and all that, but again, the FX and 64 are 1 HT link which would still provide an MP solution (2 processor...2^1 = 2). At the same time, currently they are single processor as you have concluded. And yes, I do agree that MP has a lot of benefits for prosumer...I should know since I run a dual Athlon at home :).

I might be mistaken on the HT part (since I'm not *that* familiar with it), but even still....I imagine that a Motherboard/chipset could handle the MP part...although slower. It's quite possible chipsets wanted it that way so they could make $$$ *shrug*.

dj-ohki wrote:but can it maul a 8 way power4. or a 4 way power4? dont know what the price/performace on that chip is ATM.


From what I've heard, I'd say yes. The Power4 is pretty outdated (well not *that* badly)...that's the reason IBM has Power5. And like you said...price. Opterons (btw you keep saying Opterions...there's no i in it) are much cheaper. I hear you can get an 8-way for pretty darn cheap. Hell I think Phade is looking into a two/four-way for the .org. He messaged me about it :)

dj-ohki wrote:define high end. for most servers, yea, 1meg is perfect. some could use 2 meg, but thats picking nits. there are some specialized cases where 8 meg would be VERY useful, but they are far between.
but yea, the infrastrutre change required to implement that would be staggering, and thus pointless.


I'd say high-end are major business. They have major loads to support. If it's any vote of confidence, some major banks and business have already bought 4-way or higher Opteron servers.

And yeah the infrastructure change alone would make it stupid...and again..yields (which are VERY important in the PC industry).


dj-ohki wrote:true, whats the price points on both though?


Well considering the P4EE isn't even out yet (even thought it was paper launched)...it's a moot point. HOWEVER, I hear it's 990 or so. Again, it's just a P4 Xeon, so a bit overpriced for something that they've had for years.

dj-ohki wrote:and this whole 'OMG 64 bits! it makes everything better' attitude i keep seeing all over the internet irks me. since the ath64 is nice in that 32 bit code executes at the same speed as 64 bit code, the whole 64 bit part is pointless at this stage in time. how many people do you know work with apps that use over 4 gig of memory?


I agree to a degree. Sure 64-bit hasn't caught on yet (gotta recompile things, but it is starting..like I said in another thread...Windows XP 64 is already in beta and longhorn after that)...HOWEVER the big selling point is the future. Businesses can buy this system and have great 32-bit capability now and when *they* feel like it, they can make the migration at their own pace. The major selling point is customer centric...that they can choose to migrate when they feel like it.

It's quite an awesome deal...get two generations of processors and change when you feel like it. And it's also comprable to the prices of *just* 32-bit processors from Intel. Seems like a good deal to me :).

As for over 4 gigs of memory..I know plenty of people even on this board who would want over 4 gigs of memory. Scary ain't it? :)
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
User avatar
dwchang
Sad Boy on Site
 
Joined: 04 Mar 2002
Location: Madison, WI

Postby dj-ohki » Sat Nov 15, 2003 3:46 pm

dwchang wrote:No, I understand the 3 HT links think (2^3 = 8) and all that, but again, the FX and 64 are 1 HT link which would still provide an MP solution (2 processor...2^1 = 2). At the same time, currently they are single processor as you have concluded. And yes, I do agree that MP has a lot of benefits for prosumer...I should know since I run a dual Athlon at home :).

I might be mistaken on the HT part (since I'm not *that* familiar with it), but even still....I imagine that a Motherboard/chipset could handle the MP part...although slower. It's quite possible chipsets wanted it that way so they could make $$$ *shrug*.


this is the only part of the post im gonna reply to, cause i agree with the rest of it.

a single HT cannot support a glueless MP soultion. the single HT link is from core to chipset. you require 2 HT links to do 2 way MP. C1 <-> HT <-> C2 <-> chipset. with 3, you can do up to 8 way with a hop count of 3.

and after half an hour digging, cant find the diagram. basicly you've got a grid, 2 x 4. col 2, row 3 chip has a HT that goes to the chipset, and the rest are connected to their row neighbor and column neighbor via a HT link, except the one connected to the chipset, which is missing a link to its row neighbor. or somethingl ike that. there's a diagram floating on the internet somewhere..

sure you can do MP with a single HT link, you're just gonna need a MP chipset, which none of the chipset makers are producing, nor have plans that i know to do so.
User avatar
dj-ohki
 
Joined: 17 Apr 2001

Postby dwchang » Sun Nov 16, 2003 7:18 pm

dj-ohki wrote:
dwchang wrote:No, I understand the 3 HT links think (2^3 = 8) and all that, but again, the FX and 64 are 1 HT link which would still provide an MP solution (2 processor...2^1 = 2). At the same time, currently they are single processor as you have concluded. And yes, I do agree that MP has a lot of benefits for prosumer...I should know since I run a dual Athlon at home :).

I might be mistaken on the HT part (since I'm not *that* familiar with it), but even still....I imagine that a Motherboard/chipset could handle the MP part...although slower. It's quite possible chipsets wanted it that way so they could make $$$ *shrug*.


this is the only part of the post im gonna reply to, cause i agree with the rest of it.

a single HT cannot support a glueless MP soultion. the single HT link is from core to chipset. you require 2 HT links to do 2 way MP. C1 <-> HT <-> C2 <-> chipset. with 3, you can do up to 8 way with a hop count of 3.

and after half an hour digging, cant find the diagram. basicly you've got a grid, 2 x 4. col 2, row 3 chip has a HT that goes to the chipset, and the rest are connected to their row neighbor and column neighbor via a HT link, except the one connected to the chipset, which is missing a link to its row neighbor. or somethingl ike that. there's a diagram floating on the internet somewhere..

sure you can do MP with a single HT link, you're just gonna need a MP chipset, which none of the chipset makers are producing, nor have plans that i know to do so.


Yeah, but with 3 HT links, how could you achieve 8-way? I imagine you need at least 3 links since 2^3 = 8. Again, I'm not *that* familiar with HT, but if you're right, you're right.

In any case, I still stand by my statement that I imagine it will be on the chipset/motherboard end. I mean I'm sure they would *love* to be able to do that and charge more. It's also easier on our end *shrug*. And even though people haven't said anything doesn't mean they're not doing it. I imagine Tyan is doing something (they seem to do well with the MP line). OR you could just go buy an opteron and then you'll be fine (although a lot more expensive).
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
User avatar
dwchang
Sad Boy on Site
 
Joined: 04 Mar 2002
Location: Madison, WI

Postby Quu » Mon Nov 17, 2003 10:50 am

i know this one ^_^

with three HT links you are forgetting the memory interface

with three HT links you can do basically infinite glueless SMP... imagine a ladder

two base Opteron cpues are at the bottom... the south most Ht link on one goes to a chipset... and the southmost on the other also goes to a chipset (IE one a direct link to a HT aware SCSI controller ^_^)

then the connect to each other... and above them... like a ladder...

its somethign we investigate here at work for scalability
<table>
<tr><td></td><td></td><td>PCI-X</td><td></td><td>PCI-Exprs</td><td></td><td></td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td>Mem</td><td>-</td><td align="center">CPU</td><td align="center">----</td><td align="center">CPU</td><td>-</td><td>Mem</td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td>Mem</td><td>-</td><td align="center">CPU</td><td align="center">----</td><td align="center">CPU</td><td>-</td><td>Mem</td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td>Mem</td><td>-</td><td align="center">CPU</td><td align="center">----</td><td align="center">CPU</td><td>-</td><td>Mem</td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td>Mem</td><td>-</td><td align="center">CPU</td><td align="center">----</td><td align="center">CPU</td><td>-</td><td>Mem</td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td>Mem</td><td>-</td><td align="center">CPU</td><td align="center">----</td><td align="center">CPU</td><td>-</td><td>Mem</td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td></td><td></td><td align="center">SCSI</td><td></td><td>Chipset</td><td>-</td><td>Video</td></tr>
<tr><td></td><td></td><td align="center">|</td><td></td><td align="center">|</td><td></td><td></td></tr>
<tr><td></td><td></td><td>Hard Drive</td><td></td><td>South Brdg</td><td></td><td></td></tr>
</table>
Lead me not to temptation, for I have deadlines
User avatar
Quu
 
Joined: 26 Dec 2000
Location: Atlanta, GA

Postby dwchang » Mon Nov 17, 2003 2:50 pm

Whoa Quu! That drawing makes a lot of sense. Ok I figured *all* the processors integrated to each other instead of indirectly linked. I guess that would be *a lot* more difficult, but it'd kick ass ;).

Either way, I still imagine Tyan or some other company will come up with an MP solution for non-Opterons. I mean the market is obviously there. And let's not forget that MP did take awhile to catch on with Athlons and even PIII's.

Oh and we just announced a deal with Sun. w00t!
-Daniel
Newest Video: Through the Years and Far Away aka Sad Girl in Space
User avatar
dwchang
Sad Boy on Site
 
Joined: 04 Mar 2002
Location: Madison, WI

Postby Quu » Mon Nov 17, 2003 4:01 pm

well... with the hyper transport controllers being inter connected on the die... when a Ht packet bound for a CPU further long in the chain it simply gets passed on, with out the local CPU needing to interfear... same with memory requests from a foriegn cpu... it comes across the HT buss, and is processed by the memory controller... with out the local cpu on that bus hving to handle it... really makes it scary exapandable....

if you wanted to make an athelon64 of FX multi processor than you could make a northbridage with two Ht links... and it becomes the "middle point" for th two CPUs... doing pass through like normal...

the problem is why woudl they.... the Operton 2XX series is speced and priced for the dual CPU workstation market.... its the 8XX thats for quads and higher...

I believe that the 2XX series only has two HT links enabled on the chip.... and the 1XX series opteron has only onee enabled...

you can't put a 1XX opteron chip in a multi processor motherboard... at least i don't think...

i think
Lead me not to temptation, for I have deadlines
User avatar
Quu
 
Joined: 26 Dec 2000
Location: Atlanta, GA

Previous

Return to Video Hardware Discussion

Who is online

Users browsing this forum: No registered users and 2 guests