Posted by Phil Weldon on April 23, 2007, 5:22 pm
 

What does PC1066 mean, and and what advantage does a 1:1 ratio confer?

I consider the question definitively setteled.  The CPU clock : memory clock
ratio is identical to the FSB :  memory bus ratio.  The nomenclature is
murky, but DDR2 PC1066 memory is qualified to run with a memory bus of 1066
MHz.  The CPU clock : memory clock ratio as it appears on nVidia 680i SLI
motherboards represents the FSB : memory bus ratio.  DDR2 PC1066 memory is
required to operate at a 1:1 FSB: memory bus ratio (unless lower rated
memory is overclocked.)

For this system
    E4300/ EVGA 680i / Patriot SLI-Ready DDR2 PC1066
    FSB at 1200 MHz for CPU speed of 2.7 GHz

Three memory benchmarks in SiSoft Sandra 2007 ver 2007.4.11.22
(Memory Latency, Cache and Memory, Memory Bandwidth)
 with memory timing held constant for all memory bus speeds

    (Memory timing settings in EVGA 680i BIOS)
        SLI Memory [Disabled]
        tCL:     5
        tRCD:  5
        tRP:     5
        tRAS:  16
        CMD:  2T
        tRRD:   3
        tRC:     21
        tWR:    9
        tREF:   7.8 ns

Gave the following results with memory bus speeds of 400 MHz, 600 MHz, 800
MHz, 1200 MHz -
__________
Memory bus = 400 MHz

    **Memory Latency**
    Random 16 MByte 126.6 ns / 341.7 clocks
    Linear 16 MByte 15.4 ns / 41.6 clocks

    **Cache and Memory**
    Combined Index 12548
    Speed factor 104.6

    **Memory Bandwidth**
    Int. Buffered 4401
    Float Buffered 4368
    Est. Efficiency 46%
____________
    Memory bus = 600 MHz

    **Memory Latency**
    Random 16 MByte 91.8 ns / 247.8 clocks
    Linear 16 MBytes 11.7 ns / 31.6 clocks

    **Cache and Memory**
    Combined Index 15075
    Speed factor 68.7

    **Memory Bandwidth**
    Int. Buffered 5567
    Float Buffered 5091
    Est. Efficiency 58%
__________
    Memory bus = 800 MHz

    **Memory Latency**
    Random 16 MByte 81.9 ns / 221.2 clocks
    Linear 16 MByte 11.1 ns / 29.9 clocks

    **Cache and Memory**
    Combined Index 166384
    Speed factor 53.4

    **Memory Bandwidth**
    Int. Buffered 6042
    Float Buffered 6021
    Est. Efficiency 63%
__________
    Memory bus = 1200 MHz

    **Memory Latency**
    Random 16 MByte 63.5 ns / 171.3 clocks
    Linear 16 MByte 9.3 ns /  25.4 clocks

    **Cache and Memory**
    Combined Index 19725
    Speed Factor 36.9

    **Memory Bandwidth**
    Int. Buffered:  6438
    Float Buffered 6442
    Est. Efficiency 67%
__________

Hope this helps.

Phil Weldon



Posted by Paul on April 23, 2007, 8:00 pm
 

Phil Weldon wrote:

The processor FSB is 64 bits wide. If operating at FSB1066, data transfer
rate is a maximum of 1066 * 8 bytes = 8528MB/sec.

In a dual channel setup, you have DDR2-1066 (PC2-8500) on each channel.
As the number implies, that means each channel transfers at 8500MB/sec,
and two channels transfer at 17000MB/sec. That is twice the rate that
the FSB can handle.

So, what of it ? The Intel architecture features an external memory
controller. The memory controller is located on the Northbridge
chip. In addition to the connection of the processor FSB and the
memory channels, there are also the PCI Express lanes for the video
card. This could be, for example, PCI Express x16, at 4000MB/sec
transmit and 4000MB/sec receive. So you could have the processor doing
a burst, and the video card doing a bidirectional burst (if such a
thing is possible), and that would more or less fill the memory bus.
The Northbridge also has the DMI interface (hub bus), which could be
another 4 PCI Express x1 lanes worth.

So, in all of that, is there something magic about the clocks on
the memory and FSB ?

Actually, due to the strap in the Northbridge, there is a bit of
unpredictability, about what will happen to performance as you
overclock. In fact, there is a difference in overclock results,
between "nominal BIOS/clockgen overclock" versus "overclock via BIOS".
And that is due to how the Northbridge strap is set up by the BIOS.
Since I like to back up these enthusiast concepts, with a trip to
the datasheet, I was disappointed to find no mention of any of the
details of any "Strap" in the Intel docs. Nor of any "latency setting"
in the Northbridge, that apparently the BIOS sets up.  But people
did do enough testing and presentation of their results, to show
there is an appreciable difference between the two overclock methods,
which lends credibility to the strap concept. Even if the proponent of
the strap theory is not able to explain it very well (i.e. in a way
that a hardware designer would understand).

So there are days of reading material ahead of you, if you wish
to learn the details of Core2 overclocking. You have to slog
through a lot of enthusiast chatter, to get nuggets of information.

In case you missed the point of the above two paragraphs, it is
this. You should *benchmark* your overclocking setup, and not
stare at the clocks. The memory and core clock on a Core2 Duo setup,
don't tell the whole story. In fact, you may find a counterintuitive
result, where a setup with a lower set of clock values, is giving
a higher benchmark like SuperPI. Thus, on Core2 Duo, you don't stop
and crack open a beer, after just cranking the clock. There is more
to it than that. And kudos to the guys who took the time to test
and figure it out. I doubt I would have bothered.

Anandtech did some testing here, and in these results, the biggest
"jump" might be at DDR2-533. I believe the top five results are
with a constant core clock, while the bottom three are different.

http://www.anandtech.com/memory/showdoc.aspx?i=2732&p=4

I would say, rather than "the question definitively setteled", you
are now on a "journey of discovery".

Very little of this is explained in datasheets, which annoys me
greatly. I expected better of Intel. I'm not even sure there
is a nice tutorial anywhere, that sums up all the results
collected so far.

    Paul

Posted by Phil Weldon on April 23, 2007, 8:54 pm
 

| The processor FSB is 64 bits wide. If operating at FSB1066, data transfer
| rate is a maximum of 1066 * 8 bytes = 8528MB/sec.
|
| In a dual channel setup, you have DDR2-1066 (PC2-8500) on each channel.
| As the number implies, that means each channel transfers at 8500MB/sec,
| and two channels transfer at 17000MB/sec. That is twice the rate that
| the FSB can handle.
|
| So, what of it ?
.
.
| So, in all of that, is there something magic about the clocks on
| the memory and FSB ?

All of what?  You deleted almost the entire original post.
| So there are days of reading material ahead of you, if you wish
| to learn the details of Core2 overclocking. You have to slog
| through a lot of enthusiast chatter, to get nuggets of information.
.
.
I too am annoyed by the murky nomenclature.  My post is part of an ongoing
discussion in this newsgroup about the utility of DDR2 memory with ratings
above PC533.  The numbers I posted are an aid to understanding that using a
1:1 FSB : memory bus ratio when the FSB speed is 1066 MHz requires DDR2
memory that will operate at PC1066 levels.  Nothing more.

 You are welcome to YOUR voyage of discovery, but I see it as a quest
separate from the FSB : memory bus ratio.  It also does not aid a discussion
to delete almost the entire original post when you reply.

Phil Weldon


| Phil Weldon wrote:
| > What does PC1066 mean, and and what advantage does a 1:1 ratio confer?
| >
| > I consider the question definitively setteled.
|
| The processor FSB is 64 bits wide. If operating at FSB1066, data transfer
| rate is a maximum of 1066 * 8 bytes = 8528MB/sec.
|
| In a dual channel setup, you have DDR2-1066 (PC2-8500) on each channel.
| As the number implies, that means each channel transfers at 8500MB/sec,
| and two channels transfer at 17000MB/sec. That is twice the rate that
| the FSB can handle.
|
| So, what of it ? The Intel architecture features an external memory
| controller. The memory controller is located on the Northbridge
| chip. In addition to the connection of the processor FSB and the
| memory channels, there are also the PCI Express lanes for the video
| card. This could be, for example, PCI Express x16, at 4000MB/sec
| transmit and 4000MB/sec receive. So you could have the processor doing
| a burst, and the video card doing a bidirectional burst (if such a
| thing is possible), and that would more or less fill the memory bus.
| The Northbridge also has the DMI interface (hub bus), which could be
| another 4 PCI Express x1 lanes worth.
|
| So, in all of that, is there something magic about the clocks on
| the memory and FSB ?
|
| Actually, due to the strap in the Northbridge, there is a bit of
| unpredictability, about what will happen to performance as you
| overclock. In fact, there is a difference in overclock results,
| between "nominal BIOS/clockgen overclock" versus "overclock via BIOS".
| And that is due to how the Northbridge strap is set up by the BIOS.
| Since I like to back up these enthusiast concepts, with a trip to
| the datasheet, I was disappointed to find no mention of any of the
| details of any "Strap" in the Intel docs. Nor of any "latency setting"
| in the Northbridge, that apparently the BIOS sets up.  But people
| did do enough testing and presentation of their results, to show
| there is an appreciable difference between the two overclock methods,
| which lends credibility to the strap concept. Even if the proponent of
| the strap theory is not able to explain it very well (i.e. in a way
| that a hardware designer would understand).
|
| So there are days of reading material ahead of you, if you wish
| to learn the details of Core2 overclocking. You have to slog
| through a lot of enthusiast chatter, to get nuggets of information.
|
| In case you missed the point of the above two paragraphs, it is
| this. You should *benchmark* your overclocking setup, and not
| stare at the clocks. The memory and core clock on a Core2 Duo setup,
| don't tell the whole story. In fact, you may find a counterintuitive
| result, where a setup with a lower set of clock values, is giving
| a higher benchmark like SuperPI. Thus, on Core2 Duo, you don't stop
| and crack open a beer, after just cranking the clock. There is more
| to it than that. And kudos to the guys who took the time to test
| and figure it out. I doubt I would have bothered.
|
| Anandtech did some testing here, and in these results, the biggest
| "jump" might be at DDR2-533. I believe the top five results are
| with a constant core clock, while the bottom three are different.
|
| http://www.anandtech.com/memory/showdoc.aspx?i=2732&p=4
|
| I would say, rather than "the question definitively setteled", you
| are now on a "journey of discovery".
|
| Very little of this is explained in datasheets, which annoys me
| greatly. I expected better of Intel. I'm not even sure there
| is a nice tutorial anywhere, that sums up all the results
| collected so far.
|
|    Paul



Posted by Paul on April 23, 2007, 11:19 pm
 

Phil Weldon wrote:

I thought your post had something to do with synchronous transfer, as if there
was something magic about the 1:1 ratio. The bandwidth ratio is
2:1 between dual channel memory and the processor, for your stated case
of DDR2-1066 and FSB1066.

Clock, strictly speaking, is a physical signal connected to a chip. On the
processor, the input clock is 266MHz. The FSB is quad pumped. It means there
are four data phases per clock cycle. As far as I know, there isn't an actual
clock passed between the processor and northbridge at 1066MHz.  So there are
1066 million transfers per second of 8 bytes per transfer, for 8523MB/sec
on the FSB. But the clock fed to both the processor and the northbridge, is
at the lower rate of 266MHz.

According to the P965 datasheet, the Northbridge puts out a 266, 333, or 400MHz
clock to each DIMM. (Corresponding to DDR2-533, DDR2-667, and DDR2-800.) If
we extrapolate to the overclocked condition, that means the memory clock
is 533MHz when the memory is DDR2-1066.

So the ratio between memory clock and processor clock is 2:1, and the
reason for that, is the difference between quad pumped on the FSB
versus double data rate on the memory interface.

So, by all means, divide 1066 by 1066. The units in each case are
"million transfers per second" and not megahertz, as megahertz
applies to clocks. FSB1066 and DDR2-1066 apply to the data busses
in their respective cases and their transfer rates.

1) The clock ratio is 2:1
2) The bandwidth ratio is 2:1 (assuming dual channel as the norm)
3) The "bus transfer rate" ratio is 1:1

I snipped the rest of your post, because I was answering the 1:1
conclusion for clocks, which is not correct.

    Paul

Posted by Michel R. Carleer on April 23, 2007, 10:19 pm
 

I would like to emphasize that using dual channel memory does not mean that
you increase the mem bandwidth by a factor of 2. Because it does not mean
that you double the width from 64 bits to 128. It means that you read one
piece of data (64 bits) from one bank, and the next piece of data from the
other bank. In order to overcome at least partly the latency problem.

Michka



This Thread
  •  
  • Subject
  • Author
  • Date
please rate this thread