Site hosted by Angelfire.com: Build your free website today!

System Buses

The components inside your computer talk to each other in various different ways. Most of the internal system components, including the processor, cache, memory, expansion cards and storage devices, talk to each other over one or more "buses".

A bus, in computer terms, is simply a channel over which information flows between two or more devices (technically, a bus with only two devices on it is considered by some a "port" instead of a bus). A bus normally has access points, or places into which a device can tap to become part of the bus, and devices on the bus can send to, and receive information from, other devices. The bus concept is rather common, both inside the PC and outside in the real world as well. In fact, your home telephone wiring is a bus: information flows through the wiring that goes through your house, and you can tap into the "bus" by installing a phone jack, plugging in the phone and picking it up. All the phones can share the "information" (voice) on the bus.

This whole section focuses specifically on the system I/O (input/output) buses, also called expansion buses. First the buses and their characteristics are discussed, and then the most common types of I/O buses found on the PC are described with details on their features.

 

System Bus Functions and Features

 

Bus Hierarchy

The PC has a hierarchy, in a way, of different buses. Most modern PCs have at least four buses. I consider them a hierarchy because each bus is to some extent further removed from the processor; each one connects to the level above it, integrating the various parts of the PC together. Each one is also generally slower than the one above it (for the pretty obvious reason that the processor is the fastest device in a modern PC):

The system chipset is the conductor that controls this orchestra of communication, and makes sure that every device in the system is talking properly to every other one.

Some newer PCs actually use an additional "bus" that is specifically designed for graphics communications only. The word "bus" is in quotes because it isn't actually a bus, it's a port: the Accelerated Graphics Port (AGP). The distinction between a bus and port is that a bus is generally designed for multiple devices to share the medium, while a port is only for two devices.


Data and Address Buses

Every bus is composed of two distinct parts: the data bus and the address bus. The data bus is what most people refer to when talking about a bus; these are the lines that actually carry the data being transferred. The address bus is the set of lines that carry information about where in memory the data is to be transferred to or from.

In addition, there are a number of control lines that, well, control how the bus functions, and allow users of the bus to signal when data is available. These are sometimes refered to as the control bus, though often they are simply not mentioned.


Bus Width

A bus is a channel over which information flows. The wider the bus, the more information can flow over the channel, much as a wider highway can carry more cars than a narrow one. The original ISA bus on the IBM PC was 8 bits wide; the universal ISA bus used now is 16 bits. The other I/O buses (including VLB and PCI) are 32 bits wide. The memory and processor buses on Pentium and higher PCs are 64 bits wide.

The address bus width can be specified independently of the data bus width. The width of the address bus dictates how many different memory locations that bus can transfer information to or from.


Bus Speed

The speed of the bus reflects how many bits of information can be sent across each wire each second. This would be analogous to how fast the cars are driving on our analogical highway. Most buses transmit one bit of data per line, per clock cycle, although newer high-performance buses like AGP may actually move two bits of data per clock cycle, doubling performance. Similarly, older buses like the ISA bus may take two clock cycles to move one bit, halving performance.


Bus Bandwidth

Bandwidth, also called throughput, refers to the total amount of data that can theoretically be transferred on the bus in a given unit of time. Using the highway analogy, if the bus width is the number of lanes, and the bus speed is how fast the cars are driving, then the bandwidth is the product of these two and reflects the amount of traffic that the channel can convey per second.

The table below shows the theoretical bandwidth of most of the common I/O buses on PCs today. Note the italics on the word "theoretical"; most buses can't actually transmit anywhere near these maximum numbers because of command overhead and other factors. This is especially true of older buses. For example, the theoretical bandwidth of the 8-bit ISA bus might be about  MBytes/sec, but in reality there are wait states inserted during I/O that drop this figure down dramatically.

Most of these buses can run at many different speeds; the speed listed is the one most commonly used for the bus type.

Bus

Width (bits)

Bus Speed (MHz)

Bus Bandwidth (MBytes/sec)

8-bit ISA

8

8.3

7.9

16-bit ISA

16

8.3

15.9

EISA

32

8.3

31.8

VLB

32

33

127.2

PCI

32

33

127.2

64-bit PCI 2.1

64

66

508.6

AGP

32

66

254.3

AGP (x2 mode)

32

66x2

508.6

AGP (x4 mode)

32

66x4

1,017.3

Note: You may be somewhat confused by the bandwidth numbers I have listed in the table above. For example, shouldn't the bandwidth of standard PCI be 32/8*33.3=133.3 MB/sec? This is how most people and even companies write it, but this is not technically correct, because of the old problem of different definitions of what "M" stands for. The "M" in "MHz" is 1,000,000 (10^6), but the "M" in "MBytes/second" is 1,048,576 (2^20). So the bandwidth of the PCI bus is more properly stated as 32/8*33.3*1,000,000/1,048,576=127.2 MBytes/second.

A few words on the last four entries. In theory, the PCI bus can be extended to 64 bits in width, and 66 MHz in speed. However (here it comes again) for compatibility reasons almost all PCI buses and the devices they run in, are rated for only 33 MHz at 32 bits. AGP is based upon this theoretical standard and does run at 66 MHz, but remains only 32 bits wide. AGP has additional modes, dubbed x2 and x4, that allow the port to perform data transfers two or four times per clock cycle, respectively, leading to an effective bus speed of 133 or 266 MHz.

Bus Interfacing

On a system that has multiple buses, circuitry must be provided by the chipset to connect the buses and allow devices on one to talk to devices on the other. This device is called a "bridge", the same name used to refer to a piece of networking hardware that connects two dissimilar networks. By far the most commonly found bridge is the PCI-ISA bridge, which is part of the system chipset on a Pentium or Pentium Pro PC. The PCI bus also has a bridge to the processor bus; you can see these devices under "System devices" in the Device Manager in Windows 95.


Bus Mastering

On the higher-bandwidth buses, a great deal of information is flowing through the channel every second. Normally, the processor is required to control the transfer of this information. In essence, the processor is a "middleman", and as with many similar cases in the real world, it is far more efficient to "cut out" the middleman and perform the transfer directly. This is done by having capable devices take control of the bus and do the work themselves; devices that can do this are called bus masters. In theory, the processor can do other work simultaneously; in practice there are several complicating factors. In order to do bus mastering properly, a facility to arbitrate between requests to "take over the bus" must exist; this is provided by the chipset. Bus mastering is also called "first party" DMA since the work is controlled by the device doing the transfer.

Currently most bus mastering in the PC world is done on the PCI bus; in addition, support has been added for IDE/ATA hard disk drives to do bus mastering on PCI under certain conditions.

 

The Local Bus Concept

The switch from character-based applications to graphics-based ones began in earnest at the start of the 90s, with the rapid growth in popularity of the Windows operating system. The increase in the amount of information that must be moved between the processor, memory, video and hard disks when using a graphical operating system compared to a text-based one is tremendous. A complete, standard screen of monochrome text is just 4,000 bytes of information (2,000 bytes for the characters, and 2,000 bytes for screen attributes). However, a standard 256-color Windows screen requires over 300,000 bytes, an increase of about 15,000%! (Out of interest, the highest-end resolution and color depth generally used today, 1600x1200 at 16 million colors, requires 5.8 million bytes of information per screen!).

The transformation of the software world from text to graphics also meant much larger programs and more storage requirements. From an I/O standpoint, much more I/O bandwidth was needed to handle the additional data going to and from the video card and the increasingly larger and faster hard disks. By this time, Intel had also moved on to the 80486 processor that provided many times the performance of earlier CPUs. The ISA bus, still running at the same speed and bus width that it did on the IBM AT, was finally and totally outmatched by these increasing demands and became a major bottleneck to improving system performance. Increasing the speed of the processor accomplished little if it was always waiting for the slow system bus to transmit data.

The solution was to create a new, faster bus, that would augment the ISA bus and be used especially for high-bandwidth devices such as video cards. This new bus would be put on (or near) the processor's much faster memory bus, to let it run at or near the external speed of the processor, and to allow data to flow between these devices and the processor without having to go through the much slower ISA bus. By placing these devices "local" to the processor, the local bus was born.

The first local bus was the VESA local bus. The current local bus of choice on modern computers is the Peripheral Component Interconnect or PCI bus.

 

System Bus Types

This section looks at the various input/output buses used on PCs, focusing most of the attention on the ones most commonly found on a modern machine

 

Older Bus Types

While most of the attention today is given to the current local bus standard, PCI, and the new AGP port that is likely to become the next standard interface for video, these have evolved from a series of older buses that you will still find in service today on older PCs. The oldest one, ISA, is in fact still used even on the newest PCs! This section takes a look at these older bus types in some detail.


Industry Standard Architecture (ISA) Bus

The most common bus in the PC world, ISA stands for Industry Standard Architecture, and unlike many uses of the word "standard", in this case it actually fits. The ISA bus is still a mainstay in even the newest computers, despite the fact that it is largely unchanged since it was expanded to 16 bits in 1984! The ISA bus eventually became a bottleneck to performance and was augmented with additional high-speed buses, but ISA persists because of the truly enormous base of existing peripherals using the standard. Also, there are still many devices for which the ISA's speed is more than sufficient, and will be for some time to come (standard modems being an example).

(As a side note, after 17 years it appears that ISA may finally be going the way of the dodo. Market leaders Intel and Microsoft want to move the industry away from the use of the ISA bus in new machines. My personal prediction is that they will succeed in this effort, but that it will take at least five years to do it fully. There are few standards in the PC world as pervasive as ISA, and the hundreds of millions of existing ISA cards will ensure that ISA sticks around for some time.)

The choices made in defining the main characteristics of the ISA bus--its width and speed--can be seen by looking at the processors with which it was paired on early machines. The original ISA bus on the IBM PC was 8 bits wide, reflecting the 8 bit data width of the Intel 8088 processor's system bus, and ran at 4.77 MHz, again, the speed of the first 8088s. In 1984 the IBM AT was introduced using the Intel 80286; at this time the bus was doubled to 16 bits (the 80286's data bus width) and increased to 8 MHz (the maximum speed of the original AT, which came in 6 MHz and 8 MHz versions).

Later, the AT processors of course got faster, and eventually data buses got wider, but by this time the desire for compatibility with existing devices led manufacturers to resist change to the standard, and it has remained pretty much identical since that time. The ISA bus provides reasonable throughput for low-bandwidth devices and virtually assures compatibility with almost every PC on the market.

Many expansion cards, even modern ones, are still only 8-bit cards (you can tell by looking at the edge connector on the card; 8-bit cards use only the first part of the ISA slot, while 16-bit cards use both parts). Generally, these are cards for which the lower performance of the ISA bus is not a concern. However, access to IRQs 9 through 15 is provided through wires in the 16-bit portion of the bus slots. This is why most modems, for example, cannot be set to the higher-number IRQs. IRQs cannot be shared among ISA devices.

 

Micro Channel Architecture (MCA) Bus

The MCA bus (also called the Micro Channel bus; MCA stands for "Micro Channel Architecture") was IBM's attempt to replace the ISA bus with something "bigger and better". When the 80386DX was introduced in the mid-80s with its 32-bit data bus, IBM decided (much like it did with the AT) to create a bus to match this width. MCA is 32 bits wide, and offers several significant improvements over ISA. (One of MCA's disadvantages was rather poor DMA controller circuitry.)

The MCA bus has some pretty impressive features considering that it was introduced in 1987, a full seven years before the PCI bus made similar features common on the PC. In some ways it was ahead of its time, because back then the ISA bus really wasn't a major performance limiting factor:

MCA had a great deal of potential. Unfortunately, IBM made two decisions that would doom MCA to utter failure in the marketplace. First, they made MCA incompatible with ISA; this means ISA cards will not work at all in an MCA system, one of the few categories of PCs for which this is true. The PC market is very sensitive to backwards-compatibility issues, as evidenced by the number of older standards that persist to this day (such as ISA!) Second, IBM decided to make the MCA bus proprietary. It in fact did this with ISA as well; however in 1981 IBM could afford to flex its muscles in this manner, while by this time the clone makers were starting to come into their own and weren't interested in bending to IBM's wishes.

These two factors, combined with the increased cost of MCA systems, led to the demise of the MCA bus. With the PS/2 now discontinued, MCA is dead on the PC platform, though it is still used by IBM on some of its RISC 6000 UNIX servers. It is one of the classical examples in the field of computing of how non-technical issues often dominate over technical ones.

 

Extended Industry Standard Architecture (EISA) Bus

EISA stands for Extended Industry Standard Architecture. Unlike ISA, here the name is not indicative of reality, for the EISA bus never became widely used and cannot by any stretch of the imagination be considered an industry standard. EISA began as Compaq's answer to IBM's MCA bus, and followed a similar path of development--with very similar results.

Compaq avoided the two key mistakes that IBM made when they developed EISA. First, they made it compatible with the ISA bus. Second, they opened the design to all manufacturers instead of keeping it proprietary, by forming the non-profit EISA committee to manage the design of the standard. EISA was similar to MCA both in terms of technology and market acceptance: it had significant technical advantages over ISA, and it never caught on with the PC-buying public.

Some of the key features of the EISA bus:

EISA-based systems have today been mostly relegated to a specialty role; they are sometimes found in network fileservers. The EISA bus is virtually non-existent on desktop systems for several reasons. First, EISA-based systems tend to be much more expensive than other types of systems. Second, there are few EISA-based cards available. Finally, the performance of this bus is quite low compared to the popular local buses like the VESA Local Bus and PCI. EISA is not totally dead as a platform the way MCA is, but it is pretty close.


VESA Local Bus (VLB)

The first local bus to gain popularity, the VESA local bus (also called VL-Bus or VLB for short) was introduced in 1992. VESA stands for the Video Electronics Standards Association, a standards group that was formed in the late eighties to address video-related issues in personal computers. Indeed, the major reason for the development of VLB was to improve video performance in PCs.

The VLB is a 32-bit bus which is in a way a direct extension of the 486 processor/memory bus. A VLB slot is a 16-bit ISA slot with third and fourth slot connectors added on the end. The VLB normally runs at 33 MHz, although higher speeds are possible on some systems. Since it is an extension of the ISA bus, an ISA card can be used in a VLB slot, although it makes sense to use the regular ISA slots first and leave the (small number of) VLB slots open for VLB cards, which won't work in an ISA slot of course. Use of a VLB video card and I/O controller greatly increases system performance over an ISA-only system.

While VLB was extremely popular during the reign of the 486, with the introduction of the Pentium and its PCI local bus in 1994, wholesale abandonment of the VLB began in earnest. While Intel pushing PCI was one reason why this happened, there were also several key problems with the VLB implementation. First, the design was strongly based on the 486 processor, and adapting it to the Pentium caused a host of compatibility and other problems. Second, the bus itself was tricky electrically; for example, the number of cards that could be used on the bus was low (often only two or even one), and occasionally there could be timing problems on the bus when more than one card was used. Finally, the bus did not support bus mastering properly since there was no good arbitration scheme, and did not support Plug and Play.

Today VLB is obsolete for new systems; even the latest 486 motherboards use PCI, and all Pentiums and higher use PCI. However, these systems do still offer reasonable performance, and are now plentiful and very inexpensive--if you can still find them.

 

Peripheral Component Interconnect (PCI) Local Bus

Currently by far the most popular local I/O bus, the Peripheral Component Interconnect (PCI) bus was developed by Intel and introduced in 1993. It is geared specifically to fifth- and sixth-generation systems, although the latest generation 486 motherboards use PCI as well.

Like the VESA Local Bus, PCI is a 32-bit bus that normally runs at a maximum of 33 MHz. The key to PCI's advantages over its predecessor, the VESA local bus, lies in the chipset that controls it. The PCI bus is controlled by special circuitry in the chipset that is designed to handle it, where the VLB was basically just an extension of the 486 processor bus. PCI is not married to the 486 in this manner, and its chipset provides proper bus arbitration and control facilities, to enable PCI to do much more than VLB ever could. PCI is also used outside the PC platform, providing a degree of universality and allowing manufacturers to save on design costs.


PCI Bus Performance

The PCI bus provides superior performance to the VESA local bus; in fact, PCI is the highest performance general I/O bus currently used on PCs. This is due to several factors:

The speed of the PCI bus can be set synchronously or asynchronously, depending on the chipset and motherboard. In a synchronized setup (used by most PCs), the PCI bus runs at half the memory bus speed; since the memory bus is usually 50, 60 or 66 MHz, the PCI bus would run at 25, 30 or 33 MHz respectively. In an asynchronous setup the speed of the PCI bus can be set independently of the memory bus speed. This is normally controlled through jumpers on the motherboard, or BIOS settings. Overclocking the system bus on a PC that uses synchronous PCI will cause PCI peripherals to be overclocked as well, often leading to system stability problems.

 

PCI Expansion Slots

The PCI bus offers more expansion slots than most VLB implementations, without the electrical problems that plagued the VESA bus. Most PCI systems support 3 or 4 PCI slots, with some going significantly higher than this.

Note: In some systems, not all of the slots are capable of bus mastering. This is now far less common than it was, but with early systems the motherboard manual should be checked to see if this is the case.

The PCI bus offers a great variety of expansion cards compared to VLB. The most commonly found cards are video cards (of course), SCSI host adapters, and high-speed networking cards. (Hard disk drives are also on the PCI bus but are normally connected directly to the motherboard on a PCI system). However, it should be noted that certain functions cannot be provided on the PCI bus. For example, serial and parallel ports must remain on the ISA bus. Fortunately, the ISA bus still has more than enough speed to handle these devices, even today.

 

PCI Internal Interrupts

The PCI bus uses its own internal interrupt system for dealing with requests from the cards on the bus. These interrupts are often called "#A", "#B", "#C" and "#D" to avoid confusion with the normal numbered system IRQs, though they are sometimes called "#1" through "#4" as well. These interrupt levels are not generally seen by the user except in the BIOS setup screen for PCI, where they can be used to control how PCI cards operate.

These interrupts, if needed by cards in the slots, are mapped to regular interrupts, normally IRQ9 through IRQ12. The PCI slots in most systems can be mapped to at most four regular IRQs. In systems that have more than four PCI slots, or that have four slots and a USB controller (which uses PCI), two or more of the PCI devices share an IRQ.

If you are using Windows 95 OEM SR2, you may see additional entries for your PCI devices under the Device Manager. Each device may have an additional entry entitled "IRQ Holder for PCI Steering". PCI steering is in fact a feature that is part of the Plug and Play portions of the system, and enables the IRQ used for PCI devices to be controlled by the operating system to avoid resource problems. Having this listed in addition to another device under the IRQ list does not mean you have a resource conflict.

 

PCI Bus Mastering

As discussed in the section on system bus functions and features, bus mastering is the capability of devices on the PCI bus (other than the system chipset, of course) to take control of the bus and perform transfers directly. The PCI bus is the first bus to popularize bus mastering; probably in part because for the first time there are operating systems and software that are really capable of taking advantage of it.

PCI supports full device bus mastering, and provides bus arbitration facilities through the system chipset. PCI's design allows bus mastering of multiple devices on the bus simultaneously, with the arbitration circuitry working to ensure that no device on the bus (including the processor!) locks out any other device. At the same time though, it allows any given device to use the full bus throughput if no other device needs to transfer anything. In a way, the PCI bus acts like a tiny "local area network" inside the computer, in which multiple devices can each talk to each other, sharing a communication channel that is managed by the chipset.

 

PCI IDE Bus Mastering

The PCI bus also allows you to set up compatible IDE/ATA hard disk drives to be bus masters. Under the correct conditions this can increase performance over the use of PIO modes, which are the default way that IDE/ATA hard disks transfer data to and from the system. When PCI bus mastering is used, IDE/ATA devices use DMA modes to transfer data instead of PIO; IDE/ATA DMA modes are described in detail here.

Since this capability was made available to newer machines, it has been one of the most talked about (and most misunderstood) functions of the modern PC. There is a lot of confusion amongst PC users about what PCI IDE bus mastering does and how it works. In particular, there are a lot of misconceptions about its performance advantages. In addition, there have been a lot of problems with compatibility in getting this new technology to work.

IDE bus mastering requires all of the following in order to function at all:

Getting this all set up can be a great deal of work. In particular, the following are common problems encountered when trying to set up bus mastering:

Assuming that you get bus mastering IDE to work, you will see improvement if you are using a true multi-tasking operating system, and you are running multiple applications that are disk-access-intensive. This would not generally include most regular Windows 95 users, for example. Bus mastering IDE will not help at all in the following situations:

Especially: IDE bus mastering will not really speed up Windows 95 in general. Windows 95 does not do "true" multitasking and in many cases the processor will be held up waiting for the transfer to complete even if bus mastering is employed. So even though the processor in theory is freed up to do other things, it doesn't really do other things. Also, most people multitask by switching between applications that are open, but rarely have anything running in two or more simultaneously.

For most people, IDE bus mastering is not worth the effort and problems, and I now do not bother with it on new installs of Windows 95. This may be somewhat controversial, but in my opinion it is very overrated as a potential system improvement, given how much effort it requires. You're better off working overtime for a few hours and buying another 16 MB of RAM. If you feel like trying it, contacting the company that made your motherboard for a driver set is a good place to start. You can also try Intel for a generic driver that may work on your Intel-chipset system. I'd recommend that you back up your hard disk first before trying any of these... refer to this section of the Troubleshooting Expert for more help resolving problems with these drivers if you have difficulties with them.

I am hopeful that in time, bus mastering over the IDE/ATA interface will be improved and these problems will be just a distant memory. With the creation of Ultra ATA and the DMA-33 high-speed transfer mode, it appears that the future lies in the use of PCI bus mastering with the IDE/ATA interface. There is just some work to do until this support is both universal and well-implemented.

 

PCI Plug and Play

The PCI bus is part of the Plug and Play standard developed by Intel, with cooperation from Microsoft and many other companies. PCI systems were the first to popularize the use of Plug and Play. The PCI chipset circuitry handles the identification of cards and works with the operating system and BIOS to automatically set resource allocations for compatible peripheral cards.

 

Accelerated Graphics Port (AGP)

The need for increased bandwidth between the main processor and the video subsystem originally lead to the development of the local I/O bus on the PCs, starting with the VESA local bus and eventually leading to the popular PCI bus. This trend continues, with the need for video bandwidth now starting to push up against the limits of even the PCI bus.

Much as was the case with the ISA bus before it, traffic on the PCI bus is starting to become heavy on high-end PCs, with video, hard disk and peripheral data all competing for the same I/O bandwidth. To combat the eventual saturation of the PCI bus with video information, a new interface has been pioneered by Intel, designed specifically for the video subsystem. It is called the Accelerated Graphics Port or AGP.

AGP was developed in response to the trend towards greater and greater performance requirements for video. As software evolves and computer use continues into previously unexplored areas such as 3D acceleration and full-motion video playback, both the processor and the video chipset need to process more and more information. The PCI bus is reaching its performance limits in these applications, especially with hard disks and other peripherals also in there fighting for the same bandwidth.

Another issue has been the increasing demands for video memory. As 3D computing becomes more mainstream, much larger amounts of memory become required, not just for the screen image but also for doing the 3D calculations. This traditionally has meant putting more memory on the video card for doing this work. There are two problems with this:

AGP gets around these problems by allowing the video processor to access the main system memory for doing its calculations. This is more efficient because this memory can be shared dynamically between the system processor and the video processor, depending on the needs of the system.

The idea behind AGP is simple: create a faster, dedicated interface between the video chipset and the system processor. The interface is only between these two devices; this has three major advantages: it makes it easier to implement the port, makes it easier to increase AGP in speed, and makes it possible to put enhancements into the design that are specific to video.

AGP is considered a port, and not a bus, because it only involves two devices (the processor and video card) and is not expandable. One of the great advantages of AGP is that it isolates the video subsystem from the rest of the PC so there isn't nearly as much contention over I/O bandwidth as there is with PCI. With the video card removed from the PCI bus, other PCI devices will also benefit from improved bandwidth.

AGP is a new technology and was just introduced to the market in the third quarter of 1997. The first support for this new technology will be from Intel's 440LX Pentium II chipset. More information on AGP will be forthcoming as it becomes more mainstream and is seen more in the general computing market. Interestingly, one of Intel's goals with AGP was supposed to be to make high-end video more affordable without requiring sophisticated 3D video cards. If this is the case, it really makes me wonder why they are only making AGP available for their high-end, very expensive Pentium II processor line. Originally, AGP was rumored to be a feature on the 430TX Pentium socket 7 chipset, but it did not materialize. Via and other companies are carrying the flag for future socket 7 chipset development now that Intel has dropped it, and several non-Intel AGP-capable chipsets will be entering the market in 1998.

 

AGP Interface

The AGP interface is in many ways still quite similar to PCI. The slot itself is similar physically in shape and size, but is offset further from the edge of the motherboard than PCI slots are. The AGP specification is in fact based on the PCI 2.1 specification, which includes a high-bandwidth 66 MHz speed that was never implemented on the PC. AGP motherboards have a single expansion card slot for the AGP video card, and usually one less PCI slot, and are otherwise quite similar to PCI motherboards.

 

AGP Bus Width, Speed and Bandwidth

The AGP bus is 32 bits wide, just the same as PCI is, but instead of running at half of the system (memory) bus speed the way PCI does, it runs at full bus speed. This means that on a standard Pentium II motherboard AGP runs at 66 MHz instead of the PCI bus's 33 MHz. This of course immediately doubles the bandwidth of the port; instead of the limit of 127.2 MB/s as with PCI, AGP in its lowest speed mode has a bandwidth of 254.3 MB/s. Plus of course the benefits of not having to share bandwidth with other PCI devices.

In addition to doubling the speed of the bus, AGP has defined a 2X mode, which uses special signaling to allow twice as much data to be sent over the port at the same clock speed. What the hardware does is to send information on both the rising and falling edges of the clock signal. Each cycle, the clock signal transitions from "0", to "1" ("rising edge"), and back to "0" ("falling edge"). While PCI for example only transfers data on one of these transitions each cycle, AGP transfers data on both. The result is that the performance doubles again, to 508.6 MB/s theoretical bandwidth. There is also a plan to implement a 4X mode, which will perform four transfers per clock cycle: a whopping 1,017 MB/s of bandwidth!

This is certainly very exciting, but we must temper this excitement somewhat (and not just because AGP is new and we don't have much that is practical to evaluate yet). It's great fun to talk about 1 GB/s bandwidth for the video card, but there's only one problem: this is more than the bandwidth of the entire system bus of a modern PC! If you recall, the data bus of a Pentium class or later PC is 64 bits wide and runs at 66 MHz. This gives a total of 508.6 MB/s bandwidth, so the 1 GB/s maximum isn't going to do much good until we get the data bus running much faster than 66 MHz. Future motherboard chipsets will take the system bus to 100 MHz, which will increase total memory bandwidth to 763 MB/s, a definite step in the right direction, but still not enough to make 4X transfers feasible.

Also worth remembering is that the CPU also needs to have access to the system memory, not just the video subsystem. If all 508.6 MB/s of system bandwidth is taken up by video over AGP, what is the processor going to do? Again here, going to 100 MHz system speed will help immensely. In practical terms, the jury is still out on AGP and will be for a while, though there can be no denying its tremendous promise.

 

AGP Video Pipelining

One performance enhancing benefit of AGP is its ability to pipeline requests for data. Pipelining was first used by modern processors as a way to improve performance by letting the sequential parts of tasks overlap. With AGP, the video chipset can use a similar technique when requesting information from memory, which improves performance.


AGP Video System Memory Access

One of the key features of AGP is the ability to share the main system memory with the video chipset. The reason that this has been incorporated into the design is to allow the video subsystem to have access to larger amounts of memory for 3D and other processing, without requiring that large quantities of special video memory be put on the video card for this purpose. Right now the memory on the video card is shared between the frame buffer and the other uses that the video card has for memory. Since the frame buffer requires high-performance memory technologies, most cards use this better technology (such as VRAM) for all of the memory on the video card, which is usually overkill for the parts other than the frame buffer.

Note that AGP is not the same as the ill-fated unified memory architecture (UMA). Under UMA, all of the video card's memory, including the frame buffer, is taken from main system memory. Under AGP, the frame buffer remains on the video card, where it belongs. Tthe frame buffer is the most important part of the video memory and it requires the highest performance, so it makes sense to leave it on the video card so special video-specific technologies like VRAM can be used.

What AGP does is to allow the video processor to access the system memory for other tasks that require memory, such as texturing and other 3D operations. The theory is that this memory isn't as crucial as the frame buffer, and doing this design means that higher-end video cards can be made more inexpensively by saving on the cost of dedicated video card memory that is only being used for these operations.

 

AGP Requirements

There are a number of different requirements in order to allow a system to take advantage of AGP:

Operating system support is going to be especially problematic. I am not really sure how this will end up being handled in the early days of AGP, but I suspect that it may not be until Microsoft provides AGP support in the upcoming Windows 98 that we will see AGP's popularity really start to pick up.