Techno
Page - By Harendra Alwis
Hyper-Threading
The last time you clicked on an icon on your desktop or your favourite
song in your Winamp queue, a lot of 'things' happened before the
application opened up or your song was played. With the mouse-click,
your operating system or your application sends out a series of
instructions in the form of a bunch of compiled code called a 'thread'
to the CPU. These tell the CPU what to do according to your input.
A register within the CPU called the Programme Counter (PC) tells
the CPU which 'thread' to execute by giving the memory address of
each of these 'threads' of code one at a time.
The process
is a touch more complex than that, but an inherent problem with
CPUs have been the fact that they could only work on just one 'thread'
of instructions at a time. It is widely acclaimed that the human
mind is never used to its full potential and this is very true for
most of its creations as well, the classic example in this instance
being the micro processor. A thread, while in execution does not
utilize all the resources of the CPU at a given instant, so all
those unused resources are in a sense, wasted.
Being able to
dispatch multiple execution threads to hardware is generally referred
to as multithreading. For a few years, giant chip maker Intel had
been working on Simultaneous Multi-Threading (SMT) technology code
named "Jackson" but it was not until fall last year that
their discoveries were revealed. Later on they gave it a more sensible
name and called it Hyper-Threading, but its implications are startling!
In the past,
Multi-Threading could only be done with multiple processors, but
this had many draw-backs as until the release of the AMD 760MP chipset,
all x86 platforms with multi-processor support split the available
FSB bandwidth between all available CPUs and there was a considerable
amount of overhead in managing the resources for the multiple processors,
apart from the fact that it was quite expensive to have many processors
in the first place! Still then, the operating system and the applications
had to be capable of supporting this kind of execution.
Instruction
level parallelism (ILP) is where multiple instructions are executed
simultaneously because of a CPU's ability to fill their multiple
parallel execution units. Here, the parallel instructions being
executed should be using resources of the CPU that are independent
of one-another, failing which instruction level parallelism is not
achieved. The fact is that there are very few instructions that
can be executed in this way, in parallel with another. This still
resulted in about 60%-70% of the CPU's resources going waste in
most CPUs anyway. If we had two of these CPUs in our system then
both threads could execute simultaneously. This exploits what is
known as thread-level parallelism (TLP) but is also a very costly
approach to improving performance.
Currently, the
way most CPU manufacturers improve performance within a CPU family
is by increasing clock speed and cache sizes. If there was a way
for us to execute multiple threads at once we could make more efficient
use of the CPU's resources. This is exactly what Intel's Hyper-Threading
technology does.
Hyper-Threading
is the marketing name applied to a technology that has been around
outside of the x86 realm for a little while now - Simultaneous Multi-Threading
(SMT). The idea behind SMT is simple; the single physical CPU appears
to the OS as two logical processors but the OS does not see any
difference between one SMT CPU and two regular CPUs. In both cases
the OS dispatches two threads to the "two" CPUs and the
hardware takes it from there.
In a Hyper-Threading
enabled CPU, each logical processor has its own set of registers
(including a separate Programme Counter) but in order to minimize
the complexity of the technology, Intel's Hyper-Threading does not
attempt to simultaneously fetch and decode instructions corresponding
to two threads. Instead, the CPU will alternate the fetch and decode
stages between the two logical CPUs and only attempt to execute
'operations' from two threads simultaneously, thus making the maximum
use of the Execution unit of the CPU.
The technology
has not officially debuted on a CPU yet, however those that have
experienced the new Xeon processors and used them on boards with
updated BIOS's seem to have been surprised with an interesting option
- to enable or disable Hyper-Threading. For now, Intel will be leaving
Hyper-Threading disabled by default but all that is necessary in
order to enable Hyper-Threading is the presence of a BIOS option
to control it. But why would Intel want to leave this performance-enhancing
feature disabled?
Now we are getting
to the dark side of the story. If you were to enable Hyper-Threading
on a desktop PC you may not see a performance increase, rather a
decrease in performance of up to 10%! The unfortunate reality here
is that when it comes to similar operations and the two logical
CPUs try to execute them separately, there comes additional overhead
in managing resources and dealing with what happens when you run
out of one type of execution unit. At the end of the day, you have
twice as many instructions requiring the use of the Execution Unit!
The area where
performance gains are the most likely today is said to be under
server applications because of the varied nature of the operations
sent to the CPU. According to the reviews, transactional database
server applications have shown a 20 - 30% boost in performance just
by enabling Hyper-Threading.
Although some
people got extremely excited when Hyper-Threading was rumoured to
be on all current Pentium 4/Xeon processors, it was not the free
performance for all that they wished for. The technology has a long
way to go before we will be able to take advantage of it including
desktops. Hyper-Threading will be absent from the desktop market
for a while but given proper developer support it can make its way
down from the server level to the desktop level.
|