Untitled Document

Mirror Magazine

Techno Page - By Harendra Alwis
Hyper-Threading
The last time you clicked on an icon on your desktop or your favourite song in your Winamp queue, a lot of 'things' happened before the application opened up or your song was played. With the mouse-click, your operating system or your application sends out a series of instructions in the form of a bunch of compiled code called a 'thread' to the CPU. These tell the CPU what to do according to your input. A register within the CPU called the Programme Counter (PC) tells the CPU which 'thread' to execute by giving the memory address of each of these 'threads' of code one at a time.

The process is a touch more complex than that, but an inherent problem with CPUs have been the fact that they could only work on just one 'thread' of instructions at a time. It is widely acclaimed that the human mind is never used to its full potential and this is very true for most of its creations as well, the classic example in this instance being the micro processor. A thread, while in execution does not utilize all the resources of the CPU at a given instant, so all those unused resources are in a sense, wasted.

Being able to dispatch multiple execution threads to hardware is generally referred to as multithreading. For a few years, giant chip maker Intel had been working on Simultaneous Multi-Threading (SMT) technology code named "Jackson" but it was not until fall last year that their discoveries were revealed. Later on they gave it a more sensible name and called it Hyper-Threading, but its implications are startling!

In the past, Multi-Threading could only be done with multiple processors, but this had many draw-backs as until the release of the AMD 760MP chipset, all x86 platforms with multi-processor support split the available FSB bandwidth between all available CPUs and there was a considerable amount of overhead in managing the resources for the multiple processors, apart from the fact that it was quite expensive to have many processors in the first place! Still then, the operating system and the applications had to be capable of supporting this kind of execution.

Instruction level parallelism (ILP) is where multiple instructions are executed simultaneously because of a CPU's ability to fill their multiple parallel execution units. Here, the parallel instructions being executed should be using resources of the CPU that are independent of one-another, failing which instruction level parallelism is not achieved. The fact is that there are very few instructions that can be executed in this way, in parallel with another. This still resulted in about 60%-70% of the CPU's resources going waste in most CPUs anyway. If we had two of these CPUs in our system then both threads could execute simultaneously. This exploits what is known as thread-level parallelism (TLP) but is also a very costly approach to improving performance.

Currently, the way most CPU manufacturers improve performance within a CPU family is by increasing clock speed and cache sizes. If there was a way for us to execute multiple threads at once we could make more efficient use of the CPU's resources. This is exactly what Intel's Hyper-Threading technology does.

Hyper-Threading is the marketing name applied to a technology that has been around outside of the x86 realm for a little while now - Simultaneous Multi-Threading (SMT). The idea behind SMT is simple; the single physical CPU appears to the OS as two logical processors but the OS does not see any difference between one SMT CPU and two regular CPUs. In both cases the OS dispatches two threads to the "two" CPUs and the hardware takes it from there.

In a Hyper-Threading enabled CPU, each logical processor has its own set of registers (including a separate Programme Counter) but in order to minimize the complexity of the technology, Intel's Hyper-Threading does not attempt to simultaneously fetch and decode instructions corresponding to two threads. Instead, the CPU will alternate the fetch and decode stages between the two logical CPUs and only attempt to execute 'operations' from two threads simultaneously, thus making the maximum use of the Execution unit of the CPU.

The technology has not officially debuted on a CPU yet, however those that have experienced the new Xeon processors and used them on boards with updated BIOS's seem to have been surprised with an interesting option - to enable or disable Hyper-Threading. For now, Intel will be leaving Hyper-Threading disabled by default but all that is necessary in order to enable Hyper-Threading is the presence of a BIOS option to control it. But why would Intel want to leave this performance-enhancing feature disabled?

Now we are getting to the dark side of the story. If you were to enable Hyper-Threading on a desktop PC you may not see a performance increase, rather a decrease in performance of up to 10%! The unfortunate reality here is that when it comes to similar operations and the two logical CPUs try to execute them separately, there comes additional overhead in managing resources and dealing with what happens when you run out of one type of execution unit. At the end of the day, you have twice as many instructions requiring the use of the Execution Unit!

The area where performance gains are the most likely today is said to be under server applications because of the varied nature of the operations sent to the CPU. According to the reviews, transactional database server applications have shown a 20 - 30% boost in performance just by enabling Hyper-Threading.

Although some people got extremely excited when Hyper-Threading was rumoured to be on all current Pentium 4/Xeon processors, it was not the free performance for all that they wished for. The technology has a long way to go before we will be able to take advantage of it including desktops. Hyper-Threading will be absent from the desktop market for a while but given proper developer support it can make its way down from the server level to the desktop level.

Back to Mirror Magazine