Plan du site  

Articles - Étudiants SUPINFO

The history of Intel CPU

Par Silvio CASSARO Publié le 17/07/2017 à 18:56:02 Noter cet article:
(0 votes)
Avis favorable du comité de lecture


Intel Corporation is the largest multinational manufacturing company of semiconductor devices (microprocessors, memory devices, telecommunication support circuits and computer applications) based in Santa Clara, Calif. It was founded in 1968 by Gordon Moore and Robert is the story of its CPUs

Here is the story of its CPUs.

Intel 4004

The first microprocessor sold by Intel was the 4-bit 4004 in 1971. It was designed to work with three other microchips - ROM 4001, RAM 4002, and Shift Register 4003. The 4004 performed calculations, while the others were critical to running the processor.

The 4004 was mainly used in computers and similar devices, and was not intended to end up inside computers. Its maximum frequency was 740 KHz. At 4004 followed a similar processor, the 4040, an improved variant with a set of extended performance and further instructions.

8008 and 8080

The 4004 allowed Intel to name itself in the microprocessor industry. To capitalize on the situation, Intel introduced a new line of 8-bit processors. The 8008 came in 1972, followed by 8080 in 1974 and 8085 in 1975. Although the 8008 was the first 8-bit processor produced by Intel, it was not as important as its predecessor and its successor.

It was faster than the 4004 because of its ability to process 8-bit data but had a rather conservative frequency of between 200 and 800 kHz, and consequently its performance did not completely convince. The 8008 was produced with transistors 10 micrometres. The 8080 was more successful: it was a 8008 to 6 micrometer and incorporated new instructions.

This allowed the company more than double the frequencies: the 8080 best chips came in 1974 at a frequency of 2 MHz. The 8080 was integrated into a lot of devices, and this brought a lot of software developers, such as the then young Microsoft Focus on Intel processors.


The 8086 was the first x86 processor. This 16-bit chip could manage 1 MB of memory using a 20-bit external bus. The clock frequency was 4.77 MHz, basically low, considering that at the end of his career this processor reached 10 MHz.

Thanks to the 16-bit managed two 8-bit instructions simultaneously. The 8086 used the first ISA x86 review, still used by AMD and Intel.

80186 and 80188

In the 8086 they followed several other processors based on the same 16-bit architecture. The first was the 80186. Intel migrated various hardware components usually located on the motherboard inside the CPU, as the clock generator, the interrupt controller and timer. By integrating these components in the 80186 processor he proved several times faster 8086. Intel also increased the frequency to get more performance. The 80188 was a less expensive variant with a halved bus.


Introduced in the same year dell'80186, the 80286 was three times faster than the 8086 at the same frequency. It could handle 16 MB of memory through a 24 bit bus address. It was the first x86 with a unit of memory management unit (MMU), which allowed to administer the virtual memory. As the 8086, he had no floating-point unit (FPU): relied on a x87 (80287) co-processor. The maximum frequency reached 12.5 MHz.

IAPX 432

iAPX 432 was the first attempt by Intel to separate dall'x86. From this solution, the company expected performance times more than x86 solutions. The processor, however, had no luck due to large architectural problems. The x86 processors were relatively complex, but the iAPX 432 brought CISC to a new level of complexity.

Intel had to create two separate die. The processor was also rather hungry for data and did not offer good performance without an extremely high bandwidth. The iAPX 432 managed to exceed the 8080 and 8086 as well as performance, but little was enough for the x86 to come back.


Intel created its first RISC processor in 1984. It was designed as a competitor to the x86 processors, but it was a microcontroller for embedded applications. Internally, it was a 32-bit superscalar architecture that used the concepts of the Berkeley RISC project.

The first i960 processors had a relatively low frequency, with the slower model running at 10 MHz, but over the years it improved and passed to more advanced processes that allowed to touch the 100 MHz. It supported 4 GB of protected memory. The i960 was used in military systems as well as in business systems.


The Intel 80386 was the first x86 processor with 32-bit architecture. Debuted in 1985. One of the main advantages of this solution was the 32-bit address bus which allowed him to support up to 4 GB of memory. Though it was much more memory than that used at that time, it should not be forgotten that limited RAM often brake the performance of previous x86 processors.

Unlike modern CPUs, more RAM at that time almost always resulted in an increase in performance. Intel also implemented several architectural enhancements that allowed the 80286 to exceed the performance, even with the same amount of RAM. The chip also supported virtual mode processing, which improved multitasking support.

They came on the market several versions, such as the famous 80386SL, who had a 16-bit data bus. However, it supported 4GB of RAM, but the least bandwidth impacted performance.


In 1989, Intel tried again to abandon the x86. He created a new known RISC CPU named i860. Unlike the i960, this project was born to be a high performance solution that compete in the desktop industry. Since Intel is still making x86 chips today, you already know how it's gone.

The biggest problem is that the processor based his performance entirely on the compiler, Member of the placement of the order in which you would have to execute the first authoring. In this way Intel wanted to hold the die size and complexity, but it was almost impossible to properly list each instruction from start to finish the compilation of the program. Consequently, the CPU entered the stall constantly trying to work around the problem.


The 80486 was, for many, the entrance door to the computer world. The key to its success was the great integration of the components. The 80486 was the first x86 CPU to offer L1 cache. The first 80486 models they had 8 KB, and were made with the process to 1000 nanometers. It then passed 600 nanometers and the L1 cache doubled to 16 KB.

Intel also integrated an FPU to the CPU, which up to that point was a unit of a separate calculation. By moving this component into the processor, latency was reduced considerably. The 80486 also used a faster FSB interface to increase bandwidth, and the core was improved to increase IPC. These changes allowed the chip to unlock high performance: high end models were several times faster than the old 80386.

The first 80486 operated at 50 MHz, but the subsequent 600 nanometer models arrived at 100 MHz. Intel also presented a low cost 80486 call called 80486SX, which had an FPU disabled.

P5 (Pentium)

The first Pentium came in 1993. He did not follow the previous nomenclature. Internally, the Pentium used an architecture known as P5, the first project of Intel x86 superscalar. Although the Pentium was generally faster 80486 in every respect, the most important feature was the FPU improved significantly.

The FPU of the Pentium was more than 10 times faster than 80486. It was a big step forward, as this feature gained prominence in later years with the arrival of the Pentium MMX. This processor had an architecture identical to the first Pentium, but it supported the new SIMD MMX instruction set that dramatically increased performance.

Compared to 80486 Intel increased the size of the L1 cache. The first Pentium contained 16 KB, while the Pentium MMX passed to 32 KB. Of course these processors operated at higher frequencies. The first transistor Pentium had 800 nanometers and could reach 60 MHz, but later revisions passed to 250 nanometers and reached up to 300 MHz.

P6, Pentium Pro

At the Pentium she followed the Pentium Pro with P6 architecture. The new processor was considerably faster than the Pentium in 32-bit operations due to Out-of-Order design (OoO).

The internal architecture was deeply magazine to decode instructions into micro operations, which were then executed on execution unit general purpose. He also used an extended 14-stage pipeline with additional hardware for decoding.

As the first Pentium Pro processors were pointing to the server market, Intel extended the 36-bit bus address and added PAE technology that could support up to 64GB of RAM. It was much more than the average user needed, but supporting more RAM was important for the server world.

The cache was also reviewed. The L1 cache was limited to two 8 KB segmented caches, instructions, and data. To cope with the 16kbps deficit compared to the Pentium MMX, Intel placed between 256 KB and 1 MB of L2 cache on a separate chip, linked to the CPU package. It was connected to the CPU via a back side bus (BSB).

Initially, Intel wanted to push Pentium Pro into the consumer industry, but then decided to be limited to the server world. It had several revolutionary features, but in performance terms it was harder than Pentium and Pentium MMX. The old Pentium were significantly faster with 16-bit instructions and 16-bit software was still the most popular. The processor also did not support MMX, which led the Pentium MMX to overcome Pentium Pro in software optimized for that instruction set.

Pentium Pro could have had a chance in the consumer market, but in addition to the issues mentioned it was quite expensive due to the separate L2 cache chip. The fastest Pentium Pro had a 200 MHz clock and was made with processes between 500 and 350 nanometers.

P6, Pentium II

In 1997 it was the turn of the Pentium II processor that was born to remedy just about all the negative aspects of the Pentium Pro. The basic architecture was similar to the Pentium Pro, and continued to use a 14-stage pipeline with several improvements Increase IPC. The L1 cache was formed of 16 KB data and 16 KB instructions.

Intel went to cheaper cache chips, linked to a larger package to reduce production costs. It was an effective way to break the price, but these memory modules were not able to operate at the maximum CPU speed. L2 cache operated at a halved frequency, but was enough to increase performance. Intel also added support for the MMX instruction set.

The Pentium II - code-named "Klamath" and "Deschutes" depending on the core - also found space in the server industry such as Xeon and Pentium II Overdrive. The best models integrated 512 KB of L2 cache and worked at 450 MHz.

P6, Pentium III

Intel was already developing the Netburst architecture, but it was not ready yet. So introduced the Pentium III. The first processors, code-named Katman, were pretty similar to Pentium II because they used a low-quality L2 cache that operated halfway through CPU frequency. The architecture offered other major changes, given that different parts of the 14-stage pipeline were fused together - reducing it to 10 stages.Thanks to the updated pipeline and the increased frequency, the first Pentium III exceeded the Pentium II counterparts a small margin.

Katmai was produced with 250 nanometer transistors. By switching to 180-nanometer Intel was able to greatly increase the performance of the Pentium. The new version, code-named Coppermine, moved the L2 cache inside the processor and reduced its capacity by half (256 KB).As it worked at the processor's frequency, the performance increased.

Coppermine competed with AMD's Athlon in gigahertz racing. Intel attempted to produce a 1.13 GHz model but was forced to call it because Tom's Hardware founder Dr. Thomas Pabst found it unstable. The last was called core Pentium III Tualatin. Going to 130 nanometers the frequency went up to 1.4 GHz. It also generated the L2 cache (512 KB).

P5 and P6, Celeron and Xeon

The years of Pentium saw the arrival of other product lines also calls Celeron and Xeon. These solutions used the same core as Pentium II or Pentium III, but had variable cache amounts. The first Celeron were based on Pentium II and did not have L2 cache, which undermined their performance. Pentium III-based models had half L2 cache disableed compared to Pentium III counterparts.

This led to Celeron processors with Coppermine core and only 128 KB of L2 cache; The successive models based on Tualatin core had 256 KB.

These half cache derivatives were also known as Coppermine-128 and Tualatin-256. They had frequencies similar to Pentium III to compete with AMD's Duron. Microsoft used a Celeron Coppermine-128 at 733 MHz on the first Xbox.

The first Xeons offered more L2 cache. Pentium II-based Xeon chips contained up to 2 MB.


Before talking about the architecture of the Pentium 4 Netburst and it is important to dwell on the concept of a pipeline, the process in which the instructions are moved in the core. Pipeline stages often carry out more operations, but sometimes they are dedicated to individual functions.

By adding new hardware or splitting a stage into multiple segments, you can extend the pipeline. The pipeline can also be shortened by removing hardware or combining multiple stages into a single.

The length or depth of the pipeline has a direct impact on latency, IPC, frequency and throughput architecture. Usually longer pipelines require greater bandwidth, but if the pipeline is always powered, every stage remains busy. Processors that have longer pipelines are usually able to work even at higher frequencies.

The compromise is a much higher internal processor latency, with the data passing through them that must stop at each stage for a given number of clock cycles. Processors with a long pipeline tend to have a lower IPC, and this is why they rely on higher frequencies to increase performance. Over the years there have been successful processors based on both philosophies. None of the two approaches, basic, is wrong.

Netburst, P4 Willamette and Northwood

In 2000 the Netburst architecture was finally ready and arrived on the market in the form of Pentium 4. The first solution was called Willametteand lasted for about two years. The chip struggled to overcome the Pentium III. Netburst reached much higher frequencies - Willamette touched the 2 GHz - but the 1.4 GHz Pentium III was faster in some operations. AMD's Athlon processors enjoyed a good lead over this period.

The problem was that Intel Willamette brought the pipeline to 20 stages believing that they can exceed 2 GHz. Consumption and temperatures prevented her from achieving that goal. eThe situation improved with the 130 nanometers, known as Northwood, which climbed up to 3.2 GHz and doubled the L2 cache from 256 to 512 KB. Netburst's fuel consumption and temperature persisted. Northwood managed to behave considerably better and compete better with AMD solutions.

On high-end models, Intel introduced the Hyper-Threading to improve the use of resources in multitasking environments. Compared with today, however, the benefits were significantly lower. Willamette and Northwood were also used to create Celeron and Xeon CPUs. As with the previous Celeron and Xeon generations, Intel increased or reduced the size of the L2 cache.

Netburst, Prescott

After Northwood fell to Prescott, a new solution with many improvements. Produced at 90 nanometers, it allowed Intel to bring the L2 cache to 1 MB. Intel also introduced the new LGA 775 interface with DDR2 support and an improved quad-pumped FSB.

These changes guaranteed a much higher bandwidth than Northwood, vital to boosting Netburst's performance. Prescott was also the first 64-bit Intel x86 processor.

Unfortunately it was a failure. Intel extended the pipeline, bringing it to 31 stages. The company hoped to raise enough frequency to cope with the longer pipeline, but was able to reach only 3.8 GHz. Prescott became too hot and consumed excessively. Intel hoped that the 90-nanometer shift would alleviate the problem, but the greater density made cooling more difficult.

Summing up the improvements and the cache, Prescott was - at best - equal with Northwood. At the same time AMD's K8 processors went to a lower production process and reached higher frequencies, dominating the market.

P6, Pentium M

Netburst architecture was not designed for the mobile industry. For that in 2003 Intel created the first architecture for notebooks, namely thePentium M processors. The chips were based on P6 architecture, but they had a longer pipeline (12/14 stages). It was also the first time that Intel adopted a variable length pipeline.

This meant that the instructions could be run after 12 stages, but only if the information required for the instruction was already in the cache. If it was not, it passed for two additional stages.

The Pentium Ms were based on 130 nanometer transistors and contained 1 MB of L2 cache. It reached up to 1.8 GHz, consuming just 24.5 watts.A subsequent review - Dothan - went to 90 nanometers.

This allowed Intel to increase the L2 cache, bringing it to 2 MB and improving the IPC throughput thanks to some core retouching. The frequency increased to 2.27 GHz, with a slight increase in consumption (27 watts).

Netburst, Pentium D

In 2005 everyone was focused on establishing the first dual-core processor for the consumer market. AMD announced the dual-core Athlon 64, but Intel went on sale in advance by using a multi-core module (MCM) containing two die Prescott. The company named its first dual-corePentium D, codenamed Smithfield.

The Pentium D was not very well received: Prescott faced the same problems. The heat and power consumption of the two drives limited the frequency to 3.2 GHz. And since architecture had a reduced bandwidth, Smithfield IPC suffered because the throughput was split between the two core. The first dual-core CPU of AMD, made with just one die, was superior.

Smithfield followed Presler, which was 65 nanometers. It contained two Cedar Mill on an MCM. The smallest process allowed the processor to reduce temperature and consumption. The frequency went up to 3.8 GHz. The Pentium D Presler had a 125 watt TDP, which then fell to 95 watts thanks to a new stepping. It also doubled the L2 cache with every die of 2 MB. Some high-end models offered the Hyper-Threading, allowing the CPU to handle four threads simultaneously. All Pentium D processors supported 64-bit software and could use more than 4GB of RAM.

Core, Core 2 Duo

Intel decided to abandon the Netburst architecture and create a new one based on what was done with the P6 and Pentium M projects. Thus was born the Core project. Like the Pentium M, it had a 12/14 stage pipeline much shorter than the Prescott's 31 stadia.

Core is demonstrated highly scalable, so as to cover the mobile systems with TDP up to 5 watts and high-end server 130 watt. Intel led the market Core 2 Duo and Core 2 Quad, Core but was also at the center of Core Solo, Celeron, Pentium and Xeon. The die integrated two core and quad-core designs used two dual-core MCMs. Single core versions had a core disabled. The L2 cache size ranged from 512 KB to 2 MB.

Thanks to the new Intel project, he was able to compete again with AMD. The architecture Intel Core put on the right path and it is thanks to this breakthrough if they were born Atom processors.

The Bonnell architecture of the first Atom was not designed from scratch, but sank its roots in P5 architecture. The first die, Silverthorne, had a 3 watt TDP that allowed Intel to occupy sectors that had previously been foreclosed. The performance was not high, but the experience was enough and, above all, cheap. Silverthorne succeeded Diamondville, with a lower frequency but 64 bit support. In the following years, Pineview, Cedarview, Silvermont and Airmont took turns.

Nehalem, the first Core i7

With the market of Intel processors in full ferment went back to work on the architecture Nehalem Core creating a solution with numerous improvements. The cache controller was redesigned and the L2 cache went down to 256 KB per core. A decision that did not lower the performance, as Intel added between 4 and 12 MB of shared L3 cache across all cores. Nehalem-based CPUs were comprised between one and four core and were made at 45 nanometers.

Intel redefined the connections between the CPU and the rest of the system. The old FSB was in use since the 1980's, so Intel replaced it with QuickPath Interconnect (QPI) on high-end systems and DMI elsewhere. This allowed Intel to move the memory controller (updated to support the DDR3) and the PCIe into the processor. These changes sharply increased bandwidth, while latency collapsed.

Once again, Intel extended the processor's pipeline, this time at 20-24 stages. The frequency did not rise, however, and Nehalem worked at frequencies similar to the Core. Nehalem also was the first Intel processor to implement Turbo Boost. Although the clock base of the best Nahalem chip was 3.33 GHz, the processor could accelerate to 3.6 GHz for short periods of time thanks to the new technology.

Nehalem also marked the return of Hyper-Threading. Thanks to this and many other changes, the project achieved performance up to two times higherthan the Core 2 heavy loads that exploited many threads.


After Nehalem was the turn of the die shrink to 32nm Westmere. Basic architecture changed little, but Intel took advantage of the smaller dimension of the die to add more components within the CPU. Instead of just four x86 core, Westmere could hold up to 10 (Westmere-EX) and have up to 30 MB shared L3 cache.

Arrandale and Clarkdale, mobile and desktop mainstream solutions, adopted an integrated GPU on the CPU package and produced at 45 nanometers. The Core Graphics HD Graphics was similar to the GMA 4500, except for two additional EUs.

From the glorious Sandy Bridge to our day

Sandy Bridge

With Sandy Bridge Intel made a clear step forward, so that a micro-op cache still more than a few passionate accurately retains a Core series 2000. The pipeline was reduced to 14-19 stages and was implemented that can hold up to 1500 micro- Op decoded to enable instructions to bypass five stages when the micro-op request was already cached. If it had not, education would have to go through 19 stages.

The processor also provided other improvements, such as DDR3 to support high performance. The GPU was also integrated into the CPU die. The various subsystems were internally connected by a ring bus that allowed data exchange with a very high bandwidth.

Intel also upgraded the integrated graphics chip. Instead of a single HD Graphics on all CPU models, he created three versions. The HD Graphics 3000, the best version, had 12 EU up to 1.35 GHz. It also contained the engine for Quick Sync transcoding. The HD Graphics 2000 was identical, but it had only six EUs. Low-end HD Graphics also had six EUs but was devoid of some value-added features.

Ivy Bridge

Sandy Bridge followed Ivy Bridge, a "Tick +" at the then "Tick-Tock" evolutionary cadence. Ivy Bridge's IPC was slightly better than Sandy Bridge, but there were other major enhancements that improved energy efficiency.

The architecture adopted three-dimensional transistors to 22 nanometers that reduced consumption significantly. The Core i7 Sandy Bridge mainstream had a 95 watt TDP, while similar Ivy Bridge went down to 77 watts. This improvement was of particular importance in the mobile world: Intel managed to create a quad-core Ivy Bridge mobile with 35 watt TDP. Previously all Intel quad-core mobile CPUs had a TDP of at least 45 watts.

Intel used the reduced die size to expand the integrated GPU. The HD Graphics 4000 integrated 16 EUs, but a new graphics architecture improved the performance of every EU. Thanks to these changes, the HD Graphics 400 provided performance 200% higher than its predecessor.


A year later came Ivy Bridge Haswell. It was, again, an evolution rather than a revolution. The AMD processors facing Sandy and Ivy Bridge were not fast enough to bump into the high end, so Intel did not have much interest in boosting performance. Haswell was about 10% faster than Ivy Bridge.

Even with Haswell the company is focused on energy efficiency and integrated GPU. Haswell processor blended in a voltage regulator which allowed the CPU to better manage consumption. The voltage regulator led the CPU to produce more heat, but the Haswell platform overall was more efficient.

To fight with AMD APUs, Intel has put up to 40 EUs in the Haswell iGPU. The company also tried to increase the bandwidth at the disposal of the fastest graphics chip by providing 128MB of eDRAM (L4 cache) that dramatically improved its performance.


The next architecture to Haswell took the name of Broadwell. Designed for mobile systems, it debuted at the end of 2014 and used 14-nanometer transistors. The first Broadwell processor was called Core M, a dual-core with Hyper-Threading with a TDP of 3/6 watts.

Other Broadwell mobile processors arrived later, but on the Broadwell desktop front we only count two CPUs. The Core i7 had the fastest integrated graphics ever seen on an Intel desktop processor. It contained six sublists with 8 EU each, for a total of 48 EU. The GPU also had access to 128 MB of eDRAM (L4 cache), which was able to solve the bandwidth problem with which the integrated chip usually collided. In testing with the games, the GPU surpassed AMD's best APU and proved to be good performance in modern titles.


In 2015, shortly after Broadwell, came Skylake. The CPU was faster, but platform changes were more important. Skylake was the first consumer CPU compatible with DDR4, more efficient and capable of ensuring a greater bandwidth than DDR3. The Skylake platform offered several enhancements, such as a new DMI interface, an updated PCIe controller, and a broader connectivity support.

Of course Skyake also had a better integrated GPU. The most prominent model, Iris Pro Graphics 580, was used in R solutions and had 72 EU and 128MB of eDRAM (L4 cache). Much of the CPUs have a 24 EU GPU based on a graphic architecture similar to Broadwell.

Kaby Lake

With Skylake and Kaby Lake, Intel ended the development Tick-Tock cadence in favor of the new tick-tock-tock, which is referred to as Process-Architecture-Optimize (PAO).

Intel will spend more time on a single production process before developing a new one. As a result, Intel will also bring about major changes in architecture.

Kaby Lake, therefore, can be seen as an optimized variant of Skylake. Intel has used a new process called 14nm + with various modifications to improve energy efficiency and performance.

The architecture has changed very little, but it supports DDR4 at 2400 MHz. Kaby Lake integrates an improved GPU, HD Graphics 630, with support for the latest encoding and decoding codec, as well as 4K video playback.

A propos de SUPINFO | Contacts & adresses | Enseigner à SUPINFO | Presse | Conditions d'utilisation & Copyright | Respect de la vie privée | Investir
Logo de la société Cisco, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société IBM, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Sun-Oracle, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Apple, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Sybase, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Novell, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Intel, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Accenture, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société SAP, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Prometric, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Toeic, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo du IT Academy Program par Microsoft, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management

SUPINFO International University
Ecole d'Informatique - IT School
École Supérieure d'Informatique de Paris, leader en France
La Grande Ecole de l'informatique, du numérique et du management
Fondée en 1965, reconnue par l'État. Titre Bac+5 certifié au niveau I.
SUPINFO International University is globally operated by EDUCINVEST Belgium - Avenue Louise, 534 - 1050 Brussels