1.1.2 Types of processor

GitHub last commit

We have many different types of processor, all designed to complete different tasks. Some processors are designed to accelerate specific tasks, while others are designed to be general purpose.

CPUs can be separated into two families, being CISC and RISC.

Processor architectures

Rising from different philosophies, modern processors fall under two groups, being RISC and CISC. They both have different properties, pros, and cons that may make one architecture more suitable than the other.

RISC

RISC (Reduced Instruction Set Computer) is one of the two main CPU architecture in use. It is present in most mobile devices, embedded computers, microcontrollers, and is growing in use in laptops and servers particularly by Apple with their M-series and Qualcomm with their Snapdragon X lineup.

RISC processors maintain the philosophy of one operation per one instruction within one clock cycle.

This results in RISC processors having much fewer and less complicated instructions compared to CISC since they perform less and theres fewer operations that need to be given a dedicated instruction.
Something simple such as adddition could take many instructions such as loading the values from memory and then calling the add instruction.

Additional traits of RISC:

Easier to pipeline since each instruction is predictable and simple
RISC programs require more ram to store the additional instructions
RISC processors are generally more power and heat efficient
Fewer abstractions

CISC

CISC (Complex Instruction Set Compiter) is the CPU architecture most commonly used in desktops and laptops.

Note

For additional reading on CISC instructions, see this video on strange x86 CPU instructions

CISC processors are significantly more complex (hence the name) compared to RISC processors, as these processors are capable of having single instructions that can perform multiple tasks.

Generally, these instructions wrap around the simpler instructions.
For example, the instruction DPPS ( Dot Product of Packed Single Precision Floating-Point Values) from the AVX SSE4.1 instruction set does significantly more than just one operation.

CISC processors will have more instructions than RISC, since there’s more clock cycles that can be allocated to a single instruction.

Additional traits of CISC:

Simplifies compilation because the instructions can more closely resemble higher level statements
Could also make the optimization stage more difficult
Less heat/power efficient
Physically larger
Lower memory usage#

Parallel processing

Parallel processing is where multiple tasks are completed separately from each other, resulting in all of the tasks being completed in a shorter time than if they were to be executed sequentially.
Systems with more cores are capable of executing more tasks in parallel.

For systems without multiple cores, a single core can make use of threading, which is a form of concurrency on a single core.

Caution

Concurrency and parallelisation ARE DIFFERENT.
See this video (2:04-3:07) for the explanation, though I highly recommend watching the whole video.
Feel free to ignore the Rust specific parts. From what I know this is on the specification.

The negatives of parallel processing

Parallel processing may not guarantee a speed increase. Often, it can cause bugs and rarely cause slowdowns.

When performing parallel processing, the task needs to be allocated to the different cores/processors, which induces a small overhead.
In the cases of small tasks being parallelised, the overhead of allocating many different tasks could overshadow the speed benefit of parallelising the task in the first place.

Parallel processing is also significantly harder to program and utilise.

The Rust Programming Language Book phrases this very well:

Splitting the computation in your program into multiple threads to run multiple tasks at the same time can improve performance, but it also adds complexity. Because threads can run simultaneously, there’s no inherent guarantee about the order in which parts of your code on different threads will run. This can lead to problems, such as:

Race conditions, in which threads are accessing data or resources in an inconsistent order

Deadlocks, in which two threads are waiting for each other, preventing both threads from continuing

Bugs that only happen in certain situations and are hard to reproduce and fix reliably

These possible circumstances make development and testing with parallelism very difficult in comparison to single threaded programming. It’s up to the programmer to determine whether it is worth it or not to implement parallelism/concurrency.

Coprocessors and Accelerators

Along with our primary processor (generally the CPU), there can also be additional processors that tasks can be delegated to.

Co-processors are specialised processors that are capable of doing specific tasks much faster than the CPU can. Things like floating point arithmetic, cryptography, matrix multiplication, and other easily parallelised tasks are frequently offloaded to such coprocessors.

Hardware acceleration will often use coprocessors such as a GPU or NPU/TPU, which will offload computationally intensive tasks onto the additional device. This can improve render times and performance overall as these devices are Mathematically intensive tasks like rendering, ML workloads, etc, will almost always be offloaded to the GPU, as it is able to perform orders of magnitude faster than the CPU in these tasks.

Here are some common coprocessors:

Note

You don’t need to know any of these except for the GPU.

Acronym	Full name	Task
GPU	Graphics Processing Unit	Rendering graphics, Parallelised arithmetic
NPU	Neural Processing Unit	AI and machine learning
TPU	Tensor Processing Unit	Variant of an NPU by Google for neural networks
QPU	Quantum Processing Unit	Quantum computing

GPUs

The GPU (Graphics processing unit) is a specialised processor originally designed to accelerate the computation of graphics and 3D space.

GPUs in comparison to CPUs will instead pack the die with thousands of cores to maximise the capabilities of the parallel processing.
Since the only thing the GPU will be doing is completing tasks sent from the CPU, it does not need nearly as much silicon space for administrative parts.

GPU	CPU
More cores	Less cores
Simpler cores	complicated cores
Specialised for parallel processing	Specialised for sequential processing
Computes more specialised tasks	Computes more general tasks

Keyboard shortcuts

OCR H446 Textbook