By Víctor Mayoral-Vilches and Giulio Corradi, Xilinx Corporation

Serial 2: Industrial analogy CPU/GPU, ASIC and FPGA, who is more suitable for robot computing

CPU and General Purpose GPU (GPGPU) are two widely used commercial computing platforms because of their high availability and generality. The versatility of these computing techniques is what makes them of particular interest to roboticists. But the price of generality is:

1. The fixed architecture of the general platform is difficult to adapt to new robot scenarios. Additional functionality often requires additional hardware, which also often means spending time on new levels of system integration with new hardware.

2. Universality will inevitably lead to its defects in timeliness, thereby affecting the formation of determinism (difficult to meet strict real-time requirements).

3. Its power consumption is typically one to two orders of magnitude higher than dedicated computing architectures such as FPGAs or ASICs (1).

4. Its fixed and inflexible architecture makes it less resilient to cybersecurity threats and malicious behavior. Examples of cyberattacks such as Meltdown or Spectre show that without the ability to reconfigure data pipelines, computing platforms will eventually lose their security.

In general, fixed-architecture devices such as CPUs, GPUs, and ASICs come at a cost to developers while they offer advantages. Their lack of flexibility and adaptability results in their lack of timeliness and increased energy consumption. And because they cannot reconfigure their architecture to make hardware more resilient, they are more vulnerable to cyber threats.

Industrial analogy for CPUs

Figure 1 is an industrial analogy of the CPU, which understands the CPU as a series of workshops, each with a very skilled worker.


Figure 1: Industrial analogy for CPUs

Each of these workers can produce almost any product using common tools. Using different tools in sequence, each worker manufactures raw materials into finished products, one product at a time. Depending on the nature of the task, this serial production process may involve a large number of steps. These workshops are basically (without regard to buffering) independent of each other, and workers can concentrate on different tasks without worrying about interference or coordination problems. Although a CPU is flexible, its underlying hardware is fixed. CPUs still run on a basic von Neumann architecture (or rather, a stored-program computer). Data is read from memory to the processor to perform operations, and then written back to memory. Basically every CPU operates serially, one instruction at a time. At the same time, the architecture is centered on the arithmetic logic unit (ALU), and each operation needs to input data to and output data from the ALU.

Industrial analogy for CPUs

GPUs can also be compared with workshops and workers, but they are much larger in number and the workers are much more specialized, as shown in Figure 2.


Figure 2: Industrial analogy for GPUs

GPU workers are limited to specific tools, and each can perform a much smaller variety of tasks, but they are very efficient at getting things done. GPU workers are most efficient when they do the same small number of tasks repeatedly, especially when they are all doing the same thing at the same time. GPUs solve one of the main drawbacks of CPUs, the ability to process large amounts of data in parallel.

Although GPUs have many more cores than CPUs, GPUs still use a fixed hardware architecture. The core of the GPU still contains some type of von Neumann processor. A single instruction can process a thousand or more pieces of data, although the same operation must usually be performed on each piece of data being processed at the same time. Atomic processing elements operate on vectors of data (data points in the non-CPU case), but still execute a fixed instruction per ALU. Therefore, users still need to pass data from memory to these processing units through a fixed data path. Like CPUs, GPUs are built with fixed hardware, and the basic architecture and data flow are fixed for all robotics applications.

Industrial analogs for FPGAs

If CPUs and GPUs are the shop floor where workers sequentially process inputs into outputs, FPGAs are flexible, adaptive factories that create assembly lines and conveyor belts tailored to the specific task at hand (see Figure 3).


Figure 4: Industrial analog of FPGA

This flexibility means that instead of using generic tools, FPGA architects can build factories, assembly lines, and workstations and tailor them to the tasks they need to accomplish. Raw materials in these factories are gradually processed into finished products by teams of workers assigned to assembly lines. Each worker performs the same task repeatedly, while semi-finished products are passed between workers on a conveyor belt. This significantly increases productivity and ensures optimal utilization of resources and power. In this analogy, the factory is the OpenCL acceleration kernel, the assembly line is the data pipeline, and the workstation is the OpenCL compute function.

ASIC's Industrial Analogy

Similar to FPGAs, ASICs also build factories, but the factories in ASICs are final and cannot be altered (see Figure 4). In other words, there are only robots inside these ASICs, and there is no human cognition in the factory. These assembly lines and conveyor belts are fixed and do not allow changes to automated processes. This dedicated fixed architecture of ASICs gives them extremely high energy efficiency and the lowest price for high-volume mass production. Unfortunately, the development of ASICs often takes years and does not support any changes, which will cause the assets invested up-front to quickly fail to keep up with future productivity-enhancing changes.


Figure 4: Industrial analogy for ASICs

Leave a Reply

Your email address will not be published. Required fields are marked *