Traditional, high-performance DSP platforms based on general-purpose DSP processors and running algorithms developed in C are moving toward the use of FPGA preprocessors and/or coprocessors. This latest development can provide products with huge performance, power and cost advantages.

Despite this obvious advantage, teams accustomed to designing with processor-based systems avoid FPGAs because they lack the necessary hardware skills to use FPGAs as coprocessors (Figure 1). Unfamiliarity with traditional hardware design methodologies like VHDL and Verilog limits or prevents the use of FPGAs, which often result in designs that are too expensive and consume too much power. ESL, a new set of design tools, addresses this design challenge. It helps processor-based designers use programmable logic to accelerate their designs while retaining conventional hardware and software design methods.

Boost performance with FPGA coprocessing

Designers can take advantage of the flexibility of use brought about by the parallelism of the FPGA fabric to dramatically improve the performance of DSP systems. Common design examples include (but are not limited to) FIR filtering, FFT, digital downconversion, and forward error correction (FEC) blocks.

The Xilinx VirtexTM-4 and Virtex-5 architectures provide up to 512 parallel multipliers capable of running at speeds in excess of 500MHz, delivering DSP peak performance of 256GMACs. By implementing high-speed parallel processing on FPGA and high-speed serial processing on DSP, the performance of the entire DSP system can be optimized while reducing the power requirements of the system.

Reduce costs with FPGA-embedded processing

A DSP hardware system with an FPGA coprocessor that provides many implementations for operations within the scope of C algorithms, such as algorithmic partitioning between DSP processors, FPGA Configurable Logic Blocks (CLBs), and FPGA embedded processors . Virtex-4 devices offer two embedded processors—the MicroBlazeTM soft-core processor typically used for system control and the higher-performance PowerPCTM hard-core processor. The parallel operations implemented by the FPGA fabric can be used directly in the DSP’s data path, or configured as a hardware accelerator for an embedded processor.

The challenge for designers is how to divide the DSP’s system operations among the hardware resources provided in the most efficient and cost-effective way. The biggest benefit of using an FPGA embedded processor is not always obvious, but this hardware resource can significantly reduce the overall cost of the system. FPGA embedded processors provide an opportunity to concentrate all non-critical operations on the software running on the embedded processor, thereby minimizing the total amount of hardware resources required by the system.

C program to system gate

In FPGA applications, the term “C program-to-system gate” refers specifically to one of two implementations – directly implementing a DSP block on the FPGA fabric or creating a hardware accelerator for a MicroBlaze or PowerPC 405 embedded processor (Fig. 2).

Implementing the FPGA as a DSP block achieves the highest performance when the operation is performed directly in the DSP datapath. This method first synthesizes the C code directly into the RTL code, and then materializes the module in the data path of the DSP. You can materialize using traditional HDL design methods, or through system tools like Xilinx System Generator for DSP. This direct materialization allows developers to achieve the highest performance with minimal overhead.

Mainstream C synthesis tools can achieve performance comparable to handwritten RTL – but doing so requires a thorough understanding of how C synthesis tools work and code style. To achieve the required performance, code modifications are often required and inline synthesis instructions are added to insert parallel and pipeline stages. Despite these improvements, the design efficiency can still be greatly improved. The C-system model remains the primary factor driving the design process.

As an alternative, creating a hardware accelerator for Xilinx embedded processors is usually a simpler approach. In this method, the processor is still mainly used to run the C program, but the operations that will have a significant impact on performance are placed in the FPGA logic in the form of hardware accelerators for execution. This is a more software-centric approach to design. However, this approach sacrifices some performance. Similar to the DSP module approach, the C program is synthesized into RTL code, except that the top-level entities are surrounded by interface logic so that they can be connected to the bus of the Xilinx embedded processor. This creates a hardware accelerator that can be called into the Xilinx EDK environment and called by software-friendly C programs.

The performance requirements for mapping C programs to hardware accelerators are generally not that demanding. The goal here is to improve performance over methods implemented using pure software, while maintaining a software-friendly design flow. While there are still coding techniques and inline synthesis instructions, it is often possible to achieve the required performance gains without using them.

Design Methodology – Barriers to Adopting FPGA Coprocessing

Properly partitioning and implementing a complex DSP system takes a lot of time and effort to master the required skills. In 2005, Forward Concepts, a market research firm, conducted a survey to determine the most important criteria for FPGA selection in DSP designs. The results of the survey indicated that development tools were the most important selection criteria, as shown in Figure 3.

The survey results show that the advantages of using FPGA coprocessor to implement DSP hardware system have been fully recognized by users, but for traditional DSP designers, the existing situation of development tools has become an obstacle for them to adopt this design method.

Xilinx ESL Program

The ESL design tool takes the abstraction of digital design one step further on the basis of RTL. Some of these tools are dedicated to mapping system models developed in C++/C++ into DSP systems containing FPGAs and DSP processors. The purpose of this move is to make the hardware platform transparent to software designers (Figure 4).

This year, Xilinx and major ESL tool vendors joined forces to launch a collaborative project called the ESL Program to fully address the above obstacles. The primary goal of this collaborative program is to empower designers with software programming capabilities that allow them to easily implement their ideas in programmable hardware without having to learn traditional hardware design skills. Incorporating innovations from ESL member institutions, the program accelerates product development and drives designers to adopt the world’s most advanced design methodologies.

in conclusion

Bringing together the tools of Xilinx ESL partners enables a broad range of complementary solutions optimized for a range of products, platforms and end users. Xilinx is also concentrating on complementary technologies. For example, AccelDSP synthesis provides a hardware implementation for algorithms developed in floating-point MATLAB, while Xilinx System Generator for DSP enables modules developed in ESL designs to be easily combined with Xilinx IP and embedded processors. Leveraging the work of multiple innovative partners is the quickest way to achieve the FPGA design flow that programmers expect.

Responsible editor: gt

Leave a Reply

Your email address will not be published.