Flexible Architectures for Next-Generation Iterative MIMO Receivers
Flexibility requirements, when coupled with low-cost and less time-to-market constraints, make the development of a mobile device highly complicated and challenging. Software defined radios (SDRs), with cognitive capabilities, are getting prominence as potential candidates to meet the future requirements of mobile wireless devices. In this project, we are investigating SDR development approaches that can provide a comprehensive design framework encompassing platforms, architectures, software, methodology and design tools.
This research project is performed in the Ultra High-Speed Mobile Information and Communication (UMIC) research centre as a part of the Nucleus project under the support of Fraunhofer Institute for Communication, Information Processing and Ergonomics (FKIE), Germany.
Key Challenges in SDR Development
The paradigm of SDR poses new challenges or makes current design challenges more stringent. The most relevant challenges are:
- Flexibility in modern radio devices is required to serve different goals, such as efficient multi-modal or multi-waveform transmission in order to support compliance with new and old air interfaces, promote interoperability, support multiple algorithms and reduction of costs via modular and parametric design. In this context, the term waveform (WF) represents a complete wireless standard, e.g. GSM, UMTS, etc.
- Efficiency with respect to area and energy is essential in order to decrease the power/energy consumption and extend the battery life. However, this requires high efficiency in waveform implementation.
- Trade-offs between flexibility and efficiency becomes challenging in the wake of their contradictory nature and presence of several hard real-time waveform constraints like low latency and high throughput. This makes heterogeneous multi-processor system-on-chips (MPSoCs), an inevitable candidate as the hardware (HW) platform for implementing a waveform.
Heterogeneous MPSoCs with specialized processing elements (PEs) can pave the way to solve the dilemma of contradicting demands of high computational performance at the one hand and energy efficiency on the other. However, designing such a system is highly complex, tedious and error-prone requiring huge design space exploration and early verification, making it a challenging task. What is needed is a description method that can lead to a (semi-) automatic generation of a waveform implementation directly from the specification. Therefore, a methodology is required, that raises the abstraction level of waveform design to make it manageable.
Library Based Waveform Development
Raising the abstraction level leads to library based approaches for waveform development, where efficient implementations of basic components are available and can be assembled to implement the complete waveform. Also, a library based approach enables efficient utilization of heterogeneous MPSoCs. However, existing library based approaches are either waveform-based or hardware-based: a library supports only one waveform, e.g. , or one processing element, e.g. . Moreover, due to the proprietary nature of existing libraries, e.g. , reusability is very limited among vendors. Therefore, more open approaches that can support a seamless waveform development for SDRs are needed.
The Nucleus Methodology
To overcome the drawbacks of traditional library based approaches, a new classification of library elements called Nuclei is proposed targeting reusability, portability and implementation efficiency . Considering the important aspects for a library based WF development, the Nucleus concept approaches system design by the following key principles:
- Limit HW flexibility to the minimum required level (e.g. architecture of PEs, communications and memories)
- Maximize area and energy efficiency
- Manage/exploit flexibility by means of high level programmability
The Figure illustrates the Nucleus concept and the waveform development environment (WDE) using the Nucleus methodology. A Nucleus is defined as a critical, demanding, flexible, algorithmic kernel that captures common functionalities within and/or among WFs . As shown in the figure, the Nucleus library is used while describing a waveform. The waveform description will now consist of Nuclei kernels and processing light non-Nuclei tasks like modulator.
One of the main aspects of the Nucleus methodology [4, 6, 7] is the standardization of the Nucleus library. Since the details of Nuclei and their interfaces will be available, different vendors can provide efficient implementations for some/all Nuclei for some/all processing elements on a given hardware platform as a board support package (BSP). We define a Flavor as an efficient and optimized implementation for one Nucleus. As shown in the Figure, there can be several Flavors for one Nucleus on one PE. Flavors can be based on several implementation algorithms. Each or all of the Flavor(s) can have tunable parameters. Note that flexibility in implementing a waveform is available, by means of Flavors and configuration parameters, even on a fixed HW platform.
Unlike traditional approaches where mapping of a waveform is done on a HW platform, in the Nucleus methodology, mapping is reduced from a Nucleus to a Flavor (see Figure). In other words, the HW platform has been abstracted to the Nucleus level due to the bundling of Flavors to a PE. When a Flavor for a Nucleus is not available in a BSP, it can still be implemented using traditional development approaches. Since the non-Nuclei tasks are computationally-light, they can be mapped on a general purpose processor.
Nucleus methodology offers several advantages. Due to the waveform-independent and HW-independent nature of the Nuclei library, a waveform has high portability. Efficiency is guaranteed due to the optimized nature of Flavors, which exploit the architecture of a PE. Furthermore, flexibility vs. efficiency trade-offs can be done even on a fixed HW platform by changing the Flavors and their configuration parameters like input data-width. Due to existence of such Nuclei kernels in general purpose applications research in the same direction is also done by computer science experts 
Our investigations in  have shown that Flavors and configuration parameters like input data-width, scaling, etc. have a huge influence on performance properties like BER, latency, energy efficiency, etc. This advocates the need for a tool-assistance where mapping exploration can be done among available Flavors in an efficient way for implementing a waveform [7, 8].
In the context of Nucleus methodology, MIMO-OFDM systems are interesting due to the high-throughput and low-latency requirements combined with high-complexity in implementation. For MIMO detection schemes such as successive interference cancellation (SIC) and sphere decoder, QR-decomposition (QRD) is used . Since QRD is computation intensive, the implementation of the algorithm is challenging making it one of the key research topics in the recent past [9, 10, 11]. Currently, in this project, algorithms for performing QR decomposition are investigated as a case study for Nucleus methodology. In general, the investigations relating to the Nucleus methodology are done on the following topics:
- How can Nuclei be identified in a waveform? What should be the flexibility of a Nucleus?
- How much does the performance, e.g. energy efficiency, of an algorithm vary on different architectures like application specific instruction-set processors (ASIPs), digital signal processors (DSPs), general purpose processors (GPPs), etc?
- Does a Nucleus (algorithm) exhibit sufficient numerical stability for fixed point implementations? What are the finite word length effects of Flavors on performance properties?
- What is the cost/influence of algorithmic flexibility on performance properties like energy efficiency? Can this influence be characterized / modeled, e.g. using equations, for a tool-assisted mapping?
- In a waveform with several Nuclei, what are the cascading effects of Flavors on performance properties, e.g. latency? How can such effects be modeled in a tool?
- How can the flexibility of a Nucleus represented in a tool? Should configuration parameters like input data-width be part of a Nucleus interface?
- What constitutes the interface of a Nucleus? What details should be provided to the tool for a constraint-driven mapping that guarantees meeting all the waveform constraints?
- How much automation can the WDE support? For example, is it possible to generate glue code automatically?
Venkatesh Ramakrishnan, Uwe Deidersen, Torsten Kempf, Daniel Günther
 Wimax TI Library, "http://www.ti.com/corp/docs/landing/wimax/index.htm", May 2009.
 CMSIS-DSP Library, "http://www.onarm.com", Dec 2010.
 T. Langguth et al., "SDR based Waveform Development", in 5th Karlsruhe Workshop on Software Radios (WSR 2008), Karlsruhe, Germany, 2008.
 V. Ramakrishnan et al., "Efficient and Portable SDR Waveform Development: The Nucleus Concept", in IEEE Military Communications Conference (MILCOM 2009), 2009.
 "The Landscape of Parallel Computing Research: A View From Berkeley", May 2009. www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf
 V. Ramakrishnan et al., "SDR Waveform Development: Towards Tool Assisted Mapping and Evaluation in the Nucleus Concept", in SDR 09 Technical Conference and product Exposition (SDR 09), 2009.
 V. Ramakrishnan et al., "Efficient Implementations From Libraries: Analyzing the Influence of Configuration Parameters on Key Performance Properties", in IEEE Personal, Indoor and Mobile Radio Communications Symposium 2009 (PIMRC 09), 2009.
 V. Ramakrishnan et al., "A High Level Performance Estimation: Modeling the Effects of Parameters On Performance Properties for a Tool Assisted SDR Development", in IEEE International Conference on Communications, 2010.
 P. Luethi et al., "VLSI Implementation of a High-Speed Iterative Sorted MMSE QR Decomposition" in IEEE International Symposium on Circuits and Systems, 2007.
 T. Haustein et al., "Real-time signal processing for multiantenna systems: algorithms, optimization, and implementation on an experimental test-bed" in EURASIP Journal on Applied Signal Processing, Volume 2006, Jan. 2006.
 Z. Nikolic et al., "Design and implementation of numerical linear algebra algorithms on fixed point DSPs" in EURASIP Journal on Applied Signal Processing, Volume 2007 Issue 2, June 2007.