MIRA: Micro-Architectural Reliability Analysis for Deep Submicron Technology

Introduction

The last few decades have witnessed continuous scaling of CMOS technology, guided by Moore's Law, to support devices with higher speed, less area and less power. Though there is varying arguments on how long the scaling can be continued, it is undisputed that there is a reach of classical physics on supporting deterministic circuit behavior, which is limited by the thickness of an atom. The current deep submicron CMOS technology generation is already facing several challenges, resulting in a broad class of problems known as reliability. According to International Technology Roadmap for Semiconductors (ITRS), reliability and resilience across all design layers constitute a long-term grand challenge. The unreliability of a device can be masked by conservative design decisions, which affects the achievable performance. An alternative to such performance degradation is to accept and expose the unreliability to all the layers of computing. For example, an aggressive voltage scaling of the device may lead to higher runtime performance at the cost of timing errors, which can be corrected by circuit or micro-architectural techniques.

A key ingredient of successful exploration of reliability against other performance constraints (e.g. power, temperature, speed) is to accurately model the faults prevalent in deep submicron technologies and develop a smooth tool-flow at high-level design platform to analyze the effect of such faults.A conceptual tool-flow, under development, is shown in the figure.

 
 

Challenges

We are tackling multiple challenges for developing the reliability-estimation and exploration framework. These are identified as following.

  • Generic, technology-independent, parameterizable fault library construction: The state-of-the-art fault characterization is not sufficient in view of emerging devices and better understanding of their behavior is required. Furthermore, the architectural reliability estimation needs a corresponding logical representation of the physical defects, which is a challenging problem.
  • Fast Reliability Estimation Flow: Reliability estimation at high-level design abstraction is approximate but, fast. Whereas detailed circuit-level simulations are painfully slow with the advantage of accurate estimation. This accuracy-performance trade-off is well-known and acceptable. However, with increasing design complexity, even the fast simulation set up of high-level architecture description takes significantly long time (hours) for providing a reliability estimate. This challenge can be approached by first, analytical modeling of architectural reliability and second, determining representative simulation vectors instead of using full applications.
  • High-level Estimation of Physical Parameters: The dependence of several reliability measures (e.g. Mean-Time-To-Failure) with physical system parameters such as die area and temperature are well-known. To have an accurate estimate of the reliability, a prerequisite is to have accurate estimation of such physical parameters.

Contact

Zheng Wang, Anupam Chattopadhyay

Publications

Eusse, J. F., Murillo, L. G., McGirr, C., Leupers, R. and Ascheid, G.: Application-Specific Architecture Exploration Based on Processor-Agnostic Performance Estimation, in Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES15)(St. Goar, Germany ), Jun. 2015


Odendahl, M., Goens, A., Leupers, R., Ascheid, G. and Henriksson, T.: Buffer Allocation based On-Chip Memory Optimization for Many-Core Platforms, in Fifth International Workshop on Parallel Computing and Optimization, pp. 1119-1124, May. 2015, ISBN: 978-1-46737-684-6, 10.1109/IPDPSW.2015.67 ©2015 IEEE


Ries, B., Odendahl, M. and Leupers, R.: A Heuristic for Logical Data Buffer Allocation in Multicore Platforms, in International Performance Computing and Communications Conference, Poster Session, Dec. 2014, 10.1109/PCCC.2014.7017040 ©2014 IEEE


Cosmin, C.-G., Marcu, M., Wang, Z., Chattopadhyay, A., Amaricai, A., Fedeac, S., Ghenea, M., Weinstock, J. H. and Leupers, R.: Direct FPGA-based Power Profiling for a RISC Processor, in IEEE International Instrumentation and Measurement Technology Conference (I2MTC)(Pisa, Italy), pp. 1578 - 1583 , May. 2015, 10.1109/I2MTC.2015.7151514 ©2015 IEEE


Wang, Z.: High-level Modeling, Estimation and Exploration of Reliability for MPSoC, 2015, 10.1109/IWCIT.2015.7140217


Ishaque, A. and Ascheid, G.: On the Sensitivity of SMT Systems to Oscillator Phase Noise over Doubly-Selective Channels, in Proceedings of IEEE Wireless Communications and Networking Conference (WCNC)(New Orleans, LA USA), pp. 545 - 550, Mar. 2015, 10.1109/WCNC.2015.7127528 ©2015 IEEE


Eusse, J. F., Williams, C., Murillo, L. G., Leupers, R. and Ascheid, G.: Pre-architectural Performance Estimation for ASIP Design Based on Abstract Processor Models, in SAMOS 2014(Samos, Greece), Jul. 2014, 10.1109/SAMOS.2014.6893204 ©2014 IEEE


Sheng, W., Schürmans, S., Odendahl, M., Bertsch, M., Volevach, V., Leupers, R. and Ascheid, G.: A compiler infrastructure for embedded heterogeneous MPSoCs, in Parallel Computing, pp. 51-68, Elsevier, Feb. 2014, http://dx.doi.org/10.1016/j.parco.2013.11.007


Wang, Z., Li, R. and Chattopadhyay, A.: Opportunistic Redundancy for Improving Reliability of Embedded Processors, in 8th IEEE International Design & Test Symposium (IDT)(Marrakesh, Morocco), pp. 1-6, Dec. 2013, 10.1109/IDT.2013.6727090


Chen, X., Li, S., Schleifer, J., Coenen, T., Chattopadhyay, A., Ascheid, G. and Noll, T.: High-Level Modeling and Synthesis for Embedded FPGAs, in Proceedings of the Conference on Design, Automation & Test in Europe (DATE), Mar. 2013


Ishaque, A. and Ascheid, G.: A Blind-ML Method for Frequency-Selective I/Q Mismatch Compensation in Low-IF Receivers, in 2013 IEEE Wireless Communications and Networking Conference (WCNC), 7-10 April 2013, (Shanghai, China), pp. 2513 - 2518, Apr. 2013, ISBN: 978-1-46735-938-2, ISSN: 1525-3511, 10.1109/WCNC.2013.6554956 ©2013 IEEE


Jordan, M., Gong, X., Dimofte, A. and Ascheid, G.: Conversion from Uplink to Downlink Spatio-Temporal Correlation with Cubic Splines, in IEEE 69th Vehicular Technology Conference: VTC Spring 2009(Barcelona, Spain), in IEEE 69th Vehicular Technology Conference: VTC Spring 2009(Barcelona, Spain), Apr. 2009