

Author: Shahbahrami Asadollah Juurlink Ben Borodin Demid Vassiliadis Stamatis
Publisher: Springer Publishing Company
ISSN: 0885-7458
Source: International Journal of Parallel Programming, Vol.34, Iss.3, 2006-06, pp. : 237-260
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
Single-Instruction Multiple-Data (SIMD) instructions provide an inexpensive way to exploit the Data-Level Parallelism in multimedia applications. However, the performance improvement obtained by employing SIMD instructions is often limited because frequently many overhead instructions are required to bring data in a form amenable to SIMD processing. In this paper, we employ two techniques to overcome this limitation. The first technique, extended subwords, uses four extra bits for every byte in a media register. This allows many SIMD operations to be performed without overflow and avoids packing/unpacking conversion overhead. The second technique, Matrix Register File (MRF), allows flexible row-wise as well as column-wise access to the register file. It is useful for many two-dimensional multimedia algorithms such as the (I) Discrete Cosine Transform, 2 × 2 Haar Transform, and pixel padding. In addition, we propose a few new media instructions. Experimental results obtained by extending the SimpleScalar toolset show that these techniques improve performance by up to a factor of 4.5 compared to a conventional SIMD instruction set extension.
Related content






If-conversion for embedded VLIW architectures
By Bruel C.
International Journal of Embedded Systems, Vol. 4, Iss. 1, 2009-07 ,pp. :


Comparing SIMD and MIMD Programming Modes
By Ganesan R. Govindarajan K. Wu M.Y.
Journal of Parallel and Distributed Computing, Vol. 35, Iss. 1, 1996-05 ,pp. :


Formal specification of parallel SIMD execution
By Farrell C.A. Kieronska D.H.
Theoretical Computer Science, Vol. 169, Iss. 1, 1996-11 ,pp. :