# VLSI Design Handbook **Martin Limestone** Volume I # VLSI Design Handbook Volume I Edited by Martin Limestone Published by Clanrye International, 55 Van Reypen Street, Jersey City, NJ 07306, USA www.clanryeinternational.com VLSI Design Handbook: Volume I Edited by Martin Limestone © 2015 Clanrye International International Standard Book Number: 978-1-63240-519-7 (Hardback) This book contains information obtained from authentic and highly regarded sources. Copyright for all individual chapters remain with the respective authors as indicated. A wide variety of references are listed. Permission and sources are indicated; for detailed attributions, please refer to the permissions page. Reasonable efforts have been made to publish reliable data and information, but the authors, editors and publisher cannot assume any responsibility for the validity of all materials or the consequences of their use. The publisher's policy is to use permanent paper from mills that operate a sustainable forestry policy. Furthermore, the publisher ensures that the text paper and cover boards used have met acceptable environmental accreditation standards. **Trademark Notice:** Registered trademark of products or corporate names are used only for explanation and identification without intent to infringe. Printed in China. # VLSI Design Handbook Volume I #### **Preface** The abbreviation VLSI stands for Very Large Scale Integration. Integrated circuit technology allows billions of transistors to be fabricated into a single chip. The development of this technology only occurred in the twentieth century, somewhere in the mid-1920s, when numerous people tried to create devices which intended to convert solid-state diodes into triodes by controlling current. However, it was only in 1947, with the creation of transistors at Bell Labs that vacuum tubes were replaced by solid-state devices. Factually the Moore's Law was always validated for prediction of exponential complexity growth and advancement in the performance of integrated circuits. Most semiconductor based industries, face extreme problems in maintaining all aspects of production process during designing of the chip. These issues range from scientific research in discovering novel materials and devices to advanced technology developments and finding new killer applications. This book has been compiled in order to emphasize the latest developments in the vast field of VLSI design. The contributors have made no attempt to be comprehensive on the topics. Instead, they tried to provide some promising concepts, such as problems and challenges for the introduction of new-generation electronic design automation tools, optimization, modeling and simulation methodologies, thermal and power reduction and management, parasitic interconnects, etc. I would like to thank all the authors for their excellent contributions in different applications of VLSI. Despite the rapid advances in the field, I believe that the examples provided here will allow us to look through some main researches. I hope that this book will prove to be a worthy contribution in the field of VLSI. I also wish to thank the publisher and the publishing team for their outstanding support at every level of the editing process. Lastly, I wish to convey my regards to my friends and family for supporting me in every endeavor of my life. Editor ### Contents | | Preface | VII | |------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Chapter 1 | Design of Low Power Multiplier with Energy Efficient Full Adder Using DPTAAL A. Kishore Kumar, D. Somasundareswari, V. Duraisamy and T. Shunbaga Pradeepa | 1 | | Chapter 2 | Performance Analysis of High Speed Hybrid CMOS Full<br>Adder Circuits for Low Voltage VLSI Design<br>Subodh Wairya, Rajendra Kumar Nagaria and Sudarshan Tiwari | 10 | | Chapter 3 | A Signature-Based Power Model for MPSoC on FPGA<br>Roberta Piscitelli and Andy D. Pimentel | 28 | | Chapter 4 | Low Complexity Submatrix Divided MMSE Sparse-SQRD Detection<br>for MIMO-OFDM with ESPAR Antenna Receiver<br>Diego Javier Reinoso Chisaguano and Minoru Okada | 41 | | Chapter 5 | Optimized Architecture Using a Novel Subexpression Elimination on Loeffler Algorithm for DCT-Based Image Compression Maher Jridi, Ayman Alfalou and Pramod Kumar Meher | 52 | | Chapter 6 | <b>Design a Bioamplifier with High CMRR</b> Yu-Ming Hsiao, Miin-Shyue Shiau, Kuen-Han Li, Jing-Jhong Hou, Heng-Shou Hsu, Hong-Chong Wu and Don-Gey Liu | 64 | | Chapter 7 | Verification of Mixed-Signal Systems with Affine Arithmetic Assertions Carna Radojicic, Christoph Grimm, Florian Schupfer and Michael Rathmair | 69 | | Chapter 8 | Low Cost Design of a Hybrid Architecture of Integer Inverse DCT for H.264, VC-1, AVS, and HEVC Muhammad Martuza and Khan A. Wahid | 83 | | Chapter 9 | <b>9T Full Adder Design in Subthreshold Region</b><br>Shiwani Singh, Tripti Sharma, K. G. Sharma and B. P. Singh | 93 | | Chapter 10 | Automatic Generation of Optimized and Synthesizable Hardware<br>Implementation from High-Level Dataflow Programs<br>Khaled Jerbi, Mickaël Raulet, Olivier Déforges and Mohamed Abid | 98 | **List of Contributors** | Chapter 11 | A Graph-Based Approach to Optimal Scan Chain Stitching Using RTL Design Descriptions Lilia Zaourar, Yann Kieffer and Chouki Aktouf | 112 | |------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------| | Chapter 12 | A 0.6-V to 1-V Audio $\Delta\Sigma$ Modulator in 65 nm CMOS with 90.2 dB SNDR at 0.6-V Liyuan Liu, Dongmei Li and Zhihua Wang | 123 | | Chapter 13 | A New Length-Based Algebraic Multigrid Clustering Algorithm<br>L. Rakai, A. Farshidi, L. Behjat and D.Westwick | 132 | | Chapter 14 | An Efficient Multi-Core SIMD Implementation for H.264/AVC Encoder M. Bariani, P. Lambruschini and M. Raggio | 146 | | Chapter 15 | Design Example of Useful Memory Latency for Developing a Hazard Preventive Pipeline High-Performance Embedded-Microprocessor Ching-Hwa Cheng | 160 | | Chapter 16 | Homogeneous and Heterogeneous MPSoC Architectures with<br>Network-On-Chip Connectivity for Low-Power and<br>Real-Time Multimedia Signal Processing<br>Sergio Saponara and Luca Fanucci | <b>17</b> 0 | | Chapter 17 | Fast and Near-Optimal Timing-Driven Cell Sizing under Cell Area<br>and Leakage Power Constraints Using a Simplified Discrete<br>Network Flow Algorithm<br>Huan Ren and Shantanu Dutt | 187 | | | Permissions | | # Design of Low Power Multiplier with Energy Efficient Full Adder Using DPTAAL #### A. Kishore Kumar, D. Somasundareswari, V. Duraisamy, and T. Shunbaga Pradeepa - <sup>1</sup> ECE Department, Hindusthan College of Engineering & Technology, Coimbatore 641 032, India - <sup>2</sup> Department of Electrical Sciences, Adithya Institute of Technology, Coimbatore 641 107, India - <sup>3</sup> Maharaja Institute of Technology, Coimbatore 641 407, India Correspondence should be addressed to A. Kishore Kumar; kishore\_hindusthan@yahoo.in Academic Editor: Wieslaw Kuzmicz Asynchronous adiabatic logic (AAL) is a novel lowpower design technique which combines the energy saving benefits of asynchronous systems with adiabatic benefits. In this paper, energy efficient full adder using double pass transistor with asynchronous adiabatic logic (DPTAAL) is used to design a low power multiplier. Asynchronous adiabatic circuits are very low power circuits to preserve energy for reuse, which reduces the amount of energy drawn directly from the power supply. In this work, an $8 \times 8$ multiplier using DPTAAL is designed and simulated, which exhibits low power and reliable logical operations. To improve the circuit performance at reduced voltage level, double pass transistor logic (DPL) is introduced. The power results of the proposed multiplier design are compared with the conventional CMOS implementation. Simulation results show significant improvement in power for clock rates ranging from 100 MHz to 300 MHz. #### 1. Introduction Over the past few decades, low power design solution has steadily geared up the list of researcher's design concerns for low power and low noise digital circuits to introduce new methods to the design of low power VLSI circuits. Moore's law describes the requirement of the transistors for VLSI design which gives the experimental observation of component density and performance of integrated circuits, which doubles every two years. Transistor count is a primary concern which largely affects the design complexity of many function units such as multiplier and arithmetic logic unit (ALU). The significance of the digital computing lies in the multiplier design. The multipliers play a significant role in arithmetic operations in DSP applications. Recent developments in processor designs also focus on low power multiplier architecture usage in their circuits. Two significant yet often conflicting design criteria are power consumption and speed. Taking into consideration these constraints, the design of low power multiplier is of great interest. As reported in [1], to get the best power and area requirements of the computational complexities in the VLSI circuits, the length and width of transistors are shrunk into the deep submicron region, handled by process engineering. In recent years, the literatures have identified several types and designs of adiabatic circuits. For instance, 2N2N2P, PFAL, pass transistor adiabatic logic, clocked adiabatic lLogic, improved pass-gate adiabatic logic, and adiabatic differential switch Logic were designed and achieved considerable energy savings, compared with conventional CMOS design [3–9]. In [10], complementary pass transistor adiabatic logic circuit was discussed, in which the nonadiabatic energy loss of output loads has been completely eliminated. In [11], adiabatic CPL circuits using two-phase power clocks were presented. In [12], energy saving design technique achieved by latched pass transistor with adiabatic logic was presented. Many research methods in the adiabatic logic have been attempted to reduce the power dissipation of VLSI circuits, reported in [9–16]. Many research efforts in the multiplier <sup>&</sup>lt;sup>4</sup> ECE Department, Coimbatore Institute of Technology, Coimbatore 641 014, India 2 VLSI Design Handbook design have been introduced to obtain energy efficiency in VLSI circuits. In [17], a 1.5 ns 32-b CMOS ALU in double pass-transistor logic was proposed to improve the circuit performance at reduced supply voltage ranges. In [18], a low power multiplier using 4-2 compressor based on adiabatic CPL circuit is described. By the scaling rules set by Dennard, smart optimization can be achieved by means of timely introduction of new processing techniques in device structures and materials [19]. In [1], low power multiplier design using complementary pass transistor asynchronous adiabatic logic is investigated, which exhibits low power and reliable logical operations. In this paper, design of low power multiplier with energy efficient full adder using double pass transistor asynchronous adiabatic logic (DPTAAL) is proposed and discussed in further sections. #### 2. Adiabatic Logic Design "Adiabatic" is a term of Greek origin which spent most of its history related to classical thermodynamics. It refers to a system in which a transition occurs without energy (usually in the form of heat) being either lost to or gained from the system. In the context of use of electronic systems, electronic charge is preserved rather than heat. Adiabatic logic is viewed on issues related to the thermodynamics of computation. By considering this branch of physics that usually looks at mechanical engines and applying it to computing engines, research areas such as reversible computation as well as adiabatic logic have been developed. By moving to a computing paradigm that is reversible, energy can be reprocessed from a computing engine and reused to perform further calculations. This style of logical approach differs from CMOS circuits, which dissipate energy during switching. To reduce the dynamic power, there are some conventional approaches such as reducing supply voltage, decreasing physical capacitance, and reducing switching activity. These approaches are not conforming enough to meet today's power requirement. On the other hand, most research has focused on building adiabatic logic, which is a hopeful design for low power applications. Adiabatic technique works with the concept of switching activities which reduces the power by giving stored energy back to the supply. Thus, the term adiabatic logic is applied in low power VLSI circuits which execute reversible logic. In the adiabatic techniques, the main design changes are focused on power clock which plays the essential role in the principle of operation. The following major design rules for the adiabatic circuit design are achieved in each phase of the power clock. - (1) Never turn on a transistor if voltage exists across it $(V_{\rm DS}>0)$ . - (2) Never turn off a transistor if current exists across it $(I_{DS} \neq 0)$ . - (3) Never pass current through a diode. In all the four phases of power clock, if these conditions are satisfied, recovery phase will restore the energy to the power clock, resulting in considerable energy saving. Even some complexities in adiabatic logic design perpetuate. Two such complexities are circuit implementation for time-varying power sources that needs to be done and computational implementation by low overhead circuit structures that needs to be followed [1]. #### 3. Asynchronous Adiabatic Logic (AAL) Asynchronous adiabatic logic is a unique design technique which combines the energy saving benefits of asynchronous logic and adiabatic logic. Like adiabatic circuits, asynchronous circuits are also a promising technology to focus on low power, highly modular digital circuits. One of the properties of asynchronous systems which make them useful in these applications is that circuits include a builtin insensitivity to variations in power supply voltage, with a lower voltage resulting in slower operation rather than the functional failures that would be seen if traditional synchronous systems were used. Another benefit is the fact that when an asynchronous system is idle, it will not utilize clock signals, whereas in synchronous systems, these clock signals are propagated throughout the entire system and convert energy to heat, often without performing any useful computations. In contrast to the synchronous circuits, asynchronous circuits perform handshaking between their components to perform all necessary synchronization, communication, and sequencing of operations. Asynchronous circuits fall into different classes, each offering different advantages. The main privilege of this circuit is its low power consumption, stemming from its elimination of clock drivers and the fact that no transistor ever transitions unless it is performing a useful computation. #### 4. Proposed Design The main objective of this paper is to design low power multiplier with energy efficient full adder cell using double pass transistor with asynchronous adiabatic logic. The logic scheme for full adder cell is illustrated in Figure 1. In this, entire system consists of two main blocks, such as logical block and control and regeneration (C&R) block. As in Figure 1, data output signal of any logical block is not only going into next logical block as data input, but at the same time, it is used to generate a control signal for the next logical block using C&R block 1 as reported in [15]. This technique helps to save the required power clock generator with less power. 4.1. Power Clock. In adiabatic circuits, the supply voltage behaves as the clock of the circuit by providing the power, to the circuit and for this reason, it is called power clock. One of the main concerns in the adiabatic logic circuits is the power clock generation. In these circuits, the supply voltage is desired to be a ramping voltage. In the conventional synchronous adiabatic circuits, rather driving each adiabatic logic unit with an externally supplied clock phase, each block is controlled and powered by control signal generated by the FIGURE 1: Logic scheme for fulladder cell. FIGURE 2: Proposed multiplier design scheme. C&R block with the help of the logical output of the previous stage. In the design of VLSI circuits, power clock design is a major issue, because the whole transistor logic system shares the power clock. The power clock switching circuit will also dissipate the most power in the logic. Nowadays multiple phase clocks and clock pipelining are the most followed techniques to reduce power dissipation in the power clocks. The synchronous clock system utilizes the clock source globally; that is, single clock is shared and restored by the large number of logical gates in parallel. Here switching loss of the power clock generator is more as in the CMOS circuit operation. The simple construction of the pass transistor logic makes it easy to adjust the sizing of transistors to get the desired charging and discharging time; hence the slope of the output FIGURE 3: Control and regeneration (C&R) block. FIGURE 4: DPL full adder cell [2]. control signal minimizes the power. The clock energy in the asynchronous clock system is locally stored in the C&R block, and it has been used for later gates; the loss of energy of each operation will be taken from its clock source. The local regeneration stores the intermediate energy. This energy is provided to the required operations for the next level of logic. However, the initial requirement of power from the clock generator remains the same; after powering up the logical sequence, power taken from the power clock is reduced drastically. The proposed multiplier design scheme is illustrated in Figure 2. In this, data out signal of any full adder is not only going into next full adder as data input. But at the same time, it is used to generate a control signal for the next full adder using C&R block 1. This technique helps to save the required power clock generator with less power [15]. This approach gives the feasibility of using the adiabatic logic in real-time implementations. Also to reduce the initial FIGURE 5: DPTAAL full adder logic diagram. power dissipation, we can utilize the conventional techniques for compensation, like multiple clocks and pipeline architecture. In this work, we have examined the practical approach of adiabatic logic in full adiabaticity. According to the Landauer's principle method to charge/discharge the capacitances of input nodes adiabatically, the input voltages must be reconstructed from the outputs. It is accomplished by using the control and regeneration block. Control block is used to follow and preserve the power clock sequences with the input vectors. Regeneration gives power saving strategy. All logic gates or logic sequences are connected through C&R structure. The throughput of the logical systems is reduced by the intermediate C&R blocks due to the asynchronous mode of operation. The speed of operations can be compensated for the higher input frequency due to the improvement of speed grade of proposed asynchronous adiabatic logic, as discussed in [1]. #### 5. Control and Regeneration (C&R) Block The control and regeneration (C&R) block is given in Figure 3. C&R block generates the control signal for the next logical stage with the help of the previous stage output signal. The regeneration technique makes the control signal strong enough to drive the next logical block. In the proposed design, asynchronous operation has been achieved by the control and regeneration part. This C&R controls and regenerates the energy, required for the next operation to the next logical block. The system energy will be circulating among the logical circuits and the minimum power is required from the power clock generator for the operation. Generally, the regenerated signal is stored and circulated between the C&R and logical part. Thus, there will not be much power reverse to the power clock system. It facilitates reducing the power clock system switching losses. The proposed design of DPTAAL logic gates is used to design logical blocks. The pass transistor logic implementation is used for the design of C&R block in terms of energy efficiency and functionality. The NOR portion of the OR gate is acting as the control part whereas the NOT portion is not only making the desired logical inversion. But at the same time, it performs the regeneration of the signal. The regenerated signal energy will be used in the next logic circuit for the sequential operation. The NOT portion will again regenerate the signal whereas the operation gets completed. The construction of the C&R promotes the local storage of the energy and switching circuit for the recovery. The power reduction is not achieved in C&R block. However 60% to 70% of power saving and 1/3 of the speed improvement are achieved, compared to the adiabatic logic with the power clock generator, as discussed in [15]. #### 6. Double Pass Transistor (DPL) Double pass transistor (DPL) is a modified version of complementary pass transistor logic (CPL) that meets the requirement of reduced supply voltage designs. In DPL circuits full swing operation is achieved by simply adding PMOS transistors in parallel with the NMOS transistors. Thus, the problems of noise margin and speed degradation at reduced supply voltages associated in CPL circuits are avoided. The circuit diagram of the DPL full adder cell is given in Figure 4. FIGURE 6: Simulation results of DPTAAL full adder cell. TABLE 1: Transistors sizes used in each design block of the DPTAAL full adder. | Docion blo alto | PMOS | | NMOS | | |--------------------------------------|--------------------------|------------|-----------------------------------|------------| | Design blocks | Minimum length $(\mu m)$ | Width (μm) | Minimum length $(\mu \mathrm{m})$ | Width (µm) | | Adiabatic DPL gates, MUX, and buffer | 0.18 | 5.0 | 0.18 | 5.0 | | C&R section | 0.18 | 5.0 | 0.18 | 2.0 | In this, sum output consists of XOR/XNOR gates, a multiplexer, and a CMOS output buffer. The carry output consists of AND/NAND gates, OR/NOR gates, a multiplexer, and a CMOS output buffer. These DPL gates consist of both NMOS and PMOS pass transistors, in contrast to CPL gates, where only NMOS pass transistors are used. The outputs S bar and C<sub>o</sub> bar are acting as the current paths, where inputs A, B, and C are all low. These current paths include two pass transistors, and there are two current paths for each output. In the double pass transistor logic gates, the inputs to the gates of the PMOS transistors are changed from A to B. This arrangement compensates for the speed degradation of CMOS pass transistors in two ways. First, it is a symmetrical arrangement whereby any input is connected to the gate of one MOSFET and the source of another. In the case of XOR/XNOR, it is perfectly symmetrical. Any of the inputs A, A bar, B, and B bar is connected to the gates of NMOS and to the sources of the NMOS and PMOS. This results in balanced input capacitance and reduces the dependence of the delay time on data. Secondly, it has double transmission characteristics. In the DPL gate, both A and B are passed when A&B are low. In both the CPL and CMOS implementations, the gate input A or A bar controls the pass transistors. When A is low, B is passed, and B is passed when A is high. In the DPL gate, on the other hand, there are two types of pass transistors: one is controlled by A and the other by B. The A controlled pass transistors operate in the same way as CPL and CMOS. For the B controlled pass transistors, when B is low, A is passed, and A bar is passed when B is high. As a result, there are always two current paths driving the buffer stage [17]. ### 7. Double Pass Transistor with Asynchronous Adiabatic Logic (DPTAAL) In the DPL design, the widths of the NMOS and PMOS pass transistors are one-third and two-thirds, respectively, of the NMOS pass transistor in the CPL gate, so the input capacitance and the gate area are nearly the same for all these architectures. The resistance including that of the CMOS buffer of the previous stage is smallest for the DPL gate due to its double-transmission property. In multiplier circuits, the DPL full adder is as fast as CPL, 18% faster than the conventional pass transistor logic, and 37% faster than CMOS, reported in [17]. 6 VLSI Design Handbook | TABLE 2: Qualitative comparison of logic designs | TABLE 2: | <b>Oualitative</b> | comparison | of | logic | designs | [2 | 1. | |--------------------------------------------------|----------|--------------------|------------|----|-------|---------|----|----| |--------------------------------------------------|----------|--------------------|------------|----|-------|---------|----|----| | Logic designs | No. of MOS logic networks | Output driving capability | Input/output decoupling | Signal rails | Robustness | |---------------|---------------------------|---------------------------|-------------------------|--------------|------------| | CMOS | n + p | Medium-good | Yes | Single | High | | CPL | 2n | Good | Yes | Dual | Medium | | DPL | 2n + 2p | Good | Yes | Dual | High | FIGURE 7: Schematic of DPTAAL multiplier. In the proposed design, double pass transistor technique is combined with asynchronous adiabatic logic (AAL) design technique to obtain the significant power benefits in the digital circuits. Asynchronous adiabatic full adder uses double pass transistor logical block with C&R structures. A simple implementation of this system is depicted. It is a full adder cell, with the logical part designed using DPTAAL, whereas the control part of the C&R block and regeneration part is made of pass transistor logic. This pass transistor logic is functioning as transmission gate in the output logic of each gate structure. The DPTAAL design of full adder cell is presented in Figure 5, which consists of C&R section, adiabatic DPL full adder circuit, multiplexer section, and an output buffer. In this DPTAAL full adder, sum circuit section includes DPL XOR gate, a DPL multiplexer, C&R section, and an output buffer. The carry output section consists of DPL AND gate, DPL OR gate, a DPL multiplexer, C&R section, and an output buffer. These adiabatic DPL gates consist of both NMOS and PMOS pass transistors to achieve full swing operation. When the inputs A, B, and C are all low, the outputs SUM bar and CARRY bar will be acting as the current paths. Thus, two current paths for each output can be achieved. In this DPTAAL design, power and clock lines are mixed into a single power clock line which has both functions of powering and timing the circuit. C&R section is the main concept of this DPTAAL design, which generates the control signal for the next logical gate using the output signal of the previous gate. The regeneration technique makes the control signal strong enough to drive the next logical gate. Thus, power consumption from the power clock is reduced drastically. A multiplexer chooses the output to be one of several inputs based on a select signal. In this full adder design, multiplexer is used to select the required outputs of DPL-XOR, DPL-AND, and DPL-OR, based on the inputs C and C bar. All these full adder blocks have been designed with PMOS/NMOS transistors, focusing on low power consumption and high efficient operation. The dimensions of all gate lengths (L) of these transistors have been taken as 0.18 $\mu$ m. The width ( $W_p \& W_n$ ) of PMOS/NMOS transistors has been taken as 5.0 $\mu$ m. For C&R section, the width ( $W_n$ ) of NMOS transistors has been taken as 2.0 $\mu$ m and the width ( $W_p$ ) of PMOS transistors has been taken as 5.0 $\mu$ m, with gate length FIGURE 8: Simulation results of DPTAAL multiplier. of 0.18 $\mu$ m. Table 1 illustrates the final sizes of the transistors used in each design block of the DPTAAL full adder. ## 8. Simulation Results and Performance Analysis The qualitative comparison of three logic designs CMOS, complementary pass transistor logic (CPL), and DPL is given in Table 2, which influences circuit performance and energy consumption. In particular, the number of MOS logic networks, the output driving capabilities, the presence of input/output decoupling, the number of signal rails, and the robustness with respect to voltage scaling are given for the logic styles discussed [2]. In DPL, the robustness with respect to voltage scaling is high, which improves circuit performance at reduced supply ranges. Its symmetrical arrangement and double transmission characteristics compensate for the speed degradation arising from the use of PMOS and NMOS pass transistors. Energy efficient full adder cell design using asynchronous adiabatic logic with double pass transistor has been implemented. The simulation results of the DPTAAL full adder are presented in Figure 6, for the various combinations of the inputs. The power clock sequence for this full adder structure is shown in these simulation results and it is based on the conventional structure. These simulation results are obtained for a periodic sequence as represented in the figure, propagated through the buffer chain. TABLE 3: Transistor count comparison of full adders. | Logic style | No. of transistors | | | |------------------------------------|--------------------|--|--| | Conventional CMOS | 28 | | | | Double pass transistor logic (DPL) | 48 | | | | DPTAAL full adder design | 65* | | | <sup>\*</sup>DPTAAL is 35% larger than DPL. 4-bit and 8-bit multipliers with energy efficient full adder using DPTAAL have been implemented and compared with conventional CMOS logic. The proposed design of DPTAAL multiplier is presented in Figure 7 and its simulation results are presented in Figure 8. The transistor count comparison of CMOS, DPL, and DPTAAL full adder cells is given in Table 3. As presented in Table 3, the transistor count of the DPTAAL full adder is increased as compared with existing designs; hence a large on-chip area overhead is associated with AAL design. Taking into consideration the gain in energy efficiency, the area overhead is acceptable. The area of AAL can be reduced by using more area efficient logical blocks, as discussed in [15]. The energy performance of the DPTAAL full adder is compared with the conventional CMOS full adder and given in Table 4. The obtained energy of these full adders is specified in fJ (femto Joules), for operating frequencies from 1 MHz to 300 MHz. The symmetrical arrangement and double transmission characteristics of the adiabatic DPL gates improve the circuit performance in the DPTAAL full | | Logic | | | Frequency | | | |--------------|---------|-------|--------|-----------|---------|---------| | | designs | 1 MHZ | 10 MHZ | 100 MHZ | 200 MHZ | 300 MHZ | | Energy (fJ) | CMOS | 75 | 75 | 75 | 75 | 75 | | Lifeigy (1)) | DPTAAL | 5 | 5.4 | 7.5 | 12 | 17.2 | TABLE 4: Energy consumption of DPTAAL full adder. TABLE 5: Power comparison of conventional CMOS versus DPTAAL multiplier. | No. of | Frequency | | | | | |--------|-----------------|-----------|------------|---------|---------| | bits | $1\mathrm{MHZ}$ | 10 MHZ | 100 MHZ | 200 MHZ | 300 MHZ | | | | Conventio | nal CMOS ( | μW) | | | 4 bit | 0.24 | 0.49 | 2.74 | 3.34 | 5.01 | | 8 bit | 0.35 | 0.67 | 3.44 | 6.81 | 11.84 | | | | DPT | 'AAL (μW) | | | | 4 bit | 0.19 | 0.39 | 1.75 | 2.42 | 3.21 | | 8 bit | 0.25 | 0.51 | 2.42 | 5.02 | 8.09 | adder. Asynchronous operation is achieved by the control and regeneration (C&R) block to reduce the power clock system switching losses. By combining these techniques, DPTAAL full adder design features the lowest energy consumption per addition as compared with conventional CMOS design. HSPICE simulations showed energy savings up to 84% in this full adder design, maintaining proper functionality. The power consumption of the simulated $4 \times 4$ and $8 \times 8$ multipliers is reported in Table 5, for operating frequencies as low as 1 MHz and as high as 300 MHz. It can be observed by comparing the data presented in Table 5 that DPTAAL design achieves significant power savings for clock rates ranging from 200 MHz to 300 MHz. The power comparison graph for $4\times4$ and $8\times8$ multipliers is shown in Figure 9. The multiplier design is studied on TSMC 0.18 $\mu$ m CMOS process models in Tanner EDA tools with SPICE support, at 1.8 V supply voltage. The standard values of gate capacitances and other MOSFET model parameters were included in this simulation. The simulation parameters of this process technology are summarized in Table 6. #### 9. Conclusion In this paper, we have proposed a framework for designing a low power multiplier using energy efficient full adder. A unique approach, double pass transistor with asynchronous adiabatic logic (DPTAAL), has been followed in the full adder design. Double pass transistor logic (DPL) is a modified version of complementary pass transistor logic, which is used to improve the circuit performance at reduced voltage level. This technique is combined with asynchronous adiabatic logic (AAL) to obtain the energy saving benefits with improved circuit performance in full adder design. In this DPTAAL design, asynchronous operation has been achieved by the control and regeneration (C&R) section, which generates the control signal for the next logical gate FIGURE 9: Power comparison graph. TABLE 6: Process parameters used for this simulation. | Parameters | Value | |------------------------------------|---------------------------------| | Process technology | 0.18 μm | | Supply voltage $(V_{DD})$ | 1.8 V | | Ambient temperature | 0-70°C | | Gate oxide thickness $(T_{ox})$ | 4 nm | | Gate capacitance $(C_g)$ | $2 \mathrm{fF}/\mu\mathrm{m}^2$ | | Minimum gate length ( $L_{\min}$ ) | $0.18 \mu \mathrm{m}$ | | NFET threshold voltage $(V_{tn})$ | 0.39 V | | PFET threshold voltage $(V_{tp})$ | -0.42 V | | NFET drain current $(I_{Dsat})$ | 600 mA | | PFET drain current ( $I_{Dsat}$ ) | 260 mA | using the output signal of the previous gate. The regeneration technique makes the control signal strong enough to drive the next logical gate. It facilitates reducing the power clock system switching losses. The energy performance of the DPTAAL full adder is compared with the conventional CMOS full adder for the various frequency ranges and achieved significant energy savings up to 84%. This energy efficient full adder cell is used in the multiplier design. The performance of this design is analyzed with 4-bit and 8-bit multipliers for operating frequencies as low as 1 MHz and as high as 300 MHz. The power results of the proposed multiplier design are compared with the conventional logic designs. It is observed that for frequencies between 200 MHz and 300 MHz, DPTAAL multiplier circuits consume less power than the conventional designs. The DPTAAL multiplier circuits have been implemented and studied using 0.18 $\mu$ m TSMC technology file with 1.8 V