# IEEE 1972 Fault-Tolerant Computing Symposium A PUBLICATION OF THE IEEE COMPUTER SOCIETY Digest of Papers 1972 INTERNATIONAL SYMPOSIUM ON # FAULT-TOLERANT COMPUTING Newton, Massachusetts, June 19-21, 1972 Sponsored by the Technical Committee on Fault-Tolerant Computing of the IEEE Computer Society in cooperation with The MIT C.S. Draper Laboratory Cambridge, Massachusetts 72CH0623-9C F7460576 Copies available from: IEEE Computer Society 8949 Reseda Boulevard Northridge, California 91324 IEEE, Order Department 345 East 47th Street New York, N. Y. 10017 PRICE: \$12.50 Members \$15.00 Non-members ### TABLE OF CONTENTS | SESSIC | ON I: APPLICATIONS OF CODING THEORY Chairman: C. V. Srinivasan, Rutgers University, Brunswick, New Jersey | | |---------|-----------------------------------------------------------------------------------------------------------|-----| | 1. | A Residue Checker for Arithmetic and Logical Operations P. Monteiro and T. R. N. Rao | | | 2. | Look-Aside Techniques for Minimum Circuit Memory Translators W. C. Carter, K. A. Duke, D. C. Jessep | 1 | | 3. | A Design Technique for Synthesis of Fault-Tolerant Adders D. K. Pradhan and S. M. Reddy | 2 | | 4. | Arithmetic Algorithms for Error-Coded Operands A. Avizienis | 2 | | 5. | Design of Totally Self-Checking Check Circuits for M-Out of-N Codes D. A. Anderson and G. Metze | 3 | | SESSIO | N II: DIAGNOSIS-PRACTICE ORIENTED Chairman: H. Chang, Bell Telephone Laboratories, Naperville, Illinois | | | 1. | Detection of Multiple Faults in Combinational Circuits J. P. Richard, M. Courvoisier, M. Diaz, P. Azema | 3 | | 2. | A Comparison of Digital Simulation and Physical Fault Insertion in Diagnostic Test<br>Development | | | 3. | D. Corman and J. G. Burns | 4: | | 4. | D. A. Bastin, E. M. Girard, J-C. L. Rault | 4 | | 5. | M. A. Breuer | 5 | | | A. K. Susskind | 58 | | 7. | M. E. Fohl | 62 | | 8. | K. Nozawa and K. Ritani | 61 | | SESSION | G. C. Jain and H. G. Adshead | 73 | | | Chairman: J. King, IBM Research Center, Yorktown, New York Fault-Tolerant Programming | | | 2. | W. R. Elmendorf | 79 | | 3. | J. S. Miller and W. H. Vandever | 84 | | | B. R. Borgerson | 89 | | | J. R. Connet, E. J. Pasternak, B. D. Wagner | 94 | | | M O Malanter o W 1 11 Fra o 1 1 | 100 | | | 0 0 0111 | 109 | | 363310 | Chairman: D. Schertz, Bradley University, Peoria, Illinois | | |--------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | 1. | System Modeling and Testing Procedures for Microdiagnostics C. V. Ramamoorthy and L. C. Chang | 110 | | 2. | Fault Detection in Uniform Modular Realizations of Sequential Machines | 11/ | | | S. Ramos and A. R. Smith III | 114 | | | Multiple Fault Detection in Combinational Circuits; Algorithms and Experimental Results M-W. Du and C. D. Weiss | 120 | | | Easily Testable Realizations for Logic Functions S. M. Reddy | 126 | | 5. | Fault Locating Test Generation for Combinational Logic Networks Y. Koga and F. Hirata | 131 | | 6. | A Homing Sequence Generation Algorithm for Fault Detection in Asynchronous Sequential Circuits D. K. Chia and M. Y. Hsiao | 137 | | _ | Use of Spoofs for Faulty Logic Network Analysis | - | | 7. | F. W. Clegg | 143 | | 8. | On Numerical Bounds in Diagnosable Systems R. P. Vancura and C. R. Kime | 148 | | SESSI | N V: LOGIC DESIGN Chairman: P. E. Wood, Jr., Honeywell Information Systems, Waltham, Massachusetts | | | 1. | Computer Error Control by Testable Morphic Boolean Functions - A Way of Removing Hardcore W. C. Carter, A. B. Wadia, D. C. Jessep | 154 | | 2. | Design of a Microprogram Control for Self-Checking R. W. Cook, W. H. Sisson, T. F. Storey, W. N. Toy | 160 | | 3. | Recovery Strategies in the No. 2 Electronic Switching System P. J. Kennedy and T. M. Quinn | 165 | | 4. | Design of Asynchronous Sequential Machines for Fault Detection D. H. Sawin III, G. K. Maki, S. R. Groenig | 170 | | 5. | A Fault-Tolerant Asynchronous Sequential Machine W. W. Patterson and G. A. Metze | 176 | | 6. | An Iterative Cell Switch Design for Hybrid Redundancy D. P. Siewiorek and E. J. McCluskey | 182 | | SESSI | ON VI: MATHEMATICAL MODELS Chairman: A. I. Green, M.I.T. C.S. Draper Laboratory, Cambridge, Massachusetts | | | 1. | Figure of Merit for Fault-Tolerant Space Computers H. Hecht | 189 | | 2. | Modeling of a Bubble Memory Organization with Self-Checking Translators to Achieve High<br>Reliability<br>W. G. Bouricius, W. C. Carter, E. P. Hsieh, D. C. Jessep, A. B. Wadia | 19: | | . 3. | The Concept of Coverage and its Effect on the Reliability Model of a Repairable System T. F. Arnold | 200 | | 4. | Inherent Redundancy in Combinational Behaviors | 205 | | | | | | 5. | Probabilistic Models for Software Reliability Pre iction | 21 | ### AUTHORS AND CHAIRMEN OF SESSIONS | | Session | Page | | Session | Page | |--------------------|---------|----------|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------| | dshead, H. G. | 2 | 73 | Schertz, D. | 4 | Chairman | | nderson, D. A. | ī | 30 | Shooman, M. L. | attrast I sa se i <mark>6</mark> -a a con A | 211 | | rnold, T. F. | 6 | 200 | Siewiorek, D. P. | 5 | 182 | | vizienis, A. | i | 25 | Sisson, W. H. | 5 | | | zema, P. | 2 | 36 | | . <b> </b> | 160 | | astin, D. A. | 2 | | Smith, A. R. III | - | 114 | | | | 47 | Srinivasan, C. V. | 1 | Chairman | | orgerson, B. R. | 3 | 89 | Stakeberg, L-O. | 3 | 100 | | ouricius, W. G. | 6 | 193 | Storey, T. F. | 5 | 160 | | reuer, M. A. | 2 | 53 | Susskind, A. K. | 9 9 1 5 14 15 <b>2</b> 11 9 15 19 | 58 | | urns, J. G. | 2 | 42 | Toy, W. N. | 5 | 160 | | arter, W. C. | 1 | 14 | Vancura, R. P. | The state of the state of the | 148 | | | 5 | 154 | Vandever, W. H. | 3 | 84 | | | 6 | 193 | Wadia, A. B. | 5 | 154 | | hang, H. | 2 | Chairman | Wagner, B. D. | stanai lasi ga <b>g</b> a Teru | 94 | | hang, L. C. | 4 | 110 | Weiss, C. D. | 2010 - 10a - 21 <b>4</b> 0 Los 21 2 | 120 | | hia, D. K. | 4 | 137 | Wood, P. E. | 5 | Chai rman | | legg, F. W. | 4 | 143 | | Maradid a magasi i gubayi | Olima Lamia | | onnet, J. R. | 3 | 94 | | | | | ook, R. W. | 5 | 160 | | | | | orman. D. | 2 | 42 | | | | | burvoisier, M. | 2 | 36 | | | | | Maz, M. | 2 | 36 | | | | | Du. M-W. | 4 | 120 | | | | | Duke, K. A. | i | 14 | 45. | of as a way factored | | | Imendorf, W. R. | 3 | 79 | | by 9 5 ban wassast 4. | | | ohl, M. E. | 2 | | | on the the court management that | | | | | 62 | | MONEY DIOM | | | Hilley, G. C. | 3 | 105 | | CONTROL OF THE PROPERTY | | | Birard, E. M. | 2 | 47 | | SORA D A. Chinana Carrier. | | | Fray, F. G. | 6 | 205 | | | | | Green, A. I. | 6 | Chairman | | | | | roenig, S. R. | 5 | 170 | | | | | lecht, H. | 6 | 189 | | | | | dirata, F. | 4 | 131 | | saign of a bicropy of the | | | Haiao, M. Y. | 4 | 137 | | W. Cook, W. H. Sire | | | isieh, E. P. | 6 | 193 | | | | | Jain, G. C. | 2 | 73 | | | | | Jessep, D. C. | 1 | 14 | | | | | | 5 | 154 | | | | | | 6 | 193 | | | | | Cennedy, P. J. | 5 | 165 | | | | | (ime, C. R. | 4 | 148 | | | | | (ing. J. | 3 | Chairman | | | | | (oga, Y. | 4 | 131 | | A . Patterson of P. A. | | | Lindqvist, M. G. | 3 | 100 | | | | | taki, G. K. | 5 | 170 | | | | | AcCluskey, E. J. | 5 | 182 | and the second second | | | | Metze, G. A. | í | 30 | | | | | | 5 | 176 | * | | | | Miller, J. S. | 3 | 84 | | | | | Monteiro, P. | i | 8 | | | | | Montelius, G. | 3 | 100 | | | | | Nozawa, K. | 2 | 68 | | | | | Pasternak, E. J. | 3 | 94 | | | ** ** · | | | 5 | | | sent the set of the sections | | | Patterson, W. W. | - | 176 | | | | | Pradhan, D. K. | 1 | 20 | | | | | Quinn, T. M. | 5 | 165 | | | | | Ramamoorthy, C. V. | 4 | 110 | | | | | Ramos, S. | 4 | 114 | | | / | | Rao, T. R. N. | 1 | 8 | | | | | Rault, J-C. L. | 2 | 47 | | | | | Reddy, S. M. | 1 | 20 | | | | | | 4 | 126 | | | | | Richard, J. P. | 2 | 36 | | | | | Ritani, K. | 2 | 68 | | | | | Sawin, D. H. III | 5 | 170 | | nazemie u | | | | | | | | | # 1972 INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING Symposium Chairman: Albert L. Hopkins, Jr. C.S. Draper Laboratopy and Department of Aeronautics and Astronautics Massachusetts Institute of Technology Cambridge, Massachusetts Vice Chairman: Sanford Cohen MIT C.S. Draper Laboratory Cambridge, Massachusetts Treasurer: Paul Barr Raytheon Company Sudbury, Massachusetts Registration: Daniel Dolan MIT C.S. Draper Laboratory Cambridge, Massachusetts Arrangements: Robert C. Millard MIT C.S. Draper Laboratory Cambridge, Massachusetts Digest Editor: Lutz P. Henckels General Radio Company Concord, Massachusetts Editor, IEEE-TC Special Issue: W. C. Carter IBM Research Center Yorktown Heights, New York Program Chairman: Gernot Metze University of Illinois Urbana, Illinois ### PROGRAM COMMITTEE Algirdas Avizienis Jet Propulsion Lab Pasadena, California Harvey Garner University of Pennsylvania Philadelphia, Pennsylvania James King IBM Research Center Yorktown Heights, New York Alfred Susskind Lehigh University Bethlehem, Pennsylvania William C. Carter IBM Research Center Yorktown Heights, New York Jack Goldberg Stanford Research Institute Menlo Park, California Donald Schertz Bradley University Peoria, Illinois Stephen Szygenda Southern Methodist University Dallas, Texas Herbert Chang Bell Telephone Labs Naperville, Illinois Alan I. Green MIT Draper Lab Cambridge, Massachusetts C. V. Srinivasan Rutgers University Brunswick, New Jersey Paul E. Wood Honeywell Waltham, Massachusetts The 1972 International Symposium on Fault-Tolerant Computing is the second annual symposium to be sponsored by the Technical Committee on Fault-Tolerant Computing of the IEEE Computer Society. This year's symposium is held in cooperation with the C.S. Draper Laboratory of the Massachusetts Institute of Technology, which has supported several members of the organizing committee. I wish to cite the efforts of Prof. Gernot Metze, Dr. Lutz P. Henckels, and Messrs. Daniel Dolan and Robert Millards who have all given their valuable enthusiastic, and long efforts to this symposium. I wish also to thank all those who have helped in many ways, including the authors themselves for their cooperation with the review and editorial activities, and the reviewers and program committee members whose efforts were indispensible. The papers in this volume are worthy successors to those in the highly successful 1971 Symposium. They reflect the rapid progress and increasing interest in the field of Fault-Tolerant Computing which is currently under way, and which promises to continue to grow. The growth of the field is interdependent with the dissemination and exchange of information among all of those interested in and participating in it. Inevitably, the dissemination process in symposia like this one tends to be selective. To those whose efforts and interests are not so well represented in this volume as they might wish, I urge their continued effort to make their contributions to future symposia in this series. Fault-tolerant computing must advance on a broad front if it is to advance at all. One of the purposes of the sponsoring Technical Committee is to stimulate advancement, and for this we require the effective support of large numbers of workers in the diverse areas of the field. Albert L. Hopkins, Jr., Symposium Chairman The program of the 1972 International Symposium on Fault-Tolerant Computing, the second symposium in a series begun just last year, attests to the continued growth of interest in this new area. The Program Committee has selected 38 papers that present state-of-the-art results in a wide range of topics in fault-tolerance and has arranged them in six sessions. Various aspects of fault diagnosis, from practical comparisons of the efficiency of different test generation methods to theoretical results on the diagnostic capabilities of subsystems, again dominate the program and two sessions with eight papers each are scheduled. There are also two related sessions on fault-tolerant logic design and system design, including software systems, a session on applications of error protection codes and the design of check circuits, and a session on models for reliability prediction in hardware and software and other models for fault-tolerant computing. In addition, a panel of designers of fault-tolerant systems will present a retrospective analysis of the relative merits of different schemes for achieving fault-tolerance. The program packs a wealth of information into three days. We hope you will share our enthusiasm and participate actively in the discussions. Germot Metze, Technical Program Chairman As Editor of the 1972 Conference Digest, I wish to express my appreciation to Albert Hopkins, Jr. for his advice and support and to Linda Barron, my secretary, for her indispensable help in preparing the Digest. I want to thank, in particular, the authors whose splended cooperation made my task an easy and enjoyable one. Lutz P. Henckels, Digest Editor P. Monteiro and T. R. N. Rao Department of Electrical Engineering University of Maryland College Park, Maryland 20742 ### ABSTRACT Until now the application of residue codes had limited itself to arithmetic type operations such as ADD, SHIFT, CYCLE, COMPLEMENT, etc. We establish here that residue codes can be very effectively used to check logical operations (AND, OR, EXOR) as well. The mathematical theory and logic presented here, enables us to design a residue checker organization to detect errors in all arithmetic and logical operations of a processor. Mathematical expressions characterizing the operations of a processor and checker are derived and are used to obtain a hardware implementation schematic. Index Terms: Residue checker, arithmetic and logical operations, separate adder and checker, residue codes, and self-checking processors. ### INTRODUCTION As the complexity of modern computers is increasing, there is a growing need for efficient self detection and correction of hardware faults in computers. The conventional approach of duplication or triplication, and matching or majority vote-taking of the outputs, has given way to the relatively cheaper approach of error detection through arithmetic codes. The theory and implementation of these codes has received considerable attention in recent years, but there remains a gap between the theory and the development of schemes which are easily and cheaply implementable. A class of codes known as AN codes has been studied in detail by Brown, Diamond [1][2]and others, and Avizenis [3] has demonstrated a practical implementation of a 15N code for checking arithmetic operations in a computer. The theory of other residue codes has also been studied in great detail, and various theorems establishing the error correcting properties of such codes have been proved [4][5][6][7][8]. For checking logical operations, various codes have been studied [10][11]. An estimate of the relative cost of error checking through coding, as against that using triplication has been arrived at [12]. However, hitherto error checking by means of residue codes had restricted itself to checking arithmetic operations only. It is our aim in the following paragraphs, to arrive at a set of expressions for the results of all the arithmetic, as well as logical operations in the processor, and then describe a single unit which can be used to perform concurrent diagnosis of the above mentioned operations. It must be mentioned here that the circuitry for addition and logical operations can also be checked by two-rail logic [15][16] where each Boolean variable is represented by two lines, such that the true and complemented logic functions are available at each stage. All errors caused by a single failure are detected by this method. ### Description of processor unit to be checked A block diagram of the unit to be checked is given in Fig. 1. The accumulator A is an n-bit register and holds the result at the end of every operation. $(B_{n-1}, B_{n-2}, \ldots, B_o)$ represents the n-bit memory operand, that is fed in through the n parallel input data lines. In addition we have an n-bit parallel adder, the shift rotate control logic, the circuitry used for performing logical operations on words of length n, and the OP-code decoder. Table 1 is a list of the operations that are performed by the processor (which are all the operations that will be checked). ### GENERAL THEORY OF RESIDUE CHECKING The residue code that will be used in the implementation of the checker, belongs to a class known as separate codes. In a separate residue code, a number N is coded as a pair (N, C(N)). C(N) is the check symbol for N. Addition of two code words $(N_1, C(N_1))$ and( $N_2$ , $C(N_2)$ ) yields the code word $(N_1 + N_2, \bar{C}(N_1))$ C(N2)), where a suitable operation ' \* has to be defined. Note that the addition of code words involves the two operations of '+' and '\*' on the information and check parts respectively. Also, if the equation $C(N_1)*C(N_2) = C(N_1+N_2)$ is satisfied, then the sum of two code words yields another code word. A model of a processor unit using a separate adder and checker is shown in Fig. 2a. Peterson [5] has proven a theorem of fundamental importance to schemes where adder and checker are independent units. Theorem 1. (Peterson). If there are fewer check symbols than integers permissible in the range of N, (of the model given in Fig. 2(a)) and if the check symbols C(N) satisfy $C(N_1)*C(N_2) = C(N_1+N_2)$ , then C(N) must be the residue of N modulo some base b, and \* is addition modulo b of these check symbols. Thus every separate code closed under addition is of the form $(N,|N|_b)$ , $(|N|_b)$ being defined as the least non-negative integer congruent to N modulo b), for some suitable b. Peterson's model can readily be generalized to include all elementary arithmetic and arithmetic type operations, such as SUB, SHIFT, CYCLE, etc. Fig. 2b is a model for the generalized separate processor and checker. Further, if addition in the processor is done modulo m, (where $m=2^n$ or $2^n$ -1 depending on whether 2's or 1's complement arithmetic is used, and n is the number of bits of the operand register), then we have the following result, first established by Garner $\lceil 4 \rceil$ , and also used by Rao $\lceil 8 \rceil$ in his error checking scheme for arithmetic operations. Theorem 2. If $N_1$ and $N_2$ are elements of $Z_m$ (where $Z_m$ denotes the ring of integers modulo m) then the $(N,|N|_b)$ code is closed under addition if and only if b divides m. Here b is called the check base. Thus if N represents the integer value of the accumulator contents at any instant, the residue $|N|_h$ is obtained in a check register, such that the contents of the accumulator, together with the contents of the check register form a code word at the end of an operation cycle. Let $N_1$ denote the integer value of the accumulator contents at the beginning of an operation cycle. Let $N_2$ denote the integer value of the input on the parallel data lines from memory. An operation denoted by $\Phi_1$ can be defined as a function of either $N_1$ , $N_2$ or both $N_1$ and $N_2$ . Thus $\Phi_1$ can be a unary or a binary function. The integer value of the result of this operation is denoted by R, such that (for a binary operation) $R = \Phi_1(N_1, N_2)$ . Thus R is the integer value of the output, which is stored in the accumulator at the end of an operation cycle. Let $N_{1b}$ denote the residue of $N_1$ modulo the check base b. We assume that $N_{1b}$ is maintained in the check register. Let $N_{2b}$ denote the residue of $N_2$ modulo b. Our aim is to find an operator $\Phi_{\rm c}$ , such that $\Phi_{\rm c}(N_{1b},N_{2b})$ = r, and r is the residue of R modulo b, thus preserving the relation of congruence modulo b between the contents of the accumulator and those of the check register. In symbolic language, we wish to preserve the relation $|\Phi(N_1,N_2)|_b - \Phi_{\rm c}(N_{1b},N_{2b})$ . If we can find such a $\Phi_{\rm c}$ for every $\Phi$ , error checking is simply reduced to checking the relation of congruence between the accumulator contents and the check register contents periodically. Any violation of this congruence can be set up as an error signal. In the following discussion we consider the accumulator to be of length n. In order to have simplified checking logic, $l^i$ s complement arithmetic will be used (which is equivalent to considering all operations modulo $2^n$ -1), and b will be of the form $2^k$ -1, where k divides n. The fault coverage at the gate level depends on the scheme of addition and the logic being used. The discussion of error coverage is given in Section 5, with particular reference to the coverage obtainable with a mod 15 residue checker. ### CHECKING ARITHMETIC OPERATIONS The checking of arithmetic operations has been discussed in a previous paper[8]. For each operation that is executed in the processor, there will be a parallel operation in the checker, such that at the end of the operation cycle, the checker maintains a residue of the accumulator contents modulo b (assuming there has been no error in the computation). The first part of Table 2 is a listing of the formulas for the results of various processor arithmetic and arithmetic type operations, and their corresponding checker operations. These can be easily derived along the lines of the derivation given below for a sample arithmetic type operation. Consider the SHR operation, which shifts the contents of the accumulator to the right once. Let A(t) denote the n-bit binary array represented by the con- tents of the accumulator at instant t. Let $A_i(t)$ denote the binary digit stored in the $i\!+\!1^{th}$ bit (from right to left) of the accumulator at time t. Let $\delta_I(A(t))$ denote the integer value of the accumulator at time t, such that $\delta_I(A(t))=N_1$ . Assume each operation takes one unit of time. Let the operator $\Phi$ denote the SHR operation, and let R denote the integer value of the result of the operation, which is stored in the accumulator at the end of the operation. We have, $$R = \delta_{I}(A(t+1)) = \Phi(N_{1}, N_{2})$$ $$A(t+1) = (A_{n-1}(t+1), A_{n-2}(t+1) \dots A_{0}(t+1)) ,$$ where $$A_{n-1}(t+1) = 0$$ $$A_{j}(t+1) = A_{j+1}(t) \quad \text{for} \quad j = 0, 1, \dots, n-2$$ $$R = \delta_{I}(A(t+1)) = \sum_{i=0}^{n-1} A_{i}(t+1) \cdot 2^{i}$$ $$= \sum_{i=0}^{n-2} A_{i+1}(t) \cdot 2^{i} = \sum_{i=0}^{n-2} A_{i+1}(t) 2^{i+1}/2$$ Let $$j = i + 1$$ $$R = 1/2 \sum_{j=1}^{n-1} A_j(t) \cdot 2^j + 1/2 A_0(t) - 1/2 A_0(t)$$ $$= 1/2 \sum_{i=0}^{n-1} A_j(t) \cdot 2^j - 1/2 A_0(t) = (N_1 - A_0(t))/2$$ Therefore the result r of the checker is given by $$r = |1/2(N_1 - A_0(t))|_b = |1/2(N_{lb} - A_0(t))|_b$$ Similarly formulas are derived for all other arithmetic and arithmetic type operations listed in Table 1. We can see that except for corrections in the form of the least and most significant digits from the accumulator during the SHR and SHL operations, the checker operates independently of the processor in a parallel fashion. Thus the checker unit will have a k-bit check register, a k-bit parallel adder, a decoder, and appropriate gating as shown in Fig. 3. ### CHECKING LOGICAL OPERATIONS In trying to use a residue code for checking logical operations, we immediately run into a problem. The residue code is closed under operations like ADD, SUB, CYCLE etc., but is not closed under bitwise logical operations such as AND, OR and exclusive - OR. However by use of a simple relation between arithmetic and logical operations, stated in the theorem below, we can make use of the same unit for checking both-arithmetic as well as logical operations. Notation: Let A(t) and B(t) each represent binary arrays of length n. Let $$\delta_{\mathbf{I}}(\mathbf{A}(t)) = \mathbf{N}_{\mathbf{I}}(t)$$ . $\delta_{\mathbf{I}}(\mathbf{B}(t)) = \mathbf{N}_{\mathbf{I}}(t)$ Let ' ' ', ' V' and ' \theta' denote the bit-by-bit AND, OR and exclusive-OR operations on arrays A(t) and B(t). $$\delta_{\mathbf{I}}(\mathbb{A}(\mathfrak{t})) + \delta_{\mathbf{I}}(\mathbb{B}(\mathfrak{t})) = \delta_{\mathbf{I}}(\mathbb{A}(\mathfrak{t}) \cdot \mathbb{B}(\mathfrak{t})) + \delta_{\mathbf{I}}(\mathbb{A}(\mathfrak{t}) \vee \mathbb{B}(\mathfrak{t}))$$ Also $$\delta_{\mathtt{I}}(\mathbb{A}(t) \oplus \mathbb{B}(t)) = \delta_{\mathtt{I}}(\mathbb{A}(t)) + \delta_{\mathtt{I}}(\mathbb{B}(t)) - 2 \, \delta_{\mathtt{I}}(\mathbb{A}(t) \cdot \mathbb{B}(t))$$ The theorem can easily be proved by showing that the relation is true for a single bit i of the two arrays, and then summing up from i = 0 to i = n-1. As an immediate consequence of the above we have Corollary $$\begin{split} &\left|\delta_{\mathrm{I}}(\mathbf{A}(t) \cdot \mathbf{v} \cdot \mathbf{B}(t))\right|_{\mathbf{b}} = \left|\mathbf{N}_{\mathrm{1b}}(t) + \mathbf{N}_{\mathrm{2b}}(t) - \left|\delta_{\mathrm{I}}(\mathbf{A}(t) \cdot \mathbf{B}(t))\right|_{\mathbf{b}}\right|_{\mathbf{b}} \\ &\left|\delta_{\mathrm{I}}(\mathbf{A}(t) \oplus \mathbf{B}(t))\right|_{\mathbf{b}} = \left|\mathbf{N}_{\mathrm{1b}}(t) + \mathbf{N}_{\mathrm{2b}}(t) - 2\left|\delta_{\mathrm{I}}(\mathbf{A}(t) \cdot \mathbf{B}(t))\right|_{\mathbf{b}}\right|_{\mathbf{b}} \end{split}$$ The formulas for the result of logical operations are given in the second half of Table 2. ### HARDWARE IMPLEMENTATION From the various formulas listed in Table 2 we can obtain sequences of microoperations that make up each arithmetic and logical operation. From the results of the previous section we see that the logical operations considered can be expressed in terms of the ADD, AND, SUB and CYCLE LEFT (CYL) operations. The OR operation reduces to ADD and AND operations followed by SUB. The EXOR operation reduces to ADD and AND followed by CYL and SUB. Thus checking of logical operations by means of a residue code requires additional circuitry in the checker, consisting of just n two-input AND gates, as the circuitry for performing ADD, SUB and CYL operations is already available in the checker. We can derive the Boolean equations corresponding to each sequence of microoperations for the checker, and these can then be translated into a hardware design. Design Studies have been carried out on a mod 15 residue checker for a 16 bit monolithic parallel processor [14]. The processor circuitry consists of about 800 gates. The checker circuitry consists of about 400 gates. The hardware increase is only of the order of 50%. Moreover the checker does not slow down processor operation. Error detection can be carried out during each instruction interpretation cycle, while the next operation is being decoded. A block diagram of the checker is given in Fig. 3. ### Error Coverage Although detailed studies on the exact error coverage obtainable by this scheme are not yet completed, some general comments are in order. The mod 15 residue check detects all failures for which the error is not a multiple of 15. All the gates that affect one, two or three successive bits of the accumulator are covered, as $|\pm 2^j|_{15} |\pm 3.2^j|_{15}$ , $|\pm 5.2^j|_{15}$ and $|\pm 7.2^j|_{15}$ are all non zero. All bursts of length 3 or less, and most bursts of length 4 or greater are detected. The coverage at the gate level for addition circuitry depends on the scheme of addition being used. In the example being considered, the 16 bit operand in the ALU is divided into bytes of four bits each. Thus there will be four groups of digits. During addition of two words, carries are rippled through each group of four bits, but there is a separate fast carry structure look-ahead logic for propagation of the carry between one group and the next. Langdon and Tang [13] have shown, that when such is the case, a failure in a single carry line may cause a burst, such that the error value is not necessarily $2^{i}$ for some $i(0 \le i \le 15)$ . is because the carries do not propagate up to the position expected if they were allowed to ripple all the way through. The faulty carry may affect the sum bits in the particular group within which it is generated, but the carry into the succeeding group is independently and correctly generated, and hence the sum bits of the next group are correct. For a detailed discussion on the type of errors caused by individual gate failures in carry look-ahead adders, the reader is referred to the paper by Langdon and Tang [13]. ### CONCLUSION The main effort in this paper, is directed towards showing how a single checker unit utilizing a residue code can be used for checking both, logical, as well as arithmetic operations in a digital computer. The cost of checking has been found to be about 50% of the cost of the processor, which is far below the cost involved in duplication and matching. Using a mod 15 residue code as an example, we have shown that apart from single errors, all bursts of length 3 or less and most bursts of length 4 are detected with this code. Thus, the mod 15 checker covers a large percentage of errors that may occur due to faults in the ALU. Future research in this area will concentrate on studies on syndrome generation and error correction using bi-residue codes. The theory of error correction for both, logical as well as arithmetic operations will be considered. ### ACKNOWLEDGEMENTS This work is supported in part by research grants NASA NGR-21-002-229, NSF GK-17308. ### REFERENCES - [1] Brown, D. T., "Error Detecting and Error Correcting Binary Codes for Arithemtic Operations," IRE Trans. on EC Vol. EC-9, Sept. 1960, pp. 333-337. - [2] Diamond, J. M., "Checking Codes for Digital Computers," Proc. of the IRE, Vol. 43, April 1955, pp. 487-488. | [3] | Avizenis, A., "Design of Fault Tolerant Com- | |-----|----------------------------------------------| | | puters," Proceedings of 1967 FJCC, Vol. 31, | | | pp. 733-743. | ### [4] Garner, H. L., "Error Codes for Arithemtic Operations," IEEE Trans. MEC Vol. EC-15, No. 5, Oct. 1966, pp. 763-770. - [5] Peterson, W. W., "On Checking an Adder," IBM Journal of Research and Development, Vol. 2, April 1958, pp. 166-168. - [6] Chien, R. T., "Linear Codes for Single Error Correction in Binary Arithmetic and Transmission," 1963 IEEE Natl. Conv. Rec., pt. 4, pp. 133-138. - [7] Henderson, D.S., "Residue Class Error Checking Codes," Proc. 16<sup>th</sup> Natl. Meeting of the ACM, Los Angeles, Calif., Sept. 1961. - [8] Rao, T. R. N., "Error Checking Logic for Arithmetic Type Operations of a Processor," IEEE Trans. on EC Vol. C-17, pp. 845-849, Sept. 1968. - [9] Rao, T.R.N., "Biresidue Error-Correcing Codes for Computers Arithmetic," IEEE Trans. on Computers, Vol. C-19, No. 19, May 1970, pp. 398-402 - [10] Hamming, R. W., "Error Detecting and Error Correcting Codes," Bell System Tech. Journal, Vol. 29, No. 2, April 1960, pp. 147-160. - [11] Peterson, W.W. and Rabin, H.O., "On Codes for Checking Logical Operations," IBM Journal, 3, 163, April 1959. - [12] Garcia, O.N. and Rao, T.R.N., "On the Methods of Checking Logical Operations," Proc. Second Annual, Princeton Conf. on Information Sciences and Systems, March 1968. - [13] Langdon, G.G., and Tang, C.K., "Concurrent Error Detection for Group Look Ahead Binary Adders." - [14] Periodic Progress Report, "Monolithic Parallel Processors, by RCA Electronic Components, Simerville, N.J., Prepared for Goddard Space Flight Center, Greenbelt, Md., Dec. 1968. - [15] F. F. Sellers, M. Y. Hsaio, L. W. Bearnson, "Error Detecting Logic for Digital Computers," McGraw-Hill, 1968. - [16] P. Fagg, "IBM S/360 Engineering," Fall Joint Computer Conference, pp. 205-237, 1964. TABLE 1: Description of Processor Operations. | Symbol | Op. Code | Description | |-----------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------| | Φ1 | ADD | Adds contents of accumulator and parallel data lines and stores result in accumulator. | | <sup>Ф</sup> 2 | SUB | Subtracts contents of parallel data lines from accumulator and stores result in accumulator. | | <sup>Ф</sup> 3 | SM | Subtracts contents of accumulator from parallel data lines, and stores result in accumulator. | | <sup>\$\Phi\$</sup> 4 | COMP | Complements the contents of the accumulator bitwise. | | <sup>Φ</sup> 5 | SHL | Shifts the contents of the accum-<br>ulator to the left once. | | <sup>Ф</sup> 6 | SHR | Shifts the contents of the accumulator to the right once. | | <sup>4</sup> 7 | CYR | Rotates the contents of the accumulator to the right once. | | <sup>\$\Phi\</sup> 8 | CYL | Rotates the contents of the accumulator to the left once. | | <sup>Ф</sup> 9 | CLA | Clears accumulator and loads value on parallel data lines into accumulator. | | Φ <sub>10</sub> | CLZ | Clears accumulator to zero. | | <sup>Ф</sup> 11 | AND | Performs bitwise logical AND operation on contents of accumulator and parallel data lines and stores result in accumulator. | | <sup>Ф</sup> 12 | OR | Performs a bitwise logical OR operation on contents of accumulator and input on parallel data lines and stores result in accumulator. | | <sup>Ф</sup> 13 | EXOR | Performs bitwise logical exclusive - OR operation on contents of accumulator and inpu on parallel data lines, and store result in accumulator. | | | | | TABLE 2: Characterization of Arithmetic and Logical Operations of a Processor | Operation $\phi_{\mathbf{i}}$ | Result $R = \phi_i(N_1, N_2)$<br>$m = 2^n - 1$ | Operation $\phi_{ ext{ci}}$ | $Q = \phi_{ci}(N_{1b}, N_{2b})$<br>$b = 2^k - 1$ | |-------------------------------|-----------------------------------------------------------------------------------------------------------------|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | j. 1-4 | A. ARITHMETIC | OPERATIONS | erection in Dinary Atlantes ()<br>ssion, 1 1963 SEE hall, Cray, 14<br>133, 138, | | $\phi_1(ADD)$ | $ N_1 + N_2 _{m}$ | ф <sub>с1</sub> | N <sub>1b</sub> + N <sub>2b</sub> <sub>b</sub> | | $\phi_2(SUB)$ | $ N_1 - N_2 _{m}$ | ф <sub>с2</sub> | $ N_{lb} - N_{2b} _{b}$ | | $\phi_3(SM)$ | $ N_2 - N_1 _{m}$ | Фс3 | $\left N_{2b} - N_{1b}\right _{b}$ | | $\phi_4$ (COMP) | - N <sub>1</sub> <sub>m</sub> , | ф <sub>с4</sub> | - N <sub>1b</sub> <sub>b</sub> | | $\phi_5(SHL)$ | $\left 2N_1 - A_{n-1}(t) \right _m$ | ф <sub>с5</sub> | $ 2N_{1b} - A_{n-1}(t) _{b}$ | | φ <sub>6</sub> (SHR) | $ 1/2(N_1-A_0(t)) _{m}$ | Фс6 | $ 1/2(N_{1b}-A_{0}(t)) _{b}$ | | φ <sub>7</sub> (CYR) | 1/2N <sub>1</sub> <sub>m</sub> | ф <sub>с7</sub> | 1/2 N <sub>1b</sub> <sub>b</sub> | | φ8(CYL) | 2N <sub>1</sub> <sub>m</sub> | ф <sub>с8</sub> | 2N <sub>lb</sub> b | | φ <sub>9</sub> (CLA) | N <sub>2</sub> | Фс9 | N <sub>2b</sub> | | $\phi_{10}(CLZ)$ | 0 | ф <sub>с10</sub> | O The state of | | φ <sub>11</sub> (SET) | 0 | $\phi_{\mathbf{cll}}$ | ena. The sed Red Like o. The Committee of o | | | B. LOGICAL O | PERATIONS | Systems, Month 1968 | | φ <sub>12</sub> (AND) | $\left \delta_{I}(A(t) \cdot B(t)) \right _{m}$ | ф <sub>с12</sub> | $\left \delta_{\mathbf{I}}(\mathbf{A}(\mathbf{t})\cdot\mathbf{B}(\mathbf{t}))\right _{\mathbf{b}}$ | | φ <sub>13</sub> (OR) | $\left \delta_{\mathbf{I}}(\mathbf{A}(\mathbf{t}) \ \mathbf{v} \ \mathbf{B}(\mathbf{t})) \right _{\mathbf{m}}$ | ф <sub>с13</sub> | $ N_{lb} + N_{2b} - \delta_{I}(A(t)) $ $B(t)) _{b} _{b}$ | | φ <sub>14</sub> (EXOR) | $\left \delta_{\mathbf{I}}(\mathbf{A}(\mathbf{t}) \oplus \mathbf{B}(\mathbf{t})) \right _{\mathbf{m}}$ | <sup>ф</sup> с14 | $ N_{lb} + N_{2b} - 2 \delta_{I}(A(t))$ $ B(t) _{b} _{b}$ | - Fig. 2(a): Peterson's Model for Separate Adder and Checker. $C(N_1)*C(N_1+N_2)$ Fig. 2(b): Generalized Model for Separate Processor and Checker. ### LOOK-ASIDE TECHNIQUES FOR MINIMUM CIRCUIT MEMORY TRANSLATORS W. C. Carter K. A. Duke\* D. C. Jessep, Jr. IBM Thomas J. Watson Research Center Yorktown Heights, New York 10598 ### ABSTRACT This paper will demonstrate two improvements in coding techniques that could be used for memory word coding. First, within the fixed structure of a Hamming SEC/DED code, an improvement can be obtained in circuit cost and operational speed over more conventional code implementations. Second, the mechanics of error correction in a fault tolerant computer may be carried out via conventional hardware means or by use of the existing system facilities such as the combination of the microprogram unit, local store, and the arithmeticlogic unit. These improvements may be obtained by the use of Rotational Coding schemes in conjunction with a technique called "Look-aside Correction". This paper will first show a generalized algorithm for specifying the parity check matrix of Rotational Codes. The structure implemented by the parity check matrix in this paper will be not merely encoding and decoding circuitry, but will translate between rotational code forms and byte-parity encoded forms. The unique feature of these translators is that use of the Rotational Code permits the error correction to be performed on only a subset of the data word bits, and only if a single error condition has been detected. The correction mechanism may be either a hardware logic circuit of firmware. The paper concludes with a comparison of the circuit requirements and correctional speed of the Hamming (72,64) Single-Error-Correcting, Double-Error-Detecting Code, as it would normally be implemented, and a Rotational Code Translator also operating on 64 data bits and eight check bits. The latter translator is seen to provide cheaper implementation with faster average speed of error correction. ### 1. INTRODUCTION ## 1.1. The Use of Codes in a Fault-Tolerant Computing System Computing system designers have utilized a variety of coding techniques to enhance the failure toleration capacity of the system, but normally at a significant increment to system cost. Moreoever, the coding circuitry used for reliability improvement can constitute a significant number of logic gates and even become a liability to its own objective if it is designed in an indiscriminate fashion. A recent paper [1] presented design techniques to be used for implementing memory word coding so that the logic circuitry used for both encoding and decoding can be designed to alert the system CPU when it suffers a logic gate failure itself. This function is performed in addition to the normal memory word error correction and detection functions performed by the circuitry. The extra cost in logic gates of this self-testability property is less than 7.5% of the total cost of coding circuitry for a 32 bit word and approaches 1.0% for larger memory words. An alternate vehicle to coding alone for achieving fault tolerance is that of replacing faulty units by spares. Recent work [2], however, has shown that limiting conditions exist, in a system using only sparing, beyond which no reliability improvements is afforded by the inclusion of additional spares beyond a specifiable maximum allotment. Moreoever, coding in conjunction with sparing can be shown [2,3] to be more efficient than sparing alone. In any such case, the coding must be implemented in an efficient manner so as not to detract heavily from normal system operation. ### 1.2. A Theory of Design of Coding Circuitry This paper will demonstrate two distinct improvements in the application of a particular coding technique as it would be used for memory word coding. First, even within a rather fixed structure of the chosen code, as it is specified by the matrix descriptors of coding theory, a dramatic improvement can be obtained in circuit cost and operational speed over more conventional code implementations. "Look-aside" correction, as explained in this paper, provides the basis for these improvements. Second, the mechanics of error correction in an ultra-reliable computer may be carried out via conventional hardware means or by use of the existing system facilities such as the combination of the microprogram unit, local store, and the arithmetic-logic unit. This, again, is a choice to be made by the designer. In either case, correction in a "Look-aside" mode need only be performed when necessary, not as an inherent part of the memory Read-out process. Hence, the primary purpose of this paper will be to demonstrate how minimum implementation cost and minimum delay coding circuitry can be prescribed for a particular class of codes. <sup>\*</sup> IBM - Systems Development Division South Road Poughkeepsie, New York 12601 ### 2. THE PARTICULAR CODE UNDER CONSIDERATION To properly preface the discussion of the code, it is necessary to delineate carefully some assumptions and terminology to be used in this discussion. The coding circuitry used will actually not merely encode or decode the memory word; the word read out of store will be transformed from its memory coded form into a byte-parity encoded form. The word to be stored will be removed from the bus in a byte-parity encoded form and transformed into an encoded form for storage. Hence, the coding circuitry performing the encoding and decoding is actually translating from one coded form to another, depending on whether data is being placed on, or removed from the bus. Thus, the coding circuitry will be referred to as a translator. It will be assumed in this paper that the reader is familiar with the concept of the Parity Check Matrix (PCM) and the use of the parity relations between data bits and check bits prescribed by the rows of the PCM to form syndromes; this approach has already received ample treatment in the literature [4]. If the word to be encoded according to the PCM is composed of m bytes of b bits per byte, there are k = mb total data bits. Given k, the normal Hamming relationship [4] determines the number of check bits, r. The PCM will then have n = k + r columns and r rows (and syndromes). ### 2.1. The Rotational Code In terms of the PCM just discussed it is now possible to define a class of codes, the Rotational Codes, which will serve as a focal point of this paper. The Rotational Codes are so named because of the appearance of the PCM. The PCM for the Rotational Codes is formed according to the following rules which apply to the case m = r. Modifications for m ≠ r will be discussed later. - (1) For the r check bit columns of the PCM, choose the r combinations of one 1 and (r-1) 0's so that the r columns have a 1 in the 1<sup>st</sup>, r<sup>th</sup>, $(r-1)^{st}$ ,...,2<sup>nd</sup> rows only, with 0's elsewhere in these columns. - (2) Starting with the columns of the PCM corresponding to only the first byte (the first b columns), assign the entire first row of these columns to be 1's. - (3) For the same b columns of the PCM in step (2), assign each column using the list in Table 1. First, pick b columns or as many columns as possible with three 1's and with min r ≤ b. If there are less than b such columns, repeat the process for five 1's, then for seven 1's, nine 1's, eleven 1's, until b columns have been selected. Table 2 shows the maximum number of columns with three, five, seven or nine 1's which may be chosen for r bytes for r = 4,...,11. - (4) The PCM columns for the succeeding bytes are obtained from the columns formulated for the first byte by vertical rotational of the PCM rows corresponding to byte 1. The row of b 1's, formulated in step (2) as the first row of the PCM columns corresponding to byte 1, now becomes the $r^{th}$ row for the columns corresponding to byte 2, the $(r-1)^{st}$ row for the columns corresponding to byte 3. Thus, the row of b 1's will be the $[1+((r+1-j)\text{mod }r)]^{st}$ row of the $j^{th}$ byte, $j=1,2,\ldots,r$ . Each row is then seen to rotate vertically one row as first the columns for the $j^{th}$ byte are observed and then those of the $j+1^{st}$ byte. of the PCM, there should be mb+r columns, each with an odd number of 1's in it. Each b columnwide section of the first mb columns will be a vertical rotation away from the sections immediately adjacent to it. Each of the final r columns will contain only a single 1 and each of these columns will be a vertical rotation away from the columns immediately adjacent to it in this checkbit portion of the PCM. There are two considerations paramount to this technique of specifying the PCM: first, variations in the algorithm for relative sizes of m, b and r, and second, the construction of a PCM in which all columns are distinct (no two are identical. To generalize the above algorithm for PCM construction, the following steps must be taken. Divide the m bytes evenly, if possible, into r sets, $T_1$ . If m = dr + e, o < e < r, put d + 1 bytes into the first e sets, $T_1$ , $T_2, \dots, T_e$ , $((d+1)b \le max b in Table 2)$ and d bytes into the last (r-e) sets, $T_{e+1}$ , $T_{e+2}$ , ..., $T_r$ . Let set $T_r$ correspond to the $T_r$ check bit and the $T_r$ correspond to the $T_r$ because $T_r$ into the first row under $T_r$ , $T_r$ because $T_r$ into the first $T_r$ such that $T_r$ into the first $T_r$ segment $T_r$ into the second $T_r$ segment $T_r$ into the second $T_r$ segment $T_r$ into the second $T_r$ segment $T_r$ into the first $T_r$ set $T_r$ into the second $T_r$ set $T_r$ into the second $T_r$ set $T_r$ into the second $T_r$ set $T_r$ into the second $T_r$ set $T_r$ in $T_r$ second s For the case of m < r, it may be necessary to rotate the rows more than one position at a time. It will not generally be possible, in such a case, to perform r total steps of rotation. If m > r, it may be necessary to subdivide one or more bytes of the PCM to obtain a step-wise rotation for each row. No matter what the relationship of m and b, a minimum number of 1's is used. Use of a minimal number of 1's (starting with three, then five, seven and so on) is advantageous because the fewer 1's there are in the PCM, the fewer the interconnections and the fewer Exclusive—OR (XOR) circuits used. It is best to choose the 1's such that the number in each row is balanced so the circuit delays are approximately equal. The final result of these steps is a PCM in which all columns are distinct and in which as few a number of 1's as possible appears, for minimal circuit or delay implementation. The next step is to consider how this PCM can be used to provide the encoding and decoding operations necessary and to demonstrate the possibilities for error correction by use of either normal hardware correctors or the microprogram unit and the arithmetic unit. ### 2.2. Decoding and Encoding with a Rotational Code The general structure of the PCM formulated for the rotational class of codes in the previous section will have characteristics of importance to coding. The PCM contains a minimum number of 1's for a Hamming Single Error Correcting -Double Error Detecting (SEC/DED) code [5]. set of columns corresponding to one byte of data will have one solid row of 1's appearing in it. Additionally, this same row will have other l's corresponding only to data bits l's appearing in it. This latter set of 1's will be referred to as the Parity Subset. In conventional coding schemes, the parity of all data bits which correspond to a l in the $i^{\rm th}$ row of the PCM is determined directly to form the syndrome S. In rotational coding schemes, the syndrome is formed in such a fashion as to permit formation of byte parity as an intermediate step in the process, thus facilitating bus transmission. Define the following variables: y<sub>i</sub> - the parity over the i<sup>th</sup> row Parity Subset x<sub>i</sub> - the parity over the i<sup>th</sup> byte p<sub>i</sub> - the parity bit to maintain odd parity across the i<sup>th</sup> byte. The distinction is made here on the parity of a byte $(x_1)$ and the parity bit for the same byte $(p_1)$ . If the parity for a byte is even, the parity bit will be a 1 if odd parity is required for error detection. Consider odd parity to be used with a rotational code PCM. Then, for the i row of the PCM, we have the parity equations: and $$x_i \oplus y_i \oplus c_i = 1$$ $x_i \oplus p_i = 1$ where " • " denotes the XOR operation and c<sub>i</sub> is the check bit which corresponds to a 1 entry in the i<sup>th</sup> row. Combining the two by an XOR, (1) $$p_i = y_i \cdot c_i$$ . The implication of this equation is that the parity bit for the i<sup>th</sup> data byte can now be generated as the parity of the i<sup>th</sup> row Parity Subset and the check bit, c, (and does not have to include the actual bits of the i<sup>th</sup> data byte). The i<sup>th</sup> date byte, with parity x, and parity bit p, are put into the MDR. Each byte of the MDR is parity checked to ensure data integrity before use after a READ or before encoding after a WRITE begins. Now, for the rotational code, then, the i syndrome, $s_i$ is formed from the expression $s_i = s_i + p_i$ which checks the MDR. Slight deviations in this are required if m # r because of the way the PCM was formed. If m # r, the parity and syndrome bits are formed as for m = r for syndromes corresponding to PCM rows in which a full byte-wide row of 1's is found. Otherwise, parity and syndrome bits are formed from the KOR of only the bits having 1's in the corresponding row of the PCM. $S_{i} = 1, i = 1, 2, ..., r, for no error$ in the ith byte. The no error signal, NE, then is formed from NE = $\bigwedge_{i=1}^{N}$ S<sub>i</sub> and is 1 for an error free word. If NE = 0, it becomes necessary to determine if it is a single or double error. Anytime a single data error exists, an odd number of syndromes change. If one, and only one, syndrome changes, the error is in a check bit [1,5]. This classification ability derives from the code property that each column has an odd number of 1's in it, with columns having single 1's in them reserved for check bit columns. Thus, a single error has been detected if NE = 0 and if the parity across the syndromes changes. Hence, the single error signal is $SE = \overline{NE} \wedge (S_1 \oplus S_2 \oplus ... + S_r)$ for r even and $SE = \overline{NE}$ $(S_1 \oplus S_2 \oplus ... \oplus S_r)^{\dagger}$ for r odd. If there is an error signal (NE = 0) and it is not a single error (SE = 0), then it is a double error (DE) and DE = $\overline{NE} \wedge \overline{SE} = 1$ . The basic steps of the Memory Read process, then, are given as follows: - 1. Formulate a parity bit for each byte by use of the PCM and Equation (1), - 2. Form the syndromes from the parity check on the byte parity and its parity bit (formed in 1) for each data byte. - 3. Determine if an error condition exists in the data read out. If the data contains no errors, gate the word out to the CPU; otherwise, correct any single error or notify the CPU of any double errors. The Memory Write process consists of accepting a set of parity encoded data bytes from the bus, checking the parity for each byte, re-encoding the data bits according to the PCM, and gating away the newly reencoded word into the memory. Once the byte parity of the incoming word has been checked, the check bits for the newly reencoded form can be generated using the circuitry provided to implement the PCM function for the Read process. Rearrangement of the basic equation, $c_i \oplus y_i = p_i$ , for the Read process gives $p_i \oplus y_i = c_i$ for the Write process. Hence the parity bit for each byte and the Parity Subset denoted by y in the equation above, are XOR'ed together to provide the check bits required by the data bits in each byte prior to storage. Once the check bits are generated, the word (data and check bits) can be stored. It is thus important to note that this translator literally makes use of the same logic circuitry to both decode and encode and the MDR parity check circuitry is used to check and form a syndrome. The hardware implementation of one such system