



ISSN: 2320-9984 (Online)



# International Journal of Modern Engineering & Management Research

Website: www.ijmemr.org

# FPGA Based an Unique Modulo Multiplier Design for High Speed Secure Network

Santosh Chouhan M.Tech Scholar (DC) Shri Ram Institute of Science and Technology, Jabalpur, (M.P.) [INDIA] Email: Santoshchouhan07@yahoo.in Prof. Sumit Sharma

Professor & Head of the Department Shri Ram Institute of Science and Technology, Jabalpur, (M.P.) [INDIA] Email: Sharma.sumit3@gmail.com

Prof. Ravi Mohan

Professor & Guide, Department of Electronics & Communication Engg, Shri Ram Institute of Science and Technology, Jabalpur, (M.P.) [INDIA] Email: ravimohan7677@yahoo.co.in

Abstract—This paper covers the implementation of the implementation of Network Intrusion Detection System (NIDS) using International Data Encryption Algorithm (IDEA). The current era has seen an explosive growth in communications. Applications like online banking, personal digital assistants, mobile communication, smartcards, etc. have emphasized the need for security in resource constrained environments. International Data Encryption Algorithm (IDEA) cryptography serves as a perfect network intrusion detection system (NIDS) tool because of its 128 bits key sizes and high security comparable to that of other algorithms. However, to match the ever increasing requirement for speed in today's applications, hardware acceleration of the cryptographic algorithms is a necessity. This study presents an efficient hardware structure for the modulo  $(2^n + 1)$  Multiplier, which is the most time and space consuming operation in IDEA. The proposed modulo multiplier saves more time, area and cost. The block size considered here is same as of traditional IDEA encryption algorithm which is of 64 bits with 16 bit sub-blocks.

**Keywords**—NIDS, IDEA, Cryptography, mobile communication, Modulo  $(2^n + 1)$ Multiplier.

#### **1. INTRODUCTION**

In the field of networking, role of network intrusion detection is immense. It is a very vital tool which provides the security against various external and internal threats in any network. To understand the theory of network intrusion detection, the understanding and knowledge of different threats in the network. To maintain network intrusion detection in a network involves the fulfilment of the security goals in the network which are Data Confidentiality, Integrity, Authentication and Non-Repudiation. Data confidentiality is achieved by means of Cryptography. International Data Encryption Algorithm (IDEA) is a block cipher designed by Xuejia Lai and James L. Massey of ETH-Zürich and was first described in 1991. It is a minor revision of an earlier cipher, PES (Proposed Encryption Standard); IDEA was originally called IPES (Improved PES). IDEA was used as the symmetric cipher in early versions of the Pretty Good Privacy cryptosystem. IDEA was to develop a strong encryption algorithm,

which would replace the DES procedure developed in the U.S.A. in the seventies. It is also interesting in that it entirely avoids the use of any lookup tables or S-boxes. When the famous PGP email and file encryption product was designed by Phil Zimmermann, the developers were looking for maximum security. IDEA was their first choice for data encryption based on its proven design and its great reputation.

# 2. The IDEA Encryption Algorithm:

- provides high level security not based on keeping the algorithm a secret, but rather upon ignorance of the secret key
- is fully specified and easily understood
- is available to everybody
- is suitable for use in a wide range of applications
- can be economically implemented in electronic components (VLSI Chip)
- can be used efficiently
- may be exported world wide
- Is patent protected to prevent fraud and piracy.

In this paper, the cipher used is a symmetric key block cipher .It takes its input as 64 bit plain text and gives a 64 bit cipher text as output using a 128 bit key. While working on plain text, it divides the input data in to 16 bit sub-blocks and operates on each block. It is described as one of the more secure block algorithm due to its high immunity to attacks. In spite of the fact that Data Encryption standard (DES) is another popular symmetric block cipher which is used in several financial and business application and its drawback is the short key word length .Moreover unlike DES, IDEA doesn't need any S-box or P-box is required for implementing this cipher. The most crucial module part of this algorithm is the design of the multiplier modulo a Fermat prime, which is one of the algebraic group operation used and entire speed of IDEA depends on this module. So designing the multiplier is a major during the hardware or software implementation of IDEA because its speed is a big issue when hardware implemented IDEA is used in real time application. The overall objective for hardware implementation of IDEA is to minimize the hardware requirements which result in efficient use of silicon area and at the same time improve the processing speed and high throughput of data. As the performance of IDEA cipher depends entirely on the modulo (2n+1) multiplier design, the main objective is to design an efficient and fast modulo multiplier which is to be used in the entire IDEA algorithm. The organization of the rest of the paper is as follows. The previous hardware and software implementations are covered in section II. Section III describes the IDEA cipher and its detailed operations as well as modules. Section IV describes the general architecture of the cryptosystem to be implemented and the proposed modulo multiplier architecture. Section V discusses the performance reviews and comparisons with previous schemes and section VI finally concludes the paper.

# **3. PREVIOUS WORK**

In spite of the fact that IDEA works with 16 bit word blocks, software implemented IDEA cannot reach the speed that is required for online encryption in high speed networks. IDEA was implemented in software by Ascom, the patent holder of IDEA, and it achieved an encryption rate of 23.53 Mbps.

# 4. THE IDEA ALGORITHM:

The proposed Encryption Standard (PES) is a block cipher introduced by Lai and Massey. It was then improved by the Lai, Massey and Murphy in 1991. This version, with stronger security against differential analysis and truncated differentials, was called the Improved PES (IPES). IPES was renamed to be the International Data Encryption

Algorithm (IDEA) in 1992. Claims have been made that the algorithm is the most secure block encryption algorithm in the public domain. IDEA is a symmetric, secret-key block cipher. The keys for both encryption and decryption must be kept secret from unauthorized persons. Since the two keys are symmetric, one can divide the decryption key from the encryption one or vice versa. The size of the key is fixed to be 128 bits and the size of the data block which can be handled in one encryption/decryption process is fixed to 64 bits. All data operations in the IDEA cipher are in 16-bit unsigned integers. When processing data which is not an integer multiple of 64- bit block, padding is required. The security of IDEA algorithm is based on the mixing of three different kinds of algebraic operations: EX-OR, addition and modular multiplication. IDEA is based upon a basic function, which is iterated eight times. The first iteration operates on the input 64-bit plain text block and the successive iterations operate on the 64-bit block from the previous iteration. After the last iteration, a final transform step produces the 64 -bit cipher block.



The decryption phase of IDEA is identical to that of the encryption phase. It uses the same sequence of operations as in the encryption phase. The only change is that the sub-keys are reversed and are slightly different. That means the sub-keys which are used in round 1 during encryption phase are manipulated during last round of decryption phase. The sub keys used in decryption are either additive or multiplicative inverse of the sub-keys used in the encryption phase.



Figure: 2- RTL View of IDEA Algorithm

# Key Generation:

The 64-bit plaintext block is partitioned into four 16-bit sub-blocks, since all the algebraic operations used in the encryption process operate on 16-bit numbers. Another process produces for each of the encryption rounds, six 16-bit key sub-blocks from the 128 -bit key. Since a further four 16- bit key-subblocks are required for the subsequent output transformation, a total of  $52(= 8 \times 6 + 4)$ different 16-bit sub-blocks have to be generated from the 128-bit key. The 52 16-bit key sub-blocks which are generated from the 128-bit key are produced as follows:

- First, the 128-bit key is partitioned into eight 16-bit sub-blocks which are then directly used as the first eight key sub-blocks.
- The 128-bit key is then cyclically shifted to the left by 25 positions, after which the resulting 128-bit block is again partitioned into eight 16-bit sub-blocks to be directly used as the next eight key sub-blocks.
- The cyclic shift procedure described above is repeated until all of the required 52 16-bit key subblocks have been generated.

FPGA Based an Unique Modulo Multiplier Design for High Speed Secure Network *Author (s) : Santosh Chouhan, Prof. Sumit Sharma, Prof. Ravimohan | Shri Ram Institute, Jabalpur* 



Figure: 3- IDEA Algorithm Key generation

#### **5. DESIGN AND IMPLEMENTATION**

The main objective of implementing a cryptosystem is to increase the data encryption and decryption rate so as to increase the throughput. FPGA implementations of cryptosystem offer adequate speed so that it can be easily embedded in real time applications. To implement an algorithm in hardware, we have to first realize the architecture of the entire system. The general architecture of hardware implementation of a cryptosystem is shown in Figure 3.

#### A. Hardware Implementation of IDEA:

The IDEA cipher has three separate units other than key generation module. The performance and speed of hardware implemented IDEA depends on these three modules. These modules are:

- Multiplier Module.
- Addition unit.
- Inverse modulo Multiplier.

In each round of the 8 rounds of algorithm, the following sequences of events are performed:

- 1. Multiply\* P1 and K1
- 2. Add\* P2 and K2
- 3. Add\* P3 and K3

- 4. Multiply\* P4 and K4
- 5. XOR the results of step 1 and step 3
- 6. XOR the results of step 2 and step 4
- 7. Multiply\* the results of step 5 with K5
- 8. Add\* the results of step 6 and step 7
- 9. Multiply\* the results of step 8 with K6
- 10. Add\* the results of step 7 and step 9
- 11. XOR the results of step 1 and step 9
- 12. XOR the results of step 3 and step 9
- 13. XOR the results of step 2 and step 10
- 14. XOR the results of step 4 and step 10

Sequence of events followed in the output transformation round:-

- 1. Multiply\* R1 and K1
- 2. Add\* R2 and K2
- 3. Add\* R3 and K3
- 4. Multiply\* R4 and K4

Among these modules, the main component which controls the speed and performance of IDEA is the modulo  $(2_n + 1)$ multiplier module. It consumes the major portion of the clock cycles required by the entire algorithm. The modulus used in the multiplication is a Fermat prime which is  $(2_n +$ 1). One important thing here is that the operand 0 is treated as  $2_n$ . The implementation of this multiplier in hardware is the most difficult task because the word length of the operands is comparatively large and implementing the multiplication sequentially is really time consuming.

#### B. Multiplication Modulo (2n +1):

Multiplication modulo a Fermat prime (p) is used as an important operation in many algorithms and it is crucial in various applications like pseudorandom number generation, Arithmetic processing and Cryptography. In IDEA algorithm, this modulo multiplier plays a very important role in the throughput and speed. In general, a modulo multiplier consists of two stages, Multiplier module and modulo reduction. This paper mainly deals with the various multiplication schemes that have been implemented along with some newly proposed schemes.

The multiplier module is the stage where two binary n bit numbers are multiplied to form a product of 2n bits. The modulo reduction stage produce the product modulo the Fermat prime number and the final output becomes an n bit number. Various implementations have been done on this multiplier module so as to improve the efficiency of the cryptosystem. The most crucial part of multiplying two binary numbers is the generation of partial products. A lot of problems arise when two numbers are multiplied in a straightforward approach in hardware. As per human nature of calculation, when any expression is given as

 $xy \mod (2^{n}+1) = Z' \mod (2^{n}+1) = z$ 

At first the product is calculated by traditional method and then the modulo reduction is done by iterative subtraction method until the value falls under the range of 0 and 2\_ .But the drawbacks of this implementation is inefficient use of silicon area which increases the hardware cost and it is time consuming.

# C. Proposed Modulo Multiplier:

There are several multipliers existing for idea. Here we are presenting a new kind of modulo multiplier which is highly optimized as compared to previous one. The description for proposed multiplier are as follows. In general if we multiply the two n bit number, then we get 2n bit number. Store this 2n bit number (result) in to temporary register ie.t1,then make the length of modulo( $2^n+1$ ) equal to the length of 2n bit number ie.t1, by consented zeroes("0"s) after the LSB bit of the modulo  $(2^n+1)$  and store this value into t2. Here we defining the subsequent steps by RTL diagram, which is easy to understand.



Figure: 4- RTL diagram of Implemented Modulo Multiplier

We show the difference between normal modulo multiplier and the proposed multiplier. In normal method:



Here p=3, and q= 5, when multiply these two number then answer is m=15,the final output of the modulo multiplier is r=m mod  $2_n=1$ ,in this example compare the time taken by normal & proposed modulo multiplier.

In proposed method:

```
1111 <u>15 MOD 2</u>

-1000 (shifted by 2)

0111

- 100 (shifted by 1)

0011

- 10

0001
```

Total used clock pulse = 03

When compare the both method then the time taken by proposed multiplier is very less.

# 6. RESULTS AND OBSERVATIONS

The proposed multiplier is synthesized using VHDL and the synthesis report is:

|                                               |                      |            | mult Pr                   | rojec                 | t Status |                 |                    |                    |                      |       |     |
|-----------------------------------------------|----------------------|------------|---------------------------|-----------------------|----------|-----------------|--------------------|--------------------|----------------------|-------|-----|
| Project File:                                 | modulo2r             | plus 1mul  | xise                      | Parser Errors:        |          |                 |                    | No Errors          |                      |       |     |
| Module Name:                                  | mult                 |            |                           | Implementation State: |          |                 |                    | Synthesized        |                      |       |     |
| Target Device:                                | xc4vlx80             | -12ff1148  | 3                         | •Errors:              |          |                 |                    | No Errors          |                      |       |     |
| Product Version:                              | ISE 12.2             |            |                           | • Warnings:           |          |                 |                    | 2 Warnings (0 new) |                      |       |     |
| Design Goal:                                  | Balanced             |            |                           | Routing Results:      |          |                 |                    |                    |                      |       |     |
| Design Strategy:                              | Xilinx Def           | ault (unlo | cked)                     | • Timing Constraints: |          |                 |                    |                    |                      |       |     |
| Environment:                                  | System S             | ettings    |                           | Final Timing Score:   |          |                 |                    |                    |                      |       |     |
| Device Utilization Summary (estimated values) |                      |            |                           |                       |          |                 |                    |                    |                      |       |     |
| Logic Utilization                             | Used                 |            | Ava                       |                       | Availabl | vailable Util   |                    |                    | zation               |       |     |
| Number of Slices                              |                      |            | 405                       |                       | 35840    |                 |                    |                    | 1%                   |       |     |
| Number of 4 input LUTs                        | mber of 4 input LUTs |            | 7                         |                       |          | 7 71680         |                    |                    | 0%                   |       |     |
| Number of bonded IOBs                         |                      |            | 48                        |                       |          | 768             |                    |                    | 6%                   |       |     |
|                                               |                      |            |                           |                       |          |                 |                    |                    |                      |       |     |
|                                               |                      |            | Detailed Re               | port                  | s        |                 |                    |                    |                      |       | [-] |
| Report Name                                   | Statu                | IS         | Generated                 |                       |          | Errors          | Warnin             | Warnings           |                      | Infos |     |
| Synthesis Report                              | Currer               | nt         | Wed 24. Jul 20:38:45 2013 |                       | 5 20 13  | 0               | 2 Warnings (0 new) |                    | new)                 | 0     |     |
| Translation Report                            | Out of               | Date       | Thu 24. May 18:31:05 201  |                       | 5 2012   | 0 0             |                    |                    | 0                    |       |     |
| Map Report                                    | Out of               | Date       | Thu 24. May 18:31:15 2012 |                       | 5 2012   | 0 2 Warnings (0 |                    | ngs (0 i           | new) 2 Infos (0 new) |       |     |

The performance parameters which are to be accounted for implementing IDEA in hardware are:

### Throughput or data conversion rate:

It is taken as an important tool for measuring the timing performance of IDEA i.e. the amount of data processed and encrypted per unit of time.

# Area Requirements:

It can be reported either in terms of number of Look-Up Tables (LUTs) or number of slices.



Figure: 5- Simulation Waveform of Implemented Modulo Multiplier

### Maximum Clock frequency:

It is the maximum clock frequency that can be achieved by the design.

Single pipelined stage and the number of pipelined stages.

# Latency:

It denotes the delay i.e. the time taken for the input data to move to the output port. For a pipelined design, the latency is measured by product of delay of a single pipelined stage and the number of pipelined stages.

# 7. CONCLUSION

RSA Security goes on to say that IDEA was analyzed to measure its strength against differential cryptanalysis. The analysis concluded that IDEA is immune to that technique. In fact, there are no linear cryptanalytic attacks on IDEA, and there are no known algebraic weaknesses in IDEA. The only weakness of note was discovered by Daemen: using any of a class of  $2^{51}$  weak keys during encryption results in easy detection and recovery of the key. However, since there are 2<sup>128</sup> possible keys, this result has no impact on the practical security of the cipher for encryption provided the encryption keys are chosen at random. IDEA is generally considered to be a very secure cipher and both the cipher development and its theoretical basis have been openly and widely discussed.

# **REFERENCES:**

- [1] Modugu. R, Yong-Bin Kim, Minsu Choi, "Design and performance measurement of efficient IDEA crypto-hardware using novel modular a rith metic components", Instrumentation and Measurement Technology Conference (I2MTC), 2010 IEEE, 3-6 May2010,pp1222-1227.
- [2] R. Zimmermann, A. Curiger, H. Bonnenberg, H. Kaeslin, N. Felber, and W. Fichtner, "A 177mb/s VLSI

implementation of the international data encryption algorithm, "IEEE Journal of Solid-State Circuits, Vol. 29, 1994, pp. 303-307.

- [3] Rahul Ranjan and I. Poonguzhali, "VLSI Implementation of IDEA Encryption Algorithm", Mobile and Pervasive Computing (CoMPC-2008).
- [4] Somayeh Timarchi, Keivan Navi, "Improved Modulo 2<sup>n</sup> +1 Adder Design", International Journal of Computer and Information Engineering 2:7 2008.
- [5] X.Lai and J.L Massey "A Proposal for a New Block Encryption Standard," in advances in Cryptology – EUROCRYPT 90, Berlia, Germany: Springer Verlag pp. 389-404, 1990.
- [6] Antti H"am"al"ainen, Matti Tommiska, and Jorma Skytt", "6.78 Gigabits per Second Implementation of the IDEA Cryptographic Algorithm", 2002 Springer-Verlag, pages 760-769.
- [7] M.P. Leong, O.Y.H. Cheung, K.H. Tsoi and P.H.W. Leong "ABit Serial Implementation of the International Data Encryption Algorithm IDEA"
   ©IEEE 2000.
- [8] P. Kitsos, N. Sklavos, M.D. Galanis, O. Koufopavlou, "64 Bit Blockciphers: Hardware Implementations and Comparison analysis", 593-604, 3rd November,2004,Elsevier
- [9] Thaduri,M., Yoo, S. and Gaede, R, "An Efficient Implementation of IDEA encryption algorithm using VHDL", ©2004 Elsevier.
- [10] Allen Michalski1, Kris Gaj, Tarek El-Ghazawi, "An Implementation

Comparison of an IDEA Encryption Cryptosystemon Two General-Purpose Reconfigurable Computers"

- [11] Sarang Dharmapurikar and John "Fast Lockwood, and Scalable Matching for Network Pattern Intrusion Detection Systems" IEEE Selected Areas Journal on in Communications: Oct. 2006. Volume: 24, pp. 1781-1792.
- [12] Chiranth E, Chakravarthy H.V.A, Naga mohana reddy P, Umesh T.H, Chethan Kumar M., "Implementation of RSA Cryptosystem Using Verilog" International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011.
- [13] Rajashekhar Modugu, Yong-Bin Kim and Minsu Choi, "A Fast Low-Power Modulo 2n + 1 Multiplier", Journal of IET Computers & Digital Techniques Jan-2011.

95