S.Roy

Work place: HIT, Haldia, India

E-mail:

Website:

Research Interests:

Biography

Author Articles
A Compression & Encryption Algorithm on DNA Sequences Using Dynamic Look up Table and Modified Huffman Techniques

By Syed Mahamud Hossein S.Roy

DOI: https://doi.org/10.5815/ijitcs.2013.10.05, Pub. Date: 8 Sep. 2013

Storing, transmitting and security of DNA sequences are well known research challenge. The problem has got magnified with increasing discovery and availability of DNA sequences. We have represent DNA sequence compression algorithm based on Dynamic Look Up Table (DLUT) and modified Huffman technique. DLUT consists of 43(64) bases that are 64 sub-stings, each sub-string is of 3 bases long. Each sub-string are individually coded by single ASCII code from 33(!) to 96(`) and vice versa. Encode depends on encryption key choose by user from four base pair {a,t.g and c}and decode also require decryption key provide by the encoded user. Decoding must require authenticate input for encode the data. The sub-strings are combined into a Dynamic Look up Table based pre-coding routine. This algorithm is tested on reverse; complement & reverse complement the DNA sequences and also test on artificial DNA sequences of equivalent length. Speed of encryption and security levels are two important measurements for evaluating any encryption system. Due to proliferate of ubiquitous computing system, where digital contents are accessible through resource constraint biological database security concern is very important issue. A lot of research has been made to find an encryption system which can be run effectively in those biological databases. Information security is the most challenging question to protect the data from unauthorized user. The proposed method may protect the data from hackers. It can provide the three tier security, in tier one is ASCII code, in tier two is nucleotide (a,t,g and c) choice by user and tier three is change of label or change of node position in Huffman Tree. Compression of the genome sequences will help to increase the efficiency of their use. The greatest advantage of this algorithm is fast execution, small memory occupation and easy implementation. Since the program to implement the technique have been written originally in the C language, (Windows XP platform, and TC compiler) it is possible to run in other microcomputers with small changes (depending on platform and Compiler used). The execution is quite fast, all the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. The technique can approach an effective compression ratio of 1.98 bits/base and even lower. When a user searches for any sequence for an organism, an encrypted compressed sequence file can be sent from the data source to the user. The encrypted compressed file then can be decrypted & decompressed at the client end resulting in reduced transmission time over the Internet. An encrypt compression algorithm that provides a moderately high compression with encryption rate with minimal decryption with decompression time.

[...] Read more.
Other Articles