Optimizing Smith-Waterman Algorithm Using Residue Number System and Recursive Variable Expansion
[Thesis]
Bello, Hassan Kehinde
Gbolagade, Kazeem A.
Kwara State University (Nigeria)
2020
170
Ph.D.
Kwara State University (Nigeria)
2020
One of the greatest global challenges in bioinformatics is the speed and accuracy of Biological Sequence Alignment (BSA). Several studies on BSA indicate that the computational time required is high. Recently, there is a rapid exponential growth in the size of International biological database. This occurrence necessitates the urgent need for a reliable, accurate and high speed BSA algorithm that can meet the current and future demand in bioinformatics. Several algorithms like the Fast-All Algorithm (FASTA), Basic Local Alignment Search Tool (BLAST), Needleman-Wunsch Algorithm (NWA) and Smith-Waterman Algorithm (SWA) have been used to optimize BSA speed. These algorithms lack accuracy except for SWA which is however found relatively slower, due to number of computations involved in the search. It is therefore, desirable to optimize its performance with regard to its computational time. Some of the techniques that have been used to enhance the performance of SWA include the Recursive Variable Expansion (RVE), Profiling Method (PM) and Linear Systolic Array (LSA). These techniques have the following weaknesses: In RVE, some constraints have to be relaxed in the transformation; in profiling method, acceleration rate varies in each portion of SWA and in LSA, good performance depends on the simplification of Processing Element (PE). Consequently, attributes of Residue Number System (RNS) which include, carry free addition, borrow free subtraction, digit by digit multiplication without partial product and parallel computation were explored in this work to enhance the performance of SWA. RNS has been used in many applications such as Digital Signal Processing (DSP) to reduce power consumption; and cryptography to enhance data security. The approaches used in this work for optimizing SWA included formulation of forward converter architecture, designs of SWA-RNS based modular adder and SWA-RNS based comparator; while a converter was designed using Chinese Remainder Theorem (CRT) method. The schemes that were also proposed include the Acceleration of Algorithm of Smith-Waterman using Recursive Variable Expansion (RVE method vs SWA), Biological Sequence Alignment using RNS (Profiling method vs RNS) and Systolic Array RNS Based Smith-Waterman Algorithm (Linear Systolic array method vs RNS). The Implementation of RNS on SWA (SWA-RNS based) was carried out on MATLAB 2016RA(9.0.341360), Profiling method vs RNS and Systolic Array method vs RNS were carried out on Spartan-III, 64-Bit version (Xilinx family). The metrics used for evaluation were processing time and memory utilization. The results were finally compared with existing state-of-the-art systems. The computational time of SWA and SWA-RNS based were 23.34 and 7.53 respectively, while the memory utilized respectively were 12.86 and 2.53. The run-time for profiling method vs RNS on profiling method were 7.23 and 5.43 respectively. Moreover, LSA method vs RNS based gave 372.4 Giga Cell Update Per Second (GCUPS) with 875 PE. The processing time and memory utilized upon applying RNS on SWA has 51.21% and 47.63% improvement respectively. Also, when the block size was 2, the algorithm became more optimized and redundancies were eliminated. The speed-up ratio of serial and RVE obtained were 1:801. This made RVE 801 times faster than the serial case with acceleration factor of 1.602 faster than state-of-the-art.