SNOW Work Package:
Parallel Fragment Assembly in Computater Clusters (finished)



DNA is a large molecule composed of four different chemical subunits, called bases. The sequence of these bases controls all cell activity, and therefore is of great interest for research. Determining the order of the bases along the DNA strand is called sequencing, and can be achieved by a method called Shotgun Sequencing. This methods splits DNA into several random pieces. The sequence of bases in these pieces is then read by an automated sequencing machine. After this step is realized, it is necessary to put the fragments back in order. This task is usually performed by computers, with the aid of specific algorithms developed by Kececioglu and Myers.

Among many implementations based on this algorithm, Phrap has became one of the most popular. Phrap was developed by Phil Green at the Washington University.

However, fragment assembly algorithms are very time consuming, due to the intense processing involved. The goal of this project is the parallelization of the present algorithms. In order to avoid unnecesary rework, the techinques will be aplied over Phrap, for it is one of the most popular implementations and has been well test. Combined with the cluster technology of the SNOW Project, this work will surely boost the speed of DNA sequencing, increasing the flow of information in biological research.

Keywords: Bioinformatics, DNA Sequencing, Fragment Assembly, Cluster Computing, Parallel Programming


  1. Perform studies about the "state of art"
  2. Learn about bioinformatics, specifically DNA Sequencing
  3. Conduct advanced research on parallelization techniques
  4. Create parallel versions of DNA Sequencing algorithms
  5. Write M. Sc. thesis


2002 2003
Activity 1 2 3 4 1 2 3 4

Related Work