Letu Qingge PhD Comprehensive Exam
- Tuesday, April 18, 2017 from 9:20am to 12:20pm
- Barnard Hall, Room 258 - view map
COMPUTATIONAL INVESTIGATION ON PROTEIN SEQUENCING AND GENOME REARRANGEMENT PROBLEMS
De novo protein sequencing problems and genome rearrangement problems are the classical problems in bioinformatics. De novo protein sequencing problems try to determine the whole sequence of amino acids based on the mass spectra data without using the database search. Genome rearrangement problems try to recognize the evolutionary process between two species. More specifically, in the problem we try to apply the minimum number of rearrangement operations to transform one genome to another genome.
In this proposal, firstly, we describe the process of attempting to construct a target protein sequence by utilizing mass spectrometry based data from both top-down, as well as bottom-up, tandem mass spectra processes. In addition to using data from mass spectrometry analysis, we also utilize techniques for de novo sequencing using a homologous protein as a reference to attempt to fill in any remaining gaps in the constructed protein scaffold. Initial results for analysis on real data sets yielded over 96% coverage and 73.6-91% accuracy with the target protein sequence.
Secondly, for the genome rearrangement problem with the double cut and join (DCJ) operation, we use a new randomized method and devise a fixed parameter tractable (FPT) approximation algorithm for computing the DCJ operation distance with an approximation factor 4/3 + ????, and the running time O*(2d*), where d* represents the optimal DCJ distance.