PBL 5: Gene Translation and Prediction

 

1.         The direct co-linear relationship between a prokaryotic gene and the structure of the encoded polypeptide allows the determination of protein structure from DNA sequence data.

 

            a.         Several software systems are available to relieve us from the tedium of translating base sequences into amino acid sequences. Write an algorithm upon which such a system might be based. For this algorithm assume that:

 

                        i.         The input is a string of bases that includes a gene that may encode a protein.

                        ii.        The output is the sequence of amino acids.

                        iii.       Amino acid sequences that are shorter than 30 will be rejected as erroneous solutions.


                        Do not assume that the first three bases of the input string are the initiator codon. 

            b.         With respect to protein structure, programs of the type considered above are not totally satisfactory. Explain why they are not.

 

2.         Because of the existence of introns within their genes, the task of translating DNA sequences of eukaryotes into amino acid sequences is much more difficult. You have already wrestled with the problem of gene prediction in considering the enumeration of genes in the human genome. Note that according to Pevzner (p. 154) "... no existing in silico gene recognition algorithm provides reliable gene recognition."

 

             a.         Briefly state the strategy employed in the spliced alignment approach to the prediction of exon-intron structure.

   

             b.         Explain each step in the spliced alignment algorithm given by Pevzner in Section 9.4 (pp.157-167).