Sequence analysis tools

ICGEB Trieste July 4,2001

Elisabeth Gasteiger

Starting point: http://www.expasy.org and http://www.expasy.org/tools/ in particular.
Please use the ExPASy mirror sites.

  1. Translate the (eukaryotic) DNA sequence given in the file ftp://ftp.expasy.org/outgoing/tpanalseq/SEQ1.TXT on the anonymous FTP server of ExPASy.The protein matches the PROSITE pattern PS00236; NEUROTR_ION_CHANNEL; 1; making use of this information, find the correct open reading frame.
    Determine the major biochemical characteristics of the protein sequence, i.e. molecular weight, isoelectric point, extinction coefficient…
    Does the protein contain a signal sequence, and/or transmembrane regions?
    Compare your results to the corresponding SWISS-PROT entry.
  2. Rabbit myosin (SWISS-PROT Q28641, heavy chain) is annotated to have a potential coiled coil region. Reproduce this prediction.
  3. Given the EMBL entry AJ012609: Determine the functional domains of the corresponding protein from drosophila. Use ProfileScan, Pfam HMM search (HMM = Hidden Markov Model), and SMART. Then do the same using InterPro Scan.
  4. Translate the EMBL sequence X94921 from Gulo Gulo. Determine the potential transmembrane regions of this protein. What familie(s) does it belong to? What is the “biological” validity of the different post-translational modification sites suggested by the ScanProsite tool?

(Hint: codon usage http://www2.ebi.ac.uk/translate/)