Home

Introduction to wEMBOSS

Friday May 11, 2007

EMBOSS

Practicals


[Answers]

 

You are going to use wEMBOSS @ the Swiss EMBnet node to answer some application examples. wEMBOSS @ EMBnet can be accessed anonymously (username: "guestwemboss" and password: "guest&wemboss"). Before you start working, please create a new Project (e.g. by giving your family name as title of the project) [screenshot]. You can then continue working within you personal project.
 
1 Retrieve Sequences and Files

Retrieve the sequence in fasta format of the Swiss-Prot entry P57727 with the seqret program . (Hint: Don't forget to first runshowdb to check which databases are accessible and their names).
Retrieve now the same sequence by using the EMBOSS application entret. What's the difference? Can you find any sequence variants produced by alternative splicing for this protein (Hint: look at the Features FT lines of the output file) ? If yes, how many?
Can you find the EMBL accession code(s) for the gene sequence of the protein (Hint: look at the Cross-References DR lines of the output file)? Try to save the DNA sequence in different formats (e.g., GCG).

 
2 Find Protein domains/motifs

How many different domains are present in the Swiss-Prot entry P57727 ? (Hint: Use the EMBOSS application patmatmotifs to search your sequence against the PROSITE motif database. Please select the output option of the program to provide full documentation for the matching patterns.) Is the information about the protein domains also annotated in the Features FT lines of the SwissProt entry file?

 
3 Pairwise sequence alignment

Please run a blastp (using the web interface at the ExPASy server) against the Swiss-Prot database (tick the checkbox) to search for homologous of the human protein corresponding to the Swiss-Prot entry P57727.
The third best hit in the Swiss-Prot (sp) database in the blastp output corresponds to the mouse protein Q8K1T0. The first, second, fourth and fifth best hits correspond to splice variants of the protein. The sequence of one of the splice variants is stored in the P57727_splice_var.fasta file. Please download the file in your local disk.

Perform a pairwise global alignment between the P57727 sequence and the sequence of its splice variants stored in the in the P57727_splice_var.fasta file. Compare the results you obtained with the results obtained by doing a pairwise local alignment between the same pairs of sequences. How can you explain the differences? Which domain is missing in the splice variant of the protein?

 
4 Producing a restriction map
Use the DNA sequence (see example 1) of the gene that codes for protein P57727 and search for enzymes that cut a minimum of once and a maximum of twice, and have a recognition site length of at least six bases. (Hint: use the program wossname with the keyword 'restriction' as input or check the list of programs grouped by type to find a program to perform restriction maps).
 
5 Translation
Retrieve the complete sequence entry for the DNA sequence of the gene that codes for protein P57727. Which part of the DNA sequence correspond to the coding sequence (CDS)? (Hint: use the program entret and check the FT lines). Translate the coding sequence to the corresponding protein product. (Hint: use the program wossname with the keyword 'translation' as input or check the list of programs grouped by type to find a suitable program to translate DNA into protein).

Compare this result with the one obtained by using the EMBOSS application getorf. Report only ORF with a minimum nucleotide length of 300 nucleotides. Do you see a difference between the annotated CDS and the predicted one? How can you explain this?
 
6 Designing primers

Design the 6 best primers for the DNA sequence of the gene that codes for protein P57727. How many primer pairs are considered OK by the program?
Design again primers for the sequence, but since you suspect vector contaminations, exclude the first and the last 12 base pairs of the sequence.
Design an internal oligo to detect one of the sequence variants listed in the protein sequence entry P57727 (see example 1)

 

Questions can be sent to: L. Bordoli (Lorenza.Bordoli@unibas.ch) or L. Falquet (Laurent.Falquet@isb-sib.ch)

 

 

Latest update 2007-05-11
Valid HTML 4.01 Transitional   Valid CSS!