Home

Phylogeny using Maximum Likelihood

Friday October 05, 2007


Exercice 1: What is a fish?
Dataset1: Alignment of cathepsin B protein sequences.

We will use cathepsin B as a marker of vertebrate phylogeny, as it has been sequenced in diverse species and shows little history of duplication or loss. We want to check whether fishes are monophyletic, paraphyletic or polyphyletic.
To see from what species the sequences in the files come from, enter the names from this file into the batch search at Uniprot. Note that one species does not come from a vertebrate: use it for rooting.

1.1. Save the datafile and open it in Seaview. Does the alignment look good? Consider the proportion and distribution of gaps, and the amount of variability between sequences present.

1.2. Go to the website of Phyml, make a tree.

So: Are fishes are monophyletic, paraphyletic or polyphyletic? Under what definition of the word "fish"? Compare your results with the classification of species at the NCBI (links from Uniprot).
Exercice 2: Evolution of insulin in vertebrates
Dataset2: Alignment of insulin coding genes.

Note that the myxine is a jawless vertebrate, useful to root the tree.

2.1. Consider the alignment in Seaview. How does it differ from the previous?

2.2. Again, make the Phyml tree using the webserver.

2.3. Reconstruct the history of gene duplication and loss for vertebrate insulins.
Exercice 3: Nuclear receptors 1: Ultraspiracle and RXR
Dataset3: Alignment of Retinoic acid X receptors, and their insect ortholog, Ultraspiricale.

USP_TRIC1 is from the jellyfish Tripedalia cystophora and can be used to root the tree.

3.1. Look at the alignment and make the tree, as usual. What striking features does this tree have?

3.2. Redo the tree, setting the number of substitution rate categories to 1. What changes, and why?

3.3. Redo the tree, without optimizing the topology. This gives you a Distance tree instead of a likelihood tree. What do you think of it?
Exercice 4: Nuclear receptors 2: Nematode
In C. elegans, there are more than 270 predicted nuclear receptors, instead of 21-72 in other species (21 in Drosophila, 48 in human). They defied classification for several years, due to their very divergent sequences.

Dataset4: Alignment of many nuclear hormone receptors, including unclassified receptors from nematode C. elegans.

4.1. Look at the alignment. What do you think of it? Seaview allows you to save only certain regions of the alignment, try doing this.

4.2. Try doing the try on the webserver. What happens?

4.3. Try doing the tree locally, using a Terminal Unix window. Open the resulting tree using NJplot. What can you conclude?

4.4. Play with the options, similar to what you did for RXR. What does this show you concerning the tree building methods?

 

 

Latest update 2007-10-05
Valid HTML 4.01 Transitional   Valid CSS!