Geneva and Lausanne
Joint specialisation in bioinformaticsSIB members afiliated with the Universities of Geneva and Lausanne, as well as members of the departments of biology and computer sciences in these universities, have created a common program of bioinformatics at the master's level. These common courses are integrated to the study plans of the:
- Master in Molecular Life Sciences distinction in bioinformatics (UniL)
- Master in Biology distinction in bioinformatics and data analysis in biology (UniGe)
and they are given either in Lausanne or in Geneva.
List of short projects
| Tutor | Dept/Inst | Title | Description |
|---|---|---|---|
| Evegny Zdobnov | CEGG/UniGe | Computing trees from orthologous genes | The student(s) will use sets of nesting clusters of elements, the clusters of orthologous genes predicted for each radiation node of the species trees from OrthoDB, to compute trees from these nesting structures in newick format and to check for inconsistencies of nesting. All the data is in flat files or mysql. The student(s) will use some routines already coded for comparisons of sets and Newick Utils for working with trees and will need to create new routines. |
| Frédérique Lisacek | PIG/UniGe | Extracting biological information to annotate the SugarBind database | The SugarBind database can be searched for bacteria, toxins, and viruses that bind to a particular sequence of sugars at the non-reducing terminus of an oligosaccharide. In its present form, all species and gene names cited in entries of the database are not linked to taxIDs nor to sequence accession numbers. The purpose of this small project is to write scripts that extract the taxIDs from the Taxonomy DB and accession numbers from UniProtKB corresponding to entries of SugarBind. Note that all species are correctly described but gene names for lectins/adhesins involved in the recognition of sugars are often missing or ambiguously cited. |
| Kaessmann Henrik | CIG/UniL | Development of a gene expression visualization tool | Description: The Kaessmann lab (UNIL & SIB) is currently working with an extensive set of gene expression data obtained using RNA sequencing (RNA-Seq). Representing the numerical values of expression is a non-trivial task since it is not always easy to generate graphical plots that are both precise and eye-catching.
This project aims to develop a core program generating several conventional plots (e.g. barplots) as well as some custom ones from a standard input file containing the gene expression values. One of the custom graphics would be a human or mouse organism whose organs are colored as a function of the level of expression of a given gene. Minimum development: a core program that can retrieve the corresponding gene expression data in an input file and generate the subsequent plot, including the custom one described above. Efforts should be made to allow for processing of different inputs (e.g. different standard file formats or user input) or to allow an advanced customization of the generated graphics. Optional features: ideally the core program could be enriched by a graphic interface (as a standalone application or a simple webpage) that non-bioinformaticians can easily access and use. It can be envisaged that the student discusses with some potential end users to better meet their needs. Requirements: python programing, use of external libraries (to be defined), notions about Unix command lines. Optional: graphical library (Tk,...) or xHTML/CSS languages. Supervision: The project will be supervised by a PhD student, Philippe Julien, who can help with programming, biological concepts, choice of libraries, python and application development as well as the software architecture in general. |
| Robinson-Rechavi Marc | DEE/UniL | Comparison of alignment quality tools (UniL code 1177) | Multiple Sequence Alignments (MSA) are central in biology, for similarity search, structural alignment, drug design, domains/profiles identification, or phylogeny. The reliability of MSAs are very important because problems in MSAs can strongly impact downstream analyses. Automatic methods to build MSAs are not perfect and they have difficulties to build alignments when sequence divergence is high, when they try to align repeated or unstructured regions (loops in structures), or when there are errors in the sequences.
In order to limit such problems several tools have been suggested, to detect the reliable regions in MSAs. They use different methodologies, which can be categorized in 3 main groups: - Removing/masking unreliable columns in MSAs - Removing/masking unreliable sequences in MSAs - Removing/masking unreliable residues in MSAs The goal of the project is to provide a framework to evaluate these tools, by parsing the results of the different MSA evaluation tools, and mapping them into a common structure. The results will be used in the development of the database Selectome (http://selectome.unil.ch/). |
| Salamin Nicolas | DEE/UniL | Species delimitation using phylogenetic trees (UniL code 1028) | Supervisers: Laurent Vuataz (
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
), Glenn Litsios (
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
) and Nicolas Salamin (Nicolas
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
)
Cataloging species diversity worldwide is a very difficult task, but recent technological advances makes it possible to use molecular phylogenetic trees to identify the limits of a set of closely related species. The use of coalescent-based models have been proposed to provide an objective criteria for species delineation. In this project, we will assess the relevance of these approaches using a data set of mayfly from Madagascar. The goal is to develop a simulation protocol to test existing methods and produce tools to efficiently compute these models. Skills learned: phylogenetic tree reconstruction; molecular evolution; coalescent theory; R and/or python programming References: Monaghan et al. (2009) Accelerated species inventory on Madagascar using coalescent-based models of species delineation. Systematic Biology 58:298-311. Papadopoulou et al. (2009) Comparative phylogeography of tenebrionid beetles in the Aegean archipelago: the effect of dispersal ability and habitat preference. Molecular Ecology 18: 2503-2517. |
| Keller Laurent | DEE/UniL | Genome functional annotation (UniL code 1181) | Molecular-genetic analyses have historically been limited to broadly studied organisms, most often those with medical or agricultural relevance. Thanks to new sequencing technologies (454, Illumina, Solid...) generating data that a few years ago would have cost millions of francs and the labor of hundreds of people over several years are now accessible to individual researchers working on non-model organisms that are interesting for their ecology or evolution. However, the wealth of genetic sequences cannot be manually curated and annotated as was possible not so long ago. The most fundamental tool for assignment of function based on protein homology is BLAST. Our purpose here is to use translated BLAST to find loci in a freshly sequenced genome with homology to some functionally annotated protein in other species.
This project will involve development of an algorithm and its implementation as a pipeline to assist gene prediction, transfer of functional annotation, extraction of protein coding sequence and the predicted protein sequence. The student will conduct rigorous testing of the pipeline and apply it to the identification of genes related to aging in recently sequence ant genomes. Supervisor: Eyal Privman (Keller Lab, Department of Ecology and Evolution, Biophore, bureau 4310, tel: 021 692 4182, email: This e-mail address is being protected from spambots. You need JavaScript enabled to view it ) |
| Jacques Rougement | BBCF/EPFL | Completion of a transcription factor motif analysis module within our python framework |
|
| Jacques Rougement | BBCF/EPFL | Integration and refactoring of 4C analysis scripts (R and shell) in our python library |
List of master projects
| Tutor | Dept/Inst | Title | Description |
|---|---|---|---|
| Michel Milinkovitch | LANE/UniGe | In-silico whole genome comparisons for the analysis of genome, transcriptome, & interactome evolution | In the context of MANTiS, an application system for genome comparisons within an explicit phylogenetic framework and integrating functional + expression data, we are seeking creative and highly motivated computer-science students or biology students with a strong interest in genome molecular evolution. Each Master Project can last 4 to 12 months. For students living in Lausanne, refunding of travel expenses is negotiable. More info. |
| Michel Milinkovitch | LANE/UniGe | Development of stochastic heuristics for inferring the evolution of DNA and protein sequences | In the context of developing new heuristics for phylogeny inference (reconstructing the phylogenetic tree among living species), we are seeking creative and highly motivated computer-science students with a strong interest in biological molecular evolution and skills in optimisation and/or parallelisation techniques. Each Master Project can last 4 to 12 months. For students living in Lausanne, refunding of travel expenses is negotiable. More info. |
| Jacques Rougement | BBCF/EPFL | Development of a whole genome resequencing pipeline | We have many tools and scripts that have been used to characterize whole (bacterial) genomes where a reference strain is available. We would like to integrate them in our analysis portal at http://htsstation.vital-it.ch/. This implies re-evaluating the methods with respect to recent developments, refactoring the code and providing new data representations. |
| Jacques Rougement | BBCF/EPFL | Design and development of advanced analysis module for 4C-seq data |
Chromosome conformation capture is a recent technique to interrogate 3D genome organization in vivo. We would like to integrate several statistical analysis and visualization algorithms into our existing data processing pipeline. |
| Salamin Nicolas | DEE/UniL | The effects of sequence evolutionary rates on phylogenetic reconstruction (Unil code 1048) | Daily supervision: Silvia Moreno (
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
).
Supervisor: Nicolas Salamin ( This e-mail address is being protected from spambots. You need JavaScript enabled to view it ). Building phylogenetic trees has become a simple but essential step in many areas of evolutionary biology. The availability of large numbers of DNA sequences in public databases makes it feasible to assemble quickly comprehensive phylogenetic trees for most organisms. However, the accuracy of the tree reconstruction is a difficult aspect to consider as many biological factors can bias the phylogenetic obtained. In this project, we will assess how differential rates of evolution hinder current reconstruction methods. We will focus more particularly on the effects of sequence saturation and identify the consequences of using fast evolving genes to resolve deep phylogenetic relationships. Skills learned: phylogenetic reconstruction; sequence alignment; computer simulations; perl/python and R programming. References: Struck et al. (2010) Detecting possibly saturated positions in 18S and 28S sequences and their influence on phylogenetic reconstruction of Annelida (Lophotrochozoa). Mol. Phylogenet. Evol. 48: 628-645. Xia et al. (2003) An index of substitution saturation and its application. Mol. Phylogenet. Evol. 26: 1-7 |
| Kaessmann Henrik | CIG/UniL | The contribution of microRNAs to gene expression change and phenotypic evolution in mammals (UniL code 1061) | Mammals are characterized by specific phenotypic traits that include lactation, hair, and relatively large brains with unique structures. Individual mammalian lineages have, in turn, evolved characteristic traits (e.g. with respect to anatomy, reproduction, life span) that distinguish them from others. In addition to genetic changes affecting the function of gene products, changes in gene expression levels have been suggested to underlie many or even most of these phenotypic differences. However, detailed gene expression comparisons were, until recently, restricted to closely related species such as humans and chimpanzees, owing to technological limitations (species-specific microarrays). Recent technological developments (high-throughput sequencing of RNAs) now allow detailed comparisons of transcriptomes between divergent mammals.
Using RNA sequencing, our lab has recently generated comprehensive protein-coding gene expression data for a unique collection of somatic and germline tissues (cortex, cerebellum, liver, heart, kidney, testis) from representatives of all major mammalian lineages: placental mammals (humans, great apes, rodents), marsupials (opossum) and the egg-laying monotremes (platypus). In addition, we have recently generated corresponding data for microRNAs, which are short (∼22 nucleotide) noncoding RNA molecules that bind to complementary sequences on target mRNAs, thus promoting their degradation and/or their translation repression. The goal of this specific project is to assess to what extent changes in microRNA expression levels and mRNA target sequences contributed to changes in expression levels of protein-coding genes during mammalian evolution. This project will thus provide some of the first insights into mechanisms underlying gene expression change in mammals and the role of microRNAs in shaping mammalian phenotypes. Methods used in the project: data analysis using R, basic scripting (Perl, Python and/or Unix); basic statistics. Reference: Brawand, D., Soumillon, M., Necsulea, A., Julien, P., Csárdi, G., Harrigan, P., Weier, M., Liechti, A., Aximu-Petri, A., Kircher, M., Albert, F.W., Zeller, U., Khaitovich, P., Grützner, F., Bergmann, S., Nielsen, R., Pääbo, S., and Kaessmann, H. (2011) The evolution of gene expression levels in mammalian organs Nature (in press). |
| Robinson-Rechavi Marc | DEE/UniL | Development of bioinformatics tools to study gene expression patterns (UniL code 1176) | Gene expression patterns (where and when a gene is expressed) are a key feature in understanding gene function, notably in development. Comparing gene expression patterns between animals is a major step in the study of gene function as well as of animal evolution. Our lab is developing Bgee (http://bgee.unil.ch/), a database designed to study the evolution of expression patterns in animals. The student will have to develop innovative tools to analyze gene expression patterns, such as identifying over-represented anatomical structures where a given list of genes are expressed (similar to Gene Ontology enrichment tests, but using anatomical structures; the statistical basis of the tests is already established in the lab), and mapping change of expression patterns on phylogenetic trees (to visualize evolution of expression patterns). This work will have to be integrated into the existing Bgee application. The student will use Java and SQL languages. |
| Keller Laurent | DEE/UniL | Evolutionary genomics of aging in ants (UniL code 1180) | Background: Well-developed genomic infrastructures for the study of the red imported fire ant are available at the Keller group, including the recently published genomic sequence (Wurm et al. PNAS 2011). These resources open many opportunities to investigate the genomic basis for various aspects of ant biology and evolution, including social structure, immunity, invasive populations, sexual reproduction, and aging. There is tremendous variation in life span between castes in ants, with queens living over 500 times longer than males and 50 times longer than sterile workers. The fact that individuals of these different castes share the same genome and exhibit such enormous variation in lifespan allows us to ask fundamental questions about the evolution of lifespan with respect to the expression of several groups of genes implicated in the ageing process. Proposal: This project aims to identify aging-related genes in insects, refine their annotation in ant genomes, and then study their evolution (positive selection, conservation, duplications). The work consists in retrieving candidate genes related to ageing from the literature or from public databases (for example the GenAge Database). The student will search for these genes in the genomes of seven different ant species that have been recently sequenced, and will analyze and characterize them using comparative genomics approaches. For example, the branch-site test for positive selection (implemented in PAML) will be used to look for adaptation of these genes in ants. Special interest will be given to the genes of the fire ant studied in the Keller group. Promising candidate genes will be passed on to experimental study in this species. This project mostly consists of intelligent use of existing software, although some programming might be helpful. Supervisor: Eyal Privman (Keller Lab, Department of Ecology and Evolution, Biophore, room 4310, tel: 021 692 4182, email: This e-mail address is being protected from spambots. You need JavaScript enabled to view it ) |
| Goudet Jerome | DEE/UniL | Population genetics of bats and their parasites: a molecular and bio-informatic investigation (UniL code 1186) |
This project jointly supervised with Dr P. Christe will make use of the data available in our group for several species of bats and their parasites. Either using existing data and/or generating new, we will investigate using computer simulations scenarios that have lead to the observed genetic data. |
| Goudet Jerome | DEE/UniL | Genetic architecture of adaptive traits (UniL code 1187) |
Using individual based computer simulations, you will investigate how selective pressure together with migration, mutation and drift affect the architecture of adaptive traits. |
Contact
Dr. PALAGI Patricia
Swiss Institute of Bioinformatics
CMU, 1 Michel-Servet
CH-1211 Geneva 4
Switzerland
Email:
This e-mail address is being protected from spambots. You need JavaScript enabled to view it






