The Computational Biochemistry Research (CBR) group is interested in the modelling and analysis of biological problems at the molecular level. The CBR groupвЂ™s expertise lies in particular in searching algorithms, optimization algorithms, mathematical modelling, and computational systems. The group is currently working on two large-scale projects, the Orthologous Matrix project and the Codon Bias project and offers the results and general services through the internet and through the distribution of the Darwin system for bioinformatics computations.
The CBR group is interested in bioinformatics problems, in particular the modelling and simulation of molecular sequence data. The group is pursuing large-scale computational problems, such as the Orthologous Matrix Project (OMA) and a joint research project with the Biochemistry Institute, the Codon Bias project.
Projects and Services
The Orthologous Matrix Project (OMA)
The goal of the OMA project is to automatically produce reliable orthologous groups of proteins derived from entire genomes. The entire genomes are processed with various quality controls and then all of their encoded proteins are compared against all of the encoded proteins of other genomes. More than 1200 complete genomes have already been processed, roughly as many as are available.
The all-against-all alignment of proteins is by far the most demanding step from the computational point of view. Once all of the basic homologies have been detected these are filtered through 3 steps: stable pairs, verified stable pairs, and cliques to produce orthologous groups. The research aim is to have high confidence that the groups are indeed orthologous and not paralogous, even at the expense of occasionally breaking some orthology. Further computations, like phylogenetic trees, suffer more fromfalse positives than from false negatives.The OMA project is very rich in sub-projects, which provide the main research topics for several graduate students working in the group. For example, quality indices of phylogenetic trees, algorithms for proving paralogy, models and algorithms for the detection of lateral gene transfer, models for branch-length comparison/decision, various forms of estimating distances between sequences, quality of multiple sequence alignments, quality of phylogenetic trees and assembling the information of various genes to estimate genomic distances.
The Codon Bias project
The Codon Bias project attempts to uncover the causes and purposes of codon bias in eukaryotes, in particular in yeast. This project is funded by the Swiss National Science Foundation and is run in conjunction with Prof. Yves Barral from the Biochemistry Institute at the Swiss Federal Institute of Technology in Zurich (ETHZ). A key result obtained is a provable difference in the rate of expression for different codon reusage levels in Saccharomyces cerivisiae. A measure of codon reusage called the TPI index has been devised. In the lab we have synthesized GFP with maximum TPI, and GFP with minimum TPI (in various versions but always using exactly the same codons), and the difference in expression has been measured.
New projectsIn addition to the above, two new projects have been started in 2010/2011:
Gene Ontology annotation
We are studying methods to assess the quality of the automatic annotations based on the evolution of the automatic/experimental annotations.
Pattern detection in molecular sequences
We are developing algorithms which optimize probabilistic finite automata (PFAs) to recognize certain properties of molecular sequences. This can be used at the DNA or amino acid level and is useful to recognize, automatically, these properties. The PFAs are trained on known sequences using new techniques akin to machine learning.
Website for Further Information
CBRG - Computational Biochemistry Research Group: http://www.cbrg.ethz.ch/