Bioinformatics approaches to protein function prediction and genome variation analysis
Modern biology increasingly relies on high-throughput techniques. This trend challenges computational biologists to quickly extract as much useful information from the data as possible. In the genomic sense, this primarily implies correlating phenotypic differences with observed nucleotide sequence variations. On the protein side the challenge generally is to annotate protein function at reasonable accuracy levels. We believe that nucleic and amino acid sequences contain a large portion of the information necessary to address both of these directions.
Our main goal is to develop fast, accurate, and meaningful ways of analyzing this growing deluge of biological data and to bring these developments bench- (or patient-) side. To make our predictions we rely on a number of sequence-based features (including evolutionary information and other predictor results) and utilize a variety of methodologies (including Neural Nets, SVMs and random forests).
The active projects in the lab include:
- Development of an in silico mutagenesis methodology which will define functionally important residues in protein sequences. This direction addresses questions in nsSNP analysis, mutation combinatorics (possibly applicable to phylogenetics), and function prediction.
- Analyzing the effects of genomic SNPs (non-coding or synonymous) on the overall organism fitness. Initial steps in this direction focus on data collection and on outlining SNP characteristics that can be used to differentiate between functionally non-/important SNPs.
- Computational literature analysis (Natural Language Processing) to extract from free text (scientific publications, lab records, etc.) information relevant to the above two goals.
TrAnsFuSE refines the search for protein function: oxidoreductases. Integrated Biology. April 2012, 4,7,765-777
Harel A, Falkowski P, Bromberg Y.
SNPdbe: constructing an nsSNP functional impacts database. Bioinformatics. February 2012, 15;28(4):601-2
Schaefer C, Meier A, Rost B, Bromberg Y.
Bioinformatics for personal genome interpretation.
Brief Bioinform. January 2012, 13,4
Capriotti E, Nehrt NL, Kann MG, Bromberg Y.
Comparative genomic and physiological analysis provides insights into the role of Acidobacteria in organic carbon utilization in Arctic tundra soils. FEMS Microbiology Ecology. 2012,
Rawat S, Männistö MK, Bromberg Y, Häggblom MM
Disease-related mutations predicted to impact protein function. BMC Genomics. 2012, Vol. 13 Issue Suppl 4, p1-6, 6p, 2 Graphs
Schaefer C, Bromberg Y, Achten D, Rost B
SNP-SIG Meeting 2011: Identification and annotation of SNPs in the context of structure, function, and disease. BMC Genomics. 2012, Vol. 13 Issue Suppl 4, p1-2, 2p
Bromberg Y, Capriotti E