Computation is defined as the use of computer technology in information processing and is a major subject of the computer science field. It usually involves a process that implements and executes in the form of an algorithm, a well-defined model or protocol.
The NIH Biomedical Information Science and Technology Initiative Consortium defines Computational Biology as the “development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems”1. The field is multidisciplinary by definition and comprises aspects from the fields of theoretical computer science, applied mathematics, statistics, biochemistry, chemistry, chemical engineering, biomedical engineering, biophysics, molecular biology, genetics, ecology, evolution, anatomy, neuroscience, and visualization as well as specific sub-fields of those disciplines. Similarly, Computational Genomics is a field within genomics that studies the genomes of cells and organisms using computational approaches. Computational genetics refers to the study of genes and their roles in inheritance using again computational methods.
One specific computer science topic warrants special mention, namely the area of ‘string searching’ or ‘matching’ algorithms and the associated task of database searching. Searching increasingly larger databases for instances of a query of interest has been the mainstay of key activities in computational biology for more than two decades. Increasingly more efficient algorithms for string searching were developed in the 70’s and 80’s, with the Needleman-Wuncsh and Smith-Waterman algorithms being the key representatives of the era.
In the early days, the typical databases comprised amino acid sequences. However, as the ability to sequence DNA improved with time so did the sizes of nucleic acid and amino acid sequence databases being searched. This in turn led to the development of heuristic schemes that emphasized speed and statistical significance over exhaustive search results. The FAST and BLAST families of algorithms have been the most successful and widely used such approaches.
Other computational schemes, non-search related, were also developed with time. Among the many such examples, we list the following representative ones: multiple sequence alignment, used in studying phylogenetic relationships and protein function; gene finding algorithms, used to delineate those DNA regions that correspond to protein-coding genes; protein annotation, used to determine, directly from the amino acid sequence, the function and structure of the corresponding protein, the location of active and post-translational modification sites, etc.; genome-wide comparisons, which permit the comparison of whole genome sequences and the elucidation of the evolutionary history of a particular organism and the discovery of complex evolutionary events or genetics susceptibility sites.
“Bioinformatics” and “Computational Biology”
Bioinformatics applies principles of information sciences and technologies to make the vast, diverse, and complex life sciences data easier to collect, manage and process. On the other hand, computational biology invents and uses complex mathematical and computational approaches in conjunction with bona fide biological knowledge in order to address existing theoretical and experimental questions in biology and to generate new hypotheses about molecular attributes, molecular interactions, response to drugs, pathways affected by drugs, genetic susceptibility loci, etc.
Both Bioinformatics and Computational Biology are rooted in the information sciences as well as in the life sciences and medicine. Not unexpectedly, both disciplines share an overlap at their interface. Moreover, both disciplines draw from and rely on results from a variety of other fields of science including mathematics, physics, statistics, computer science, engineering, biology, biotechnology, and behavioral science. Nonetheless, ‘Bioinformatics’ is in many ways a procedural activity whereas ‘Computational Biology’ is exploratory in nature.
- A working definition of Bioinformatics and Computational Biology – external link