High Performance Computing
Recent developments in the evolution of computer hardware offers tantalizing new opportunities to develop novel approaches for efficient biological information analysis. We are developing new algorithms on specialized (FPGA, CELL) and multi-cpu, multi-core devices that maximize the performance of innovative solutions.
Real Time Mass Spectrometry Protein Identification
In the discovery of proteins important in human health and disease, mass spectrometry is used to identify proteins from complex mixtures. A critical worldwide limitation is that the protein identification step is disconnected from the data collection step. This is because it takes so much longer to undertake the database search than to collect the data. This sequential analysis removes our ability to explore less abundant peptides in an efficient manner, or more importantly, search for peptides that would quickly and uniquely identify proteins. Since parallelization of search software on clusters requires doubling the size of a conventional computing cluster to cut the search time in half, alternative approaches must be investigated.
Field programmable gate arrays (FPGAs) are used to create hardware-accelerated algorithms that reduce operating costs and improve search speed compared to large clusters. In previous work, we presented a novel hardware design that takes full spectra and computes 6-frame translation word searches on DNA databases at a rate of approximately 3 billion base pairs per second, with queries of up to 10 amino acids in length and arbitrary wildcard positions. Hardware post-processing identifies in silico tryptic peptides and scores them using a variety of techniques including mass frequency expected values. With faster FPGAs, protein identifications from the human genome can be achieved in less than a second, and this makes it an ideal solution for a number of proteome-scale applications.
We are currently researching the application of novel algorithms that that maximal use of the STI CELL Broadband Engine (CELL BE). The Cell BE, an innovative heterogeneous multi-core processor, possesses unique hardware capabilities to enable low power, high throughput real-time data processing. It is easier to develop on the CELL than the FPGA, and is expected to be portable to next generation hardware. New funding from the Canadian Foundation for Innovation (CFI) will allow us to develop a platform for real time data analysis. Students interested in pursuing research towards this goal should contact Dr. Dumontier
Relevant Publications
|