Home | Research | High-Performance Computing Algorithms


High-Performance Computing Algorithms

In single-particle cryo-EM, the transmissive projection image of each frozen molecule does not directly convey the information of protein dynamics. However, the collection of images of millions of molecular copies contains the statistical information of conformational sampling that is a major manifestation of protein dynamics. The more single-molecule images one can obtain, the more likely one will sufficiently sample the dynamic properties of the protein subject to the study. On the other hand, over-sampling has been proved to efficiently suppress overfitting in machine learning algorithms. Therefore, there is tremendous trend for the data size growing in cryo-EM studies aiming at understanding complex dynamics or systems level behaviors. Such challenges in big data analysis may not be addressed without high-performance computing or supercomputing technology. In the Intel® PCCSB, several research projects are focused on the following topics:
(1) The developments of HPC data exchange algorithms that can effectively handle big data processing with relatively limited HPC resources and in a heterogeneous HPC environment.
(2) The optimization of machine learning algorithms specifically for Intel multi-core and many integrated core architecture.
(3) The integration of various levels of HPC algorithms for enabling the statistical and time-series reconstruction of 3D protein dynamics at a near-atomic resolution at a physiological condition.