Mining coherent patterns in multidimensional big data based on spectral decomposition
Hong Yan
City University of Hong Kong, Hong Kong
: J Comput Eng Inf Technol
Abstract
Although many multidimensional datasets can be very large, they may contain coherent patterns of much smaller sizes. For example, in gene expression data, we may be interested in a subset of genes that co-express under a subset of conditions during a subset of time intervals. It is a challenge to extract these coherent patterns from a large multidimensional data array. In this presentation, I will discuss a robust coherent pattern detection technique that our research group has developed recently based on spectral decomposition. In our method, we analyzed the data in singular vector spaces and detect hyperplanes that are related to subsets of data features or samples, which correspond to coherent data points. This procedure effectively suppressed the noise and removed irrelevant elements in the data. We have found several useful applications of the coherent pattern extraction algorithm to image and biomedical data analysis. Based on these patterns, we were able to recognize human facial expressions from face images and locate the points that play important roles in characterizing different expressions. With DNA microarray gene expression data, we can identify co-expressed genes for different types and subtypes of cancers. We have also used the coherent patterns to analyze the molecular mechanisms of non-small cell lung cancer drug resistance and predict the drug resistance levels for different protein mutations.
Biography
Email: h.yan@cityu.edu.hk