Unsupervised structure detection in biomedical data Journal Article


Author: Vogt, J. E.
Article Title: Unsupervised structure detection in biomedical data
Abstract: A major challenge in computational biology is to find simple representations of high-dimensional data that best reveal the underlying structure. In this work, we present an intuitive and easy-to-implement method based on ranked neighborhood comparisons that detects structure in unsupervised data. The method is based on ordering objects in terms of similarity and on the mutual overlap of nearest neighbors. This basic framework was originally introduced in the field of social network analysis to detect actor communities. We demonstrate that the same ideas can successfully be applied to biomedical data sets in order to reveal complex underlying structure. The algorithm is very efficient and works on distance data directly without requiring a vectorial embedding of data. Comprehensive experiments demonstrate the validity of this approach. Comparisons with state-of-the-art clustering methods show that the presented method outperforms hierarchical methods as well as density based clustering methods and model-based clustering. A further advantage of the method is that it simultaneously provides a visualization of the data. Especially in biomedical applications, the visualization of data can be used as a first pre-processing step when analyzing real world data sets to get an intuition of the underlying data structure. We apply this model to synthetic data as well as to various biomedical data sets which demonstrate the high quality and usefulness of the inferred structure. © 2015 IEEE.
Keywords: unsupervised learning; cluster analysis; computational biology; bioinformatics; visualization; medical applications; data visualization; biomedical applications; virtual reality; clustering; clustering algorithms; model-based clustering; complex networks; data mining; social networking (online); knowledge discovery; network analysis; structure detection; electric network analysis; density-based clustering; high dimensional data; neighborhood comparisons
Journal Title: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume: 12
Issue: 4
ISSN: 1545-5963
Publisher: IEEE  
Date Published: 2015-07-01
Start Page: 753
End Page: 760
Language: English
DOI: 10.1109/tcbb.2015.2394408
PROVIDER: scopus
DOI/URL:
Notes: Export Date: 2 September 2015 -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Julia E Vogt
    2 Vogt