Vector quantization of amino acids: Analysis of the HIV V3 loop region Journal Article


Authors: Olshen, A. B.; Cosman, P. C.; Rodrigo, A. G.; Bickel, P. J.; Olshen, R. A.
Article Title: Vector quantization of amino acids: Analysis of the HIV V3 loop region
Abstract: This paper is about techniques for clustering sequences such as nucleic or amino acids. Our application is to defining viral subtypes of HIV on the basis of similarities of V3 loop region amino acids of the envelope (env) gene. The techniques introduced here could apply with virtually no change to other HIV genes as well as to other problems and data not necessarily of viral origin. These algorithms as they apply to quantitative data have found much application in engineering contexts to compressing images and speech. They are called vector quantization and involve a mapping from a large number of possible inputs into a much smaller number of outputs. Many implementations, in particular those that go by the name generalized Lloyd or k-means, exist for choosing sets of possible outputs and mappings. With each there is an attempt to maximize similarities among inputs that map to any single output, or, alternatively, to minimize some measure of distortion between input and output. Here, two standard types of vector quantization are brought to bear upon the cited problem of clustering V3 loop amino acid sequences. Results of this clustering are compared to those of the well known UPGMA algorithms, the unweighted pair group method in which arithmetic averages are employed. © 2004 Elsevier B.V. All rights reserved.
Keywords: hiv; clustering; vector quantization
Journal Title: Journal of Statistical Planning and Inference
Volume: 130
Issue: 1-2
ISSN: 0378-3758
Publisher: Elsevier B.V.  
Date Published: 2005-03-01
Start Page: 277
End Page: 298
Language: English
DOI: 10.1016/j.jspi.2003.10.010
PROVIDER: scopus
DOI/URL:
Notes: --- - "Cited By (since 1996): 1" - "Export Date: 24 October 2012" - "CODEN: JSPID" - "Source: Scopus"
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Adam B Olshen
    107 Olshen
Related MSK Work