Mapping genotype data with multidimensional scaling algorithms

Soledad E. LLerena, C. D. Maciel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Genotype data resulting of modern biomolecular techniques are characterized by having high dimensionality. Find patterns in this type of data is a complex and delayed work if it is performed solely by humans. A Multidimensional Scaling (MDS) technique was recently applied to map genotype data into helpful visual representations. Despite its announced success in helping to identify patterns, such conclusion was relative to the chosen MDS algorithm. There exist various MDS algorithms and it is unknown which of them would be more suitable in the mapping of genotype data. In this paper we present a comparative analysis of four popular MDS algorithms: classical MDS (CMDS), optimized MDS (SMACOF), Landmark MDS (LANDMARK) and FASTMAP. The analysis was performed using three comparison criteria: stress index, which measure the ability of the algorithms in accommodate the similarity information contained in the data into low dimensional spaces; clustering purity index, which measure the ability of the algorithms to preserve true group structures in the mapped space; and the computational time index, which measure the empirical computational costs of the algorithms. The results obtained in three well know datasets showed some differences in the measured criteria, with SMACOF presenting the better values of stress and clustering purity index, but an increased computational time. Additionally, SMACOF was used to map a genotype dataset that was previously mapped with the LANDMARK algorithm, resulting in a similar visual representation, with clusters more accurately recognizable.
Original languageEnglish
Title of host publicationBIOMAT 2010
Subtitle of host publicationInternational Symposium on Mathematical and Computational Biology
EditorsRubem P. Mondaini
PublisherWorld Scientific
Pages303-319
ISBN (Electronic)978-981-4460-67-5
ISBN (Print)978-981-4343-42-8
DOIs
StatePublished - May 2011
Externally publishedYes
EventBIOMAT 2010 - International Symposium on Mathematical and Computational Biology - Rio de Janeiro, Brazil
Duration: 24 Jul 201029 Jul 2010

Other

OtherBIOMAT 2010 - International Symposium on Mathematical and Computational Biology
Country/TerritoryBrazil
CityRio de Janeiro
Period24/07/1029/07/10

Fingerprint

Dive into the research topics of 'Mapping genotype data with multidimensional scaling algorithms'. Together they form a unique fingerprint.

Cite this