Mapping genotype data with multidimensional scaling algorithms

Soledad E. LLerena, C. D. Maciel

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

Genotype data resulting of modern biomolecular techniques are characterized by having high dimensionality. Find patterns in this type of data is a complex and delayed work if it is performed solely by humans. A Multidimensional Scaling (MDS) technique was recently applied to map genotype data into helpful visual representations. Despite its announced success in helping to identify patterns, such conclusion was relative to the chosen MDS algorithm. There exist various MDS algorithms and it is unknown which of them would be more suitable in the mapping of genotype data. In this paper we present a comparative analysis of four popular MDS algorithms: classical MDS (CMDS), optimized MDS (SMACOF), Landmark MDS (LANDMARK) and FASTMAP. The analysis was performed using three comparison criteria: stress index, which measure the ability of the algorithms in accommodate the similarity information contained in the data into low dimensional spaces; clustering purity index, which measure the ability of the algorithms to preserve true group structures in the mapped space; and the computational time index, which measure the empirical computational costs of the algorithms. The results obtained in three well know datasets showed some differences in the measured criteria, with SMACOF presenting the better values of stress and clustering purity index, but an increased computational time. Additionally, SMACOF was used to map a genotype dataset that was previously mapped with the LANDMARK algorithm, resulting in a similar visual representation, with clusters more accurately recognizable.
Idioma originalInglés
Título de la publicación alojadaBIOMAT 2010
Subtítulo de la publicación alojadaInternational Symposium on Mathematical and Computational Biology
EditoresRubem P. Mondaini
EditorialWorld Scientific
Páginas303-319
ISBN (versión digital)978-981-4460-67-5
ISBN (versión impresa)978-981-4343-42-8
DOI
EstadoPublicada - may. 2011
Publicado de forma externa
EventoBIOMAT 2010 - International Symposium on Mathematical and Computational Biology - Rio de Janeiro, Brasil
Duración: 24 jul. 201029 jul. 2010

Otros

OtrosBIOMAT 2010 - International Symposium on Mathematical and Computational Biology
País/TerritorioBrasil
CiudadRio de Janeiro
Período24/07/1029/07/10

Huella

Profundice en los temas de investigación de 'Mapping genotype data with multidimensional scaling algorithms'. En conjunto forman una huella única.

Citar esto