A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets

Soledad Espezua; Edwin Villanueva; Carlos D. Maciel; André Carvalho

doi:10.1016/j.neucom.2014.07.057

A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets

Soledad Espezua, Edwin Villanueva, Carlos D. Maciel, André Carvalho

Producción científica: Contribución a una revista › Artículo de revista › revisión exhaustiva

33 Citas (Scopus)

Resumen

The analysis and interpretation of datasets with large number of features and few examples has remained as a challenging problem in the scientific community, owing to the difficulties associated with the curse-of-the-dimensionality phenomenon. Projection Pursuit (PP) has shown promise in circumventing this phenomenon by searching low-dimensional projections of the data where meaningful structures are exposed. However, PP faces computational difficulties in dealing with datasets containing thousands of features (typical in genomics and proteomics) due to the vast quantity of parameters to optimize. In this paper we describe and evaluate a PP framework aimed at relieving such difficulties and thus ease the construction of classifier systems. The framework is a two-stage approach, where the first stage performs a rapid compaction of the data and the second stage implements the PP search using an improved version of the SPP method (Guo et al., 2000, [32]). In an experimental evaluation with eight public microarray datasets we showed that some configurations of the proposed framework can clearly overtake the performance of eight well-established dimension reduction methods in their ability to pack more discriminatory information into fewer dimensions.

Idioma original	Inglés
Páginas (desde-hasta)	767-776
Número de páginas	10
Publicación	Neurocomputing
Volumen	149
N.º	PB
DOI	https://doi.org/10.1016/j.neucom.2014.07.057
Estado	Publicada - 3 feb. 2015
Publicado de forma externa	Sí

Nota bibliográfica

Funding Information:
We would like to thank CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) grant#151547/2013-0 and FAPESP (São Paulo Research Foundation) grant#2012/22295-0 for funding this study.

Publisher Copyright:
© 2014 Elsevier B.V.

Acceder al documento

10.1016/j.neucom.2014.07.057

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{bfdcd12350ad4ab789f3299b33bc82f6,

title = "A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets",

abstract = "The analysis and interpretation of datasets with large number of features and few examples has remained as a challenging problem in the scientific community, owing to the difficulties associated with the curse-of-the-dimensionality phenomenon. Projection Pursuit (PP) has shown promise in circumventing this phenomenon by searching low-dimensional projections of the data where meaningful structures are exposed. However, PP faces computational difficulties in dealing with datasets containing thousands of features (typical in genomics and proteomics) due to the vast quantity of parameters to optimize. In this paper we describe and evaluate a PP framework aimed at relieving such difficulties and thus ease the construction of classifier systems. The framework is a two-stage approach, where the first stage performs a rapid compaction of the data and the second stage implements the PP search using an improved version of the SPP method (Guo et al., 2000, [32]). In an experimental evaluation with eight public microarray datasets we showed that some configurations of the proposed framework can clearly overtake the performance of eight well-established dimension reduction methods in their ability to pack more discriminatory information into fewer dimensions.",

keywords = "Classification, Dimension reduction, Gene expression, Projection Pursuit",

author = "Soledad Espezua and Edwin Villanueva and Maciel, {Carlos D.} and Andr{\'e} Carvalho",

note = "Publisher Copyright: {\textcopyright} 2014 Elsevier B.V.",

year = "2015",

month = feb,

day = "3",

doi = "10.1016/j.neucom.2014.07.057",

language = "English",

volume = "149",

pages = "767--776",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier",

number = "PB",

}

TY - JOUR

T1 - A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets

AU - Espezua, Soledad

AU - Villanueva, Edwin

AU - Maciel, Carlos D.

AU - Carvalho, André

PY - 2015/2/3

Y1 - 2015/2/3

N2 - The analysis and interpretation of datasets with large number of features and few examples has remained as a challenging problem in the scientific community, owing to the difficulties associated with the curse-of-the-dimensionality phenomenon. Projection Pursuit (PP) has shown promise in circumventing this phenomenon by searching low-dimensional projections of the data where meaningful structures are exposed. However, PP faces computational difficulties in dealing with datasets containing thousands of features (typical in genomics and proteomics) due to the vast quantity of parameters to optimize. In this paper we describe and evaluate a PP framework aimed at relieving such difficulties and thus ease the construction of classifier systems. The framework is a two-stage approach, where the first stage performs a rapid compaction of the data and the second stage implements the PP search using an improved version of the SPP method (Guo et al., 2000, [32]). In an experimental evaluation with eight public microarray datasets we showed that some configurations of the proposed framework can clearly overtake the performance of eight well-established dimension reduction methods in their ability to pack more discriminatory information into fewer dimensions.

AB - The analysis and interpretation of datasets with large number of features and few examples has remained as a challenging problem in the scientific community, owing to the difficulties associated with the curse-of-the-dimensionality phenomenon. Projection Pursuit (PP) has shown promise in circumventing this phenomenon by searching low-dimensional projections of the data where meaningful structures are exposed. However, PP faces computational difficulties in dealing with datasets containing thousands of features (typical in genomics and proteomics) due to the vast quantity of parameters to optimize. In this paper we describe and evaluate a PP framework aimed at relieving such difficulties and thus ease the construction of classifier systems. The framework is a two-stage approach, where the first stage performs a rapid compaction of the data and the second stage implements the PP search using an improved version of the SPP method (Guo et al., 2000, [32]). In an experimental evaluation with eight public microarray datasets we showed that some configurations of the proposed framework can clearly overtake the performance of eight well-established dimension reduction methods in their ability to pack more discriminatory information into fewer dimensions.

KW - Classification

KW - Dimension reduction

KW - Gene expression

KW - Projection Pursuit

UR - http://www.scopus.com/inward/record.url?scp=85027943239&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2014.07.057

DO - 10.1016/j.neucom.2014.07.057

M3 - Article in a journal

AN - SCOPUS:85027943239

SN - 0925-2312

VL - 149

SP - 767

EP - 776

JO - Neurocomputing

JF - Neurocomputing

IS - PB

ER -

A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets

Resumen

Nota bibliográfica

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto