Documents retrieval for qualitative research: Gender discrimination analysis

Hugo Alatrista-Salas, Pilar Hidalgo-Leon, Miguel Nunez-Del-Prado

Producción científica: Contribución a una conferencia

3 Citas (Scopus)


Gender discrimination is an act of exclusion or differential treatment towards a person due to its sex. This phenomenon has been studied in qualitative research by seeking to analyze and to describe the reality and context of discrimination. Qualitative researchers use a collection of documents such as surveys, interviews among another source. These large full textual documents tend to be unstructured from a Data Science point of view. These data are often complex and tend to show similar information between documents. Nevertheless, the process of selecting relevant information is manual, generating difficulties in categorizing and analyzing relevant piece of information, such as victim's surveys. The main reason in this processing is the use of tools to simplify the task of information selection and to perform it efficiently. This article proposes two methods based on the TF-IDF measure to search documents in a corpus. Our findings show that other methods such as, LSA (Latent Semantics Analysis) and LDA (Latent Dirichlet Allocation) consume a lot of memory, and have a low effectiveness extracting meaningful words than relying on TD-IDF only. The information processed in this case is about testimonies of gender discrimination in university students in Peru.
Idioma originalInglés
Número de páginas6
EstadoPublicada - 24 ene. 2019
Evento2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018 -
Duración: 23 ene. 2019 → …


Conferencia2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018
Período23/01/19 → …

Palabras clave

  • Data mining
  • Discrimination
  • Document retrieval
  • Text mining


Profundice en los temas de investigación de 'Documents retrieval for qualitative research: Gender discrimination analysis'. En conjunto forman una huella única.

Citar esto