Documents retrieval for qualitative research: Gender discrimination analysis

Hugo Alatrista-Salas, Pilar Hidalgo-Leon, Miguel Nunez-Del-Prado

Research output: Contribution to conferencePaper

2 Scopus citations

Abstract

Gender discrimination is an act of exclusion or differential treatment towards a person due to its sex. This phenomenon has been studied in qualitative research by seeking to analyze and to describe the reality and context of discrimination. Qualitative researchers use a collection of documents such as surveys, interviews among another source. These large full textual documents tend to be unstructured from a Data Science point of view. These data are often complex and tend to show similar information between documents. Nevertheless, the process of selecting relevant information is manual, generating difficulties in categorizing and analyzing relevant piece of information, such as victim's surveys. The main reason in this processing is the use of tools to simplify the task of information selection and to perform it efficiently. This article proposes two methods based on the TF-IDF measure to search documents in a corpus. Our findings show that other methods such as, LSA (Latent Semantics Analysis) and LDA (Latent Dirichlet Allocation) consume a lot of memory, and have a low effectiveness extracting meaningful words than relying on TD-IDF only. The information processed in this case is about testimonies of gender discrimination in university students in Peru.
Original languageEnglish
Number of pages6
DOIs
StatePublished - 24 Jan 2019
Event2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018 -
Duration: 23 Jan 2019 → …

Conference

Conference2018 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2018
Period23/01/19 → …

Keywords

  • Data mining
  • Discrimination
  • Document retrieval
  • Text mining

Fingerprint

Dive into the research topics of 'Documents retrieval for qualitative research: Gender discrimination analysis'. Together they form a unique fingerprint.

Cite this