A comparison of classification models to detect cyberbullying in the Peruvian Spanish language on twitter

Ximena M. Cuzcano; Victor H. Ayma

doi:10.14569/IJACSA.2020.0111018

A comparison of classification models to detect cyberbullying in the Peruvian Spanish language on twitter

Ximena M. Cuzcano, Victor H. Ayma

Producción científica: Contribución a una revista › Artículo de revista › revisión exhaustiva

6 Citas (Scopus)

Resumen

—Cyberbullying is a social problem in which bullies’ actions are more harmful than in traditional forms of bullying as they have the power to repeatedly humiliate the victim in front of an entire community through social media. Nowadays, multiple works aim at detecting acts of cyberbullying via the analysis of texts in social media publications written in one or more languages; however, few investigations target the cyberbullying detection in the Spanish language. In this work, we aim to compare four traditional supervised machine learning methods performances in detecting cyberbullying via the identification of four cyberbullying-related categories on Twitter posts written in the Peruvian Spanish language. Specifically, we trained and tested the Naive Bayes, Multinomial Logistic Regression, Support Vector Machines, and Random Forest classifiers upon a manually annotated dataset with the help of human participants. The results indicate that the best performing classifier for the cyberbullying detection task was the Support Vector Machine classifier.

Idioma original	Inglés
Páginas (desde-hasta)	132-138
Número de páginas	7
Publicación	International Journal of Advanced Computer Science and Applications
Volumen	11
N.º	10
DOI	https://doi.org/10.14569/IJACSA.2020.0111018
Estado	Publicada - 1 oct. 2020
Publicado de forma externa	Sí

Nota bibliográfica

Publisher Copyright:
© 2020 Science and Information Organization. All rights reserved.

ODS de las Naciones Unidas

Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible

Acceder al documento

10.14569/IJACSA.2020.0111018

Otros archivos y enlaces

Citar esto

@article{dbb34de55a4448788cfabd92f063c247,

title = "A comparison of classification models to detect cyberbullying in the Peruvian Spanish language on twitter",

abstract = "—Cyberbullying is a social problem in which bullies{\textquoteright} actions are more harmful than in traditional forms of bullying as they have the power to repeatedly humiliate the victim in front of an entire community through social media. Nowadays, multiple works aim at detecting acts of cyberbullying via the analysis of texts in social media publications written in one or more languages; however, few investigations target the cyberbullying detection in the Spanish language. In this work, we aim to compare four traditional supervised machine learning methods performances in detecting cyberbullying via the identification of four cyberbullying-related categories on Twitter posts written in the Peruvian Spanish language. Specifically, we trained and tested the Naive Bayes, Multinomial Logistic Regression, Support Vector Machines, and Random Forest classifiers upon a manually annotated dataset with the help of human participants. The results indicate that the best performing classifier for the cyberbullying detection task was the Support Vector Machine classifier.",

keywords = "Feature extraction, Machine learning, Natural language processing, —Cyberbullying detection",

author = "Cuzcano, {Ximena M.} and Ayma, {Victor H.}",

year = "2020",

month = oct,

day = "1",

doi = "10.14569/IJACSA.2020.0111018",

language = "English",

volume = "11",

pages = "132--138",

journal = "International Journal of Advanced Computer Science and Applications",

issn = "2158-107X",

publisher = "Science and Information Organization",

number = "10",

}

TY - JOUR

T1 - A comparison of classification models to detect cyberbullying in the Peruvian Spanish language on twitter

AU - Cuzcano, Ximena M.

AU - Ayma, Victor H.

PY - 2020/10/1

Y1 - 2020/10/1

N2 - —Cyberbullying is a social problem in which bullies’ actions are more harmful than in traditional forms of bullying as they have the power to repeatedly humiliate the victim in front of an entire community through social media. Nowadays, multiple works aim at detecting acts of cyberbullying via the analysis of texts in social media publications written in one or more languages; however, few investigations target the cyberbullying detection in the Spanish language. In this work, we aim to compare four traditional supervised machine learning methods performances in detecting cyberbullying via the identification of four cyberbullying-related categories on Twitter posts written in the Peruvian Spanish language. Specifically, we trained and tested the Naive Bayes, Multinomial Logistic Regression, Support Vector Machines, and Random Forest classifiers upon a manually annotated dataset with the help of human participants. The results indicate that the best performing classifier for the cyberbullying detection task was the Support Vector Machine classifier.

AB - —Cyberbullying is a social problem in which bullies’ actions are more harmful than in traditional forms of bullying as they have the power to repeatedly humiliate the victim in front of an entire community through social media. Nowadays, multiple works aim at detecting acts of cyberbullying via the analysis of texts in social media publications written in one or more languages; however, few investigations target the cyberbullying detection in the Spanish language. In this work, we aim to compare four traditional supervised machine learning methods performances in detecting cyberbullying via the identification of four cyberbullying-related categories on Twitter posts written in the Peruvian Spanish language. Specifically, we trained and tested the Naive Bayes, Multinomial Logistic Regression, Support Vector Machines, and Random Forest classifiers upon a manually annotated dataset with the help of human participants. The results indicate that the best performing classifier for the cyberbullying detection task was the Support Vector Machine classifier.

KW - Feature extraction

KW - Machine learning

KW - Natural language processing

KW - —Cyberbullying detection

UR - http://www.scopus.com/inward/record.url?scp=85101642625&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/d031ef0a-1fb2-3db2-83d8-508db9b6154b/

U2 - 10.14569/IJACSA.2020.0111018

DO - 10.14569/IJACSA.2020.0111018

M3 - Article in a journal

AN - SCOPUS:85101642625

SN - 2158-107X

VL - 11

SP - 132

EP - 138

JO - International Journal of Advanced Computer Science and Applications

JF - International Journal of Advanced Computer Science and Applications

IS - 10

ER -

A comparison of classification models to detect cyberbullying in the Peruvian Spanish language on twitter

Resumen

Nota bibliográfica

ODS de las Naciones Unidas

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto