Clustering and topic modeling over tweets: A comparison over a health dataset

Juan Antonio Lossio-Ventura, Juandiego Morzan, Hugo Alatrista-Salas, Tina Hernandez-Boussard, Jiang Bian

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

11 Citas (Scopus)

Resumen

Twitter became the most popular form of social interactions in the healthcare domain. Thus, various teams have evaluated Twitter as an additional source where patients share information about their healthcare with the potential goal to improve their outcomes. Several existing topic modeling and document clustering applications have been adapted to assess tweets showing that the performances of the applications are negatively affected due to the nature and characteristics of tweets. Moreover, Twitter health research has become difficult to measure because of the absence of comparisons between the existing applications. In this paper, we perform an evaluation based on internal indexes of different topic modeling and document clustering applications over two Twitter health-related datasets. Our results show that Online Twitter LDA and Gibbs LDA get a better performance for extracting topics and grouping tweets. We want to provide health practitioners this comparison to select the most suitable application for their tasks.
Idioma originalInglés
Título de la publicación alojadaProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
EditoresIllhoi Yoo, Jinbo Bi, Xiaohua Tony Hu
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas1544-1547
Número de páginas4
ISBN (versión digital)978-172811867-3
DOI
EstadoPublicada - 1 nov. 2019
EventoProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 -
Duración: 1 nov. 2019 → …

Serie de la publicación

NombreProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019

Conferencia

ConferenciaProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
Período1/11/19 → …

Nota bibliográfica

Funding Information:
Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA183962. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Publisher Copyright:
© 2019 IEEE.

Huella

Profundice en los temas de investigación de 'Clustering and topic modeling over tweets: A comparison over a health dataset'. En conjunto forman una huella única.

Citar esto