IDENTIFICATION OF HATE SPEECH IN THE 2020 MUNICIPAL ELECTIONS
Main Article Content
Abstract
Hate speech has become increasingly prevalent on social media, with evidence suggesting that it intensifies during election years. It is frequently a topic of discussion in public policy debates, with the aim of holding perpetrators accountable and mitigating its effects without infringing upon individual freedom of expression. In this context, artificial intelligence can be a valuable tool to aid in the identification of hate speech. This study presents the construction and evaluation of a Naïve Bayes classifier to identify hate speech in publications on social media platform X related to candidates who contested the second round of the 2020 municipal elections. A quantitative methodology was employed. The CRISP-DM process, well-established in the field of data science, was adopted for the construction and evaluation of the proposed Naïve Bayes classifier. The data used to train the classifier was collected through an exploratory search on Kaggle, an online repository of data for training predictive models. The data used to evaluate the model was collected from a survey of publications on social media platform X. The classifier's performance was evaluated quantitatively using statistical accuracy metrics. Overall, the proposed Naïve Bayes classifier achieved an average accuracy of 72.38% on the dataset collected for evaluation. The performance difference between the proposed classifier and Perspective API, an online tool for hate speech identification adopted as a reference in this study, was less than 9.12% for all candidates considered. These results demonstrate that the proposed Naïve Bayes classifier was capable of identifying the presence of hate speech in publications on social media platform X related to candidates who contested the second round of the 2020 municipal elections.
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c). Conjuncture Bulletin (BOCA)
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
ALKOMAH, F.; MA, X. “A Literature Review of Textual Hate Speech Detection Methods and Datasets”. Information, vol. 13, n. 6, 2022.
ALMEREKHI, H. et al. “Detecting toxicity triggers in online discussions”. 30th ACM Conference on Hypertext and Social Media. New York: ACM, 2019.
AUGUSTOP. “Portuguese Tweets for Sentiment Analysis”. Kaggle [2018]. Disponível em: . Acesso em: 28/06/2024.
BISPO, F. “Polícia investiga racismo e ameaça de morte contra vereadora eleita em Joinville”. Estadão [2020]. Disponível em: . Acesso em: 05/06/2024.
BOUCHET-VALAT, M. “Package SnowballC”. The Comprehensive R Archive Network [2023]. Disponível em: . Acesso em: 28/06/2024.
BRASIL. Constituição da República Federativa do Brasil. Brasília: Planalto, 1988. Disponível em: . Acesso em: 19/05/2024.
BRASIL. Projeto de Lei n. 7582. Brasília: Planalto, 2014. Disponível em: . Acesso em: 19/05/2024.
CARVALHO, A. C. P. L. F. et al. Ciência de Dados: Fundamentos e Aplicações. Rio de Janeiro: Grupo GEN, 2024.
DAVIDSON, T. et al. “Automated Hate Speech Detection and the Problem of Offensive Language”. International AAAI Conference on Web and Social Media. Montreal: AAAI, 2017.
DUNKER, C. I. L. et al. Relatório de Recomendações para o Enfrentamento do Discurso de Ódio e o Extremismo no Brasil. Brasília: Ministério dos Direitos Humanos e da Cidadania, 2024.
ELIAS, M. O.; BRASIL, P. Z. S. “O papel das cortes constitucionais no enfrentamento aos ataques e na defesa da democracia”. Boletim de Conjuntura (BOCA), vol. 17, n. 50, 2024.
ESCOVEDO, T.; KOSHIYAMA, A. Introdução a Data Science: Algoritmos de machine learning e métodos de análise. São Paulo: Editora Casa do Código, 2020.
FACELI, K. et al. Inteligência Artificial: Uma abordagem de aprendizado de máquina. Rio de Janeiro: Editora LTC, 2021.
GARGARELLA, R. “Constitucionalismo y libertad de expresión”. In: ORDOÑEZ, M. P. A. et al. (eds.). Libertad de expresión: debates, alcances y nueva agenda. Quito: Unesco, 2011.
GO, A.; BHAYANI, R.; HUANG, L. “Twitter sentiment classification using distant supervision”. Conference Information Systems and Technologies. Palo Alto: CS224N Project, 2009.
IBGE - Instituto Brasileiro de Geografia E Estatística. Cidades e Estados do Brasil. Rio de Janeiro: IBGE, 2022. Disponível em: . Acesso em: 05/01/2022.
JAHAN, M. S.; OUSSALAH, M. “A systematic review of hate speech automatic detection using natural language processing”. Neurocomputing, vol. 546, n. 1, 2023.
JIGSAW. “One of Europe’s largest gaming platforms is tackling toxicity with machine learning”. Medium [2019]. Disponível em: . Acesso em: 28/06/2024.
JIGSAW. Using machine learning to reduce toxicity online. New York: JIGSAW, 2022. Disponível em: . Acesso em: 28/06/2024.
MEYER, D. et al. Misc Functions of the Department of Statistics, Probability Theory Group. The Comprehensive R Archive Network [2023]. Disponível em: . Acesso em: 28/06/2024.
OLIVEIRA, A. S. et al. “How Good Is ChatGPT For Detecting Hate Speech In Portuguese?”. Anais do 14º Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana. Porto Alegre: SBC, 2023.
ONU - Organização das Nações Unidas. United Nations Strategy and Plan of Action on Hate Speech. New York: ONU, 2019. Disponível em: . Acesso em: 28/06/2024.
PEREIRA, J. R. G.; MEDEIROS, O. R.; COUTINHO, C. S. “Regulação do discurso de ódio: análise comparada em países do Sul Global”. Revista de Direito Internacional, vol. 17, n. 1, 2020.
PROVOST, F.; FAWCETT, T. Data Science para Negócios. Rio de Janeiro: Editora Alta Books, 2016.
RAWAT, T.; “Applying CRISP-DM Methodology in Developing Machine Learning Model for Credit Risk Prediction”. Lecture Notes in Networks and Systems. vol. 739, n. 1, 2023.
RIBEIRO, D. A. Avaliação do desempenho em métodos de análise de sentimentos e no algoritmo Naïve Bayes (Trabalho de Conclusão de Curso de Graduação em Sistemas de Informação). Marabá: Unifesspa, 2016.
SAFERNET. “Crimes de ódio têm crescimento de até 650% no primeiro semestre de 2022”. Safernet [2022]. Disponível em: . Acesso em: 28/06/2024.
SALMINEN, J. O. et al. “Developing an online hate classifier for multiple social media platforms”. Human-centric Computing and Information Sciences, vol. 10, n. 1, 2020.
SILVA, N. F. F. Análise de sentimentos em textos curtos provenientes de redes sociais (Tese de Doutorado em Ciência da Computação e Matemática Computacional). São Carlos: USP, 2016.
SILVA, V. R. “Eleições de 2018 têm pico de denúncias de discurso de ódio, apontam dados da Safernet”. Associação Gênero e Número [2018]. Disponível em: . Acesso em: 28/06/2024.
TONTODIMAMMA, A. et al. “Thirty years of research into hate speech: topics of interest and their evolution”. Scientometrics, vol. 126, n. 1, 2021.
UNESCO - United Nations Educational, Scientific and Cultural Organization. Addressing hate speech on social media: contemporary challenges. Paris: Unesco, 2021.
VARGAS, F. et al. “HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection”. Thirteenth Language Resources and Evaluation Conference. Marseille: European Language Resources Association, 2022.
WIRTH, R.; HIPP, J. “CRISP-DM: Towards a standard process model for data mining”. 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining. Manchester: Practical Application Company, 2000.
X CORP. About the X API. San Francisco: X Corp, 2024. Disponível em: . Acesso em: 28/06/2024.
YIN, W.; ZUBIAGA, A. “Towards generalisable hate speech detection: a review on obstacles and solutions”. PeerJ Computer Science, vol. 7, 2021.
ZAVALETA-SÁNCHEZ, E. et al. “Comparative Study of KDD and CRISP-DM Methodologies for Phishing Identification”. Ninth International Congress on Information and Communication Technology. London: Springer, 2024.