A novel multimodal model for detecting Vietnamese toxic news using PhoBERT and Swin Transformer V2

Indonesian Journal of Electrical Engineering and Computer Science

A novel multimodal model for detecting Vietnamese toxic news using PhoBERT and Swin Transformer V2

Abstract

News articles with fake, toxic or reactionary content are currently posted and spreaded very strongly due to the popularity of the Internet and especially the explosion of social networks and online services in cyberspace. Toxic news, especially reactionary news aimed at Vietnam, such as online articles spreading false information, slandering leaders, inciting destruction of the great national unity bloc, have a great impact on social life because they can spread quickly and have many forms of expression, such as news in the forms of text, images, videos, or a combination of text and images. Due to the seriousness of articles posting fake, toxic or reactionary news in cyberspace, there have been a number of studies in Vietnam and abroad for detection and prevention. However, most of the proposals focus on handling fake and toxic news posted using the English language. Furthermore, due to a large number of online news are posted in the form of images, or text embedded in images and videos, it is very difficult to process these news, leading to a relatively low detection rate. This paper proposes a multimodal model based on the combination of PhoBERT and Swin Transformer V2 for detecting fake and toxic news in both forms of text and images. Comprehensive experiments conducted on a dataset of 8,000 text and image news articles demonstrate that the proposed multimodal model surpasses both individual models and previous approaches, achieving 95% accuracy and 95% F1-score.

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration