According to recent research, geometric deep learning allows to reach unprecedented accuracy for online misinformation detection. By fully leveraging the news social context, URL propagation paths in social networks are first represented as graphs and then classified using Graph Neural Network (GNN) models. Despite these remarkable efforts, researchers are still hampered by the scarcity of high-quality benchmark datasets, and as a result, the efficacy of state-of-the-art approaches could be overestimated. So far, in order to obtain a decent number of third-party fact-checked URLs, researchers have either sampled news from notoriously reliable and unreliable sources using distant supervision, or they have gathered pre-labeled URLs from third-party fact-checking websites. In the former case, resulting datasets can be quite large, but also noisy and biased since pieces of news are labeled as true or false.
FbMultiLingMisinfo: Challenging Large-Scale Multilingual Benchmark for Misinformation Detection / Barnabò, Giorgio; Siciliano, Federico; Castillo, Carlos; Leonardi, Stefano; Nakov, Preslav; Da San Martino, Giovanni; Silvestri, Fabrizio. - (2022), pp. 1-8. (Intervento presentato al convegno International Joint Conference on Neural Networks tenutosi a Padova; Italia) [10.1109/IJCNN55064.2022.9892739].
FbMultiLingMisinfo: Challenging Large-Scale Multilingual Benchmark for Misinformation Detection
Federico Siciliano
;Carlos Castillo
;Stefano Leonardi
;Fabrizio Silvestri
2022
Abstract
According to recent research, geometric deep learning allows to reach unprecedented accuracy for online misinformation detection. By fully leveraging the news social context, URL propagation paths in social networks are first represented as graphs and then classified using Graph Neural Network (GNN) models. Despite these remarkable efforts, researchers are still hampered by the scarcity of high-quality benchmark datasets, and as a result, the efficacy of state-of-the-art approaches could be overestimated. So far, in order to obtain a decent number of third-party fact-checked URLs, researchers have either sampled news from notoriously reliable and unreliable sources using distant supervision, or they have gathered pre-labeled URLs from third-party fact-checking websites. In the former case, resulting datasets can be quite large, but also noisy and biased since pieces of news are labeled as true or false.File | Dimensione | Formato | |
---|---|---|---|
Barnabò_FbMultiLingMisinfo_2022.pdf
solo gestori archivio
Note: https://chato.cl/papers/barnabo_siciliano_2022_multilingual_misinformation_dataset.pdf - DOI: 10.1109/IJCNN55064.2022.9892739
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
832.01 kB
Formato
Adobe PDF
|
832.01 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.