Generative AI (GenAI) and Natural Language Processing (NLP) have advanced significantly in recent years, exhibiting breakthroughs and pushing the bar of accuracy rates in text mining. Cascade effects have been observed in many application domains, spanning text analysis, question answering, classification, and new textual content generation. The latter has allowed many end-users to perceive AI as ready-to-go solutions to optimise their daily workflow. However, dark and bright sides lurk behind textual content generation, as trustworthy and unverified content can be effortlessly generated. That has fuelled a significant challenge in our society: fake news. Although fake news has existed for a while, it remains an unsolved issue. Generative AI has brought it to a new level by enabling the automated production of large volumes of high-quality, individually targeted fake content. Our work is part of the HeReFaNMi (Health-Related Fake News Mitigation) project, which focuses on health-related fake news mitigation by using NLP, Language Models, and a Retrieval-Augmented Generation (RAG) system. We propose a new chunking mechanism that streamlines the overall RAG framework pipeline. BERT and BERT+RAG have been compared on the health-related fake news classification task on a dataset of 2000 health-related articles equally split into two categories (’fake’ and ’credible’). Preliminary experimental results reveal improvements in Accuracy, Recall, and F1-score.

Health Misinformation Detection: {A} Chunking Strategy Integratedto Retrieval-Augmented Generation (short paper), 2024.

Health Misinformation Detection: {A} Chunking Strategy Integrated to Retrieval-Augmented Generation (short paper)

Alessandro Bruno;
2024-01-01

Abstract

Generative AI (GenAI) and Natural Language Processing (NLP) have advanced significantly in recent years, exhibiting breakthroughs and pushing the bar of accuracy rates in text mining. Cascade effects have been observed in many application domains, spanning text analysis, question answering, classification, and new textual content generation. The latter has allowed many end-users to perceive AI as ready-to-go solutions to optimise their daily workflow. However, dark and bright sides lurk behind textual content generation, as trustworthy and unverified content can be effortlessly generated. That has fuelled a significant challenge in our society: fake news. Although fake news has existed for a while, it remains an unsolved issue. Generative AI has brought it to a new level by enabling the automated production of large volumes of high-quality, individually targeted fake content. Our work is part of the HeReFaNMi (Health-Related Fake News Mitigation) project, which focuses on health-related fake news mitigation by using NLP, Language Models, and a Retrieval-Augmented Generation (RAG) system. We propose a new chunking mechanism that streamlines the overall RAG framework pipeline. BERT and BERT+RAG have been compared on the health-related fake news classification task on a dataset of 2000 health-related articles equally split into two categories (’fake’ and ’credible’). Preliminary experimental results reveal improvements in Accuracy, Recall, and F1-score.
Inglese
2024
https://ceur-ws.org/Vol-3923/Paper\_5.pdf
AIxPAC 2024 Artificial Intelligence for Perception and Artificial Consciousness 2024
Bolzano
2024
internazionale
contributo
Proceedings of the 2nd Workshop on Artificial Intelligence for Perception and Artificial Consciousness (AIxPAC 2024) co-located with the 22nd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024), Bolzano, Italy, November 28, 2024
Alessandro Bruno and Arianna Pipitone and Riccardo Manzotti and Agnese Augello and Pier Luigi Mazzeo and Filippo Vella and Giuseppe Mazzola
41
48
Germany
CEUR-WS.org
esperti anonimi
Online
Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
9
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10808/69989
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact