CSR & Sentiment Analysis: a new customized dictionary

IRIS

Communication concerning the CSR pillars is key to sustainable corporate development. Sentiment analysis (SA) is a sub-area of natural language processing for studying communication through the classification of negative or positive opinions. Measuring sentiment is characterized by pitfalls related to: a) the context, where the polarity classification depends on the domain; b) the methods, if lexicon-based, machine learning, or their combination; c) the language, where the lack of resources (different from English) in literature was observed. Strategic communication based on CSR has no domain resources for investigating sentiment, neither in English nor in other languages. Our contribution is placed within the methodological setting of SA for the sustainability framework. We combined lexicon-based methods with machine-learning ones to build a customized lexicon for analyzing the CSR. The innovation concerns: 1) a domain corpus-based approach for improving a general pre-constructed dictionary; 2) the application for Italian; and 3) the performance assessment through machine learning. We developed an algorithm characterized by a multi-stage model that combines text analysis with network analysis and captures semantic concordances through an index of keyword content in the text. To validate our model from a machine learning perspective, we divided our data collection into five random samples: one sample was utilized as a train set or baseline for the lexicon’s implementation, and four were used as test sets. The study showed a notable increase in performance metrics across all samples, demonstrating the effectiveness of our proposal in building a customized lexicon for analyzing CSR in the Italian context.

CSR & Sentiment Analysis: a new customized dictionary, 2023.

CSR & Sentiment Analysis: a new customized dictionary

Emma, Zavarrone;Alessia, Forciniti

2023-01-01

Abstract

Communication concerning the CSR pillars is key to sustainable corporate development. Sentiment analysis (SA) is a sub-area of natural language processing for studying communication through the classification of negative or positive opinions. Measuring sentiment is characterized by pitfalls related to: a) the context, where the polarity classification depends on the domain; b) the methods, if lexicon-based, machine learning, or their combination; c) the language, where the lack of resources (different from English) in literature was observed. Strategic communication based on CSR has no domain resources for investigating sentiment, neither in English nor in other languages. Our contribution is placed within the methodological setting of SA for the sustainability framework. We combined lexicon-based methods with machine-learning ones to build a customized lexicon for analyzing the CSR. The innovation concerns: 1) a domain corpus-based approach for improving a general pre-constructed dictionary; 2) the application for Italian; and 3) the performance assessment through machine learning. We developed an algorithm characterized by a multi-stage model that combines text analysis with network analysis and captures semantic concordances through an index of keyword content in the text. To validate our model from a machine learning perspective, we divided our data collection into five random samples: one sample was utilized as a train set or baseline for the lexicon’s implementation, and four were used as test sets. The study showed a notable increase in performance metrics across all samples, demonstrating the effectiveness of our proposal in building a customized lexicon for analyzing CSR in the Italian context.

Scheda breve

Scheda completa

Scheda completa (DC)

	Lingua/e
	
				Inglese
			
	Data di pubblicazione
	
				2023
			
	Data di accettazione
	
				2023
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-031-39059-3_31
			
	Curatore/i
	
				Conte, D., Fred, A., Gusikhin, O., Sansone, C.
			
	Titolo della monografia
	
				Deep Learning Theory and Applications. DeLTA 2023. Communications in Computer and Information Science.
			
	Serie
	
				COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE
			
	Volume
	
				1875
			
	Pagina iniziale
	
				466
			
	Pagina finale
	
				479
			
	N. pagine
	
				13
			
	ISBN della monografia
	
				303139058X
			
	Paese di pubblicazione
	
				Germany
			
	Città di pubblicazione
	
				Heidelberg
			
	Editore
	
				Springer
			
	Referee
	
				esperti anonimi
			
	Rilevanza
	
				internazionale
			
	Formato
	
				A stampa
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024)
	
				Settore SECS-S/05 - Statistica Sociale
Settore SECS-S/01 - Statistica
Settore INF/01 - Informatica
			
	Numero autori
	
				2
			
	Appare nelle tipologie:
	
				2.01 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
DeLTA_2023_54_CR.pdf Non accessibile Dimensione 755.29 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	755.29 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10808/50105

Citazioni

ND

ND

ND

social impact