SATSal: A Multi-Level Self-Attention Based Architecture for Visual Saliency Prediction

IRIS

Human visual Attention modelling is a persistent interdisciplinary research challenge, gaining new interest in recent years mainly due to the latest developments in deep learning. That is particularly evident in saliency benchmarks. Novel deep learning-based visual saliency models show promising results in capturing high-level (top-down) human visual attention processes. Therefore, they strongly differ from the earlier approaches, mainly characterised by low-level (bottom-up) visual features. These developments account for innate human selectivity mechanisms that are reliant on both high- and low-level factors. Moreover, the two factors interact with each other. Motivated by the importance of these interactions, in this project, we tackle visual saliency modelling holistically, examining if we could consider both high- and low-level features that govern human attention. Specifically, we propose a novel method SAtSal (Self-Attention Saliency). SAtSal leverages both high and low-level features using a multilevel merging of skip connections during the decoding stage. Consequently, we incorporate convolutional self-attention modules on skip connection from the encoder to the decoder network to properly integrate the valuable signals from multilevel spatial features. Thus, the self-attention modules learn to filter out the latent representation of the salient regions from the other irrelevant information in an embedded and joint manner with the main encoder-decoder model backbone. Finally, we evaluate SAtSal against various existing solutions to validate our approach, using the well-known standard saliency benchmark MIT300. To further examine SAtSal's robustness on other image types, we also evaluate it on the Le-Meur saliency painting benchmark.

SATSal: A Multi-Level Self-Attention Based Architecture for Visual Saliency Prediction, 2022.

SATSal: A Multi-Level Self-Attention Based Architecture for Visual Saliency Prediction

Marouane Tliba;Mohamed A. Kerkouri;Bashir Ghariba;Aladine Chetouani;Arzu Coltekin;Mohamed Sami Shehata;Alessandro Bruno

2022-01-01

Abstract

Human visual Attention modelling is a persistent interdisciplinary research challenge, gaining new interest in recent years mainly due to the latest developments in deep learning. That is particularly evident in saliency benchmarks. Novel deep learning-based visual saliency models show promising results in capturing high-level (top-down) human visual attention processes. Therefore, they strongly differ from the earlier approaches, mainly characterised by low-level (bottom-up) visual features. These developments account for innate human selectivity mechanisms that are reliant on both high- and low-level factors. Moreover, the two factors interact with each other. Motivated by the importance of these interactions, in this project, we tackle visual saliency modelling holistically, examining if we could consider both high- and low-level features that govern human attention. Specifically, we propose a novel method SAtSal (Self-Attention Saliency). SAtSal leverages both high and low-level features using a multilevel merging of skip connections during the decoding stage. Consequently, we incorporate convolutional self-attention modules on skip connection from the encoder to the decoder network to properly integrate the valuable signals from multilevel spatial features. Thus, the self-attention modules learn to filter out the latent representation of the salient regions from the other irrelevant information in an embedded and joint manner with the main encoder-decoder model backbone. Finally, we evaluate SAtSal against various existing solutions to validate our approach, using the well-known standard saliency benchmark MIT300. To further examine SAtSal's robustness on other image types, we also evaluate it on the Le-Meur saliency painting benchmark.

Scheda breve

Scheda completa

Scheda completa (DC)

	Lingua/e
	
				Inglese
			
	Data di pubblicazione
	
				2022
			
	DOI
	
				https://dx.doi.org/10.1109/ACCESS.2022.3152189
			
	Titolo rivista
	
				IEEE ACCESS
			
	Editore
	
				IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
			
	Volume
	
				10
			
	Pagina iniziale
	
				20701
			
	Pagina finale
	
				20713
			
	Paese di pubblicazione
	
				United States
			
	Rilevanza
	
				internazionale
			
	Referee
	
				esperti anonimi
			
	ISI Impact Factor
	
				con ISI Impact Factor
			
	Formato
	
				Online
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Numero autori
	
				7
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
publication_No_1.pdf Open Access Tipologia: Documento in Post-print Dimensione 1.27 MB Formato Adobe PDF Visualizza/Apri	1.27 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10808/49809

Citazioni

ND

30

22

social impact