: Object counting, defined as the task of accurately predicting the number of objects in static images or videos, has recently attracted considerable interest. However, the unavoidable presence of background noise prevents counting performance from advancing further. To address this issue, we created a group and graph attention network (GGANet) for dense object counting. GGANet is an encoder-decoder architecture incorporating a group channel attention (GCA) module and a learnable graph attention (LGA) module. The GCA module groups the feature map into several subfeatures, each of which is assigned an attention factor through the identical channel attention. The LGA module views the feature map as a graph structure in which the different channels represent diverse feature vertices, and the responses between channels represent edges. The GCA and LGA modules jointly avoid the interference of irrelevant pixels and suppress the background noise. Experiments are conducted on four crowd-counting datasets, two vehicle-counting datasets, one remote-sensing counting dataset, and one few-shot object-counting dataset. Comparative results prove that the proposed abbr achieves superior counting performance.
Object Counting via Group and Graph Attention Network, 2023.
Object Counting via Group and Graph Attention Network
Bruno, Alessandro;
2023-01-01
Abstract
: Object counting, defined as the task of accurately predicting the number of objects in static images or videos, has recently attracted considerable interest. However, the unavoidable presence of background noise prevents counting performance from advancing further. To address this issue, we created a group and graph attention network (GGANet) for dense object counting. GGANet is an encoder-decoder architecture incorporating a group channel attention (GCA) module and a learnable graph attention (LGA) module. The GCA module groups the feature map into several subfeatures, each of which is assigned an attention factor through the identical channel attention. The LGA module views the feature map as a graph structure in which the different channels represent diverse feature vertices, and the responses between channels represent edges. The GCA and LGA modules jointly avoid the interference of irrelevant pixels and suppress the background noise. Experiments are conducted on four crowd-counting datasets, two vehicle-counting datasets, one remote-sensing counting dataset, and one few-shot object-counting dataset. Comparative results prove that the proposed abbr achieves superior counting performance.File | Dimensione | Formato | |
---|---|---|---|
threat_extracted_Final manuscript (1).pdf
Accessibile solo dalla rete interna IULM
Tipologia:
Documento in Pre-print
Dimensione
1.62 MB
Formato
Adobe PDF
|
1.62 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.