In this paper, we introduce a new methodology for the evaluation of alternative algorithms in capturing the deep statistical structure of datasets of different types and nature, called MST Fitness, and based on the notion of Minimum Spanning Tree (MST). We test this methodology on six different databases, some of which artificial and widely used in similar experimentations, and some related to real world phenomena. Our test set consists of eight different algorithms, including some widely known and used, such as Principal Component Analysis, Linear Correlation, or Euclidean Distance. We moreover consider more sophisticated Artificial Neural Network based algorithms, such as the Self-Organizing Map (SOM) and a relatively new algorithm called Auto-Contractive Map (AutoCM). We find that, for our benchmark of datasets, AutoCM performs consistently better than all other algorithms for all of the datasets, and that its global performance is superior to that of the others of several orders of magnitude. It is to be checked in future research if AutoCM can be considered a truly general-purpose algorithm for the analysis of heterogeneous categories of datasets.

MST Fitness Index and implicit data narratives: a comparative test on alternative unsupervised algorithms, 2016.

MST Fitness Index and implicit data narratives: a comparative test on alternative unsupervised algorithms

Sacco, Pierluigi
2016-01-01

Abstract

In this paper, we introduce a new methodology for the evaluation of alternative algorithms in capturing the deep statistical structure of datasets of different types and nature, called MST Fitness, and based on the notion of Minimum Spanning Tree (MST). We test this methodology on six different databases, some of which artificial and widely used in similar experimentations, and some related to real world phenomena. Our test set consists of eight different algorithms, including some widely known and used, such as Principal Component Analysis, Linear Correlation, or Euclidean Distance. We moreover consider more sophisticated Artificial Neural Network based algorithms, such as the Self-Organizing Map (SOM) and a relatively new algorithm called Auto-Contractive Map (AutoCM). We find that, for our benchmark of datasets, AutoCM performs consistently better than all other algorithms for all of the datasets, and that its global performance is superior to that of the others of several orders of magnitude. It is to be checked in future research if AutoCM can be considered a truly general-purpose algorithm for the analysis of heterogeneous categories of datasets.
Inglese
2016
Elsevier
461
726
746
21
Netherlands
internazionale
esperti anonimi
con ISI Impact Factor
A stampa
Settore SECS-P/02 - Politica Economica
2
File in questo prodotto:
File Dimensione Formato  
Physica A 2016.pdf

Accessibile solo dalla rete interna IULM

Dimensione 4.44 MB
Formato Adobe PDF
4.44 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10808/29779
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact