Internet-sourced data with human-readable text formats proliferate, making natural language processing (NLP) tools helpful in extracting users’ query-relevant information. Websites dedicated to community question and answer (CQA) such as Stack Overflow have become popular among developers in their day-to-day work. However, those kinds of websites might not be helpful if developers get stuck on responses that are not sorted with quality checks. Therefore, the increasing number of new queries posted daily on such platforms leads to the need for the development of reliable automatic software solutions replacing human moderators. To this end, we propose a deep learning stack for an NLP application to categorize Stack Overflow questions as High Quality, Low-Quality Edit, and Low-Quality Close using AI models like Recurrent Neural Network(RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), bidirectional, convolutional 1D model with max-out layer. Our proposed approach achieves a training accuracy of 97.24%, and a testing accuracy of 96.44%, higher than the best-performing baseline (LSTM: 94.72% testing accuracy). Furthermore, it delivers a precision of 95.84%, recall of 95.44%, and F1-score of 95.64%, highlighting its robustness and reliability in classifying Q&As.

Three-Class Stack Overflow Question Classifier Using 1D CNN and Maxout, 2025.

Three-Class Stack Overflow Question Classifier Using 1D CNN and Maxout

Alessandro Bruno;Chintan Bhatt;
2025-01-01

Abstract

Internet-sourced data with human-readable text formats proliferate, making natural language processing (NLP) tools helpful in extracting users’ query-relevant information. Websites dedicated to community question and answer (CQA) such as Stack Overflow have become popular among developers in their day-to-day work. However, those kinds of websites might not be helpful if developers get stuck on responses that are not sorted with quality checks. Therefore, the increasing number of new queries posted daily on such platforms leads to the need for the development of reliable automatic software solutions replacing human moderators. To this end, we propose a deep learning stack for an NLP application to categorize Stack Overflow questions as High Quality, Low-Quality Edit, and Low-Quality Close using AI models like Recurrent Neural Network(RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), bidirectional, convolutional 1D model with max-out layer. Our proposed approach achieves a training accuracy of 97.24%, and a testing accuracy of 96.44%, higher than the best-performing baseline (LSTM: 94.72% testing accuracy). Furthermore, it delivers a precision of 95.84%, recall of 95.44%, and F1-score of 95.64%, highlighting its robustness and reliability in classifying Q&As.
Inglese
2025
2025
Intelligent Vision and Computing
internazionale
contributo
ICIVC 2025 - International Conference on Intelligent Vision and Computing
Switzerland
Springer Link
esperti anonimi
Online
Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10808/70029
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact