Skip to main content
SearchLoginLogin or Signup

Review 2: "The Acoustic Dissection of Cough: Diving into Machine Listening-based COVID-19 Analysis and Detection"

This preprint reports on a machine learning model for detecting COVID-19 by analyzing patients’ cough sounds. Reviewers deemed the findings potentially informative and promising, with a few limitations that could be addressed.

Published onApr 07, 2022
Review 2: "The Acoustic Dissection of Cough: Diving into Machine Listening-based COVID-19 Analysis and Detection"
1 of 2
key-enterThis Pub is a Review of
The Acoustic Dissection of Cough: Diving into Machine Listening-based COVID-19 Analysis and Detection

AbstractPurposeThe coronavirus disease 2019 (COVID-19) has caused a crisis worldwide. Amounts of efforts have been made to prevent and control COVID-19’s transmission, from early screenings to vaccinations and treatments. Recently, due to the spring up of many automatic disease recognition applications based on machine listening techniques, it would be fast and cheap to detect COVID-19 from recordings of cough, a key symptom of COVID-19. To date, knowledge on the acoustic characteristics of COVID-19 cough sounds is limited, but would be essential for structuring effective and robust machine learning models. The present study aims to explore acoustic features for distinguishing COVID-19 positive individuals from COVID-19 negative ones based on their cough sounds.MethodsWith the theory of computational paralinguistics, we analyse the acoustic correlates of COVID-19 cough sounds based on the COMPARE feature set, i. e., a standardised set of 6,373 acoustic higher-level features. Furthermore, we train automatic COVID-19 detection models with machine learning methods and explore the latent features by evaluating the contribution of all features to the COVID-19 status predictions.ResultsThe experimental results demonstrate that a set of acoustic parameters of cough sounds, e. g., statistical functionals of the root mean square energy and Mel-frequency cepstral coefficients, are relevant for the differentiation between COVID-19 positive and COVID-19 negative cough samples. Our automatic COVID-19 detection model performs significantly above chance level, i. e., at an unweighted average recall (UAR) of 0.632, on a data set consisting of 1,411 cough samples (COVID-19 positive/negative: 210/1,201).ConclusionsBased on the acoustic correlates analysis on the COMPARE feature set and the feature analysis in the effective COVID-19 detection model, we find that the machine learning method to a certain extent relies on acoustic features showing higher effects in conventional group difference testing.

RR:C19 Evidence Scale rating by reviewer:

  • Reliable. The main study claims are generally justified by its methods and data. The results and conclusions are likely to be similar to the hypothetical ideal study. There are some minor caveats or limitations, but they would/do not change the major claims of the study. The study provides sufficient strength of evidence on its own that its main claims should be considered actionable, with some room for future revision.



This study adds to the growing body of evidence on the use of bioacoustic signal analysis and machine learning methods for the detection of COVID-19. The paper focuses on the analysis of cough sounds, assessing the predictive power of several acoustic features previously used in the field of computational paralinguistics. The authors claim that machine learning models trained on data consisting of such acoustic features extracted from recordings of patients' coughs can distinguish between cough samples of symptomatic and asymptomatic COVID-19 patients and controls. They identify a subset of these acoustic features which contribute the most to the model's predictive power, finding some consistency between features identified through machine learning and those identified by conventional statistical testing. Based on RR:C19’s Strength of Evidence Scale, these claims are reliable, generally supported by the data and methods used, and therefore actionable with some limitations.

The paper presents a good, though not an exhaustive review of the growing literature on COVID-19 detection based on speech and cough sounds. However, the main contribution of the study lies not in improvements in prediction accuracy over the state of the art, but in its detailed analysis of cough sounds in terms of acoustic features as regards COVID-19 detection, based on a carefully curated publicly available data set. A subset of the COUGHVID (Orlandic et al. 2021) data set was selected to support three binary prediction tasks: COVID-19 positive versus negative, symptomatic COVID-19 versus symptomatic COVID-19 negative patients, and asymptomatic COVID-19 positive versus asymptomatic controls. Overall, the models achieve moderate discriminatory power, with the area under the receiver operating characteristic curve (AUC) ranging between 0.61 and 0.67 for the best models for each task.

The analysis of the bioacoustic features is generally informative, showing a comparison between the assessment of the discriminative power of their low-level descriptors in terms of conventional non-parametric statistical testing and machine learning scores. The methods are well described, and the results are carefully presented, summarised, and discussed in relation to features found to be relevant in other studies. The authors, however, do not make an attempt to discuss the implications of these findings in relation to COVID-19 pathology or their potential clinical implications, which limits this otherwise interesting contribution.

Although the authors made great efforts to improve the reliability of the data, the data set remains a major limitation of the study. The authors acknowledge as much in a separate section of the paper, pointing out that the self-reported nature of the data collection procedure makes it impossible to verify the accuracy of the participant's status. Although the risk of selection bias is mentioned, the paper lacks a proper discussion of age bias, which would seem important given that the data consist largely of samples from participants in their 20s and 30s. While this and the relatively low AUC scores undermine the authors' claim regarding the usefulness of the machine learning models in clinical settings, the results show promise, highlighting an area that deserves further research.

Orlandic, L., Teijeiro, T. & Atienza, D. The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms. Sci Data 8, 156 (2021).

1 of 3

No comments here

Why not start the discussion?