Baleen whales produce a wide variety of frequency-modulated calls. Extraction of the time-frequency (TF) structures of these calls forms the basis for many applications, including abundance estimation and species recognition. Typical methods to extract the contours of whale calls from a spectrogram are based on the short-time Fourier transform and are, thus, restricted by a fixed TF resolution. Considering the low-frequency nature of baleen whale calls, this work represents the contours using a pseudo-Wigner-Ville distribution for a higher TF resolution at the cost of introducing cross terms. An adaptive threshold is proposed followed by a modified Gaussian mixture probability hypothesis density filter to extract the contours. Finally, the artificial contours, which are caused by the cross terms, can be removed in post-processing. Simulations were conducted to explore how the signal-to-noise ratio influences the performance of the proposed method. Then, in experiments based on real data, the contours of the calls of three kinds of baleen whales were extracted in a highly accurate manner (with mean deviations of 5.4 and 0.051 Hz from the ground-truth contours at sampling rates of 4000 and 100 Hz, respectively) with a recall of 75% and a precision of 78.5%.
Automated extraction of baleen whale calls based on the pseudo-Wigner–Ville distribution / Pu, Wangyi; Liu, Songzuo; Qing, Xin; Qiao, Gang; Mazhar, Suleman; Ma, Tianlong. - In: THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA. - ISSN 0001-4966. - 153:3(2023), pp. 1564-1579. [10.1121/10.0017457]
Automated extraction of baleen whale calls based on the pseudo-Wigner–Ville distribution
Pu, Wangyi;
2023
Abstract
Baleen whales produce a wide variety of frequency-modulated calls. Extraction of the time-frequency (TF) structures of these calls forms the basis for many applications, including abundance estimation and species recognition. Typical methods to extract the contours of whale calls from a spectrogram are based on the short-time Fourier transform and are, thus, restricted by a fixed TF resolution. Considering the low-frequency nature of baleen whale calls, this work represents the contours using a pseudo-Wigner-Ville distribution for a higher TF resolution at the cost of introducing cross terms. An adaptive threshold is proposed followed by a modified Gaussian mixture probability hypothesis density filter to extract the contours. Finally, the artificial contours, which are caused by the cross terms, can be removed in post-processing. Simulations were conducted to explore how the signal-to-noise ratio influences the performance of the proposed method. Then, in experiments based on real data, the contours of the calls of three kinds of baleen whales were extracted in a highly accurate manner (with mean deviations of 5.4 and 0.051 Hz from the ground-truth contours at sampling rates of 4000 and 100 Hz, respectively) with a recall of 75% and a precision of 78.5%.| File | Dimensione | Formato | |
|---|---|---|---|
|
Pu_Automated extraction_2023.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
3.75 MB
Formato
Adobe PDF
|
3.75 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


