Title: Yet Another Algorithm for Pitch Tracking
Authors: Kavita Kasi, Stephen A. Zahorian
Published: 2002-05-01
Link: https://ieeexplore.ieee.org/document/5743729/authors#authors

Abstract

In this paper, we present a pitch detection algorithm that is extremely robust for both high quality and telephone speech. The kernel method for this algorithm is the “NCCF or Normalized Cross Correlation” reported by David Talkin [1]. Major innovations include: processing of the original acoustic signal and a nonlinearly processed version of the signal to partially restore very weak F0 components; intelligent peak picking to select multiple F0 candidates and assign merit factors; and, incorporation of highly rohust pitch contours obtained from smoothed versions of low frequency portions of spectrograms. Dynamic programming is used to find the “best” pitch track among all the candidates, using both local and transition costs. We evaluated our algorithm using the Keele pitch extraction reference database as “ground truth” for both “high quality” and “telephone” speech. For both types of speech, the error rates obtained are lower than the lowest reported in the literature.