Voice acoustic analysis in ENT practice

Part II: Application of acoustic analysis in practice

F0 analysis

Author: Ph.D. Marcin Just (DiagNova Technologies)

The fundamental frequency analysis is the first of these analyzes where a serious error in the computational method may be encountered. While for healthy people the determination of the fundamental frequency can rarely be incorrect, for “pathological” voices the probability of error increases with the level of pathology. These elements must be taken into account when analyzing the fundamental frequency plots and determining its mean value. In addition, the fundamental frequency plots for a uniformly phoned sound are analyzed differently than for a sentence. More important from the point of view of the vocal folds diagnostics is the analysis for the sound and it is presented here. It is also important that the fundamental frequency is the basis for determining many parameters characterizing speech, so errors in the determination of F 0 automatically translate into incorrect values of these parameters.

Thus, the incorrect shape of the fundamental frequency diagrams may be caused by incorrect work of the vocal folds, as well as by errors in determining the frequency itself. It is important to be able to distinguish between the two causes and possibly eliminate errors by correcting the frequency search range.

The genesis of errors in the determination of F 0

The basic problem in determining the fundamental frequency is the mistaken classification (usually in pathological voices) as F 0 subharmonic (usually (1/2) F 0 ) or even in the case of healthy voices - finding as F 0 the first (less often higher) control. In order to reduce the possibility of erroneous determination of the fundamental frequency, the scope of the search should be narrowed down as much as possible. First of all, it is necessary to take into account the sex of the examined person, their age, profession, health condition, past diseases and procedures performed. After predefining F 0 , the program will precisely determine its real value.

Errors associated with the existence of strong subharmonics

Fig. 14 shows an exemplary oscillogram for which it is practically impossible to determine F 0 without assuming a possible range of its value (based i.e. on gender). Depending on which side the analysis would start from in this case, the frequency value found would be different.

Oscillogram

Fig. 14. Oscillogram with repeated fragments marked to represent a hypothetical baseline period

In similar cases, the fundamental frequency graph usually looks like Fig. 15 and is characterized by rapid jumps up to 100% of the frequency value. The solution to the problem is then to estimate which frequency value is correct (110 Hz or 220 Hz in Fig. 15), and to limit the search range in such a way as to eliminate errors (corrected diagram in Fig. 16).

fundamental frequency

Fig. 15. Incorrect F 0 chart - subharmonics influence

fundamental frequency

Fig. 16. The corrected graph in Fig. 15

Errors related to irregular vibrations of the folds

In the case of extremely irregular vibrations (example in Fig. 17), vibrations produced by different structures, the fundamental frequency found may be erroneous. In such a case, its determination should be abandoned (therefore, at the same time, most parameters should be determined), or limited only to more regular fragments.

 vibrations of folds that are periodically highly irregular

Fig. 17. An example of vibrations of folds that are periodically highly irregular

Errors due to disturbances

Sometimes external disturbances can be interpreted as phonation (Fig. 18, time from 1700 ms to 2000 ms). Fragments of the graph appear then, usually in the area with no real phonation or at the "ends" of sounds with a significantly different frequency value. The problem is solved by narrowing down the frequency search range, cutting a voice sample or setting the "phonation - no phonation" decision level appropriately.

a)  Noise interpreted as phonation
b)  Noise interpreted as phonation correction

Fig. 18. Noise interpreted as phonation: a) before correction; b) after correction

Errors related to F0 search range mismatch

These are perhaps the most serious errors. Fortunately, they only apply to post-operative (laryngectomy), prosthetic and singing voices during the testing of the vocal range. In the case of very low or extremely high voices, the frequency value should always be checked using a narrowband spectrogram (pay attention to hum - Fig. 2b). An attempt to determine extremely low values of the fundamental frequency for samples recorded with equipment with an insufficient bandwidth (limited from the bottom to e.g. 70 Hz) can manifest in a similar way. An example of an incorrectly determined frequency for a prosthetic voice is shown in Fig. 19. In this case, only the correct definition of the range can help.

a) Incorrectly determined fundamental frequency
b)   Incorrectly determined fundamental frequency

Fig. 19. Incorrectly determined fundamental frequency for a very low prosthetic voice: a) before correction; b) after correction

F0 graph interpretation

While the correct determination of the fundamental frequency is often somewhat complex, the interpretation of the graphs is simple (especially in the case of extended "a" phonation). The smoother the graph, the better. Examples are the graphs in Figs. 16 and 18b.