Practical tests with the participation of patients
Author: Ph.D. Marcin Just (DiagNova Technologies)The test results were verified by a test carried out in conditions ensuring optimal use of the values of each microphone. It allowed to finally determining the suitability of a given microphone for conducting voice examination. The best microphone amplifier was used.
Methodology
The dedicated MobilePRE audio interface was used for testing, which also serves as a microphone preamplifier and a high-class sound card, and five microphones:
- miniature condenser microphone,
- classic (cheap) computer microphone (electret condenser microphone),
- cheap dynamic microphone - TDM 205 (useful band according to manufacturer 80-12000 Hz),
- dynamic vocal microphone, the so-called "Entry level" - Shure C607N (usable band according to the manufacturer 60-12000 Hz),
- classic dynamic vocal microphone - Behringer XM8500 (useful band 50-12000 Hz).
Additionally, an attempt was made to analyze the samples recorded with the use of a voice recorder on classic cassette tapes. This test was aimed at assessing the quality of this method of archiving voice samples.
The reference microphone was the Behringer B-1 studio condenser microphone. The recordings were made simultaneously (two-channel recording) using the reference and test microphones./p>
The recording level was optimally set, the sampling frequency was 22050 Hz, the resolution was 16 bits, and the distance between the recorded person's mouth and the microphone was chosen to avoid the adverse effects of changing the characteristics of dynamic microphones. The recordings were made in a quiet and shielded room, using a laptop computer to eliminate electromagnetic interference (including mains hum). As a result, the influence of microphones on the recording has been practically limited to differences in frequency characteristics.
Practical tests were carried out for the microphones (recording and analyzing a voice sample). 13 people participated in the recordings (6 men and 7 women):
- 6 people with no observed anomalies in the voice and no diagnosed pathological conditions of the larynx,
- 2 people with an extremely low voice,
- 2 people generating voice using a prosthesis,
- 3 people diagnosed with a pathological condition of the larynx causing voice disorders.
Ultimately, the material was divided into three groups:
- voices "without anomalies" - correct (6),
- "low" voices - with extremely low frequency (3),
- "pathological" voices - in people with voice disorders (4).
For the analysis, recordings of the extended phonation of the "a" were used. The sample recordings for a single test were simultaneously shifted in time and "truncated" using a specially designed computer program, so as to eliminate differences in the distance between microphones and the patient and obtain their maximum similarity (in the sense of correlation). Fig. 10 shows the program window, and Fig. 11 shows examples of waveforms of the audio signal (voice samples), recorded simultaneously from two microphones after time synchronization.
Fig. 10. The result of the automatic timing of the two waveforms
Fig. 11. Program window for time synchronization of waveforms recorded from multiple sources
Analyzes were performed automatically, the operator's intervention, consisting in the preliminary estimation of the fundamental frequency, concerned only low-frequency cases and was identical for all samples. Only two samples were directly compared - test and reference, determining the ratio of the corresponding parameters generated for these two samples.
Results
Figure 12 shows the relative changes in acoustic parameters for all tested microphones.
Fig. 12. Relative changes in the values of the determined acoustic parameters for different types of voices and different microphones
The three horizontal stripes show the changes in parameter values in the groups of pathological, low and healthy voices, respectively. Yellow indicates an overestimated of the analysis results, blue indicates an underestimated. Green color shows no difference between the test microphone and the reference microphone. The graphs do not show the results of the samples' analyzes, recorded with a tape recorder, because in their case the changes in parameters significantly exceeded the changes recorded with the use of the worst microphones.
The graphs show that after limiting the influence of microphones to changes in frequency characteristics only, only the cheapest dynamic microphone has an adverse effect in normal voices. Its very low sensitivity increases the importance of the noise in the preamplifiers, which is reflected in the increase in the value of some parameters.
In "pathological" voices, a minimal adverse effect on the recordings was observed for cheap condenser microphones (computer and miniature) - due to a certain limitation of the frequency response in the high frequency range, they lower the values of some parameters. In addition, the lack of a protective sponge to prevent direct blowing into the microphone may to some extent affect some parameters related to voice stability (two "peaks" for "polawew" and R2HDev in the pathological voices chart for the computer microphone). For voices with a very low fundamental frequency, most of the tested microphones have a negative impact on the recording, except for the best professional dynamic microphone. Apart from the last case and the cheapest, available on the market, dynamic microphone, the remaining defects of the recording, related to the frequency characteristics, can be eliminated at the stage of numerical calculations, when determining the parameters of acoustic analysis.
Problems related to the analysis of the lowest frequency waveforms probably result from the occurrence of the fundamental frequency below the lower threshold of the frequency response of the microphones, which prevents its correct determination. Fig. 13 shows examples of differences in the determined graphs of the fundamental frequency for the same voice sample, recorded with a studio microphone and a dynamic microphone.
Fig. 13. Comparison of frequency charts for a prosthetic voice recorded with two types of microphones; left: high-end dynamic microphone, right: correct graph from a studio condenser microphone
Unfortunately, samples from the studio microphone often showed the presence of additional very low frequency components making the analysis difficult (Fig. 14) and worsening its results. When using this type of microphone, you should pay attention to its good stabilization.
Fig. 14. Very low frequency components captured using a studio condenser microphone
Conclusions
- The analysis of the signal previously recorded on an audio tape should be preceded by the inspection of the equipment, as the obtained results may be significantly changed and may not be suitable for further analysis.
- The miniature microphone and the cheapest dynamic microphone should not be used for diagnostic purposes.
- A computer microphone in combination with a good amplifier fully performs its function when it is used to analyze voices with "normal" fundamental frequencies in good housing conditions (a quiet, shielded room). This also applies to the cheapest dynamic microphone. The limitation of the upper cut-off frequency observed in previous tests is probably mainly related to the limitation of the built-in sound card preamplifier. It is advisable to use an additional sponge to prevent the "additional exhalation preceding phonation" and to replace the connecting cable with a better shielded one.
- For the analysis of voices with the fundamental frequency F0 < 100 Hz the best dynamic microphones (with the lowest possible lower bandwidth limit) or (which should be recommended) high-class studio condenser microphones should be used. Computer microphones perform surprisingly well here (this phenomenon explains the similarity of their operation principle to a studio microphone).
- Proper use of studio microphones requires good knowledge of acoustic technique, perfect microphone mounting and a lot of attention. Using it in everyday practice for the analysis of voices with F0> 100 Hz seems to be inadvisable and may lead to worse results of the acoustic voice analysis if it is mounted incorrectly.