Practical tests of the influence of recording conditions
Author: Ph.D. Marcin Just (DiagNova Technologies)After synthetic determination of the influence of disturbances on the results of analyzes and determination of the characteristics of selected hardware configurations, a test was carried out under real conditions. Its main goal was to investigate the impact of errors made in the real recording process (mainly the wrong microphone distance from the mouth) on the final results of the analysis and the practical determination of the relationship between the characteristics of the equipment and the calculation results.
Methodology
The Yamaha HS50M studio monitor was used as the sound source for maximum repeatability. Only the recording sets and the distance between the speakers and the microphone in the range of 6–50 cm changed. For each set, two standard voice samples used earlier in the synthetic tests were played back four times. The recording room was a quiet anechoic chamber. No electromagnetic shielding was applied and the influence of the lighting network was not eliminated. It was supposed to simulate the best recording conditions achievable in an ordinary room without the use of expensive electromagnetic shielding, without eliminating the lighting network and other equipment operating in the room. In all cases the same equipment was used as in the previous tests, only additional combinations of elements were added. In all cases, the sampling frequency was 22050 Hz and the resolution was 16 bits.
Conclusions
Fig. 7 shows the relative changes in acoustic analysis parameters depending on the use of different hardware sets. Each vertical band on the graph shows the results for a particular combination of microphone, amplifier, and sound card. For each set, the recordings were made four times for a male voice sample and four times for a female voice sample; therefore each band has four columns. In each band, the column on the left shows the results for the recording made at the shortest distance between the microphone and the speakers (about 6 cm), with optimal drive. On the right side of the band of the recording carried out from a distance of about 50 cm without obtaining the appropriate drive and exposing itself to a greater influence of interference. The middle columns are intermediate settings.
Ideally, the band should be green, which means no difference to the parameters for the reference waveform. Blue colors represent decreasing values of parameters, and red colors - increasing values.
Fig. 7. Comparison of the impact of errors during the recording process on the analysis results for six hardware sets: red - overstatement of the analysis results, blue -understatement
Descriptions below the figure apply to all other graphs and mean:
- intbw – built-in sound card without additional preamplifier,
- intwz – built-in sound card with additional microphone preamplifier (Behringer Mic200),
- sblive – Creative External Live! external sound card (USB),
- mpre – MobilePRE external audio interface from M-Audio,
- komp – the simplest electret computer microphone,
- m8500 – mid-range vocal dynamic microphone,
- b1 – studio condenser microphone Behringer B1.
In the case of women, the type of equipment used has a relatively small impact on the results of the analysis, only in the case of a built-in sound card (bands A and B) a slight increase in the values of the Jitter and U2H parameters should be noted for the least correctly made recordings (the greatest distance). For men, the equipment used is relatively important. Only the recordings made with the use of the best equipment (columns I and L) were analyzed correctly, regardless of the quality of the recording process itself. First of all, the parameters from the Jitter and U2H groups changed, which can be explained by the influence of disturbances from the lighting network (50 Hz) in combination with a lower fundamental frequency in men.
Sample results for the best recording set used for the reference recording and for the shortest and greatest distance are presented graphically in Table 1.
| Playback of the reference recording | Test recording | |
|---|---|---|
| best carried out | worst carried out | |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The results for the worst recording set used for the reference recording and recordings from the shortest and greatest distances are presented graphically in Table 2.
| Playback of the reference recording | Test recording | |
|---|---|---|
| best carried out | worst carried out | |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The influence of the equipment used during the recordings was examined in detail only on the values of a few selected parameters - most commonly used in clinical practice - from the Jitter, Shimmer group and the NHR parameter. Figure 8 shows the horizontal section through Figure 7 for the Jitter parameter.
For the recording from the greatest distance, a significant increase in the value of the Jitter parameter was noted - for both women and men. This phenomenon depends on the quality of the recording equipment. Only for the best sets it is negligible. For the best-case scenario - recordings from the shortest distance - the measured Jitter values are slightly underestimated for all but one set that uses a studio microphone. The recordings from the second series turned out to be optimal (distance of about 10 cm). The remaining parameters from the Jitter and Shimmer groups behave in a manner similar to the Jitter itself. The NHR parameter behaves slightly differently (shown in Fig. 9). In its case, the values of the parameters when recording from too close a distance are clearly lowered. In the case of the simplest set (computer microphone connected to the built-in card) the parameter values are very seriously underestimated and practically regardless of the distance between the microphone and the speakers. All these phenomena are well related to the parameters of the equipment.
The lowering of the parameters of acoustic analysis (especially NHR) in the case of too close a distance, especially visible for dynamic microphones, is explained by the changes in the transmission characteristics of these microphones declared by the manufacturers for sound sources in close proximity. When using these microphones, when recording the voice, keep a greater distance between the mouth and the microphone. Computer microphones often have a limited frequency response in the high-frequency range, which results in a constant undervaluation of the NHR value, regardless of recording conditions. Only a studio condenser microphone ensures maximum consistency of the obtained results, regardless of errors made during the recording process..
Fig. 8. A horizontal section through Figure 7 for the Jitter parameter
Fig. 9. Horizontal section through Figure 7 for the NHR parameter











