Medical video recording

Part IV: High-speed camera

Stroboscopy and high-speed video comparison

Author: Ph.D. Anna Racino, Ph.D. Marcin Just, Ph.D. Michał Tyc (DiagNova Technologies)
Date: 2019.07.01

In the assessment of diseases related to the work of the vocal folds, it is necessary to analyze the slow-motion sequence showing their movement during phonation. For this purpose, videostroboscopy or imaging with the use of high-speed cameras are used.

The basic technique for visualizing the work of the vocal folds is the presentation of kymographic sections. They allow for the assessment both at the level of a single basic period of the work of the vocal cords and the entire groups of basic periods.

Kymography is not yet a full data analysis - in the case of generation of full-frame slow-motion recordings of the work of the vocal folds, it is a new form of presentation that significantly improves data readability. In the case of line camera recordings, this is the only way to present the data.

    When analyzing a single period, it is assessed:
  • asymmetry of the work of the vocal folds (right and left fold), including:
    • amplitude,
    • phase difference;
  • difference in the opening and closing phases;
  • regurgitation, with specification of the length of the vocal folds in which it occurs and the degree of regurgitation;
  • closing and opening coefficient (with the note that determining on the basis of one period may be subject to a very large error);
  • dynamic parameters (speed of opening and closing);
  • the occurrence of a mucosal wave.
    When analyzing groups of periods, additionally it is assessed:
  • uniformity of the work of the vocal folds, also by determining the jitter and shimmer parameters in the same way as in the case of acoustic analysis;
  • frequency of work of the vocal folds;
  • averaged closing and opening rates and its spread;
  • beginning and end of phonation (the way of starting and ending vibrations).

In the case of the fast film technique, the recording of the work of the vocal folds takes place by recording video frames with a high speed - an average of 2000 frames per second:

„Video rates vary from 125 up 10000 f/s. For a clinical exam, typically a 2000 – f/s rate is used. I prefer 2000 f/s rate, although others feels that this rate may not give the ultimate quality for the best possible diagnosis”

K. Izdebski in article KayPENTAX Color High-Speed Video 9710 System [8]

„The introduction of high-speed imaging of the larynx into clinical practice has expanded our ability to image vocal fold vibration to include situations that cannot be successfully evaluated using videostroboscopy. High-speed laryngeal imaging uses a high-speed camera to capture real-time images at a minimal rate of 2000 frames per second. This frequency of image capture is fast enough to obtain multiple images from a single cycle of vibration”

Katherine A. Kendall in article Clinical Applications for High-Speed Laryngeal Imaging [9]

In order to achieve the slow-motion effect, a movie recorded at high speed is enough to play at a lower speed, eg the standard 25 frames per second; within 1/10 second of the recording it is possible to record up to 20 cycles of vocal folds operation. Thanks to this, it is possible to register vibrations even during a very short phonation, ensuring maximum reliability of the image of the vocal folds activity, additionally, it enables the examination of momentary phenomena, the beginning and end of phonation, and allows for the correct verification of the irregularity of the vocal folds work.

In the strobe technique, the slow-motion impression is obtained by appropriate exposure of individual frames, recorded at a speed of 25 frames per second. One cycle of vocal folds observed in stroboscopy is composed of frames from many different real cycles that are distant from each other. This is shown in Fig. 10. The synchronization of the flashes of light or the camera shutter with the movement of the vocal folds, necessary to achieve in the case of stroboscopy, is usually obtained on the basis of the analysis of the voice of the examined person recorded in parallel.

Fig. 10. At the top, a kymography showing many consecutive movements of the vocal folds. Below, a kymogram showing one movement of the vocal folds approximated from appropriately selected lines from the upper kymogram. In this way, the impression of "slow motion" of the movement of the vocal folds in stroboscopy is obtained - through the appropriate selection of the moments of exposure of individual frames in accordance with the frequency of work of the vocal folds. An idealized case assuming that consecutive frames are taken from consecutive cycles of the vocal folds work, in a normal case successive frames come from cycles that are distant from each other

Typically 10 seconds of stable phonation are needed to register 10 vocal folds, which is usually not achievable with pathological voices. Correct control of the moments of exposure of frames is possible only with sufficient regularity of vibrations of the vocal folds. Otherwise, it is not possible to precisely select the work cycle so as to obtain a correct record of possible irregularities in the work of the vocal folds, therefore any disturbances in the cyclical work of the vocal folds in stroboscopy will be illustrated only as fraying of the performed kymographic sections, and not as real changes in the length of individual periods.

For this reason, the slow-motion recordings generated using the phenomenon of stroboscopy allow for the analysis of only a single period of vibration to a limited extent. Due to the method of generation, it is practically impossible to evaluate groups of periods. High-speed camera recordings can be assessed both in terms of assessing one period as well as groups of periods, as in most cases they cover much more periods of vocal fold movement (Fig. 11). To highlight these significant differences between high-speed camera kymographs and those obtained from strobe recording, the second are referred to as strobokymograms.

Fig. 11. Horizontal kymogram for high-speed recording

The lack of credibility in generating slow-motion image with the stroboscopic method in the case of pathologies and cyclic disturbances was the main reason why, even despite the imperfections and high costs of high-speed cameras, they were used more willingly in research centers:

”Since its introduction, videostroboscopy has had tremendous clinical success and is now considered the ‘gold standard’ in laryngeal imaging,” said Dr. Deliyski. “However, due to basic stroboscopic principles and the nature and behavior of human vocal folds, the technology has its limitations, especially for the visual evaluation of human vocal folds. Stroboscopy simply cannot capture the true cycle-to-cycle vibratory behavior of the vocal folds, and as a result, the intra-cycle vibration seen in stroboscopy displays an illusory ‘slow motion.’ Furthermore, stroboscopy has no benefit to persons whose voice disorder causes irregular vocal fold vibration, as the stroboscopic images produced can’t be used to accurately diagnose disorders. This would affect approximately half of patients with voice disorders.”

PHANTOM, Diagnosing Voice Disorders [2]

Figure 12 shows an example of a pathology that makes it impossible to assess the work of the vocal folds by means of a stroboscopic examination, and where the assessment with virtually any high-speed camera is possible.

stroboscopy
(complete inability to evaluate phonating activity)
high-speed camera
(possible approximate evaluation of phonation activity)

Fig. 12. An example of vocal fold pathology (the so-called bamboo folds) illustrating the superiority of even the historical model of the speed camera in the assessment of the function of the vocal folds

World-class specialists agree that the stroboscopic technique is not suitable for the assessment of irregular vibrations, and this is what we deal with in the case of diseases of the voice organ:

„Aperiodic vibrations cannot be traced correctly with LVS (laryngovideostroboscopy)”

K. Izdebski in articule Advantages of high-speed digital phonoscopy [11]

„Despite being the most widely used method in routine clinical practice, videostroboscopy has some limitations. For the strobe light and fundamental frequency to be synchronized, vocal fold vibration must be relatively periodic. In addition, as it represents a sub sampling of several vibrational cycles, it is not possible to access the variations between and within cycles. Furthermore, videostroboscopy is not capable of recording the onset and offset of phonation.”

Domingos Hiroshi Tsuji in publication Improvement of Vocal Pathologies Diagnosis Using High-Speed Videolaryngoscopy [12]

“It is important to realize that in the case of an aperiodic signal, the near-periodic assumptions do not hold. When the acoustic or EGG signal is aperiodic, the timing of strobe flashes does not correspond with the phases of the glottis cycle in the desired sequence. Even subtle variations in periodicity can produce completely distorted or unrealistic videostroboscopic sequences. Depending on the type of aperiodicity, the distortions may produce random-appearing vibrations, may change the balance between the timing of the opening and closing phases of the glottal cycle, may produce a reverse-appearing motion during a portion of the cycle or through the entire cycle, or may “lock” out of the closed phase, making it appear that the glottis never closes completely.”

Dimitar Deliyski, Laryngeal High-Speed Videoendoscopy [10]

„The stroboscopic flashes need to be synchronized with the VF vibration, which is technically impossible when the vibrations are irregular. Irregular vibrations of the VF, as well as some specific vibratory patterns, such as those related to diplophonia, multiphonia, and vocal fry, cannot be adequately studied stroboscopically.”

H.K. Schutte & F.F.M. de Mul in publication Videokymography – The next step: Investigations between 2003-2008 at the Voice Research Laboratory in Groningen, The Netherlands [13]

Due to the fact that the basis of videostroboscopy is the assumption of a regular cycle of work of the vocal folds, in case of irregular vibrations the results obtained with this technique are distorted. Fig. 13 shows the high speed camera kymographs for the same patient, while the high speed camera kymogram shows the irregular work of the vocal folds - the stroboscopic technique "loses" irregularities between the cycles.

Unfortunately, the stroboscopic technique fails most often in cases where accurate imaging of the work of the vocal folds is particularly desirable - in the case of significant pathologies.

However, even in the case of minor pathologies, the evaluation of the work of the vocal folds must often be carried out on the basis of the visibility of one, sometimes two basic periods and limited to a single-period analysis. It is not possible to evaluate the uniformity of vibrations. It requires 10 or more periods of vocal fold movement to be visible. In the case of stroboscopy, this is at least 10 s of phonation.

    Two additional problems then arises:
  • rarely any examined person is able to phon for so long during an uncomfortable examination - these results in phonation separated by respiratory phases, which "destroy" the synchronization between the strobe and the vocal folds,
  • it is difficult for the examiner during such a long examination to keep the endoscope tip stable - apparent movements of the vocal folds appear on the screen, "spoiling" the appearance of kimograms and making analysis difficult.

Figure 14 shows the strobe video frames that show the problem of the endoscope tip being stably held in one position over the vocal folds, and Figure 15 shows a typical kymogram distortion from a long strobe video due to respiratory phases and endoscope tip movements. Any analysis is practically impossible for this case.

Fig. 15. Strobokymogram for the sample, the cages of which are shown in Fig. 14

If the examination subject cooperates exceptionally well and the examiner has experience and a "steady hand", it is possible to obtain, and in the case of stroboscopy, longer recordings, including even 7-10 periods without the respiratory phase and with moderate movements of the endoscope tip. A rare example is shown in Figure 16a.

Fig. 16a. Strobokymogram with eight cycles of vocal folds work. Visible movement of the endoscope tip

In such a situation, it is sometimes possible to additionally stabilize the image in the software. Such functions are provided by the DiagnoScope Specialist software by DiagNova Technologies. Thanks to it, it is possible to "improve" the kymographic cross-section to the form shown in Fig. 16b. Unfortunately, it is practically impossible to remove the problems related to irregularities in the work of the vocal folds, which appear in the strobokymograms as fraying of the edges of the vocal folds (Fig. 16b). It is impossible to determine whether the differences in the shape of the periods in the strobokymogram are due to actual differences or to problems with timing due to non-uniformity of the vocal fold cycles.

Fig. 16b. Strobokymogram in Fig. 16a after image stabilization. Fraying of the edges caused by uneven work of the vocal folds was marked

It should be emphasized that due to the fact that the subsequent lines of the kymogram are created in stroboscopy from image frames distant in time by 1/25 s, which causes significant differences between them. The image stabilization process and any other operations increasing the legibility of the image are then significantly more complicated from the technical point of view and require much more commitment, knowledge and time from the user. Obtaining images such as the kymograms from a high-speed camera shown in Figure 17 is practically impossible in the case of stroboscopy.

Fig. 17. Examples of kymograms from the ALI Cam HS1 high-speed camera for various disorders of the vocal folds (from the top - period asymmetry, different vibration frequencies of both vocal folds caused by, among others, significant regurgitation, slight regurgitation, asymmetry and phase difference of vibrations of both vocal folds, unevenness of work - formation of conglomerates of basic periods

Instead, kymographic sections usually covering one or two cycles are obtained. Work in clinical conditions or in hospital admissions makes it impossible to perform image stabilization (due to its time-consuming nature and the quality of the final effect, depending on the perfect sharpness of the image of the vocal folds). As a result, the images are usually obtained as in Fig. 18a.

Fig. 18a. Examples of strobes without additional time-consuming image stabilization. The number of periods presented corresponds to the maximum possible number of periods from a given recording

Video 2. Recordings without stabilization, using stroboscopy and high-speed camera, corresponding to the above kymograms 2

In the case when the analysis of kymographic sections, even from a strobe slow-motion sequence, is necessary due to the requirements of the procedure for the needs of occupational medicine, as mentioned, image stabilization can be performed. Depending on the image quality, correct stabilization may take up to half an hour and not in every case it brings satisfactory results, as it does not solve the problem of uneven vibrations and the resulting lack of synchronization of the stroboscopic effect with the vibration of the vocal folds.

Stroboscopy Stroboscopy High-speed camera High-speed camera

Fig. 18b. Examples of kymographic sections for four test persons obtained from a strobe and high-speed camera. The video recording from the strobe was additionally stabilized. The number of periods presented in the strobokymograms corresponds to the maximum possible number of periods from a given recording, the number of periods in kymographic sections from a high-speed camera was selected in order to obtain good visibility. 3

  • Performing an appropriate videostroboscopic recording requires some skill from the user. Even an experienced user cannot avoid the limitations of this technology.
  • A very important advantage of the fast film technique is the speed and ease of performing additional image analyzes - reliable kymographic analyzes of the image are obtained instantly and for each recording.
  • In the case of stroboscopy, generating strobokymograms is possible only in selected cases and is time-consuming. Performing the examination so that the kymogram analysis can be performed is difficult and often impossible.
  • Without generating kymograms, it is usually not possible to reliably document functional disorders.

An additional advantage of the examination with a fast camera is the possibility of fully parametrizing the movement of the vocal folds. In the case of stroboscopy, it is limited only to cases in which it was possible to generate kymographic sections, and only to determining parameters from individual periods. In the case of a high-speed video, in most cases it is possible to efficiently parameterize both in terms of a single period of work of the vocal folds (asymmetry, opening factor, amplitude, etc.), and - due to a much larger number of recorded cycles of the vocal folds operation - multi-period. It is then possible to assess the cyclic operation of the vocal folds by determining, as for the acoustic analysis, the fundamental frequency followed by jitter and shimmer, but it can be done for both vocal folds jointly or separately for each.

Figures 19a-19e shows examples of kymographic sections with one second phonations, as well as graphs of the glottis gap and the fundamental frequency of vibration of the vocal folds. The data obtained in this way is easier to analyze than audio data and allows for more precise detection of large irregularities in the length of the basic periods due to the lack of signal disturbances (influence of formants) characteristic of audio data. Thanks to this, it was possible to determine jitter and shimmer, which was added under the drawings.


Periods = 326
F0Avg = 425,4Hz
Jitter = 0,51%
Shimmer = 9,09%

Fig. 19a. An example of a kymogram, a plot of the fundamental frequency and multi-period parameterization based on phonation lasting about 1 s. No disturbance


Periods = 175
F0Avg = 290,9Hz
Jitter = 4,70%
Shimmer = 6,89%

Fig. 19b. An example of a kymogram, a plot of the fundamental frequency and multi-period parameterization based on phonation lasting about 1 s. Male. No disturbance


Periods = 18
F0Avg = 176,6Hz
Jitter = 5,10%
Shimmer = 55,11%

Fig. 19c. An example of a kymogram, a plot of the fundamental frequency and multi-period parameterization. End of phonation, visible periodicity disturbances


Periods = 48
F0Avg = 132,3Hz
Jitter = 3,87%
Shimmer = 9,11%

Fig. 19d. An example of a kymogram, a plot of the fundamental frequency and multi-period parameterization.
The same subject as on Fig. 19c. End of phonation, visible similar periodicity disturbances at the end of phonation


Periods = 114
F0Avg = 220,2Hz
Jitter = 4,21%
Shimmer = 4,08%

Fig. 19e.An example of a kymogram, a plot of the fundamental frequency and multi-period parameterization based on phonation lasting about 1 s. Female. No disturbance

Determining the irregularity of the work of the vocal folds seems to be one of the most important future directions in the image analysis of the work of the vocal folds. This is due to the occurrence of such phenomena much more frequently than it has been estimated so far. Due to the fact that their observation is practically impossible in stroboscopy (and evaluation is not possible in any case), these phenomena generally went unnoticed. Meanwhile, they perfectly explain the disturbances recorded in the acoustic analysis. Completely new concepts of "super-cycle" can be introduced, combining several cycles of the vocal folds into repetitive conglomerates, as shown in Figure 20.

Fig. 20. Conglomerates of basic periods creating characteristic "super-cycles". This phenomenon can only be observed when recording with a high speed camera 4

Due to the possibility of registering arbitrarily irregular vibrations and many coexisting vibrations, the high-speed camera is also the only tool enabling the evaluation of the work of the pseudo-glottis. In many cases, the regularity of the work of structures generating phonational vibrations is surprising (Fig. 21).

Fig. 21. The image of the pseudo-glottis and the kymographic cross-section of the vibrations showing their surprising regularity 5

Video 3. Work above the imaged pseudo-glottis