Medical video recording

Part IV: High-speed camera

High-speed video technology development

Author: MSc Anna Racino, Ph.D. Marcin Just, Ph.D. Michał Tyc (DiagNova Technologies)
Date: 2019.07.01

The condition of the voice organ can be assessed on the basis of its image. During the observation of the vocal folds with a mirror (from 1854), or - more preferably - with a laryngoscope, it is possible to detect the existence of more extensive organic changes within the vocal folds and fundamental changes in their function. A significant advance was the introduction in the early 1970s of the possibility of recording the image of the vocal folds with a camera and video systems. This removed time pressure and gave more chances of detecting minor organic changes and less dysfunction.

Unfortunately, so far it has not been possible to develop a reliable methodology that would allow for the correlation of the size and type of organic changes with their influence on the work of the vocal folds, and due to the frequency of vibration of the vocal folds during normal phonation (from about 100 Hz to 300 Hz), observation of the vocal folds activity with a laryngoscope or videolaryngoscope it is generally limited to the respiratory phase (Fig. 1).

prawidłowa praca krtani
niedowład prawego fałdu

Observation of the respiratory phase gives only general information about possible phonation disorders. There are basic asymmetries in the structure of the larynx and the main functional problems (e.g. shortening of tincture folds, paresis and paralysis). There is no direct translation into the phonation function of the vocal folds. In order to obtain complete information on the disorders of the vocal folds, it is necessary to observe their work in slow motion so that the vocal folds work cycles are visible. The high-speed video technique seems to be developed for such a task, but before 1900 it was not technically possible to record on any carrier the image of the vocal folds during work (too low sensitivity of the film). Thus, direct observation and a strobe phenomenon were used, and in 1895, after obtaining synchronization between the flashes of light and the movement of the vocal folds, an apparent movement of the vocal folds was observed. The results were not promising and at the turn of 1937 and 1938 (D.W. Farnsworth from Bell Laboratories), the first attempts were made to use image registration at a speed many times exceeding the vibration frequency to observe the movement of the vocal folds, and to reproduce them at a much slower speed. Although the attempts were completely successful, the method of realization at that time (film tape) practically completely ruled out medical use. It should be noted, however, that it was the high-speed camera that was the first device that allowed to record and reproduce the movement of the vocal folds in a way that allows the assessment of phonation.

Advances in electronics led to the first video recording of the vocal folds in 1964. However, electronic cameras were too slow at the time to be able to apply the high-speed video technique, and a strobe solution was used to create a technology to observe the movement of the vocal folds. In 1978 (Yoshida, Hirano), the work of the vocal folds was recorded on video for the first time using the stroboscopic method. Due to the awareness of the limitations of the stroboscopic technique, work on implementing the high-speed video technique has not been stopped. As soon as the technical possibilities made it possible - after 1993 - the first systems enabling electronic recording of the work of vocal folds in the high-speed camera mode appeared. Initially, all solutions were in a form that could practically be used only for scientific and research purposes (Fig. 2)

prawidłowa praca krtani

HSV model 9700; Kay Elemetrics Corporation, available since 1999.

prawidłowa praca krtani

From [1] Vocal fold vibration irregularities caused by different types of laryngeal asymmetry Ulrich Eysholdt, Frank Rosanowski, Ulrich H. Hoppe European Archives of Oto-Rhino-Laryngology 2003

The camera head itself weighs about 400g, and the accessories are so much.

„The handling is quite different compared to a conventional endo-camera and has to be carefully learned”

Fig. 2. Two examples of high-speed cameras from the early 2000s in ENT application.

Most early models achieved speeds of up to 2,000 fps. Since 2003, cameras have appeared that allow for speeds of up to 4000 fps, but with a large degradation of image quality (Fig. 3)

Fig. 3. Kymogram made from the recording from the camera in Fig. 2 (lower camera) when recording at a speed of about 3700 fps
Vocal fold vibration irregularities caused by different types of laryngeal asymmetry
Ulrich Eysholdt, Frank Rosanowski, Ulrich H. Hoppe European Archives of Oto-Rhino-Laryngology 2003 [1]

A few years later, it became possible to build systems allowing to achieve even higher speeds (6000 fps and higher), but it resulted in a significant enlargement of the camera head size, so that it became necessary to use tripods (Fig. 4).

Fig. 4. Set to record the vocal folds with a fast camera designed around 2005. For the purpose of research at the University of South Carolina. [2]

The vast advantage of high-speed recording technology over stroboscopic technology meant that before 2006, many high-speed camera imaging systems were created.

„Change in function could be obtained in more patients and for more parameters using HSV than VS”

Powell, M.E. and others in article Efficacy of Videostroboscopy and High-Speed Videoendoscopy to Obtain Functional Outcomes From Perioperative Ratings in Patients With Vocal Fold Mass Lesions [3]
    However, the use of any of the kits developed was associated with problems. The most important of them are:
  • The weight of the camera head often exceeds 1000 g, significantly different from the weight of the strobe camera heads (on average from 150 to 300 g);
  • Heating of the camera head due to the need to integrate practically the entire image recording system with it;
  • The technical complexity of the sets and the lack of compliance with medical standards, preventing their introduction to the market and wider use (the developed sets mostly worked in single centers as research equipment);
  • Relatively not the best image quality and speed usually limited to 2000 fps;
  • Very high requirements as to the illumination of the vocal folds. Basically, the only light source of adequate power at that time (and essentially until recently) was xenon gas discharge lamps. The light they generate has spectral characteristics containing harmful UV radiation (usually between 350 nm and 400 nm), and the color temperature is concentrated in the range significantly absorbed by hemoglobin (~ 400-550 nm), which additionally reduces the efficiency of lighting and increases the requirements for power. Hence, high-power xenon discharge lamps used for high-speed cameras cause very high heating of all optical components and may be harmful to the patient with prolonged exposure;
  • High cost of high-speed camera systems up to 10 times the cost of strobe systems;
  • Recording timeout of a few seconds - typically 4 seconds - due to limited memory in the camera head. After making the recording, the examination had to be stopped and the long process of sending the recording to the computer had to be carried out. In addition, due to heating, the entire head had to be cooled down after each recording for several minutes (or faster - under a stream of water);
  • A complicated method of analyzing the examination results that often requires several dozen minutes of viewing the recorded data;

As a result, the study methodology was complicated, the study was difficult to perform and long. However, the advantage of the diagnostic possibilities of high-speed camera caused attempts to overcome at least some of the presented difficulties. In 2006, the Wolf Endocam 5562 system appeared on the market (Fig. 5).

Fig. 5. System Endocam 5562 – the first high-speed camera system fully adapted to medical applications [4]

The system was prepared for medical certification and offered a speed of 4000 fps at a resolution of 256x256 pixels, but at the end of the unchanged lighting meant that speeds were used only up to 2000 fps. The weight of the camera remained at the level of 1000 g, which significantly excluded its use in everyday research conditions and limited its use for scientific research or for the diagnosis of specific cases of pathology. The camera, even despite the high price (about $ 60,000), spread in research centers and significantly contributed to the development of phoniatrics.

In 2011, a refurbished (KayPentax – 9710/9711) appeared on the market (Fig. 6).

Rys. 6. KayPentax 9710.

Fig. 6. KayPentax 9710

The weight of the head was significantly reduced to just over 200 g, which did not differ from the weight of the strobe camera heads (especially HD). The camera enabled recording in color at a speed of 4000 fps with a resolution of 512x256 pixels. Unfortunately, what remains is the reduction of the recording time to 4 s (at 2000 fps), significant heating of the head, very high heating of the optical (lighting) elements and the construction making it difficult to obtain medical certification. Improving parameters and weight resulted in higher production costs and a high price of up to $ 200,000. These disadvantages meant that this camera could not compete with strobe cameras.

The limitations of stroboscopic technology meant that research centers on their own made attempts to develop sets with a lower cost using the ready-made heads of high-speed cameras for industrial applications available on the market. Numerous examples of systems for scientific research were created (Fig. 7), but all of them, due to the tightening medical standards and still too high a price (the industrial camera head module still cost about $ 20,000, taking into account the conditions on the medical market, the cost of the entire set for end user would have to significantly exceed $ 50,000) practically did not exceed the prototype level and were used only for scientific research. The camera head Fastec 1/Fastec 4, used most often in these solutions, weighed 280 g without accessories, after adding optics, handles and cables, the weight of the entire camera is 400-600 g. endoscope.

Fig. 7. Examples of high-speed camera systems built on the basis of a popular industrial camera (Fastec HiSpec1 i Fastec HiSpec4) [5] [6]

An interesting proposition was the system, developed in 1996, using a combination of the two cameras: regular and fast linear (generates one image line) or one camera that can work in normal mode or in single line generation mode. It was based on the proposed presentation technique (Videokymography: high-speed line scanning of vocal fold vibration. Svec JG, Schutte HK, J Voice 1996 [7]) medical data called kymography, which works especially well when analyzing the work of the vocal folds. Videokymography - a technique of generating kymographic cross-sections based on a video image from a linear camera or selected lines of a full-frame slow-motion sequence (Fig. 8), allows to visualize most of the disorders of the vocal folds function.

video frames
time domain
section through the vocal folds
spatial domain (location on the vocal fold)
klatki filmu - przekrój kimograficzny kymography
przekrój przez fałdy głosowe - kimografia

Fig. 8. The principle of generating kymographic sections

According to the authors of many publications, the use of kymography is a must when analyzing long recordings of several dozen phonation movements of the vocal folds, because the direct manual analysis of the movement of the vocal folds on many frames of the recording is prone to errors:

„In addition, changes in frequency of vibration may not be as easily identified with high-speed imaging because the observed difference from cycle to cycle may be subtle and requires the examiner to review many cycles to assess. Using the kymography function, aperiodic vibrations are easily identified with high-speed imaging.”

Katherine A. Kendall in article Clinical Applications for High-Speed Laryngeal Imaging [8]

With line cameras, it is easy to achieve operating speeds in excess of 4,000 fps, resulting in high-quality imaging of high-frequency voices. The use of line cameras also slightly reduces the requirements for illuminance, while the limitation is the generation of data from only one place on the vocal folds and the inability to perform vertical image stabilization of the vocal folds during data analysis. It is also not possible to correct the rotated image of the vocal folds, which may lead to some visualization and analysis errors (especially when opening the vocal folds).

In 2017, the cost difference of a high-speed camera to a simple strobe set was 10: 1. The cost of a high-speed camera was at least 100,000 euros, the cost of a strobe - 10,000 euros. In Poland, due to the high cost, only one high-speed camera was purchased until 2018 (model Endocam 5562) [4].

Despite the inconvenience of use, high costs and complicated research, it should be emphasized that in the world it is standard high-speed cameras in full-frame or linear versions that remain the basic equipment used in scientific research and in expanding medical knowledge. Most of the scientific publications were based on them and they constitute the unquestionable only source of reliable data for the evaluation of the work of the vocal folds. However, the low cost and relative ease of testing made stroboscopy, despite its disadvantages, the gold standard in the field of phoniatric research. However, many publications emphasized that after overcoming technical and operational problems, high-speed cameras should eventually completely dominate the phoniatric imaging market.

Due to the continuous development of microelectronics, the state of dominance of stroboscopy is just changing. Previously, all advances in high-speed cameras were limited to heads. Gradually, the light source became the main limitation. Due to technical reasons - the lack of a replacement technology - it was not modified, despite the fact that the awareness of the disadvantages of the applied solution, especially in connection with the existing methodology of working with a high-speed camera, was well known:

„HSV technology requires a lot of light due to the CMOS photon integration principles. Thus, increasing the amount of light can improve HSV image quality and frame rates. The type of light source used with most HSV systems today is 300 W constant xenon light. There is, however, a safety concern that further increasing the amount of light used with HSV can cause tissue damage. Additionally, it is considered possible that long exposures to a 300 W constant xenon light can cause tissue damage. No reports of such damage have been filed to date, but as a precaution it is recommended that the amount of time the vocal folds are exposed to light during an HSV exam be reduced to less than 20 seconds.”

pioneer of fast film technology Dimitar Deliyski in article Laryngeal High-Speed Videoendoscopy [9]

The breakthrough came only in 2018. The high-speed ALI Cam HS1 camera from DiagNova Technologies (Fig. 9) was then introduced to the market together with the pioneering ALI Lum laser illuminator.

Fig. 9. ALI Cam HS1 camera head

    Numerous improvements have been applied throughout the set:
  • An energy-saving CMOS type camera sensor with a lower speed and therefore higher image quality was used. It made it possible to obtain speeds of up to 3200 fps with an image resolution of 480x400 px, a color image with parameters at least as in the case of stroboscopy. It also significantly reduced the heating of the camera head;
  • With the weight of the head reduced to about 200 g, its size was not reduced too much, which improved the heat dissipation process and further limited the temperature of the head during operation;
  • Direct (immediate) data transfer to the computer allowed to eliminate all restrictions related to the length and number of recordings;
  • The illuminator uses high-power laser diodes, which made it possible to increase the illumination intensity while reducing - due to lower light transmission losses - heating of the optics. Additionally, the lighting has a narrow-band character, which improves the visibility of the vessels and extends the ability to diagnose in oncology;
  • An electronic lens with automatic focusing (autofocus) was used, which, on the one hand, made it possible to obtain higher-quality images, on the other hand, it made it possible to introduce a system for more accurate assessment of the size of objects in endoscopic images thanks to the estimation of the distance of the endoscope tip from the observed structures based on the set parameters of the lens;
  • Thanks to the laser illuminator, it was possible to introduce dynamic adjustment of the illumination intensity, introduce a continuous preview of the image in the low intensity mode and turn on the full intensity only at the time of recording the High-Speedsequence;

Thanks to the use of new technologies and design different than in high-speed cameras available before 2018, it was possible to significantly modify the philosophy of examining with a high-speed camera - introducing simplified sequence recording modes, which retain most of the advantages of examining with a high-speed camera (especially high reliability of the examination and the possibility of visualizing the movement of the vocal folds with all pathologies) with simplicity and speed of examination exceeding even stroboscopic examination. The complete examination of the patient with full data archiving may take even less than 15 seconds, and unlike the stroboscopic examination, it is possible to carry out all the applied analyzes of the work of the vocal folds and generate kymographic sections. Higher light intensity also provided higher image quality and the ability to use a maximum speed of 3200 fps without noticeable image degradation. The result was the first system intended to be used in clinical conditions, and not only for scientific work.

Film 1.Examples of recordings with a high-speed camera visualizing the movements of the vocal folds with occurring pathologies, impossible to obtain using a strobe recording.1 You can see more recordings in our galery

Thanks to new technologies, it will be possible to restore the order of things postulated by scientists, in which stroboscopy will be used as a screening tool, and after pathology is detected, proper diagnostics and rehabilitation will be carried out using a high-speed camera.

„Hoarseness, the clinical complaint usually investigated with laryngeal imaging, is the result of abnormalities in vocal fold vibratory function and is an indication for high-speed laryngeal imaging”

Katherine A. Kendall w article Clinical Applications for High-Speed Laryngeal Imaging [9]