Medical video recording

Part I: Basic issues related to medical video recording

Endoscopic cameras (and other)

Author: Ph.D. Marcin Just, Ph.D. Michał Tyc (DiagNova Technologies)

The camera is the first and basic element necessary to transfer the image to the computer disk. The basic step of the processing - measurement of the electric charge accumulated in individual cells of the detector (sensor) of the camera - always takes place in the analogue, so from the viewpoint of the processing of the light signal to electrical form cameras are analog, the signal is represented in them through a specific within a certain range the voltage or current level. For a computer to do its part of the job, the analog signal from the camera's sensor must be digitized. Two solutions are possible here:

  • the signal from the camera converter is amplified, properly prepared and analogically transferred to a separate grabbing card, where it is converted into a digital form and transferred to a computer,
  • The camera converter itself converts the signal into a digital form, or the signal from the camera converter is converted in the camera itself into a digital form and in this form without a separate grabber and fed to the computer.

Cameras with the second solution applied are commonly referred to as digital cameras.


CCD converters with CMOS

The heart of the camera is a converter that converts images into electrical signals. In most cameras, it takes the form of a matrix of miniature photosensitive elements. The converters are made in two different technologies: CCD and CMOS. These technologies differ in design and performance. To simplify it, it can be assumed that CMOS sensors take over much more "duties" related to image processing - they usually contain built-in converters converting the analog electrical signal from photosensitive cells into digital form, they provide a simple implementation of the shutter function, etc. Due to their greater complexity, CMOS matrices usually heat up much more during operation and show greater differences in parameters between individual photosensitive cells. This causes an increase in thermal noise, which manifests itself in a greater so-called dark current, i.e. the current generated by photosensitive cells in the absence of lighting. These phenomena have recently lost their importance and should be taken into account practically only with very long exposure times characteristic for taking photos in poor lighting conditions (especially astronomical photos). CCD technology is devoid of most of the disadvantages of CMOS technology, but it comes at the expense of the complexity of the system accompanying the image sensor. The construction of CCD sensors predestines them especially for the construction of interlaced cameras, CMOS sensors are usually of the progressive type.


Progressive and interlaced cameras

This division is particularly important. The operation of progressive cameras differs significantly from the operation of interlaced cameras. The interlaced technology is definitely older. It is characterized by the fact that the image is processed by the sensor in two stages - first the even lines of the sensor matrix, then the odd ones (or vice versa). The time separation of both stages is particularly important here. The receipt of even and odd lines is separated by at least the time needed to collect and process data from all lines - usually it is about 1/100–1/50 sec. In this way, not one full-resolution image is obtained, but two half-resolution images (half images) additionally shifted vertically by one line height, and evenly distributed in time. This technique is perfect for viewing the image from such a camera on the screen of a classic CRT TV (or monitor). The image is then displayed twice as often (not 25, but 50 frames/sec.) With the appropriate vertical shift, giving the impression of full resolution for still images, and a feeling of better fluidity in the case of moving images. The situation is different when the image from such a camera is presented on the screen with full resolution at the same time, without the time separation of the half frames. Only in the case of completely static images (erg. an image of a preparation under a microscope), it is possible to obtain the full declared resolution from the "interlaced" type camera. When such a camera records a moving image, the so-called artifacts, deteriorating the image quality (for example, Fig. 1).

Original frame  the interlacing effect
Frame with interlacing removed  the interlacing effect
Enlarged frame  the interlacing effect

Fig. 1. Interlacing in the video material (endoscopic recording of the vocal folds during phonation)

Additional deterioration of the interlaced image occurs after rescaling (Fig. 2).

 the interlacing effect

Fig. 2. Deepen the interlacing effect when the image is scaled

There are image processing algorithms that allow for the removal of the abovementioned disturbances, but regardless of their complexity, they ensure full vertical resolution only in a lucky coincidence (either there is no vertical shift of the image between the fields, or the shift is exactly a multiple of the image line height ). A comparison of the performance of several deinterlacing algorithms is shown in Fig. 3.

Original frame  deinterlacing
Inteligent blending  deinterlacing

deinterlacing  deinterlacing
Advanced deinterlacing with content shift detection  deinterlacing

Fig. 3. Comparison of three de-interlacing algorithms illustrating for real video the practically minimal usefulness of using advanced methods in relation to simple removal of every second line

The use of advanced deinterlacing algorithms to obtain full image resolution proves to be unjustified in practice and should only be used on old recordings to remove (or rather reduce) artifacts. The operation of advanced algorithms usually does not even improve the appearance of almost static frames (Fig. 4).


Simple deletion of every other line  deinterlacing
Advanced deinterlacing with content shift detection  deinterlacing

Fig. 4. Comparison of the operation of the advanced algorithm and simple line removal for a seemingly static image

In the case of an interlaced camera signal, it is better to double the frame rate in combination with a slight vertical shift of adjacent fields, simulating viewing the image on the screen of a classic CRT monitor. You then get the impression of better fluidity and better resolution while watching the recorded movie. At the expense of a practically very slight loss of vertical sharpness, the image is non-interlaced and additionally a larger number of frames, allowing for easier selection of the frame containing the most interesting content (Fig. 5).

 deinterlacing
 deinterlacing

Fig. 5. How the deinterlacing and framerate doubling mechanism works: top original video, bottom after deinterlacing with frame rate doubling.

You can get a non-interlaced image from each interlaced camera by selecting one of the lower resolutions (e.g. 352 × 288 instead of 720 × 576). In principle, this method should only be used when the computing power of the computer is insufficient to cope with the higher resolution. The difference in image quality obtained by the deinterlacing method with duplicating frames and a simple resolution reduction is visible (Fig. 6), additionally, the doubling of the frames is not obtained.

Enlarged fragment with resolution 320×240  Compare image
Enlarged fragment with resolution 720×576 after deinterlacing by splitting the frames image comparison

Fig. 6. Compare images with different resolutions

Progressive technology is definitely better suited to the specifics of modern computer monitors and TV sets. Due to the subtleties of the construction of progressive and interlaced image sensors, progressive cameras require a little more light to perform well. In extreme cases, it may be a factor of the order of 2.

If the lack of light is not an obstacle, progressive cameras are definitely a better solution. They usually have higher resolution as well.


Digital vs. analog cameras

As already mentioned, digital cameras differ from analog ones in the way of sending a signal to a computer (digital or analog). Due to the progressive miniaturization, increase in computing power and lower power requirements for electronic components, integrating an analog-to-digital converter with the camera as a grabber does not reduce its quality. Digital data transmission is much less sensitive to interference and is a future-proof solution. All this would suggest that it is definitely more advantageous to use digital cameras. However, there is no rose without thorns. In the case of a classic grabber, placed in a desktop computer as an expansion card or in a notebook as an Express Card (or an older type of PCMCIA card), the transfer of digital data is very efficient, without too much use of the computer's processor. The entire processing power of the processor can then be used to process images in order to improve their quality, reduce size, compression, etc. Paradoxically, the oldest (and usually the cheapest) solution - an analog camera with a grabber card ensures the best use of computer power. A digital camera can be connected using a special dedicated expansion card inserted into a computer, but usually the digital camera is connected using one of the computer's ports - USB, FireWire (IEEE 1394a or 1394b) or Gigabit Ethernet. While the FireWire and Gigabit Ethernet ports provide support for the incoming data stream without overloading the processor, unfortunately the most popular of them - USB (USB 2.0) requires constant connection control by the processor and reserving a large part of its computing power, which can no longer be used for processing images. Additionally, among the standards mentioned, USB 2.0 provides the smallest practical bandwidth for data - in practice around 30 MB (megabytes) per second (theoretically a little more than 50 MB). The problem of the bandwidth of the USB link will be described later in the document, at this point it is enough to mention that at a resolution of 720×576 pixels, about 25 frames/sec can be transferred via the USB 2.0 link. (so it is only enough to transmit a standard TV image in the PAL system used in Poland), with a resolution of 640×480 about 30, but already at a resolution of 1280×960 - only 7.5 frames/sec. The FireWire port in the basic version has a bandwidth similar to the USB port (practically about 40 MB / sec.), And in the extended version (1394b) about twice as much, in addition, this standard includes some data compression that allows even the transfer of data in HD format (high resolution).


The size (diagonal) of the transducer and quantum efficiency

Colloquially (mainly in relation to digital cameras) it is said that "small" matrices are humming. There is a lot of truth in this statement. It is not, however, that small matrices generate a lot of interference (noise); they are only able to generate a small signal. With a small signal, all the noise related to the conversion of light into electrical signals becomes more significant. Noises appear directly when converting light into an electric charge of individual cells of the camera matrix, they also appear due to the random generation of charges in the matrix cells (depending on the temperature - the higher, the more of them, hence until recently CMOS matrices, due to their stronger heating, were considered more noisy).

There are many types of noise, but taking into account the specifics of the camera generating real-time video image, the most important are the noise related to the quantum nature of light and the conversion of light into electric charge. And, what is very important - these noises do not depend on the manufacturer of the matrix, camera, etc. It is a physical phenomenon that cannot be avoided in ordinary sensors used in classic cameras. In simple terms, these indelible fundamental noises reach a value equal to the root of the value of the electric charge generated in the camera cell by the incident light.

At this point, it is important to realize that an ordinary camera is where normal human meets quantum physics. Light is generally perceived as something "continuous". The fact that they can be interpreted as a stream of particles seems at first sight to be a pure abstraction that can only be dealt with in a large laboratory. Meanwhile, during the operation of the camera, while collecting one image constituting one frame, completely countable numbers of light particles - photons - reach each camera cell. And we're not talking billions here - these are thousands, hundreds of thousands at best. Only a part of these photons will generate an elementary charge (electron) increasing the total charge of the camera cell (this coefficient is the quantum efficiency of the converter, and usually ranges from 0.2 to 0.5), and the noise will be the root of the number of generated charges. Assuming that the average camera generates about 10,000 charges in each sensor cell during the collection of one image, the noise is about 100 charges (simplifying) and the noise ratio (signal, i.e. the generated charge, to noise ratio) is also only 100 - and this is the level already detectable by the human eye in the picture. As you can see, the signal-to-noise ratio increases in proportion to the root of the generated charge.

A method for low noise is simple - should seek to maximize the number of charges generated in the cells of the camera. This is related to the need to achieve high quantum efficiency, but most of all to the need to reach the converter with as many photons as possible! So the key issue turns out to be the lighting. The better, the more photons will reach the camera sensor. The quality of lenses or endoscopic optics in medicine also plays an important role - they have to transmit a lot of light. And here it comes to the heart of the matter - the cells of the camera's converter matrix have a strictly limited capacity. They will not accumulate more generated charges (and therefore will not accept more photons) than specified in the transmitter manufacturer's specification. Exceeding this number causes characteristic discoloration, white spots, etc. As a rule, the capacity of a camera cell depends on its surface area. The larger the cell, the better it can give you maximum noise rejection. The cell surface area depends to some extent on the type of transducer (CMOS or CCD, progressive or interlaced), and mainly on the area of the entire transducer and the number of cells (i.e. resolution). It may seem that the high resolution of the camera's sensor is directly related to higher noise. However, a high-resolution image can always be scaled to a lower resolution by computer processing, reducing noise accordingly (by averaging the brightness of neighboring pixels), so although high resolution slightly increases noise, it is related to second-order phenomena - for example the mentioned thermal noise or the degree of use the area of one sensor cell (pixel), which is slightly worse in cameras with higher resolution. Critically important is the surface area of the transducer - or several transducers. If a separate transducer is used in the camera for each primary color, their area is added up; for example the popular 3CCD cameras.

In practice, cameras (including medical cameras) usually use sensors with diagonals from 1/6 inch to about 2/3 inch. For example, assuming the same number of pixels of the sensor and the extreme sizes used normally - 1/6 inch and 2/3 inch, the surface ratio is 16 (!), and hence you can easily reach 16 times the camera cell capacity 2/3 inch and up to 4 times better its noise parameters. This is how much you gain from a larger driver, while the profit associated with the best manufacturer over the worst currently in popular applications is about 1.5. Continuing the comparison, a camera with three 1/4 inch sensors (3CCD) will in effect likely be inferior to a camera with one 1/2 inch sensor.

It is worth mentioning that the best (in terms of noise) are interlaced CCD cameras, and the worst - progressive CMOS cameras; the differences will be relatively small nowadays, so that you should only consider the surface of the sensor and the desired resolution of the resulting images, rather than the technology of production and the principle of operation of the camera converter. It is also very important that the use of a transducer with a larger diagonal will only be possible when it "gets" enough light. And this comes with either the requirement to have good lighting, or with the shutter speed of each frame being longer, which can blur the image of fast-moving objects.


Resolution and color space

The influence of resolution on image quality is explained in the chapter on sensor size. In this chapter, only camera resolution formats and their actual usefulness will be discussed.

The constant existence of NTSC and PAL standards is an anachronism from the time of analog technology. While these standards generally described how color information is transmitted, they are usually associated with certain resolutions. Definitely more common cameras are compatible with the existing PAL system in Poland. With relevant information, this system has been defined primarily the number of lines making up the television picture. In most variants of this system, it is 625 interlaced lines, i.e. two half-images of 312 lines and 313 lines each. On television, some of these lines do not carry visual information (only e.g. blanking or synchronization pulses). There are only 576 useful lines (two half-images, 288 lines each), so after digitizing (digitizing) and saving in a file, the vertical resolution will be 576 lines at most. It is true that the standard assumes that the information is interleaved, but nothing prevents the adjacent half-images from being paired in pairs from the same moment of time, i.e. they were recorded progressively (this requires minimal expansion of the camera systems). Of course, every second or every fourth line can be selected from the transmitted image to reduce the requirements for the computing power of the computer and the bandwidth of digital connections (USB) needed to correctly capture the image. The standard defines the frame rate of 25 frames per second. As already mentioned, this is actually 50 half-images per second - two half-images make up one frame.

Ultimately, the following image resolutions are associated with the PAL standard:

  • 768×576 (extended PAL adapted to computer monitors) – rare,
  • 720×576 (full PAL),
  • 704×576,
  • 720×288 (every second frame, but no interlacing),
  • 384×288 (older screen monitors, rarely available),
  • 352×288 (half PAL),
  • 176×144.

Full PAL (720 × 576) with an actual interlacing can also be interpreted as 50 frames / sec. with a resolution of 720 × 288. This produces a smoother, non-interlaced image with half the vertical resolution, and requires the playback device (or software) to be able to automatically correct aspect ratio. Some video playback programs are able to perform such correction (e.g. Media Player Classic), and DiagNova video recording software (DiagnoScope).

The NTSC system is non-interlaced, with a frame rate of 30 per second. and resolutions derived from the size of 640 × 480:

  • 640×480,
  • 480×480,
  • 640×240,
  • 320×240,
  • 160×120,

From the camera user point of view, having a camera in the PAL system usually means that there are available resolutions giving 720 pixels per line, and that usually resolutions above 288 pixels vertically are associated with interlacing. PAL cameras usually also provide NTSC system resolutions, but this adaptation generally does not include increasing the frame rate to 30 frames per second and eliminating interlacing.

Data stream

In the case of digital cameras, resolution also has an aspect related to data transmission. One frame in the full PAL format (720×576) is 414 720 pixels. The easiest way to code the color of each pixel is by specifying the intensity value of each basic color component (RGB - Red Green Blue, i.e. red, green and blue).Reasonable qualitative assumptions require that each component be given with a resolution of 256 levels, i.e. 8 bits (one byte) of data must be used for each component. As a result, it is necessary to use three bytes to encode the color of one pixel and 1,244,160 bytes for the entire frame. At 25 frames, this gives a data stream of around 30 MB (megabytes) per second, which is the maximum usable bandwidth of the popular USB. However, camera designers have to consider an extra margin of safety so they use a trick. It is based on the use of a certain disability of human eyesight, thanks to which it is possible to reduce the resultant number of bytes needed to encode the data of one pixel from three to two. The data stream requirements then drop and video with a resolution of 720×576 can be transferred using a cheap USB. At this point, it would be appropriate to introduce the concept of a color space.

Color space

For various reasons (to reduce the amount of data, add additional information, adjust to the capabilities of the device), digital data is sent in a non-natural format, in which information about the intensity of each basic color of each pixel is separately encoded. In addition, in order to improve quality, in recent times, it increases the accuracy of the color and allocates more than - until recently considered perfectly adequate - 8 bits (byte).

In addition to this basic format, many derived formats as well as significantly different formats were created. We do not yet take into account the issue of algorithmic data volume reduction - compression.

Due to the color coding and the number of bits needed to encode one pixel (3 x 8 = 24), the basic format is RGB_24. Useful and practical formats used derivatives of the basic format is:

  • RGB_16 (respectively 5, 5 i 5 or 5, 6 i 5 bits for each component – some artifacts and distortions are clearly visible in the image,
  • RGB_32 (additional 8 bits for information about transparency or left empty in order to accelerate operations due to the compliance of the size of the data describing the pixel with the size of the microprocessor registers),
  • RGB_36 (12 bits for each color component).

The most commonly used formats, which are fundamentally different from RGB, are the YUV formats, where the color is coded not by specifying the values of the color components, but by specifying for each pixel the brightness (luminance - Y) and the color for one or more pixel groups together (colors - U and V). The types listed below:

  • YUY2 – color coded for pixel pairs horizontally, resultant number of bits per pixel – 16,
  • UYVY – as YUY2, but different order of data recoding (not compatible),
  • YV12 – color coded for four pixels groups (two vertically and two horizontally), resultant number of bits per pixel – 12.
  • YV9 – color coded for sixteen pixels groups (four vertically and four horizontally), resultant number of bits per pixel – 9.

Recorded frames in given format of color space can be easily recoded to another format, but with minimal loss of color quality. For a computer, the RGB format is the most convenient and if the camera can send data in different formats, RGB_24 should always be the first choice.

However, you should not be afraid to use other color spaces. For ordinary photos, and especially for medical photos, the color spaces YUY2 and YV12 do not cause any visible deterioration of colors. Even the rarely used YV9 space usually produces an acceptable quality. We tested the use of different color spaces to code the same frame from a videostroboscope (vocal folds during phonation) and a simple synthetic image, selected to emphasize possible distortions. The results are shown in Figures 7 and 8.

 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces

 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces

 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
Original frame RGB_32
YV12
YUY2

Fig. 7. Compare the appearance of the frame with the actual video sequence encoded with different color spaces

 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
 Compare the appearance of the frame with the actual video sequence encoded with different color spaces
Original frame RGB_32
YV12
YUY2

Fig. 8. Comparison of the appearance of a synthetic frame (with content selected so that it is sensitive to the color coding method) encoded with different color spaces

While a special frame with sharp edges separating completely different colors shows artifacts related to the use of a compressed color coding method (Fig. 8), the real video image with a fairly uniform color does not show the influence of different ways of recording color information.

It can be added that most compression procedures explicitly or implicitly convert video frames to YUY2 to save space (even the popular DVD format uses the YV12 color space, which is sometimes noticeable when watching animated films). The profit from the use of RGB_24 or RGB_32 color spaces in the camera will be visible practically only in the case of forced compression of video using the RGB color space, otherwise the frame selected from the video will actually use the YUV color space and YUY2 or YV12 encoding.

Some of the formats may, unfortunately, appear with additional changes in the mapping scale (e.g. color components described by a byte (values 0–255) may only change within the range of 16–240). Algorithms processing data from the camera sensor (always in RGB format) can be saved in accordance with various legal regulations. Although this leads to very slight distortions, it is advisable that the computer program recording the sequences allows you to select the algorithm that gives the best image quality (DiagNova software has options for correcting these minimal inconsistencies, see Fig. 9).

 selecting the color coding and decoding standard in the YUY2 system

Fig. 9. Advanced options for selecting the color coding and decoding standard in the YUY2 system (DiagnoScope program)

An example of this of image color correction methods is presented in chapter about practical video recording.

Abnormal resolution – HD cameras

Cameras have been appearing for several years to record images in a resolution greater than the standard 720×576. Various resolutions are available, and the first issue to consider is the aspect ratio of the obtained image. Video from classic cameras has an aspect ratio of 4: 3 (NTSC, less often PAL) or 5: 4 (PAL). They are adapted to classic TV sets and older type computer monitors. On widescreen monitors, the image will either not fill the entire monitor, or part of the image will be lost when displayed in stretch mode. Such an image is well suited for recording medical video data, where the examined objects can have any shape, not stretched horizontally. The advantage of HD resolution images with a near-square aspect ratio is therefore that they fit in with the view of medical images. There can be practically any resolutions here, one can mention:

  • 800×600,
  • 1024×768,
  • 1280×960,
  • 1280×1024,
  • 1600×1200

and higher.

On the other hand, there are cameras that provide resolutions related to the popular market, i.e. panoramic, i.e. with an aspect ratio of 16: 9 or 16:10. They are less suited to the presentation of medical data, but better fill the space of modern monitors. The following can be mentioned here:

  • 1280×720,
  • 1280×800,
  • 1360×768,
  • 1440×900,
  • 1680×1024,
  • 1920×1080.

An image with a higher resolution absolutely shows greater usefulness, but there are limits to that usefulness. Buying cameras with maximum resolution will in most cases not lead to an adequate image improvement. Also, using the full resolution of such cameras can in many cases be counterproductive, and a reasonable compromise is to use a significantly reduced resolution. It has already been mentioned that more pixels translates into more noise. Most of this extra noise can be removed by computer processing by reducing the resolution of the images. Most, but not all. Due to the aforementioned limited bandwidth of computer buses, images with higher resolutions are transmitted with a significantly reduced number of frames per second. This reduction may be in the case of uncompressed images from 4 times (compared to 30 frames/sec. for a 640×480 image) for a 1280×960 image (i.e. 7.5 frames/sec.) to 16 times for a 2560 image.×1920 (i.e. about 2 frames/sec.).

Compressing the transmitted data to improve the number of frames in turn deteriorates their quality, so that the gains from the additional resolution are illusory. In addition, when saving a high-resolution image on the hard drive of a computer as a video file, a huge data stream will create very large output files (difficult to manage, including archiving) or require a high degree of compression, which further deteriorates the quality and makes excessive useless resolutions. In the case of HD image, you need to consider recording not only full-fledged video sequences, but also single frames. There is then no problem with the volume and required power of the computer. However, the problem of frame registration delay appears then – when observing the image and selecting frames "live", it usually takes about 1/10 s from the operator to the frame selection process. The result is that the desired frame is not captured, but in fact the one following a few frames after it. The solution here is the mode of recording short series of frames (4 to 8), which is still very rare in the registration software (such a function is included in the DiagnoScope program).

HD cameras also require a greater amount of light. There is also a serious problem of registering fast-changing phenomena (e.g. vocal folds during phonation). To avoid blurring of the image and use the full resolution of the HD camera shutter must be from 2 to 4 times less than conventional cameras. In addition, it increases the requirements for the quality of lighting. A good HD camera requires perfect (mostly very strong) lighting.

The last aspect of using HD cameras is data presentation. If the captured images are presented on the monitor screen in a small window, the potential of HD will never be used up. This is the case, for example, when comparing images side by side. A resolution of 640 x 480 is then sufficient. Paradoxically, the problem also applies to printouts. Assuming that the resolution of the normal eye in good lighting conditions is about 1 arc minute (the eye will separate two points if their centers are at least one arc minute, which gives about 0.3 mm at a distance of 1 m), and the presentation images in the popular two-column format (images about 7.5 cm wide) and viewing them from a distance of 0.5 m - the human eye will not see grain for frames with a width of 500 pixels! Assuming viewing images from a distance of 35 cm, a resolution of about 700 pixels is sufficient, which is as much as given by classic PAL cameras. The capabilities of the HD camera will be untapped. Images with higher resolutions will prove useful only when skillfully framing the content and enlarging the essential elements of the image.

However, HD cameras can be useful even without using their full capabilities, because due to their greater technological advancement, they are usually progressive, which can be a significant advantage.

In summary, HD cameras are intended for users who:

  • record static images (microscope slides, etc.), or
  • capture moving images and have an excellent high-power light source, or
  • capture fast-changing images and have a good source of strobe light,
  • make framing presented data,
  • require a progressive camera,
  • have a fast computer and a lot of space for videos, or advanced software that is best able to record a series of frames,
  • they do not require high smoothness of the video image (they are satisfied with the low number of frames per second).

In other situations, a classic camera will be a better choice. The choice between an HD camera and a classic one should be distinguished here from the choice between a digital and analog camera. These are completely different matters.

In our opinion, for most people recording medical video images, the optimal solution would be a progressive camera with a resolution of 800×600 pixels, or 640×480, with a 1/3 inch sensor. For people with the most demanding requirements - a camera with a resolution of 1280×960 with a 1/2 inch sensor. Larger transducers may cause problems with the creation of endoscopic optics that can properly use their capabilities, and larger resolutions will be practically impossible to use in everyday use, except for the registration of completely static images (microscopic preparations, fundus photos, etc.).

Single-sensor and three-sensor cameras (3CCD)

The difference between these types of cameras has already been preliminarily described in the chapter on sensor surface. It should only be added that the use of three monochrome transducers (one for each color component) instead of one transducer with a mosaic of colors slightly improves the quality of color reproduction due to the possibility of using better color separating filters. Currently, however, the difference is minimal for new cameras.