An Introduction to Video capturing with the APAS system

	WWW: Use this form to send us your feedback.
	Orders: Use this form to place your order(s).
	Corporate Office: Voice:(949) 858-4216 Fax: (949) 858-5022
	Sales and Service: Voice:(858) 874-2547 Fax: (858) 874-2549
	E-Mail: adi@arielnet.com gideon@arielnet.com
	Corporate Office 6 Alicante Street Trabuco Canyon, CA 92679 U.S.A.
	Sales and Service: 4885 Ronson Court, Suite A San Diego, CA 92111 U.S.A.
	The Webmaster 6 Alicante Trabuco Canyon, CA 92679 U.S.A.

Basic Video capturing and the APAS

An Introduction to video capturing

In order to understand how the computer can capture pictures from the video, it is necessary to consider some history and explanations of previous techniques and technologies. The use of the video camera/VCR with the computer was considered, at the time, a major breakthrough. A faster, more streamlined approach that enables the user to send the pictures directly from the video camera/camcorder to the computer, however, is now available. A brief explanation of the earlier systems will enhance understanding of this new technique.

Consider what happens when a video camera is used to take a picture. A video camera can capture only one image at a time, in a series, each time the record button is pressed. This sequence of individual images may record a collection of positions by an athlete as he or she performs a certain athletic movement. Information can be extracted from these pictures, if they were taken during the moment when a position crucial for the actual movement/event occurred, such as the position of a discus thrower just before he releases the discus.

In order to capture the position needed using the techniques of video, it is necessary to start recording with the camcorder prior to the beginning of the desired movement and stopping after the completion of the action. For this example, the camera would be started before the discus thrower began his spin and stopped after the discus was released. Examination of the discus thrower's videotape in a VCR provides an opportunity for viewing the sequence as many times as desired. Using the variety of controls on the VCR (play, stop pause and so on), it is easy and simple to analyze/view interesting position in the movement. The greatest strength associated with the camcorder/VCR/Computer use is the extensive amount of dynamic information can be extracted by virtue of the sophisticated software. Dynamic information includes not just the position but the variation or progression from image to image.

The standard camcorder is capable of recording 60 (NTSC) /50 (PAL) images every second onto the videotape inserted in the camcorder. Every image in the video is made of multiple lines and a large number of points in every line. The actual image can be compared to a huge table where each cell in the table represents a color. The reason there are two different number of images that can be stored on videotape every second is because each country in the world has set acceptance of different standards. The two standards are PAL (mostly used in European countries) and NTSC (North American standard). The major difference in the two standards is, of course, the number of images that are stored every second. But, in addition, these standards also regulate how the color is represented in the actual picture/image and the number of lines and points in every line in the image (i.e. table principle). In summary, some of the differences:

	Pal	NTSC
Number of images/sec	50	60
Number of lines	384	240
Number of points in line	576	640

The camcorder/VCR has been used extensively during the last 10 years to extract visual information about movement. The complex movements associated with athletic performances have been easier to dissect with the development and applications of these technological improvements. Evaluation of performances which attempt to compare the differences between two different trials for the same person are quite useful. However, this necessitates that both the athlete and the coach are highly skilled at viewing movement and determining the appropriate changes. More often, it is impossible to identify movement patterns, deceleration or acceleration changes of various body parts throughout the activity by using only the human eye. Merely examining the video tape does not provide as much information as processing the movement patterns with computerized software. Better use of the video tape involves processing the recorded video sequence by the computer with "easy to use" software packages which allow quantification of the movement with highly mathematical techniques.

In order to understand more precisely how the computer is able to utilized the video tape for subsequent mathematical applications, a brief explanation is needed. When video is recorded onto the videotape, it is stored using electromagnetic fields. The actual video is normally viewed using a VCR connected to a television. The connection between the VCR and the television is a small cable where electric signals are used to represent the actual images. In other words, the image stored on the videotape is converted from magnetic fields to electric signals that is moved to the television and displayed.

These same electric signals (which represents the images from the video tape) can be transformed into images in the computer. Converting electronic signals to something the computer can understand requires the use of specialized hardware called an A/D (Analog to Digital) converter. An A/D converter converts the Analog electric signal representing the video image into a large amount of digital numbers which the computer recognizes. The computer cannot understand what to do with the Analog electric signal unless it has been transformed or converted into a meaningful form, that is, a digital signal. This is analogous to two people speaking different languages, such as a Frenchman trying to talk with someone who only speaks Japanese. If neither individual understand the language of the other, they need someone or something to translate or convert from one language to another. This is the function of an A/D converter for computers which need to have electrical signals translated into digital forms.

After the analog electric signal has been converted into a digital number, the computer can store that number in the memory or on the hard drive. The A/D conversion of video images and the storage of this digital information to the memory/hard drive is called a Video Capture System. In many different books and literature this conversion and storing of images to a computer memory/hard drive is referred to as Capturing or Video Capturing.

As explained earlier, every image in the video sequence consists of several lines and a number of points in every line. This was called the "table principle". When capturing and storing images in the computer, the size, that is the amount of space required for storage on the computer, is a function of several parameters. These image parameters include the number of lines in the image, the number of points for every line, and the amount of bytes used for the representation of the color. A sample calculation for size follows using the PAL mode for the example:

size of image = (number of lines) x (number of points in line) x (number of bytes for every color)

PAL Image size = 384 x 576 x 3 = 663552 bytes

Normally the number of bytes for the color is 3. This gives the possibility of 3 x 8 bit = 24 bit color which is very close to the maximum of what the human eye can handle. When adding the information about the number of images captured every second, the amount of data that the computer has to move to the hard drive is (50 image/sec x 663552 bytes) 33177600 bytes which is more than 30Mb per second. People with a little knowledge in computers are aware that no computer currently available can transport this amount of data to the hard drive. The storage capacity of the hard drive is sufficient but the flow rate from the Video Capture system to the hard drive is simply too high for the system. Therefore, an alternative method is employed which reduces the amount of data that has to go to the computer's hard drive.

Since the beginning of the computer, programmers have tried to reduce the amount of data that has to be stored on the computer due to the problem that the computer does not have unlimited storage. There are several ways that the amount of data can be reduced. One way would be to decrease the number of lines or the number of points in every line. Unfortunately, either of these choices would decrease the quality of the image significantly. Reducing the number of colors would be another option. Using only a gray scale (8 bit-1 byte) would decrease the amount of data by three. But 10Mb/sec is still very high and gray scale is not always as nice as color images. Fortunately, the solution for this specific Video Capturing System are special compression algorithms which allow large amounts of data to be stored in relatively small amounts of memory with very little loss of information in the reduction process.

One compression algorithm is called JPG and has become a standard used extensively on the Internet for compressing images. The Internet is another part of the computer world which also requires reduction in the amount of data moved from one end of the world to the another. If compression were not used on today's networks, it would be very difficult to download anything from the Internet and the time spent downloading or just viewing web pages would take forever. JPG used in video sequences is called MotionJPG or just MJPG. There is a large and varied number of compression algorithms currently available. Most of them are developed with specific goals to accomplish. However, the JPG has been determined to be the best standard for video based motion analysis systems.

The 33Mb/second delivered by the Video Capturing System can be reduced to as little as 1.1Mb/second. Of course, with this amount of compression, there is some loss of information but, if the amount of compression is chosen to about 4-6Mb/second, then the quality of the image will remain very high. Even 4-6Mb/sec produces a large amount of data for storage on the hard drive and most older computers (like the old 486 or older Pentiums) are unable to accommodate such a file. The requirements for the computer to adequately store these larger data files are as follows:

Pentium 200Mhz
More than 2Gb hard drive
The hard drive must be using Ultra DMA (which is the case for any new
computer) or Ultra SCSI when communicating with the processor.
The amount of memory must be higher than 16Mb
Use Win95/win98 or WinNT

In order to make the Video Capturing System success, an appropriate compression algorithm to reduce the amount of data is essential. The algorithm used in JPG is a complicated mathematical process which reduces certain information. Attempting to let the processor perform this function is impossible, since even a Pentium 450Mhz is not fast enough. This is because the computer processor were not built specifically to execute these highly mathematical processes. Although they could do such work, there were not for these tasks. Therefore, most computers use specialized video processors mounted on the Video Capture system in order to compress the video without loss of images. These types of processors are specialized and dedicated for this purpose only.

After the video has been stored on the computer's hard drive, displaying it on the computer's monitor requires decompression of the video in order again view the actual image. The same speed problem occurs and, therefore, instead of using the computer processor for decompressing the video, the computer uses the video processor on the Video Capturing System, before the data is displayed on the computer monitor.

The COmpression and DECompression unit is call a CODEC. A codec, as explained above, can be either hardware or software. A codec is installed in Windows 95/98/NT in the same manner as is hardware. If you would like to view what codec's you have installed on your system please follow these few steps:

1: Click the Start button in the task bar
2: Under the menu called Settings, choose the Control Panel
3: In the Control Panel Window, find the item called Multimedia
4: Double click the Multimedia Icon
5: In the Multimedia Properties, click the Tab called Devices
6: Under Devices it is possible view all components installed for Multimedia purposes.
7: Click the Plus sign at the title Video Compression Codecs
8: Now you will be presented for a list of all the codecs installed in you computer

When you open a video clip in your computer, the system will ask the Multimedia Properties if you have a codec installed that can decompress the video you selected. If it is installed, you will see the video directly on your computer, but if the computer can not find a codec that can decompress the video you will have an error message explaining that it is not possible to find a codec for this video. This means that if the system has an codec installed to fit the video you need to use, then everything will work just fine. Otherwise, you will have to install a codec and then the video clip can be successfully decompress for your use.

Moving the captured video from one computer to another requires that both computers has the correct codec installed either in hardware or software form. It is possible to view the video using the computer where the video was captured or on another computer which has the appropriate codec installed. The only software MJPG codec available currently is free on the Internet and is called Paradigem MJPG Codec.

Following this generalized explanation of video capturing, it should be easier to relate to operation of the APAS system. Most of the information was provided as background on how video signals can be captured into computers. The connection between this information and the APAS system is essential because the APAS system utilizes video for a computer-based analysis system.

In general, the analysis of human movement can be divided into few steps:

1. Record the event using video cameras (camcorders)
2. Capture the video from the Camcorder to the computer
3. Digitize the event utilizing the video
4. Transform the video-digitized coordinates to a 3D coordinate system
5. Filter the transformed coordinates
6. Analyze the movement

Step 2 is the actual capturing of the video from the camcorder to the computer. For this purpose, the APAS system provides two modules: Capture and RealCap.

The Capture module employs a special technique which uses the connected VCR to capture the actual video. This process is based on stepping the video forward for each image and capturing that image individually. This is just like creating animated cartoons where the drawing is moved a fraction of a millimeter and every time the drawing is moved a picture is taken.

The RealCap module utilizes the video compression MJPG, which was described previously. This technique enables the module to stream the video directly onto the hard drive while the Video is in play mode. There are several advantages of the approach which include: (1) the time for capturing a 3 sec event is much shorter than using the Capture module, (2) the quality of the images is much better, and (3) there is no need for external VCR equipment thus reducing the price of the over all system.

The reason why the images are better using the RealCap is because the Capture module uses a pause mode during the image capturing. While pausing, the VCR tries to adjust the video heads to acquire a stable image. This adjustment can introduce a small jitter in the image. During the play mode, this jittering in the image is not present.

Although the RealCap module appears to have more pluses than the Capture module there is a small minus. When using more than one view (this is required when analyzing 3D movement) in the recorded event, it is necessary to have the first image of all views to be identical. If this is not the case, a huge error will be introduced to the analysis.

Capturing video while the VCR/camcorder is in play mode requires a certain level of finesse by the user. This difficulty is related to positioning the video with such accuracy that it can be reproduced from one view to another, that is, from one video sequence to another. This necessitates some kind of trimming of each individual view. The RealCap module provides an opportunity to adjust the individual view when saving the video to the hard drive. This adjustment enables the individual views to have the same starting position and also to have the same length. The best technique is to locate a clearly defined position or event within the movement and select this as the synchronizing image. An example of a clearly defined position could be the heel strike during walking or the release of a discus in a discus throw. However, the most important point to remember is that the synchronizing image is clearly defined and the same in all the views used for the analysis.

An explanation of the trimming process is presented in greater detail elsewhere. Please refer to the online manuals of the RealCap for this description.