Raquel Dosil Lago

Motion Segmentation using Composite Frequency Features

Some Examples

In the following, some results are presented to show the behavior of the method in problematic situations. The results are compared to an alternative implementation where:

the initial state for the first frame is defined by user interaction,
the initial state for subsequent frames is the segmentation of the previous frame and
the external energy is the same as with our method, but replacing the odd-symmetric representation of the visual pattern by the inter-frame difference:
I_t (x,y,t_k) = |I(x,y,t_k) - I(x,y,t_k_-1)|

Click the images in the following pictures to see the corresponding movies.

Example #1

The following example shows the ability of the method to deal with changes in illumination and shape. In particular, the following sequence presents several moving parts, with variable shape, speed and direction over a textured static background. The constant brightness assumption does not hold in this case due to contrast changes. The model is able to isolate the visual pattern associated to the moving hand.

	Original data
	Motion pattern (even-symmetric representation)
	Segmentation using composite-feature based active model
	Segmentation using inter-frame based active model

Example #2

In the next example, a static scene is recorded by a moving camera. The estimation of the temporal derivative along frames produces large values at every image contour. In contrast, visual pattern decomposition allows isolation of motion patterns with different speeds, which in the 3D spatio-temporal domain is translated into patterns with different orientations. This is made clear if we visualize a cut of the image and the visual patterns in a plane normal to the temporal axis.

Consequently, the external energy estimated from the temporal derivative feature presents deep minima all over the image and the active model is not able to distinguish foreground objects from background. The external energy in our implementation considers only the motion pattern corresponding to the foreground object, leading to a correct segmentation.

	Motion Feature	External Potential	Segmentation
Temporal derivative
Rectified odd-symmetric representation of the visual pattern

Example #3

When the sampling rate is too small in relation to the speed of the moving object, it is difficult to find the correspondence between the position of the object in two consecutive frames. However, this is no longer a problem when using frequency features. We can achieve a correct segmentation using a composite frequency-feature for the initialization of the model at each frame. When initializing with the previous segmentation, the model will not be able to track the target if the previous segmentation does not intersect the object in the following frame.

Frame t	Frame t+1
		Original Data
		Segmentation using inter-frame based active model
		Motion pattern.
		Segmentation using composite feature based active model

Example #4

Occlusions give rise to the same problem as with fast objects. Again, initialization with composite frequency-features leads to a correct segmentation even when the object disappears during several frames.

Frame 11	Frame 15	Frame 23
			Original Data
			Segmentation using inter-frame based active model.
			Initial state from motion pattern
			Segmentation using composite-feature based active model

In this particular example, the feature employed for initialization is the half-wave rectified even-symmetric representation of the motion pattern --instead of the energy of the motion pattern--, since it provides a better localization of the object. Nevertheless, the result produced using the energy is very similar.

Example #5

The following example (from URL http://www.psi.toronto.edu/computerVision.html) is similar to previous one, except that the occluding object is not static but undergoes motion with different direction than the occluded object. The composite feature model is able to segment the two objects independently, allowing robustness to occlusions in tracking both objects. Segmentation using the inter-frame based approach fails to track both objects independently.

Frame 10	Frame 25	Frame 36
			Original Data
			Segmentation of the two people using one active model for each of them, based on the inter-frame difference.
			Visual pattern employed for segmentation in green color in bottom row (even-symmetric representation)
			Visual pattern employed for segmentation in red color in bottom row (even-symmetric representation)
			Segmentation of the two people using one active model for each of them, based on the composite-feature representation

Example #6

The following example is a fragment of the standard video sequence “coast guard”. This sequence shows two mobile objects, one of them being followed by the camera. The model classifies the filters corresponding to the moving background as non active, excluding them from the subsequent analysis. The model is able to identify two motion patterns, one for each of the moving objects. Segmentation of the static object is not interfered by background texture or mobile contours.

frame 8	frame 44
		Two frames of the original sequence

Corresponding frames of the amplitude representation of one of the detected composite energy features
		Corresponding frames of the amplitude representation of the other detected composite energy features
		Segmentation results for the second composite energy feature

Example #7

This example is a sequence showing two motion patterns with different scales, speeds and directions of motion. Both patterns are occluded in different parts of the sequence. The background is not completely static, but there are also local motion patterns –branches in the trees are moving due to the wind. It is a low resolution video, with a high level of noise.

The proposed model classifies the filters contributing to the background as non active, including local motion in the branches of the trees and active filters give place to two motion patterns, one for each moving object. In tracking, initialization with composite features solves the problem of occlusions in both situations. Image potentials from composite features avoid interferences between the two motion patterns and interferences with motion in background pixels.

			Three frames of the original video sequence
			Traversal cut of the original sequence (left) and the amplitude representation of the two detected composite energy features. Frame index increases downwards.
			Three frames of the tracking results using both composite energy features