SYDE 770-7 Assignment 1

SD 770-7 Course Webpage

On this page you will find some items that I worked on during the SD 770-7 course.

Mean Shift Algorithm

The mean shift algorithm was implemented using matlab as defined in [1]. The code for this implementation can be found below. The Epanechnikov weighting function with d=1 and Cd = 2 was used for histogram calculation. The models are initialized manually by the user on the first frame.

There are two main limitations of the implementation. First, scale change was not implemented. That is, the size of the ellipse surrounding the object being tracked does not change in size. The second limitation is that the ellipse is always oriented horizontally or vertically. Ideally, we would perform the tracking per [1] by obtaining the translation using the mean shift algorithm, then the scale change by maximizing the Bhattacharyya coefficient. We would follow this by obtaining the correct orientation of the ellipse surrounding the object by finding the orientation of the ellipse that maximizes the Bhattacharyya coefficient.

I present the results for some test cases below, with discussions and video. (Keep in mind that scale and orientation changes have not been implemented.) Let us begin by looking at the object model representation.

Object Model

A very simple object model is used for this algorithm. We represent an object by its grayscale histogram. However, the histogram is weighted such that the pixel near the tracked region's center is weighted more than the pixels near the edges of the tracked region. The Epanechnikov weighting function is used per [1].

Test Case 1: Change in Histogram Bin Numbers

Figure 1: 5 Histogram Bins   Figure 2: 25 Histogram Bins

From Figures 1 and 2 we can see there is a slight improvement in the trajectory when the number of bins is increased from 5 to 25. This improvement is very minor. As a result it can be said that the algorithm is robust to bin size as long as the number of bins are chosen such that there is a separation between the object being tracked and the background.

Test 2: Similar Background Histogram

Figure 3: Tracking lost due to background sharing same histogram distribution as target.

When a region on the background has a similar histogram distribution as the object, the object and the background region can not be differentiated when the object moves over this region. The result is that the algorithm can converge to the background and lose the object being tracked (as illustrated in Figure 3). Select frames from this test sequence are presented below.

Figure 4: Shows where tracking fails.

Test 3: Illumination Changes

Figure 5: Tracking lost due to illumination changes causing model and target histograms to differ.

In Figure 5 the object is not tracked due to illumination changes. We see that when the object being tracked passes from a well illuminated region to poorly illuminated region the model histogram is not indicative of the target. As a result the object is no longer tracked. Figure 6 provides some frames from this test case.

Figure 6: Shows where tracking fails.

Test 4: Object to Object Occlusion

Figure 7: Object 1 Tracked Results   Figure 8: Object 2 Tracked Results

Finally, if there are two objects with the same grayscale distribution passing by or overlapping each other, there is a potential for confusing the two objects. In some cases, the tracked object might continued to be tracked (e.g., Figure 7) and in other cases we can end up tracking the second object (e.g., Figure 8). Select frames of these sequences are illustrated in Figure 8 and 9.

Figure 9: Object is correctly tracked after object to object overlap.
Figure 10: Object is not correctly tracked after object to object overlap.


Based on the test sequences we can see that the mean shift algorithm works when there is a nice separation between the features of the background and the object being tracked. Furthermore the features of the object being tracked must stay constant over time and the object being tracked must not pass by another object with similar features.

The selection of the feature to use is important. If we were to choose a feature that is less susceptible to illumination changes then we can avoid most failures due to illumination changes.

KLT Tracker

The KLT algorithm implementation found online at [2] is used for testing, with the default parameter settings. As with the Mean Shift algorithm, rotations and scale changes of the object ellipse (ellipse surrounding the object) are not considered when tracking. Furthermore, as with the Mean Shift algorithm, the starting location of the object must be manually initialized.

Object tracking is accomplished using two methods: simple track and advanced track.

Simple Track:

All KLT features within the object region (ellipse) are identified, then these features are tracked using implementation [2]. The center location of the tracked object is assumed to be the centroid of all KLT feature points. The size of the object is fixed to the initial size specified by the user.

Advanced Track:

This is the same as the first method but at each new frame, once the ellipse is tracked to the current frame, all new KLT features within the ellipse are added to the KLT features being tracked. If these newly added features move out of the ellipse or are stationary for the next T frames, they are removed from the KLT features being tracked. Replacing KLT features gives us the potential to track the object for longer time period. However, the replaced KLT feature can belong to the background or another object in the frame. If it belongs to the background it will be stationary as the tracked object moves, and thus will be eliminated after T frames. However, if the feature belongs to another object and if that other object is moving the feature will not be removed. This will be illustrated later in the results section.

Object Model

The features that are used to represent the object are points with the minimum eigenvalues of the Hessian matrix above a threshold. The result of thresholding the minimum eigenvalues is very similar to using the Harris corner detector [6]; as a result we can say that we represent our model with corner-like features.

  • Matlab code: read KLT tracker output files and matlab code for replacing KLT features on object.

Test 1: Changes in Starting Location

Figure 11: Simple Track Figure 12: Simple Track
Figure 13: Simple Track
Figure 14: Advanced Track Figure 15: Advanced Track
Figure 16: Advanced Track

When we do not replace features, (i.e., Simple Track), we find that the tracking performance depends greatly on the initial features chosen. If we were to choose the features at the right point such that the chosen features are visible in the rest of video sequence the object can be tracked for longer. This is illustrated in Figures 11 to 13 (above) where tracking results are presented for the same video sequence but the initial features are chosen at three different frames (time periods). The initial features for these three sequences are shown in Figure 17.

Test Sequence 1 Test Sequence 2 Test Sequence 3
Figure 17: Initial KLT Features

Figure 18 illustrates how, in the advanced track method, a background KLT feature is added to the object being tracked. The added background KLT feature is then removed after T frames of no motion.

Figure 18: Background features added to tracked object are removed after T frames. The two points on the bottom of the image belong to the background and are removed after T frames.

Test 2: Object Rotation

Figure 19: Simple Track Figure 20: Advanced Track

Figure 12 (above) illustrates how the KLT feature vanishes under out-of-plane rotations. The KLT features are chosen when the person was in a profile view but when the person rotates around we lose the selected KLT features. When the object being tracked does not go through a rotation (Figure 13) the object can be tracked for longer.

The rotation of the object causing the loss of KLT features is further illustrated in Figure 19 when tracking is lost as a result of an object trajectory change which causes the object to rotate. However with the feature replacement method used in the advanced track method this problem can be overcome (Figure 20).

Test 3: Illumination

Figure 21: Simple Track Figure 22: Advanced Track

In Figure 21 the tracked object is lost early in the process. This is a result of high illumination and similarity of the object to the background. When the KLT features are chosen, shadow caused a contrast between the person's shirt and the background allowing for features to be selected. However, once the person starts moving and the shadow disappears, the contrast between the shirt and background vanishes causing the features to be lost.

Test 4: Object to Object Occlusion

Figure 23: Simple Track Figure 24: Simple Track
Figure 25: Advanced Track Figure 26: Advanced Track

Figure 26 illustrates a scenario in which the object is tracked even when there exists an out-of plane rotation of the tracked object. This seems contradictory to the previous observation that rotations will cause the features to be lost. In fact this is not so, the reason this object is successfully tracked is because a KLT feature is selected on the top of the head (Figure 27) such that even under rotations it looks similar.

Figure 27: KLT feature is selected at a location (the head) that looks the same under rotations.

Figure 28 illustrates how two objects can be confused as a result of adding new features during tracking in the advanced track method. In this case the two objects are moving in the same direction so both objects are tracked instead of just the single object. There is a potential this will not happen if enough of the original features on the object is visible during the interaction of the two objects (Figure 29).

Figure 28: Confusion between moving nearby objects as a result of adding new KLT feature points.
Figure 29: The near by objects are resolved correctly.


The KLT features are tracked very well as long as the corner-like features tracked using correlation are visible in all frames. However, due to 3D rotations and object deformations, these corner like features can be lost. This results in the fact that the object cannot be tracked based only on the initial features selected for tracking. We must add new features from the object as the object moves.

Adding new features is a tricky business as we do not know if the new feature is part of the object being tracked, the background or another object. A simple method of adding features has been implemented here: any features within the object region that are not stationary over T frames are part of the object being tracked. This method is good at differentiating between the object being tracked and the background. However, it is very poor at differentiating between the tracked object and other nearby moving objects. Furthermore, if the object were to stand still for more than T frames, all points on the object will be assumed to belong to the background when using this method.

Another limitation of both the simple and advanced methods presented here is the locating of the center of the tracked object. We set the center of the object as the centroid (average) of all KLT features that are part of the object. However the KLT features are not found uniformly on all parts of the object. As a result the centroid shifts around too much as KLT features are found at different parts of the tracked object. A better scheme would be to translate the initial center given by the user as a function of all KLT features currently being tracked. This has its own problems such as what happens when some of the features in the object are translating and other are not.


[1] D. Comaniciu, V. Ramesh, and P. Meer, Kernel-Based Object Tracking, IEEE trans. on Pattern Recognition and Machine Intelligence, May 2003, vol. 25, number 5, pp. 564-578, 2003

[2] Jianbo Shi and Carlo Tomasi. Good Features to Track. IEEE Conference on Computer Vision and Pattern Recognition, pages 593-600, 1994.

[3] PETS 2004 Dataset

[4] mmread by Micah Richert

[5] Yilmaz, A., Javed, O., and Shah, M. 2006. Object tracking: A survey. ACM Comput. Surv. 38, 4 (Dec. 2006), 13.