Object Detection and Tracking¶

This project will use the morphological transformation to process the video image in order to identify the target object. And use the OpenCV tracker to track the object. The video below shows the moving spot on a black screen. The goal is to automatically track the moving spot.

Object Detection¶

Split Color Channel¶

The video image has three color channels. Since the target spot's color is white, all three color channels red, green and blue all have the maximum pixel intensity values for the target spot. Therefore, it is sufficient to use only one channel to identify the object.

blueIm, greenIm, redIm=cv2.split(frame)

Reference:
https://docs.opencv.org/4.x/d3/df2/tutorial_py_basic_ops.html

Morphological Transformation¶

In order to clearly identify the target object, the image processing such as morphological transformation is used to clean up the white spot's edge and remove the other tiny noisy spots if any in the video frame of this project.

It requires a set up of a kernel which is suitable for the size of the target object.

Morphology Closing- Dilation followed by Erosion¶

The Dilation operation expands the white region. A pixel element is set to '1' (white) if any one pixel under the kernel is '1'.

The Erosion operation trims the white region. A pixel element is set to '1' (white) only if all pixels under the kernel is '1'.

The Opening operation is Erosion followed by Dilation. It is useful in shrinking small white spots inside the black objects. Applied the opening operations multiple times to extend the erosion (trim white spots) to dilation (expand the remaining white spots) and to erosion (trim the white spots) and then dilation (expand the remaining white spots again) until all major white spots are removed.

The Closing operation is Dilation followed by Erosion. It is useful in shrinking small black spots inside the white objects by repeating the dilation (trim black spots), erosion (expand the remaining black spots) and then dilation (trim the black spots) and erosion (expand the remaining black spots) until all major black spots are removed.

For example,
kernelclose = np.ones((2,2),np.uint8)
mask = cv2.morphologyEx(blueIm, cv2.MORPH_CLOSE, kernelclose, iterations= 5)

Reference:
https://docs.opencv.org/4.x/d9/d61/tutorial_py_morphological_ops.html

findContours- Identify the contour of shapes¶

Contours are the boundaries of shapes with the same intensity. The function returns the detected contours and an optional hierarchy describing the relationship between them. In this project, the processed image contains a round target object on a black screen. It is straightforward to use the function findContours to detect the round target.

For example,
contours, hierarchy = cv2.findContours(mask, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

Reference:
https://docs.opencv.org/4.x/d4/d73/tutorial_py_contours_begin.html

minEnclosingCircle- Identify the fitted radius¶

cv2.minEnclosingCircle is a function to determine the smallest circle that can completely contain a shape or contour. The output is the location of the circle center and its radius.

In this project, there is only one round object. Therefore, the goal is to find the round object with the largest radius from the list of found contours.

((x,y),radius) = cv2.minEnclosingCircle(EachContour)

Reference:
https://docs.opencv.org/3.4/dd/d49/tutorial_py_contour_features.html

Rectangle- Draw the bounding box¶

In order to draw a rectangle around a round object, it needs to identify the upper left corner and the lower right corner of the rectangle using the center and the radius of the round object. Note that the corner (x,y)=(0,0) is the upper left corner of a frame.

cv2.rectangle(frame, (cornerUx,cornerUy),(cornerLx,cornerLy), lineColor, lineThickness)

Reference:
https://docs.opencv.org/4.x/dc/da5/tutorial_py_drawing_functions.html

Object Tracking¶

Object tracking is to locate an object in successive frames of a video. Tracking is faster than Detection because a lot of information is already available from the previous frame such as the appearance of the object, the location, and the direction and speed of its motion. In the current frame, only a small search around the estimated location could possibly detect the object and proceed.

Tracking can succeed the object detection continuity while the object is occluded. The tracking can also help provide ID to each tracked object, which is not available by the detection function. It is also possible to use an online classifier in the tracking function to confirm the target object is still inside the bounding box.

Types of Tracking algorithms¶

Boosting¶

Boosting is a legacy function and only evaluates at the current location of the object. The initial bounding box at the beginning of the algorithm is taken as a positive example for the object classifier. Its upgrade is MIL and KCF.

Multiple Instance Learning ( MIL )¶

In addition to the current location of the object, it looks in a small neighborhood around the current location. However, MIL does not recover from full occlusion.

Kernelized Correlation Filters (KCF)¶

KCF is an upgrade from MIL. It utilizes the overlapping area among multiple positive object samples to retrieve the mathematical properties which advances the tracking efficiency. It is known for its speed and good performance in many situations. However, it still does not recover from full occlusion.

Tracking, Learning, and Detection (TLD)¶

TLD is a legacy function which divides a long-term tracking into three functions:
It has a detector that localizes the object appearance and provide correction if necessary.
This tracker tracks an object over a larger scale, motion, and occlusion. The tracking could jump to a similar object near by. However, it helps track objects under occlusion over multiple frames or the objects with scale changes. But potentially provides lots of false positive tracking.

MedianFlow¶

MedianFlow is a legacy function which tracks the object in both the forward and backward trajectories. It can precisely report the tracking failure. However, only works for small motion change between frames without occlusions.

GOTURN¶

GoTurn bases on Convolutional Neural Network (CNN). It is robust for the object with viewpoint changes, lighting changes, and object scale changes. However, it does not work well in the case of occlusion.

Channel and Spatial Reliability Tracking (CSRT )¶

CSRT is a feature-based tracker that uses the spatial reliability map and channel information to track the object within the non-rectangular regions or objects. CSRT is a more advanced and accurate but slower version of the KCF tracker and is more robust to changes in the object's appearance.

Minimum Output Sum of Squared Error (MOSSE)¶

MOSSE is similar to the KCF tracker but uses the Minimum Output Sum of Squared Error (MOSSE) metric to train the correlation filters, which makes it faster than KCF. MOSSE produces stable correlation filters which minimize the sum of squared errors between the actual correlation output and the predicted correlation output.

It is a light weight tracker robust to variations in lighting, scale, pose, and non-rigid deformations. It also detects occlusion based upon the peak-to-sidelobe ratio, which enables the tracker to pause and resume where it left off when the object reappears. However, it is not as accurate as KCF or CSRT. To further improve the tracking performance, the users will switch to the deep learning based trackers.

Reference:
https://learnopencv.com/object-tracking-using-opencv-cpp-python/

Example¶

In this project, the CSRT algorithm is used to demonstrate the tracking performance.

The steps are:

(1) Initialization:
Create a tracker object using cv2.Tracker_create(), replacing with the desired tracker name (e.g., CSRT).

(2) Define the initial bounding box of the object to be tracked.
Initialize the tracker with the first frame and the bounding box using tracker.init(frame, bbox).

(3) Update:
In each subsequent frame, update the tracker using tracker.update(frame).
The update method returns a boolean indicating success or failure and the updated bounding box.

### Set Up Trackers #################################
tracker_types = ['BOOSTING', 'MIL','KCF', 'TLD', 'MEDIANFLOW', 'GOTURN', 'CSRT', 'MOSSE']
tracker_type = tracker_types[6]

if tracker_type == 'BOOSTING':
    tracker = cv2.TrackerBoosting_create()
elif tracker_type == 'MIL':
    tracker = cv2.TrackerMIL_create()
elif tracker_type == 'KCF':
    tracker = cv2.TrackerKCF_create()
elif tracker_type == 'TLD':
    tracker = cv2.TrackerTLD_create()
elif tracker_type == 'MEDIANFLOW':
    tracker = cv2.TrackerMedianFlow_create()
elif tracker_type == 'GOTURN':
    tracker = cv2.TrackerGOTURN_create()
elif tracker_type == "CSRT":
    tracker = cv2.TrackerCSRT_create()
elif tracker_type == "MOSSE":
    tracker = cv2.TrackerMOSSE_create()
else:
    tracker = None
    print('Incorrect tracker name')
    print('Available trackers are:')
    for t in trackerTypes:
        print(t)
        

cap = cv2.VideoCapture(filename)
if (cap.isOpened()== False):
print("Error opening video stream or file")

while(cap.isOpened()):

Capture frame-by-frame¶

ret, frame = cap.read()

if ret == True:

if count==0:
count+=1
bbox=findObject(frame)
ok = tracker.init(frame, bbox)

frame=DrawInitialDetectedBox(frame, bbox)

cv2.imshow(windowName, frame)

count+=1

elif count>1:

The update method¶

The update method is used to obtain the location of the new tracked object. The method returns false when the track is lost. Tracking can fail because the object went outside the video frame or if the tracker failed to track the object.
In both cases, a false value is returned. At this point, the detection algorithm is initiated. The moving spot is re-detected and then the tracker is re-started.

ok, bbox = tracker.update(frame)

if ok:

frame=DrawDetectedBox(frame, bbox)

else :

Tracking failure¶

tracker = cv2.TrackerCSRT_create()
bbox=findObject(frame)

ok = tracker.init(frame, bbox)

frame=DrawDetectedBox(frame, bbox)

cv2.imshow(windowName, frame)