Electrical & Computer Engineering, Department of


Date of this Version

Spring 4-17-2013


A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy, Major: Electrical Engineering, Under the Supervision of Professors Sina Balkır and Senem Velipasalar. Lincoln, Nebraska: May, 2013

Copyright (c) 2013 Youlu Wang


Multiple cameras have been used to improve the coverage and accuracy of visual surveillance systems. Nowadays, there are estimated 30 million surveillance cameras deployed in the United States. The large amount of video data generated by cameras necessitate automatic activity analysis, and automatic object detection and tracking are essential steps before any activity/event analysis.

Most work on automatic tracking of objects across multiple camera views has considered systems that rely on a back-end server to process video inputs from multiple cameras. In this dissertation, we propose distributed camera systems in peer-to-peer communication. Each camera in the proposed systems performs object detection and tracking individually and only exchanges a small amount of data for consistent labeling. With the lightweight and robust algorithms running in each camera, the systems are capable of tracking multiple objects in a real-time manner. The cameras in the system may have overlapping or non-overlapping views. With partially overlapping views, the object labels can be handed off between cameras based on geometric relations. Most camera systems with overlapping views attach cameras to PCs and communicate via Ethernet, which hinders the flexibility and scalability. With the advances in VLSI technology, smart cameras have been introduced. A smart camera not only captures images, but also includes a processor, memory and communication interface making it a stand-alone unit. We first present a wireless embedded smart camera system for cooperative object tracking and detection of composite events. Each camera is a CITRIC mote consisting of a camera board and a wireless mote. All the processing is performed on camera boards. Power consumption of the proposed system is analyzed based on the measurements of operating currents for different scenarios.

On the other hand, in wide-area tracking applications, it is not always realistic to assume that all the cameras in the system have overlapping fields of view. Tracking across non-overlapping views present more challenges due to lack of spatial continuity. To address this problem, we present another distributed camera system based on a probabilistic Petri Net framework. We combine appearance features of objects as well as the travel-time evidence for target matching and consistent labeling across disjoint camera views. Multiple features are combined by adaptive weights, which are assigned based on the reliability of the features and updated online. We employ a probabilistic Petri Net to account for the uncertainties of the vision algorithms and to incorporate the available domain knowledge.

Synchronization is another important problem for multi-camera systems, because it is essential to have the precise relevance between the video data captured by different cameras. We present a computationally efficient and robust method for temporally calibrating video sequences from unsynchronized cameras. As opposed to expensive hardware-based synchronization methods, our algorithm is solely based on video processing. This algorithm is to match and align the object trajectories using the Longest Consecutive Common Subsequence, and thus to recover the frame offset between video sequences.

With the increasing number of cameras in the system, cost and flexibility are important factors to consider. The cost of each camera node increases with the increasing resolution of the image sensor. A possible way of employing low-cost low-resolution sensors to achieve higher resolution images is presented. In this system, four embedded cameras with low-resolution customized sensors are tiled in different arrangements. With the customized CMOS imager, we perform edge and motion detection on the focal plane, then stitch the four edge images together to get a higher-resolution edge map.

Adviser: Sina Balkır and Senem Velipasalar