1、 京 江 学 院 JINGJIANG COLLEGE OF J I A N G S U U N I V E R S I T Y 外 文 文 献 翻 译 学生学号 : 3081155033 学生姓名: 缪成鹏 专业班级 : J 电子信息工程 0802 指导教师姓名: 李正明 指导教师职称: 教授 2012 年 6 月 A System for Remote Video Surveillance and Monitoring The thrust of CMU research under the DARPA Video Surveillance and Monitoring (VSAM) pro
2、ject is cooperative multi-sensor surveillance to support battlefield awareness. Under our VSAM Integrated Feasibility Demonstration (IFD) contract, we have developed automated video understanding technology that enables a single human operator to monitor activities over a complex area using a distri
3、buted network of active video sensors. The goal is to automatically collect and disseminate real-time information from the battlefield to improve the situational awareness of commanders and staff. Other military and federal law enforcement applications include providing perimeter security for troops
4、, monitoring peace treaties or refugee movements from unmanned air vehicles, providing security for embassies or airports, and staking out suspected drug or terrorist hide-outs by collecting time-stamped pictures of everyone entering and exiting the building. Automated video surveillance is an impor
5、tant research area in the commercial sector as well. Technology has reached a stage where mounting cameras to capture video imagery is cheap, but finding available human resources to sit and watch that imagery is expensive. Surveillance cameras are already prevalent in commercial establishments, wit
6、h camera output being recorded to tapes that are either rewritten periodically or stored in video archives. After a crime occurs a store is robbed or a car is stolen investigators can go back after the fact to see what happened, but of course by then it is too late. What is needed is continuous 24-h
7、our monitoring and analysis of video surveillance data to alert security officers to a burglary in progress, or to a suspicious individual loitering in the parking lot, while options are still open for avoiding the crime. Keeping track of people, vehicles, and their interactions in an urban or battl
8、efield environment is a difficult task. The role of VSAM video understanding technology in achieving this goal is to automatically “parse” people and vehicles from raw video, determine their geolocations, and insert them into dynamic scene visualization. We have developed robust routines for detecti
9、ng and tracking moving objects. Detected objects are classified into semantic categories such as human, human group, car, and truck using shape and color analysis, and these labels are used to improve tracking using temporal consistency constraints. Further classification of human activity, such as
10、walking and running, has also been achieved. Geolocations of labeled entities are determined from their image coordinates using either wide-baseline stereo from two or more overlapping camera views, or intersection of viewing rays with a terrain model from monocular views. These computed locations f
11、eed into a higher level tracking module that tasks multiple sensors with variable pan, tilt and zoom to cooperatively and continuously track an object through the scene. All resulting object hypotheses from all sensors are transmitted as symbolic data packets back to a central operator control unit,
12、 where they are displayed on a graphical user interface to give a broad overview of scene activities. These technologies have been demonstrated through a series of yearly demos, using a testbed system developed on the urban campus of CMU. Detection of moving objects in video streams is known to be a
13、 significant, and difficult, research problem. Aside from the intrinsic usefulness of being able to segment video streams into moving and background components, detecting moving blobs provides a focus of attention for recognition, classification, and activity analysis, making these later processes m
14、ore efficient since only “moving” pixels need be considered. There are three conventional approaches to moving object detection: temporal differencing ; background subtraction; and optical flow. Temporal differencing is very adaptive to dynamic environments, but generally does a poor job of extracti
15、ng all relevant feature pixels. Background subtraction provides the most complete feature data, but is extremely sensitive to dynamic scene changes due to lighting and extraneous events. Optical flow can be used to detect independently moving objects in the presence of camera motion; however, most o
16、ptical flow computation methods are computationally complex, and cannot be applied to full-frame video streams in real-time without specialized hardware. Under the VSAM program, CMU has developed and implemented three methods for moving object detection on the VSAM testbed. The first is a combinatio
17、n of adaptive background subtraction and three-frame differencing . This hybrid algorithm is very fast, and surprisingly effective indeed, it is the primary algorithm used by the majority of the SPUs in the VSAM system. In addition, two new prototype algorithms have been developed to address shortcomings of this standard approach. First, a mechanism for maintaining temporal object layers is developed to allow greater disambiguation of moving objects that stop for a while, are occluded by other objects,