Welcome to CSML


0
50
85
170
255
Creating a foreground mask, which is a binary picture of the pixels that belong to moving elements in the scene, can be done with still cameras by using a technique known as background subtraction (BS), which is a common and commonly used method. In this study, we will show a video subtraction pipeline that makes use of both computer vision and deep learning techniques.
The dataset, CDNET, can be downloaded from here. This dataset contains 11 video categories with 4–6 video sequences in each category. However, we only used eight video sequences. They are: busStation, canoe, fountain02, highway, office, park, peopleInShade, sidewalk. Each individual video file (.zip or.7z) can be downloaded separately. Alternatively, all video files within one category can be downloaded as a single .zip or .7z file. Each video file, when uncompressed, becomes a directory that contains the following:
a sub-directory named “input” containing a separate JPEG file for each frame of the input video
a sub-directory named “groundtruth” containing a separate BMP file for each frame of the groundtruth
“an empty folder named “results” for binary results (1 binary image per frame per video you have processed)
files named “ROI.bmp” and “ROI.jpg” showing the spatial region of interest
a file named “temporalROI.txt” containing two frame numbers. Only the frames in this range will be used to calculate your score
The groundtruth images contain five labels, namely
0 : Static
50 : Hard shadow
85 : Outside region of interest
170 : Unknown motion (usually around moving objects, due to semi-transparency and motion blur)
255 : Motion