Home || Architecture || Video Search || Visual Search || Scripts || Applications || Important Messages || OGL || Src

Visual Search

Contents

top top

Overview

This gives an overview of the steps taken in order to browse/search a new image/video collection. The image or video set will be called demoset. The steps taken are basically execution of programs. The structure of the programs are their command line options are described in Applications.

At some points in the process we make use of annotations. The annotations typically originate from another data set called annoset. Naturally, demoset and annoset may designate the same data set.

For clarity, we assume that each data set has its own subdirectory in a VisualSearch directory. So, we start with two directories : VisualSearch/demoset and VideoSearch/annoset.

Global Activity Diagram

inline_dotgraph_23

Legend:

inline_dotgraph_24

inline_dotgraph_25

A very terse data definition

RawData = video | still (| text)

Label = string

Annotation = (Label, shot | still | region | track) = (Label, Subject)?

FeatureDefinition = (name, Parameter(s))

Parameter = (name, value)

Feature = (FeatureDefinition, value(s), point | region | image)

TemporalFeature = (interval, Feature)

PrototypicalFeature = (Label, Feature)

Classifier = (FeatureDefinition, svm model | ...)

Concept = (Annotation, Classifier(s))

RankThread = (Label, shots)

Scripts

inline_dotgraph_26

top top

1 - Video data preparation

Preparation of the video data set includes video segmentation and exportation of keyframes, thumbnails (small keyframes), and stills (up to 16 frames per shot). It comprises the following steps:

Scripts:

The actual scripts are:

Activity diagram:

inline_dotgraph_27

Class diagram:

inline_dotgraph_28

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D101 VideoSet VideoData videoset.txt video set definition A101
D100 RgbDataSrc vid1.mpg video file external
D102a RgbDataSrcInfo vid1.mpg.info video info file A102
D102b Mpeg7Doc MetaData shots vid1.mpg/ shots in mpeg7 A102
D103a Keyframes VideoIndex keyframes.tab A103
D103b Segmentation segmentation.tab A103
D106a Stills stills.tab A106
D103c ImageSet ImageData thumbnails.txt image set definition A103
D103d keyframes.txt A103
D106b stills.txt A106
D105 ImageSet ImageArchive thumbnails.raw archive for whole set A105
D104 ImageSet keyframes vid1.mpg/ archive per video A104
D107 ImageSet stills vid1.mpg/ A107
D108 RgbDataSrcKeyframes Frames vid1.mpg/ images.raw archive per video A108
D102c SimilarityTableSet SimilarityData Frames/streamConcepts.txt/no_model/direct vid1.mpg/ shot boundary similarity A102

Computation:

Activity windows linux (data) parallel distributed / media grid
ShotSegmentation + + ? + todo
IndexSegmentation + + - - todo
ExportKeyframes + + - + todo
MakeThumbnails + + - - ?
ExportStills def + + - - ?
ExportStills data + + - + todo
ExportFrames + + - + todo

A101 - Definition of video collection

We assume that the raw video data is stored in (subdirectories of) VideoSearch/demoset/Videodata (from here on, we eliminate VideoSearch in path specifications). To define the set of videos to be processed we generate a text file with one line per video. The line is the path to the video file itself using the VideoData directory as root. The text file is stored in demoset/VideoData/demoset.txt.

# generate video set definition with .mpg files on linux
cd demoset/VideoData
/usr/bin/find . -iname "*.mpg" -print > demoset.txt

Within Impala, the text file is used via the Impala::Core::VideoSet::VideoSet class.

A102 - Shot segmentation

Segments each video into a number of shots. The feature values used to decide upon the likelihood of a shot boundary are also stored. The --src lavcwriteidx option will generate an index file for each video to facilitate efficient and reliable random access to frame data in the future.

vidset shotsegmenter demoset.txt --report 100 --src lavcwriteidx

An Impala::Core::VideoSet::ShotSegmenter object produces an Impala::Core::VideoSet::Mpeg7Doc and an Impala::Core::Table::SimilarityTableSet per video. The Mpeg7 documents are stored in MetaData/shots/. The shot boundary similarity scores are stored in SimilarityData/Frames/streamConcepts.txt/no_model/direct/.

A103 - Index shot segmentation

Produces an index for the shot segmentation from the Mpeg7 documents of all videos.

vidset indexsegmentation demoset.txt --virtualWalk

An Impala::Core::VideoSet::IndexSegmentation object produces an Impala::Core::VideoSet::Segmentation (stored in VideoIndex/segmentation.tab), an Impala::Core::VideoSet::Keyframes (stored in VideoIndex/keyframes.tab) and Impala::Core::ImageSet::ImageSet 's for keyframes and thumbnails (stored in ImageData/keyframes.txt and ImageData/thumbnails.txt). The shot segmentation may be inspected using util dumpsegmentation demoset.txt and the keyframe definition using util dumpkeyframes demoset.txt .

A104 - Export keyframe images

The keyframes are extracted from the videos for visualization purposes. It is most conveniant to store compressed jpg data of all keyframes in one archive per video. This is called a split archive.

# separate file for each keyframe
#vidset exportkeyframes demoset.txt keyframes.txt --keyframes --report 100 --src lavc

# keyframes in one archive per video (split archive)
vidset exportkeyframes demoset.txt keyframes.txt split 90 --keyframes --report 100 --src lavc

An Impala::Core::VideoSet::ExportKeyframes object does jpg compression on the frames and puts the data in a std::vector of Impala::Core::Array::Array2dScalarUInt8's. After each video Impala::Core::Array::WriteRawListVar writes the data to ImageArchive/keyframes/*/images.raw. The result may be inspected visually using show images.raw .

A105 - Make thumbnails of keyframe images

Configured to store compressed jpg data of all thumbnails in one archive for the whole video set at a scale of 0.5

# separate file for each thumbnail
#imset thumbnails keyframes.txt thumbnails.txt 0.5 --imSplitArchive --report 100

# thumbnails in one archive per video (split archive)
#imset thumbnails keyframes.txt thumbnails.txt 0.5 split --imSplitArchive --report 100

# thumbnails in one archive for the whole video set
imset thumbnails keyframes.txt thumbnails.txt 0.5 archive --imSplitArchive --report 100

An Impala::Core::ImageSet::Thumbnails object scales the frames and puts the data in a std::vector of Impala::Core::Array::Array2dScalarUInt8's. At the end, Impala::Core::Array::WriteRawListVar writes the data to ImageArchive/thumbnails.raw

A106 - Define still image set

Stills provide a more elaborate overview of a shot than (a) keyframe(s). From each shot a maximum of 16 images is taken with the constraint that they should be at least 15 frames apart.

vidset exportstills demoset.txt def --stepSize 10000000 --report 1 --src lavc

An Impala::Core::VideoSet::ExportStills object produces a Impala::Core::VideoSet::Stills (stored in VideoIndex/stills) and an Impala::Core::ImageSet::ImageSet (stored in ImageData/stills.txt). The still definition may be inspected using util dumpstills demoset.txt .

A107 - Export still image data

Configured to store compressed jpg data of all stills in one archive per video. The --stepSize 10000000 implies that vidset actually skips all frames after providing the first one. The ExportStills object walks over the stills by itself because vidset has no knowledge of the stills.

vidset exportstills demoset.txt data 90 --stepSize 10000000 --report 1 --src lavc

An Impala::Core::VideoSet::ExportStills object does jpg compression on the stills and puts the data in a std::vector of Impala::Core::Array::Array2dScalarUInt8's. After each video Impala::Core::Array::WriteRawListVar writes the data to ImageArchive/stills/*/images.raw.

A108 - Export (key) frames.

Export (key) frames as png images to allow for processing of frames without access to the video data. Uses png instead of jpg to avoid compression artefacts.

vidset exportframes demoset.txt split --keyframes --report 100 --src lavc

An Impala::Core::VideoSet::ExportFrames object does png compression on the frames and puts the data in a std::vector of Impala::Core::Array::Array2dScalarUInt8's. After each video Impala::Core::Array::WriteRawListVar writes the data to Frames/*/images.raw. The result may be inspected visually using show images.raw --png .

top top

2 - Annotation process

At the moment, we use two kinds of annotations: rectangular parts of consecutive frames for proto concepts and keyframes for high level concepts.

Activity diagram:

inline_dotgraph_29

Class diagram:

todo

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D200 vector<string> Annotations concepts.txt labels external
D201 TableVxs Annotations annotation.vxs video region annotation A201
D202 Mpeg7Doc MetaData annotations/concepts.txt vid1.mpg/ concept annotations in mpeg7 A202
D203a AnnotationTableSet Annotations Frame/concepts.txt/ con.tab frame annos for whole set A203
D203b AnnotationTableSet Annotations Shot/concepts.txt/ con.tab shot annos for whole set A203

Computation:

Activity windows linux (data) parallel distributed / annotation grid
VisSemAnnotation + + - - -
ConceptAnnotation + + - - -
IndexAnnotation + + - - -

A201 - Annotation of visual proto concepts

Visual proto concepts are based on a set of annotations. Basically, an annotation is a rectangular part of consecutive frames in a video denoting what is visible within that rectangle. The vidbrowse application supports the video region annotation process.

vidbrowse anno annotation.vxs --videoSet trec2005fsd.txt

Jan has assembled an annotation set from the videos in the TRECVID 2005 development set. This annotation set (an Impala::Core::Table::TableVxs) is stored in VideoSearch/trec2005devel/Annotations/annotation.vxs.

A202 - Concept annotation on (key)frames

For training of high level concepts we use annotations of frames. Basically, such an annotation is a Label, i.e. a string, indicating that a concept, e.g. a car, is visible within the frames. The annotation does not provide any spatial information. The annovidset application support the frame annotation process.

annovidset

The set of Labels is defined by a plain text file and is stored in Annotations/concepts.txt. The annotations are stored as an Impala::Core::VideoSet::Mpeg7Doc per Label and per video in MetaData/annotations/concepts.txt/ .

A203 - Index concept annotations

Indexes the Mpeg7 files into a set of AnnotationTables. One AnnotationTable holds the annotations for one concept for all videos. The --segmentation option will map the annotations onto shots, i.e. a shot is positive in case it has an overlap with some positive annotation. The --keyframes option will map the annotations onto keyframes.

vidset indexannotation annoset.txt concepts.txt --segmentation --keyframes --virtualWalk

An Impala::Core::VideoSet::IndexAnnotation object produces two Impala::Core::Table::AnnotationTableSet s. The keyframe table is written to Annotations/Frame/concepts.txt/*.tab and the shot table is written to Annotations/Shot/concepts.txt/*.tab. An individual Impala::Core::Table::AnnotationTable may be inspected using table dumpannotationtable Annotations/Frame/concepts.txt/concept.tab . An overview of the annotations may be generated using table dumpannotationtableset annoset.txt concepts.txt Frame 0 .

top top

3a - Prototyping with annotation

Compute prototypical features of each annotation to serve as a codebook.

Activity diagram:

inline_dotgraph_30

Class diagram:

todo

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D300 FeatureDefinition just a string external
D301 FeatureTableSet Prototypes annotation.vxs vissem/ Weibull prototypes A301
D302 FeatureTableSet Prototypes annotation.vxs vissemgabor/ Gabor prototypes A302

Computation:

Activity windows linux (data) parallel distributed / annotation grid
VisSemTrain + + ? todo todo
VisSemGaborTrain + + ? todo todo

A301 - Training of VisSem proto concepts using Weibulls

Computes prototypical Weibull features for each annotation.

vidset vistrain Annotations/annotation.vxs --videoSet annoset.txt --ini ../script/vissem.ini --report 100

vissem.ini:

# feature parameters
nrScales 2
spatialSigma_s0 1.0
spatialSigma_s1 3.0
histBinCount 1001

# region definition
borderWidth 15
nrRegionsPerDimMin 2
nrRegionsPerDimMax 6
nrRegionsStepSize 4

# annotation set (for reading only)
data .;../annoset
protoDataFile annotation.vxs
protoDataType vid
protoDataSet annoset.txt

mainVidSet loads the annotations as bookmarks and gives each one to an Impala::Core::VideoSet::VisSemTrain object to process it. The result is an Impala::Core::Feature::FeatureTableSet with two Impala::Core::FeatureTable s (one for each scale : sigma=1.0 and sigma=3.0. Each table contains two Weibull parameters of six features (Wx, Wy, Wlx, Wly, Wllx, and Wlly) for all annotations.

inline_dotgraph_31

The tables are stored in Prototypes/annotation.vxs/vissem/ .

A302 - Training of VisSem proto concepts using Gabor

This is very similar to A301 - Training of VisSem proto concepts using Weibulls except it uses Gabor features instead of Weibulls.

vidset visgabortrain Annotations/annotation.vxs  --videoSet annoset.txt --ini ../script/vissem.ini --report 100

An Impala::Core::VideoSet::VisSemGaborTrain object produces an Impala::Core::Feature::FeatureTableSet that is stored in Prototypes/annotation.vxs/vissemgabor/ .

top top

3b - Prototyping without annotation

Compute prototypical features to serve as a codebook, typically using clustering.

Scripts:

The actual scripts are:

Activity diagram:

inline_dotgraph_32

Class diagram:

todo

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D300 FeatureDefinition just a string external
D311 FeatureTableSet Prototypes clusters vissem/ Weibull prototypes A311
D312 FeatureTableSet Prototypes clusters vissemgabor/ Gabor prototypes A312

Computation:

Activity windows linux (data) parallel distributed / cluster grid
ClusterFeaturesWeibull + + ? todo todo
ClusterFeaturesGabor + + ? todo todo

A311 - Clustering of Weibulls

Computes prototypical Weibull features using radius based clustering.

vidset clusterfeatures demoset.txt weibull "radius;weisim;1.0;1.5;0.5" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt weibull "radius;weisim;2.0;2.5;0.5" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt weibull "radius;weisim;3.0;3.5;0.5" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25

vissemClusterSP.ini:

doTemporalSmoothing 0
doTemporalDerivative 0
temporalSigma 0.75

# feature parameters
nrScales 2
spatialSigma_s0 1.0
spatialSigma_s1 3.0
doC 0
doRot 0

histBinCount 1001

# region definition
borderWidth 15
nrRegionsPerDimMin 2
#nrRegionsPerDimMax 14
nrRegionsPerDimMax 10
nrRegionsStepSize 4

# Lazebnik Spatial Pyramid (cvpr'06)
# 1x1 = level 0 = no spatial pyramid
# set number of 'bins' per dimension
nrLazebnikRegionsPerDimMinX 1
nrLazebnikRegionsPerDimMaxX 1
nrLazebnikRegionsStepSizeX 2
nrLazebnikRegionsPerDimMinY 1
nrLazebnikRegionsPerDimMaxY 3
nrLazebnikRegionsStepSizeY 2
overlapLazebnikRegions 0

protoDataFile clusters

An Impala::Core::VideoSet::ClusterFeatures object with an Impala::Core::Feature::VisSem object is given each keyframe and uses an Impala::Core::Feature::Clusteror object to cluster the features. The result is an Impala::Core::Feature::FeatureTableSet with an Impala::Core::FeatureTable for each parameter setting.

The tables are stored in Prototypes/clusters/vissem/ .

A321 - Clustering of Gabors

This is very similar to A311 - Clustering of Weibulls except is uses Gabor features instead of Weibulls.

vidset clusterfeatures demoset.txt gabor "radius;histint;5.5;5.5;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;6.0;6.0;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;6.5;6.5;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;7.0;7.0;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;7.5;7.5;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;8.0;8.0;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;8.5;8.5;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25
vidset clusterfeatures demoset.txt gabor "radius;histint;9.0;9.0;1.0" start 500 5000 --keyframes --keyframeSrc --ini ../script/vissemClusterSP.ini --report 25

An Impala::Core::VideoSet::ClusterFeatures object with an Impala::Core::Feature::VisSem object is given each keyframe and uses an Impala::Core::Feature::Clusteror object to cluster the features. The result is an Impala::Core::Feature::FeatureTableSet with an Impala::Core::FeatureTable for each parameter setting.

The tables are stored in Prototypes/clusters/vissemgabor/ .

4 - Feature similarity computation

Compute similarity of features from bulk data to prototypical features, e.g. features from annotated data. Typically, the bulk data comprises all keyframes in the video set.

Scripts:

The actual scripts are:

Activity diagram:

inline_dotgraph_33

Class diagram:

todo

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D401 FeatureTableSet FeatureData Keyframes/vissem vid1.mpg/*scale* features per video A401
D402 Keyframes/vissem vid1.mpg/*nrScales* A402
D403 Keyframes/vissemgabor vid1.mpg/*scale* A403
D404 Keyframes/vissemgabor vid1.mpg/*nrScales* A404
D405 Keyframes/fusionvissemgabor vid1.mpg/*nrScales* A405
D406a Mpeg7Doc MetaData features/vid1.mpg vissem.xml features in Mpeg7 A406
D406b features/vid1.mpg vissemgabor.xml A406
D407a FeatureTableSet FeatureIndex vissem/ *nrScales* features for whole set A407
D407b vissemgabor/ *nrScales* A407
D407c fusionvissemgabor/ *nrScales* A407

Computation:

Activity windows linux (data) parallel distributed / media grid
ProtoSimilarityEval + + ? + todo
IndexFeatures + + - - ?
ConcatenateFeatures + + - + todo
ExportFeatures + + - + todo

A401 - Compute vissem Weibull features for keyframes

Compute similarity of images/frames w.r.t. the visual proto concepts.

vidset viseval demoset.txt --keyframes --keyframeSrc --ini ../script/vissem.ini --report 10

An Impala::Core::VideoSet::ProtoSimilarityEval object with an Impala::Core::Feature::VisSem object is given each keyframe and computes its similarity to all proto concepts. First, the image is divided into a number of rectangles of varying size (size is defined by regions/dimension). For each of the rectangles features are computed and compared to the features of all annotations using a Weibull similarity function. The average similarity for all annotations of one concept is stored in an Impala::Core::Feature::FeatureTable as an intermediate result. The final result is computed by taking both average and maximum of the similarities for each combination of a sigma and a rectangle size. The result is an Impala::Core::Feature::FeatureTableSet with four Impala::Core::Feature::FeatureTable s: vissem_proto_annotation_scale_1_rpd_2, vissem_proto_annotation_scale_1_rpd_6, vissem_proto_annotation_scale_3_rpd_2, and vissem_proto_annotation_scale_3_rpd_6 (rpd stands for region/dimension). The feature vector of one of these tables typically contains 30 values : average and maximum similarity to 15 proto concepts.

inline_dotgraph_34

The tables are stored per video in FeatureData/Keyframes/vissem/vid.mpg/ .

A402 - Concatenate vissem Weibull features

Concatenate feature vectors into one long vector. Such a vector typically contains 120 values.

vidset concatfeatures demoset.txt vissem_proto_annotation_nrScales_2_nrRects_130 \
       vissem_proto_annotation_scale_1_rpd_2 vissem_proto_annotation_scale_1_rpd_6 \
       vissem_proto_annotation_scale_3_rpd_2 vissem_proto_annotation_scale_3_rpd_6 \
       --keyframes --virtualWalk

An Impala::Core::VideoSet::ConcatFeatures object is given each video id (the --keyframe and --virtualWalk option trick tell the Walker not to look for any actual data) to locate tables from an Impala::Core::Feature::FeatureTableSet and concatenate them into a single Impala::Core::Feature::FeatureTable. The result is stored in FeatureData/Keyframes/vissem/vid.mpg/ and may be inspected using table dumpfeaturetable vissem_proto_annotation_nrScales_2_nrRects_130.tab .

A403 - Compute vissem Gabor features for keyframes

This step is very similar to A401 - Compute vissem Weibull features for keyframes except for that it uses Gabor features instead of Weibulls.

vidset visegaborval demoset.txt --keyframes --keyframeSrc --ini ../script/vissem.ini --report 10

An Impala::Core::VideoSet::ProtoSimilarityEval object with an Impala::Core::Feature::VisSemGabor object produces an Impala::Core::Feature::FeatureTableSet. The tables are stored per video in FeatureData/Keyframes/vissemgabor/vid.mpg/ .

A404 - Concatenate vissem Gabor features

Concatenate feature vectors into one long vector. Such a vector typically contains 120 values.

vidset concatfeatures demoset.txt vissemgabor_proto_annotation_nrScales_2_nrRects_130 \
       vissemgabor_proto_annotation_scale_2.828_rpd_2 vissemgabor_proto_annotation_scale_2.828_rpd_6 \
       vissemgabor_proto_annotation_scale_1.414_rpd_2 vissemgabor_proto_annotation_scale_1.414_rpd_6 \
       --keyframes --virtualWalk

An Impala::Core::VideoSet::ConcatFeatures object is given each video id (the --keyframe and --virtualWalk option trick tell the Walker not to look for any actual data) to locate tables from an Impala::Core::Feature::FeatureTableSet and concatenate them into a single Impala::Core::Feature::FeatureTable. The result is stored in FeatureData/Keyframes/vissemgabor/vid.mpg/ and may be inspected using table dumpfeaturetable vissemgabor_proto_annotation_nrScales_2_nrRects_130.tab .

A405 - Fusion of vissem Weibull and Gabor features

Concatenate the VisSem Weibull and Gabor feature vectors into a single vector.

vidset concatfeatures demoset.txt fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130 \
       vissem_proto_annotation_nrScales_2_nrRects_130 vissemgabor_proto_annotation_nrScales_2_nrRects_130 \
       --keyframes --virtualWalk

An Impala::Core::VideoSet::ConcatFeatures object is given each video id (the --keyframe and --virtualWalk option trick tell the Walker not to look for any actual data) to locate tables from an Impala::Core::Feature::FeatureTableSet and concatenate them into a single Impala::Core::Feature::FeatureTable. The result is stored in FeatureData/Keyframes/fusionvissemgabor/vid.mpg/ and may be inspected using table dumpfeaturetable fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130.tab .

A406 - Export features (optional)

Export features to an Mpeg7 file per video.

vidset exportfeatures demoset.txt 25 vissem_proto_annotation_nrScales_2_nrRects_130 --keyframes --virtualWalk --segmentation
vidset exportfeatures demoset.txt 25 vissemgabor_proto_annotation_nrScales_2_nrRects_130 --keyframes --virtualWalk --segmentation

An Impala::Core::VideoSet::ExportFeatures object transfers data from Impala::Core::Feature::FeatureTable s with the given Impala::Core::Feature::FeatureDefinition into an Impala::Core::VideoSet::Mpeg7Doc. The result is stored in MetaData/features/*.mpg/vissem.xml and MetaData/features/*.mpg/vissemgabor.xml.

A407 - Index features

Create a single table with the given features of all videos in the set.

vidset indexfeatures demoset.txt vissem_proto_annotation_nrScales_2_nrRects_130 --keyframes --virtualWalk
vidset indexfeatures demoset.txt vissemgabor_proto_annotation_nrScales_2_nrRects_130 --keyframes --virtualWalk
vidset indexfeatures demoset.txt fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130 --keyframes --virtualWalk

An Impala::Core::VideoSet::IndexFeatures object assembles FeatureTables with the given Impala::Core::Feature::FeatureDefinition (stored per video) into one table. The result table is stored in FeatureIndex/vissem/, FeatureIndex/vissemgabor, and FeatureIndex/fusionvissemgabor.

top top

5 - High level concept model training

Train an (SVM-) model based on annotated feature vectors.

Scripts:

The actual scripts are:

Activity diagram:

inline_dotgraph_35

Class diagram:

Todo

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D501a PropertySet ConceptModels concepts.txt/modelname/vissem *.best parameters per concept A501
D501b concepts.txt/modelname/vissemgabor A501
D501c concepts.txt/modelname/fusion A501
D502a Svm ConceptModels concepts.txt/modelname/vissem *.model model per concept A502
D502b concepts.txt/modelname/vissemgabor A502
D502c concepts.txt/modelname/fusion A502

Computation:

Activity windows linux (data) parallel distributed / model (parameter) grid
TrainConcepts + + - + todo
TrainOnHoldout + + - + todo

A501 - Parameter search for optimal model

Do a parameter search to find the optimal SVM model parameters for each feature set. Optimal parameters are determined using cross validation.

trainconcepts annoset.txt concepts.txt modelname vissem_proto_annotation_nrScales_2_nrRects_130 --ini ../script/train.ini
trainconcepts annoset.txt concepts.txt modelname vissemgabor_proto_annotation_nrScales_2_nrRects_130 --ini ../script/train.ini
trainconcepts annoset.txt concepts.txt modelname fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130 --ini ../script/train.ini

train.ini:

w1 [log0:3]
w2 [log0:3]
gamma [log-2:2]
kernel rbf
repetitions 1
episode-constrained 1
cache 900

Impala::Samples::mainTrainConcepts uses Impala::Core::Table::AnnotationTable s to assemble Impala::Core::Feature::FeatureTable s for positive and negative examples. An Impala::Core::Training::ParameterSearcher object produces an Impala::Util::PropertySet with the best model parameters. The result is stored in ConceptModels/concepts.txt/modelname/vissem/*.best.

A502 - Train optimal model

Use the best model parameters to train a model on the complete feature set. Also indicates how this model performs.

trainonholdout annoset.txt concepts.txt modelname vissem_proto_annotation_nrScales_2_nrRects_130 --ini ../script/train.ini
trainonholdout annoset.txt concepts.txt modelname vissemgabor_proto_annotation_nrScales_2_nrRects_130 --ini ../script/train.ini
trainonholdout annoset.txt concepts.txt modelname fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130 --ini ../script/train.ini

Impala::Samples::mainTrainConcepts uses Impala::Core::Table::AnnotationTable s to assemble Impala::Core::Feature::FeatureTable s for positive and negative examples. An Impala::Core::Training::Svm object uses the best model parameters to compute an SVM model and stores it in ConceptModels/concepts.txt/modelname/vissem/*.model. The performance measure in stored in ConceptModels/concepts.txt/modelname/vissem/*.ScoreOnSelf.

top top

6 - High level concept classification

Compute a similarity score that indicates how well an object (frame, shot, etc.) fits a concept.

Scripts:

The actual scripts are:

Activity diagram:

inline_dotgraph_36

Class diagram:

todo

Persistency in file system (data is stored in VideoSearch/setname):

Ref Class Directory Comment Producer
D601a SimilarityTableSet SimilarityData Keyframes/concepts.txt/modelname vissem/vid.mpg/ similarities per video A601
D601b gabor/vid.mpg/ A601
D601c fusion/vid.mpg/ A601
D602 combined/vid.mpg/ A602
D603 Mpeg7Doc MetaData similarities/concepts.txt vid1.mpg/con.xml similarities in mpeg7 A603
D604 SimilarityTableSet SimilarityIndex concepts.txt/modelname combined/ similarities for set A604

Computation:

Activity windows linux (data) parallel distributed / media grid
ApplyConcepts + + ? + todo
CombineConcepts + + - + todo
ExportConcepts + + - + todo
IndexConcepts + + - - todo

A601 - Apply concept classifiers

Use concept classifiers to label features with a concept similarity.

vidset applyconcepts demoset.txt annoset.txt concepts.txt modelname vissem_proto_annotation_nrScales_2_nrRects_130 \
       --keyframes --virtualWalk --data ".;../annoset"
vidset applyconcepts demoset.txt annoset.txt concepts.txt modelname vissemgabor_proto_annotation_nrScales_2_nrRects_130 \
       --keyframes --virtualWalk --data ".;../annoset"
vidset applyconcepts demoset.txt annoset.txt concepts.txt modelname fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130 \
       --keyframes --virtualWalk --data ".;../annoset"

An Impala::Core::VideoSet::ApplyConcepts object creates an Impala::Core::Feature::ConceptSet with the given concepts. For each concept an Impala::Core::Training::Classifier is instantiated and asked to score all features in the Impala::Core::Feature::FeatureTable found for each video. The result is an Impala::Core::Table::SimilarityTableSet with concept similarities as well as a ranking per video. It is stored in SimilarityData/Keyframes/concepts.txt/modelname/vissem/vid.mpg/.

A602 - Combine concept similarities

Combine concept similarity scores of individual classifiers into one concept similarity score.

vidset combineconcepts demoset.txt concepts.txt modelname combined avg vissem_proto_annotation_nrScales_2_nrRects_130 \
        vissemgabor_proto_annotation_nrScales_2_nrRects_130 fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130 \
        --keyframes --virtualWalk

An Impala::Core::VideoSet::CombineConcepts object loads an Impala::Core::Table::SimilarityTableSet for each feature definition. The scores in the SimTable's are averaged and ranked. The result is again a SimilarityTableSet and is stored in SimilarityData/Keyframes/concepts.txt/modelname/combnied/vid.mpg/.

A603 - Export concept similarities (optional)

Export concept similarities to an Mpeg7 file per concept per video.

vidset exportconcepts demoset.txt 25 concepts.txt modelname combined --keyframes --virtualWalk --segmentation

An Impala::Core::VideoSet::ExportConcepts object transfers data from Impala::Core::Table::SimilarityTableSet with the given Impala::Core::Feature::FeatureDefinition into an Impala::Core::VideoSet::Mpeg7Doc. The result is stored in MetaData/similarities/concepts.txt/vid.mpg/concept.xml.

A604 - Index concept similarities

Create a single table with the given similarity scores of all videos in the set.

vidset indexconcepts demoset.txt concepts.txt modelname combined --keyframes --virtualWalk

An Impala::Core::VideoSet::IndexConcepts object assembles Impala::Core::Table::SimilarityTableSet s related to the given Impala::Core::Feature::FeatureDefinition (stored per video) into one table. The resulting set of tables is stored in SimilarityIndex/concepts.txt/modelname/combined/concept_*.tab.

top top

7 - Evalutation and Search

Two ways to explore the results.

Scripts:

The actual scripts are:

A701 - Show video set processing results

A basic GUI to browse/check the results of most of the processing steps.

showvidset

showvidset.ini:

data .

videoSet demoset.txt
imageSetThumbnails thumbnails.txt
imageSetKeyframes keyframes.txt
imageSetStills stills.txt

conceptSet concepts.txt
#conceptSetSubDir vissem_proto_annotation_nrScales_2_nrRects_130
#conceptSetSubDir vissemgabor_proto_annotation_nrScales_2_nrRects_130
#conceptSetSubDir fusionvissemgabor_proto_annotation_nrScales_2_nrRects_130
featureSet vissem_proto_annotation_nrScales_2_nrRects_130;vissemgabor_proto_annotation_nrScales_2_nrRects_130
conceptAnnotations concepts.txt

imageStills 1

A702 - TRECVID search interface

GUI developed specifically for the interactive search task of TRECVID

trecsearch

trecsearch.ini:

videoSet demoset.txt
imageSetThumbnails thumbnails.txt
imageSetKeyframes keyframes.txt

#als in commentaar, geen stills:
imageSetStills stills.txt

#xTextSearchServer http://licor.science.uva.nl:8000/axis/services/TextSearchWS
#xTextSuggestServer http://licor.science.uva.nl:8000/axis/services/DetectorSuggestWS
#xRegionQueryServer http://146.50.0.57:8080/axis/QueryGateway.jws

maxImagesOnRow 2
initialBrowser 2


xTextSearchSet test
# xTextSearchSet devel
xTextSearchYear 2006

year 2006
trecTopicSet topics.2006.xml

threadFile tv2006_threads

#judgeFile search.qrels.tv05

#searchTopicImages mixed.txt
#searchTopicImages trec2006topic_keyframes.txt
#searchTopicThumbnails trec2006topic_thumbnails.txt

conceptSet tv2006_interactive
conceptCat tv2006_categories
conceptStripToDot 0
conceptStripExtension 1
conceptQuality tv2006_quality
conceptMapping tv2006_mapping

top top

Job Activity Diagram

Job Activity Diagram

inline_dotgraph_37

top top

Generated on Tue Mar 30 13:39:14 2010 for ImpalaDoc by  doxygen 1.5.1