Free-hand sketch provides a natural and expressive modality for interaction with computers. This project explores methods to intuitively search video databases using sketches. Although video search is typically performed using keywords that specify content, text is cumbersome for describing scene appearance. Rather, a sketched depiction of a scene represents an orthogonal channel to constrain search. Although sketch based image retrieval (SBIR) has received much attention, the related problem of video retrieval (SBVR) is only sparsely researched – especially the fusion of text and sketch.
Building upon prior SBIR (Hu, 2011, ICIP) and SBVR (Collomosse, 2009, ICCV) we describe results from such a hybrid system. Our sketches describe objects based on their semantics (e.g. horse), a sketch of their motion trajectory, and their colour; collectively representing a natural interface for conveying multiple facets of events. A “fingerprint” (descriptor) is extracted for each object; descriptors are matched to identify relevant videos.
Our results show annotated sketches retrieving clips from a database of sports footage. Natural user interfaces such as sketch will have significant impact in the near-term, given the trend toward non-classical platforms such as tablet, mobile and large-scale touch-screen devices. SBVR also motivates cross-disciplinary studies of user perception and depiction of event.