Challenges in complexity and variability of multimedia data have led to revolutions in machine learning techniques. Multimedia data, such as digital images, audio streams and motion video programs, exhibit richer structures than simple, isolated data items. A number of pixels in a digital image collectively conveys certain visual content to viewers. A TV video program consists of both audio and image streams that unfold the underlying story.