Current Research Areas:

1. Multi-sensor Coordination and Control

2. Multi-sensor Social Interaction Analysis

3. Tweeting Cameras for Event Detection

4. Multi-Output Active Learning

5. Media Aesthetics and Role of Social Media

6. Contextual Video Advertising

1. Multi-Sensor Coordination and Control
Embedded systems and sensor technology are producing advance sensors that can do self-processing and communication with other sensors (E.g. Smart cameras). Usage of multiple heterogeneous sensors is becoming more practical for multi-media applications like surveillance, smart environments, 3D-TV applications, etc. The processing and communication capabilities of these sensors are not fully exploited in their applications. Sensors in the current sensor systems that are deployed in the above applications do not actively share information with other sensors for making any decisions; instead they are controlled by the human operator or by a centralized server. The decision making capability of these sensors for tracking, best-view computation, etc. are not explored much in the literature. In order to automate the sensor system, there is a need for making the sensors to collaborate with other sensors in the sensor system. To make group decisions among sensors, there should be a framework or protocol for the sensors to talk to other sensors.  This project develops the theoretical background for the sensors to involve themselves in group actions and group decisions for multimedia applications. There are many open problems in multimedia and computer vision systems that can be solved robustly by making the sensors to collaborate with each other. We envision our future sensor systems to be intelligent such that the sensors learn from the system and make their decisions autonomously.
People: Prabhu Natarajan

2. Multi-sensor Social Interaction Analysis
Social interactions play an important role in our daily lives: people organize themselves in groups to share views, opinions, as well as thoughts. In the literature, social interaction analysis is re-garded as one type of complex human activity analysis, which is an important area of computer vision research. However, all the human activity analysis works have different definitions for social interaction. From one view, all these definitions describe the multifaceted nature of inter-action as an intuitive concept, but, from the other perspective, they also highlight the lack of a common (and general) notion of interaction. Also, the recent developments in sensor technology, such as the emergence of new sensors, advanced processing techniques, and improved processing hardware, provide an opportunity to improve the interaction analysis technologies by making use of more sensors in terms of both modality and quantity.
People: Gan Tian

3. Tweeting Cameras for Event Detection
This research revolves around multi-modal sensor fusion for situation awareness and recognition. In particular, it is focused on fusing multimedia data from physical sensors and social sensors for event detection and paradigms or architectures that make sensors of multiple modalities socially organized.
People: Wang Yuhui

4. Multi-Output Active Learning
This research work addresses the problem of active learning of a multi-output Gaussian process (MOGP) model representing multiple types of coexisting correlated environmental phenomena. In contrast to existing works, our active learning problem involves selecting not just the most informative sampling locations to be observed but also the types of measurements at each selected location for minimizing the predictive uncertainty (i.e., posterior joint entropy) of a target phenomenon of interest given a sampling budget. Unfortunately, such an entropy criterion scales poorly in the numbers of candidate sampling locations and selected observations when optimized. To resolve this issue, we first exploit a structure common to sparse MOGP models for deriving a novel active learning criterion. Then, we exploit a relaxed form of submodularity property of our new criterion for devising a polynomial-time approximation algorithm that guarantees a constant-factor approximation of that achieved by the optimal set of selected observations. Empirical evaluation on real-world datasets (e.g. concentrations of heavy metals in the polluted soil, indoor environmental qualities like temperature and lighting conditions) shows that our proposed approach outperforms existing algorithms for active learning of MOGP and single-output GP models.
People: Zhang Yehong

5. Media Aesthetics and Role of Social Media
Advancement in digital photography and wireless network technologies enables us to capture and share our experiences on the move. To assist photographers in taking better photos, camera devices have intelligent features, such as automatic focus, face detection, etc, but still it remains a challenge for an amateur user to capture high-quality images. Professional photographers exploit various parameters based on the context and it require training and lot of experience for capturing high quality photos. Although, there are post processing tools available to enhance the photo quality, it is very time consuming. Also, after capturing the photograph, there is little which can be done about the quality and composition of a badly captured image, without tedious post processing. This research aims at providing real-time assistance to amateur users so that they can capture high quality photographs and home videos. Our approach will be focused on learning the art of photography and videography from multimedia content shared on social media.
People: Yogesh S Rawat

6. Contextual Video Advertising
The explosive growth of multimedia data on Internet has created huge opportunities for online video advertising. In this paper, we propose a novel advertising technique called SalAd, which utilizes textual information, visual content and the webpage saliency, to automatically associate the most suitable companion ads with online videos. Unlike most existing approaches that only focus on selecting the most relevant ads, SalAd further considers the saliency of selected ads to reduce intentional ignorance. SalAd consists of three basic steps. Given an online video and a set of advertisements, we first roughly identify a set of relevant ads based on the textual information matching. We then carefully select a sub-set of candidates based on visual content matching. In this regard, our selected ads are contextually relevant to online video content in terms of both textual information and visual content. We finally select the most salient ad among the relevant ads as the most appropriate one. To demonstrate the effectiveness of our method, we have conducted a rigorous eye-tracking experiment on two ad-datasets. The experimental results show that our method enhances the user engagement with the ad content while maintaining users quality of video viewing experience.
People: Chen Xiang

Copyright © 2016 Multimedia Analysis and Synthesis Lab