1.
Multi-Sensor Coordination and Control |
Embedded
systems and sensor technology are producing advance
sensors that can do self-processing and communication
with other sensors (E.g. Smart cameras). Usage of
multiple heterogeneous sensors is becoming more
practical for multi-media applications like
surveillance, smart environments, 3D-TV applications,
etc. The processing and communication capabilities of
these sensors are not fully exploited in their
applications. Sensors in the current sensor systems
that are deployed in the above applications do not
actively share information with other sensors for
making any decisions; instead they are controlled by
the human operator or by a centralized server. The
decision making capability of these sensors for
tracking, best-view computation, etc. are not explored
much in the literature. In order to automate the
sensor system, there is a need for making the sensors
to collaborate with other sensors in the sensor
system. To make group decisions among sensors, there
should be a framework or protocol for the sensors to
talk to other sensors. This project develops the
theoretical background for the sensors to involve
themselves in group actions and group decisions for
multimedia applications. There are many open problems
in multimedia and computer vision systems that can be
solved robustly by making the sensors to collaborate
with each other. We envision our future sensor systems
to be intelligent such that the sensors learn from the
system and make their decisions autonomously.
|
People:
Prabhu
Natarajan |
2.
Multi-sensor Social Interaction Analysis |
Social interactions
play an important role in our daily lives: people
organize themselves in groups to share views, opinions,
as well as thoughts. In the literature, social
interaction analysis is re-garded as one type of complex
human activity analysis, which is an important area of
computer vision research. However, all the human
activity analysis works have different definitions for
social interaction. From one view, all these definitions
describe the multifaceted nature of inter-action as an
intuitive concept, but, from the other perspective, they
also highlight the lack of a common (and general) notion
of interaction. Also, the recent developments in sensor
technology, such as the emergence of new sensors,
advanced processing techniques, and improved processing
hardware, provide an opportunity to improve the
interaction analysis technologies by making use of more
sensors in terms of both modality and quantity.
|
People:
Gan
Tian |
3. Tweeting Cameras
for Event Detection |
This research
revolves around multi-modal sensor fusion for
situation awareness and recognition. In particular, it
is focused on fusing multimedia data from physical
sensors and social sensors for event detection and
paradigms or architectures that make sensors of
multiple modalities socially organized.
|
People:
Wang
Yuhui |
4. Multi-Output Active
Learning |
This research
work addresses the problem of active learning of a
multi-output Gaussian process (MOGP) model
representing multiple types of coexisting correlated
environmental phenomena. In contrast to existing
works, our active learning problem involves selecting
not just the most informative sampling locations to be
observed but also the types of measurements at each
selected location for minimizing the predictive
uncertainty (i.e., posterior joint entropy) of a
target phenomenon of interest given a sampling budget.
Unfortunately, such an entropy criterion scales poorly
in the numbers of candidate sampling locations and
selected observations when optimized. To resolve this
issue, we first exploit a structure common to sparse
MOGP models for deriving a novel active learning
criterion. Then, we exploit a relaxed form of
submodularity property of our new criterion for
devising a polynomial-time approximation algorithm
that guarantees a constant-factor approximation of
that achieved by the optimal set of selected
observations. Empirical evaluation on real-world
datasets (e.g. concentrations of heavy metals in the
polluted soil, indoor environmental qualities like
temperature and lighting conditions) shows that our
proposed approach outperforms existing algorithms for
active learning of MOGP and single-output GP models.
|
People:
Zhang
Yehong |
5. Media Aesthetics and Role
of Social Media |
Advancement in
digital photography and wireless network technologies
enables us to capture and share our experiences on the
move. To assist photographers in taking better photos,
camera devices have intelligent features, such as
automatic focus, face detection, etc, but still it
remains a challenge for an amateur user to capture
high-quality images. Professional photographers
exploit various parameters based on the context and it
require training and lot of experience for capturing
high quality photos. Although, there are post
processing tools available to enhance the photo
quality, it is very time consuming. Also, after
capturing the photograph, there is little which can be
done about the quality and composition of a badly
captured image, without tedious post processing. This
research aims at providing real-time assistance to
amateur users so that they can capture high quality
photographs and home videos. Our approach will be
focused on learning the art of photography and
videography from multimedia content shared on social
media.
|
People:
Yogesh
S Rawat |
6. Contextual Video
Advertising |
The explosive
growth of multimedia data on Internet has created huge
opportunities for online video advertising. In this
paper, we propose a novel advertising technique called
SalAd, which utilizes textual information, visual
content and the webpage saliency, to automatically
associate the most suitable companion ads with online
videos. Unlike most existing approaches that only
focus on selecting the most relevant ads, SalAd
further considers the saliency of selected ads to
reduce intentional ignorance. SalAd consists of three
basic steps. Given an online video and a set of
advertisements, we first roughly identify a set of
relevant ads based on the textual information
matching. We then carefully select a sub-set of
candidates based on visual content matching. In this
regard, our selected ads are contextually relevant to
online video content in terms of both textual
information and visual content. We finally select the
most salient ad among the relevant ads as the most
appropriate one. To demonstrate the effectiveness of
our method, we have conducted a rigorous eye-tracking
experiment on two ad-datasets. The experimental
results show that our method enhances the user
engagement with the ad content while maintaining users
quality of video viewing experience.
|
People:
Chen
Xiang |
|