With the current exponential growth of video-based social networks, video retrieval using natural language is receiving ever-increasing attention. Most existing approaches tackle this task by extracting individual frame-level spatial features to represent ...
Predictive scene parsing is a task of assigning pixel-level semantic labels to a future frame of a video. It has many applications in vision-based artificial intelligent systems, e.g., autonomous driving and robot navigation. Although previous work has sho ...
Social media have transformed the Web into an interactive sharing platform where users upload data and media, comment on, and share this content within their social circles. Each content item is associated with an abundance of metadata and related informat ...
A common trend in machine learning and pattern classification research is the exploitation of massive amounts of information in order to achieve an increase in performance. In particular, learning from huge collections of data obtained from the web, and us ...