Multimedia knowledge extraction: Get things right about complex information
Heng Ji
Rensselaer Polytechnic Institute, USA
: J Comput Eng Inf Technol
Abstract
Knowledge extraction and representation have been the common goals for both the text domain and the visual domain. A few significant benchmarking efforts, such as TREC and TRECVID, have also demonstrated important progress in information extraction from data of different modalities. However, none of the media modality research is complete and fully reliable. Systems using text Knowledge Base Population (KBP) tools cover important high-level events, entities, and relations, but they often do not provide the complete details depicting the physical scenes, objects, or activities. Visual recognition systems, despite the recent progress, still suffer from inadequate abilities in extracting high-level semantics comparable to the counterparts from the text part. In this talk, we will present our recent efforts at developing a scalable, portable, and adaptive multi-media knowledge construction framework which can exploit cross-media knowledge, resource transfer and bootstrapping to dramatically scale up cross-media knowledge extraction processes. We have developed novel cross-media methods (including a cross-media deep learning model and “Liberal” KBP) to automatically construct multimodal semantic schema for event, improve extraction through inference and conditional detection, and enrich knowledge through cross-media cross-lingual event co-reference and linking.
Biography
Email: jih@rpi.edu