With the explosive growth of web data, effective and efficient technologies are in urgent needs for retrieving semantically relevant contents of heterogeneous modalities. Previous studies construct global transformations to project the heterogeneous data into a measurable subspace. However, global projections cannot appropriately adapt to diverse contents, and the naturally existing multi-level semantic relation in web data is ignored. We study the problem of semantic coherent retrieval, where documents from different modalities should be ranked by the semantic relevance to the queries. Accordingly, we propose TINA, a correlation learning method by Adaptive Hierarchical Semantic Aggregation. First, by joint modeling of content and ontology similarities, we build a semantic hierarchy to measure multi-level semantic relevance. Second, with a set of local linear projections aggregated by gating functions, we optimize the structure risk objective function that involves semantic coherence measurement, local projection consistency and the complexity penalty of local projections. Therefore, semantic coherence and a better bias-variance trade-off can be achieved by TINA. Extensive experiments on widely used NUS-WIDE and ICML-Challenge datasets demonstrate that TINA outperforms state-of-the-art, and achieves better adaptation to the multi-level semantic relation and content divergence.
|Original language||English (US)|
|Number of pages||10|
|Journal||Proceedings - IEEE International Conference on Data Mining, ICDM|
|State||Published - Jan 26 2015|
|Event||14th IEEE International Conference on Data Mining, ICDM 2014 - Shenzhen, China|
Duration: Dec 14 2014 → Dec 17 2014
All Science Journal Classification (ASJC) codes