Minimally Supervised Learning for Unconstrained Conceptual Property Extraction

Abstract

We present a highly performant, minimally supervised system for the challenging task of unconstrained conceptual property extraction (e.g., "banana is fruit", "spoon used for eating"). Our technique employs lightly supervised support vector machines to acquire promising features from our corpora (Wikipedia and UKWAC) and uses those features to anchor the search for plausible unconstrained relations in our corpus. We introduce a novel backing-off method to find the most likely relation for each concept/feature pair and produce a number of metrics which act as potential indicators of true relations, training our system using a stochastic search algorithm to find the optimal reweighting of these metrics. We also introduce a human semantic-similarity dataset; our output shows a strong correlation with human similarity judgements. Both our gold standard comparison and direct human evaluation results improve on those of previous approaches, with our human judgements evaluation showing a significant 20 percentage point performance increase.


Back to Table of Contents