Comparative Analysis of Semantic Models and Corpora Choice when using Semantic Fields to Predict Eye Movement on Webpages

Abstract

Nine models are compared in their ability to predict eye-tracking data that was collected from 49 participants' goal-oriented search tasks on a total of 1809 webpages. Forming the basis of six of these models, three semantic models and two corpus types are compared as components for the Semantic Fields model (Stone and Dennis, 2007) that estimates the semantic saliency of different areas displayed on webpages. Latent Semantic Analysis, Sparse Non-Negative Matrix Factorization, and Vectorspace were used to generate similarity comparisons of goal and webpage text in the semantic component of the Semantic Fields model. Surprisingly, Vectorspace was consistently the best performing model in this study. Two types of corpora or knowledge-bases were used to inform the semantic models, the well known TASA corpus and other corpora that were constructed from the Wikipedia encyclopedia. In all cases the Wikipedia corpora out performed the TASA corpora. Three other baseline models: Flat, Non-Flat, and No-Model were included as a point of comparison to evaluate the effectiveness of the Semantic Fields models. In all cases the Semantic Fields models outperformed the baseline models when predicting the participants' eye-tracking data.


Back to Friday Posters