Computing the Canonical Subset of User Protocols


A common problem in cognitive science research is the large volume of behavioral protocol data recorded during the execution of the tasks being studied. The analysis of these large data sets has often been a tedious and time-consuming process, and automated analysis methods have been slow to develop. We have developed an automated method to find canonical behaviors: a small subset of protocols that is most representative of the full data set, providing a "big picture" view of the data with as few protocols as possible. The method takes advantage of recent algorithmic developments in computational vision, which we have adapted to the comparison of behavioral protocols. No a priori model is required, just a similarity measure between pairs of behaviors. Initial experiments show that canonical sets of web-browsing protocols found by our method compare well with those found by expert human coders.

Back to Friday Posters