This paper addresses the nature of visual representations associated with complex structured objects, and the role of these representations in perceptual organization. We use a novel experimental paradigm to probe subjects intuitions about parsing a scene consisting of overlapping two-dimensional objects. The objects are generated from an abstract 2-dimensional image grammar, which specifies the set of possible configurations of object parts. We show that participants' performance on the task depends on prior experience with the object class, and based on structural cues. This indicates that structural representations exerted a top-down influence on parsing. To address the question of representation type, we used a computational model of object matching in conjunction with various probabilistic representational models. Our simulations indicate that grammar-based representations derived from the original grammars are superior to more restrictive exemplar-based representations in explaining human performance on this task, as well as to more inclusive, over-generalizing grammar-based representations.